-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: MDEV-34041 Display additional information for materialized subqueries… #3241
base: 10.5
Are you sure you want to change the base?
Conversation
… in EXPLAIN/ANALYZE FORMAT=JSON
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, there are two kinds of data structures:
- EXPLAIN data structure
- Tracker.
Explain_subq_materialization tries to be both. I know this is caused by
the fact that there's really no query plan.
I would still request to have the Explain class and the tracker class
explicitly. Even if the Explain class has only one member, the tracker.
The tracker class should follow other tracker classes:
-
data members are private.
-
execution code calls functions to report data
-
printing json output is done by local members.
-
constructor initalizes everything to some initial state which
can already be printed (currently it doesn't). This is to allow
SHOW ANALYZE and also the situation where subquery is never invoked.
sql/item_subselect.cc
Outdated
if (is_analyze) | ||
{ | ||
writer->add_member("r_exec_strategy").add_str( | ||
exec_strategy_str[exec_strategy]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is "exec" redundant in r_exec_strategy
, can we just use r_strategy
?
Looking at the enum of strategies, I don't see many that make sense:
{
"undefined", "complete_match", "partial_match", "partial_match_merge",
"partial_match_scan", "impossible"
};
undefined
makes sense if the execution never reached the code that determines the strategy.
complete match
means "only do index lookup", perhaps we should just print "index_lookup".
partial_match
- this does not occur in practice, does it?
partial_match_scan
and partial_match_merge
are ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with r_strategy
. The enum of strategies indeed looks excessive, partial_match
and impossible
don't seem to be used anywhere. Do you think we can shrink subselect_hash_sj_engine::exec_strategy
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not touch that in this patch. I would use a switch(...)
instead of exec_strategy_str
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it also makes sense to introduce r_loops
in subquery_materialization
(to be materialization
)
I don't understand what should be counted/tracked under this counter. Every lookup in the materialized subquery? |
Yes. Sometimes, it is possible to infer this number, but in general case it is not available anywhere else. So it makes a lot of sense to print it. |
Also, note that for non-NULL-aware materialization, it reports
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Input so far... @Olernov please wait with addressing it, I will let know when done
@@ -115,6 +116,12 @@ class Explain_node : public Sql_alloc | |||
*/ | |||
Expression_cache_tracker* cache_tracker; | |||
|
|||
/** | |||
This tracker is not NULL if the node explains a materialized subquery. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a tracker anymore.
@@ -2693,3 +2710,45 @@ void Explain_range_checked_fer::print_json(Json_writer *writer, | |||
writer->end_object(); | |||
} | |||
} | |||
|
|||
|
|||
void Explain_subq_materialization::print_explain_json(Json_writer *writer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this function gets all the data from tracker, would it make sense to delegate all printing to the tracker?
if (in_subs->left_expr_has_null()) | ||
{ | ||
/* | ||
The case when all values in left_expr are NULL is handled by | ||
Item_in_optimizer::val_int(). | ||
*/ | ||
if (tracker) | ||
tracker->increment_table_scan_loops(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that table scan is NOT done if in_subs->is_top_level_item()
evaluates to TRUE right below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will cause wrong number of table scans to be reported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... and is this relevant at all ?
Why does subselect_uniquesubquery_engine::exec()
access the "materialization_tracker"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I do not see any testcase with r_table_scan_loops
. Please add one. (or remove r_table_scan_loops if it's dead code, that's what I suspect).
"materialization": { | ||
"r_strategy": "index_lookup", | ||
"r_loops": 3, | ||
"r_index_lookup_loops": 3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not add "_loops" to all counters.
Please change to "r_index_lookups".
"subqueries": [ | ||
{ | ||
"materialization": { | ||
"r_strategy": "partial_match_merge", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks confusing: it is not clear that partial matches are only done when we can't do index lookup.
How about changing to
index_lookup;array merge for partial match
"r_partial_match_loops": 1,
Please change to r_partial_matches
.
"r_partial_merge_key_sizes": ["2"],
please change to r_partial_match_array_sizes
.
"subqueries": [ | ||
{ | ||
"materialization": { | ||
"r_strategy": "partial_match_scan", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following the above suggestion, let's change partial_match_scan
into
index_lookup;full scan for partial match
@Olernov , ok I think there are enough comments to get the next version of the patch. |
… in EXPLAIN/ANALYZE FORMAT=JSON
Description
This is a WIP (work-in-progress) PR, that's why the test suite is not updated. The ANALYZE output format is open for discussion, so it makes sense to update existing tests after the output format is approved
Release Notes
TODO: What should the release notes say about this change?
Include any changed system variables, status variables or behaviour. Optionally list any https://mariadb.com/kb/ pages that need changing.
How can this PR be tested?
TODO: modify the automated test suite to verify that the PR causes MariaDB to behave as intended.
Consult the documentation on "Writing good test cases".
If the changes are not amenable to automated testing, please explain why not and carefully describe how to test manually.
Basing the PR against the correct MariaDB version
PR quality check