-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of distinct aggregations #21907
base: master
Are you sure you want to change the base?
Conversation
0d37869
to
abce86f
Compare
a8e0691
to
1dbce97
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm % comments
core/trino-main/src/main/java/io/trino/SystemSessionProperties.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/planner/PlanOptimizers.java
Outdated
Show resolved
Hide resolved
3c9b688
to
e1575ae
Compare
...no-main/src/main/java/io/trino/sql/planner/iterative/rule/DistinctAggregationController.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/planner/PlanOptimizers.java
Outdated
Show resolved
Hide resolved
...ino-main/src/main/java/io/trino/sql/planner/iterative/rule/DistinctAggregationToGroupBy.java
Outdated
Show resolved
Hide resolved
...ino-main/src/main/java/io/trino/sql/planner/iterative/rule/DistinctAggregationToGroupBy.java
Outdated
Show resolved
Hide resolved
9045fda
to
12631c0
Compare
b1b21b8
to
e313705
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm % minor comments
.../java/io/trino/sql/planner/iterative/rule/TestMultipleDistinctAggregationToMarkDistinct.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/SystemSessionProperties.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/SystemSessionProperties.java
Outdated
Show resolved
Hide resolved
Please update the release notes section to specify which system session properties and configs were removed/defunct and what are the new alternatives. |
Remove deprecated symbol usage.
8e7637e
to
8a79ccf
Compare
core/trino-main/src/main/java/io/trino/SystemSessionProperties.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/planner/OptimizerConfig.java
Outdated
Show resolved
Hide resolved
...no-main/src/main/java/io/trino/sql/planner/iterative/rule/DistinctAggregationController.java
Outdated
Show resolved
Hide resolved
...no-main/src/main/java/io/trino/sql/planner/iterative/rule/DistinctAggregationController.java
Outdated
Show resolved
Hide resolved
ab2d632
to
76ed7d1
Compare
Extract the logic to determine whether the direct distinct aggregation applicability, which can be reused in multiple optimiser rules.
…oupBy Rename class to before refactoring to preserve history
The rule replaces `OptimizeMixedDistinctAggregations`, and adds support for multiple distinct aggregations.
Replace optimizer.optimize-mixed-distinct-aggregations with a new optimizer.distinct-aggregations-strategy `pre_aggregate`. Also rename corresponding config property optimizer.mark-distinct-strategy to optimizer.distinct-aggregations-strategy and values to NONE -> SINGLE_STEP and ALWAYS -> MARK_DISTINCT
Use estimated aggregation source NDV and the number of grouping keys to decide if pre-aggregate strategy should be used for a given aggregation
Description
Improve performance of distinct aggregations by defining additional strategies based on source properties (for example, NDV).
The strategy to use for multiple distinct aggregations.
SINGLE_STEP
Computes distinct aggregations in single-step without any pre-aggregations.This strategy will perform poorly if the number of distinct grouping keys is small.
MARK_DISTINCT
usesMarkDistinct
for multiple distinct aggregationsor for mix of distinct and non-distinct aggregations.
PRE_AGGREGATE
Computes distinct aggregations using a combination of aggregationand pre-aggregation steps.
AUTOMATIC
chooses the strategy automatically.Single-step strategy is preferred. However, for cases with limited concurrency due to
a small number of distinct grouping keys, it will choose an alternative strategy
based on input data statistics.
Additional context and related issues
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: