-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add preferred timeout for small dynamic filters #22527
base: master
Are you sure you want to change the base?
Conversation
3e150d1
to
6ee8f40
Compare
a70fc00
to
c970010
Compare
core/trino-main/src/main/java/io/trino/SystemSessionProperties.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/execution/DynamicFilterConfig.java
Outdated
Show resolved
Hide resolved
...src/main/java/io/trino/sql/planner/optimizations/DeterminePreferredDynamicFilterTimeout.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/JdbcDynamicFilteringConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/test/java/io/trino/plugin/hive/TestBackgroundHiveSplitLoader.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/sql/planner/planprinter/PlanPrinter.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/SystemSessionProperties.java
Outdated
Show resolved
Hide resolved
plugin/trino-base-jdbc/src/main/java/io/trino/plugin/jdbc/JdbcDynamicFilteringSplitManager.java
Outdated
Show resolved
Hide resolved
41e1320
to
5238c35
Compare
import static java.lang.Math.max; | ||
import static java.util.Objects.requireNonNull; | ||
|
||
public class DeterminePreferredDynamicFilterTimeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn’t this a Rule or set of them? Visitor-based optimizers are legacy, and we’re moving away from them. We should avoid introducing any new ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're trying to change an attribute of the filter node on the probe side scan of join based on the estimated size of the build side of that join. I'm not sure how this would be designed as one or more series of Rules.
Do we have a similar Rule elsewhere that we could use as a reference ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rule approach would require matching against Join/SemiJoin/DynamicFilterSourceNode and processing all probe side FilterNodes, extracting conjuncts to match against join node dynamic filter. It would cause far more lookups and rewrites than extracting all dynamic filter sources and process filter nodes once.
20ba16f
to
09b8a31
Compare
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua |
@Dith3r can you rebase? |
Add a CBO rule to estimate if dynamic filter is worth waiting for. Focus on small tables that are most often used as dimension tables.
@sopel39 ptal |
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua |
@Dith3r please rebase :) |
/** | ||
* Returns preferred timeout in milliseconds if build side can be estimated otherwise empty, that connector should wait | ||
* for the dynamic filter to be narrowed down since split enumeration started. | ||
* Future from {@link DynamicFilter#isBlocked()} method should be acquired before getting preferred dynamic filter timeout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Dith3r could you remind me why it was the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you call getPreferredTimeout
before acquiring future from isBlocked
you can receive future with excluded already resolved DF associated with preferred timeout and wait for DFs that have no preferred timeout set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If DF is already resolved, then no waiting will happen anyway, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
Description
Add a CBO rule to estimate if dynamic filter is worth waiting for. Focus on small tables that are most often used as dimension tables and nodes that generate small number of distinct values.
Rule DeterminePreferredDynamicFilterTimeout could allow us to remove 20s forced wait time for every possible DF in JDBC connectors if table presents the least row count statistic.
Additional context and related issues
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(X) Release notes are required, with the following suggested text: