forked from apache/doris
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Fix](parquet-reader) Fix and optimize parquet min-max filtering. (ap…
…ache#38277) ## Proposed changes Refer to trino's implementation - Some bugs in the historical version paquet-mr. Use `CorruptStatistics::should_ignore_statistics()` to handle. - The old version of parquet uses `min` and `max` stats, and later implements `min_value` and `max_value`. `Min`/`max` stats cannot be used for some types and in some cases. This is related to the comparison and sorting method of values. - If it is double or float, special cases such as NaN, -0, and 0 must be handled. - If the string type only has min and max stats, but no min_value or max_value, use `ParquetPredicate::_try_read_old_utf8_stats()` to expand the range reading optimization method for optimization.
- Loading branch information
1 parent
1810cba
commit 433b84a
Showing
9 changed files
with
1,207 additions
and
26 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.