-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add late model extension to RNTupleMerger #16080
Add late model extension to RNTupleMerger #16080
Conversation
Test Results 13 files 13 suites 3d 2h 17m 12s ⏱️ Results for commit edc5d49. ♻️ This comment has been updated with latest results. |
39ee934
to
3ea010a
Compare
e3fa48f
to
e4f81d5
Compare
51eb509
to
9b203c5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great improvement! Some comments.
9b203c5
to
9976d02
Compare
[ntuple] extract ExtendOutputModel from Merge
... rather than saving one per source.
- grouped most data that is commonly passed to merge helper functions into a struct to reduce the number of function params - added a page allocator to the merger - removed friendship with RPageStorage, created a public static version of SealPage() - other minor cleanups
9976d02
to
adfda4c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
struct RColumnInfo { | ||
// This column name is built as a dot-separated concatenation of the ancestry of | ||
// the columns' parent fields' names plus the index of the column itself. | ||
// e.g. "Muon.pt.x._0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do this in a follow-up PR: I think the representation index needs to be part of the name, too.
This Pull request:
refactors RNTupleMerger to properly support late model extension.
The "Union" merging mode is added, allowing the merger to late-model-extend the destination to include all the fields of the input ntuples (instead of ignoring unknown fields / complaining when models don't match).
Likewise, the "Strict" merging mode is added that checks that all inputs have the exact same structure.
By default, the old behavior (named "Filter") is used.
To better compare the RNTuples structures, the merging logic is now more properly considering the fields of each input, rather than just the columns as it was previously. This also allows for more descriptive messages to the user if some mismatch is found.
NOTE: the
RNTupleMerger::Merge
function now returns a RResult instead of throwing exceptions on error.TODO
Checklist: