Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ntuple] expose RNTuple(Un)ownedView instead of RNTupleView<T, bool> #16363

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

silverweed
Copy link
Contributor

This Pull request:

splits RNTupleView<T, bool> into RNTupleUnownedView and RNTupleOwnedView. RNTupleView is renamed to Internal::RNTupleViewBase and used as the base class for the new public classes.

Notes

  • I'm conflicted about the naming of the classes. For me it's ambiguous whether the Unowned and Owned labels should be changed to, respectively, Owning and Unowning (note the swap). On one hand you could say RNTupleView<T, false> is "Unowned" by the user; on the other, you could say it is "Owning" its memory. Thoughts about this?
  • I made RNTupleViewBase not constructible (aside from friend classes) by giving it a protected destructor. Maybe we don't want to exclude this possibility (e.g. to allow users to use the base class in template metaprogramming), but in this case maybe we don't want to make it Internal either.

Checklist:

  • tested changes locally
  • updated the docs (if necessary)

This PR fixes #16321

@enirolf
Copy link
Contributor

enirolf commented Sep 3, 2024

To me Owned would indicate ownership by the user, while Owning indicates the opposite (so exactly what you describe). In that regard I think the proposed naming makes sense. I personally would prefer to have the first, so Owned and Unowned (like it is now).

I think regarding the second point, unless we already have concrete cases where users would want to use RNTupleViewBase, it's okay to leave it non-constructible. It would be easier to relax this at a later point than do the opposite (and as you say, it's in Internal so it shouldn't really matter either way).

@vepadulano
Copy link
Member

I agree with previous comments about the meaning of the Owned/Unowned wording. But I also wanted to point out that we might want to consider the aspect of what we communicate to the user. My understanding is that the RNTupleUnownedView view is also the view that we would recommend in the vast majority of cases, whereas the Owned alternative is for power users (i.e. mostly experiment frameworks). To this end, I would propose something akin to RNTupleView for the Unowned and RNTupleUnsafeView for the Owned version.

@silverweed
Copy link
Contributor Author

@vepadulano I agree that if we foresee the Unowned version being the one used "by default" it should probably be just called RNTupleView, though I'm not sure if the "Unsafe" nomenclature communicates the concept well. I don't have better ideas right now either. Maybe "Unmanaged"?

Copy link

github-actions bot commented Sep 3, 2024

Test Results

    13 files      13 suites   2d 23h 54m 10s ⏱️
 3 027 tests  3 025 ✅ 0 💤 2 ❌
33 832 runs  33 830 ✅ 0 💤 2 ❌

For more details on these failures, see this check.

Results for commit da3cf8b.

@pcanal
Copy link
Member

pcanal commented Sep 3, 2024

It seems clear that the two name RNTupleUnownedView and RNTupleOwnedView are too ambiguous (no way to know who the subject of the '[un]owned' is (Is the the user or the system that owns (or not)).

Furthermore I am confused by the question itself. A view is typically a 'small' object that does not own what it gives access to (e.g. std::string_view or its cousin std::span). So is view even the right term? Completely related can you remind me what information we are trying to pass to the user with the 2 distinct names (what do they need to do differently)?

@silverweed
Copy link
Contributor Author

silverweed commented Sep 4, 2024

@pcanal in the currently named Unowned version the user does not own the memory that will be populated with the results. In the Owned version they do, so it's their responsibility to manage that memory.

@hahnjo
Copy link
Member

hahnjo commented Sep 6, 2024

As an idea for naming, and given that we foresee this being used by power users / frameworks, maybe communicate how it is created, with something like RNTuplePointerView? (because you get it by passing in a pointer to GetView)

@enirolf
Copy link
Contributor

enirolf commented Sep 9, 2024

As an idea for naming, and given that we foresee this being used by power users / frameworks, maybe communicate how it is created, with something like RNTuplePointerView? (because you get it by passing in a pointer to GetView)

Hmmm, I'm not so sure about this one. From this name alone I would expect to get a pointer as the return type, rather than indicating that I have to supply the pointer.

@jblomer
Copy link
Contributor

jblomer commented Sep 9, 2024

Just an idea: would it improve the situation if we kept the one name RNTupleView with a template parameter that distinguishes between owning and non-owning, but we make that template parameter an enum so that the meaning is clearly spelled out.

@silverweed
Copy link
Contributor Author

Just an idea: would it improve the situation if we kept the one name RNTupleView with a template parameter that distinguishes between owning and non-owning, but we make that template parameter an enum so that the meaning is clearly spelled out.

I guess we should ask the experiments about this, since this change proposal came from them.
Personally, I'm not sure how much this would help, given that we would just move the naming problem to the enum values rather than the classes.

@jblomer
Copy link
Contributor

jblomer commented Sep 9, 2024

Just an idea: would it improve the situation if we kept the one name RNTupleView with a template parameter that distinguishes between owning and non-owning, but we make that template parameter an enum so that the meaning is clearly spelled out.

I guess we should ask the experiments about this, since this change proposal came from them. Personally, I'm not sure how much this would help, given that we would just move the naming problem to the enum values rather than the classes.

I agree that we should discuss it with the API reviewers. The advantage I see is that I find it easier to be verbose in the enum constant names (e.g., kOwnedByCaller, kOwnedByROOT) than in the class name. Especially, if the non-owning view is the default value. E.g.

auto v1 = reader->GetView<float>(...);
auto v2 = reader->GetView<float, EViewType::kOwnedByCaller>(...)

@jblomer
Copy link
Contributor

jblomer commented Sep 20, 2024

Thinking about this again: the reason that we have a compile time option is essentially for zero-copy views of mappable fields. However, that's not really a critical use case. How about we drop the support for RField<...>::Map() for now, and every view has an actual backing RField::RValue. In this case, through the normal RValue API the user can provide their own storage and we don't need any other special handling.

Zero-copy views can be added at a later stage, e.g. as RNTupleZeroCopyView.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ntuple] Split RNTupleView<T, bool> in two classes
6 participants