-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Core: Make view metadata properties optional in JSON parser #8723
Core: Make view metadata properties optional in JSON parser #8723
Conversation
@@ -66,7 +67,9 @@ static void toJson(ViewMetadata metadata, JsonGenerator gen) throws IOException | |||
gen.writeStringField(VIEW_UUID, metadata.uuid()); | |||
gen.writeNumberField(FORMAT_VERSION, metadata.formatVersion()); | |||
gen.writeStringField(LOCATION, metadata.location()); | |||
JsonUtil.writeStringMap(PROPERTIES, metadata.properties(), gen); | |||
if (!metadata.properties().isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But In case of table metadata we are always writing it.
JsonUtil.writeStringMap(PROPERTIES, metadata.properties(), generator); |
Should we make it similar in a follow up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can probably leave the TableMetadata
as is. I'm curious why we want to make this change for views, sorry if this was already discussed before @nastra . Why can't we just keep it similar to tables, and have the property always exist, and by default it'll just be the empty collection that Immutables creates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amogh-jahagirdar: Table metadata , the read is still optional. But always writes. So, else case is a dead code.
For view this PR is making both read and write optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amogh-jahagirdar in newer parsers we typically try to avoid sending empty collections, hence the suggested change.
We could still send an empty collection here for views, but we still need to adjust the read-side to not require properties
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajantha-bhat Right, my point is I think what TableMetadata is doing already is fine and we should be able to follow the same idea here.
In other words, why make it optional in the first place for views on the write side when going through the Iceberg provided metadata writer?
I can see the argument for optional reads since there's no point failing reading view metadata if it's missing properties, in case somebody messed with metadata or has their own metadata writer (so the else block mentioned is not really dead code in this case)
. That's OK to me because properties are optional from a spec perspective.However, I think that on the write side the Iceberg library metadata writer should just write out the properties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could still send an empty collection here for views, but we still need to adjust the read-side to not require properties
Yeah I'm on board with this @nastra. Checking on the read side is expected since properties is optional as per the spec. I think any Iceberg library parser should write out "richer" metadata even if it's empty. It makes it easy for someone to look at the metadata file and see the state (rather than determine a field is missing by having to have the spec side by side). Also minor point in this case, it simplifies the code to just write out the field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just discussed with @nastra , I also see the other argument that writing empty collections is a bit wasteful. Considering that, I'm okay either way.
e082a11
to
de9469c
Compare
Thanks, @nastra! |
No description provided.