Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Make view metadata properties optional in JSON parser #8723

Merged
merged 1 commit into from
Oct 20, 2023

Conversation

nastra
Copy link
Contributor

@nastra nastra commented Oct 5, 2023

No description provided.

@github-actions github-actions bot added the core label Oct 5, 2023
@@ -66,7 +67,9 @@ static void toJson(ViewMetadata metadata, JsonGenerator gen) throws IOException
gen.writeStringField(VIEW_UUID, metadata.uuid());
gen.writeNumberField(FORMAT_VERSION, metadata.formatVersion());
gen.writeStringField(LOCATION, metadata.location());
JsonUtil.writeStringMap(PROPERTIES, metadata.properties(), gen);
if (!metadata.properties().isEmpty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But In case of table metadata we are always writing it.

JsonUtil.writeStringMap(PROPERTIES, metadata.properties(), generator);

Should we make it similar in a follow up?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can probably leave the TableMetadata as is. I'm curious why we want to make this change for views, sorry if this was already discussed before @nastra . Why can't we just keep it similar to tables, and have the property always exist, and by default it'll just be the empty collection that Immutables creates?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amogh-jahagirdar: Table metadata , the read is still optional. But always writes. So, else case is a dead code.
For view this PR is making both read and write optional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amogh-jahagirdar in newer parsers we typically try to avoid sending empty collections, hence the suggested change.
We could still send an empty collection here for views, but we still need to adjust the read-side to not require properties

Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar Oct 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajantha-bhat Right, my point is I think what TableMetadata is doing already is fine and we should be able to follow the same idea here.

In other words, why make it optional in the first place for views on the write side when going through the Iceberg provided metadata writer?

I can see the argument for optional reads since there's no point failing reading view metadata if it's missing properties, in case somebody messed with metadata or has their own metadata writer (so the else block mentioned is not really dead code in this case)
. That's OK to me because properties are optional from a spec perspective.However, I think that on the write side the Iceberg library metadata writer should just write out the properties.

Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar Oct 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could still send an empty collection here for views, but we still need to adjust the read-side to not require properties

Yeah I'm on board with this @nastra. Checking on the read side is expected since properties is optional as per the spec. I think any Iceberg library parser should write out "richer" metadata even if it's empty. It makes it easy for someone to look at the metadata file and see the state (rather than determine a field is missing by having to have the spec side by side). Also minor point in this case, it simplifies the code to just write out the field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just discussed with @nastra , I also see the other argument that writing empty collections is a bit wasteful. Considering that, I'm okay either way.

@nastra nastra requested a review from rdblue October 5, 2023 13:11
@nastra nastra force-pushed the make-view-metadata-properties-optional branch from e082a11 to de9469c Compare October 19, 2023 12:52
@rdblue rdblue merged commit 94edb0e into apache:main Oct 20, 2023
45 checks passed
@rdblue
Copy link
Contributor

rdblue commented Oct 20, 2023

Thanks, @nastra!

@nastra nastra deleted the make-view-metadata-properties-optional branch October 21, 2023 07:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants