Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-35717][table] Allow defining partition keys and table distribution in CREATE TABLE AS (CTAS) #24993

Merged
merged 1 commit into from
Jul 11, 2024

Conversation

spena
Copy link
Contributor

@spena spena commented Jun 27, 2024

What is the purpose of the change

Allows defining PARTITIONED BY and DISTRIBUTED BY in the CTAS statement.

Syntax supported:

CREATE TABLE table_name
[PARTITIONED BY (cols)]
[DISTRIBUTED BY [HASH|RANGE|RANDOM](cols) [INTO n BUCKETS]]
AS SELECT query_expression;

Brief change log

  • Added support for DISTRIBUTED BY syntax in CTAS
    • Added support for PARTITIONED BY syntax in CTAS

Verifying this change

This change added tests and can be verified as follows:

  • Added unit tests to validation and converter classes
  • Manually verified the change by running a single node cluster and sql client

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? JavaDocs (will follow-up with another PR to update docs)

@flinkbot
Copy link
Collaborator

flinkbot commented Jun 27, 2024

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@spena spena changed the title [FLINK-35717][Table SQL / API] Allow DISTRIBUTED BY in CREATE TABLE AS (CTAS) [FLINK-35717][table] Allow defining partition keys and table distribution in CREATE TABLE AS (CTAS) Jul 10, 2024
@@ -856,18 +858,30 @@ public void testMergingCreateTableAsWitDistribution() {
.build())
.distribution(TableDistribution.ofHash(Collections.singletonList("f0"), 3))
.partitionKeys(Arrays.asList("f0", "f1"))
.options(sourceProperties)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix typo in name of the method

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove unused sourceProperties variable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@spena spena requested a review from twalthr July 10, 2024 15:31
@twalthr
Copy link
Contributor

twalthr commented Jul 11, 2024

@flinkbot run azure

@twalthr twalthr merged commit 4041b24 into apache:master Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants