Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial Athena Clickhouse connector commit related to issue 1754 #1770

Merged
merged 20 commits into from
Jun 25, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
a59ee61
initial Athena Clickhouse connector commit related to https://github.…
Feb 22, 2024
9dbdb76
Merge branch 'master' into master
bishrtabbaa Feb 29, 2024
084ceee
Merge branch 'master' into master
bishrtabbaa Mar 5, 2024
0c6c207
Merge branch 'awslabs:master' into master
bishrtabbaa Mar 20, 2024
384d833
incorporating Athena team feedback to reuse and extend Athena MySqlMe…
Mar 21, 2024
7efdc55
incorporate Athena service team feedback
bishrtabbaa Apr 9, 2024
cd4a577
Merge pull request #1 from awslabs/master
bishrtabbaa Apr 9, 2024
48d9290
Delete athena-clickhouse/.aws-sam/build.toml per service team feedback
bishrtabbaa Apr 9, 2024
c3def5d
updating jar lib version number
bishrtabbaa Apr 10, 2024
d63b413
Merge branch 'master' into master
bishrtabbaa Apr 26, 2024
5acee22
Merge branch 'master' into master
bishrtabbaa May 3, 2024
37ada59
Merge branch 'awslabs:master' into master
bishrtabbaa Jun 6, 2024
953263f
incorporating Athena service team feedback related to pull/1770
bishrtabbaa Jun 6, 2024
15e2b0c
Merge branch 'master' into master
bishrtabbaa Jun 11, 2024
0190490
prepared README for public GA release and streamlined SAM CLI instruc…
bishrtabbaa Jun 12, 2024
80b0ecd
switched from curl to wget for jar file download instructions
bishrtabbaa Jun 13, 2024
303e084
cleaned up download instructions
bishrtabbaa Jun 13, 2024
9728d46
Merge branch 'master' into master
chngpe Jun 18, 2024
e85080e
incorporated Athena service team feedback from https://github.com/aws…
bishrtabbaa Jun 24, 2024
0979abd
Merge branch 'master' into master
aimethed Jun 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
incorporated Athena service team feedback from d82edd3
  • Loading branch information
bishrtabbaa committed Jun 24, 2024
commit e85080e281049020614986d8a4dec1f6e4ce199b
53 changes: 16 additions & 37 deletions athena-clickhouse/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,51 +6,33 @@ Official Public documentation has moved [here](https://docs.aws.amazon.com/athen

This README walks through the SAM CLI installation method (not Serverless Application Repository via AWS Console).

## 1. Download Athena Clickhouse Connector source and release repositories
Deploy a Connector without Serverless Application Repository [link](https://github.com/awslabs/aws-athena-query-federation/wiki/Deploy-a-Connector-without-Serverless-Application-Repository)

Download latest Athena source
```
git clone https://github.com/awslabs/aws-athena-query-federation
cd athena-clickhouse
```

Download latest Athena Clickhouse JAR binary file. Browse to https://github.com/awslabs/aws-athena-query-federation/releases then choose latest release. Note the version of the file may change over time.
```
wget https://github.com/awslabs/aws-athena-query-federation/releases/download/v2024.19.1/athena-clickhouse-2024.19.1.jar
```

## 2. Copy Athena Clickhouse Connector JAR file to Amazon S3 bucket since it is >= 50 MB local file upload limit

You **MUST** change the S3 bucket and prefix folder where the Connector JAR file will be stored for the subsequent SAM deployment step (4).
SAM CLI will provide interactive experience to perform deployment.

```
aws s3 cp --region us-east-2 athena-clickhouse-2024.19.1.jar s3://my-athena-demo/code/
```

## 3. Validate Athena Clickhouse Connector as Serverless Cloudformation stack
### Example SAM CLI

SAM CLI guided:
```
sam validate --region us-east-2 --template-file athena-clickhouse.yaml
cd athena-clickhouse
sam deploy -g --template-file athena-clickhouse.yaml
```

## 4. Deploy Athena Clickhouse Connector as Serverless Cloudformation stack

You can change the Lambda function configuration at deployment time and also once the stack has been deployed. Parameters that **MUST** change are listed in section below.

Also, note that you **MUST** create and configure VPC endpoints for S3 (and *optionally* Secrets Manager) because the Athena connector's Lambda function will be deployed within a VPC.

Direct Configuration of Credentials in Connector's connection string:
Direct Configuration of Credentials in Connector's connection string without SAM cli guided:
```
sam deploy --guided --region us-east-2 --template-file athena-clickhouse.yaml --stack-name AthenaClickhouseConnectorStack --capabilities CAPABILITY_NAMED_IAM --parameter-overrides LambdaFunctionName=athenaclickhouseconnectorfunction DefaultConnectionString='clickhouse://jdbc:clickhouse:https://myclickhouseserver.xyzware.io:8443/default?user=foo&password=bar&sslmode=none' DisableSpillEncryption=true SecretNamePrefix=AthenaClickhouse SpillBucket=my-athena-demo SpillPrefix=athena-spill SecurityGroupIds=sg-ab9282d4 SubnetIds=subnet-bc1f0ac6,subnet-db9f40b0 LambdaS3CodeUriBucket=my-athena-demo LambdaS3CodeUriKey=code/athena-clickhouse-2024.19.1.jar
cd athena-clickhouse
sam deploy --resolve-s3 --region us-east-1 --template-file athena-clickhouse.yaml --stack-name <stack_name> --capabilities CAPABILITY_NAMED_IAM --parameter-overrides LambdaFunctionName=<function_name> DefaultConnectionString='clickhouse://jdbc:clickhouse:https://myclickhouseserver.xyzware.io:8443/default?user=<user>&password=<password>&sslmode=none' SpillBucket=my-athena-demo SecurityGroupIds=sg-1 SubnetIds=subnet-1,subnet-2
```

Indirect Configuration of Credentials in Connector's connection string using AWS Secrets Manager:
Indirect Configuration of Credentials in Connector's connection string using AWS Secrets Manager without SAM cli guided:
```
sam deploy --guided --region us-east-2 --template-file athena-clickhouse.yaml --stack-name AthenaClickhouseConnectorStack --capabilities CAPABILITY_NAMED_IAM --parameter-overrides LambdaFunctionName=athenaclickhouseconnectorfunction DefaultConnectionString='clickhouse://jdbc:clickhouse:https://myclickhouseserver.xyzware.io:8443/default?${AthenaClickhouse}&sslmode=none' DisableSpillEncryption=true SecretNamePrefix=AthenaClickhouse SpillBucket=my-athena-demo SpillPrefix=athena-spill SecurityGroupIds=sg-ab9282d4 SubnetIds=subnet-bc1f0ac6,subnet-db9f40b0 LambdaS3CodeUriBucket=my-athena-demo LambdaS3CodeUriKey=code/athena-clickhouse-2024.19.1.jar
cd athena-clickhouse
sam deploy --resolve-s3 --region us-east-1 --template-file athena-clickhouse.yaml --stack-name <stack_name> --capabilities CAPABILITY_NAMED_IAM --parameter-overrides LambdaFunctionName=<function_name> DefaultConnectionString='clickhouse://jdbc:clickhouse:https://myclickhouseserver.xyzware.io:8443/default?${AthenaClickhouse}&sslmode=none' SecretNamePrefix=AthenaClickhouse SpillBucket=my-athena-demo SecurityGroupIds=sg-1 SubnetIds=subnet-1,subnet-2
```
### References

**Parameters** listed below. You **MUST** change the `DefaultConnectionString`, `SpillBucket`, `SpillPrefix`, `SecurityGroupIds`, `SubnetIds`, `LambdaS3CodeUriBucket`, and `LambdaS3CodeUriKey`.
**Parameters** listed below. You **MUST** change the `DefaultConnectionString`, `SpillBucket`, `SecurityGroupIds` and `SubnetIds`.

Also, note that there are `DefaultConnectionString` differences depending on whether you directly configure within the URL or indirectly using AWS Secrets Manager.

Expand All @@ -61,15 +43,12 @@ If you decide to indirectly configure credentials using AWS Secrets Manager, mak
* LambdaFunctionName=athenaclickhouseconnectorfunction
* DisableSpillEncryption=true [optional]
* SecretNamePrefix=AthenaClickhouse [optional]
* LOG_LEVEL=info [optional]
* **DefaultConnectionString**=clickhouse://jdbc:clickhouse:https://myclickhouseserver.xyzware.io:8443/default?user=foo&password=bar&sslmode=none [direct]
* **DefaultConnectionString**=clickhouse://jdbc:clickhouse:https://myclickhouseserver.xyzware.io:8443/default?${AthenaClickhouse}&sslmode=none [indirect]
* **SpillBucket**=my-athena-demo
* **SpillPrefix**=athena-spill
* **SecurityGroupIds**=sg-ab9282d4
* **SubnetIds**=subnet-bc1f0ac6,subnet-db9f40b0
* **LambdaS3CodeUriBucket**=my-athena-demo
* **LambdaS3CodeUriKey**=code/athena-clickhouse-2024.19.1.jar
* **SpillPrefix**=athena-spill [default]
* **SecurityGroupIds**=sg-1
* **SubnetIds**=subnet-1,subnet-2

**Links**
* https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/what-is-sam.html
Expand Down
10 changes: 1 addition & 9 deletions athena-clickhouse/athena-clickhouse.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,12 +56,6 @@ Parameters:
Description: "(Optional) An IAM policy ARN to use as the PermissionsBoundary for the created Lambda function's execution role"
Default: ''
Type: String
LambdaS3CodeUriBucket:
Description: This must be set to a S3 mybucket because the JAR is greater than the local SAM 50MB limit and must be referenced based on prior s3 cp deploy step.
Type: String
LambdaS3CodeUriKey:
Description: This must be set to a S3 folder/file (e.g. code/athena-clickhouse-2022.47.1.jar) because the JAR is greater than the local SAM 50MB limit and must be referenced based on prior s3 cp deploy step.
Type: String
Conditions:
HasPermissionsBoundary: !Not [ !Equals [ !Ref PermissionsBoundaryARN, "" ] ]
NotHasLambdaRole: !Equals [!Ref LambdaRoleARN, ""]
Expand All @@ -77,9 +71,7 @@ Resources:
default: !Ref DefaultConnectionString
FunctionName: !Ref LambdaFunctionName
Handler: "com.amazonaws.athena.connectors.clickhouse.ClickHouseMuxCompositeHandler"
CodeUri:
Bucket: !Ref LambdaS3CodeUriBucket
Key: !Ref LambdaS3CodeUriKey
CodeUri: "./target/athena-clickhouse-2022.47.1.jar"
Description: "Enables Amazon Athena to communicate with ClickHouse using JDBC"
Runtime: java11
Timeout: !Ref LambdaTimeout
Expand Down