Skip to content

Commit

Permalink
Fixed a bug where the scripts would fetch Sales and Marketing dataset…
Browse files Browse the repository at this point in the history
…s from a public AWS S3 bucket instead of the bucket to which files arre uploaded as part of the ETL flow.
  • Loading branch information
moanany committed Aug 29, 2018
1 parent e6103db commit 62958bb
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 3 deletions.
2 changes: 1 addition & 1 deletion build.py
Original file line number Diff line number Diff line change
Expand Up @@ -318,7 +318,7 @@ def deploygluescripts(**kwargs):
"ERROR: S3ETLScriptPath must be set in 'cloudformation/glue-resources-params.json'.")
return

result = re.search('s3://(.*)/(.*)', s3_etl_script_path)
result = re.search('s3://(.+?)/(.*)', s3_etl_script_path)
if(result is None):
print("ERROR: S3ETLScriptPath is malformed.")
return
Expand Down
9 changes: 7 additions & 2 deletions cloudformation/glue-resources.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,11 @@ Parameters:
MinLength: "10"
Description: "Name of the S3 output path to which this CloudFormation template's AWS Glue jobs are going to write ETL output."

SourceDataBucketName:
Type: String
MinLength: "1"
Description: "Name of the S3 bucket in which the source Marketing and Sales data will be uploaded. Bucket is created by this CFT."

Resources:

### AWS GLUE RESOURCES ###
Expand Down Expand Up @@ -165,7 +170,7 @@ Resources:
}
SerializationLibrary: "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"
Compressed: False
Location: "s3://quicksightsampledata/SalesPipeline_QuickSightSample.csv"
Location: !Sub "s3://${SourceDataBucketName}/sales/"
Retention: 0
Name: !Ref SalesPipelineTableName
DatabaseName: !Ref MarketingAndSalesDatabaseName
Expand Down Expand Up @@ -255,7 +260,7 @@ Resources:
}
SerializationLibrary: "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"
Compressed: False
Location: "s3://quicksightsampledata/MarketingData_QuickSightSample.csv"
Location: !Sub "s3://${SourceDataBucketName}/marketing/"
Retention: 0
Name: !Ref MarketingTableName
DatabaseName: !Ref MarketingAndSalesDatabaseName
Expand Down

0 comments on commit 62958bb

Please sign in to comment.