Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
GlueJobRatesToParquet:
Type: AWS::Glue::Job
Properties:
GlueVersion: 1.0
Command:
Name: glueetl
PythonVersion: 3
ScriptLocation: !Sub "s3://${ScriptBucketName}/glue_scripts/rates_xml_to_parquet.py"
DefaultArguments: {
"--s3_output_path": !Sub "s3://${DataBucketName}/electricity_rates_parquet",
"--source_glue_database": !Ref GlueDatabase,
"--source_glue_table": "electricity_rates_xml",
"--job-bookmark-option": "job-bookmark-enable",
"--enable-spark-ui": "true",
"--spark-event-logs-path": !Sub "s3://${LogBucketName}/glue-etl-jobs/"
}
Description: "Convert electrical rates XML data to Parquet"
ExecutionProperty:
MaxConcurrentRuns: 2
MaxRetries: 0
Name: rates-xml-to-parquet
Role: !GetAtt "CrawlerRole.Arn"
DependsOn:
- CrawlerRole
- GlueDatabase
- DataBucket
- ScriptBucket
- LogBucket
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.