Created
December 30, 2020 14:59
-
-
Save maxgr0/b92714426038171b99afe59e9bdfa221 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
resources: | |
Resources: | |
DataLakeBucket: | |
Type: AWS::S3::Bucket | |
Properties: | |
BucketName: ${self:custom.dataLakeBucketName} | |
GlueDataLake: | |
Type: AWS::Glue::Database | |
Properties: | |
CatalogId: ${self:provider.environment.AWS_ACCOUNT_ID} | |
DatabaseInput: | |
Name: ${self:custom.dataLakeIdentifier} | |
GlueDataLakeInteractionsTable: | |
DependsOn: GlueDataLake | |
Type: AWS::Glue::Table | |
Properties: | |
CatalogId: ${self:provider.environment.AWS_ACCOUNT_ID} | |
DatabaseName: ${self:custom.dataLakeIdentifier} | |
TableInput: | |
Name: interactions | |
TableType: EXTERNAL_TABLE | |
Parameters: | |
classification: parquet | |
projection.enabled: true | |
projection.dt.format: yyyy-MM-dd-HH | |
projection.dt.interval: 1 | |
projection.dt.interval.unit: HOURS | |
projection.dt.range: 2020-12-01-00,NOW | |
projection.dt.type: date | |
storage.location.template: | |
Fn::Join: | |
- '' | |
- - 's3://' | |
- ${self:custom.dataLakeBucketName} | |
- '/' | |
- 'interactions' | |
- '/dt=${dt}' | |
PartitionKeys: | |
- Name: dt | |
Type: string | |
StorageDescriptor: | |
OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat | |
InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat | |
SerdeInfo: | |
SerializationLibrary: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe | |
Location: | |
Fn::Join: | |
- '' | |
- - 's3://' | |
- ${self:custom.dataLakeBucketName} | |
- '/' | |
- 'interactions' | |
- '/' | |
Columns: | |
- Name: id | |
Type: string | |
- Name: created_at | |
Type: timestamp | |
- Name: created_by | |
Type: string | |
- Name: entity | |
Type: string | |
- Name: type | |
Type: string | |
InteractionsDataDeliveryStream: | |
DependsOn: | |
- DataLakeKinesisFirehoseS3Role | |
- DataLakeBucket | |
Type: AWS::KinesisFirehose::DeliveryStream | |
Properties: | |
DeliveryStreamType: DirectPut | |
DeliveryStreamName: ${self:custom.dataDeliveryStreamNameInteractions} | |
ExtendedS3DestinationConfiguration: | |
RoleARN: | |
Fn::GetAtt: | |
- DataLakeKinesisFirehoseS3Role | |
- Arn | |
BucketARN: | |
Fn::GetAtt: | |
- DataLakeBucket | |
- Arn | |
Prefix: | |
Fn::Join: | |
- '' | |
- - Ref: GlueDataLakeInteractionsTable | |
- '/dt=!{timestamp:yyyy}-!{timestamp:MM}-!{timestamp:dd}-!{timestamp:HH}/' | |
ErrorOutputPrefix: | |
Fn::Join: | |
- '' | |
- - Ref: GlueDataLakeInteractionsTable | |
- '/error/!{firehose:error-output-type}/dt=!{timestamp:yyyy}-!{timestamp:MM}-!{timestamp:dd}-!{timestamp:HH}/' | |
BufferingHints: | |
SizeInMBs: 128 | |
IntervalInSeconds: 900 | |
CloudWatchLoggingOptions: | |
Enabled: false | |
S3BackupMode: Disabled | |
DataFormatConversionConfiguration: | |
Enabled: True | |
SchemaConfiguration: | |
CatalogId: ${self:provider.environment.AWS_ACCOUNT_ID} | |
RoleARN: | |
Fn::GetAtt: | |
- DataLakeKinesisFirehoseS3Role | |
- Arn | |
DatabaseName: | |
Ref: GlueDataLake | |
TableName: | |
Ref: GlueDataLakeInteractionsTable | |
Region: ${self:provider.region} | |
VersionId: LATEST | |
InputFormatConfiguration: | |
Deserializer: | |
OpenXJsonSerDe: {} | |
OutputFormatConfiguration: | |
Serializer: | |
ParquetSerDe: {} | |
DataLakeKinesisFirehoseS3Role: | |
Type: AWS::IAM::Role | |
DependsOn: DataLakeBucket | |
Properties: | |
RoleName: | |
Fn::Join: | |
- '-' | |
- - ${self:custom.dataLakeIdentifier} | |
- 'kinesis-firehose-s3-role' | |
AssumeRolePolicyDocument: | |
Version: '2012-10-17' | |
Statement: | |
- Sid: '' | |
Effect: Allow | |
Principal: | |
Service: firehose.amazonaws.com | |
Action: 'sts:AssumeRole' | |
Condition: | |
StringEquals: | |
'sts:ExternalId': ${self:provider.environment.AWS_ACCOUNT_ID} | |
Path: '/' | |
Policies: | |
- PolicyName: | |
Fn::Join: | |
- '-' | |
- - ${self:custom.dataLakeIdentifier} | |
- 'kinesis-firehose-s3-policy' | |
PolicyDocument: | |
Version: '2012-10-17' | |
Statement: | |
- Effect: Allow | |
Action: | |
- 's3:AbortMultipartUpload' | |
- 's3:GetBucketLocation' | |
- 's3:GetObject' | |
- 's3:ListBucket' | |
- 's3:ListBucketMultipartUploads' | |
- 's3:PutObject' | |
Resource: | |
- Fn::Join: | |
- '' | |
- - 'arn:aws:s3:::' | |
- Ref: DataLakeBucket | |
- Fn::Join: | |
- '' | |
- - 'arn:aws:s3:::' | |
- Ref: DataLakeBucket | |
- '/*' | |
- Effect: Allow | |
Action: 'glue:GetTableVersions' | |
Resource: '*' | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment