Skip to content

Instantly share code, notes, and snippets.

@gcchaan
Created September 2, 2019 01:36
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gcchaan/bd4f74034aa2e10b869dc7653cba2807 to your computer and use it in GitHub Desktop.
Save gcchaan/bd4f74034aa2e10b869dc7653cba2807 to your computer and use it in GitHub Desktop.
Cloudformation template of Athena
AWSTemplateFormatVersion: "2010-09-09"
Description: Athena Stack
Resources:
GlueDatabase:
Type: AWS::Glue::Database
Properties:
CatalogId: !Ref 'AWS::AccountId'
DatabaseInput:
Description: for athena
Name: gcchaan_database
GlueTable:
Type: AWS::Glue::Table
Properties:
CatalogId: !Ref 'AWS::AccountId'
DatabaseName: !Ref GlueDatabase
TableInput:
Description: User Log
Name: user_table
Owner: gcchaan
PartitionKeys:
- Name: year
Type: int
- Name: month
Type: int
- Name: day
Type: int
StorageDescriptor:
Columns:
- Name: user
Type: string
- Name: message
Type: string
- Name: timestamp
Type: string
Compressed: False
Location: !Join
- ''
- - 's3://'
- !Ref s3bucket
- /
- user_table
- data/
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat
SerdeInfo:
SerializationLibrary: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
TableType: EXTERNAL_TABLE
DeliveryStream:
DependsOn:
- deliveryPolicy
Type: AWS::KinesisFirehose::DeliveryStream
Properties:
ExtendedS3DestinationConfiguration:
BucketARN: !Join
- ''
- - 'arn:aws:s3:::'
- !Ref s3bucket
BufferingHints:
IntervalInSeconds: '60'
SizeInMBs: '64'
CompressionFormat: UNCOMPRESSED
Prefix: !Join
- ''
- - !Ref GlueTable
- 'data/year=!{timestamp:YYYY}/month=!{timestamp:MM}/day=!{timestamp:dd}/hour=!{timestamp:HH}/'
ErrorOutputPrefix: !Join
- ''
- - !Ref GlueTable
- 'error/!{firehose:error-output-type}/year=!{timestamp:YYYY}/month=!{timestamp:MM}/day=!{timestamp:dd}/hour=!{timestamp:HH}/'
S3BackupMode: Disabled
DataFormatConversionConfiguration:
SchemaConfiguration:
CatalogId: !Ref AWS::AccountId
RoleARN: !GetAtt deliveryRole.Arn
DatabaseName: !Ref GlueDatabase
TableName: !Ref GlueTable
Region: !Ref AWS::Region
VersionId: LATEST
InputFormatConfiguration:
Deserializer:
OpenXJsonSerDe: {}
OutputFormatConfiguration:
Serializer:
ParquetSerDe: {}
Enabled: True
RoleARN: !GetAtt deliveryRole.Arn
s3bucket:
Type: AWS::S3::Bucket
Properties:
VersioningConfiguration:
Status: Enabled
deliveryRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Principal:
Service: firehose.amazonaws.com
Action: 'sts:AssumeRole'
Condition:
StringEquals:
'sts:ExternalId': !Ref 'AWS::AccountId'
deliveryPolicy:
Type: AWS::IAM::Policy
Properties:
PolicyName: firehose_delivery_policy
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- 's3:AbortMultipartUpload'
- 's3:GetBucketLocation'
- 's3:GetObject'
- 's3:ListBucket'
- 's3:ListBucketMultipartUploads'
- 's3:PutObject'
Resource:
- !Join
- ''
- - 'arn:aws:s3:::'
- !Ref s3bucket
- !Join
- ''
- - 'arn:aws:s3:::'
- !Ref s3bucket
- '*'
- Effect: Allow
Action:
- 'glue:*'
Resource:
- !Join
- ''
- - 'arn:aws:glue:*:*:table'
- '/'
- !Ref GlueDatabase
- '/'
- !Ref GlueTable
- !Join
- ''
- - 'arn:aws:glue:*:*:database'
- '/'
- !Ref GlueDatabase
- 'arn:aws:glue:*:*:catalog'
Roles:
- !Ref deliveryRole
@gcchaan
Copy link
Author

gcchaan commented Sep 2, 2019

aws firehose put-record --delivery-stream-name kinesis-firehose-DeliveryStream-0123456789abc --record Data='"{\"user\": \"Bob\", \"message\": \"hello\", \"timestamp\": \"2099-01-01 00:00:00\"}"'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment