Skip to content

Instantly share code, notes, and snippets.

@jdorfman
Created August 28, 2019 03:04
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jdorfman/8b1c65c1f0618cb68694695234cc22b0 to your computer and use it in GitHub Desktop.
Save jdorfman/8b1c65c1f0618cb68694695234cc22b0 to your computer and use it in GitHub Desktop.
CREATE EXTERNAL TABLE IF NOT EXISTS segment_logs.eventlogs (
anonymousid string , # pick columns you care about!
context map<string,string> , # using a map for nested JSON
messageid string ,
timestamp Timestamp ,
type string ,
userid string ,
traits map<string,string> ,
event string
)
PARTITIONED BY (sourceid string) # partition by the axes you expect to query often, sourceid here is associated with each source of data
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
LOCATION 's3://your-s3-bucket/segment-logs' # location of your data in S3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment