Skip to content

Instantly share code, notes, and snippets.

@nsabharwal
Last active May 26, 2016 22:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nsabharwal/fd18d7b32f72aa02745bddc52306dfd1 to your computer and use it in GitHub Desktop.
Save nsabharwal/fd18d7b32f72aa02745bddc52306dfd1 to your computer and use it in GitHub Desktop.
Druid setup
## git clone https://github.com/Banno/druid-docker
## git clone https://github.com/zcox/druid-pageviews
cd druid-docker
fig kill && fig rm --force
./build.sh
fig up -d druid
#### docker
## `boot2docker shellinit`
## docker ps
##
## Hadoop Batch index
cd druid-pageviews
##learn about the files
cat ingest.sh
cat task1.json ## change "baseDir": "/Users/nsabharwal/druid-pageviews",
### and "filter": "e.json" ...e.json file will go into HDFS
cat hdfs-commands
# let's log into Hadoop container and copy the e.json into /data as mentioned in hdfs-commands
docker ps | grep hadoop
docker exec -it containerid bash
/usr/local/hadoop/bin/hdfs dfs -mkdir -p /data
cat > e.json
{"eventId":"e1", "timestamp":"2016-05-25T14:00:00Z", "userId":"u1", "url":"http://site.com/1"}
{"eventId":"e2", "timestamp":"2016-05-25T14:00:01Z", "userId":"u2", "url":"http://site.com/1"}
{"eventId":"e3", "timestamp":"2016-05-25T14:00:02Z", "userId":"u3", "url":"http://site.com/1"}
{"eventId":"e4", "timestamp":"2016-05-25T14:00:03Z", "userId":"u1", "url":"http://site.com/2"}
{"eventId":"e5", "timestamp":"2016-05-25T14:00:04Z", "userId":"u2", "url":"http://site.com/2"}
{"eventId":"e6", "timestamp":"2016-05-25T14:00:05Z", "userId":"u1", "url":"http://site.com/3"}
{"eventId":"e7", "timestamp":"2016-05-25T15:00:00Z", "userId":"u1", "url":"http://site.com/1"}
{"eventId":"e8", "timestamp":"2016-05-25T15:00:01Z", "userId":"u4", "url":"http://site.com/1"}
{"eventId":"e9", "timestamp":"2016-05-25T15:00:02Z", "userId":"u3", "url":"http://site.com/2"}
{"eventId":"e10", "timestamp":"2016-05-25T15:00:03Z", "userId":"u4", "url":"http://site.com/2"}
{"eventId":"e11", "timestamp":"2016-05-25T16:00:00Z", "userId":rl":"http://site.com/1"}
{"eventId":"e12", "timestamp":"2016-05-25T16:00:01Z", "userId":"u4", "url":"http://site.com/1"}
ctrl + D
/usr/local/hadoop/bin/hdfs dfs -mkdir -p /data
/usr/local/hadoop/bin/hdfs dfs -put e.json /data
## exit or different terminal
./ingest.sh task1.json
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment