Skip to content

Instantly share code, notes, and snippets.

Keybase proof

I hereby claim:

  • I am fixone on github.
  • I am fixone (https://keybase.io/fixone) on keybase.
  • I have a public key ASAoIULIrRQLc8X5ylgzOftj-0D1mv3B3E_EPYttQhn57Qo

To claim this, I am signing this object:

sudo apt-get update
sudo apt-get install -y openjdk-8-jre
CONFLUENT_VER="5.2.1"
SCALA_VER="2.12"
#http://packages.confluent.io/archive/5.2/confluent-5.2.1-2.12.zip
wget http://packages.confluent.io/archive/5.2/confluent-${CONFLUENT_VER}-${SCALA_VER}.tar.gz -O /tmp/confluent.tar.gz
cd /tmp
tar -xzf ./confluent.tar.gz
sudo mv ./confluent-${CONFLUENT_VER} /opt/confluent
cd /opt/confluent
CREATE TABLE ratings_all_hive (userid int, age int, gender string, occupation int, zip string, rating double, rating_time timestamp, movieid int, title string, year int, genres string)
COMMENT 'data loaded with serde org.apache.hadoop.hive.serde2.OpenCSVSerde'
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES ( "separatorChar" = "\,", "quoteChar" = "\"")
STORED AS TEXTFILE tblproperties("skip.header.line.count"="1");
LOAD DATA INPATH '/tmp/ratings-all.csv' OVERWRITE into table ratings_all_hive;
create table ratings_all_impala as select cast (userid as int) As userid, cast (age as int) As age, gender, cast (occupation as int) As occupation, zip, cast (rating as double) As rating, rating_time, cast (movieid as int) As movieid, title, genre, cast (year as int) As year from ratings_all_hive
2018-04-14T18:41:29.894Z lightning_gossipd(5114): Received node_announcement for node 03d3f875b88f083dd4d1d5af4bcd5483d7bf0303ec1f585879e2a1012961ec9a9d
2018-04-14T19:57:09.737Z lightning_gossipd(5114): Connected out for 03d3f875b88f083dd4d1d5af4bcd5483d7bf0303ec1f585879e2a1012961ec9a9d
2018-04-14T19:57:09.955Z lightning_gossipd(5114): Handing back peer 03d3f875b88f083dd4d1d5af4bcd5483d7bf0303ec1f585879e2a1012961ec9a9d to master
2018-04-14T19:57:09.955Z lightning_gossipd(5114): hand_back_peer 03d3f875b88f083dd4d1d5af4bcd5483d7bf0303ec1f585879e2a1012961ec9a9d: now local again
2018-04-14T19:57:30.211Z lightningd(5106): lightning_openingd-03d3f875b88f083dd4d1d5af4bcd5483d7bf0303ec1f585879e2a1012961ec9a9d chan #13: pid 24047, msgfd 22
2018-04-14T19:57:30.239Z lightningd(5106): lightning_openingd-03d3f875b88f083dd4d1d5af4bcd5483d7bf0303ec1f585879e2a1012961ec9a9d chan #13: First per_commit_point = 02eb8b81f9b986f59787a6baa6cbb5c06cf433098c830e1e3f785012f896632d5d
2018-04-14T19:57:30.239Z lightningd(5106): lightning
@fixone
fixone / codiax
Last active November 16, 2017 13:08
Demo - what can you do with Hive and Impala
Apache Logs in Hive
Take a set of Apache HTTP server logs (in combined log format - i.e. including the Referer and User Agent). Each line of the log should be similar with this one.
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 "http://www.example.com/start.html" "Mozilla/4.08 [en] (Win98; I ;Nav)"
Place them in a HDFS directory, let’s say /data/logs/.