Skip to content

Instantly share code, notes, and snippets.

@thanoojgithub
Last active February 15, 2016 09:21
Show Gist options
  • Save thanoojgithub/972cbc9aea6307453af1 to your computer and use it in GitHub Desktop.
Save thanoojgithub/972cbc9aea6307453af1 to your computer and use it in GitHub Desktop.
JSON file into Hive table using SerDe
hive> LIST jars;
hive-hcatalog-core-1.2.1.jar
hive> DELETE JAR hive-hcatalog-core-1.2.1.jar;
Deleted [hive-hcatalog-core-1.2.1.jar] from class path
hive> ADD JAR /home/ubuntu/hive-1.2.1/hcatalog/share/hcatalog/hive-hcatalog-core-1.2.1.jar;
Added [/home/ubuntu/hive-1.2.1/hcatalog/share/hcatalog/hive-hcatalog-core-1.2.1.jar] to class path
Added resources: [/home/ubuntu/hive-1.2.1/hcatalog/share/hcatalog/hive-hcatalog-core-1.2.1.jar]
hive> drop table thanooj.EmpJson;
OK
Time taken: 0.02 seconds
hive> CREATE TABLE thanooj.EmpJson ( name STRING, salary FLOAT, mobile BIGINT, location STRING) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe';
OK
Time taken: 1.776 seconds
hive> LOAD DATA LOCAL INPATH '/home/ubuntu/input/EmpJson.txt' OVERWRITE INTO TABLE THANOOJ.EmpJson;
Loading data to table thanooj.EmpJson
Table thanooj.EmpJson stats: [numFiles=1, numRows=0, totalSize=233, rawDataSize=0]
OK
Time taken: 0.212 seconds
hive> select * from thanooj.EmpJson;
OK
sriram 5000.0 1234567890 ayodhya
seetha 4500.0 9876543210 midhila
lakshman 2000.0 5432167890 ayodhya
Time taken: 0.07 seconds, Fetched: 3 row(s)
json data:EmpJson.txt
---------------------
{"name": "sriram","salary": 5000,"mobile": 1234567890,"location": "ayodhya"}
{"name": "seetha","salary": 4500,"mobile": 9876543210,"location": "midhila"}
{"name": "lakshman","salary": 2000,"mobile": 5432167890,"location": "ayodhya"}
## complex type JSON::
---------------------
hive> CREATE TABLE thanooj.employeesCol ( name STRUCT<first:STRING,last:STRING>, salary FLOAT, mobile ARRAY<BIGINT>, address STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe';
hive> LOAD DATA LOCAL INPATH '/home/ubuntu/input/empcol.txt' OVERWRITE INTO TABLE THANOOJ.employeesCol;
hive> set hive.cli.print.header=true;
hive> select name.first, name.last, salary, mobile[0], mobile[1], address.street, address.city, address.state, address.zip from thanooj.employeescol;
OK
first last salary _c3 _c4 street city state zip
John Doe 100000.0 1234567890 9876543210 1 Michigan Ave. Chicago IL 60600
Time taken: 0.061 seconds, Fetched: 1 row(s)
hive> select name.first, name.last, salary, mobile[0] as primaryMobile, mobile[1] as secondaryMobile, address.street, address.city, address.state, address.zip from thanooj.employeescol;
OK
first last salary primarymobile secondarymobile street city state zip
John Doe 100000.0 1234567890 9876543210 1 Michigan Ave. Chicago IL 60600
Time taken: 0.044 seconds, Fetched: 1 row(s)
hive>
JSON data : empcol.txt
----------------------
{"name": {"first":"John","last":"Doe"},"salary": 100000, "mobile": [1234567890, 9876543210], "address": {"street": "1 Michigan Ave.", "city": "Chicago", "state": "IL", "zip": 60600}}
@thanoojgithub
Copy link
Author

simple JSON data into Hive table using SerDe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment