Skip to content

Instantly share code, notes, and snippets.

@vinodkc
Last active November 15, 2020 13:57
Show Gist options
  • Save vinodkc/f2ef1b92aa688f49172f358bcd874ecb to your computer and use it in GitHub Desktop.
Save vinodkc/f2ef1b92aa688f49172f358bcd874ecb to your computer and use it in GitHub Desktop.

Storm-Hive-Integration - on HDP 3.1.0.0-78

Download https://mvnrepository.com/artifact/org.apache.storm/storm-hive-examples/1.2.1.3.1.0.0-78 from maven central or build it from https://github.com/hortonworks/storm/blob/HDP-3.1.0.0-78-tag/examples/storm-hive-examples

Note: No need to setup Kafka, as this demo topology simulates the input data from a local Spout.

We will try to save records with following fields into Hive table

{"id","name","phone","street","city","state"} 

A) Storm Hive managed partitioned transactional table integration Demo.

beeline>

create database if not exists  stormdb;
use stormdb;
CREATE TABLE `storm_person_partTransTable`( 
   `id` int ,  
   `name` string  ,  
   `phone` string  ,  
   `street` string  ) 
    PARTITIONED BY (                                    
   `state` string,
   `city` string)  
 STORED AS ORC TBLPROPERTIES ('transactional' = 'true');

1.1) Give table/db access to 'storm' user; ACL Eg: //else use Ranger

hdfs dfs -setfacl -m default:user:storm:rwx /warehouse/tablespace/managed/hive/stormdb.db

1.2) Submit storm topology

/usr/hdp/current/storm-client/bin/storm jar ./storm-hive-examples-1.2.1.3.1.0.0-78.jar  org.apache.storm.hive.bolt.HiveTopologyPartitioned  'thrift://c4114-node3.coelab.cloudera.com:9083'  stormdb storm_person_partTransTable storm_person_partTransTable_topology

1.3) Test the output from hive

beeline> select count(*)   from stormdb.storm_person_partTransTable limit 1 ;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment