Rajkumar Singh rajkrrsingh

## Druid_Batch_Mode_Ingestion.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                rajkrrsingh
                / Druid_Batch_Mode_Ingestion.md
            
            
              Last active
              July 1, 2018 18:23
            
              
                quick-start guide to ingest data into druid using batch mode on HDP platform.
              
          
    source : http://druid.io/docs/latest/tutorials/tutorial-batch.html
ENV : HDP-2.6.4
pageview.json

{"time": "2015-09-01T00:00:00Z", "url": "/foo/bar", "user": "alice", "latencyMs": 32}
{"time": "2015-09-01T01:00:00Z", "url": "/", "user": "bob", "latencyMs": 11}
{"time": "2015-09-01T01:30:00Z", "url": "/foo/bar", "user": "bob", "latencyMs": 45}

index task json


## hive_druid_integration.md

      
              1 file
            
          
              1 fork
            
          
              0 comments
            
          
              0 stars
            
          
                rajkrrsingh
                / hive_druid_integration.md
            
            
              Last active
              July 4, 2019 10:44
            
              
                hive druid integration : quick test to create druid table from hive table
              
          
    generate data for hive table

echo "generating sample data for hive table"
echo {-1..-181451}hours | xargs -n1 date +"%Y-%m-%d %H:%M:%S" -d >> /tmp/dates.data
echo {-1..-18145}minutes | xargs -n1 date +"%Y-%m-%d %H:%M:%S" -d >> /tmp/dates.data
echo {-1..-1825}days | xargs -n1 date +"%Y-%m-%d %H:%M:%S" -d >> /tmp/dates.data
cat /tmp/dates.data | while read LINE ; do echo $LINE,"user"$((1 + RANDOM % 10000)),$((1 + RANDOM % 1000)) >> /tmp/hive_user_table.data; done

create hive table


## Hive_Compaction_Failing.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                rajkrrsingh
                / Hive_Compaction_Failing.md
            
            
              Created
              June 27, 2018 23:56
            
              
                hive compaction failing with FileAlreadyExistsException
              
          
    ENV

HDP263

Exception

Client

ERROR [Thread-123]: compactor.Worker (Worker.java:run(191)) - Caught exception while trying to compact id:123,dbname:hive_acid,tableName:hive_acid_table,partName:hive_acid_part=part_name,state:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestTxnId:0.  Marking failed to avoid repeated failures,    java.io.IOException: Minor compactor job failed for Hadoop JobId:job_XXXXXX_XXXX     at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.launchCompactionJob(CompactorMR.java:314)
     at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:269)
 at org.apache.hadoop.hive.ql.txn.compactor.Worker$1.run(Worker.java:175)


## Hive_Metastore_Event_listners.md

      
              1 file
            
          
              0 forks
            
          
              2 comments
            
          
              0 stars
            
          
                rajkrrsingh
                / Hive_Metastore_Event_listners.md
            
            
              Created
              January 17, 2018 22:20
            
              
                how to create own metastore event listner
              
          
    Create DROP table listner which get triggered once the DROP table event happen

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hive.metastore.MetaStoreEventListener;
import org.apache.hadoop.hive.metastore.api.MetaException;
import org.apache.hadoop.hive.metastore.events.DropTableEvent;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class DropTableListner extends MetaStoreEventListener  {


## Druid_installation_HDP_2.6.2.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                rajkrrsingh
                / Druid_installation_HDP_2.6.2.md
            
            
              Last active
              October 26, 2017 11:09
            
              
                steps to install druid on hdp 2.6.2 over cento7
              
          
    during installatioin I hit some issues where installtion is failing with variour reasons, I have document the some of the hurdle I faced and
how I can overcome of those.
there is some issue with superset installtion when you select the storage as sqllite so please select superset storage mysql or postgress in Ambari Installation wizard.
Issue1: Requires: openblas-devel

Druid Broker Installation failed with following exception:
resource_management.core.exceptions.ExecutionFailed: Execution of '/usr/bin/yum -d 0 -e 0 -y install superset_2_6_2_0_205' returned 1. Error: Package: superset_2_6_2_0_205-0.15.0.2.6.2.0-205.x86_64 (HDP-2.6)
           Requires: openblas-devel
 You could try using --skip-broken to work around the problem
 You could try running: rpm -Va --nofiles --nodigest

  
## Kafka-Metrics.md

      
              1 file
            
          
              3 forks
            
          
              0 comments
            
          
              1 star
            
          
                rajkrrsingh
                / Kafka-Metrics.md
            
            
              Last active
              July 23, 2022 12:21
            
              
                Monitoring Kafka Broker JMX using jolokia JVM Agent
              
          
    Download Jolokia JVM Agent from following location

https://jolokia.org/download.html
wget http://search.maven.org/remotecontent?filepath=org/jolokia/jolokia-jvm/1.3.7/jolokia-jvm-1.3.7-agent.jar

mv jolokia-jvm-1.3.7-agent.jar agent.jar

here is the small shell script to get metrics you are intersted in


## Kafka-MirrorMaker-Set-Up.md

      
              1 file
            
          
              2 forks
            
          
              0 comments
            
          
              2 stars
            
          
                rajkrrsingh
                / Kafka-MirrorMaker-Set-Up.md
            
            
              Created
              September 5, 2017 09:14
            
              
                Kafka Mirror Maker - from source non-kerberized cluster to kerberized cluster
              
          
    Kafka Mirror Maker - from source non-kerberized cluster to target (kerberized) cluster
Env:

source cluster:
HDP242
un-secure
hostname: rksnode1

destination cluster:


## Custom_UDF_with_LLAP.md

      
              1 file
            
          
              1 fork
            
          
              0 comments
            
          
              0 stars
            
          
                rajkrrsingh
                / Custom_UDF_with_LLAP.md
            
            
              Last active
              January 25, 2018 20:44
            
              
                steps to add custom udf to LLAP
              
          
    Creating and running Temporary functions are discouraged while running query on LLAP because of security reason, since many users are
sharing same instances of LLAP, it can create a conflict but still you can create temp functions using add jar and hive.llap.execution.mode=auto.
with exculsive llap execution mode(hive.llap.execution.mode=only) you will run into the ClassNotFoundException, hive.llap.execution.mode=auto will allow
some part of query(map tasks) to run in the tez container.
Here are steps to create custom permanent funtion in LLAP(steps are tested on HDP-260)

create a jar for UDF funtion (in this case I am using simple udf):

git clone https://github.com/rajkrrsingh/SampleCode


## SparkKafkaIntegration.md

      
              1 file
            
          
              1 fork
            
          
              0 comments
            
          
              0 stars
            
          
                rajkrrsingh
                / SparkKafkaIntegration.md
            
            
              Last active
              December 18, 2019 09:25
            
              
                Spark Kafka Consumer in secure( Kerberos) enviornment 
              
          
    Sample Application

using direct stream

 import kafka.serializer.StringDecoder;
 import org.apache.spark.SparkConf
 import org.apache.spark.streaming._
 import org.apache.spark.streaming.kafka._
 
 
 object SparkKafkaConsumer2 {


## Spark LLAP Setup for Thrift server.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                rajkrrsingh
                / Spark LLAP Setup for Thrift server.md
            
            
              Last active
              May 27, 2017 15:51
            
              
                configuration required to setup Spark-LLAP
              
          
    ENV HDP-2.6.0.3-8

Download spark-llap assembly jar from http://repo.hortonworks.com/content/repositories/releases/com/hortonworks/spark-llap/
Add following in Custom spark-thrift-sparkconf

spark_thrift_cmd_opts --jars /usr/hdp/current/spark-client/lib/spark-llap-1.0.0.2.6.0.3-8-assembly.jar
spark.executor.extraClassPath /usr/hdp/current/spark-client/lib/spark-llap-1.0.0.2.6.0.3-8-assembly.jar
spark.hadoop.hive.llap.daemon.service.hosts @llap0
spark.jars /usr/hdp/current/spark-client/lib/spark-llap-1.0.0.2.6.0.3-8-assembly.jar