Skip to content

Instantly share code, notes, and snippets.

Avatar

Rajkumar Singh rajkrrsingh

View GitHub Profile
@rajkrrsingh
rajkrrsingh / Docker_HIVESERVER_Remote_Debugging.md
Last active Jul 25, 2019
Debugging HiveServer2 Docker container Remotely
View Docker_HIVESERVER_Remote_Debugging.md
HiveServer2 Dockerfile

watch out the JAVA_TOOL_OPTIONS which are having remote debugging options.

FROM centos

# Basic hygene
RUN yum upgrade -y && \
    yum update -y && \
@rajkrrsingh
rajkrrsingh / Hive_Docker.md
Last active Jul 24, 2019
Running HiveServer2 and Beeline as docker containers
View Hive_Docker.md

Create and run hiveserver2 docker container

mkdir hive-3-image
cd hive-3-image/
wget https://raw.githubusercontent.com/alanfgates/sqltest/master/dbs/hive/v3_1/Dockerfile
docker build . -t hive3-image
// Run it
docker run -it --net=myNetwork -p 10000:10000 hive3-image

You will see following logs on stdout if everything goes well

@rajkrrsingh
rajkrrsingh / Hive Replication V2 startup guide.md
Last active Nov 4, 2021
Jump start guide for Hive Replication V2 - to know more about hive replication please refer https://cwiki.apache.org/confluence/display/Hive/HiveReplicationv2Development
View Hive Replication V2 startup guide.md

Prerequisite hive settings:

set hive.server2.logging.operation.level=execution;
set hive.metastore.transactional.event.listeners=org.apache.hive.hcatalog.listener.DbNotificationListener;
set hive.metastore.dml.events=true;

Setup database and tables

@rajkrrsingh
rajkrrsingh / Create_Bulk_Topic_In_Kafka.md
Last active May 19, 2020
create kafka bulk topic from admin client
View Create_Bulk_Topic_In_Kafka.md

import kafka.admin.AdminUtils;
import kafka.admin.RackAwareMode;
import kafka.utils.ZKStringSerializer$;
import kafka.utils.ZkUtils;
import org.I0Itec.zkclient.ZkClient;
import org.I0Itec.zkclient.ZkConnection;

import java.io.BufferedReader;
@rajkrrsingh
rajkrrsingh / Hive LLAP Workload Management.md
Last active Jul 20, 2019
hive llap workload management commands
View Hive LLAP Workload Management.md

Create RESOURCE PLAN testrp;

create RESOURCE PLAN testrp;
CREATE POOL testrp.default.c1 WITH ALLOC_FRACTION=0.3, QUERY_PARALLELISM=3, SCHEDULING_POLICY='fair';
CREATE POOL testrp.default.c2 WITH ALLOC_FRACTION=0.5, QUERY_PARALLELISM=1, SCHEDULING_POLICY='fair';
SELECT * FROM SYS.WM_POOLS;
ALTER RESOURCE PLAN testrp VALIDATE;
ALTER RESOURCE PLAN testrp ENABLE ACTIVATE;
@rajkrrsingh
rajkrrsingh / Hive Kafka Integration
Last active May 19, 2020
a quick start guide to query kafka topic from hive table
View Hive Kafka Integration
#### ENV: HDP-3.1
#### Data setup:
```
cat sample-data.json
{"name": "Raj","address": {"a": "b","c": "d","e": "f"}}
{"name": "Raj1","address": {"a": "bb","c": "dd","e": "ff"}}
```
#### Create topic in Kafka and Ingest data into it.
@rajkrrsingh
rajkrrsingh / docker-help.md
Created Feb 2, 2019 — forked from bradtraversy/docker-help.md
Docker Commands, Help & Tips
View docker-help.md

Docker Commands, Help & Tips

Show commands & management commands

$ docker

Docker version info

View Druid_Supervisor_Rest_API.md

Supervisor REST API

Get Supervisor Spec

curl    http://host215-node2:8090/druid/indexer/v1/supervisor/metrics-kafka
{"type":"kafka","dataSchema":{"dataSource":"metrics-kafka","parser":{"type":"string","parseSpec":{"format":"json","timestampSpec":{"column":"timestamp","format":"auto"},"dimensionsSpec":{"dimensions":[],"dimensionExclusions":["timestamp","value"]}}},"metricsSpec":[{"type":"count","name":"count"},{"type":"doubleSum","name":"value_sum","fieldName":"value","expression":null},{"type":"doubleMin","name":"value_min","fieldName":"value","expression":null},{"type":"doubleMax","name":"value_max","fieldName":"value","expression":null}],"granularitySpec":{"type":"uniform","segmentGranularity":"HOUR","queryGranularity":{"type":"none"},"rollup":true,"intervals":null}},"tuningConfig":{"type":"kafka","maxRowsInMemory":75000,"maxRowsPerSegment":5000000,"intermediatePersistPeriod":"PT10M","basePersistDirectory":"/tmp/1530551810169-0","maxPendingPersists":0,"indexSpec":{"bitmap":{"type":"concise"},"
@rajkrrsingh
rajkrrsingh / Druid_Kafka_Indexing_Service.md
Last active Jul 2, 2018
druid how to use druid kafka indexing service
View Druid_Kafka_Indexing_Service.md

kafka indexing service

should have extensions loaded on overlord and middle-managers druid.extensions.loadList = ["druid-datasketches", "druid-hdfs-storage", "druid-kafka-indexing-service", "mysql-metadata-storage"]

supervisor spec

cat supervisor-spec.json
{
  "type": "kafka",
  "dataSchema": {
@rajkrrsingh
rajkrrsingh / Druid_Batch_Mode_Ingestion.md
Last active Jul 1, 2018
quick-start guide to ingest data into druid using batch mode on HDP platform.
View Druid_Batch_Mode_Ingestion.md

source : http://druid.io/docs/latest/tutorials/tutorial-batch.html ENV : HDP-2.6.4

pageview.json

{"time": "2015-09-01T00:00:00Z", "url": "/foo/bar", "user": "alice", "latencyMs": 32}
{"time": "2015-09-01T01:00:00Z", "url": "/", "user": "bob", "latencyMs": 11}
{"time": "2015-09-01T01:30:00Z", "url": "/foo/bar", "user": "bob", "latencyMs": 45}

index task json