Skip to content

Instantly share code, notes, and snippets.

View rajkrrsingh's full-sized avatar

Rajkumar Singh rajkrrsingh

View GitHub Profile
@rajkrrsingh
rajkrrsingh / SettableFutureTest
Created August 3, 2019 06:47
Quick test of google guava Settable Future
package listenablefuture;
import com.google.common.util.concurrent.FutureCallback;
import com.google.common.util.concurrent.Futures;
import com.google.common.util.concurrent.ListenableFuture;
import com.google.common.util.concurrent.ListeningExecutorService;
import com.google.common.util.concurrent.MoreExecutors;
import com.google.common.util.concurrent.SettableFuture;
import com.google.common.util.concurrent.ThreadFactoryBuilder;
@rajkrrsingh
rajkrrsingh / Docker_HIVESERVER_Remote_Debugging.md
Last active July 25, 2019 20:22
Debugging HiveServer2 Docker container Remotely
HiveServer2 Dockerfile

watch out the JAVA_TOOL_OPTIONS which are having remote debugging options.

FROM centos

# Basic hygene
RUN yum upgrade -y && \
    yum update -y && \
@rajkrrsingh
rajkrrsingh / Hive_Docker.md
Last active July 24, 2019 23:10
Running HiveServer2 and Beeline as docker containers

Create and run hiveserver2 docker container

mkdir hive-3-image
cd hive-3-image/
wget https://raw.githubusercontent.com/alanfgates/sqltest/master/dbs/hive/v3_1/Dockerfile
docker build . -t hive3-image
// Run it
docker run -it --net=myNetwork -p 10000:10000 hive3-image

You will see following logs on stdout if everything goes well

@rajkrrsingh
rajkrrsingh / Hive Replication V2 startup guide.md
Last active November 4, 2021 08:24
Jump start guide for Hive Replication V2 - to know more about hive replication please refer https://cwiki.apache.org/confluence/display/Hive/HiveReplicationv2Development

Prerequisite hive settings:

set hive.server2.logging.operation.level=execution;
set hive.metastore.transactional.event.listeners=org.apache.hive.hcatalog.listener.DbNotificationListener;
set hive.metastore.dml.events=true;

Setup database and tables

@rajkrrsingh
rajkrrsingh / Create_Bulk_Topic_In_Kafka.md
Last active May 19, 2020 11:13
create kafka bulk topic from admin client

import kafka.admin.AdminUtils;
import kafka.admin.RackAwareMode;
import kafka.utils.ZKStringSerializer$;
import kafka.utils.ZkUtils;
import org.I0Itec.zkclient.ZkClient;
import org.I0Itec.zkclient.ZkConnection;

import java.io.BufferedReader;
@rajkrrsingh
rajkrrsingh / Hive LLAP Workload Management.md
Last active July 20, 2019 12:06
hive llap workload management commands

Create RESOURCE PLAN testrp;

create RESOURCE PLAN testrp;
CREATE POOL testrp.default.c1 WITH ALLOC_FRACTION=0.3, QUERY_PARALLELISM=3, SCHEDULING_POLICY='fair';
CREATE POOL testrp.default.c2 WITH ALLOC_FRACTION=0.5, QUERY_PARALLELISM=1, SCHEDULING_POLICY='fair';
SELECT * FROM SYS.WM_POOLS;
ALTER RESOURCE PLAN testrp VALIDATE;
ALTER RESOURCE PLAN testrp ENABLE ACTIVATE;
@rajkrrsingh
rajkrrsingh / Hive Kafka Integration
Last active May 19, 2020 11:17
a quick start guide to query kafka topic from hive table
#### ENV: HDP-3.1
#### Data setup:
```
cat sample-data.json
{"name": "Raj","address": {"a": "b","c": "d","e": "f"}}
{"name": "Raj1","address": {"a": "bb","c": "dd","e": "ff"}}
```
#### Create topic in Kafka and Ingest data into it.
@rajkrrsingh
rajkrrsingh / docker-help.md
Created February 2, 2019 03:11 — forked from bradtraversy/docker-help.md
Docker Commands, Help & Tips

Docker Commands, Help & Tips

Show commands & management commands

$ docker

Docker version info

@rajkrrsingh
rajkrrsingh / Druid_Supervisor_Rest_API.md
Created July 2, 2018 21:11
Druid Supervisor REST API

Supervisor REST API

Get Supervisor Spec

curl    http://host215-node2:8090/druid/indexer/v1/supervisor/metrics-kafka
{"type":"kafka","dataSchema":{"dataSource":"metrics-kafka","parser":{"type":"string","parseSpec":{"format":"json","timestampSpec":{"column":"timestamp","format":"auto"},"dimensionsSpec":{"dimensions":[],"dimensionExclusions":["timestamp","value"]}}},"metricsSpec":[{"type":"count","name":"count"},{"type":"doubleSum","name":"value_sum","fieldName":"value","expression":null},{"type":"doubleMin","name":"value_min","fieldName":"value","expression":null},{"type":"doubleMax","name":"value_max","fieldName":"value","expression":null}],"granularitySpec":{"type":"uniform","segmentGranularity":"HOUR","queryGranularity":{"type":"none"},"rollup":true,"intervals":null}},"tuningConfig":{"type":"kafka","maxRowsInMemory":75000,"maxRowsPerSegment":5000000,"intermediatePersistPeriod":"PT10M","basePersistDirectory":"/tmp/1530551810169-0","maxPendingPersists":0,"indexSpec":{"bitmap":{"type":"concise"},"
@rajkrrsingh
rajkrrsingh / Druid_Kafka_Indexing_Service.md
Last active July 2, 2018 21:10
druid how to use druid kafka indexing service

kafka indexing service

should have extensions loaded on overlord and middle-managers druid.extensions.loadList = ["druid-datasketches", "druid-hdfs-storage", "druid-kafka-indexing-service", "mysql-metadata-storage"]

supervisor spec

cat supervisor-spec.json
{
  "type": "kafka",
  "dataSchema": {