Create MapR-Db Table
Created Aug 9, 2015
code snippet to create maprdb table
private void createTable(String tableName, List<String> cfList)
throws IOException {
final String table = tableName;
final List<String> cfs = cfList;
try {
ugi.doAs(new PrivilegedExceptionAction<Void>() {
public Void run() throws Exception {
if (!admin.tableExists(table)) {
Redis Server installation on CentOs
Created Nov 23, 2015
Redis Server installation on CentOs
# tar xzf redis-3.0.2.tar.gz
# cd redis-3.0.2
# make
Run Redis Server
HiveServer2 connection with MySQL DB as metastore over SSL
Last active Jul 18, 2016
quick guide to connect HS2 to MySQL DB metastore over SSL
Setting up MySQL SSL

# Create clean environment
shell> rm -rf newcerts
shell> mkdir newcerts && cd newcerts

# Create CA certificate
shell> openssl genrsa 2048 > ca-key.pem
shell> openssl req -new -x509 -nodes -days 3600 \
         -key ca-key.pem -out ca.pem
Reading parquet files using the parquet tools
Created Aug 10, 2015
reading parquet files and know meta information of parquet file
// Building a parquet tools
git clone
cd parquet-mr/parquet-tools/
mvn clean package -Plocal
// know the schema of the parquet file
java -jar parquet-tools-1.6.0rc3-SNAPSHOT.jar schema sample.parquet
// Read parquet file
java -jar parquet-tools-1.6.0rc3-SNAPSHOT.jar cat sample.parquet
rajkrrsingh /
Created Nov 22, 2016
java program to run Sqoop Command using google SSHXCUTE framework
import net.neoremind.sshxcute.core.SSHExec;
import net.neoremind.sshxcute.core.ConnBean;
import net.neoremind.sshxcute.task.CustomTask;
import net.neoremind.sshxcute.task.impl.ExecCommand;
public class RunSqoopCommand {
public static void main(String args[]) throws Exception{
Spark Streaming Sample program using scala
Created Nov 27, 2016
Spark Streaming Sample program using scala
mkdir spark-streaming-example
cd spark-streaming-example/
mkdir -p src/main/scala
cd src/main/scala
vim TestStreaming.scala
add following line of code to TestStreaming.scala
import org.apache.spark.streaming.StreamingContext
import org.apache.spark.streaming.StreamingContext._
ATLAS REST API
Last active Nov 27, 2016
quick reference guide of ATLAS REST API
[root@rksnode ~]# curl http://rksnode:21000/api/atlas/admin/version
{"Version":"","Name":"apache-atlas","Description":"Metadata Management and Data Governance Platform over Hadoop"}[root@rksnode ~]#
[root@rksnode ~]#
[root@rksnode ~]#
[root@rksnode ~]# curl http://rksnode:21000/api/atlas/types
{"results":["DataSet","hive_order","Process","hive_table","hive_db","hive_process","hive_principal_type","hive_resource_type","hive_object_type","Infrastructure","hive_index","hive_column","hive_resourceuri","hive_storagedesc","hive_role","hive_partition","hive_serde","hive_type"],"count":18,"requestId":"qtp1286783232-60 - 0128be6a-076e-4ad3-972a-58783a1f7180"}[root@rksnode ~]#
[root@rksnode ~]#
[root@rksnode ~]# curl http://rksnode:21000/api/atlas/types/hive_process
{"typeName":"hive_process","definition":"{\n \"enumTypes\":[\n \n ],\n \"structTypes\":[\n \n ],\n \"traitTypes\":[\n \n ],\n \"classTypes\":[\n {\n \"superTypes\":[\n
sample spark2 application using scala
Created Nov 30, 2016
sample spark2 application using scala
mkdir Spark2StarterApp
cd Spark2StarterApp/
mkdir -p src/main/scala
cd src/main/scala
vim Spark2Example.scala
import org.apache.spark.sql.SparkSession
object Spark2Example {
rajkrrsingh / Spark2DataSetDemo
Created Nov 30, 2016
sample spark2 application demonstrating dataset api
[root@rkk1 Spark2StarterApp]# /usr/hdp/current/spark2-client/bin/spark-shell
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
16/11/30 18:01:48 WARN SparkContext: Use an existing SparkContext, some configuration may not take effect.
Spark context Web UI available at
Spark context available as 'sc' (master = local[*], app id = local-1480528906336).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
End-to-end Latency
0.0543 ms ms (median)
0.003125 ms (99th percentile)
5 ms (99.9th percentile)
Producer and consumer
Producer - 1431170.2 records/sec (136.49 MB/sec)
Consumer - 3276754.7021 records/sec (312.4957 MB/sec)