Skip to content

Instantly share code, notes, and snippets.

View slgithub's full-sized avatar
🎯
Focusing

SUMANTAL slgithub

🎯
Focusing
View GitHub Profile
@slgithub
slgithub / FIleSystemOperations.java
Last active September 3, 2015 12:02 — forked from ashrithr/FIleSystemOperations.java
HDFS FileSystems API example
package com.cloudwick.mapreduce.FileSystemAPI;
import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
@slgithub
slgithub / elk_puppet.md
Last active September 3, 2015 12:03 — forked from ashrithr/elk_puppet.md
Installing & Configuring logstash, elasticsearch, logstash-forwarder using Puppet

Install puppet server and clients

Puppet Server

curl -s https://raw.githubusercontent.com/cloudwicklabs/scripts/master/puppet_install.sh | bash /dev/stdin -s -a -v

Puppet Client

@slgithub
slgithub / oracle_jdk.sh
Last active September 3, 2015 12:04 — forked from ashrithr/oracle_jdk.sh
wget oracle jdk
# RPM
wget --no-check-certificate \
--no-cookies \
--header "Cookie: oraclelicense=accept-securebackup-cookie" \
http://download.oracle.com/otn-pub/java/jdk/7u45-b18/jdk-7u45-linux-x64.rpm \
-O jdk-7u45-linux-x64.rpm
# TAR GZ
wget --no-check-certificate \
--no-cookies \
@slgithub
slgithub / readme.md
Last active September 3, 2015 12:04 — forked from ashrithr/readme.md
Installing ELK on a single machine

Installing ELK (CentOS)

This is a short step-by-step guide on installing ElasticSearch LogStash and Kibana Stack on a CentOS environment to gather and analyze logs.

I. Install JDK

rpm -ivh https://dl.dropboxusercontent.com/u/5756075/jdk-7u45-linux-x64.rpm
Producer
Setup
bin/kafka-topics.sh --zookeeper esv4-hcl197.grid.linkedin.com:2181 --create --topic test-rep-one --partitions 6 --replication-factor 1
bin/kafka-topics.sh --zookeeper esv4-hcl197.grid.linkedin.com:2181 --create --topic test --partitions 6 --replication-factor 3
Single thread, no replication
bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 50000000 100 -1 acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196
@slgithub
slgithub / ssh_tunneling.md
Last active September 3, 2015 12:07 — forked from ashrithr/ssh_tunneling.md
ssh tunneling and port forwarding

###Single hop tunelling:

ssh -f -N -L 9906:127.0.0.1:3306 user@dev.example.com

where,

  • -f puts ssh in background
  • -N makes it not execute a remote command
@slgithub
slgithub / spark_on_yarn.md
Last active September 3, 2015 12:08 — forked from ashrithr/spark_on_yarn.md
spark 0.9 on yarn (hadoop-2.2)

##Using yarn as the resource manager you can deploy spark application in two modes:

  1. yarn-standalone mode, in which your driver program is running as a thread of the yarn application master, which itself runs on one of the node managers in the cluster. The Yarn client just pulls status from the application master. This mode is same as a mapreduce job, where the MR application master coordinates the containers to run the map/reduce tasks.

With this mode, your application is actually run on the remote machine where the Application Master is run upon. Thus application that involve local interaction will not work well, e.g. spark-shell.

  1. yarn-client mode, in which your driver program is running on the yarn client where you type the command to submit the spark application (may not be a machine in the yarn cluster). In this mode, although the drive program is running on the client machine, the tasks are executed on the executors in the node managers of the YARN cluster.

Simply putting to gether:

@slgithub
slgithub / mongo_setup.md
Last active September 3, 2015 12:13 — forked from ashrithr/mongo_setup.md
Mongo Setup Instructions

#Mongo UseCase:

Installing mongodb on 5 machines with the following deamon configurations:

Host Mongo Role
router.mongo.cw.com Router(mongos), Application Server, Arbiter (Shard1), Arbiter (Shard2), Config1, Config2, Config3
shard1r1.mongo.cw.com shard1 replica primary
shard1r2.mongo.cw.com shard1 replica secondary
shard2r1.mongo.cw.com shard2 replica primary
@slgithub
slgithub / storm.md
Last active September 3, 2015 12:16 — forked from ashrithr/storm.md
Intro to storm

Storm

Storm is a distributed and fault-tolerant realtime computation system

###Features of storm:

  • Scalable and robust
  • Fault-tolrant (automatic reassigning of tasks)
  • Reliable (all messages are processed at least once)
  • Fast
@slgithub
slgithub / TwitterStreamExample.java
Last active September 3, 2015 12:18 — forked from ashrithr/TwitterStreamExample.java
Twitter4j and GeoCode Parsing
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.StatusLine;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.DefaultHttpClient;
import org.json.simple.JSONArray;
import org.json.simple.JSONObject;
import org.json.simple.JSONValue;