Skip to content

Instantly share code, notes, and snippets.

View slgithub's full-sized avatar
🎯
Focusing

SUMANTAL slgithub

🎯
Focusing
View GitHub Profile
@slgithub
slgithub / git_brushup.md
Last active September 3, 2015 12:41 — forked from arttuladhar/git_brushup.md
Basic GIT Commands for everyday use.
@slgithub
slgithub / kafka
Last active September 3, 2015 12:30 — forked from superscott/kafka
Simple Kafka Ubuntu init.d Startup Script
DAEMON_PATH=/opt/kafka/bin
DAEMON_NAME=kafka
# Check that networking is up.
#[ ${NETWORKING} = "no" ] && exit 0
PATH=$PATH:$DAEMON_PATH
# See how we were called.
case "$1" in
start)
Mac Shortcut Windows Shortcut Description
⌘ + ⇧ + R Ctrl + Shift + R Open / Search for Resources
⌘ + ⇧ + T Ctrl + Shift + T Open / Search for types (Useful in finding classes)
⌘ + O Ctrl + O Shows Quick Outline of the Java Class
⌘ + T CTRL-T Show type hierarchy
ALT – ↑ or ↓ ALT – ↑ or ↓ Move line/block
⌃ + Space Ctrl + Space Content assist and Code completion
⌘ + ⇧ + F Ctrl + Shift + F Format source code
⌘ + L Ctrl + L Go to Line Number
@slgithub
slgithub / kafka.md
Last active September 3, 2015 12:22 — forked from ashrithr/kafka.md
kafka introduction

Introduction to Kafka

Kafka acts as a kind of write-ahead log (WAL) that records messages to a persistent store (disk) and allows subscribers to read and apply these changes to their own stores in a system appropriate time-frame.

Terminology:

  • Producers send messages to brokers
  • Consumers read messages from brokers
  • Messages are sent to a topic
@slgithub
slgithub / flume-ng-agent.sh
Last active September 3, 2015 12:21 — forked from ashrithr/flume-ng-agent.sh
Custom Flume NG Agent INIT script for centos for runnig multiple agents on same machine
#!/bin/bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
@slgithub
slgithub / RandomHttpLogGen.scala
Last active September 3, 2015 12:18 — forked from ashrithr/RandomHttpLogGen.scala
Scala script to generate random http log events to a output file specified, one can specify the number of events to generate per second. Also, class IPGenerator can take in number of sessions and session length which can be used to simulate a user returning back.
#!/bin/sh
exec scala -savecompiled "$0" "$@"
!#
import scala.collection.mutable.Map
import scala.util.Random
import scala.collection.mutable.ArrayBuffer
import java.io._
class IPGenerator(var sessionCount: Int, var sessionLength: Int) {
@slgithub
slgithub / TwitterStreamExample.java
Last active September 3, 2015 12:18 — forked from ashrithr/TwitterStreamExample.java
Twitter4j and GeoCode Parsing
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.StatusLine;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.DefaultHttpClient;
import org.json.simple.JSONArray;
import org.json.simple.JSONObject;
import org.json.simple.JSONValue;
@slgithub
slgithub / storm.md
Last active September 3, 2015 12:16 — forked from ashrithr/storm.md
Intro to storm

Storm

Storm is a distributed and fault-tolerant realtime computation system

###Features of storm:

  • Scalable and robust
  • Fault-tolrant (automatic reassigning of tasks)
  • Reliable (all messages are processed at least once)
  • Fast
@slgithub
slgithub / mongo_setup.md
Last active September 3, 2015 12:13 — forked from ashrithr/mongo_setup.md
Mongo Setup Instructions

#Mongo UseCase:

Installing mongodb on 5 machines with the following deamon configurations:

Host Mongo Role
router.mongo.cw.com Router(mongos), Application Server, Arbiter (Shard1), Arbiter (Shard2), Config1, Config2, Config3
shard1r1.mongo.cw.com shard1 replica primary
shard1r2.mongo.cw.com shard1 replica secondary
shard2r1.mongo.cw.com shard2 replica primary
@slgithub
slgithub / spark_on_yarn.md
Last active September 3, 2015 12:08 — forked from ashrithr/spark_on_yarn.md
spark 0.9 on yarn (hadoop-2.2)

##Using yarn as the resource manager you can deploy spark application in two modes:

  1. yarn-standalone mode, in which your driver program is running as a thread of the yarn application master, which itself runs on one of the node managers in the cluster. The Yarn client just pulls status from the application master. This mode is same as a mapreduce job, where the MR application master coordinates the containers to run the map/reduce tasks.

With this mode, your application is actually run on the remote machine where the Application Master is run upon. Thus application that involve local interaction will not work well, e.g. spark-shell.

  1. yarn-client mode, in which your driver program is running on the yarn client where you type the command to submit the spark application (may not be a machine in the yarn cluster). In this mode, although the drive program is running on the client machine, the tasks are executed on the executors in the node managers of the YARN cluster.

Simply putting to gether: