prateek goel goelprateek

## gist:af0809e358fee501340f2efb9a3fe66c
git branch -m old_branch new_branch         # Rename branch locally
git push origin :old_branch                 # Delete the old branch
git push --set-upstream origin new_branch   # Push the new branch, set local branch to track the new remote

## SparkJoin
package com.voicestreams.spark;


import org.apache.commons.io.FileUtils;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaPairRDD;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.Function2;
import org.apache.spark.api.java.function.PairFunction;

## Sharded mongodb environment on localhost
# clean everything up
echo "killing mongod and mongos"
killall mongod
killall mongos
echo "removing data files"
rm -rf data/config
rm -rf data/shard*

# For mac make sure rlimits are high enough to open all necessary connections
ulimit -n 2048

## 00-ReduceSideJoin
My blog has an introduction to reduce side join in Java map reduce-
http://hadooped.blogspot.com/2013/09/reduce-side-join-options-in-java-map.html


## 00-MapSideJoinLargeDatasets
**********************
**Gist
**********************

This gist details how to inner join two large datasets on the map-side, leveraging the join capability
in mapreduce.  Such a join makes sense if both input datasets are too large to qualify for distribution
through distributedcache, and can be implemented if both input datasets can be joined by the join key
and both input datasets are sorted in the same order, by the join key.

There are two critical pieces to engaging the join behavior:

## ubuntu_agnoster_install.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                goelprateek
                / ubuntu_agnoster_install.md
            
            
              Created
              January 17, 2018 16:40
                — forked from renshuki/ubuntu_agnoster_install.md
            
              
                Ubuntu 16.04 + Terminator + Oh My ZSH with Agnoster Theme
              
          
    Install Terminator (shell)

sudo add-apt-repository ppa:gnome-terminator
sudo apt-get update
sudo apt-get install terminator

Terminator should be setup as default now. Restart your terminal (shortcut: "Ctrl+Alt+T").

Install ZSH


## kafka installation and systemd
cd /opt
wget http://apache-mirror.rbc.ru/pub/apache/kafka/0.10.1.0/kafka_2.11-0.10.1.0.tgz
tar xvzf kafka_2.11-0.10.1.0.tgz
ln -s kafka_2.11-0.10.1.0/ kafka


vi /etc/systemd/system/kafka-zookeeper.service
[Unit]
Description=Apache Zookeeper server (Kafka)
Documentation=http://zookeeper.apache.org

## nginx-tuning.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                goelprateek
                / nginx-tuning.md
            
            
              Created
              February 13, 2018 12:46
                — forked from denji/nginx-tuning.md
            
              
                NGINX tuning for best performance
              
          
    Moved to git repository: https://github.com/denji/nginx-tuning

NGINX Tuning For Best Performance

For this configuration you can use web server you like, i decided, because i work mostly with it to use nginx.
Generally, properly configured nginx can handle up to 400K to 500K requests per second (clustered), most what i saw is 50K to 80K (non-clustered) requests per second and 30% CPU load, course, this was 2 x Intel Xeon with HyperThreading enabled, but it can work without problem on slower machines.
You must understand that this config is used in testing environment and not in production so you will need to find a way to implement most of those features best possible for your servers.

  
## bash
### java -jar

java -Xdebug -Xrunjdwp:transport=dt_socket,server=y,address=8001,suspend=y -jar target/cxf-boot-simple-0.0.1-SNAPSHOT.jar


### Maven

Debug Spring Boot app with Maven:

mvn spring-boot:run -Drun.jvmArguments="-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8001"

## delete-from-v2-docker-registry.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                goelprateek
                / delete-from-v2-docker-registry.md
            
            
              Created
              December 6, 2018 08:36
                — forked from jaytaylor/delete-from-v2-docker-registry.md
            
              
                One liner for deleting images from a v2 docker registry
              
          
    One liner for deleting images from a v2 docker registry

Just plug in your own values for registry and repo/image name.
registry='localhost:5000'
name='my-image'
curl -v -sSL -X DELETE "http://${registry}/v2/${name}/manifests/$(
    curl -sSL -I \
        -H "Accept: application/vnd.docker.distribution.manifest.v2+json" \
	git branch -m old_branch new_branch # Rename branch locally
	git push origin :old_branch # Delete the old branch
	git push --set-upstream origin new_branch # Push the new branch, set local branch to track the new remote
	package com.voicestreams.spark;


	import org.apache.commons.io.FileUtils;
	import org.apache.spark.SparkConf;
	import org.apache.spark.api.java.JavaPairRDD;
	import org.apache.spark.api.java.JavaRDD;
	import org.apache.spark.api.java.JavaSparkContext;
	import org.apache.spark.api.java.function.Function2;
	import org.apache.spark.api.java.function.PairFunction;
	# clean everything up
	echo "killing mongod and mongos"
	killall mongod
	killall mongos
	echo "removing data files"
	rm -rf data/config
	rm -rf data/shard*

	# For mac make sure rlimits are high enough to open all necessary connections
	ulimit -n 2048
	My blog has an introduction to reduce side join in Java map reduce-
	http://hadooped.blogspot.com/2013/09/reduce-side-join-options-in-java-map.html
	**********************
	**Gist
	**********************

	This gist details how to inner join two large datasets on the map-side, leveraging the join capability
	in mapreduce. Such a join makes sense if both input datasets are too large to qualify for distribution
	through distributedcache, and can be implemented if both input datasets can be joined by the join key
	and both input datasets are sorted in the same order, by the join key.

	There are two critical pieces to engaging the join behavior:
	cd /opt
	wget http://apache-mirror.rbc.ru/pub/apache/kafka/0.10.1.0/kafka_2.11-0.10.1.0.tgz
	tar xvzf kafka_2.11-0.10.1.0.tgz
	ln -s kafka_2.11-0.10.1.0/ kafka


	vi /etc/systemd/system/kafka-zookeeper.service
	[Unit]
	Description=Apache Zookeeper server (Kafka)
	Documentation=http://zookeeper.apache.org
	### java -jar

	java -Xdebug -Xrunjdwp:transport=dt_socket,server=y,address=8001,suspend=y -jar target/cxf-boot-simple-0.0.1-SNAPSHOT.jar


	### Maven

	Debug Spring Boot app with Maven:

	mvn spring-boot:run -Drun.jvmArguments="-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8001"