trungnt13/Setup Spark + Hadoop cluster

## Setup Spark + Hadoop cluster
# Download: prebuilt spark+hadoop
From: https://spark.apache.org/downloads.html
Prefer prebuilt version, easier to deploy to all worker

# Configuration
1. Slaves configuration

* Client:

 sudo adduser worker
 sudo /usr/sbin/visudo (add sudo Privileges for worker)
 ssh worker@192.168.1.15
 mkdir .ssh
 nano .ssh/authorized_keys (copy id_rsa.pub to this file)
 copy spark to /home/worker/spark
* Server:

 cd $SPARK_HOME/conf
 cp slaves.template slaves
 add 1 line for each host, e.g. worker@192.168.1.15

2. Spark environment

# Start Cluster
cd $SPARK_HOME/sbin
./start-all.sh

# Setup pyspark + ipython
Follow this instruction: http://ramhiser.com/2015/02/01/configuring-ipython-notebook-support-for-pyspark/
However, replace following line or you will never get it done:
```export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"```
	# Download: prebuilt spark+hadoop
	From: https://spark.apache.org/downloads.html
	Prefer prebuilt version, easier to deploy to all worker

	# Configuration
	1. Slaves configuration

	* Client:

	sudo adduser worker
	sudo /usr/sbin/visudo (add sudo Privileges for worker)
	ssh worker@192.168.1.15
	mkdir .ssh
	nano .ssh/authorized_keys (copy id_rsa.pub to this file)
	copy spark to /home/worker/spark
	* Server:

	cd $SPARK_HOME/conf
	cp slaves.template slaves
	add 1 line for each host, e.g. worker@192.168.1.15

	2. Spark environment

	# Start Cluster
	cd $SPARK_HOME/sbin
	./start-all.sh

	# Setup pyspark + ipython
	Follow this instruction: http://ramhiser.com/2015/02/01/configuring-ipython-notebook-support-for-pyspark/
	However, replace following line or you will never get it done:
	```export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"```