Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save andykuszyk/11985b320f219f5f30ded08fb9d3a133 to your computer and use it in GitHub Desktop.
Save andykuszyk/11985b320f219f5f30ded08fb9d3a133 to your computer and use it in GitHub Desktop.
Setting up Spark on Ubuntu in clustered environment

Setting up Spark on Ubuntu in clustered environment

Getting Spark and setting up user

sudo adduser spark
su spark
cd
wget http://apache.mirror.anlx.net/spark/spark-2.2.0/spark-2.2.0-bin-hadoop2.7.tgz
tar -xvf spark~
rm *.tgz
mv spark~/ spark/

Setting up Java

sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java9-installer

Getting the master/slave to run on startup

Create a new file at /etc/init.d/spark

#!/bin/sh                                                                                                        
   
export SPARK_WORKER_INSTANCES=8 # insert number of cores to use here
   
case "$1" in                                                                                                     
        start)                                    
                start-stop-daemon --start --chuid spark --exec /home/spark/spark/sbin/start-master.sh -q         
                start-stop-daemon --start --chuid spark --exec /home/spark/spark/sbin/start-slave.sh spark://localhost:7077 -q
                ;;                                                                                               
        stop)                                      
                start-stop-daemon --start --chuid spark --exec /home/spark/spark/sbin/stop-master.sh -q          
                start-stop-daemon --start --chuid spark --exec /home/spark/spark/sbin/stop-slave.sh -q
                ;;                                                                                               
esac                                                                                                             
                                                                                                                 
exit 0                                                                                                           

Make it executable: sudo chmod +x /etc/init.d/spark

Create symbolic link: update-rc.d spark defaults

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment