Installing Toree+Spark 2.1 on Ubuntu 16.04
## 23rd January 2017
At the time of writing, the pip install of toree is not compatible with spark 2.x. We need to use the master branch from git.
sudo apt install openjdk-8-jdk-headless sudo apt install git
sbt isn't available in the Ubuntu repos. Install it manually or do the following:-
echo "deb https://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 642AC823 sudo apt-get update sudo apt-get install sbt
Install Anaconda Python
Anaconda Python can be installed to a user's home directory and it contains most of the Python modules needed by the majority of researchers. It can coexist with the normal Ubuntu Python packages
wget https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh chmod +x ./Anaconda3-4.2.0-Linux-x86_64.sh ./Anaconda3-4.2.0-Linux-x86_64.sh
Follow the instructions. When you are asked the following, say
Do you wish the installer to prepend the Anaconda3 install location to PATH in your /home/walkingrandomly/.bashrc ? [yes|no]
Start a new terminal session so that the
.bashrc changes get applied.
wget http://d3kbcqa49mib13.cloudfront.net/spark-2.1.0.tgz tar -xvzf ./spark-2.1.0.tgz cd spark-2.1.0/ build/mvn -DskipTests clean package
Check that you can run the spark shell
Press CTRL-D to exit the shell
Install toree from source
git clone https://github.com/apache/incubator-toree cd incubator-toree/ make dist
You'll get this error which you can ignore:
/bin/sh: 1: docker: not found Makefile:212: recipe for target 'dist/toree-pip/toree-0.2.0.dev1.tar.gz' failed make: *** [dist/toree-pip/toree-0.2.0.dev1.tar.gz] Error 127
Now we can install the built package ::
cd dist/toree-pip/ python setup.py install
Install the jupyter kernel. I call this one
bespoke_spark to differentiate from any others you may have.
Be sure to change the value of
--spark-home to yours.
jupyter toree install --kernel_name=bespoke_spark --spark_home=/home/walkingrandomly/spark-2.1.0/ --user
Now launch Jupyter with
and you'll be able to select the kernel and use spark 2.1