Installing Toree+Spark 2.1 on Ubuntu 16.04

## 23rd January 2017

At the time of writing, the pip install of toree is not compatible with spark 2.x. We need to use the master branch from git.


sudo apt install openjdk-8-jdk-headless
sudo apt install git

Install sbt

sbt isn't available in the Ubuntu repos. Install it manually or do the following:-

echo "deb /" | sudo tee -a /etc/apt/sources.list.d/sbt.list
sudo apt-key adv --keyserver hkp:// --recv 642AC823
sudo apt-get update
sudo apt-get install sbt

Install Anaconda Python

Anaconda Python can be installed to a user's home directory and it contains most of the Python modules needed by the majority of researchers. It can coexist with the normal Ubuntu Python packages

chmod +x ./

Follow the instructions. When you are asked the following, say yes

Do you wish the installer to prepend the Anaconda3 install location
to PATH in your /home/walkingrandomly/.bashrc ? [yes|no]

Start a new terminal session so that the .bashrc changes get applied.

Install Spark

tar -xvzf ./spark-2.1.0.tgz
cd spark-2.1.0/
build/mvn -DskipTests clean package

Check that you can run the spark shell


Press CTRL-D to exit the shell

Install toree from source

git clone
cd incubator-toree/
make dist

Then do

make release

You'll get this error which you can ignore:

/bin/sh: 1: docker: not found
Makefile:212: recipe for target 'dist/toree-pip/toree-0.2.0.dev1.tar.gz' failed
make: *** [dist/toree-pip/toree-0.2.0.dev1.tar.gz] Error 127

Now we can install the built package ::

cd dist/toree-pip/
python install

Install the jupyter kernel. I call this one bespoke_spark to differentiate from any others you may have. Be sure to change the value of --spark-home to yours.

jupyter toree install --kernel_name=bespoke_spark --spark_home=/home/walkingrandomly/spark-2.1.0/   --user

Now launch Jupyter with

jupyter notebook

and you'll be able to select the kernel and use spark 2.1

