Skip to content

Instantly share code, notes, and snippets.

@frank-leap
Last active January 27, 2018 18:15
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save frank-leap/d94a6df86f0dd12ca753 to your computer and use it in GitHub Desktop.
Save frank-leap/d94a6df86f0dd12ca753 to your computer and use it in GitHub Desktop.
Steps to configure Jupyter (iPython Notebook) with Python (3.5.1) and Spark (1.6.0) kernel on Mac OS X (El Capitan)

Install Python3, Scala and Apache Spark via Brew (http://brew.sh/)

brew update
brew install python3
brew install scala
brew install apache-spark

Set environment variables

echo "export SPARK_HOME='/usr/local/Cellar/apache-spark/1.6.0/libexec/'" >> ~/.bashrc
echo "export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH" >> ~/.bashrc
echo "export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.9-src.zip:$PYTHONPATH" >> ~/.bashrc

Install Jupyter (iPython Notebook) via pip(3)

pip3 install jupyter

Create an iPython profile

ipython profile create pyspark

Create a startup script for that profile

nano ~/.ipython/profile_pyspark/startup/00-spark-setup.py
import os
import sys

spark_home = os.environ.get('SPARK_HOME', None)
sys.path.insert(0, os.path.join(spark_home, 'python'))
sys.path.insert(0, os.path.join(spark_home, 'python/lib/py4j-0.9-src.zip'))
execfile(os.path.join(spark_home, 'python/pyspark/shell.py'))

Verify that profile works

ipython --profile=pyspark

Create a kernel spec for Jupyter

mkdir -p ~/.ipython/kernels/pyspark
nano ~/.ipython/kernels/pyspark/kernel.json
{
    "display_name": "PySpark (Spark 1.6.0)",
    "language": "python",
    "argv": [
        "/usr/local/bin/python3",
        "-m",
        "ipykernel",
        "--profile=pyspark",
        "-f",
        "{connection_file}"
    ]
}

Verify Jupyter works

jupyter notebook
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment