Skip to content

Instantly share code, notes, and snippets.

@bryanyang0528
Created January 28, 2015 15:01
Show Gist options
  • Save bryanyang0528/7d6f4d2f694e5b77d252 to your computer and use it in GitHub Desktop.
Save bryanyang0528/7d6f4d2f694e5b77d252 to your computer and use it in GitHub Desktop.
# Configure the necessary Spark environment. pyspark needs SPARK_HOME setup
# so it knows how to start the Spark master and some local workers for you to use.
import os
# Fill this in with the path to the spark-0.9.1-bin-cdh4 folder you just downloaded
# (e.g., /home/saasbook/spark-0.9.1-bin-cdh4)
path_to_spark = "/usr/local/spark"
#os.environ['SPARK_HOME'] = path_to_spark
# Set the python path so that we know where to find the pyspark files.
import sys
path_to_pyspark = os.path.join(path_to_spark, "python")
path_to_py4j = "/usr/local/spark/python/lib/py4j-0.8.2.1-src.zip"
sys.path.insert(0, path_to_pyspark)
sys.path.insert(0, path_to_py4j)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment