Skip to content

Instantly share code, notes, and snippets.

@purukaushik
Last active September 1, 2016 08:14
Show Gist options
  • Save purukaushik/ae1df51dd3a6e1ac23d8b8f474e78299 to your computer and use it in GitHub Desktop.
Save purukaushik/ae1df51dd3a6e1ac23d8b8f474e78299 to your computer and use it in GitHub Desktop.
iPython pyspark notebook settings

install findspark ( pip install -e . after cloning https://github.com/minrk/findspark, and cd findspark)

fire a notebook (jupyter notebook)

enter the following:

import findspark
import os
findspark.init()

import pyspark
sc = pyspark.SparkContext()
lines = sc.textFile(os.path.exapnduser('~/dev/ipython/setup.py'))
lines_nonempty = lines.filter( lambda x: len(x) > 0 )
lines_nonempty.count()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment