Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
PySpark test code
from pyspark import SparkContext
dataFile = "./sbin/"
sc = SparkContext("spark://", "Simple App")
textRdd = sc.textFile(dataFile)
print "Number of lines: ", textRdd.count()
print "Number of lines with 8080: ", textRdd.filter(lambda x : '8080' in x).count()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.