Skip to content

Instantly share code, notes, and snippets.

@lucemia
Created October 19, 2015 16:51
Show Gist options
  • Save lucemia/17b198f065823ad1a46a to your computer and use it in GitHub Desktop.
Save lucemia/17b198f065823ad1a46a to your computer and use it in GitHub Desktop.
test.py
from pyspark import SparkContext
logFile = "gs://tagtoo-track-log/log2bq-2014120100*"
sc = SparkContext()
logData = sc.textFile(logFile).cache()
print logData.count(), logData.first()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment