Skip to content

Instantly share code, notes, and snippets.

@thekensta
Created October 27, 2015 16:04
Show Gist options
  • Save thekensta/66de6195e7d95e8a7f58 to your computer and use it in GitHub Desktop.
Save thekensta/66de6195e7d95e8a7f58 to your computer and use it in GitHub Desktop.
Spark Shell Script
# submit with spark-submit hello_pyspark.py
# Spark 1.5.1
from pyspark import SparkContext, SparkConf
from pyspark.sql import SQLContext
conf = SparkConf().setAppName("showMeTheSchema").setMaster("local")
sc = SparkContext(conf=conf)
sqlContext = SQLContext(sc)
df = sqlContext.read.json("data/20151026/*.gz")
df.printSchema()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment