Skip to content

Instantly share code, notes, and snippets.

@tmcgrath
Last active January 6, 2016 17:07
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tmcgrath/2f84367e133389d07624 to your computer and use it in GitHub Desktop.
Save tmcgrath/2f84367e133389d07624 to your computer and use it in GitHub Desktop.
Happy Path Spark SQL with JSON input source
todd-mcgraths-macbook-pro:spark-1.4.1-bin-hadoop2.4 toddmcgrath$ bin/spark-shell
2016-01-06 10:54:58.540 java[25147:1203] Unable to load realm info from SCDynamicStore
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.4.1
/_/
Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_65)
Type in expressions to have them evaluated.
Type :help for more information.
Spark context available as sc.
SQL context available as sqlContext.
scala> val customers = sqlContext.jsonFile("customers.json")
warning: there were 1 deprecation warning(s); re-run with -deprecation for details
2016-01-06 10:55:26.027 java[25147:1203] Unable to load realm info from SCDynamicStore
customers: org.apache.spark.sql.DataFrame = [address: struct<city:string,state:string,street:string,zip:string>, first_name: string, last_name: string]
scala> customers.registerTempTable("customers")
scala> val firstCityState = sqlContext.sql("SELECT first_name, address.city, address.state FROM customers")
firstCityState: org.apache.spark.sql.DataFrame = [first_name: string, city: string, state: string]
scala> firstCityState.collect.foreach(println)
[James,New Orleans,LA]
[Josephine,Brighton,MI]
[Art,Bridgeport,NJ]
@tmcgrath
Copy link
Author

tmcgrath commented Jan 6, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment