Skip to content

Instantly share code, notes, and snippets.

@nrshrivatsan
Created September 1, 2015 02:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nrshrivatsan/ef47064eca68b790c4af to your computer and use it in GitHub Desktop.
Save nrshrivatsan/ef47064eca68b790c4af to your computer and use it in GitHub Desktop.
A simple hack to load CSV contents into apache spark dataframes
//Start spark using https://github.com/databricks/spark-csv#spark-compiled-with-scala-211
//$SPARK_HOME/bin/spark-shell --packages com.databricks:spark-csv_2.11:1.2.0
import org.apache.spark.sql.SQLContext
val sqlContext = new SQLContext(sc)
//Download Google Stock info CSV from Quandl
val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").load("YAHOO-GOOG.csv")
df.columns
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment