Skip to content

Instantly share code, notes, and snippets.

@girisandeep
Created June 21, 2017 20:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save girisandeep/b721cf93981c338665c328441d419253 to your computer and use it in GitHub Desktop.
Save girisandeep/b721cf93981c338665c328441d419253 to your computer and use it in GitHub Desktop.
It is csv parsing example that uses opencsv library. It is inefficient
//CSV parsing program using opencsv library
//spark-shell --packages net.sf.opencsv:opencsv:2.3
//Or
//Add this to sbt: libraryDependencies += "net.sf.opencsv" % "opencsv" % "2.3"
import au.com.bytecode.opencsv.CSVParser
var a = sc.textFile("/data/spark/temps.csv");
var p = a.map(
line => {
val parser = new CSVParser(',')
parser.parseLine(line)
})
p.take(1)
//Array(Array(20, " NYC", " 2014-01-01"))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment