Skip to content

Instantly share code, notes, and snippets.

@erichgess
Last active August 29, 2015 14:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save erichgess/292dd29513e3393bf969 to your computer and use it in GitHub Desktop.
Save erichgess/292dd29513e3393bf969 to your computer and use it in GitHub Desktop.
Spark Demo
// First Example:
val first_example = sc.cassandraTable("spark_demo", "first_example")
first_example.first
first_example.first.get[Int]("id")
//---
case class FirstExample(Id: Int, Value: Int )
val first_example = sc.cassandraTable[FirstExample]("spark_demo", "first_example")
first_example.first
first_example.take(2)
first_example.map( x => x.Value).groupBy( x => x ).map( g => (g._1, g._2.length) ).take(3)
// ######
// Raw Data:
case class RawFileData(Filename: String, LineNumber: Int, LineText: String )
val raw = sc.cassandraTable[RawFileData]("spark_demo", "raw_files" )
// -- read the user data "users.dat"
// File format: UserID::Gender::Age::Occupation::Zip-code
val raw_users = raw.filter( r => r.Filename == "users.dat" )
raw_users.first
case class User(Id: Int, Age: Int, Gender: String, Occupation: Int, Zip: String )
val users = raw_users.map( l => l.LineText.trim.split("::") ).map( v => User(Id = v(0).toInt, Age = v(2).toInt, Gender=v(1), Occupation=v(3).toInt, Zip=v(4)))
users.saveToCassandra("spark_demo", "users" )
@erichgess
Copy link
Author

These are the shell commands I used during the Kindling presentation at the Chicago Cassandra Meetup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment