Skip to content

Instantly share code, notes, and snippets.

@helena
Last active August 29, 2015 14:11
Show Gist options
  • Save helena/246932a8ea5d0d07a065 to your computer and use it in GitHub Desktop.
Save helena/246932a8ea5d0d07a065 to your computer and use it in GitHub Desktop.
Spark SQL with Cassandra
import org.apache.spark.{SparkContext, SparkConf}
import org.apache.spark.sql.{SQLContext, SchemaRDD}
import org.apache.spark.sql.cassandra.CassandraSQLContext
/**
* Spark SQL Integration with the Spark Cassandra Connector
* Uses Spark SQL simple query parser, so it has limit query supported.
* Supports Selection, Writing and Join Queries.
*/
object SampleJoin extends App {
val conf = new SparkConf(true)
.set("spark.cassandra.connection.host", "127.0.0.1")
.setMaster("local[*])
.setAppName("app")
val sc = new SparkContext(conf)
val cc = new CassandraSQLContext(sc)
cc.setKeyspace("nosql_join")
cc.sql("""
SELECT test1.a, test1.b, test1.c, test2.a
FROM test1 AS test1
JOIN
test2 AS test2 ON test1.a = test2.a
AND test1.b = test2.b
AND test1.c = test2.c
""").map(Data(_)).saveToCassandra("nosql_join", "table3")
sc.stop()
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment