Skip to content

Instantly share code, notes, and snippets.

@csbond007
Last active October 17, 2016 20:20
Show Gist options
  • Save csbond007/1863f81646cc56d6728a2828a7eadfdc to your computer and use it in GitHub Desktop.
Save csbond007/1863f81646cc56d6728a2828a7eadfdc to your computer and use it in GitHub Desktop.
bin/spark-shell --conf spark.cassandra.connection.host=127.0.0.1 --packages datastax:spark-cassandra-connector:2.0.0-M2-s_2.11
import com.datastax.spark.connector._
///////////////////////////////////////////////////
//// Cassandra Table creation
CREATE KEYSPACE test WITH REPLICATION = {'class': 'SimpleStrategy', 'replication_factor': 1 };
CREATE TABLE test.words (word text PRIMARY KEY, count int);
INSERT INTO test.words (word, count) VALUES ('foo', 20);
INSERT INTO test.words (word, count) VALUES ('bar', 20);
///////////////////////////////////////////////////////////////////
val rdd = sc.cassandraTable("test", "words")
Store the first item of the rdd in the firstRow value.
val firstRow = rdd.first
// firstRow: com.datastax.spark.connector.rdd.reader.CassandraRow = CassandraRow{word: bar, count: 20}
Get the number of columns and column names:
firstRow.columnNames // Stream(word, count) // this one is not working
firstRow.size // 2
Use one of getXXX getters to obtain a column value converted to desired type:
firstRow.getInt("count") // 20
//////////// Vinsent Code //////////////////////
sc.cassandraTable("seerdata", "incidencedata")
val rdd = sc.cassandraTable("seerdata", "incidencedata")
scala> rdd.count
res5: Long = 9176963
////////////////////////////////////////////////////////////
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment