Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save spaceshipoperator/135b422335d4ba86868c to your computer and use it in GitHub Desktop.
Save spaceshipoperator/135b422335d4ba86868c to your computer and use it in GitHub Desktop.
after successfully doing this http://planetcassandra.org/blog/kindling-an-introduction-to-spark-with-cassandra/ with spark-shell, one might reasonably think that the following would work....though I may be missing something trivial.
## invoking pyspark as follows: ##
# /path/to/spark-1.2.0-bin-hadoop2.4/bin/pyspark --jars /path/to/spark-1.2.0-bin-hadoop2.4/jars/spark-cassandra-connector-assembly-1.2.0-SNAPSHOT.jar
# first, stop the spark context launched by pyspark to avoid the conflict
sc.stop()
from py4j.java_gateway import java_import
from pyspark import SparkConf
conf = (SparkConf()
.setMaster("local")
.setAppName("pyspark_cassandra")
.set("spark.cassandra.connection.host", "127.0.0.1"))
spark_context = SparkContext(conf = conf)
java_import(spark_context._gateway.jvm, "com.datastax.spark.connector._")
java_import(spark_context._gateway.jvm, "com.datastax.spark.SparkContext")
java_import(spark_context._gateway.jvm, "com.datastax.spark.SparkContext._")
java_import(spark_context._gateway.jvm, "com.datastax.spark.SparkConf")
rdd = spark_context.cassandraTable("spark_test", "test")
## results in ##
# Traceback (most recent call last):
# File "<stdin>", line 1, in <module>
#AttributeError: 'SparkContext' object has no attribute 'cassandraTable'
@tigrus
Copy link

tigrus commented Mar 9, 2016

Have you found solution?

@tigrus
Copy link

tigrus commented Mar 9, 2016

For me worked "from pyspark_cassandra import CassandraSparkContext" and replacing "SparkContext" with "CassandraSparkContext".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment