Skip to content

Instantly share code, notes, and snippets.

View PeterCorless's full-sized avatar

Peter Corless PeterCorless

View GitHub Profile
@PeterCorless
PeterCorless / DataFrameWriter.scala
Last active October 8, 2018 16:26
Hooking up Spark and ScyllaDB: Part 3
val writer = df.write.cassandraFormat(table = "test", keyspace = "test")
// writer: org.apache.spark.sql.DataFrameWriter[org.apache.spark.sql.Row] = org.apache.spark.sql.DataFrameWriter@6cf47d05
@PeterCorless
PeterCorless / IEX-batch-quote.json
Last active October 22, 2018 23:07
Hooking up Spark and ScyllaDB: Part 4
{
"AAPL": {
"quote": {
"latestPrice": 221.43,
"latestSource": "IEX real time price",
"latestUpdate": 1537455071032,
"latestVolume": 18919004,
"previousClose": 218.37,
"symbol": "AAPL"
}
@PeterCorless
PeterCorless / 01-tar-confluent.sh
Last active May 16, 2019 21:29
scylla-kafka-akka-mqtt-blog
tar -xzf confluent-5.0.0-2.11.tar.gz
cd confluent-5.0.0/
@PeterCorless
PeterCorless / 1-streaming-populate-node.yaml
Last active January 22, 2019 17:03
Improved Performance in Scylla Open Source 3.0: Streaming and Hinted Handoffs
for i=0 to 3:
cassandra-stress write no-warmup n=175000000 cl=ALL
-rate threads=500
-schema 'replication(factor=2) keyspace=keyspace'
-col 'size=FIXED(4000) n=FIXED(1)'
-mode cql3 native connectionsPerHost=66
-pop 'seq=((i*n)+1)..((i+1)*n)' # where n=175000000
-node node2
-errors ignore
@PeterCorless
PeterCorless / 1-deploying-spark.sh
Created February 7, 2019 09:39
Scylla Migrator
# On one node, start the master:
spark-2.4.0-bin-hadoop2.7 $ ./sbin/start-master.sh
# On the same node, and on the other nodes, start the Spark workers:
spark-2.4.0-bin-hadoop2.7 $ SPARK_WORKER_INSTANCES=8 SPARK_WORKER_CORES=2 ./sbin/start-slave.sh spark://<spark-master-ip>:7077
@PeterCorless
PeterCorless / 01-hello-world.py
Last active April 8, 2020 15:48
Scylla's Portable Python Interpreter, or Snakes on a Data Plane
#!/usr/bin/python3
import yaml
a = yaml.load("string: hello world!")
print(a['string'])
cqlsh> desc SCHEMA
CREATE KEYSPACE catalog WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;
CREATE TABLE catalog.apparel (
sku text,
color text,
size text,
brand text,
gender text,
group text,
val connector =
new CassandraConnector(CassandraConnectorConf(sparkContext.getConf))
@PeterCorless
PeterCorless / 01-tar-confluent.sh
Last active May 16, 2019 23:45
Scylla + Confluent IoT Examples - Updated 16 May 2019
$ tar -xzf confluent-5.2.0-2.12.tar.gz
$ cd confluent-5.2.0/
@PeterCorless
PeterCorless / 01-create-table.cql
Last active September 3, 2019 20:07
cache-antipatterns
CREATE TABLE ks.tbl (
uuid int,
time timestamp,
property text,
PRIMARY KEY (uuid, time))