Skip to content

Instantly share code, notes, and snippets.

View jmichaels's full-sized avatar

John jmichaels

View GitHub Profile
// To view the default settings, hold "alt" while clicking on the "Settings" button.
// For documentation on these settings, see: https://aka.ms/terminal-documentation
{
"$schema": "https://aka.ms/terminal-profiles-schema",
"defaultProfile": "{61c54bbd-c2c6-5271-96e7-009a87ff44bf}",
"profiles": {
@jmichaels
jmichaels / install_spark_and_jupyter.sh
Last active October 2, 2019 12:41
Set up Spark and Jupyter Notebooks on Ubuntu
# Install Java 8 (open JDK)
sudo add-apt-repository ppa:openjdk-r/ppa
sudo apt-get update
sudo apt-get install openjdk-8-jre
sudo update-java-alternatives --list
# Check the version
java -version
# If the current version is not 1.8:
@jmichaels
jmichaels / spark_streaming_kafka_in_kafka_out_example.scala
Last active February 22, 2019 03:56
Spark Streaming DStream Example - Consuming and Producing Kafka Messages
// Data from:
//
// https://catalog.data.gov/dataset/metro-bike-share-trip-data
// https://bikeshare.metro.net/about/data/
//
// Looks like:
//
// Trip ID,Duration,Start Time,End Time,Starting Station ID,Starting Station Latitude,Starting Station Longitude,Ending Station ID,Ending Station Latitude,Ending Station Longitude,Bike ID,Plan Duration,Trip Route Category,Passholder Type,Starting Lat-Long,Ending Lat-Long
// 1912818,180,07/07/2016 04:17:00 AM,07/07/2016 04:20:00 AM,3014,34.0566101,-118.23721,3014,34.0566101,-118.23721,6281,30,Round Trip,Monthly Pass,"(34.0566101, -118.23721)","(34.0566101, -118.23721)"
// 1919661,1980,07/07/2016 06:00:00 AM,07/07/2016 06:33:00 AM,3014,34.0566101,-118.23721,3014,34.0566101,-118.23721,6281,30,Round Trip,Monthly Pass,"(34.0566101, -118.23721)","(34.0566101, -118.23721)"
@jmichaels
jmichaels / SparkVehicleDataApplication.scala
Created February 21, 2019 03:24
Example Spark RDD Application - Vehicle Data
// Sensor data from delivery trucks or autonomous vehicles
//
// Lines look like this:
//
// 123456,-59.42801,88.03186,1550708357,50.2
//
// vehicle_id | latitude | longitude | timestamp | speed
// 123456 | -59.42801 | 88.03186 | 1550712445 | 50.2
import org.apache.spark.rdd.RDD