Skip to content

Instantly share code, notes, and snippets.

import cloudflow.sbt.CloudflowKeys.{cloudflowDockerImageName, cloudflowDockerRegistry, cloudflowDockerRepository}
import cloudflow.sbt.ImagePlugin
import sbt.{AutoPlugin, Def, taskKey}
trait Key {
val cloudflowImageName = taskKey[String]("The name of the Docker image to publish.")
}
object ImageNamePlugin extends AutoPlugin {
override def requires = ImagePlugin
@maasg
maasg / rampupRateSource.snb.ipynb
Last active May 2, 2018 19:09
Comparison of Rate Source implementation PR
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@maasg
maasg / test-execution.md
Last active April 22, 2018 13:05
Spark Streaming 2.2.0 Earliest offset reset

Earliest Offset Reset Test Scenario

Env: Spark 2.2.0 using Kafka integration 0.10

./spark-shell  --packages org.apache.spark:spark-streaming-kafka-0-10_2.11:2.2.0

Welcome to
      ____              __
     / __/__  ___ _____/ /__

Keybase proof

I hereby claim:

  • I am maasg on github.
  • I am maasg (https://keybase.io/maasg) on keybase.
  • I have a public key ASChpDgzfLIUh_ndWgJ1dBXEyYCfN9PkTdBVlYmqx6GUYgo

To claim this, I am signing this object:

@maasg
maasg / SimpleWordTCPServer.scala
Last active August 15, 2017 19:39
Simple TCP server delivering random words from a predefined dictionary
package so
import java.io.PrintStream
import java.net.Socket
import java.net._
import scala.concurrent.Future
class SocketHandler(socket: Socket) {
def deliver(data: Iterator[String]): Unit = {
@maasg
maasg / UniqueGlobalStateChains.snb.ipynb
Created July 27, 2017 14:26
Build path for different events and assign globalID
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@maasg
maasg / StreamingFileTest.scala
Created May 30, 2017 16:02
Minimalistic SparkStreaming-FileStream project
import org.apache.spark.mllib.linalg.Vectors
import org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
import org.apache.spark.sql._
import org.apache.spark.sql.types.{StringType, StructField, StructType}
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.{SparkConf, SparkContext}
import scala.util.Try
import java.io.File
@maasg
maasg / streaming-files.md
Created May 30, 2017 07:38
Spark Streaming job that uses a file stream
import org.apache.spark.mllib.linalg.Vectors
import org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
import org.apache.spark.sql._
import org.apache.spark.sql.types.{StringType, StructField, StructType}
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.{SparkConf, SparkContext}
import scala.util.Try
@maasg
maasg / unique_match_count.ipynb
Created January 20, 2017 12:41
Calculate the count of unique matching elements between two dataframes
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@maasg
maasg / so-data-exploration.snb.ipynb
Last active December 19, 2016 20:15
Data Exploration.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.