AccumulatorListener.scala

Accumulator Examples
import scala.collection.mutable.Map
import org.apache.spark.{Accumulator, AccumulatorParam, SparkContext}
import org.apache.spark.scheduler.{SparkListenerStageCompleted, SparkListener}
import org.apache.spark.SparkContext._
* just print out the values for all accumulators from the stage.
* you will only get updates from *named* accumulators, though
import scala.concurrent.{Future, Promise}
import scala.concurrent.duration.{Duration, FiniteDuration}
import scala.concurrent.{Await, Promise}
import scala.util.{Failure, Success}
def performSequentially[A](items: Seq[A])(f: A => Future[Unit]): Future[Unit] = {
items.headOption match {
case Some(nextItem) =>
test-time-parse.scala

parsing the scalatest timing output from sbt
case class TestTime(suite: String, test: String, time: Int)
def parse(input: Iterator[String]): Seq[TestTime] = {
val TimingPattern = """(.*)\(((\d+) seconds?, )?(.*) milliseconds?\)""".r
val x = input.filter{line => line.startsWith("[info]") && !line.contains("!!! IGNORED !!!") && !line.startsWith("[info] +")}.map{_.substring("[info] ".length)}
var suiteName: String = null
var result = IndexedSeq[TestTime]()
while(x.hasNext) {
val line =
logfile screenlogs/log.%n
deflog on
defscrollback 10000
screen 1
#other settings
hardstatus alwayslastline
hardstatus string '%{= kG}[ %{G}%H %{g}][%= %{= kw}%?%-Lw%?%{r}(%{W}%n*%f%t%?(%u)%?%{r})%{w}%?%+Lw%?%?%= %{g}][%{B} %d/%m %{W}%c %{g}]'
vbell off
autodetach on
0_reuse_code.js

Here are some things you can do with Gists in GistBox.
// Use Gists to store code you would like to remember later on
console.log(window); // log the "window" object to the console
Build.scala

avoid sbt assemblies
// define your projects, with a custom "settings"
lazy val common = Project(
id = "common",
base = file("common"),
settings = baseSettings ++ Seq(
libraryDependencies ++= commonLibraryDependencies
mylib.R

searchable source dirs for R (aka, a "classpath" for R)
## this should get loaded by your ~/.Rprofile
## either just stick all these definitions right in ~/.Rprofile, or
## have ~/.Rprofile source this file, etc.
source.dirs <- c(
## add any "base" locations here. this is like a "classpath" in java
## my default is c("~/myRUtils/src","~/companyRUtils/src","~/publicRUtils"),
## but you can use whatever you like
paste(Sys.getenv("HOME"),"myRUtils","src", sep="/"),
paste(Sys.getenv("HOME"),"companyRUtils","src", sep="/"),
binomial_confidence.R

Some hacky R code to explore confidence interval estimation for binomial rates, and the difference between two binomial rates.
alpha = 0.8
z <- 1.28
normal.theta.conf <- function(x, n) {
# normal approximation to a binomial. Generally agreed to be horrible
phat <- x / n
t <- z * sqrt(phat * (1 - phat)) / sqrt(n)
c(max(0,phat - t), min(1, phat + t))
➜ tmp javap -c -private TaggedTypes$
Compiled from "TaggedTypes.scala"
public final class TaggedTypes$ {
public static final TaggedTypes$ MODULE$;
public static {};
0: new #2 // class TaggedTypes$
3: invokespecial #12 // Method "<init>":()V
6: return


scala enums inferior

Every so often, somebody asks why I still use Java enums instead of scala "enums". Too avoid having to search for the links every time:

  1. what is a scala "enum"? scala itself can't decide. This stack overflow question mostly summarizes the alternatives, and how come each one is incomplete.

if you really take some time to understand those answers, you'll see people have invested quite a bit of effort to try and mimic the behavior of java enums, and its still lacking. If you want to see even more craziness:

  1. On top of that, the syntax for all scala versions is weird ... I need to type more, to get less