Skip to content

Instantly share code, notes, and snippets.

j14159 /
Created February 17, 2020 22:20
Working out `assert_match` that I've been missing from Erlang.
open Ppxlib
exception No_match
let assert_match_ext =
Ast_pattern.(ppat __ __)
(fun ~loc ~path:_ patt guard ->
View main.go
package main
type mesa struct {
height int
width int
type position struct {
x int
y int
j14159 / gist:aa869c3d04cac59e567f
Created March 18, 2016 02:24
Progress on list type inferencing
View gist:aa869c3d04cac59e567f
list_test_() ->
[?_assertMatch({{t_list, t_float}, _},
top_typ_of("1.0 : []")),
?_assertMatch({{t_list, t_int}, _},
top_typ_of("1 : 2 : []")),
?_assertMatch({error, _}, top_typ_of("1 : 2.0 : []")),
[{unbound, A, _}, {t_list, {unbound, A, _}}],
{t_list, {unbound, A, _}}}, _},
top_typ_of("f x y = x : y")),
View gist:19d100a556effacd1475

I'm putting this list together as a sort of reading plan for myself in order to learn more about general cluster scheduling/utilization and various ways of generically programming to them. Lists of direct links to PDFs here in the order I think makes some sense from skimming reference sections.

Happy to here of any additions that might be sensible.

The Basics

  1. Google File System since everything references it and data locality is a thing.
  2. Google MapReduce because it's one of the earlier well-known functional approaches to programming against a cluster.
  3. Dryad for a more general (iterative?) programming model.
  4. Quincy for a different take on scheduling.
  5. [Delay Scheduling](h
j14159 / gist:404f1dc86aeafff53a12
Created September 17, 2014 18:57
Updated S3N RDD
View gist:404f1dc86aeafff53a12
* A more recent version of my S3N RDD. This exists because I needed
* a reliable way to distribute the fetching of S3 data using instance
* credentials as well as a simple way to filter out the inputs that
* I didn't want in the RDD.
* This version is more eager than the last one and also provides a
* simple RDD that allows you to tag each line with information about
* its partition/source.
j14159 / gist:88a91fc9b5e926d86f86
Created September 4, 2014 21:18
mesos-worker startup attempt
View gist:88a91fc9b5e926d86f86
REGION="`curl | sed s'/[a-zA-Z]$//'`"
ZK_HOSTS="`aws ec2 describe-tags --filters "Name=resource-id,Values=$INSTANCE_ID" "Name=key,Values=mesos-master" --region $REGION --output=text | cut -f5`"
ulimit -n 200000
LD_LIBRARY_PATH=/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/amd64/jamvm nohup /usr/local/sbin/mesos-slave --master=${ZK_HOSTS} --ip= --isolation=cgroups --no-switch_user --log_dir=/var/log/mesos &
View gist:79631a8beab70f83f1cf
(require 'package)
(add-to-list 'package-archives
'("marmalade" . ""))
(add-to-list 'package-archives
'("melpa" . "") t)
(setq-default indent-tabs-mode nil)
(define-key global-map (kbd "C-c SPC") 'ace-jump-mode)
(define-key global-map (kbd "C-x g") 'magit-status)
j14159 / gist:dce718012e971b624236
Created August 12, 2014 20:19
Adapted a couple of encrypted ephemeral disk examples for simple temp storage on mesos-worker nodes (e.g. with Spark)
View gist:dce718012e971b624236
# WARNING: This will wipe and encrypt the device given. For Mesos workers,
# this is run on EVERY BOOT so you will constantly lose existing data.
# I have based this script on the following links:
View gist:d5355107ebb0aad59930
val pool = new BoneCP(config)
val getConn = () => pool.getConnection()
val relConn = (c: Connection) => pool.releaseConnection(c)
class MyActor(get: () => Connection, rel: (c: Connection) => Unit) extends Actor {
lazy val c = get()
override def preRestart(why: Throwable, msg: Option[Any]): Unit = {
View gist:ca191b61a73382316f9c
trait PersonClient {
// supply a router with a pool of PersonDao:
val personPool: ActorRef
// how long should we wait for a response from PersonDao:
val timeoutInMillis: Long
implicit val timeout = Timeout(timeoutInMillis millis)
def addPerson(p: Person): Future[Int] =