Skip to content

Instantly share code, notes, and snippets.

View krishnanraman's full-sized avatar

Krishnan Raman krishnanraman

View GitHub Profile
@pchiusano
pchiusano / GADTs.scala
Created November 16, 2011 04:30
GADT support in Scala
/** GADTs in Scala and their limitations */
/** Background: what is an algebraic data type (ADT) ?
* ADT: (possibly) recursive datatype with sums and products
* In scala - a trait with case classes (case class is product, subtyping is sum)
*/
/** Motivation: untyped embedded DSL doesn't prevent nonsensical expressions */
sealed trait Expr {
def apply(other: Expr) = Ap(this, other)
@rui314
rui314 / gist:2018964
Created March 12, 2012 00:51
N queen
int print_board(int board[][8]) {
for (int i = 0; i < 8; i++) {
for (int j = 0; j < 8; j++) {
if (board[i][j])
printf("Q ");
else
printf(". ");
}
printf("\n");
}
@gclaramunt
gclaramunt / Ids.scala
Created September 9, 2012 03:05
Tagged Ids using Phantom types
trait Entity
trait User extends Entity
trait Product extends Entity
case class Id[T<:Entity](id:String)
def buy(pId:Id[Product],uId:Id[User])="Bought product %s for user %s".format(pId.id,uId.id)
val pId=new Id[Product]("1")
@tonymorris
tonymorris / FList.scala
Last active December 10, 2015 02:18
foldMap is equivalent to scala.List
trait Semigroup[A] {
def op(a1: A, a2: A): A
}
trait Monoid[A] extends Semigroup[A] {
def id: A
}
object Monoid {
implicit def ListMonoid[A]: Monoid[List[A]] =
@johnynek
johnynek / heterfn.scala
Created June 18, 2013 04:29
An example showing how to make a function that takes one of many input types.
// Taken from: http://jnordenberg.blogspot.com/2008/08/hlist-in-scala.html
object HListTest {
sealed trait HList
final class HNil extends HList {
def ::[T](v : T) = HCons(v, this)
}
val HNil = new HNil()
class WriteOnce[T] {
private val ref = new java.util.concurrent.atomic.AtomicReference[Option[T]](None)
def init(t: T) { if(!ref.compareAndSet(None, Some(t))) sys.error("Already written"); }
def get: Option[T] = ref.get
}
scala> val w = new WriteOnce[Int]
w: WriteOnce[Int] = WriteOnce@215d55cb
scala> w.get
@johnynek
johnynek / WeakRefMonad.scala
Created December 10, 2013 23:53
Weak Reference is a monad (h/t https://twitter.com/J_ )
import java.lang.ref.WeakReference
/** An option-like type that doesn't keep
* the pointed item in scope
*/
trait WeakRef[+T] {
// If the reference is still live, return it, else None
def toOption: Option[T]
// If either reference is dead, return Empty, else give the live ref
@azymnis
azymnis / ItemSimilarity.scala
Created December 13, 2013 05:17
Approximate item similarity using LSH in Scalding.
import com.twitter.scalding._
import com.twitter.algebird.{ MinHasher, MinHasher32, MinHashSignature }
/**
* Computes similar items (with a string itemId), based on approximate
* Jaccard similarity, using LSH.
*
* Assumes an input data TSV file of the following format:
*
* itemId userId
@tonymorris
tonymorris / scalaz.scala
Created January 5, 2014 01:16
Questions about \/ and Validation
// Questions about \/ and Validation
object scalaz {
/*
This explanation might help. This is a compilable source file with revisions
available at https://gist.github.com/tonymorris/8263051
Fact: All monads are applicative functors. As has been seen we can witness the
`Applicative` that arises from the `Monad` primitives. Let's illustrate this:
@johnynek
johnynek / gist:8961994
Last active August 29, 2015 13:56
Some Questions with Sketch Monoids

Unifying Sketch Monoids

As I discussed in Algebra for Analytics, many sketch monoids, such as Bloom filters, HyperLogLog, and Count-min sketch, can be described as a hashing (projection) of items into a sparse space, then using two different commutative monoids to read and write respectively. Finally, the read monoids always have the property that (a + b) <= a, b and the write monoids has the property that (a + b) >= a, b.

##Some questions:

  1. Note how similar CMS and Bloom filters are. The difference: bloom hashes k times onto the same space, CMS hashes k times onto a k orthogonal subspaces. Why the difference? Imagine a fixed space bloom that hashes onto k orthogonal spaces, or an overlapping CMS that hashes onto k * m length space. How do the error asymptotics change?
  2. CMS has many query modes (dot product, etc...) can those generalize to other sketchs (HLL, Bloom)?
  3. What other sketch or non-sketch algorithms can be expressed in this dual mo