Skip to content

Instantly share code, notes, and snippets.

View krishnanraman's full-sized avatar

Krishnan Raman krishnanraman

View GitHub Profile
import csv
import math
import matplotlib.pyplot as plt
import locale
def frac(x):
return x - math.floor(x)
def first_digit(x):
return int(math.floor(math.pow(10, frac(math.log10(x))) + 0.0001))
library(mgcv)
library(ggplot2)
library(dplyr)
library(XML)
library(weatherData)
us.airports.url <- 'http://www.world-airport-codes.com/us-top-40-airports.html'
us.airports <- readHTMLTable(us.airports.url)[[1]] %>%
filter(!is.na(IATA)) %>%
@johnynek
johnynek / AliceInAggregatorLand.scala
Last active January 24, 2024 19:38
A REPL Example of using Aggregators in scala
/**
* To get started:
* git clone https://github.com/twitter/algebird
* cd algebird
* ./sbt algebird-core/console
*/
/**
* Let's get some data. Here is Alice in Wonderland, line by line
*/
@johnynek
johnynek / scalding_alice.scala
Created July 18, 2014 17:15
Learn Scalding with Alice
/**
git clone https://github.com/twitter/scalding.git
cd scalding
./sbt scalding-repl/console
*/
import scala.io.Source
val alice = Source.fromURL("http://www.gutenberg.org/files/11/11.txt").getLines
// Add the line numbers, which we might want later
val aliceLineNum = alice.zipWithIndex.toList
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US">
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8"/>
<link rel="stylesheet" href="/stylesheets/export.css"/>
<link rel="shortcut icon" type="image/x-icon" href="/favicon.ico"/>
<link rel="canonical" href="http://www.stackprinter.com/export?service=programmers.stackexchange&amp;question=51245&amp;printer=false&amp;linktohome=true"/>
<title>What kind of things are easy in Haskell and hard in Scala, and vice versa?</title>
<script type="text/javascript" src="/javascripts/jquery-1.4.2.min.js"></script>
<script type="text/javascript" src="/javascripts/main.js"></script>
import numpy as np
from matplotlib import pylab as plt
#from mpltools import style # uncomment for prettier plots
#style.use(['ggplot'])
# generate all bernoulli rewards ahead of time
def generate_bernoulli_bandit_data(num_samples,K):
CTRs_that_generated_data = np.tile(np.random.rand(K),(num_samples,1))
true_rewards = np.random.rand(num_samples,K) < CTRs_that_generated_data
return true_rewards,CTRs_that_generated_data
@danclien
danclien / scalaz_console_io_free_monad.scala
Last active August 29, 2015 13:57
Attempt to write a monad for console IO using scalaz's Free monad
:paste
import scalaz._, Scalaz._, scalaz.Free.{Suspend, Return}
// Console grammar
sealed trait ConsoleF[+A]
object Console {
case class WriteLine[A](msg: String, o: A) extends ConsoleF[A]
case class ReadLine[A](o: String => A) extends ConsoleF[A]
}
@johnynek
johnynek / sized_list.scala
Last active July 3, 2021 17:54
Simple example of how to write a linked-list in scala that knows its length at compile-time. This allows you to write a zip method that is always exact.
// I was reading through these examples: http://apocalisp.wordpress.com/2010/06/08/type-level-programming-in-scala/
// and I thought it would be nice to more quickly get to something useful, to show the power of the techniques.
object SizedListExample {
// Type of all Non-negative integers
sealed trait Nat
// This is zero.
sealed trait _0 extends Nat
// Successor to some non-negative number
sealed trait Succ[N <: Nat] extends Nat
List(1,2,3,4,5).foldRight(List.empty[Int])(_ :: _)
//res1: List[Int] = List(1, 2, 3, 4, 5)
/*
Passing in the type constructors for an abstract data type to its fold method
is the identity function.
List is made up of `Nil` and `Cons` objects.
*/
@johnynek
johnynek / gist:8961994
Last active August 29, 2015 13:56
Some Questions with Sketch Monoids

Unifying Sketch Monoids

As I discussed in Algebra for Analytics, many sketch monoids, such as Bloom filters, HyperLogLog, and Count-min sketch, can be described as a hashing (projection) of items into a sparse space, then using two different commutative monoids to read and write respectively. Finally, the read monoids always have the property that (a + b) <= a, b and the write monoids has the property that (a + b) >= a, b.

##Some questions:

  1. Note how similar CMS and Bloom filters are. The difference: bloom hashes k times onto the same space, CMS hashes k times onto a k orthogonal subspaces. Why the difference? Imagine a fixed space bloom that hashes onto k orthogonal spaces, or an overlapping CMS that hashes onto k * m length space. How do the error asymptotics change?
  2. CMS has many query modes (dot product, etc...) can those generalize to other sketchs (HLL, Bloom)?
  3. What other sketch or non-sketch algorithms can be expressed in this dual mo