Skip to content

Instantly share code, notes, and snippets.

View durran's full-sized avatar

Durran Jordan durran

View GitHub Profile
@pbailis
pbailis / list.md
Last active April 15, 2018 08:54
Quick and dirty (incomplete) list of interesting, mostly recent data warehousing/"big data" papers

A friend asked me for a few pointers to interesting, mostly recent papers on data warehousing and "big data" database systems, with an eye towards real-world deployments. I figured I'd share the list. It's biased and rather incomplete but maybe of interest to someone. While many are obvious choices (I've omitted several, like MapReduce), I think there are a few underappreciated gems.

###Dataflow Engines:

Dryad--general-purpose distributed parallel dataflow engine
http://research.microsoft.com/en-us/projects/dryad/eurosys07.pdf

Spark--in memory dataflow
http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf

@durran
durran / moped.txt
Created February 16, 2012 10:59
First run perf numbers, Moped.
##################################################################
# ruby 1.9.2p290 (2011-07-09 revision 32553) [x86_64-darwin11.0.0]
# 10gen: mongo-1.5.2
# bson-1.5.2 (BSON::BSON_C)
##################################################################
user system total real
10gen: insert 10,000 blank documents 0.670000 0.060000 0.730000 ( 0.744400)
10gen: insert 10,000 blank documents safe mode 1.200000 0.140000 1.340000 ( 1.800714)
10gen: insert 1,000 normal documents 0.090000 0.010000 0.100000 ( 0.091035)
@tomafro
tomafro / instrumentation.rb
Created February 18, 2011 09:13
Experimental Mongo Instrumentation for Rails 3
module Mongo
module Instrumentation
def self.instrument(clazz, *methods)
clazz.module_eval do
methods.each do |m|
class_eval %{def #{m}_with_instrumentation(*args, &block)
ActiveSupport::Notifications.instrumenter.instrument "mongo.mongo", :name => "#{m}" do
#{m}_without_instrumentation(*args, &block)
end
end