Skip to content

Instantly share code, notes, and snippets.

@kdrakon
kdrakon / Avro4s.scala
Last active December 16, 2022 15:29
Using Avro4s with Confluent Kafka Avro Serializer + Schema Registry
import java.util
import com.sksamuel.avro4s.RecordFormat
import org.apache.avro.generic.GenericRecord
import org.apache.kafka.common.serialization.{Deserializer, Serde, Serializer}
object Avro4s {
implicit class CaseClassSerde(inner: Serde[GenericRecord]) {
def forCaseClass[T](implicit recordFormat: RecordFormat[T]): Serde[T] = {
val caseClassSerializer: Serializer[T] = new Serializer[T] {
@corey
corey / list.md
Created March 2, 2013 20:46 — forked from pbailis/list.md

A friend asked me for a few pointers to interesting, mostly recent papers on data warehousing and "big data" database systems, with an eye towards real-world deployments. I figured I'd share the list. While it's biased and rather incomplete but maybe of interest to someone. While many are obvious choices (I've omitted several, like MapReduce), I think there are a few underappreciated gems.

###Dataflow Engines:

Dryad--general-purpose distributed parallel dataflow engine
http://research.microsoft.com/en-us/projects/dryad/eurosys07.pdf

Spark--in memory dataflow
http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf

@mudge
mudge / init.pp
Created May 5, 2011 16:35
Puppet class for a CentOS/RedHat server with RVM installed.
# rvm_server/manifests/init.pp
class rvm_server($version='latest') {
# Dependencies for RVM and Ruby.
package { 'rvm-dependencies':
ensure => installed,
name => ['bash', 'gawk', 'sed', 'grep', 'which', 'coreutils', 'tar',
'curl', 'gzip', 'bzip2', 'gcc-c++', 'autoconf', 'patch',
'readline', 'readline-devel', 'zlib', 'zlib-devel',
@mbostock
mbostock / .block
Last active February 8, 2016 23:17
Heatmap
license: gpl-3.0
require 'rake/clean'
HAML = FileList['**/*.haml']
LESS = FileList['**/*.less']
COFFEE = FileList['**/*.coffee']
HTML = HAML.ext('html')
CSS = LESS.ext('css')
JS = COFFEE.ext('js')