Skip to content

Instantly share code, notes, and snippets.

@jvns
jvns / mailman-to-gmane.js
Last active August 29, 2015 13:56
gmane.js
var text = $('pre')[0].innerText.substr(0, 100);
text = text.replace(/ \w+$/, '').replace("\n", " ");
var search = '"' + text + '"' + ' site:comments.gmane.org';
console.log(search);
window.location = "http://duckduckgo.com/?q=! " + search;

Keybase proof

I hereby claim:

  • I am jvns on github.
  • I am bork (https://keybase.io/bork) on keybase.
  • I have a public key whose fingerprint is 2A40 AC07 2BE2 4C7E F9C4 8FD1 E680 F9BA 62EF 229E

To claim this, I am signing this object:

@jvns
jvns / debuggers.md
Created March 23, 2014 13:43
debuggers

Hello! Welcome to RailsBridge!

We're so happy you're here. We're going to start out with some basic things you'll need to know before getting started with programming.

By the end, we'll have run our first program!

Step 1: Open a terminal!

We'll need to run programs from a terminal. Mine looks like this. Yours might looks a little different.

Some open spaces I'd love to see at PyCon, but am not qualified to run:

Diving into CPython: how it works!

In this open space, a CPython maintainer will talk about how CPython works internally! We'll talk about how the interpreter works and do a high-level overview of the source. Then we will go SOURCE DIVING!!! We'll then pair up and open up the CPython source! We'll change silly things and find out what goes wrong! You'll come away having some idea of how CPython works, and have successfully broken it.

Why I can't run this: I don't have a good enough understanding of CPython

I want to spend a week (during Hacker School alumni reunion week) better understanding performance (probably of things in the Hadoop ecosystem) on a few different dataset sizes (8GB, 100GB, 1TB). I have $1000 of AWS credit that I can spend on this (yay!)

Some things I want:

  • get a much better grasp on the performance of in-memory operations (put 8GB of data into memory and be done) vs running a distributed map reduce.
  • Understand what goes into the performance (how much time is spent copying data? sending data over the network? CPU?)
  • Learn something about tradeoffs

I'd love suggestions for experiments to run and setups to use. At work I've been using HDFS / Impala / Scalding, so my current thought is to spend time looking in depth at running a map/reduce with Scalding vs an Impala query vs running a non-distributed job in memory, because I already know about those things. But I'm open to other ideas!