Skip to content

Instantly share code, notes, and snippets.

require 'pathname'
SEPARATOR_PAT = Pathname::SEPARATOR_PAT
def chop_basename(path)
base = File.basename(path)
if /\A#{SEPARATOR_PAT}?\z/ =~ base
return nil
else
return path[0, path.rindex(base)], base
end

things I don't know

I took this list from What CS majors should know.

I think it is fun to list things I don't know so I did it =D. I actually found it to be a cool exercise -- maybe I should do a fun graphics project and learn about Open GL!

i wrote this because, while i think the things on this list are potentially worth knowing, and I actually think it's an awesome list of project ideas as well as good food for thought for people developing CS curricula (many of the things I don't know are great exercises!) -- I thought it was really weird to say that every CS student should know all of them. I have a CS degree and I learned very few of the things I do know inside my degree.

I classify "do know" as anything that I have a reasonable grasp of or at least some basic experience with -- the kind of experience I'd expect a CS student to be able to get. If I say I don't know something, it means either I know pretty much nothing about it (for "gr

@jvns
jvns / a_networking_puzzle.md
Last active September 11, 2017 21:36
A networking puzzle.

You are trying to make a lot of simulataneous network connections to localhost. You get up to about 2000 HTTP requests per second, when your CPU usage goes up to 100% on all cores.

perf top reports the following:

 31.01%  [kernel]                    [k] inet_csk_bind_conflict
 20.60%  [kernel]                    [k] inet_csk_get_port
  9.82%  [kernel]                    [k] _raw_spin_lock
  4.62%  perf                        [.] 0x0000000000038ba1
 3.90% [kernel] [k] _raw_read_unlock_bh
import random
import gevent
from collections import defaultdict
class ReadError(Exception):
pass
class Connection(object):
def __init__(self):
pass
@jvns
jvns / bundler.md
Last active September 28, 2015 02:27

why I didn't understand Bundler

Yesterday I was having drinks with @sferik, and I mentioned that I find Bundler really confusing, more confusing than virtualenv. And then he called me on it and was like "okay but why?"

And I think we figured it out! And it wasn't just that I know the Python ecosystem better than the Ruby one (though I do). Here's the story. (it doesn't have much to do with bundler, and it might not be true, but it felt satisfying to me)

In 2012, I wanted to install Octopress. I already had some Rubies on my computer, and Octopress had helpful instructions telling me to bundle install. It did not work and I was real sad.

the story gets real simple real fast: if you want to install a Python package, it almost always works with Python 2.7. If you have any Python on your computer and it was installed in the last couple years, you have Python 2.7.

Diversity needs resources.

ideas are great. here are some great ideas. you however cannot implement ideas without resources, which basically means money. I am a lot less interested in ideas for a company right now that aren't backed by that company's money / staff.

a few ways to use resources:

  • do you have 2 awesome recruiters who want to do more? let them do it full time.
  • sponsor your employees to give talks about things they care about. (mentorship? good management? =D)
  • systematically reward people for doing diversity work. give them promotions or raises. let everyone know that's how it works.
  • sponsor events (like AlterConf) in your office, and give them organizational support.

Spying on Hadoop with strace

As you may already know, I really like strace. (It has a whole category on this blog). So when the people at Big Data Montreal asked if I wanted to give a talk about stracing Hadoop, the answer was YES OBVIOUSLY.

I set up a small Hadoop cluster (1 master, 2 workers, replication set to 1) on Google Compute Engine to get this working, so that's what we'll be talking about. It has one 14GB CSV file, which contains part of this Wikipedia revision history dataset

Let's start diving into HDFS! (If this is familiar to you, I talked about a lot of this already in Diving into HFDS. There are new things, though! At the end of this we edit the blocks on the data node and see what happens and it's GREAT.)

$ snakebite ls -h /

<3


Ageism hits women harder than it does men.
All the "which do you think would make you happier" section answers put the responsibility on me to fix my issues with being a woman in STEM. I need my male colleagues to not talk over me and to verify they heard what I have to say. I would like my fair share of eye contact during technical discussions. I'm sick of the constant pissing matches that come from a solution not being a particular dev's solution so we have to refactor said existing solution AGAIN. As someone who is talked over and is denied eye contact, how hard do you think I have to fight to be heard or have my solutions last? Being told I'm recognized as talented in my field, only to have some basic computer science concept explained to me in the speaker's next breath (I do think this is a general geek problem so I won't call it mansplaining but the behavior is exhausting all the same).
A lot of my male co-workers don't want to see or are completely oblivious to the sexism that their f
command count num_people_using
cd 347893.0 803.0
ls 326227.0 783.0
rm 80394.0 744.0
git 426701.0 696.0
sudo 109042.0 690.0
mkdir 22987.0 655.0
cat 54491.0 642.0
mv 39297.0 640.0
ssh 83035.0 623.0