Skip to content

Instantly share code, notes, and snippets.

View ryanwitt's full-sized avatar

Ryan Witt ryanwitt

  • New York, NY
View GitHub Profile

Keybase proof

I hereby claim:

To claim this, I am signing this object:

@ryanwitt
ryanwitt / RedisTools.py
Created November 10, 2015 18:15 — forked from agconti/RedisTools.py
A class for checking redis queues and managing orphan jobs.
class RedisTools:
'''
A set of utility tools for interacting with a redis cache
'''
def __init__(self):
self._queues = ["default", "high", "low", "failed"]
self.get_redis_connection()
def get_redis_connection(self):
@ryanwitt
ryanwitt / install-nodejs.sh
Created April 11, 2015 21:07
Node.js install script for LInux
#!/bin/sh
VERSION=0.12.2
PLATFORM=linux
ARCH=x64
PREFIX=/usr/local
mkdir -p "$PREFIX" && \
curl http://nodejs.org/dist/v$VERSION/node-v$VERSION-$PLATFORM-$ARCH.tar.gz \
| tar xzvf - --strip-components=1 -C "$PREFIX"
def collect_ranges(s):
"""
Returns a generator of tuples of consecutive numbers found in the input.
>>> list(collect_ranges([]))
[]
>>> list(collect_ranges([1]))
[(1, 1)]
>>> list(collect_ranges([1,2,3]))
[(1, 3)]
@ryanwitt
ryanwitt / cpuse.js
Last active December 26, 2015 09:18
cpuse: cpu monitor for node
//
// cpuse.js - simple continuous cpu monitor for node
//
// Intended for programs wanting to monitor and take action on overall CPU load.
//
// The monitor starts as soon as you require the module, then you can query it at
// any later time for the average cpu:
//
// > var cpuse = require('cpuse');
// > cpuse.averages();
// Check mongodb working set size (Mongo 2.4+).
// Paste this into mongo console, get back size in GB
db.runCommand({
serverStatus:1, workingSet:1, metrics:0, locks:0
}).workingSet.pagesInMemory * 4096 / (Math.pow(2,30));
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ryanwitt
ryanwitt / README.md
Created December 5, 2012 20:44
Indexing code for the NPI database and medicare doctor referral graph.

Doctor referral graph / NPI database full-text indexer

You need 7zip installed to grab the NPI database. (brew install p7zip osx)

To create the index, run the init_* scripts. You would need the doctor graph referral data to use *_refer.*, but the NPI database will be automatically downloaded for you. Indexing happens on all cores, and takes less than 10 min on my 8 core machine.

To grab lines matching a search term, use python search_npi.py term.

Note: index performance is good if you have a lot of memory. Index file blocks will stay hot in cache, but they are loaded each time the program is run, which is super inefficient. Should use an on-disk hashtable where the offsets can be calculated instead.

froms = {}
tos = {}
for i,line in enumerate(file('refer.2011.csv')):
try:
fr, to, count = line.strip().split(',')
froms[fr] = froms.get(fr,0) + 1
tos[to] = tos.get(to,0) + 1
except:
import traceback; traceback.print_exc()
@ryanwitt
ryanwitt / stream_sample_experiment.py
Created November 15, 2012 16:54
Can you create an unbiased sample of size k from a large stream in constant memory?
import random
import matplotlib.pyplot as plt
k = 1000
array = []
for n, x in enumerate([range(k)[random.randrange(k)] for x in range(100000)]):
if n < k:
array.append(x)
else:
if random.random() < k/float(n):