Skip to content

Instantly share code, notes, and snippets.

@nhoffman
nhoffman / wang_fig1_extended.R
Created January 20, 2011 18:19
Extends Wang, et al, 2007 Fig 1 (pmid 17600086) to show detection threshold at higher coverage
## Extends Wang, et al, 2007 Fig 1 (pmid 17600086) to show detection threshold at higher coverage
##
library(lattice)
## mu = error rate
mu_hp <- 0.0044
mu_nhp <- 0.0007
thresh <- function(N, mu, p=0.001){
@nhoffman
nhoffman / mutinfo.R
Created June 29, 2011 05:13
Use mutual information to find a cutoff separating two distributions
## Use mutual information to define a value separating two
## distributions.
entropy <- function(x,y){
## shannon entropy of x or joint entropy of x and y
if(missing(y)){
freqs <- table(x)/length(x)
}else{
stopifnot(length(x) == length(y))
freqs <- table(paste(x,y))/length(x)
;; gist.el
;; https://github.com/defunkt/gist.el
;; added as a submodule:
;; % git submodule add https://github.com/defunkt/gist.el.git
;; now, to clone .emacs.d elsewhere:
;; % git clone git@github.com:nhoffman/.emacs.d.git
;; % cd .emacs.d
;; % git submodule init && git submodule update
(condition-case nil
(require 'gist "~/.emacs.d/gist.el/gist.el")
@nhoffman
nhoffman / read_dcw.py
Created September 7, 2011 03:02
Example demonstrating openpyxl
import itertools
from openpyxl.reader.excel import load_workbook
def getrows(sheet, cols):
headers = [c.value for c in sheet.rows[0] if c.value]
for row in sheet.rows[1:]:
d = dict(zip(headers, [c.value for c in row]))
yield dict((k2, d[k1]) for k1,k2 in cols)
@nhoffman
nhoffman / mysql2sqlite.sh
Created April 19, 2012 04:36 — forked from esperlu/mysql2sqlite.sh
MySQL to Sqlite converter
#!/bin/sh
# Converts a mysqldump file into a Sqlite 3 compatible file. It also extracts the MySQL `KEY xxxxx` from the
# CREATE block and create them in separate commands _after_ all the INSERTs.
# Awk is choosen because it's fast and portable. You can use gawk, original awk or even the lightning fast mawk.
# The mysqldump file is traversed only once.
# Usage: $ ./mysql2sqlite mysqldump-opts db-name | sqlite3 database.sqlite
# Example: $ ./mysql2sqlite --no-data -u root -pMySecretPassWord myDbase | sqlite3 database.sqlite
@nhoffman
nhoffman / gist.bash
Last active September 23, 2020 19:51
Function for getting the contents of a gist
gist () {
gid=${1:-2968328}
curl -s https://api.github.com/gists/$gid | python3 -c 'import json, sys; print(next(iter(json.load(sys.stdin)["files"].items()))[1]["content"])'
}
@nhoffman
nhoffman / pyscript.py
Last active February 20, 2024 22:30
Python script template
#!/usr/bin/env python3
"""A simple python script template.
"""
import os
import sys
import argparse
@nhoffman
nhoffman / prompt.sh
Created July 9, 2012 20:05
prompt and window titles for zsh (any maybe bash)
#########################################
## Set prompt, iTerm2 window and tabs ##
#########################################
# Color constants used in the prompt
BLACK="%{%}"
BOLD_BLACK="%{%}"
RED="%{%}"
BOLD_RED="%{%}"
GREEN="%{%}"
@nhoffman
nhoffman / parallel.sh
Created July 24, 2012 19:11
Demonstrate parallel processes using xargs
#!/bin/bash
# try me out:
# two processors
# $ echo {1..9} | xargs -n1 -P2 ./ps.sh
# four processors
# $ echo {1..9} | xargs -n1 -P4 ./ps.sh
# sleeptime set to a random integer between 2 and 5
sleeptime=$(shuf -i 2-5 -n 1)
@nhoffman
nhoffman / unclassified.py
Created November 16, 2012 20:54
More patterns for matching unclassified sequences
#!/usr/bin/env python
import re
import sys
rexp = re.compile(r'|'.join([
r'\bactinomycete\b',
r'\bcrenarchaeote\b',
r'\bculture\b',
r'\bchimeric\b',