Skip to content

Instantly share code, notes, and snippets.

View ericminikel's full-sized avatar

Eric Minikel ericminikel

View GitHub Profile
import os
import sys
working_dir = '/data/HD/analysis/038ea/analysis/1/fastqc/'
fastqc_summary = open(working_dir+'fastqc_summary.txt',mode='wb')
fastqc_details = open(working_dir+'fastqc_details.txt',mode='wb')
for root, dirs, files in os.walk(working_dir): # walk a directory containing FastQC output for multiple samples
for name in files:
drop table if exists ea1.fastqc_summary;
create table ea1.fastqc_summary (
fileid varchar,
module varchar,
status varchar
);
drop table if exists ea1.fastqc_details;
create table ea1.fastqc_details (
id serial primary key,
@ericminikel
ericminikel / findbadbams.bash
Last active December 28, 2015 11:09
A bash + R script to plot the file size of BAMs vs. that of corresponding FASTQs to identify outliers where a BAM appears to have gotten truncated.
cd /my/working/dir/
du -sb *.bam > bam.filesize # two-column list of BAMs and their size in bytes
cat fastq.12col | awk '{print "du -sb "$1}' | bash > fastq.filesize # two-column list of FASTQs and their size in bytes
# switch to R
R
fastq = read.table('fastq.filesize',header=FALSE)
bam = read.table('bam.filesize',header=FALSE)
bam$name = substr(bam$V2,1,21) # parse a unique ID for the BAMs from the path
fastq$name = substr(fastq$V2,71,91) # same for FASTQs
bam$bamsize=bam$V1
@ericminikel
ericminikel / antiprion-vs-all.r
Created November 18, 2013 19:48
Plotting antiprion compounds in the chemical space of drugs and libraries. Introduced in blog post: http://www.cureffi.org/2013/11/06/plotting-antiprion-compounds-in-the-chemical-space-of-drugs-and-libraries/
# Eric Minikel
# CureFFI.org
# 2013-11-06
# Anti-prion compounds plotted in the space of drugs and chemical libraries
setwd('c:/sci/037cinf/analysis/1')
options(stringsAsFactors=FALSE)
library(stringr)
library(rcdk) # cheminformatics
library(sp) # for point-in-polygon operations
@ericminikel
ericminikel / prpsc-diffeq-derivation
Last active December 28, 2015 22:39
Derivation of exponential decay equations for mRNA, PrPC and PrPSc.
mRNA
====
R(0) = R0
R(infty) = R1
R(t) = R1 + (R0 - R1)exp(-lambda*t)
PrPC
====
@ericminikel
ericminikel / degradation-model.r
Last active December 30, 2015 01:19
Comparison of simulated vs. analytical solutions for degradation of PrP in cell culture
# Eric Minikel
# CureFFI.org
# 2013-12-03
# Comparison of numerically simulated and analytically derived PrP degradation models
# half lives from literature
mrna_thalf = 7 # Pfeiffer 1993 http://www.ncbi.nlm.nih.gov/pubmed/8095862
prpc_thalf = 5 # Borcheldt 1990 http://www.ncbi.nlm.nih.gov/pubmed/1968466/
prpsc_thalf = 30 # Peretz 2001 http://www.ncbi.nlm.nih.gov/pubmed/11507642
division_time = 24 # Ghaemmaghami 2007 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2084281/
@ericminikel
ericminikel / max-inhibition.r
Created December 2, 2013 20:03
What percent of maximum inhibition will be achieved after 1, 5, or 6 days for assay readouts if we inhibit PrP transcription?
########### what percent of possible degradation is achieved by 1 day, 5 days and 6 days?
maxint = .Machine$integer.max # longest possible time in future to model
# transcription inhibition
r = l0 * 0.50 #
l = l0 * 1.00 #
a = m0 * 1.00
m = m0 * 1.00
b = v0 * 1.00
@ericminikel
ericminikel / prpc-downreg.r
Last active December 31, 2015 18:49
Simulation of effects of PrPC downregulation on kinetics of prion infection in vivo
# simulate PrPC downregulation
kdeg = log(2)/1.5 # 36 half life for PrPSc
kconv = 1.7 # a conversion rate that works out well for wt mice
PrPSc0 = 10^1.5 # initial titer as of 4 dpi
symptomatic_threshold = 10^9 # titer when mice become asymptomatic
downreg_threshold = sqrt(PrPSc0 * symptomatic_threshold) # geometric mean for 1/2way through disease course
plot(NA,NA,xlim=range(times),ylim=c(1,10^10),log="y", # empty plot over 300 days and 10 log10s
xlab='days post-inoculation', ylab='prion titer', main=paste("kconversion = ",kconv,sep=''))
@ericminikel
ericminikel / nthprime.py
Created January 8, 2014 22:05
Finds the nth prime number.
primes = []
x = 2
nthprime = 1000 # set to 1000 to get the 1000th prime number
while len(primes) < nthprime:
isPrime = True
for item in primes:
if item > x**.5:
break
if x % item == 0:
isPrime = False
@ericminikel
ericminikel / inoculum-incubationtime.r
Created January 13, 2014 02:26
Model the relationship between prion inoculum titer and incubation time
# Eric Minikel
# 2014-01-12
# explore relationship between inoculated dose and incubation time
# this is all the code from this post:
# http://www.cureffi.org/2014/01/12/prion-kinetic-models-relationship-inoculum-titer-incubation-time/
logdose = runif(n=100,min=0,max=10) # simulate 100 different experiments with diff titers
days = 10^( (logdose-26.66)/(-12.99) ) + 40 + rnorm(n=100,m=0,s=2) # apply Prusiner's model
plot(logdose,days,xlab="log10(LD50 units)",ylab="days to illness",pch=19,
main="incubation time interval bioassay\nPrusiner 1982b - Sc237 in hamsters",