Skip to content

Instantly share code, notes, and snippets.

@JohannesBuchner
JohannesBuchner / toybootstrap.py
Last active August 29, 2015 14:05
Toy linefitting: bootstrapped estimator
import numpy
from numpy import log, log10, sin, cos, tan, arctan, arccos, arcsin, abs, any, pi
import sys
import matplotlib.pyplot as plt
data = numpy.loadtxt(sys.argv[1],
dtype=[(colname, 'f') for colname in 'x', 'x_err', 'y', 'y_err', 'cor'],
skiprows=1)
plt.figure(figsize=(7,7))
@JohannesBuchner
JohannesBuchner / gen.tab
Last active August 29, 2015 14:05
Toy linefitting: new test data with known true values
# "x" "x_err" "y" "y_err" "cor"
10.191 0.125 20.128 0.125 0.731
9.808 0.050 20.286 0.050 0.662
9.700 0.039 20.437 0.039 0.580
9.831 0.065 20.058 0.065 0.720
9.912 0.058 20.194 0.058 0.502
9.861 0.083 19.989 0.083 0.769
9.971 0.060 20.229 0.060 0.563
9.859 0.060 20.164 0.060 0.752
9.720 0.044 20.318 0.044 0.646
@JohannesBuchner
JohannesBuchner / bd.py
Created February 6, 2015 12:57
birthday problem for 4 people
import numpy
import matplotlib.pyplot as plt
def prob(M):
# for M people, compute the probability of having more than 4 with same birthday
hits = 0
# number of simulation instances
N = 1000
I = numpy.arange(365).reshape((1,-1))
for j in range(N):
def generateTuple():
if numpy.random.uniform() > 0.05:
# generate from normal data set, e.g. normal distribution around some values -- here, a line
k = 1.16
d = 8.9
x = numpy.random.uniform(6, 12)
y = k * (x - 11) + d
return numpy.random.norm(x, 1), numpy.random.norm(y, 3)
else:
# generate from outlier distribution, e.g. uniform distribution over full parameter space
import numpy
from numpy import cos, sin, exp, log, pi, tan, arccos, arcsin, arctan
import matplotlib.pyplot as plt
# make a quadratic figure
plt.figure(figsize=(6, 6))
# generate 400 points between 0 and 1
t = numpy.linspace(0, 1, 40)
print 't = ', t
@JohannesBuchner
JohannesBuchner / codememoize.py
Last active August 29, 2015 14:21
For tests/builds that should only re-run when code or data files have changed (memoized tests)
"""
Memoizes a given function, given its code dependencies (loaded modules and
additional data files)
Example::
import douglasadams
def costlyfunction():
# compute answer to the universe and everything
return douglasadams.compute_answer() == 42
@JohannesBuchner
JohannesBuchner / overlapgauss.py
Created June 20, 2015 19:00
Probability that two measurements actually have the same value
import numpy
import matplotlib.pyplot as plt
import scipy.stats
# two gaussian uncertainties with width sigma
# at distance delta
# what is the probability that they actually have the same value?
def compute_bayes(delta, border=5):
a = scipy.stats.norm()
@JohannesBuchner
JohannesBuchner / pvalue.py
Created June 20, 2015 19:24
p-value reliability
import matplotlib.pyplot as plt
import numpy
import scipy.stats
# http://www.medpagetoday.com/Blogs/TheMethodsMan/52171
def calc_reliability(p, power=0.8, frac_true=0.1):
"""
Given this p-value, power of the test and fraction of hypotheses that
are actually true.
@JohannesBuchner
JohannesBuchner / console-progress.py
Last active August 29, 2015 14:23
Console progress bar -- takes stdin from arbitrary commands and plots a progress bar
"""
SYNOPSIS: ./myprog | python console-progress.py
example for myprog:
#!/bin/bash
echo 100
for i in $(seq 1 100)
do
sleep 1
@JohannesBuchner
JohannesBuchner / statistics-minimal.rst
Last active July 31, 2016 08:47
ArXiV minimal statistics checklist

ArXiV minimal statistics checklist

This checklist help you identify and fix common errors/misinterpretation in your analysis, or of a paper you are refereeing.

  1. If you use p-values (from a KS test, Pearson correlation, etc.).
  1. What do you think a low p-value says?
  1. You have absolutely disproved the null hypothesis (e.g. "no correlation" is ruled out, the data are not sampled from this model, there is no difference between the population means).