Skip to content

Instantly share code, notes, and snippets.

View tslumley's full-sized avatar
😐

Thomas Lumley tslumley

😐
View GitHub Profile
expit<-function(eta) exp(eta)/(1+exp(eta))
one.nri<-function(N=1000){
x<-rnorm(50)
y<-rbinom(50,1,expit(x))
z<-rnorm(50)
m_old<-glm(y~x,family=binomial())
m_new<-glm(y~x+z,family=binomial())
one.nri<-function(N=1000){
x<-rnorm(50)
y<-rbinom(50,1,x/10+0.5)
z<-rnorm(50)
m_old<-lm(y~x)
m_new<-lm(y~x+z)
x<-rnorm(N)
y<-rbinom(N,1,x/10+0.5)
@tslumley
tslumley / Readme.md
Last active August 29, 2015 14:16
Electorate candidate donations, all parties, 2014

Election donations, New Zealand, 2014.

@tslumley
tslumley / hexmap.R
Created March 11, 2015 21:29
An exercise in transparency: unedited ugly scripts used in making electoral hexmaps.
fogmap=
" '..','..','..','..','..','..','..','..','..','..','..','..','..',
'..','69','..','..','..','..','..','35','63','..','..','..','..',
'..','..','67','..','..','..','..','..','45','..','..','..','..',
'..','..','65','71','..','..','..','15','11','33','..','..','..',
'..','..','68','66','..','..','..','..','55','34','..','..','..',
'..','70','..','..','..','..','..','53','01','49','..','..','..',
'..','..','..','..','..','..','..','21','27','12','26','38','..',
'..','..','..','..','..','..','..','31','28','24','03','..','..',
@tslumley
tslumley / README.md
Created March 29, 2015 21:49
Data from paper helicopter optimisation, George Box style

Data from two replicates of a 2^{5-1} fractional factorial experiment on paper helicopters

We used the helicopter template from the R SixSigma package. Our experimental factors were body height, wing length, paper type, paper size (A4 vs A5), and paperclip or not. The paperclip and size are interesting because they have a big effect on how the flight looks, but not a big effect on the flight time.

The experiments were performed at the Science Street Fair hosted by the Auckland Museum of Transportation and Technology (MOTAT), March 29, 2015. Replicate 1 was done in the morning; replicate 2 in the afternoon. The drop height was approximately 2.2m. Times are in seconds.

@tslumley
tslumley / zip.Rmd
Created May 26, 2015 01:31
Zero-Inflated Poisson with complex surveys
---
title: "Zero-inflated Poisson from complex samples"
author: "Thomas Lumley"
date: "18 May 2015"
output: html_document
---
The Zero-Inflated Poisson model is a model for count data with excess zeros. The response distribution is a mixture of a point mass at zero and a Poisson distribution: if $Z$ is Bernoulli with probability $1-p_0$ and $P$ is Poisson with mean $\lambda$ then
$$Y=Z+(1-Z)P$$
is zero-inflated Poisson. The ZIP is a latent-class model; we can have $Y=0$ either because $Z=0$ or because $P=0$. That's natural in some ecological examples: if you didn't see any moose it could be because the area is moose-free (it's downtown Montréal) or because you just randomly didn't see any.
@tslumley
tslumley / README.md
Created May 27, 2015 04:40
Evil tricks in R for evil people who are evil.

Along the lines of the 'jammr' package, an example to show it's quite easy to hide things from even moderately careful inspection.

If you run the code in evil.R, there isn't any obvious impact on your session, as evil-output.txt shows. But now look at what pie() and attach() do.

This isn't a security issue: anyone who can run arbitrary R code on your system can do much worse.

@tslumley
tslumley / moe
Created August 21, 2014 23:53
Margin of error in polls, based on actual percentage and allowing for design effects
moe = function(pct, N=1000, deff=c(1,2)){
p.L = function(x, n) {
ifelse (x == 0,0,qbeta(0.025, x, n - x + 1))
}
p.U = function(x, n) {
ifelse(x == n,1,qbeta(0.975, x + 1, n - x))
}
N = rep(N,length(pct))
lower = function(pct,deff){
@tslumley
tslumley / redpeak.R
Last active September 9, 2015 14:02
redpeak=function(s,w){
x=c(0,0,1,NA, 1,2,2,NA,0.5,1,1.5)
y=c(0,1,1,NA,1,1,0,NA,0,0.5,0)
polygon(x*w+s[1],y*w+s[2],col=c("black","navyblue","#98332f"))
}
@tslumley
tslumley / neighbours.R
Created September 14, 2015 03:04
Nearest-neighbour distances and dimension
library(FNN)
x1<-runif(5e4)
x2<-matrix(runif(5e4*2),ncol=2)
x10<-matrix(runif(5e4*10),ncol=10)
x100<-matrix(runif(5e4*100),ncol=100)
system.time(d1<-knn.dist(x1,k=1))
system.time(d2<-knn.dist(x2,k=1))
system.time(d10<-knn.dist(x10,k=1))