Skip to content

Instantly share code, notes, and snippets.

@MichaelChirico
Last active September 12, 2016 00:43
Show Gist options
  • Save MichaelChirico/5e2437282e3b337c97bf71506c2ad7bb to your computer and use it in GitHub Desktop.
Save MichaelChirico/5e2437282e3b337c97bf71506c2ad7bb to your computer and use it in GitHub Desktop.
Simulating ways to get predictions from a fitted IV regression
nn = 1e6
set.seed(9112016)
#fake data
z = rnorm(nn)
v = rnorm(nn)
err1 = rnorm(nn)
err2 = rnorm(nn)
#z & v influence x
x = 2 + 3 * z + 2 * v + err1
#x & v influence y
y = 4 + 5 * x + 3 * v + err2
#since not including v, bias
ols = lm(y ~ x)
summary(ols)
#would be spot if knew v
ols_full = lm(y ~ x + v)
summary(ols_full)
#spot on
stage1 = lm(x ~ z)
summary(stage1)
#spot on for x coefficient
stage2 = lm(y ~ stage1$fitted.values)
#trying predictions
new.z = rnorm(nn)
new.v = rnorm(nn)
new.err1 = rnorm(nn)
new.err2 = rnorm(nn)
new.x = 2 + 3 * z + 2 * v + err1
new.y = 4 + 5 * x + 3 * v + err2
#predict new.y using only new.x and stage2
using.new.x =
predict(stage2, newdata = data.frame(`stage1$fitted.values` = new.x))
#predict new.y using only new.z and stage1 -> stage 2
using.new.z =
predict(stage2, newdata =
data.frame(`stage1$fitted.values` =
predict(stage1, newdata = data.frame(z = new.z))))
sapply(list(using.new.x, using.new.z, new.y), mean)
#actually they give the exact same estimate?
cbind(head(using.new.x), head(using.new.z))
all.equal(using.new.x, using.new.z)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment