Skip to content

Instantly share code, notes, and snippets.

@Myfanwy
Last active August 29, 2015 14:20
Show Gist options
  • Save Myfanwy/38a7aa3440b8a108f7ad to your computer and use it in GitHub Desktop.
Save Myfanwy/38a7aa3440b8a108f7ad to your computer and use it in GitHub Desktop.
Kelly Finn's Monkey Data - McElreath's Model
library(rethinking) #requries rstan, inline, Rcpp
# Predict: y in [0,1]
# With: x
m_glm <- glm(y ~ x, data = your_data, family = binomial) #binomial dist assumes a logit link. This is essentially the same as Emilio's model, except Emilio included a green effect qualifier
# Mathematical form:
# y_i ~ bonimial( 1, p_i) ;
# logit(p_i) = a + b*x_i
# a ~ normal (0, 100)
# b ~ normal(0, 100)
# logit link transforms the predictor space into values between 0 and 1, for probabilities
# To specify this model within a Bayesian framework (and frequentist, actually, but it's called something else) you have to choose a prior.
# GLM uses flat priors (or no likelihood penalty, which is not the best thing to do). Essentially means that you don't know ANYTHING
# about the possibilities of monkey behavior. You can nearly always do better than a flat prior.
# Now with varying intercepts:
# Grouping variable: group
m_glmm <- glmer(y ~ (1|group) + x, data=your_data, family=binomial ) #you cannot change the priors, but there are modifying packages (like blme)
# that allow you to change the priors while still using glmer
# (1|group) is how you specify varying effects. Says intercepts (1) are conditional (|) on group. You have an intercept for each group. The fact that they're in parentheses means that there will be pooling. You know that x is a blue effect becuase it's outside the parentheses.
# These parenthases conventions are specific to lme4
# Mathematical form of the second model:
y_i ~ bonimial( 1, p[i]) ;
logit(p[i]) = a[group[i]] + b*x[i]i # can read as: the intercept for the group on row i
a[group] ~ normal(a_bar, sigma_group) # assign a common distribution with new parameter a_bar which is the average intercept, and some standard deviation (which you learn from the group)
a_bar ~ noormal(0,10)
sigma_group ~ cauchy(0,2)
b ~ normal(0, 100) # slope is still a blue effect
# Data: might help you understand the [[i]] indexing that's going on in the model specification
d <- data.frame(i = 1:7, y = c(1,0,0,1,1,0,1), group = c(1,1, 2, 2,2, 3, 3))
d
# Flat priors that are used by lme4 are often okay, but often they are not; if you get a varying effects estimate of 0 for any of your clusters,
# that's the estimation technique failing. Don't publish a varying effects estimate of 0. Flat priors will overfit your sample, and you
# don't expect the future to be exactly like your sample. Regularization (what priors let you do) allow us to guard against overfitting of the sample.
#
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment