% Constrained multivariate normals in Stan % Chris Hanretty % 2014/02/27
Suppose we are interested in recovered the location parameters of a multivariate normal distribution with a known covariance matrix.
Specifically, suppose we are interested in recovering the location parameters given a covariance matrix that forces draws from this multivariate normal to sum to one.
This would be useful for recovering information about `true' latent levels of party support given manifest polling data.
In this case, we could use polling figures (
We could then pass this to Stan. However, Stan turns out to be very strict about the positive-definiteness of the covariance matrices it receives. This note is an aide memoire that this strictness can cause seemingly valid code to choke.
We begin by loading all the libraries we need for this note.
Here,
library(corpcor)
library(MCMCpack)
library(MASS)
library(rstan)
set.seed(1982)
We can now generate some combinations of party support -- about fifty realizations for eight parties should be enough.
nSims <- 50
nParties <- 8
shape <- rdirichlet(1, rep(1, nParties))
shape
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] 0.08621 0.03882 0.0337 0.02104 0.07379 0.673 0.02248 0.05096
party.means <- rdirichlet(nSims, shape)
summary(party.means)
## V1 V2 V3 V4
## Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000
## Median :0.0002 Median :0.0000 Median :0.0000 Median :0.0000
## Mean :0.1009 Mean :0.0371 Mean :0.0395 Mean :0.0189
## 3rd Qu.:0.0844 3rd Qu.:0.0001 3rd Qu.:0.0001 3rd Qu.:0.0000
## Max. :0.9205 Max. :0.5739 Max. :0.7923 Max. :0.4899
## V5 V6 V7 V8
## Min. :0.0000 Min. :0.0404 Min. :0.0000 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.:0.3976 1st Qu.:0.0000 1st Qu.:0.0000
## Median :0.0000 Median :0.8040 Median :0.0000 Median :0.0000
## Mean :0.0514 Mean :0.6775 Mean :0.0377 Mean :0.0370
## 3rd Qu.:0.0256 3rd Qu.:0.9679 3rd Qu.:0.0000 3rd Qu.:0.0105
## Max. :0.7460 Max. :1.0000 Max. :0.9399 Max. :0.6055
We now create the covariance matrices described above, and create a separate covariance matrix for each realization of party support.
These we save in the list
compCov <- function(x) {
retval <- outer(x, -x, "*")
diag(retval) <- x * (1 - x)
return(retval)
}
## And now apply to each draw
res <- vector("list", nSims)
for (i in 1:nSims) {
res[[i]] <- compCov(party.means[i, ])
}
We can draw multivariate normal samples using these covariance matrices without any problems in R. Here is just such a draw, and a test to show that the draws are all (approximately) equal to one.
out <- vector("list", nSims)
for (i in 1:nSims) {
out[[i]] <- mvrnorm(n = nSims, mu = party.means[i, ], Sigma = res[[i]])
}
## Do the samples sum to one ?
sum1.test <- unlist(lapply(out, function(x) all.equal(rowSums(x), rep(1, nrow(x)))))
table(sum1.test)
## sum1.test
## Mean relative difference: 1.718e-08 Mean relative difference: 1.814e-08
## 1 1
## TRUE
## 48
The problem arises because not all of these matrices are strictly positive definite for certain tolerance thresholds.
MASS's
## Are all covariance matrices positive-definite?
PosDef.test <- unlist(lapply(res, is.positive.definite))
table(PosDef.test)
## PosDef.test
## FALSE
## 50
We must therefore transform these matrices so that they become positive definite for certain tolerance thresholds. Here, I create two lists of positive definite matrices.
res2 <- lapply(res, make.positive.definite)
res3 <- lapply(res, make.positive.definite, tol = 1e-05)
### How close are these?
cor(unlist(res), unlist(res2))
## [1] 1
cor(unlist(res), unlist(res3))
## [1] 1
cor(unlist(res2), unlist(res3))
## [1] 1
The problem is that Stan will not accept the first of these two lists, because it is too picky. Here is some Stan code that will try and recover the means of these realizations.
stan_code <- '
data{
int<lower=0> k; // size of vectors
int<lower=0> nObs; //
vector[k] y[nObs]; //
cov_matrix[k] Sigma[nObs];
}
parameters {
simplex[k] mu;
}
model {
for (n in 1:nObs)
y[n] ~ multi_normal(mu, Sigma[n]);
}
'
We can estimate this by passing the following data to R.
forStan <- list(k = nParties, nObs = nSims, Sigma = res2, y = party.means)
fit <- stan(model_code = stan_code, data = forStan, iter = 2000, chains = 2)
### Error: Invalid value of Sigma: Error in function
### model32aa4661e444_stan_code_namespace::model32aa4661e444_stan_code(d):
### Sigma[k0__]Sigma[k0__] is not positive definite. Sigma[k0__](0,0) is
### 0.00028837592811670703. failed to create the sampler; sampling not done
... but we would get the message shown above. We must therefore pass the second list of covariance matrices, which Stan just about accepts.
### Try with a slightly different set of cov. mats
forStan <- list(k = nParties, nObs = nSims, Sigma = res3, y = party.means)
fit <- stan(model_code = stan_code, data = forStan, iter = 2000, chains = 2)
##
## TRANSLATING MODEL 'stan_code' FROM Stan CODE TO C++ CODE NOW.
## COMPILING THE C++ CODE FOR MODEL 'stan_code' NOW.
## SAMPLING FOR MODEL 'stan_code' NOW (CHAIN 1).
##
Iteration: 1 / 2000 [ 0%] (Warmup)
Iteration: 200 / 2000 [ 10%] (Warmup)
Iteration: 400 / 2000 [ 20%] (Warmup)
Iteration: 600 / 2000 [ 30%] (Warmup)
Iteration: 800 / 2000 [ 40%] (Warmup)
Iteration: 1000 / 2000 [ 50%] (Warmup)
Iteration: 1200 / 2000 [ 60%] (Sampling)
Iteration: 1400 / 2000 [ 70%] (Sampling)
Iteration: 1600 / 2000 [ 80%] (Sampling)
Iteration: 1800 / 2000 [ 90%] (Sampling)
Iteration: 2000 / 2000 [100%] (Sampling)
## Elapsed Time: 58.52 seconds (Warm-up)
## 50.62 seconds (Sampling)
## 109.14 seconds (Total)
##
## SAMPLING FOR MODEL 'stan_code' NOW (CHAIN 2).
##
Iteration: 1 / 2000 [ 0%] (Warmup)
Iteration: 200 / 2000 [ 10%] (Warmup)
Iteration: 400 / 2000 [ 20%] (Warmup)
Iteration: 600 / 2000 [ 30%] (Warmup)
Iteration: 800 / 2000 [ 40%] (Warmup)
Iteration: 1000 / 2000 [ 50%] (Warmup)
Iteration: 1200 / 2000 [ 60%] (Sampling)
Iteration: 1400 / 2000 [ 70%] (Sampling)
Iteration: 1600 / 2000 [ 80%] (Sampling)
Iteration: 1800 / 2000 [ 90%] (Sampling)
Iteration: 2000 / 2000 [100%] (Sampling)
## Elapsed Time: 63.05 seconds (Warm-up)
## 81.29 seconds (Sampling)
## 144.34 seconds (Total)
In this instance, because there's almost no information, Stan ends up with one mode where mu[6] = 1;
fit
## Inference for Stan model: stan_code.
## 2 chains, each with iter=2000; warmup=1000; thin=1;
## post-warmup draws per chain=1000, total post-warmup draws=2000.
##
## mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
## mu[1] 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 1304 1
## mu[2] 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 1082 1
## mu[3] 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 1311 1
## mu[4] 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 1404 1
## mu[5] 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 1500 1
## mu[6] 1.0 0.0 0.0 1.0 1 1.0 1.0 1.0 1276 1
## mu[7] 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 1541 1
## mu[8] 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 1380 1
## lp__ -111.9 0.1 2.1 -117.1 -113 -111.6 -110.3 -108.9 433 1
##
## Samples were drawn using NUTS(diag_e) at Thu Feb 27 15:20:51 2014.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at
## convergence, Rhat=1).