Instantly share code, notes, and snippets.

Embed
What would you like to do?
# multivariate normal example using the MASS package
require(MASS)
#make this reproducible
set.seed(2)
#how many draws in our starting data set?
n <- 1e4
# how many draws do we want from this distribution?
drawCount <- 1e4
myData <- rnorm(n, 5, .6)
yourData <- myData + rnorm(n, 8, .25)
hisData <- myData + rnorm(n, 6, .4)
herData <- hisData + rnorm(n, 8, .35)
ourData <- data.frame(myData, yourData, hisData, herData)
# now we have raw correlations in the 70s, 80s, and 90s. Funny how that
# works out
cor(ourData)
#build the mvrnorm and take draws
#this replaces all my normalizing and
#copula building in the previous example
myDraws <- mvrnorm(1e5, mu=mean(ourData),
Sigma=cov(ourData)
)
myDraws <- data.frame(myDraws)
#check the mean and sd
apply(myDraws, 2, mean)
apply(myDraws, 2, sd)
#let's look at the mean and standard dev of the starting data
apply(ourData, 2, mean)
apply(ourData, 2, sd)
# so myDraws contains the final draws
# let's check Kolmogorov-Smirnov between the starting data
# and the final draws
for (i in 1:ncol(ourData)){
print(ks.test(myDraws[[i]], ourData[[i]]))
}
#look at the correlation matrices
cor(myDraws)
cor(ourData)
#it's fun to plot the variables and see if the PDFs line up
#It's a good sanity check. Using ggplot2 to plot
require(ggplot2)
# rearrange the data to be "tall" not "wide"
meltDraws <-melt(myDraws)
meltDraws$dataSource <- "simulated"
meltData <- melt(ourData)
meltData$dataSource <- "original"
plotData <- rbind(meltData, meltDraws)
qplot(value, colour=dataSource, data=plotData, geom="density")+ facet_wrap(~variable)
@DataPsycho

This comment has been minimized.

Copy link

DataPsycho commented Jan 7, 2015

Ok I am not so much experienced. But in "mvrnorm" , mean should be a vector. The syntax "mean" cant find the mean for a data frame like "ourData". So an error should be shown when running this code.

@sobradob

This comment has been minimized.

Copy link

sobradob commented Dec 5, 2016

Can confirm I get an error.

Error in mvrnorm(1e+05, mu = mean(ourData), Sigma = cov(ourData)) : 
  incompatible arguments
In addition: Warning message:
In mean.default(ourData) : argument is not numeric or logical: returning NA
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment