Skip to content

Instantly share code, notes, and snippets.

@m-Py
Last active May 13, 2020 08:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save m-Py/679513ce04b87abb720999f6f1dc6778 to your computer and use it in GitHub Desktop.
Save m-Py/679513ce04b87abb720999f6f1dc6778 to your computer and use it in GitHub Desktop.
Function to generate bivariate normal data with specified correlation
## Year 2019 - 2020
## Author: Martin Papenberg
## This code is in the public domain, do with it whatever you like.
# Generate bivariate normal data with specified correlation
# param n: how many data points
# param mx: the mean of the first variable
# param my: the mean of the second variable
# param sdx: the standard deviation of the first variable
# param sdy: the standard deviation of the second variable
# param r: the observed correlation between the two measures
# param empirical: Boolean; does r refer to the correlation in the sample
# (-> empirical = TRUE) or the population (-> empirical = FALSE)
#
# return: the data set (two columns named `x` and `y` with random normal
# data generated via MASS::mvrnom)
paired_data <- function(n, mx = 0, my = 0, sdx = 1, sdy = 1, r, empirical = FALSE) {
cor_matrix <- matrix(c(1, r, r, 1), ncol = 2)
sds <- c(sdx, sdy)
vars <- sds %*% t(sds)
cov_matrix <- vars * cor_matrix
data <- MASS::mvrnorm(
n,
mu = c(mx, my),
Sigma = cov_matrix,
empirical = empirical
)
data <- data.frame(data)
colnames(data) <- c("x", "y")
data
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment