Skip to content

Instantly share code, notes, and snippets.

View bgall's full-sized avatar

bgall bgall

View GitHub Profile
@bgall
bgall / covariate_adjustment_bias.R
Last active February 9, 2023 22:00
Show bias in covariate adjustment and linear combination "issue"
library(dplyr)
# Wrap data simulation in a function
sim_data <- function(N) {
###########################################################################
# parameters
###########################################################################
set.seed(123)
@bgall
bgall / benchmarking_dummy_functions
Last active May 19, 2022 08:02
Benchmarking fastDummies::dummy_cols() against modeldb::add_dummy_variables()
# Compare the performance of two purportedly-fast ways of generating dummy variables from character vector.
# NOTE: fastDummies retains the original variable by default while modeldb does not
# Dependencies
library(microbenchmark)
library(fastDummies)
library(modeldb)
library(dplyr)
# Simulate data: 1 million rows, 1 variable with 26 unique values
@bgall
bgall / social_desirability_bias.R
Last active February 5, 2020 18:39
Social Desirability can produce incorrect inferences about the sign of an effect
# Show that social desirability not only produces biased
# "descriptive statistics" (e.g. group means), but can
# produce treatment effects of the opposite sign of
# the true effect. You cannot simply ignore social
# desirability under the assumption that it does
# not affect estimates of causal parameters since
# measures pre-treatment and post-treatment are
# both subject to social desirability.
#
# Note that this arises due to social desirability
@bgall
bgall / post-hoc-power-plants-mental.r
Last active January 6, 2020 15:26
Calculate post-hoc power of study of the effect of indoor plants on mental health
set.seed(123)
##############################################
# Study parameters
##############################################
# Sample size
n <- 63
# Mean of outcome
@bgall
bgall / create_predictors.R
Last active December 1, 2019 21:23
Wrapped for create_attributes, create_design_df, and create_value_dummies
################################################
# Define function: create_predictors
# Creates randomly generated data sets containing
# all design variables, attributes, and dummy
# variables for the right-hand side of the conjoint
# analysis (excludes the outcome variable),
# based on specified data parameter values.
# Data are "long," sucht hat each row of the data
# is a conjoint profile.
#
@bgall
bgall / create_value_dummies.R
Created December 1, 2019 21:01
Function to generate dummy variables from the possible sets of values
#########################################################
# Define function: create_value_dummies
# Create dummy variables indicating if a given
# profile takes on a specific value from its
# possible values. Create a dummy for the range
# of the variable, independent of whether the actual
# random sampling does (not) draw that value and assign
# it to a profile.
#########################################################
@bgall
bgall / create_attributes.R
Last active December 1, 2019 20:49
Function to generate arbitrary conjoint attributes with specified probabilities values are selected
##############################################################
# Define function: create_attributes
#
# Creates randomly generated vectors of values from sets
# of potential values for a specified number of attributes
#
# *Arguments*
#
# attr_names (optional)
# vector of attribute names of attr_n length. If no values
@bgall
bgall / create_design_df.R
Last active December 1, 2019 20:46
Creates "skeleton" data frame of conjoint parameters.
#########################################################
# Define function: create_design_df
#
# Description: Generates a data frame with participant
# IDs, choice set ID, profile ID, for specified numbers of
# participants, choice tasks per participant, and profiles
# per choice tasks
#
# Arguments:
# N = # of participants
@bgall
bgall / load_pwrc_pkgs.R
Created December 1, 2019 20:29
Load packages for simulation-based power calculation
#########################################################################
# 'loadpkg' function
# Checks if a vector of packages are installed. If not, installs the
# package. Then loads all packages in vector.
#########################################################################
loadpkg <- function(toLoad){
for(lib in toLoad){
if(! lib %in% installed.packages()[,1]) {
install.packages(lib, repos='http://cran.rstudio.com/')
#########################################################################################
# This gist contains a quick walk-through of several ways to produce scales capturing
# the average value of one or more variables. Each row (observation) gets its own
# value. We'll assume your data are not fully "tidy." What I mean by this is that you
# have an observation for each row and you want to calculate that observation's value
# on the scale, but each variable that should go into your scale is in its own column.
#########################################################################################
#########################################################################################
# Set-up (packages, fake data, etc.)