Skip to content

Instantly share code, notes, and snippets.

apoorvalal / neymanAllocationStrata.R
Last active August 22, 2022 23:04
Compute optimal treatment assignment for treatment effect precision
#' Compute Neyman allocation propensity scores for inference-optimal treatment assignment in data table [very fast]
#' @param df data.table
#' @param y outcome name
#' @param w treatment name
#' @param x covariate names (must all be discrete)
#' @return data.table with strata level conditional means, variances, propensity scores,
#' and neyman allocation propensities.
#' @export
neymanAllocation = function(df, y, w, x){
df1 = copy(df); N = nrow(df1)
apoorvalal / omnibusTestsOfHeterogeneity.R
Created August 12, 2022 02:33
Omnibus tests of treatment effect heterogeneity with linear and nonlinear hetfx estimators.
rm(list = ls())
libreq(data.table, estimatr,
DoubleML, mlr3, mlr3learners, dmlUtils)
# %% linear effect heterogeneity
dfm_omnibus = function(y, w, X){
n1 = sum(w); n0 = sum(1-w); K = ncol(X)
# separate outcome models
m1 =[w==1,], y[w==1]); m0 =[w==0,], y[w==0])
apoorvalal /
Last active May 5, 2022 19:57
example of text parsing using gettysburg address.
# %%
from bs4 import BeautifulSoup
from urllib import request
import re
from nltk.corpus import stopwords
from nltk.stem import PorterStemmer
from nltk.tokenize import word_tokenize
from collections import Counter
apoorvalal / Thompson.R
Created March 18, 2022 17:23
thompson sampling minimal example
rm(list = ls())
libreq(data.table, ggplot2)
# %%
thompson = function(n, K, reward_probs){
# init choices and reward vectors
choices <- rewards <- rep(NA, n)
# n+1 X K*2 matrix of S and F counts successes stored in first K, failures in next K
s_f = matrix(NA, nrow = n+1, K * 2) # +1 to accommodate last update step
apoorvalal / HC0_4_manual.R
Created February 14, 2022 18:29
manual implementation of HC0-HC4 robust standard errors for reference
library(car); library(sandwich)
# %%
fit = lm(price ~ mpg + weight, data = auto)
X = model.matrix(fit); n = nrow(X); k = ncol(X)
e = resid(fit)
A = crossprod(X)
H = X %*% solve(A) %*% t(X); h_ii = diag(H)
# cla = (t(e) %*% e)/(n-k) %*% solve(t(X) %*% X)
CLA = as.numeric(crossprod(e)/(n-k)) * solve(A)
apoorvalal / deltaMethodExamples.R
Created February 13, 2022 18:11
Hypothesis testing on linear and nonlinear combinations of coefficients using the Delta method
# %%
library(car); library(sandwich)
m1 <- lm(time ~ t1 + t2, data = Transact)
m1 |> summary()
vcovmat = vcovHC(m1)
# %%
deltaMethod(m1, "t1/t2")
deltaMethod(m1, "t1/t2", vcov = vcovmat)
apoorvalal / OaxacaBlinderATTEstimation.R
Created February 12, 2022 08:11
Implementation of OB treatment effect estimators
rm(list = ls())
libreq(data.table, fixest, rio, ggplot2, ebal)
# %%
cps3 = import("cps3re74.dta") |> setDT()
cps3 = cps3 |> na.omit()
setnames(cps3, c("re78", "treat"), c("y", "d"))
xs = c("age", "age2", "ed", "black", "hisp", "married", "nodeg", "re74", "re75")
apoorvalal / eur_leagues_competitiveness.R
Last active December 18, 2021 21:06
construct measures of competitiveness of european leagues using match level data from
# %% ####################################################
rm(list = ls())
LalRUtils::libreq(ggplot2, data.table, fixest, magrittr,
patchwork, IRdisplay, did, panelView, plotly)
options(repr.plot.width=12, repr.plot.height=9)
options(ggplot2.discrete.fill = RColorBrewer::brewer.pal(7, "Set2"))
options(ggplot2.discrete.colour = RColorBrewer::brewer.pal(7, "Set2"))
options(ggplot2.continuous.fill = "viridis"); options(ggplot2.continuous.colour = "viridis")

Figlet for code signposting

  • Data prep code is long, horrible, and difficult to wrap in functions
  • Typical data-scraping/cleaning scripts end up being thousands of lines long
  • Parsing these scripts are a chore for future-you / other people
  • Do yourself/others a favour by clearly signposting your code (via the minimap feature in most modern editors)

example of minimap in Atom

  • To do this, use the trusty old unix tool figlet (available on all
apoorvalal / event_study_comparison.R
Created August 23, 2021 22:50
comparing estimators for event studies
rm(list = ls())
libreq(data.table, fixest, did2s, ggplot2)
# %%
simulate = function() {
# dimensions
N = 250
T = 35