Skip to content

Instantly share code, notes, and snippets.

@NaimKabir
Last active November 25, 2020 08:52
Show Gist options
  • Save NaimKabir/1d14e7cc636f4e398355457e702bc106 to your computer and use it in GitHub Desktop.
Save NaimKabir/1d14e7cc636f4e398355457e702bc106 to your computer and use it in GitHub Desktop.
---
title: "Significance Figs"
output: html_notebook
---
```{r}
library(tidyverse)
library(ggthemes)
library(gridExtra)
```
Learning a bit of R to take advantage of `ggplot`, since it looks neat.
In one figure, I'd like to demonstrate that the equation for Shiva's chosen measure, which is:
$$ y = -x + Z; Z \sim D $$
Where $x$ is the percent of straight ticket Republican voters in a precinct, and $Z$ is a random variable representing the percent of split ticket voters voting for Trump, drawn from some distribution $D$.
```{r}
# Get boundaries for all data given the structure of the equation and the limitation that percents must be within
# [0,100]
bounds <- data.frame(
x = c(0, 100, 100, 0),
y = c(100, 0, -100, 0)
)
# Simulate drawing from the delta(X) distribution
numPoints <- 100
linspace <- seq(1,100,length=numPoints)
scatter <- data.frame(
x = linspace,
y = -1*linspace
)
zeroes <- geom_point(data=scatter, mapping=aes(x=x, y=y))
# Draw from the uniform distribution [0, 100]
getData <- function(Z) {
d <- data.frame(
x = linspace,
y = -1*linspace + Z
)
return(d)
}
uniformZ = runif(numPoints, min=0, max=100)
uniform <- geom_point(data=getData(uniformZ), mapping=aes(x=x, y=y))
# Draw from a linear model with normally distributed noise and arbitrary slope/intercept--
# but reject samples outside the possible bounds.
lm <- function(slope, intercept, sd, numPoints) {
lm_ <- slope*linspace + intercept + rnorm(numPoints, mean=0, sd=sd)
# reject samples to avoid vals that exceed our boundaries
while (any(lm_ < 0)) {
idx <- lm_ < 0
lm_[idx] = slope*linspace[idx] + intercept + rnorm(length(lm_[idx]), mean=0, sd=sd)
}
while (any(lm_ > 100)) {
idx <- lm_ > 100
lm_[idx] = slope*linspace[idx] + intercept + rnorm(length(lm_[idx]), mean=0, sd=sd)
}
return(lm_)
}
lineDownZ <- lm(0.5, 10, 20, numPoints)
linearDown <- geom_point(data=getData(lineDownZ), mapping=aes(x=x, y=y))
lineUpZ <- lm(1.5, 10, 20, numPoints)
linearUp <- geom_point(data=getData(lineUpZ), mapping=aes(x=x, y=y))
# Draw from a linear model with log-normally distributed noise and a slope more than 1
p <- ggplot(bounds, aes(x = x, y = y)) + geom_polygon(alpha=0.15) + theme_light()
p1 <- p + zeroes + labs(tag = "A")
p2 <- p + uniform + labs(tag = "B")
p3 <- p + linearDown + labs(tag = "C")
p4 <- p + linearUp + labs(tag = "D")
grid.arrange(p1, p2, p3, p4, nrow = 2, ncol = 2)
# save
g <- arrangeGrob(p1, p2, p3, p4, nrow = 2, ncol = 2) #generates g
ggsave(file="4fig-ayyadurai.png", g) #saves g
```
**A:** $z = 0$. Baseline extreme assumption, where split-ticket Trump vote percentage in all precincts is 0.
**B:** $Z \sim U(0, 100)$. Assuming split-ticket Trump vote percentage is uniformly drawn.
**C:** $Z \sim 0.5x + 10 + \epsilon; \epsilon \in \mathcal{N}(0, 20); \{0 \ge z \ge 100\}$. Assuming a linear relationship between Republican votes in a district and split-ticket Trump votes, insofar as that linear relationship results in possible values.
**D:** $Z \sim 1.5x + 10 + \epsilon; \epsilon \in \mathcal{N}(0, 20); \{0 \ge z \ge 100\}$. Assuming a strong linear relationship between Republican votes and split-ticket Trump votes, insofar as that linear relationship admits possible values.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment