Last active
November 25, 2020 08:52
-
-
Save NaimKabir/1d14e7cc636f4e398355457e702bc106 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
title: "Significance Figs" | |
output: html_notebook | |
--- | |
```{r} | |
library(tidyverse) | |
library(ggthemes) | |
library(gridExtra) | |
``` | |
Learning a bit of R to take advantage of `ggplot`, since it looks neat. | |
In one figure, I'd like to demonstrate that the equation for Shiva's chosen measure, which is: | |
$$ y = -x + Z; Z \sim D $$ | |
Where $x$ is the percent of straight ticket Republican voters in a precinct, and $Z$ is a random variable representing the percent of split ticket voters voting for Trump, drawn from some distribution $D$. | |
```{r} | |
# Get boundaries for all data given the structure of the equation and the limitation that percents must be within | |
# [0,100] | |
bounds <- data.frame( | |
x = c(0, 100, 100, 0), | |
y = c(100, 0, -100, 0) | |
) | |
# Simulate drawing from the delta(X) distribution | |
numPoints <- 100 | |
linspace <- seq(1,100,length=numPoints) | |
scatter <- data.frame( | |
x = linspace, | |
y = -1*linspace | |
) | |
zeroes <- geom_point(data=scatter, mapping=aes(x=x, y=y)) | |
# Draw from the uniform distribution [0, 100] | |
getData <- function(Z) { | |
d <- data.frame( | |
x = linspace, | |
y = -1*linspace + Z | |
) | |
return(d) | |
} | |
uniformZ = runif(numPoints, min=0, max=100) | |
uniform <- geom_point(data=getData(uniformZ), mapping=aes(x=x, y=y)) | |
# Draw from a linear model with normally distributed noise and arbitrary slope/intercept-- | |
# but reject samples outside the possible bounds. | |
lm <- function(slope, intercept, sd, numPoints) { | |
lm_ <- slope*linspace + intercept + rnorm(numPoints, mean=0, sd=sd) | |
# reject samples to avoid vals that exceed our boundaries | |
while (any(lm_ < 0)) { | |
idx <- lm_ < 0 | |
lm_[idx] = slope*linspace[idx] + intercept + rnorm(length(lm_[idx]), mean=0, sd=sd) | |
} | |
while (any(lm_ > 100)) { | |
idx <- lm_ > 100 | |
lm_[idx] = slope*linspace[idx] + intercept + rnorm(length(lm_[idx]), mean=0, sd=sd) | |
} | |
return(lm_) | |
} | |
lineDownZ <- lm(0.5, 10, 20, numPoints) | |
linearDown <- geom_point(data=getData(lineDownZ), mapping=aes(x=x, y=y)) | |
lineUpZ <- lm(1.5, 10, 20, numPoints) | |
linearUp <- geom_point(data=getData(lineUpZ), mapping=aes(x=x, y=y)) | |
# Draw from a linear model with log-normally distributed noise and a slope more than 1 | |
p <- ggplot(bounds, aes(x = x, y = y)) + geom_polygon(alpha=0.15) + theme_light() | |
p1 <- p + zeroes + labs(tag = "A") | |
p2 <- p + uniform + labs(tag = "B") | |
p3 <- p + linearDown + labs(tag = "C") | |
p4 <- p + linearUp + labs(tag = "D") | |
grid.arrange(p1, p2, p3, p4, nrow = 2, ncol = 2) | |
# save | |
g <- arrangeGrob(p1, p2, p3, p4, nrow = 2, ncol = 2) #generates g | |
ggsave(file="4fig-ayyadurai.png", g) #saves g | |
``` | |
**A:** $z = 0$. Baseline extreme assumption, where split-ticket Trump vote percentage in all precincts is 0. | |
**B:** $Z \sim U(0, 100)$. Assuming split-ticket Trump vote percentage is uniformly drawn. | |
**C:** $Z \sim 0.5x + 10 + \epsilon; \epsilon \in \mathcal{N}(0, 20); \{0 \ge z \ge 100\}$. Assuming a linear relationship between Republican votes in a district and split-ticket Trump votes, insofar as that linear relationship results in possible values. | |
**D:** $Z \sim 1.5x + 10 + \epsilon; \epsilon \in \mathcal{N}(0, 20); \{0 \ge z \ge 100\}$. Assuming a strong linear relationship between Republican votes and split-ticket Trump votes, insofar as that linear relationship admits possible values. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment