Skip to content

Instantly share code, notes, and snippets.

View ctesta01's full-sized avatar
🌱

Christian Testa ctesta01

🌱
View GitHub Profile
# suppose we have two populations for which we'd like to infer the incidence rate ratio --
# 
# we could use a poisson, quasipoisson, or negative binomial model 
# 
# we'd like to know if we get the same estimates for incidence rate ratios using each of these models 

# dependencies
library(MASS)
library(broom)
@ctesta01
ctesta01 / branches_and_PRs.md
Last active March 23, 2022 19:35
Notes on using Git and GitHub with Pull Requests

Using Feature Branches and Pull Requests on GitHub

The following instructions walk through cloning a repository, creating a feature branch, and submitting a pull request.

You'll need to be ready to use git[^1] to follow along, and I suggest setting up ssh-key based authentication[^2] so you don't have to enter your username and password every time you are transferring code to and from GitHub.

@ctesta01
ctesta01 / lubridate_quarters.R
Created February 17, 2022 15:19
Create year-quarter variables from dates
library(dplyr)
library(lubridate)
library(stringr)
# let's say my data has some dates in it -- these are just random dates for an example
# and I want to code them to year and quarter
df <- data.frame(
# random dates between 2020-01-01 and 2022-02-14
date = lubridate::ymd('2020-01-01') + sample.int(775, 100, replace = TRUE))
library(ggforce)
tw <- -2:1 + .5 # triangle width points
th <- 0:3 + .5 # triangle height points
triangle <- rbind(c(-2,0), c(2,0), c(0,4))
nudge_factor <- 0.15
df <- tibble::tribble(
@ctesta01
ctesta01 / weekday_effect_on_reported_COVID_deaths.R
Last active February 14, 2022 19:00
Plot the weekday variation in reported COVID-19 deaths in the United States
library(readr)
library(ggdist)
library(tidyverse)
library(magrittr)
library(cowplot)
library(ISOweek)
df <- readr::read_csv("https://github.com/nytimes/covid-19-data/raw/master/rolling-averages/us.csv")
df %<>% mutate(wday = lubridate::wday(lubridate::ymd(date)))
weekdays <- c('Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday')
@ctesta01
ctesta01 / aligning_two_sided_dotplots.R
Last active February 7, 2022 15:43
Putting dotplots for two distinct groups on either side of the x-axis
library(ggplot2)
library(dplyr)
library(magrittr)
# generate fake data
total_N_obs <- 1000
example_data <-
data.frame(
year = sample(2000:2020, replace = TRUE, size = total_N_obs),
type = sample(c('a', 'b'), replace = TRUE, size = total_N_obs)
@ctesta01
ctesta01 / using_consistent_color_scales.R
Last active February 3, 2022 17:16
Using consistent color scales across different plots with scale_color_gradient
library(palmerpenguins)
library(purrr)
library(ggplot2)
library(dplyr)
library(patchwork)
# let's say I want to show bill_length_mm across different species (one plot for each species),
# but each plot should have the same color scale
# one common issue is that if you just use scale_color_gradient() or scale_color_brewer(), each
@ctesta01
ctesta01 / name_that_color.R
Last active January 29, 2022 19:20
Using Good Color Names for Color Variables in R
# Using Appropriate Color Names in R
#
# Often times the right color palette and color choices can greatly elevate the
# quality of a data visualization; the colourlovers.com website has long been
# host to a great number of 5-color palettes which users can favorite and share.
# Moreover, there is now an R package to interface with the colourlovers to
# automatically pull the hex-codes of a specified palette into R.
#
# To take this to the next step of usability, there are times it's appropriate
# to identify and name individiaul colors (ideally with memorable names) to
@ctesta01
ctesta01 / black_history_milestones.R
Last active February 3, 2022 17:54
Black History Milestones Scraped from History.com and Visualized
library(ggplot2)
library(magrittr)
library(rvest)
library(dplyr)
# url to fetch html from
html_data <- "https://www.history.com/topics/black-history/black-history-milestones"
# parse html
html_text <- read_html(html_data)
@ctesta01
ctesta01 / dorling_style_covid_cases_and_deaths.R
Last active January 27, 2022 16:47
Plot US COVID Cases and Deaths in the "SLOWDOWN" (or phase-plane) style
library(tidyverse)
library(magrittr)
library(geomtextpath)
df_usa <- readr::read_csv("https://raw.githubusercontent.com/nytimes/covid-19-data/master/rolling-averages/us.csv")
ma <- function(x, n = 31){stats::filter(x, rep(1 / n, n), sides = 2)}
df_usa %<>% mutate(
deaths_avg = as.numeric(ma(deaths_avg)),