Skip to content

Instantly share code, notes, and snippets.

View statwonk's full-sized avatar
🏠
Working from home

Christopher Peters statwonk

🏠
Working from home
View GitHub Profile
@statwonk
statwonk / ai_data_engineer_copilot.py
Created March 24, 2024 15:07
A tool to make data engineering easier, written by Cora and Alan https://www.loom.com/share/b5c46b716f23476ba13d22fca1dd1d72
import pandas as pd
import statsmodels.api as sm
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
class AIDataEngineer:
def __init__(self):
self.data = None
self.model = None
self.X_train = None
@statwonk
statwonk / ivermectin.R
Created March 13, 2024 01:45
An analysis of ivermectin efficacy
library(tidyverse)
library(brms)
library(tidybayes)
# https://docs.google.com/spreadsheets/d/1vG0WdjZaYlS4_7_if-OaE3uMv3aX6zfvQMx5JvDREF0/edit?usp=sharing
# Chi.sq
prop.test(c(103, 135), c(5947, 5609),
alternative = "less",
conf.level = 0.99)
@statwonk
statwonk / religiousity_by_firearm_deaths.R
Created April 16, 2023 16:34
Analysis of how religiousity contributes to firearm deaths.
library(tidyverse)
read_csv("~/Downloads/Religion by Firearm Deaths Data - Firearm Deaths by State.csv",
skip = 4) %>%
janitor::clean_names() -> firearm_deaths
read_csv("~/Downloads/Religion by Firearm Deaths Data - Religiousity by State.csv",
skip = 3) %>%
janitor::clean_names() %>%
rename(state = region) %>%
mutate(religious = 1 - irreligion_percent/100) %>%
@statwonk
statwonk / inflation.R
Created August 17, 2022 23:04
Studying inflation dynamics using an ARDL model
library(tidyverse)
# CPI
# https://data.bls.gov/timeseries/CUSR0000SA0&output_view=pct_1mth
readxl::read_xlsx("~/Downloads/SeriesReport-20220817182003_bb0f80.xlsx", skip = 10) %>%
mutate(date = seq.POSIXt(as.POSIXct("1957-02-01"), by = "month", length.out = n())) %>%
select(date, cpi = Value) %>%
inner_join(
# Short-term Interest Rates - https://fred.stlouisfed.org/series/DGS1MO
read_csv("~/Downloads/DGS1MO (1).csv", col_types = cols(
@statwonk
statwonk / gist:55a29e2a8792b6f2d6833d376d4754da
Last active August 7, 2022 16:29
Examining US CPI persistence by using using the fractional-differencing technique of time series analysis. In this analysis, I roll the differencing procedure over a 120 month window to expose changes in the underlying process of inflation persistence.
library(tidyverse)
# CPI
# https://data.bls.gov/timeseries/CUSR0000SA0&output_view=pct_1mth
readxl::read_xlsx("~/Downloads/SeriesReport-20220807120638_733b06.xlsx", skip = 10) %>%
mutate(date = seq.POSIXt(as.POSIXct("1947-02-01"), by = "month", length.out = n())) %>%
select(date, cpi = Value) -> cpi
cpi %>%
filter(date >= as.POSIXct("1949-01-01")) %>%
@statwonk
statwonk / inflation.R
Last active January 30, 2022 17:43
Using a time series model to inspect the seasonally adj. US CPI inflation measure, % change
library(tidyverse)
library(forecast)
library(urca)
library(dynlm)
library(lmtest)
library(sandwich)
# https://data.bls.gov/timeseries/CUSR0000SA0&output_view=pct_1mth
read_csv("~/Downloads/inflation.csv", skip = 11) %>%
janitor::clean_names() %>%
@statwonk
statwonk / stock_prices.R
Created January 9, 2022 18:43
Analysis of stock prices as a function of the risk free rate and market risk.
library(tidyverse)
library(gamlss); select <- dplyr::select
library(fmpapi)
library(Quandl)
fmp_daily_prices("ASAN") -> d
fmp_daily_prices("TEAM") -> team
Quandl("USTREASURY/YIELD") -> yc
yc %>%
@statwonk
statwonk / clustering.R
Created March 4, 2021 12:58
Let's explore Dr. Wooldridge's clustering comment on Twitter. https://twitter.com/jmwooldridge/status/1366515323923488768?s=20
library(tidyverse)
library(lmtest)
library(sandwich)
5e2 -> students
20 -> schools
tibble(student_id = 1:students) %>%
mutate(school_id = rep(1:schools, max(student_id) / schools)) %>%
left_join(tibble(school_id = 1:schools, school_effect = rnorm(schools)),
@statwonk
statwonk / beta_interval_data.R
Last active December 31, 2020 18:26
Using beta-distributed interval-censored data to produce an estimate for the median share of US adults having taken the vaccine by end of Q2 2021.
library(tidyverse)
library(fitdistrplus)
dplyr::select -> select
.Machine$double.eps -> eps
1975 -> N # number of responses
tibble(lower = c(0, 0.25, 0.5, 0.75), # lower bins
upper = c(0.25 + eps, 0.5 + eps, 0.75 + eps, 1), # upper bins
pct = c(0.32, 0.51, 0.15, 1 - sum(0.32, 0.51, 0.15)), # response shares
n = floor(pct * N)) %>% # implied responses + eps
@statwonk
statwonk / mktcap.R
Last active December 25, 2020 21:53
Some rough beliefs about market capitalization for a new entrant from a variety of sectors, sub-sectors and states.
library(tidyverse)
library(rvest)
library(gamlss)
library(brms)
library(tidybayes)
select <- dplyr::select
####################################################################################
# Model the market capitalizations of members of the S&P 500.
####################################################################################