Skip to content

Instantly share code, notes, and snippets.

@FrankRuns
FrankRuns / analyze-ontime-three-models.py
Created April 28, 2024 15:19
Supporting analysis for How supply chain leaders improve on-time delivery with multiple models
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import dowhy
from dowhy import CausalModel
import networkx as nx
import math
import sklearn
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
@FrankRuns
FrankRuns / white-noise.R
Created January 23, 2024 11:20
Calculations supporting substack post called Analytics, now what?
# Replicate Bob's results from this LinkedIn post:
# https://www.linkedin.com/posts/bob-wilson-77a22ab_people-sometimes-say-ab-testing-requires-activity-7152792859878871040-X1Sr?utm_source=share&utm_medium=member_desktop
### Implement Fisher's Exact Test
# Create the contingency table
contingency_table <- matrix(c(0, 4, 7, 3), nrow = 2)
dimnames(contingency_table) <- list(c("Control", "Treatment"),
@FrankRuns
FrankRuns / data-analyst-mistakes.R
Created January 10, 2024 19:47
Script to simulate tv holiday episode data for data analysts make mistakes article
# Load Required Libraries
if (!require("MASS")) install.packages("MASS")
library(MASS)
# Define TV Shows
# A vector of TV show titles
tv_shows <- c(
"Breaking Bad", "Game of Thrones", "The Wire",
"Stranger Things", "The Crown", "Mad Men",
"The Sopranos", "Friends", "The Office",
@FrankRuns
FrankRuns / tidytuesday-drwho.R
Created November 28, 2023 11:59
Quick inspection and vis of Dr. Who dataset for TidyTuesday 2023-11-28
# load packages
library(tidyverse)
library(tidytuesdayR) # Used for loading datasets from the TidyTuesday project
# load datasets
tuesdata <- tidytuesdayR::tt_load('2023-11-28')
drwho_episodes <- tuesdata$drwho_episodes
drwho_directors <- tuesdata$drwho_directors
drwho_writers <- tuesdata$drwho_writers
@FrankRuns
FrankRuns / selection-bias-visual.r
Last active June 12, 2022 10:56
Very quick ggplot2 scatterplot visualization for selection bias article.
# purpose: visualize linear trend for all data and subset of data
# libraries
library(dplyr)
library(ggplot2)
# read data
d <- read.csv("mycsvfile.csv")
# quickl look
# purpose: helper script to determine my max heart rate
# load libraries
library(dplyr)
library(rethinking)
#### Read and Filter Data ----
# grab data
---
title: "Is-Behavior-Change-Happening"
author: "frank-corrigan"
date: "3/9/2022"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
---
title: "Marathon-Insights"
author: "frank-corrigan"
date: "3/1/2022"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
# toy simulation to determine how likely or
# unlikely the poll results from blog post
# are (Do you plan to relocate in 2022?)
# load packages
library(dplyr)
library(ggplot2)
# Single Instance ----
from tabula.io import read_pdf
import pandas as pd
import matplotlib.pyplot as plt
# set path to file
pdf_path = "Container-Vessels-In-Port.pdf"
# creates a python list, each page is a list item stored as pandas df
dfs = read_pdf(pdf_path, stream=True, pages='2-10')