Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@steveharoz
steveharoz / generate graph analyze.R
Last active June 13, 2023 08:38
Example data for analysis
library(tidyverse)
library(lmerTest)
# subject count
COUNT = 5
set.seed(8675309)
# generate a unique intercept per subject
data = tibble(
@steveharoz
steveharoz / image.md
Last active May 18, 2023 11:42
perceived correlation of rank

image

@steveharoz
steveharoz / extract citation numbers.R
Created January 7, 2023 08:07
Extract IEEE citation numbers from a PDF
# Extract all citation numbers such as [1] from a PDF's text
# It also includes cases for multiples [1, 3] and ranges [1-5]
# It tries to exclude confidence intervals by skipping
#
# written by Steve Haroz with help from ChatGPT
# MIT license
library(tidyverse)
library(pdftools)
@steveharoz
steveharoz / multiple comparison simulation.R
Last active December 1, 2022 02:28
Simulate multiple comparisons to show that an adjustment is needed
COUNT = 100000
# How often does a single t-test of random data yield p<0.05?
replicate(COUNT,
t.test(rnorm(20))$p.value < 0.05
) %>% mean()
#> 0.05073
# 5% false positive rate
@steveharoz
steveharoz / hierarchical pie.R
Created April 5, 2022 02:16
hierarchical pie
library(tidyverse)
COUNT = 40
data = tibble(
car = paste0(sample(LETTERS, COUNT, TRUE), sample(letters, COUNT, TRUE), sample(letters, COUNT, TRUE)),
value = rnorm(COUNT, 3),
group = c(rep("Petrol", COUNT/2), rep("Hybrid", COUNT/4), rep("Pure Electric", COUNT/8), rep("Diesel", COUNT/8))
)
@steveharoz
steveharoz / endpoint.R
Last active February 16, 2022 14:42
Endpoint stat for ggplot
StatEndpoint <- ggproto("StatEndpoint", Stat,
compute_group = function(data, scales) {
# sort by x so indexing is meaningful
data = arrange(data, x)
# grab only the first and last row
data[c(1,nrow(data)),]
},
required_aes = c("x", "y")
)
We can't make this file beautiful and searchable because it's too large.
x,y,value
1,205,0.3125
1,204,0.3125
1,203,0.3125
1,202,0.3125
1,201,0.3125
1,200,0.3125
1,199,0.3125
1,198,0.3125
1,197,0.3125
@steveharoz
steveharoz / readme.md
Last active November 5, 2021 09:16
XKCD colormap
@steveharoz
steveharoz / pie chart - subcategories.R
Created October 28, 2021 10:36
Pie chart with subcategories
library(tidyverse)
set.seed(999)
data = tibble(
name = c("A1", "A2", "A3", "A4", "B1", "B2", "B3", "B4", "C1", "C2"),
value = rnorm(10, 10, sd = 3),
color = c(
hcl(220, seq(60, 30, -10), seq(50, 80, 10)),
hcl(0, seq(60, 30, -10), seq(50, 80, 10)),
@steveharoz
steveharoz / Texas congressional district simulation.R
Created October 21, 2021 16:53
Texas congressional district simulation
library(tidyverse)
# arbitrary number
district_count = 38
# population from stephanie's figure
# https://twitter.com/evergreendata/status/1450862060972216320
population = c(
rep("White", 40),
rep("Latino", 39),