Skip to content

Instantly share code, notes, and snippets.

View Aariq's full-sized avatar
🍵

Eric R. Scott Aariq

🍵
View GitHub Profile
@Aariq
Aariq / baked_goods.csv
Last active February 20, 2023 14:27
Muffin and cupcake ingredients in US cups
We can make this file beautiful and searchable if this error is corrected: It looks like row 4 should actually have 31 columns, instead of 30. in line 3.
type,recipe_id,baking powder,baking soda,butter,buttermilk,chocolate,cornmeal,cream cheese,eggs,flour,fruit,fruit juice,honey,margarine,milk,nut,oats,oil,other,salt,sour cream,spice,starch,sugar,unitless,vanilla,vegetable,vinegar,water,yogurt
Cupcake,145206,0.0017361083333333333,8.680541666666666e-4,0,0,0,0,0,0,0.09375,0,0,0.010416666666666666,0,0.0625,0.04513888333333333,0,0.027777777777777776,0,4.340270833333333e-4,0,0,0,0.05555555555555555,0,0.0034722166666666665,0,0.0034722166666666665,0,0
Cupcake,240140,8.680541666666666e-4,6.944433333333333e-4,0.016666666666666666,0,0,0,0.03333333333333333,0.13333333333333333,0.08333333333333333,0,0,0,0,0,0,0,0.05,0,3.4722166666666663e-4,0,0.0024305516666666667,0,0.16666666666666666,0,0,0.05,0,0,0
Cupcake,161019,8.680541666666666e-4,0.0017361083333333333,0,0.041666666666666664,0.033854166666666664,0,0,0.08333333333333333,0.08333333333333333,0,0,0,0,0,0,0,0.020833333333333332,0.041666666666666664,4.340270833333333e-4,0,0,0,0.08333333333333333,0.041666666666666664,0,0,0,0
@Aariq
Aariq / read_mult_csv.R
Created February 3, 2023 19:30
Combining multiple .csv files into a single data.frame in R in one line of code
library(readr)
library(dplyr)
library(purrr)
#create data for testing
split_cars <- mtcars |>
group_by(cyl) |>
group_split()
tmp <- tempdir()
site ensemble date_start date_end finished ed_error ed_error_reason model2netcdf_error
MANDIFORE-PNW-4538 1 2002-06-01 2012-05-01 TRUE FALSE NA FALSE
MANDIFORE-PNW-4538 2 2002-06-01 2012-05-01 TRUE FALSE NA FALSE
MANDIFORE-PNW-4538 3 2002-06-01 2012-05-01 TRUE FALSE NA FALSE
MANDIFORE-PNW-4538 4 2002-06-01 2012-05-01 FALSE TRUE Program received signal SIGABRT: Process abort signal. FALSE
MANDIFORE-PNW-4538 5 2002-06-01 2012-05-01 FALSE FALSE NA TRUE
MANDIFORE-PNW-4538 6 2002-06-01 2012-05-01 FALSE FALSE NA TRUE
MANDIFORE-PNW-4538 7 2002-06-01 2012-05-01 FALSE TRUE Program received signal SIGABRT: Process abort signal. FALSE
MANDIFORE-PNW-4538 8 2002-06-01 2010-08-01 FALSE FALSE NA FALSE
MANDIFORE-PNW-4538 9 2002-06-01 2012-05-01 TRUE FALSE NA FALSE
@Aariq
Aariq / extract_hashes.R
Last active January 31, 2022 15:50
"hashes" from Elsevier PDF metadata
# My janky r-code wrapping commandline tool `exiftool`
library(stringr)
library(purrr)
library(glue)
pull_hashes <- function(file) {
xml <- system(glue("exiftool -b -xmp '{file}'"), intern = TRUE)
doi <-
@Aariq
Aariq / anova-bad-Anova-good.R
Created October 1, 2021 18:53
don't use anova(), use car::Anova()
# anova() uses type I SS and unless you have only categorical predictors and
# balanced sample sizes, the order of the formula will change p-values,
# sometimes drastically!
m1 <- lm(Volume ~ Height + Girth, data = trees)
m2 <- lm(Volume ~ Girth + Height, data = trees)
anova(m1)
#> Analysis of Variance Table
#>
@Aariq
Aariq / read-multiple.R
Created February 25, 2021 14:26
Read in data spread across multiple .csv files or multiple sheets of .xlsx files.
# Read in multiple .csv files with the same structure (same column names, numbers of columns, types of data in each column) and join them into a single data frame:
library(readr)
library(purrr)
root <- "path/to/where/my/data/is/"
file_paths <- list.files(root, pattern = "*.csv", full.names = TRUE)
#set the names of the file_paths vector, here I'll just use the file names, but you can use whatever
names(file_paths) <- list.files(root, pattern = "*.csv")
@Aariq
Aariq / convex_hulls.png
Created September 5, 2018 18:38 — forked from mbedward/convex_hulls.png
Example of drawing convex hulls around grouped points in R using dplyr and ggplot.
convex_hulls.png