Skip to content

Instantly share code, notes, and snippets.

View lwaldron's full-sized avatar

Levi Waldron lwaldron

View GitHub Profile
@lwaldron
lwaldron / cmd_giant_table.R
Created July 11, 2024 11:19
Summarize curatedMetagenomicData studies in one giant Epi Table 1
library(curatedMetagenomicData)
library(dplyr)
library(table1)
dat <- sampleMetadata |>
select(study_name, body_site, study_condition, age_category, age, BMI)
# labeling is optional, just to make the table nicer
label(dat$body_site) <- "Body Site"
label(dat$study_condition) <- "Study Condition"
@lwaldron
lwaldron / knn-matching.R
Created June 25, 2024 18:45
One way to age match using k-nearest neighbors
library(nabor)
# suppose you have two vectors of propensity scores
propensity_scores1 <- c(0.1, 0.2, 0.3, 0.4, 0.5) #more controls
propensity_scores2 <- c(0.15, 0.25, 0.35) #fewer cases
# use the knn function from the nabor package to find the index of the closest match in propensity_scores2 for each score in propensity_scores1
matches <- nabor::knn(matrix(propensity_scores2), matrix(propensity_scores1), k = 1)$nn.idx
# print the matches
@lwaldron
lwaldron / cmd_healthycontrols.R
Created May 3, 2024 14:30
curatedMetagenomicData healthy control samples, relab + metadata csv file per age category
library(curatedMetagenomicData)
library(dplyr)
agecats <- unique(sampleMetadata$age_category) |> na.omit()
sm <- filter(sampleMetadata, study_condition=="control") |>
filter(disease == "healthy") |>
filter(body_site == "stool") |>
filter(!is.na(age_category))
for (agecat in agecats){
sm1 <- filter(sm, age_category == agecat)
@lwaldron
lwaldron / lefser_pathwayab.R
Last active March 18, 2024 10:34
lefser on pathway abundances using ZellerG_2014 from cMD
suppressPackageStartupMessages({
library(lefser)
library(curatedMetagenomicData)
})
zeller <-
curatedMetagenomicData("ZellerG_2014.pathway_abundance",
counts = TRUE,
dryrun = FALSE)[[1]]
zeller <- zeller[, zeller$study_condition != "adenoma"]
zeller <- relativeAb(zeller)
@lwaldron
lwaldron / gist:edea48dfda3c9db34b80a326f50fc5d1
Last active February 24, 2024 21:23
Select some UniRef IDs from curatedMetagenomicData studies, join, write to file
suppressPackageStartupMessages({
library(curatedMetagenomicData)
library(mia)
library(dplyr)
library(purrr)
})
datasets <- sampleMetadata |>
group_by(study_name) |>
count() |>
@lwaldron
lwaldron / openai_analyzecomments.R
Last active February 14, 2024 13:45
Simple use of openai package for sentiment analysis of written teaching evaluation comments
Before using this script you need to create an OpenAI API key (https://platform.openai.com/api-keys)
and put it in ~/.Renviron:
OPENAI_API_KEY='my_key_here'
# libraries used
library(openai)
library(dplyr)
library(stringr)
@lwaldron
lwaldron / alphadiversity.R
Last active November 13, 2023 02:40
Analyze alpha diversity by year from bugsigdb.org
library(bugsigdbr)
bsdb <- importBugSigDB(version = "devel")
# Create a stacked barplot of the proportion of Pielou, Shannon, Chao1, Simpson, Inverse Simpson, and Richness as a function of year
library(tidyverse)
bsdb_by_year <- bsdb |>
filter(Year > 2014) |>
dplyr::group_by(Year) |>
dplyr::summarize(
Pielou = sum(!is.na(Pielou)) / n(),
@lwaldron
lwaldron / stepwise_treadmill_test.Rmd
Created September 17, 2023 09:34
Analysis of Garmin tcx heart rate data from a stepwise treadmill test to identify lactate threshold
---
title: "Stepwise treadmill test"
author: "Levi Waldron"
date: "`r Sys.Date()`"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
@lwaldron
lwaldron / useProbeInfo.Rnw
Created July 14, 2023 10:02
Rnw vignette from the `annotate` package
% \VignetteIndexEntry{Using Affymetrix Probe Level Data}
% \VignetteDepends{hgu95av2.db, rae230a.db, rae230aprobe, Biostrings}
% \VignetteKeywords{Annotation}
%\VignettePackage{annotate}
\documentclass{article}
\newcommand{\Rfunction}[1]{{\texttt{#1}}}
\newcommand{\Rmethod}[1]{{\texttt{#1}}}
@lwaldron
lwaldron / testlefsercounts
Created July 3, 2023 09:40
Compare lefser output with relab and counts
suppressPackageStartupMessages(library(curatedMetagenomicData))
zeller <- curatedMetagenomicData::curatedMetagenomicData("Zeller.+relative_abundance", counts = FALSE, dryrun = FALSE)[[1]]
zellercounts <- curatedMetagenomicData::curatedMetagenomicData("Zeller.+relative_abundance", counts = TRUE, dryrun = FALSE)[[1]]
zeller <- zeller[, zeller$study_condition != "adenoma"]
zellercounts <- zellercounts[, zellercounts$study_condition != "adenoma"]
suppressPackageStartupMessages(library(lefser))
res_group <- lefser(zeller, groupCol = "study_condition")
res_group_counts <- lefser(zellercounts, groupCol = "study_condition")