Skip to content

Instantly share code, notes, and snippets.

View tomsing1's full-sized avatar

Thomas Sandmann tomsing1

View GitHub Profile
@tomsing1
tomsing1 / gorilla.R
Created August 28, 2023 16:26
R script to create a dataset similar to Yanai and Lercher, Selective attention in hypothesis-driven data analysis, biorXiv, 2020
# source: Matt Dray's blog: https://www.rostrum.blog/posts/2021-10-05-gorilla/
library(magick)
# download and read the image
img_file <- tempfile(fileext = ".jpg")
download.file(
paste0(
"https://classroomclipart.com/images/gallery/",
"Clipart/Black_and_White_Clipart/Animals/",
@tomsing1
tomsing1 / datalad.md
Created August 22, 2023 15:49
Datalad tutorial, based on its amazing documentation

Installation on Mac OS X

brew install datalad wget

Creating a new dataset

First, we create a new, empty dataset. (Reminder: a dataset refers to a folder of files, not any single file.)

@tomsing1
tomsing1 / statistical_misunderstandings.md
Created August 21, 2023 19:49
Statistical misunderstandings

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Excerpted from Greenland et al, Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations, 2016

What P values, confidence intervals, and power calculations don't tell us

Common misinterpretations of single P values

  1. The P value is the probability that the test hypothesis is true; for example, if a test of the null hypothesis gave P = 0.01, the null hypothesis has only a 1 % chance of being true; if instead it gave P = 0.40, the null hypothesis has a 40 % chance of being true.
@tomsing1
tomsing1 / duckdb_notes.txt
Created May 1, 2023 17:05
Notes on first steps with duckdb
brew install duckdb
duckdb -c "INSTALL httpfs"
REMOTE_FILE="https://raw.githubusercontent.com/mwaskom/seaborn-data/master/penguins.csv"
# read a remote CSV file in to a duckdb database into an in-memory database
duckdb -c "SELECT * FROM '${REMOTE_FILE}';"
# By default, the CLI will open a temporary in-memory database.
@tomsing1
tomsing1 / ena_controlled_vocabulary.Rmd
Created April 3, 2023 22:18
Rmd file that retrieves and parses ENA's XML specification for NGS experiments
---
title: "Controlled vocabulary for sequencing experiments"
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
@tomsing1
tomsing1 / json_in_sqlite.qmd
Created December 31, 2022 22:50
Experimenting with JSON fields in a SQLite database
---
title: "Experimenting with SQLite"
editor_options:
chunk_output_type: inline
---
```{r}
library(glue)
library(RSQLite)
library(jsonlite)
@tomsing1
tomsing1 / GO.R
Created December 28, 2022 04:27
Retrieving Gene ontology (GO) annotations using Bioconductor annotation packages
library(AnnotationDbi)
library(org.Hs.eg.db)
library(GO.db)
kTerm <- "GO:0007265"
# retrieve all genes annotated with the GO germ
df <- AnnotationDbi::select(org.Hs.eg.db, keys = c(kTerm),
columns = c("ENTREZID", "ENSEMBL"),
keytype = "GO")
nrow(df) # 117 (some duplicate EntrezIds because we requested Ensembl ids, too)
@tomsing1
tomsing1 / plotly-nflfastR.Rmd
Last active December 17, 2022 20:50
Updated version of Tom Mock's plotly-nflfastR.R gist
---
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
library(crosstalk)
library(gsisdecoder) # for nflfastR::build_nflfastR_pbp
library(htmltools)
library(nflfastR)
@tomsing1
tomsing1 / ploty_dt_crosstalk.Rmd
Last active December 15, 2022 17:46
Combining a plotly graph and an interactive table with crosstalk in R
---
title: "DT, plotly and crosstalk"
---
```{r}
library(plotly)
library(DT)
library(crosstalk)
library(htmltools)
```
@tomsing1
tomsing1 / sparrow_reactable.Rmd
Last active December 21, 2022 02:13
Nested reactable table to display the results of a gene set enrichment analysis with the sparrow Bioconductor package
---
title: "Presenting gene set enrichment results with reactable and plotly"
format:
html:
page-layout: full
code-fold: true
code-summary: "Show the code"
---
```{r}