Skip to content

Instantly share code, notes, and snippets.

View mikelove's full-sized avatar

Michael Love mikelove

View GitHub Profile
@mikelove
mikelove / frozen_vst.R
Created February 19, 2024 14:02
Frozen variance stabilizing transformation for count data
mat <- matrix(rnbinom(2e5, mu=100, size=1/.01), ncol=100)
library(DESeq2)
d <- DESeqDataSetFromMatrix(mat, DataFrame(x=rep(1,100)), ~1)
# library size correction, centered log ratio to reference sample
d <- estimateSizeFactors(d)
# variance
d <- estimateDispersionsGeneEst(d)
# trend
@mikelove
mikelove / element_level.R
Last active February 9, 2024 20:00
Element level analysis with mpralm
set.seed(5)
n <- 1000
reps <- 10
rna <- matrix(
rnbinom(n * reps, mu = 10, size = 100),
ncol=reps
)
dna <- matrix(
rnbinom(n * reps, mu = 10, size = 100),
@mikelove
mikelove / R_Bioc_tidy_data.R
Created November 3, 2023 02:51
Demonstration of various classes in R
# dataframes vs lm S3 vs Bioc S4
# Michael Love
# Nov 1 2023
dat <- data.frame(genotype=c("wt","wt","mut","mut"),
count=c(10,20,30,40),
score=c(-1.2,0,3.4,-5),
gene=c("Abc","Abc","Xyz","Xyz"))
library(tibble)
dat |> as_tibble()
@mikelove
mikelove / tree_example.Rmd
Created October 16, 2023 14:34
Toy tree example for collapsing
---
title: "Toy tree example for collapsing"
author: "Michael Love"
---
Example data with 20 inferential replicates, here we just have 1
sample per condition and we calculate the LFC at each level of the
tree.
From the below simulation setup (see first chunk), the true DE signal
library(plyranges)
library(readr)
bindata <- read_tsv("bindata.40000.hg19.tsv.gz")
fire <- read_bed("fire-adult-hg19.txt")
# not necessary, but nice to have
si <- Seqinfo(genome="hg19")
si <- keepStandardChromosomes(si)
@mikelove
mikelove / minimap2_to_Salmon
Created September 21, 2023 11:45
minimap2 -> Salmon
# minimap2 map reads:
minimap2 -t {threads} -a -x map-ont -p 1.0 -N 100 {input.index} {input.fastq} | samtools view -Sb > {output.bam}
# salmon Quant:
salmon quant --ont --noLengthCorrection -p 8 -t {input.transcriptome} -l U -a {input.bam} -o {output}
@mikelove
mikelove / explore_tidyverse_omics.R
Last active November 1, 2023 15:26
Simple exploration of data with tidyverse and tidyomics
library(palmerpenguins) # penguins!
library(ggplot2) # "grammar of graphics" plots
suppressPackageStartupMessages(
library(dplyr) # data pliers
)
penguins |>
slice(1:3) |>
select(species, island)
@mikelove
mikelove / conditional_scz_gwas_2022_hg19.csv
Created August 10, 2023 15:43
Trubetskoy et al (2022) Supp Table 10b, conditionally independent associations (hg19)
We can make this file beautiful and searchable if this error is corrected: It looks like row 6 should actually have 28 columns, instead of 7. in line 5.
Index CHR,Index SNP,BP Left,BP Right,# Conditional independent,Index BP,Index P value,Merged Indexes,2nd SNP,2nd BP,2nd P value,3rd SNP,3rd BP,3rd P value,4th SNP,4th BP,4th P value,5th SNP,5th BP,5th P value,,,,,,,,
6,rs13195636,24939493,30559493,4,27509493,3.33E-40,rs13195636,rs9461856,33395199,4.13E-14,rs8192589,32187637,2.27E-10,rs9368789,34001678,2.03E-07,rs10946808,26233387,1.11E-06,,,,,,,,
18,rs9636107,52666708,53680258,4,53200117,1.92E-13,"rs12969453,rs74914300,rs72934602,rs9636107",rs12969453,52751708,3.93E-11,rs78322266,53063676,6.89E-08,rs72934602,53568458,6.79E-07,rs9952704,53447690,2.60E-04,,,,,,,,
3,rs2710323,48132866,53603354,3,52815905,1.02E-21,"rs11715134,rs11917680,rs2710323,rs13080668",rs2236989,50505395,1.81E-10,rs7633840,48719638,4.41E-07,rs4687724,53408177,1.10E-04,,,,,,,,,,,
18,rs72980087,77324421,77981194,2,77632194,4.80E-16,"rs56040937,rs7238071,rs72980087,rs72970145",rs11664298,77578986,5.24E-16,rs12455965,77688830,0.0001109,,,,,,,,,,,,,,
3,rs13090130,161092491,161969035,2,161777035,
@mikelove
mikelove / join_overlap_distance.R
Created August 3, 2023 14:03
join_overlap_* but with distance as a metadata column
library(plyranges)
x <- data.frame(seqnames=1,
start=sample(1000,10,FALSE),
width=1, id=1:10) %>%
as_granges()
y <- data.frame(seqnames=1,
start=sample(1000,10,FALSE),
width=1, id=letters[1:10]) %>%
as_granges()
@mikelove
mikelove / many_groups.R
Created July 14, 2023 13:47
issues with summarizing over many groups in plyranges - related to Issue #30
library(plyranges)
library(microbenchmark)
library(dplyr)
library(tibble)
make_rand_gr <- function(N, grps) {
data.frame(seqnames = sample(c("seq1", "seq2", "seq3"), N, replace = TRUE),
strand = sample(c("+", "-", "*"), N, replace = TRUE), start = rpois(N,
N), width = rpois(N, N), grps = sample(grps, N, replace = TRUE),
score = runif(N)) %>% as_granges()