Skip to content

Instantly share code, notes, and snippets.

View mikelove's full-sized avatar

Michael Love mikelove

View GitHub Profile
@mikelove
mikelove / fluent-genomics-v2.qmd
Last active June 11, 2024 20:54
fluent genomics update
---
title: "Differential chromatin accessibility and gene expression"
format: html
---
# Differential expression from RNA-seq
```{r}
#| eval: FALSE
dir <- system.file("extdata", package="macrophage")
@mikelove
mikelove / hg38_seqlens.tsv
Created June 9, 2024 11:03
Speed testing plyranges joins at various numbers of features 1e3 to 1e6
chr1 248956422
chr2 242193529
chr3 198295559
chr4 190214555
chr5 181538259
chr6 170805979
chr7 159345973
chr8 145138636
chr9 138394717
chr10 133797422
@mikelove
mikelove / segment_example.R
Created May 10, 2024 12:12
Example of producing simple segmentation
library(plyranges)
library(nullranges)
x <- data.frame(
seqnames=rep(c("1","2","3"), each=10),
start=rep(c(0:4, 10:14) * 1000 + 1, times=3),
width=100) |>
as_granges()
seqlengths(x) <- c("1"=20123, "2"=20123, "3"=20123)
@mikelove
mikelove / join_se.R
Last active May 11, 2024 11:05
Example code for joining two range sets where second one is also attached to an SE
library(SummarizedExperiment)
library(plyranges)
# example data
m <- matrix(rnorm(600), nrow=100)
r1 <- data.frame(seqnames=1, start=1:50 * 100 + 2501,
width=5, id1=paste0("u",formatC(1:50,width=3,flag="0"))) |>
as_granges()
r2 <- data.frame(seqnames=1, start=1:100 * 100 + 1,
width=5, id2=paste0("v",formatC(1:100,width=3,flag="0"))) |>
@mikelove
mikelove / 1_app.R
Last active April 26, 2024 12:10
IGVF ancestry dashboard sketch
library(shiny)
library(UpSetR)
library(dplyr)
library(tidyr)
library(readr)
library(ggplot2)
dat <- read_delim("ancestry_dataframe.tsv")
ui <- fluidPage(
titlePanel("IGVF Ancestry Dashboard"),
@mikelove
mikelove / frozen_vst.R
Created February 19, 2024 14:02
Frozen variance stabilizing transformation for count data
mat <- matrix(rnbinom(2e5, mu=100, size=1/.01), ncol=100)
library(DESeq2)
d <- DESeqDataSetFromMatrix(mat, DataFrame(x=rep(1,100)), ~1)
# library size correction, centered log ratio to reference sample
d <- estimateSizeFactors(d)
# variance
d <- estimateDispersionsGeneEst(d)
# trend
@mikelove
mikelove / element_level.R
Last active February 9, 2024 20:00
Element level analysis with mpralm
set.seed(5)
n <- 1000
reps <- 10
rna <- matrix(
rnbinom(n * reps, mu = 10, size = 100),
ncol=reps
)
dna <- matrix(
rnbinom(n * reps, mu = 10, size = 100),
@mikelove
mikelove / R_Bioc_tidy_data.R
Created November 3, 2023 02:51
Demonstration of various classes in R
# dataframes vs lm S3 vs Bioc S4
# Michael Love
# Nov 1 2023
dat <- data.frame(genotype=c("wt","wt","mut","mut"),
count=c(10,20,30,40),
score=c(-1.2,0,3.4,-5),
gene=c("Abc","Abc","Xyz","Xyz"))
library(tibble)
dat |> as_tibble()
@mikelove
mikelove / tree_example.Rmd
Created October 16, 2023 14:34
Toy tree example for collapsing
---
title: "Toy tree example for collapsing"
author: "Michael Love"
---
Example data with 20 inferential replicates, here we just have 1
sample per condition and we calculate the LFC at each level of the
tree.
From the below simulation setup (see first chunk), the true DE signal
library(plyranges)
library(readr)
bindata <- read_tsv("bindata.40000.hg19.tsv.gz")
fire <- read_bed("fire-adult-hg19.txt")
# not necessary, but nice to have
si <- Seqinfo(genome="hg19")
si <- keepStandardChromosomes(si)