Skip to content

Instantly share code, notes, and snippets.

View millerh1's full-sized avatar

Henry Miller millerh1

View GitHub Profile
@millerh1
millerh1 / auto-enrichr-links.Rmd
Last active August 15, 2021 16:26
DESeq2 + enrichr -- DGE for minimalists
@millerh1
millerh1 / auto-sys-dep-shiny-renv.yml
Created August 15, 2021 16:16
GitHub Actions for R-Shiny + renv + automatic system dependencies
env:
RENV_PATHS_ROOT: ~/.local/share/renv
on:
push:
branches:
- main
pull_request:
branches:
- main
@millerh1
millerh1 / available_genomes.tsv
Created August 26, 2021 02:59
Available Genomes from UCSC with effective genome sizes
We can make this file beautiful and searchable if this error is corrected: It looks like row 3 should actually have 27 columns, instead of 22. in line 2.
UCSC_orgID description nibPath organism defaultPos active orderKey genome scientificName htmlPath hgNearOk hgPbOk sourceName taxId genes_available year eff_genome_size_36bp eff_genome_size_50bp eff_genome_size_75bp eff_genome_size_100bp eff_genome_size_125bp eff_genome_size_150bp eff_genome_size_200bp eff_genome_size_250bp eff_genome_size_300bp genome_length homer_anno_available
ailMel1 Dec. 2009 (BGI-Shenzhen 1.0/ailMel1) /gbdb/ailMel1 Panda GL192818.1:558576-566855 1 16070 Panda Ailuropoda melanoleuca /gbdb/ailMel1/html/description.html 0 0 BGI-Shenzhen AilMel 1.0 Dec. 2009 9646 TRUE 2009 2131884815 2195257147 2167233843 2206501040 2213941269 2196290908 2233702932 2212959190 2168863899 2299509015 FALSE
allMis1 Aug. 2012 (allMis0.2/allMis1) /gbdb/allMis1 American alligator JH731472:504271-884586 1 1425 American alligator Alligator mississippiensis /gbdb/allMis1/html/description.html 0 0 International Crocodilian Genomes Working Group 8496 TRUE 2012 2050346072 2080458921 2117005123 2101224009 2132575305 21480
@millerh1
millerh1 / getPublicSampleInfo.R
Last active August 26, 2021 03:42
Get SRA sample info with any NCBI-friendly accessions
get_public_run_info <- function(accessions) {
suppressWarnings(suppressPackageStartupMessages(require(XML)))
httr::set_config(httr::config(http_version = 2))
# set the HTTP version to 1.1 (none, 1.0, 1.1, 2)
#### Bug testing ##
#accessions <- c("SRX2481503", "SRX2481504", "GSE134101", "SRP150774", "GSE127329", "SRS1466492")
#accessions <- c("SRX2918366", "SRX2918367", "GSM3936516", "SRX5129664")
@millerh1
millerh1 / mv_filenames_rseqtesting.R
Created August 27, 2021 15:45
Changing filenames on AWS S3 from R (Example, RSeq Testing Data)
# Script for wrangling test files to correct naming conventions
library(tidyverse)
library(parallel)
S3_BAM_URI <- "s3://rseq-testing/bam-files/"
bamsAvail <- system(paste0("aws s3 ls ", S3_BAM_URI), intern = TRUE)
oldnew <- tibble(
oldfls = gsub(bamsAvail, pattern = ".+ ([ES]{1}RX[0-9]+_.+\\.[hgmm]{2}[0-9]+\\.bam)", replacement = "\\1"),
newfls = gsub(bamsAvail, pattern = ".+ ([ES]{1}RX[0-9]+)_.+\\.([hgmm]{2}[0-9]+\\.bam)", replacement = "\\1_\\2")
@millerh1
millerh1 / shuffle_bam.sh
Created August 27, 2021 22:21
Shuffle Bam File
#!/bin/bash
bamToBed -i $BAM_ORIG -bed12 -cigar | shuffleBed -i - -g $CHROM_SIZES | bedToBam -bed12 -g $CHROM_SIZES | samtools sort > $BAM_SHUFFLED
@millerh1
millerh1 / encExp2Peaks.R
Last active September 14, 2021 00:19
Encode Experiment ID -> Narrow Peak URLs
encExp2Peaks <- function(acc, getIDR=FALSE) {
resp <- httr::GET(url = paste0("https://www.encodeproject.org/search/?type=File&dataset=/experiments/",
acc,"/&file_type=bed+narrowPeak&format=json&frame=object&limit=all"), httr::accept_json()) %>%
httr::content(as="parsed")
res <- lapply(seq(resp$`@graph`), function(i) {
entry <- resp$`@graph`[[i]]
tibble(
accfile = entry$accession,
sup = ! is.null(entry$superseded_by %>% unlist()),
type = entry$file_type,
@millerh1
millerh1 / infoLinksForGenomics.R
Created September 11, 2021 00:25
Info links for Genomics Reports
# Create links in datatabels for gene info modal
createGeneInfoLink <- function(val) {
sprintf(paste0('<a href="https://www.genecards.org/cgi-bin/carddisp.pl?gene=', val ,'" target="_blank" class="tooltip-test" onClick="gene_click(this.id)" title="', val, '">', val, '</a>'))
}
# Create links in datatables for GSEA info from MSIGDB
createGSEAInfoLink <- function(val, valTitle) {
sprintf(paste0('<a href="http://software.broadinstitute.org/gsea/msigdb/cards/', val,'" id="', val ,'" target="_blank" class="tooltip-test" onClick="gsea_click(this.id)" title="', val, '">', valTitle, '</a>'))
}
# Create links in datatables for GEO accessions
@millerh1
millerh1 / tidy_venn.diagram.R
Created September 20, 2021 17:19
Tidy venn.diagram
library(magrittr)
# Venn diagram plot in tidy pipe
plt <- list(
"A" = 1:100,
"B" = 96:140
) %>%
VennDiagram::venn.diagram(filename = NULL) %>%
grid::grid.draw(.) %>%
ggplotify::grid2grob() %>%
@millerh1
millerh1 / gsmemmat_generator.R
Created January 21, 2022 15:51
Gene set membership matrix generator
library(msigdbr)
library(dplyr)
msigdbr(category = "C8") %>%
dplyr::select(gs_name, gene_symbol) %>%
distinct(gs_name, gene_symbol) %>%
mutate(val = 1) %>%
pivot_wider(
id_cols = gs_name, names_from = gene_symbol,
values_from = val,
values_fill = 0