Skip to content

Instantly share code, notes, and snippets.

View vjcitn's full-sized avatar

Vince Carey vjcitn

  • Boston
View GitHub Profile
@vjcitn
vjcitn / ad_mats.R
Last active August 10, 2022 14:53
use VariantAnnotation::readVcf etc. to get allelic depth count matrices from VCF
#' produce matrices of AD values from VcfFile instance for a given genomic interval
#' @param vf instance of VcfFile, should be tabix indexed
#' @param rng compatible GRanges instance
#' @param genome character(1) obligatory for readVcf
#' @return list with matrices allele1 and allele2, similar to the AD matrix, and ref and alt as obtained directly
#' @examples
#' x = VariantAnnotation::VcfFile("ALL.chrX.BI_Beagle.20100804.genotypes.vcf.gz")
#' param = GRanges("X", IRanges(60000, width=10000))
#' m = ad_mats(x, param)
#' m$allele1[1:4,1:10]
#' produce matrices of allele calls from VcfFile instance for a given genomic interval
#' @param vf instance of VcfFile, should be tabix indexed
#' @param rng compatible GRanges instance
#' @param genome character(1) obligatory for readVcf
#' @param pat1 character(1) gsub regexp to isolate first allele code (might need to have
#' / instead of | if unphased)
#' @param pat2 character(1) gsub regexp to isolate second allele code (might need to have
#' / instead of | if unphased)
#' x = VariantAnnotation::VcfFile("ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz")
#' param = GRanges("22", IRanges(16e6, width=200000))
@vjcitn
vjcitn / cgxapp.R
Created February 1, 2022 19:32
sketch of app to work with mtmorgan/cellxgenedp
library(shiny)
library(cellxgenedp)
library(jsonlite)
db = try(db())
if (inherits(db, "try-error")) stop("can't get db")
jsdb = fromJSON(db) # is character, makes data.frame!
nn = jsdb$name
names(nn) = paste(seq_len(nrow(jsdb)), nn)
@vjcitn
vjcitn / revise_source.R
Created July 28, 2021 22:19
code that can revise DESCRIPTION to add a package that is needed, bump version, and take care of git operations
#' @param dry_run logical(1) if TRUE, just return revised DESCRIPTION
#' @return `desc::description` instance; will write revised DESCRIPTION to `path` if `dry_run` FALSE.
#' @note Bumps third component of version tag
#' @export
revise_desc = function(path, type="Suggests", to_add="rmarkdown", dry_run=TRUE) {
stopifnot(is.atomic(to_add) && length(to_add)==1)
init = desc::description$new(path)
deps = init$get_deps()
if (to_add %in% deps$package) stop("already depended upon")
init = init$set_dep(type=type, package=to_add)
@vjcitn
vjcitn / foo.md
Last active November 19, 2020 21:30
testing
title leader tab_date comment
Cloud computing cost measurement and control Vince 2020-09-03 cloud agnostic?
Orchestra for workshops (Guest talk) Sean Davis 2020-10-01 Infra. for Bioc2020 wkshops
The Bioc Hubs: enhancements for training Mike Love NA scale/formats
Large data methods Stephanie Hicks NA many players
Reliability and code validation Levi Waldron NA goals/resources
Build system automation NA NA build on commit report
Documentation evaluation and maintenance CAB NA doc rot reporting mechanism
@vjcitn
vjcitn / gist:48fdf3ecd17d15bcf6771e8f1efd2fbe
Last active April 25, 2020 17:27
highly specialized function to demonstrate how to produce an estimated Rt based on JHU confirmed case time-series
library(sars2pack)
easyRt = function(alpha3="ITA", src = (suppressWarnings(enriched_jhu_data())),
init = "2020-03-01", ...) {
cur = cumulative_events_ejhu(src, eventtype="confirmed",
alpha3=alpha3)
inc = form_incident_events( trim_from (cur, init ) )
newd = data.frame(I=inc$count, dates=inc$dates)
EpiEstim::estimate_R(newd, ...)
}
rtBya3 = function(a3="ITA") {
library(dplyr)
library(magrittr)
library(RCurl)
library(R0)
fetch_JHU_Data = function (as.data.frame = FALSE)
{
csv <- getURL("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Confirmed.csv")
data <- read.csv(text = csv, check.names = F)
names(data)[1] <- "ProvinceState"
names(data)[2] <- "CountryRegion"
# probably needs to do something to inform R that a new package is available?
# the code can be used but will installed.packages() work properly?
lib2 = function(pkgname, dry.run=TRUE, repos=BiocManager::repositories(),
character.only=FALSE, source_gs="gs://biocbbs_2020a/packs_3.10/",
target=.libPaths()[1], ...) {
if (!character.only) pkgname = as.character(substitute(pkgname))
stopifnot(is.character(pkgname) && is.atomic(pkgname) && length(pkgname)==1)
# verify need
ino = options(no.readonly=TRUE)
options(repos=repos)
@vjcitn
vjcitn / mktexpk
Created January 5, 2020 13:31
a copy of mktexpk, which i could not find in tinytex 0.18 installation with docker image bioconductor_full:release
#!/bin/sh
# original mktexpk -- make a new PK font, because one wasn't found.
#
# (If you change or delete the word `original' on the previous line,
# installation won't write this script over yours.)
#
# Originally written by Thomas Esser, Karl Berry, and Olaf Weber.
# Public domain.
version='$Id: mktexpk 34656 2014-07-18 23:38:50Z karl $'
@vjcitn
vjcitn / highly_vbl.R
Created December 7, 2019 18:16
highly_vbl R function from Brussels talk
# genesym is a gene symbol as a character(1)
# compend is a SummarizedExperiment 'like' result of HumanTranscriptomeCompendium::htx_load()
# stat is a function that will compute a statistic on the log(selected gene's expression+1) over samples
# pctile is the percentile for selecting studies
highly_vbl = function(genesym, compend, stat=mad, pctile=.9) {
stopifnot("gene_name" %in% colnames(rowData(compend)))
stopifnot("study_accession" %in% colnames(colData(compend)))
stopifnot("study_title" %in% colnames(colData(compend)))
ind = which(rowData(compend)$gene_name == genesym)[1]
if (is.na(ind)) stop("could not find genesym in compend")