Skip to content

Instantly share code, notes, and snippets.

View vjcitn's full-sized avatar

Vince Carey vjcitn

  • Boston
View GitHub Profile
@vjcitn
vjcitn / lincpet.R
Created October 15, 2016 11:21
simple implementation of lincoln petersen population size estimation applied to letters
lincpet = function(ssize=17) {
s1 = sample(letters, size=ssize, replace=FALSE)
s2 = sample(letters, size=ssize, replace=FALSE)
ssize^2/length(intersect(s1,s2))
}
@vjcitn
vjcitn / Dockerfile
Created March 21, 2019 11:00 — forked from seandavi/Dockerfile
Dockerfile for blog post on using GCR. Builds SRA-toolkit with dbGaP access as an example
FROM ubuntu:18.04
RUN apt-get update
RUN apt-get install -y wget
# We do things this way to keep the docker image
# size down. See https://nickjanetakis.com/blog/docker-tip-3-chain-your-docker-run-instructions-to-shrink-your-images
RUN wget http://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/2.9.2/sratoolkit.2.9.2-ubuntu64.tar.gz \
&& tar -xvzf sratoolkit.2.9.2-ubuntu64.tar.gz \
&& rm sratoolkit.2.9.2-ubuntu64.tar.gz
@vjcitn
vjcitn / omicidx-beta-graphql-intro.ipynb
Created March 21, 2019 11:00 — forked from seandavi/omicidx-beta-graphql-intro.ipynb
Quick introduction to the OmicIDX GraphQL API
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@vjcitn
vjcitn / gist:9cb7373c5fe2ee8d514260e5b9dd910c
Created April 18, 2019 11:29
personal version of library that reduces information on attachment events
libstats = function(inisess, newsess) {
inibase = inisess$basePkgs
inioth = names(inisess$otherPkgs)
newbase = newsess$basePkgs
newoth = names(newsess$otherPkgs)
iniatt = length(unique(c(inibase,inioth)))
newatt = length(unique(c(newbase,newoth)))
addatt = newatt-iniatt
inilo = names(inisess$loadedOnly)
newlo = names(newsess$loadedOnly)
@vjcitn
vjcitn / ubsetup_bbs.txt
Created July 8, 2019 15:39
ubuntu 18.04 apt-get for BiocBBSpack preparation
sudo apt-get install -y accountsservice acl acpid adduser adwaita-icon-theme and ansible apparmor apport apport-symptoms apt apt-utils aspell aspell-en at at-spi2-core autoconf automake autopoint autotools-dev base-files base-passwd bash bash-completion bc bcache-tools bcc bin86 bind9-host binutils binutils-common binutils-x86-64-linux-gnu bison bridge-utils bsdmainutils bsdutils btrfs-progs btrfs-tools build-essential busybox-initramfs busybox-static byobu bzip2 bzip2-doc ca-certificates ca-certificates-java cabextract cgroupfs-mount cloud-guest-utils cloud-init cloud-initramfs-copymods cloud-initramfs-dyn-netconf cmake cmake-data command-not-found command-not-found-data console-setup console-setup-linux containerd coreutils cpio cpp cpp-7 cron cryptsetup cryptsetup-bin curl cwltool dash dbus dbus-user-session dbus-x11 dconf-gsettings-backend dconf-service debconf debconf-i18n debhelper debianutils default-jdk default-jdk-headless default-jre default-jre-headless default-libmysqlclient-dev desktop-base deskt
@vjcitn
vjcitn / gist:7295067e6592213823c28b9421eb5fdc
Created September 3, 2019 16:49
function to use ensembl API to convert rsid to hg38 coordinates ... note the seqnames
get_rslocs_38 = function(rsids = c("rs6060535", "rs56116432")) {
server <- "https://rest.ensembl.org"
ext <- "/variant_recoder/homo_sapiens"
r <- httr::POST(paste(server, ext, sep = ""),
httr::content_type("application/json"),
httr::accept("application/json"),
body = list(ids=rsids), encode="json")
httr::stop_for_status(r)
ans = rjson::fromJSON( rjson::toJSON( httr::content(r)))
ids = lapply(ans, "[[", "id")
@vjcitn
vjcitn / highly_vbl.R
Created December 7, 2019 18:16
highly_vbl R function from Brussels talk
# genesym is a gene symbol as a character(1)
# compend is a SummarizedExperiment 'like' result of HumanTranscriptomeCompendium::htx_load()
# stat is a function that will compute a statistic on the log(selected gene's expression+1) over samples
# pctile is the percentile for selecting studies
highly_vbl = function(genesym, compend, stat=mad, pctile=.9) {
stopifnot("gene_name" %in% colnames(rowData(compend)))
stopifnot("study_accession" %in% colnames(colData(compend)))
stopifnot("study_title" %in% colnames(colData(compend)))
ind = which(rowData(compend)$gene_name == genesym)[1]
if (is.na(ind)) stop("could not find genesym in compend")
@vjcitn
vjcitn / mktexpk
Created January 5, 2020 13:31
a copy of mktexpk, which i could not find in tinytex 0.18 installation with docker image bioconductor_full:release
#!/bin/sh
# original mktexpk -- make a new PK font, because one wasn't found.
#
# (If you change or delete the word `original' on the previous line,
# installation won't write this script over yours.)
#
# Originally written by Thomas Esser, Karl Berry, and Olaf Weber.
# Public domain.
version='$Id: mktexpk 34656 2014-07-18 23:38:50Z karl $'
# probably needs to do something to inform R that a new package is available?
# the code can be used but will installed.packages() work properly?
lib2 = function(pkgname, dry.run=TRUE, repos=BiocManager::repositories(),
character.only=FALSE, source_gs="gs://biocbbs_2020a/packs_3.10/",
target=.libPaths()[1], ...) {
if (!character.only) pkgname = as.character(substitute(pkgname))
stopifnot(is.character(pkgname) && is.atomic(pkgname) && length(pkgname)==1)
# verify need
ino = options(no.readonly=TRUE)
options(repos=repos)
library(dplyr)
library(magrittr)
library(RCurl)
library(R0)
fetch_JHU_Data = function (as.data.frame = FALSE)
{
csv <- getURL("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Confirmed.csv")
data <- read.csv(text = csv, check.names = F)
names(data)[1] <- "ProvinceState"
names(data)[2] <- "CountryRegion"