View geom_flat_violin.R
# somewhat hackish solution to:
# https://twitter.com/EamonCaddigan/status/646759751242620928
# based mostly on copy/pasting from ggplot2 geom_violin source:
# https://github.com/hadley/ggplot2/blob/master/R/geom-violin.r
library(ggplot2)
library(dplyr)
"%||%" <- function(a, b) {
View geom_flat_violin.R
# somewhat hackish solution to:
# https://twitter.com/EamonCaddigan/status/646759751242620928
# based mostly on copy/pasting from ggplot2 geom_violin source:
# https://github.com/hadley/ggplot2/blob/master/R/geom-violin.r
library(ggplot2)
library(dplyr)
"%||%" <- function(a, b) {
View three_gotchas_R_genomics.md
View tile_chromHMM_segments.md

When call chromHMM with a bin size of say 1000 bp, if the consecutive bins are of the same state, it will be merged.

I want to tile the segment file before merging. https://support.bioconductor.org/p/102775/#102777

library(GenomicRanges)
library(rtracklayer)
chromHMM_seg<- import("data/chromHMM/SKCM-M852-P008_18_segments.bed", format = "BED")
View ebi_url_from_srr.py
#API created by @apfejes (Anthony Fejes) on top of my half-cooked script
#python ebi_url_from_srr.py --file srr_list.txt | xargs -I {} wget {}
import argparse
def prepareURL(srr_name, prefix="ftp://ftp.sra.ebi.ac.uk/vol1/fastq/"):
dir_1=srr_name[:6]
dir_2=""
url=""
num_digits=sum(s.isdigit() for s in srr_name)
if(num_digits == 6):
View umap.R
#install UMAP from https://github.com/lmcinnes/umap
#install.packages("rPython")
umap <- function(x,n_neighbors=10,min_dist=0.1,metric="euclidean"){
x <- as.matrix(x)
colnames(x) <- NULL
rPython::python.exec( c( "def umap(data,n,mdist,metric):",
"\timport umap" ,
"\timport numpy",
"\tembedding = umap.UMAP(n_neighbors=n,min_dist=mdist,metric=metric).fit_transform(data)",
View clinical_data_TCGA.md
library('TCGAbiolinks')
library('plyr')
library('devtools')

projects<- "TCGA-LUAD"

clin <- lapply(projects, function(p) {
View mouse2hum_biomart_ens87.txt
This file has been truncated, but you can view the full file.
Gene ID Transcript ID Human associated gene name Human gene stable ID Associated Gene Name
ENSMUSG00000064336 ENSMUST00000082387 mt-Tf
ENSMUSG00000064337 ENSMUST00000082388 mt-Rnr1
ENSMUSG00000064338 ENSMUST00000082389 mt-Tv
ENSMUSG00000064339 ENSMUST00000082390 mt-Rnr2
ENSMUSG00000064340 ENSMUST00000082391 mt-Tl1
ENSMUSG00000064341 ENSMUST00000082392 MT-ND1 ENSG00000198888 mt-Nd1
ENSMUSG00000064342 ENSMUST00000082393 mt-Ti
ENSMUSG00000064343 ENSMUST00000082394 mt-Tq
View convert_msigdb_human_gmt_to_mouse.md
View call_peaks.sh
#! /bin/bash
set -e
set -u
set -o pipefail
root=`pwd`
mkdir macs14_pbs
cat bam_names.txt | while read -r IP Input