Skip to content

Instantly share code, notes, and snippets.

View crazyhottommy's full-sized avatar
🎯
Focusing

Ming Tang crazyhottommy

🎯
Focusing
View GitHub Profile
@crazyhottommy
crazyhottommy / natural_sort_vcf.sh
Last active June 21, 2016 21:03 — forked from arq5x/example.sh
Natural sort a VCF
chmod a+x vcfsort.sh
vcfsort.sh trio.trim.vep.vcf.gz
## devtools::install_github("stephenturner/msigdf")
library(msigdf)
library(dplyr)
library(clusterProfiler)
c2 <- msigdf.human %>%
filter(collection == "c2") %>% select(geneset, entrez) %>% as.data.frame
data(geneList)
de <- names(geneList)[1:100]
@crazyhottommy
crazyhottommy / maf_legacy.R
Created February 22, 2017 15:39 — forked from tiagochst/maf_legacy.R
Get MAF files aligned against hg19
query.maf.hg19 <- GDCquery(project = "TCGA-COAD",
data.category = "Simple nucleotide variation",
data.type = "Simple somatic mutation",
access = "open",
legacy = TRUE)
# Check maf availables
knitr::kable(getResults(query.maf.hg19)[,c("created_datetime","file_name")])
query.maf.hg19 <- GDCquery(project = "TCGA-COAD",
data.category = "Simple nucleotide variation",
# This code will get all clinical indexed data from TCGA
library(TCGAbiolinks)
library(data.table)
clinical <- TCGAbiolinks:::getGDCprojects()$project_id %>%
regexPipes::grep("TCGA",value=T) %>%
sort %>%
plyr::alply(1,GDCquery_clinic, .progress = "text") %>%
rbindlist
readr::write_csv(clinical,path = paste0("all_clin_indexed.csv"))
@crazyhottommy
crazyhottommy / mp_primer_v2.sh
Created November 14, 2017 16:58 — forked from mbk0asis/mp_primer_v2.sh
Batch bisulfite primer design tool version 2 (NEW : Multiple results per template, allowed to set number of C's within a primer)
#!/bin/bash
printf "\n *** BIS BATCH PRIMER version 2.0 ***"
printf "\n\n !!! 'Primer3 & fastx-toolkit' must be installed on the system.\n\n !!! Edit parameters (e.g. sizes, Tm, and etc) before start\n\n "
printf "\n\n Usage : \n ./mp_primer.sh FASTA PARAMETER \n\n"
printf " >>> input FASTA = "$1
printf " \n >>> parameters = "$2
printf "\n\n\n ()()() Running... \n\n"
if [ -f $1 -a -f $2 ]; then
#API created by @apfejes (Anthony Fejes) on top of my half-cooked script
#python ebi_url_from_srr.py --file srr_list.txt | xargs -I {} wget {}
import argparse
def prepareURL(srr_name, prefix="ftp://ftp.sra.ebi.ac.uk/vol1/fastq/"):
dir_1=srr_name[:6]
dir_2=""
url=""
num_digits=sum(s.isdigit() for s in srr_name)
if(num_digits == 6):
# somewhat hackish solution to:
# https://twitter.com/EamonCaddigan/status/646759751242620928
# based mostly on copy/pasting from ggplot2 geom_violin source:
# https://github.com/hadley/ggplot2/blob/master/R/geom-violin.r
library(ggplot2)
library(dplyr)
"%||%" <- function(a, b) {
# somewhat hackish solution to:
# https://twitter.com/EamonCaddigan/status/646759751242620928
# based mostly on copy/pasting from ggplot2 geom_violin source:
# https://github.com/hadley/ggplot2/blob/master/R/geom-violin.r
library(ggplot2)
library(dplyr)
"%||%" <- function(a, b) {
@crazyhottommy
crazyhottommy / umap.R
Created February 15, 2018 17:51 — forked from schochastics/umap.R
Quick and dirty way of using UMAP in R using rPyhton
#install UMAP from https://github.com/lmcinnes/umap
#install.packages("rPython")
umap <- function(x,n_neighbors=10,min_dist=0.1,metric="euclidean"){
x <- as.matrix(x)
colnames(x) <- NULL
rPython::python.exec( c( "def umap(data,n,mdist,metric):",
"\timport umap" ,
"\timport numpy",
"\tembedding = umap.UMAP(n_neighbors=n,min_dist=mdist,metric=metric).fit_transform(data)",
@crazyhottommy
crazyhottommy / plot_aligned_series.R
Created February 4, 2019 19:40 — forked from tomhopper/plot_aligned_series.R
Align multiple ggplot2 graphs with a common x axis and different y axes, each with different y-axis labels.
#' When plotting multiple data series that share a common x axis but different y axes,
#' we can just plot each graph separately. This suffers from the drawback that the shared axis will typically
#' not align across graphs due to different plot margins.
#' One easy solution is to reshape2::melt() the data and use ggplot2's facet_grid() mapping. However, there is
#' no way to label individual y axes.
#' facet_grid() and facet_wrap() were designed to plot small multiples, where both x- and y-axis ranges are
#' shared acros all plots in the facetting. While the facet_ calls allow us to use different scales with
#' the \code{scales = "free"} argument, they should not be used this way.
#' A more robust approach is to the grid package grid.draw(), rbind() and ggplotGrob() to create a grid of
#' individual plots where the plot axes are properly aligned within the grid.