Skip to content

Instantly share code, notes, and snippets.


Daniel Nicorici ndaniel

View GitHub Profile
ndaniel / do_hyper.R
Created January 30, 2019 10:59 — forked from slowkow/do_hyper.R
Compute a hypergeometric p-value for a gene set of interest.
View do_hyper.R
# Try this with:
# -
# -
#' Compute a hypergeometric p-value for your gene set of interest relative to
#' a universe of genes that you have defined.
#' @param ids A vector with genes of interest.
#' @param universe A vector with all genes, including the genes of interest.
ndaniel /
Created May 24, 2017 10:27 — forked from lh3/
Mapping short reads with a ~50bp INDEL

This is a small experiment on the alignment of ~50bp INDELs. The query sequences are shown in 0.01.fq below, where seq_ori is a 204bp sequence extracted from the human reference genome, seq_del54 contains a 54bp deletion in the middle, seq_del84 contains a 84bp deletion in a 120bp read, and seq_ins40 contains a 40bp insertion in a 140bp read. These four short sequences were mapped to the human reference genome with Bowtie2, BWA-MEM, LAST, Novoalign, SNAP and Stampy with default settings. Non-default scoring functions were also tested for Bowtie2 (--rdg 5,1 --rfg 5,1), BWA-MEM (-A2 -E1) and LAST (-r2 -q4). The output by various mappers/settings can be found in this gist. The following table gives my summary:

Mapper Setting -84bp -54bp +40bp
BBMAP default Yes Yes Yes
Bowtie2 default No No No
Bowtie2 --rdg 5,1 --rfg 5,1 as insertion as insertion Yes
BWA-MEM default as split Yes Yes
BWA-MEM -A2 -E1 Yes Yes Yes
LAST default as split as split