Skip to content

Instantly share code, notes, and snippets.

View ndaniel's full-sized avatar

Daniel Nicorici ndaniel

View GitHub Profile
@lh3
lh3 / 0.00README.md
Last active April 28, 2022 21:04
Mapping short reads with a ~50bp INDEL

This is a small experiment on the alignment of ~50bp INDELs. The query sequences are shown in 0.01.fq below, where seq_ori is a 204bp sequence extracted from the human reference genome, seq_del54 contains a 54bp deletion in the middle, seq_del84 contains a 84bp deletion in a 120bp read, and seq_ins40 contains a 40bp insertion in a 140bp read. These four short sequences were mapped to the human reference genome with Bowtie2, BWA-MEM, LAST, Novoalign, SNAP and Stampy with default settings. Non-default scoring functions were also tested for Bowtie2 (--rdg 5,1 --rfg 5,1), BWA-MEM (-A2 -E1) and LAST (-r2 -q4). The output by various mappers/settings can be found in this gist. The following table gives my summary:

Mapper Setting -84bp -54bp +40bp
BBMAP default Yes Yes Yes
Bowtie2 default No No No
Bowtie2 --rdg 5,1 --rfg 5,1 as insertion as insertion Yes
BWA-MEM default as split Yes Yes
BWA-MEM -A2 -E1 Yes Yes Yes
LAST default as split as split
@why-not
why-not / gist:4582705
Last active February 1, 2024 00:44
Pandas recipe. I find pandas indexing counter intuitive, perhaps my intuitions were shaped by many years in the imperative world. I am collecting some recipes to do things quickly in pandas & to jog my memory.
"""making a dataframe"""
df = pd.DataFrame([[1, 2], [3, 4]], columns=list('AB'))
"""quick way to create an interesting data frame to try things out"""
df = pd.DataFrame(np.random.randn(5, 4), columns=['a', 'b', 'c', 'd'])
"""convert a dictionary into a DataFrame"""
"""make the keys into columns"""
df = pd.DataFrame(dic, index=[0])