Daniel Nicorici ndaniel

## 0.00README.md

      
              12 files
            
          
              1 fork
            
          
              0 comments
            
          
              7 stars
            
          
                lh3
                / 0.00README.md
            
            
              Last active
              April 28, 2022 21:04
            
              
                Mapping short reads with a ~50bp INDEL
              
          
    This is a small experiment on the alignment of ~50bp INDELs. The query sequences are shown in 0.01.fq below, where seq_ori is a 204bp sequence extracted from the human reference genome, seq_del54 contains a 54bp deletion in the middle, seq_del84 contains a 84bp deletion in a 120bp read, and seq_ins40 contains a 40bp insertion in a 140bp read. These four short sequences were mapped to the human reference genome with Bowtie2, BWA-MEM, LAST, Novoalign, SNAP and Stampy with default settings. Non-default scoring functions were also tested for Bowtie2 (--rdg 5,1 --rfg 5,1), BWA-MEM (-A2 -E1) and LAST (-r2 -q4). The output by various mappers/settings can be found in this gist. The following table gives my summary:


Mapper
Setting
-84bp
-54bp
+40bp


BBMAP
default
Yes
Yes
Yes


Bowtie2
default
No
No
No


Bowtie2
--rdg 5,1 --rfg 5,1
as insertion
as insertion
Yes


BWA-MEM
default
as split
Yes
Yes


BWA-MEM
-A2 -E1
Yes
Yes
Yes


LAST
default
as split
as split


## gist:4582705
"""making a dataframe"""
df = pd.DataFrame([[1, 2], [3, 4]], columns=list('AB'))

"""quick way to create an interesting data frame to try things out"""
df = pd.DataFrame(np.random.randn(5, 4), columns=['a', 'b', 'c', 'd'])

"""convert a dictionary into a DataFrame"""
"""make the keys into columns"""
df = pd.DataFrame(dic, index=[0])
Mapper	Setting	-84bp	-54bp	+40bp
BBMAP	default	Yes	Yes	Yes
Bowtie2	default	No	No	No
Bowtie2	--rdg 5,1 --rfg 5,1	as insertion	as insertion	Yes
BWA-MEM	default	as split	Yes	Yes
BWA-MEM	-A2 -E1	Yes	Yes	Yes
LAST	default	as split	as split
	"""making a dataframe"""
	df = pd.DataFrame([[1, 2], [3, 4]], columns=list('AB'))

	"""quick way to create an interesting data frame to try things out"""
	df = pd.DataFrame(np.random.randn(5, 4), columns=['a', 'b', 'c', 'd'])

	"""convert a dictionary into a DataFrame"""
	"""make the keys into columns"""
	df = pd.DataFrame(dic, index=[0])