Sometimes FASTQ data is aligned to a reference and stored as a BAM file, instead of the normal FASTQ read files. This is okay, because it is possible to recreate raw FASTQ files based on the BAM file. The following outlines this process. The useful software samtools
and bedtools
are both required.
From each bam, we need to extract:
- reads that mapped properly as pairs
- reads that didn’t map properly as pairs (both didn’t map, or one didn’t map)
For #1, the following command will work. This was taken from this webpage.