Skip to content

Instantly share code, notes, and snippets.

@crazyhottommy
Created October 31, 2013 18:46
Show Gist options
  • Save crazyhottommy/7254815 to your computer and use it in GitHub Desktop.
Save crazyhottommy/7254815 to your computer and use it in GitHub Desktop.
sam_to_bedgraph_HTSeq
import HTSeq
alignment_file = HTSeq.SAM_Reader("SRR817000.sam")
# HTSeq also has a BAM_Reader function to handle the bam file
# initialize a Genomic Array (a class defined in the HTSeq package to deal with NGS data,
# it allows transparent access of the data through the GenomicInterval object)
# more reading http://www-huber.embl.de/users/anders/HTSeq/doc/genomic.html#genomic
coverage = HTSeq.GenomicArray("auto", stranded = True, typecode = 'i')
# go through the alignment file, add count by 1 if in that Interval there is a read mapped there
for alignment in alignment_file:
if alignment.aligned:
coverage[ alignment.iv] += 1
# it takes a while to construct this coverage array since python goes through every read in the big SAM file
# write the file to bedgraph
coverage.write_bedgraph_file ( "plus.wig", "+")
coverage.write_bedgraph_file ( "minus.wig", "-")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment