Skip to content

Instantly share code, notes, and snippets.

View Shicheng-Guo's full-sized avatar

Shicheng Guo Shicheng-Guo

View GitHub Profile
@nachocab
nachocab / mds.jpg
Last active December 18, 2015 08:46
mds.jpg
@willtownes
willtownes / refgene2bed.py
Last active January 30, 2024 16:31
Splits a refGene.txt file into multiple .bed files for various genome features (exon,intron, etc), suitable for input to bedtools coverage
"""
Python Script to read in a reference genome of refseq IDs and output several tab-delimited BED (text) files suitable for use with bedtools coverage for counting ChIP-seq reads that map to various gene features.
All output files have the structure expected by bedtools, namely,
CHROM POSITION1 POSITION2 REFSEQ_ID
Possible output files include:
1. distal promoter (transcription start [-5KB,-1KB]) KB means kilobase pairs, not kilobyte
2. proximal promoter (transcription start [-1KB,1KB])
3. gene body (anywhere between transcription start and transcription end)
4. transcript (anywhere in an exon)- outputs each exon as a separate line
5. first 1/3 transcript- outputs each exon as a separate line