Skip to content

Instantly share code, notes, and snippets.

@deliaBlue
Created January 24, 2024 19:41
Show Gist options
  • Select an option

  • Save deliaBlue/19ad3740c95937378bd9281bd9d1bc72 to your computer and use it in GitHub Desktop.

Select an option

Save deliaBlue/19ad3740c95937378bd9281bd9d1bc72 to your computer and use it in GitHub Desktop.
Format and example of the BED6 file to be provided in the MIRFLOWZ pipeline to generate ASCII-style alignment pileups.

To generate the ASCII-style alignment pileups, the BED file containing the genomic regions of interest must have 6 fields:

  1. chrom - Name of the chromosome or scaffold in the Ensembl format (i.e. the chromosome name without the chr prefix). If you are using a custom sequence, make sure to fill this field with the first word of the name you used in the provided reference genome file.
  2. chromStart - Start position of the feature in standard chromosomal coordinates (i.e. the first base is a 0).
  3. chromEnd - End position of the feature in standard chromosomal coordinates.
  4. name - Feature name to be displayed in the pileups.
  5. score - As this column is ignored when generating the pileups, you can set its value to ·.
  6. strand - Defined as + (forward) or - (reverse).

For instance, supose the following sequence in the reference genome (our regions of interest are in capital letters):

>REF_SEQ (putative sequence in the positive strand)
tcttgatttaattaaagagcttaagaaAGATTTCAGCCTGTCTGACTTCCGCAGGGCAGC
CAGGAGGTCAGAATGGCGCTCAGAAGTCCTCCTCCACAGGAATTCTAACCCGGAGCGCCT
GCTGGCTACTGCCCAGAaactgagtcatgaagaaaccccacgtGTAAAATAATCCTTCAG
GCAAATGGGAAACGGTACCTTAGAATGGACTGtatcagagccatggactcaagatttgaa
tgaaatacagagccagctaagttccctcccgctggagccattcattcag

The two entries in the BED file will be:

REF_SEQ    27    136    region_1    ·    +
REF_SEQ    163   211    region_2    ·    +
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment