Skip to content

Instantly share code, notes, and snippets.

@kescobo
Last active August 29, 2015 14:17
Show Gist options
  • Save kescobo/10fee7bdce9e4e6c831f to your computer and use it in GitHub Desktop.
Save kescobo/10fee7bdce9e4e6c831f to your computer and use it in GitHub Desktop.
HGT analysis pipeline
import annotated genomes
for genome in genomes:
write genome to genome_database
database{
species_name:
CDS{
annotation:
aa sequence:
nt sequence:
3' position:
5' position:
}
}
for CDS in species_database:
BLAST against all other species
if hit > 99% identical
write to HGT_database:
query_CDS{
hit_CDS1:
hit_CDS2:
... etc
}
for hit in HGT_database:
if 3' or 5' within ~100bp of other_hit:
add hit and other_hit to group
align hits and deduplicate
for CDS in groups:
get annotations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment