Skip to content

Instantly share code, notes, and snippets.

@mdshw5
Created March 31, 2014 14:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mdshw5/9893462 to your computer and use it in GitHub Desktop.
Save mdshw5/9893462 to your computer and use it in GitHub Desktop.
Biostars 96573 solution
from pyfaidx import Fasta
with open("regions.txt") as regions, Fasta("sequence.fasta") as fasta:
for line in regions:
fields = line.rstrip().split()
rname, start, end = fields[4:7]
repeat = ' '.join(fields[9:11])
seq = fasta[rname][int(start)-1:int(end)-1]
print(seq.name + repeat)
print(seq.seq)
21 3.6 0.0 0.0 gi|411024077|gb|CM001634.1| 11554 11581 (28623556) + AT_rich Low_complexity 1 28 (0) 1
22 0.0 0.0 0.0 gi|411024077|gb|CM001634.1| 12224 12245 (28622892) + AT_rich Low_complexity 1 22 (0) 2
22 8.0 0.0 0.0 gi|411024077|gb|CM001634.1| 12609 12658 (28622479) + AT_rich Low_complexity 1 50 (0) 3
22 5.6 0.0 0.0 gi|411024077|gb|CM001634.1| 12691 12726 (28622411) + AT_rich Low_complexity 1 36 (0) 4
26 3.0 0.0 0.0 gi|411024077|gb|CM001634.1| 13965 13997 (28621140) + AT_rich Low_complexity 1 33 (0) 5
222 3.7 0.0 0.0 gi|411024077|gb|CM001634.1| 16373 16399 (28618738) + (TA)n Simple_repeat 1 27 (0) 6
219 12.5 0.0 0.0 gi|411024077|gb|CM001634.1| 17247 17286 (28617851) + (G)n Simple_repeat 1 40 (0) 7
198 0.0 0.0 0.0 gi|411024077|gb|CM001634.1| 20074 20095 (28615042) + (CAAAAA)n Simple_repeat 3 24 (0) 8
189 0.0 0.0 0.0 gi|411024077|gb|CM001634.1| 20344 20364 (28614773) + (TA)n Simple_repeat 1 21 (0) 9
23 0.0 0.0 0.0 gi|411024077|gb|CM001634.1| 21437 21459 (28613678) + AT_rich Low_complexity 1 23 (0) 10
198 0.0 0.0 0.0 gi|411024077|gb|CM001634.1| 22420 22441 (28612696) + (GAA)n Simple_repeat 2 23 (0) 11
189 0.0 0.0 0.0 gi|411024077|gb|CM001634.1| 27191 27211 (28607926) + (TTTTTG)n Simple_repeat 4 24 (0) 12
21 0.0 0.0 0.0 gi|411024077|gb|CM001634.1| 27481 27501 (28607636) + AT_rich Low_complexity 1 21 (0) 13
24 16.1 0.0 0.0 gi|411024077|gb|CM001634.1| 33144 33174 (28601963) + AT_rich Low_complexity 1 31 (0) 14
186 4.3 0.0 0.0 gi|411024077|gb|CM001634.1| 34503 34525 (28600612) + (CGAAT)n Simple_repeat 2 24 (0) 15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment