Skip to content

Instantly share code, notes, and snippets.

@mdshw5
Last active March 9, 2020 15:51
Show Gist options
  • Save mdshw5/6c06a272a7fde2a5288214080210c8cf to your computer and use it in GitHub Desktop.
Save mdshw5/6c06a272a7fde2a5288214080210c8cf to your computer and use it in GitHub Desktop.
biostars 265811
from pyfaidx import Fasta
# positions.txt:
# chr1 1 T
# chr1 100 C
# chr2 10 G
# ...
with open('positions.txt') as mut_table:
# mutable Fasta modifies input file in-place
# make sure you're editing a copy of the original file
with Fasta('input.fasta', mutable=True) as fasta:
for line in mut_table:
rname, pos, base = line.rstrip().split()
# convert 1-based to 0-based coordinates
fasta[rname][int(pos) - 1] = base
@mdshw5
Copy link
Author

mdshw5 commented Mar 9, 2020

Sorry to hear that you're having trouble. I've tried to reproduce your issue using the example you provided, however I do see that the sites are mutated when running the code in this gist.

$ cat input.fasta
>ACVR1_R206H_1
CACCGCTGGCGAGCCACTGTTCTTTAACAGTGTAATCTGGCGAGCCACTGTTCTTTGTACCAGAAGAATTC
>ACVR1_R206H_2
CACCGTCTGGCGAGCCACTGTTCTTCAACAGTGTAATCTGGCGAGCCACTGTTCTTTGTACCAGAGAATTC
>ACVR1_S290L_3
CACCGATCGTTGTACGACTATCTTTCATGAAATGGGATCGTTGTACGACTATCTTCAGCTTACTAGAATTC
$ cat positions.txt 
ACVR1_R206H_1 1 T
ACVR1_R206H_2 1 T
ACVR1_S290L_3 1 T
ACVR1_R206H_1 4 T
ACVR1_R206H_2 4 T
ACVR1_S290L_3 4 T
$ cat mutate_guides.py
from pyfaidx import Fasta

# positions.txt:
# chr1 1 T
# chr1 100 C
# chr2 10 G
# ...

with open('positions.txt') as mut_table:
  # mutable Fasta modifies input file in-place
  # make sure you're editing a copy of the original file
  with Fasta('input.fasta', mutable=True) as fasta:
    for line in mut_table:
      rname, pos, base = line.rstrip().split()
      # convert 1-based to 0-based coordinates
      fasta[rname][int(pos) - 1] = base
$ python --version
Python 3.6.5
$ pip freeze
pyfaidx==0.5.8
six==1.14.0

Running the script produces the expected output:

$ python mutate_guides.py
$ cat input.fasta
>ACVR1_R206H_1
TACTGCTGGCGAGCCACTGTTCTTTAACAGTGTAATCTGGCGAGCCACTGTTCTTTGTACCAGAAGAATTC
>ACVR1_R206H_2
TACTGTCTGGCGAGCCACTGTTCTTCAACAGTGTAATCTGGCGAGCCACTGTTCTTTGTACCAGAGAATTC
>ACVR1_S290L_3
TACTGATCGTTGTACGACTATCTTTCATGAAATGGGATCGTTGTACGACTATCTTCAGCTTACTAGAATTC

Hopefully this can give you a starting point to figure out what's going wrong. Do note that the Fasta(mutable=True) does mean that the FASTA file will be edited in-place, and so you'll want to run the script on a copy of your guide sequence file so that you can compare the changes to the original.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment