Skip to content

Instantly share code, notes, and snippets.

@AlaaALatif
Last active April 13, 2022 09:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save AlaaALatif/46468af0ae0730d3bae41ff81ce2aef0 to your computer and use it in GitHub Desktop.
Save AlaaALatif/46468af0ae0730d3bae41ff81ce2aef0 to your computer and use it in GitHub Desktop.
import bjorn_support as bs
import mutations as bm
# FASTA must include reference NC_045512.2 (e.g. use cat to add the reference)
fasta_filepath = '/valhalla/2021-02-08_release/msa/2021-02-08_release.fa'
# specify name for output alignment
msa_filepath = 'msa.fa'
# run alignment (uses MAFFT but can be changed from bjorn_support.py)
bs.align_fasta(fasta_filepath, msa_filepath);
# load alignment
msa_data = bs.load_fasta(msa_filepath, is_aligned=True)
# identify variants for each sample
# must identify insertions before anything else, otherwise information is lost
try:
insertions, _ = bm.identify_insertions_per_sample(msa_data)
except:
insertions = None
substitutions, _ = bm.identify_replacements_per_sample(msa_data)
deletions, _ = bm.identify_deletions_per_sample(msa_data)
@AlaaALatif
Copy link
Author

script for tabulating mutations for each sample inside the input fasta file. mutations are computed relative to the reference sequence named 'NC_045512.2' and present inside the input fasta.

@AlaaALatif
Copy link
Author

AlaaALatif commented Feb 12, 2021

as a rough estimate of runtime, this takes a total of 43.7 seconds on 245 SARS-CoV-2 samples collected between December 2020 and February 2021 using 8 cores on a linux machine.

@AlaaALatif
Copy link
Author

link to supporting code (bjorn): https://github.com/andersen-lab/bjorn

@niemasd
Copy link

niemasd commented Feb 23, 2021

Writing myself a comment to remind myself: in the code snippet above, insertions, substitutions, and deletions are just DataFrames, which can then be exported to a spreadsheet using the DataFrame.to_csv() function

@AlaaALatif
Copy link
Author

Hi Niema, please do let me know if there are any issues or lack of clarity

@niemasd
Copy link

niemasd commented Feb 23, 2021

Thank you, will do! I appreciate it!

@liamxg
Copy link

liamxg commented Apr 13, 2022

import bjorn_support as bs
ModuleNotFoundError: No module named 'bjorn_support'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment