Skip to content

Instantly share code, notes, and snippets.

@finswimmer
Last active April 8, 2021 02:58
Show Gist options
  • Save finswimmer/76bdc20b35ffe5d41bf6a52ff21e1d4b to your computer and use it in GitHub Desktop.
Save finswimmer/76bdc20b35ffe5d41bf6a52ff21e1d4b to your computer and use it in GitHub Desktop.
Replace amino acid 1-letter code with 3-letter code in annovar output
import sys
def replace(text, replacement):
new = ''
for c in text:
try:
new += replacement[c]
except KeyError:
new += c
return new
aa_code = {
"A": "Ala", "C": "Cys", "D": "Asp", "E": "Glu", "F": "Phe",
"G": "Gly", "H": "His", "I": "Ile", "K": "Lys", "L": "Leu",
"M": "Met", "N": "Asn", "P": "Pro", "Q": "Gln", "R": "Arg",
"S": "Ser", "T": "Thr", "V": "Val", "W": "Trp", "Y": "Tyr",
"*": "Ter"
}
hgvs_p = -1
with open(sys.argv[1], "r") as csv:
for line in csv:
data = line.strip().split("\t")
if hgvs_p == -1:
hgvs_p = data.index("HGVS_P")
else:
aa = data[hgvs_p]
data[hgvs_p] = replace(aa, aa_code)
print("\t".join(data))
@finswimmer
Copy link
Author

Answer on biostars question: https://www.biostars.org/p/316062/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment