Skip to content

Instantly share code, notes, and snippets.

@meren
Last active April 10, 2018 16:24
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save meren/d1cc319ea9c7c6c4ab13471fdf828718 to your computer and use it in GitHub Desktop.
#!/usr/bin/env python
# Click 'Download > Multiple-file JSON' from NCBI search results page,
# unzip it, run this script in it without any parameters, get the
# markdown formatted table.
import json
import glob
# poor man's whatever:
QUERY = lambda: hits['BlastOutput2']['report']['results']['search']['query_title'].split('___')[0]
QLEN = lambda: hits['BlastOutput2']['report']['results']['search']['query_len']
HIT = lambda: hits['BlastOutput2']['report']['results']['search']['hits'][index]
DESC = lambda: HIT()['description'][index]
TITLE = lambda: DESC()['title']
SCINAME = lambda: '_%s_' % DESC()['sciname']
ACC = lambda: '[%(desc)s](https://www.ncbi.nlm.nih.gov/protein/%(desc)s)' % {'desc': DESC()['accession']}
HSPS = lambda: HIT()['hsps'][0]
PCTALIGN = lambda: '%.2f%%' % (HSPS()['align_len']* 100 / QLEN())
PCTID = lambda: '%.2f%%' % (HSPS()['identity'] * 100 / HSPS()['align_len'])
print('|'.join(['', 'Found in the assembly', 'Best hit on NCBI', 'Percent alignment', 'Percent identity', 'Accession', '']))
print('|'.join(['', ':--', ':--', ':--:', ':--:', ':--:', '']))
# go through every json file in the directory:
for j in glob.glob('*.json'):
hits = json.load(open(j))
# skip the poop file
if 'BlastOutput2' not in hits:
continue
# report the best hit:
index = 0
# unless the best hit resolves to a multispecies .. if it does, increment
# index
while 1:
if TITLE().find('MULTISPECIES') == -1:
break
index += 1
print('|'.join(['', QUERY(), SCINAME(), PCTALIGN(), PCTID(), ACC(), '']))
@ShaiberAlon
Copy link

This line seems wrong to me:

PCTALIGN = lambda: '%.2f%%' % (100 - ((HSPS()['align_len']* 100 / QLEN()) - 100))

And I think it should be:

PCTALIGN = lambda: '%.2f%%' % (HSPS()['align_len']* 100 / QLEN())

@meren
Copy link
Author

meren commented Apr 10, 2018

When did I disagree with you, Alon? Fixed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment