Skip to content

Instantly share code, notes, and snippets.

View aschreyer's full-sized avatar

Adrian Schreyer aschreyer

View GitHub Profile
@aschreyer
aschreyer / usrcat-query.sql
Created September 10, 2012 10:41
Example USRCAT query demonstrating the use of the cube extension
-- select a query molecule
WITH query AS
(
SELECT *
FROM dude.usr_moments_fnta
WHERE category = 'ACTIVE' AND mol_idx = 1 AND conformer = 0
)
-- run the virtual screen with the given query
SELECT category, mol_idx, MAX(sq.similarity) as similarity
FROM (
WITH conformer AS
(
SELECT openeye.omega('CC1=CC(=C(N1C2=CC(=CC(=C2)C(F)(F)F)C(F)(F)F)C)/C=C\3/C(=O)N=C(S3)NC4=CC=CC=C4') as oeb
),
query AS
(
SELECT openeye.usrcat(oeb) AS usr_moments
FROM conformer
),
screen AS
@aschreyer
aschreyer / credo-rest-usrcat.py
Created February 26, 2013 12:01
USRCAT search with one of the OSDD Malaria compounds, OSM-S-35. The SMILES strings were obtained by using chemicalize.org on the relevant OSDD blog post. Two stereoisomers were found for OSM-S-35 and a conformer generated for each with OpenEye's OMEGA toolkit.
import requests
headers = {'accept': 'application/json'}
files = {'file': open('/home/adrian/Dropbox/OSM-S-35-I.sdf')}
response = requests.post('http://marid.bioc.cam.ac.uk/credo/chemcomps/sim/usr/',
headers=headers, files=files, params={'usrcat':'t', 'limit':25})
loop_
_pdbx_struct_oper_list.id
_pdbx_struct_oper_list.type
_pdbx_struct_oper_list.name
_pdbx_struct_oper_list.symmetry_operation
_pdbx_struct_oper_list.matrix[1][1]
_pdbx_struct_oper_list.matrix[1][2]
_pdbx_struct_oper_list.matrix[1][3]
_pdbx_struct_oper_list.vector[1]
_pdbx_struct_oper_list.matrix[2][1]
@aschreyer
aschreyer / pgopeneye-benchmark.sql
Last active December 12, 2015 09:49
PostgreSQL OpenEye cartridge benchmark based on ChEMBL15.
-- set search_path = public, openeye;
DO $$
DECLARE
molregno integer;
fp oefp;
BEGIN
FOR molregno, fp IN SELECT cfp.molregno, cfp.circular_fp
FROM chembl.compound_oefps cfp
ORDER BY random() LIMIT 100
@paginate
def fetch_all_descendants(self, fragment_id, *expr, **kwargs):
"""
Returns all the descending fragments of the fragment with the given
fragment_id using an recursive SQL query.
"""
# query part that will be used in both parts of the recursive query
query = Fragment.query.with_entities(Fragment.fragment_id)
query = query.join(FragmentHierarchy, FragmentHierarchy.child_id==Fragment.fragment_id)
@aschreyer
aschreyer / chemcomprdmol-hybrid-method.py
Last active December 12, 2015 07:39
More credoscript examples to show the implementation of SQLAlchemy.
@hybrid_method
@requires.rdkit
def contains(self, smiles):
return self.rdmol.HasSubstructMatch(MolFromSmiles(str(smiles)))
@contains.expression
def contains(self, smiles):
return self.rdmol.op('OPERATOR(rdkit.@>)')(smiles)
import urllib2
import json
headers = {'Accept' : 'application/json'}
req = urllib2.Request('http://marid.bioc.cam.ac.uk/credo/chemcomps/001', headers=headers)
resp = urllib2.urlopen(req)
jsondata = json.load(resp)
# output
{u'data': [7464,
@aschreyer
aschreyer / openeye-multimcs-chembl-threshold.sql
Created November 12, 2012 17:15
The multimcs() function in the PostgreSQL OpenEye cartridge
SELECT assay_id,
count(DISTINCT molregno) AS compounds,
openeye.multimcs(array_agg(ism), threshold:=0.8) AS mcs
FROM chembl.target_dictionary
JOIN chembl.assay2target USING(tid)
JOIN chembl.activities a USING(assay_id)
JOIN chembl.compound_smiles cs USING(molregno)
WHERE protein_accession = 'P53779'
GROUP BY assay_id
HAVING count(DISTINCT molregno) >= 5
@aschreyer
aschreyer / unichem-fdw-query.sql
Created November 2, 2012 10:20
UniChem PostgreSQL Foreign Data Wrapper
SELECT *
FROM credo.unichem_pdb_to_kegg
LIMIT 10;
het_id compound_id
------ -----------
LQZ C07073
HC4 C00811
DC5 C14355
POD C10874