Skip to content

Instantly share code, notes, and snippets.

View aschreyer's full-sized avatar

Adrian Schreyer aschreyer

View GitHub Profile
@aschreyer
aschreyer / ChEMBL11.sql
Created August 12, 2011 08:25
PostgreSQL version of the ChEMBL11 database schema
--
-- PostgreSQL database dump
--
-- Dumped from database version 9.0.4
-- Dumped by pg_dump version 9.0.1
-- Started on 2011-08-12 09:23:17
SET statement_timeout = 0;
SET client_encoding = 'UTF8';
@aschreyer
aschreyer / chembl10.sql
Created August 9, 2011 07:48
PostgreSQL version of the ChEMBL10 database schema
--
-- PostgreSQL database dump
--
-- Dumped from database version 9.0.4
-- Dumped by pg_dump version 9.0.1
-- Started on 2011-07-22 13:30:36
SET statement_timeout = 0;
SET client_encoding = 'UTF8';
@aschreyer
aschreyer / sql-lingo.sql
Created June 13, 2011 16:41
Using the PostgreSQL pg_trgm extension for chemical similarity searching
SELECT molregno, s.ism, similarity(s.ism, 'C[NH+](C)CC[C@H](CSc1ccccc1)Nc2ccc(cc2[N+](=O)[O-])S(=O)(=O)NC(=O)c3ccc(cc3)N4CC[NH+](CC4)Cc5ccccc5c6ccc(cc6)Cl') As sim
FROM chembl.compound_smiles s
WHERE s.ism % 'C[NH+](C)CC[C@H](CSc1ccccc1)Nc2ccc(cc2[N+](=O)[O-])S(=O)(=O)NC(=O)c3ccc(cc3)N4CC[NH+](CC4)Cc5ccccc5c6ccc(cc6)Cl'
ORDER BY s.ism <-> 'C[NH+](C)CC[C@H](CSc1ccccc1)Nc2ccc(cc2[N+](=O)[O-])S(=O)(=O)NC(=O)c3ccc(cc3)N4CC[NH+](CC4)Cc5ccccc5c6ccc(cc6)Cl'
LIMIT 20;
@aschreyer
aschreyer / credo-rdkit.py
Created June 1, 2011 16:07
RDKit integration in CREDO
from credo.credoscript import *
ca = ChemCompAdaptor()
# GET IMATINIB
sti = ca.fetch_by_het_id('STI')
# GET ALL SIMILAR CHEMICAL COMPONENTS USING CIRCULAR FINGERPRINTS
ca.fetch_all_by_sim(sti.ism, fp='circular')
@aschreyer
aschreyer / credo-contrib-chemresolver.py
Created June 1, 2011 16:05
Implementing the Chemical Identifier Resolver
from credo.credoscript import *
from credo.contrib.chemidresolver import *
ca = ChemCompAdaptor()
rsvr = ChemIdResolver()
# GET THE SDF FILE OF LAPATINIB WITH 3D COORDINATES
molstring = rsvr.get_structure('lapatinib','sdf', get3d=True)
# GET THE USR MOMENTS FOR THE STRUCTURE
@aschreyer
aschreyer / credo-restful-example.py
Created June 1, 2011 15:59
Implementation of RESTful web services for CREDO
import json
from urllib import urlopen
# GET A CHEMICAL COMPONENT
response = urlopen('http://0.0.0.0:8000/credo/rest/chemcomps/sti').read()
# PARSE JSON INTO PYTHON DICTIONARY
sti = json.loads(response)
# COLUMN DATA/ATTRIBUTES ARE CONVERTED TO DICTIONARY KEYS
@aschreyer
aschreyer / vector3d-angle.sql
Created June 1, 2011 15:50
Eigen Vector3D PostgreSQL extension
SELECT degrees('(3,0,0)'::vector3d @ '(5,5,0)'::vector3d) AS angle;
angle
-------
45
(1 row)
@aschreyer
aschreyer / fuzcav-calpha.sql
Created June 1, 2011 12:16
Binding site similarity search in CREDO using the FuzCav algorithm
SELECT s.pdb, l.ligand_name, s.title, sq.similarity
FROM (
WITH query AS (SELECT calphafp FROM credo.ligand_fuzcav WHERE ligand_id = 3919)
SELECT f.ligand_id, arrayxi_fuzcavsim_global(query.calphafp, f.calphafp) AS similarity
FROM credo.ligand_fuzcav f, query
WHERE arrayxi_fuzcavsim_global(query.calphafp, f.calphafp) > 0.17
) sq,
credo.ligands l, credo.biomolecules b, credo.structures s
WHERE sq.ligand_id = l.ligand_id
AND l.biomolecule_id = b.biomolecule_id
@aschreyer
aschreyer / epinephrines.sql
Created May 24, 2011 12:55
Stereochemistry functions in the PostgreSQL OpenEye cartridge
SELECT canonical_smiles, openeye.has_undef_stereo(canonical_smiles)
FROM chembl.compound_structures s
JOIN chembl.chembl_id_lookup i
ON s.molregno = i.entity_id AND i.entity_type = 'COMPOUND'
WHERE i.chembl_id = 'CHEMBL287587' OR i.chembl_id = 'CHEMBL1437';
canonical_smiles | has_undef_stereo
----------------------------+-----------------------
'NC[C@H](O)c1ccc(O)c(O)c1' | f
@aschreyer
aschreyer / array-intersection.sql
Created May 20, 2011 15:05
Examples of the Eigen PostgreSQL extension
SELECT '{1,2,3,4,5}'::arrayxi & '{5,4,3,2,1}'::arrayxi;
?column?
---------------
'{0,0,3,0,0}'
(1 row)
Time: 0.290 ms