Skip to content

Instantly share code, notes, and snippets.

@aschreyer
Created October 24, 2011 15:07
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save aschreyer/1309271 to your computer and use it in GitHub Desktop.
Save aschreyer/1309271 to your computer and use it in GitHub Desktop.
RDKit PostgreSQL cartridge KNN-GIST benchmark
# select molregno, tanimoto_sml(morganbv_fp('CC(C(=O)c1ccccc1)C[NH+]1CC[NH+](CC(c2ccc(F)cc2)[NH+]2CC[NH+](C)CC2)CC1',2), circular_fp) as tanimoto
from chembl.fps
where morganbv_fp('CC(C(=O)c1ccccc1)C[NH+]1CC[NH+](CC(c2ccc(F)cc2)[NH+]2CC[NH+](C)CC2)CC1',2) % circular_fp
order by 2 desc
limit 10;
molregno | tanimoto
----------+-------------------
10464 | 1
10451 | 0.869565217391304
10344 | 0.836734693877551
536673 | 0.82
10770 | 0.803921568627451
10644 | 0.773584905660377
10541 | 0.773584905660377
10634 | 0.759259259259259
10769 | 0.740740740740741
10370 | 0.740740740740741
(10 rows)
Time: 1747.021 ms
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=3216.78..3216.80 rows=10 width=67) (actual time=1742.996..1743.085 rows=10 loops=1)
-> Sort (cost=3216.78..3219.27 rows=999 width=67) (actual time=1742.988..1743.019 rows=10 loops=1)
Sort Key: (tanimoto_sml('\xe0ffffff000400002d000000023e0c2e1e12881606d61642161c0830041ccc982850061e2e0c020a280a06682218000e263c0016142a440e5812'::bfp, circular_fp))
Sort Method: top-N heapsort Memory: 25kB
-> Bitmap Heap Scan on fps (cost=104.96..3195.19 rows=999 width=67) (actual time=1742.251..1742.813 rows=39 loops=1)
Recheck Cond: ('\xe0ffffff000400002d000000023e0c2e1e12881606d61642161c0830041ccc982850061e2e0c020a280a06682218000e263c0016142a440e5812'::bfp % circular_fp)
-> Bitmap Index Scan on morganbvidx (cost=0.00..104.71 rows=999 width=0) (actual time=1742.204..1742.204 rows=39 loops=1)
Index Cond: ('\xe0ffffff000400002d000000023e0c2e1e12881606d61642161c0830041ccc982850061e2e0c020a280a06682218000e263c0016142a440e5812'::bfp % circular_fp)
Total runtime: 1743.159 ms
(9 rows)
# select molregno, tanimoto_sml(morganbv_fp('CC(C(=O)c1ccccc1)C[NH+]1CC[NH+](CC(c2ccc(F)cc2)[NH+]2CC[NH+](C)CC2)CC1',2), circular_fp) as tanimoto
from chembl.fps
where morganbv_fp('CC(C(=O)c1ccccc1)C[NH+]1CC[NH+](CC(c2ccc(F)cc2)[NH+]2CC[NH+](C)CC2)CC1',2) % circular_fp
order by morganbv_fp('CC(C(=O)c1ccccc1)C[NH+]1CC[NH+](CC(c2ccc(F)cc2)[NH+]2CC[NH+](C)CC2)CC1',2) <%> circular_fp
limit 10;
molregno | tanimoto
----------+-------------------
10464 | 1
10451 | 0.869565217391304
10344 | 0.836734693877551
536673 | 0.82
10770 | 0.803921568627451
10644 | 0.773584905660377
10541 | 0.773584905660377
10634 | 0.759259259259259
10370 | 0.740740740740741
10769 | 0.740740740740741
(10 rows)
Time: 1147.311 ms
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.05..40.16 rows=10 width=67) (actual time=3.998..1148.691 rows=10 loops=1)
-> Index Scan using morganbvidx on fps (cost=0.05..4007.17 rows=999 width=67) (actual time=3.989..1148.612 rows=10 loops=1)
Index Cond: ('\xe0ffffff000400002d000000023e0c2e1e12881606d61642161c0830041ccc982850061e2e0c020a280a06682218000e263c0016142a440e5812'::bfp % circular_fp)
Order By: (circular_fp <%> '\xe0ffffff000400002d000000023e0c2e1e12881606d61642161c0830041ccc982850061e2e0c020a280a06682218000e263c0016142a440e5812'::bfp)
Total runtime: 1148.768 ms
(5 rows)
Time: 1152.219 ms
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment