Skip to content

Instantly share code, notes, and snippets.

@iwatobipen
Created March 19, 2022 08:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save iwatobipen/1ffa9b2a13b70cf3f424e7dd9af68a90 to your computer and use it in GitHub Desktop.
Save iwatobipen/1ffa9b2a13b70cf3f424e7dd9af68a90 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@cthoyt
Copy link

cthoyt commented Mar 19, 2022

You can replace

tree = etree.parse('./chembl_30_monomer_library.xml')
root = tree.getroot()

with the following code using chembl-downloader to make this notebook automatically download the latest ChEMBL data (or pin with version="30":

import chembl_downloader

# Option 1: Always get the latest
root = chembl_downloader.get_monomer_library_root()

# Option 2: Pin to a specific version
root = chembl_downloader.get_monomer_library_root(version="30")

This has the added benefit that anyone can re-run your notebook without messing around with wget and making sure the file paths are all correct. I demonstrated this works on my fork of this gist at https://gist.github.com/cthoyt/a48376238cabf1d368401550f5f3e5ee.

@charlesxu90
Copy link

Another alternative is to update the FTP link in wget download to the following:

! wget https://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBLdb/releases/chembl_30/chembl_30_monomer_library.xml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment