Skip to content

Instantly share code, notes, and snippets.

@j-wags
Last active August 31, 2022 17:24
Show Gist options
  • Save j-wags/4b9716cfd7c4c5efee03ff63ec6a964e to your computer and use it in GitHub Desktop.
Save j-wags/4b9716cfd7c4c5efee03ff63ec6a964e to your computer and use it in GitHub Desktop.
biopolymer_alpha_notebook.ipynb
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "view-in-github"
},
"source": [
"[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gist/j-wags/4b9716cfd7c4c5efee03ff63ec6a964e/HEAD)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jGKy7ZaVSMNK"
},
"source": [
"# Let's load a protein from PDB\n",
"\n",
"Right now, the `Molecule.from_polymer_pdb` method can handle the canonical amino acids (26 including protonation states) in capped and uncapped forms. We may add an API to allow registration of new polymer substructures in a future release, but right now we just want to get this release out the door."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Warning: Unable to load toolkit 'OpenEye Toolkit'. The Open Force Field Toolkit does not require the OpenEye Toolkits, and can use RDKit/AmberTools instead. However, if you have a valid license for the OpenEye Toolkits, consider installing them for faster performance and additional file format support: https://docs.eyesopen.com/toolkits/python/quickstart-python/linuxosx.html OpenEye offers free Toolkit licenses for academics: https://www.eyesopen.com/academic-licensing\n"
]
},
{
"data": {
"text/plain": [
"'./T4-protein.pdb'"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# This code block just copies an example protein PDB into the current directory \n",
"# for your convenience\n",
"import shutil\n",
"from openff.toolkit.utils import get_data_file_path\n",
"file_path = get_data_file_path('proteins/T4-protein.pdb')\n",
"shutil.copy(file_path, './T4-protein.pdb')"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"id": "Ctz4Qshz0pkz"
},
"outputs": [],
"source": [
"from openff.toolkit import Molecule, Topology\n",
"# Load structures and perceive info\n",
"# This PDB file must have explicit hydrogens\n",
"protein = Molecule.from_polymer_pdb('T4-protein.pdb')"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "xdmbmufAEpBR",
"outputId": "56b9f8fa-7bc6-471d-b3c2-d09122a35eee"
},
"outputs": [
{
"data": {
"text/plain": [
"OrderedDict([('atom1', 17),\n",
" ('atom2', 18),\n",
" ('bond_order', 2),\n",
" ('is_aromatic', False),\n",
" ('stereochemistry', None),\n",
" ('fractional_bond_order', None)])"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Let's take a look at a peptide C=O bond. The bond orders have been filled \n",
"# in for a library of known substructures. So the information that this C=O\n",
"# bond has an order of 2 came from the chemical components dictionary \n",
"# published by the RCSB - _this information was NOT explicitly in the PDB file_\n",
"protein.bonds[1].to_dict()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"protein"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "EKmJ1RuyKdHw"
},
"source": [
"# \"HierarchySchemes\", AKA residues and chains\n",
"\n",
"Different software ecosystems have different expectations about what constitutes a \"residue\" or \"chain\". We try to be very hands-off, and allow atoms to have \"metadata\" (a dict of `(str): (str or int)`). Then we allow users to group atoms according to this metadata (for example, atoms with the same residue name, number, and chain ID could be in a single \"residue\"). Here, the \"residue\" is an instance of an OpenFF \"HierarchyElement\", and the machinery that groups them by their metadata is called a \"HierarchyScheme\".\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "nn25SwfEGM1B",
"outputId": "e95fde0d-e291-45ea-e671-4a4dd5e5f68f"
},
"outputs": [
{
"data": {
"text/plain": [
"OrderedDict([('atomic_number', 8),\n",
" ('formal_charge', 0),\n",
" ('is_aromatic', False),\n",
" ('stereochemistry', None),\n",
" ('name', 'O'),\n",
" ('metadata',\n",
" {'residue_name': 'MET',\n",
" 'residue_number': '0',\n",
" 'chain_id': 'A'})])"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Check out the new \"metadata\" field that this atom has\n",
"protein.atoms[18].to_dict()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "LQlkBQ8rsXId",
"outputId": "a391550e-ed06-49af-a24a-8cdb7096ba9c"
},
"outputs": [
{
"data": {
"text/plain": [
"{'chains': HierarchyScheme with uniqueness_criteria '('chain_id',)', iterator_name 'chains', and 1 elements,\n",
" 'residues': HierarchyScheme with uniqueness_criteria '('chain_id', 'residue_number', 'residue_name')', iterator_name 'residues', and 164 elements}"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# The OFFMol knows about residues and chains\n",
"protein.hierarchy_schemes"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "311Hqe8DIFuh",
"outputId": "989f32d8-b991-4b64-9eef-1d9b2ce6fe6e"
},
"outputs": [
{
"data": {
"text/plain": [
"[HierarchyElement ('A', '0', 'MET') of iterator 'residues' containing 19 atom(s),\n",
" HierarchyElement ('A', '1', 'ASN') of iterator 'residues' containing 14 atom(s),\n",
" HierarchyElement ('A', '2', 'ILE') of iterator 'residues' containing 19 atom(s),\n",
" HierarchyElement ('A', '3', 'PHE') of iterator 'residues' containing 20 atom(s),\n",
" HierarchyElement ('A', '4', 'GLU') of iterator 'residues' containing 15 atom(s)]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# We offer lightweight residue and chain iteration functionality\n",
"protein.residues[:5]"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "wHJEN5JzOEJU",
"outputId": "f0629b69-dc31-46e6-9625-3e5712d71652"
},
"outputs": [
{
"data": {
"text/plain": [
"[Atom(name=N, atomic number=7),\n",
" Atom(name=H, atomic number=1),\n",
" Atom(name=CA, atomic number=6),\n",
" Atom(name=HA, atomic number=1),\n",
" Atom(name=CB, atomic number=6),\n",
" Atom(name=HB2, atomic number=1),\n",
" Atom(name=HB3, atomic number=1),\n",
" Atom(name=CG, atomic number=6),\n",
" Atom(name=HG2, atomic number=1),\n",
" Atom(name=HG3, atomic number=1),\n",
" Atom(name=CD, atomic number=6),\n",
" Atom(name=OE1, atomic number=8),\n",
" Atom(name=OE2, atomic number=8),\n",
" Atom(name=C, atomic number=6),\n",
" Atom(name=O, atomic number=8)]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# The underlying atoms can be accessed using the \"particles\" iterator\n",
"[*protein.residues[4].atoms]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "KQ_2NoF2ZoiA"
},
"source": [
"## Hierarchy metadata is preserved when going to/from OpenMM, RDKit, OpenEye (not shown here), and PDB \n",
"\n",
"It's worth emphasizing again that we are VERY hands-off with hierarchy info handling - Our main goal is to have as few restrictions as possible. But the other side of this coin is that there are cases where we're converting to a format with limitations on their hierarchy info and information is lost or other unexpected things happen. We are adding a special docs page to help people understand this here: https://docs.openforcefield.org/projects/toolkit/en/main/users/molecule_conversion.html"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "ssqNn3xeWrAv",
"outputId": "2e3fcb21-ee60-4442-e595-3163a43bbb2c"
},
"outputs": [
{
"data": {
"text/plain": [
"<bound method Topology.residues of <Topology; 1 chains, 164 residues, 2634 atoms, 2654 bonds>>"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Here's an example of where hierarchy info goes when converting an OFFMol\n",
"# to OpenMM\n",
"omm_top = protein.to_topology().to_openmm()\n",
"omm_top.residues"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "3amen4QzanB-",
"outputId": "e258b90a-288e-4735-9a02-b3918ddb7cd9"
},
"outputs": [
{
"data": {
"text/plain": [
"<Atom 100 (H51x) of chain 0 residue 5 (MET)>"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"[*omm_top.atoms()][100]"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "2dr_znOZZ1ar",
"outputId": "1d4291fe-ac2d-473e-ddd8-c60d7f6a4827"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"MET 5 A\n"
]
}
],
"source": [
"# Here's where hierarchy info goes in RDKit\n",
"rd_protein = protein.to_rdkit()\n",
"res_info = rd_protein.GetAtomWithIdx(100).GetPDBResidueInfo()\n",
"print(res_info.GetResidueName(), res_info.GetResidueNumber(), res_info.GetChainId())"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "KfReaGusabGu",
"outputId": "403f918e-446e-4d05-a5e4-8a2c44fba23f"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"ATOM 1 N MET A 0 33.430 11.280 44.490 1.00 0.00 N1+\r\n",
"ATOM 2 H MET A 0 32.860 11.870 45.090 1.00 0.00 H \r\n",
"ATOM 3 H2 MET A 0 32.970 10.380 44.440 1.00 0.00 H \r\n",
"ATOM 4 H3 MET A 0 34.270 11.280 45.050 1.00 0.00 H \r\n",
"ATOM 5 CA MET A 0 33.550 11.970 43.200 1.00 0.00 C \r\n",
"ATOM 6 HA MET A 0 34.090 12.890 43.410 1.00 0.00 H \r\n",
"ATOM 7 CB MET A 0 34.550 11.390 42.270 1.00 0.00 C \r\n",
"ATOM 8 HB2 MET A 0 35.470 11.330 42.850 1.00 0.00 H \r\n",
"ATOM 9 HB3 MET A 0 34.320 10.370 41.960 1.00 0.00 H \r\n",
"ATOM 10 CG MET A 0 34.730 12.320 41.020 1.00 0.00 C \r\n",
"ATOM 11 HG2 MET A 0 35.430 11.840 40.340 1.00 0.00 H \r\n",
"ATOM 12 HG3 MET A 0 33.720 12.350 40.620 1.00 0.00 H \r\n",
"ATOM 13 SD MET A 0 35.270 14.050 41.300 1.00 0.00 S \r\n",
"ATOM 14 CE MET A 0 37.050 13.800 41.760 1.00 0.00 C \r\n",
"ATOM 15 HE1 MET A 0 37.170 13.650 42.830 1.00 0.00 H \r\n",
"ATOM 16 HE2 MET A 0 37.500 12.990 41.190 1.00 0.00 H \r\n",
"ATOM 17 HE3 MET A 0 37.570 14.740 41.600 1.00 0.00 H \r\n",
"ATOM 18 C MET A 0 32.100 12.250 42.660 1.00 0.00 C \r\n",
"ATOM 19 O MET A 0 31.370 11.320 42.330 1.00 0.00 O \r\n",
"ATOM 20 N ASN A 1 31.740 13.550 42.670 1.00 0.00 N \r\n",
"ATOM 21 H ASN A 1 32.490 14.160 42.970 1.00 0.00 H \r\n",
"ATOM 22 CA ASN A 1 30.410 14.180 42.580 1.00 0.00 C \r\n",
"ATOM 23 HA ASN A 1 29.790 13.840 41.750 1.00 0.00 H \r\n",
"ATOM 24 CB ASN A 1 29.680 13.970 43.910 1.00 0.00 C \r\n",
"ATOM 25 HB2 ASN A 1 28.610 14.090 43.760 1.00 0.00 H \r\n",
"ATOM 26 HB3 ASN A 1 29.880 13.020 44.410 1.00 0.00 H \r\n",
"ATOM 27 CG ASN A 1 29.930 15.080 44.920 1.00 0.00 C \r\n",
"ATOM 28 OD1 ASN A 1 29.060 15.930 45.070 1.00 0.00 O \r\n",
"ATOM 29 ND2 ASN A 1 31.150 15.160 45.390 1.00 0.00 N \r\n",
"ATOM 30 HD21 ASN A 1 31.300 15.760 46.190 1.00 0.00 H \r\n"
]
}
],
"source": [
"# We also write this hierarchy info out to PDB if requested (this all goes \n",
"# through OpenMM's excellent PDBFile class)\n",
"protein.to_file('test.pdb', file_format='pdb')\n",
"!head -n 30 test.pdb"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_qv0_hqDZuRM"
},
"source": [
"# OpenFF units \n",
"\n",
"We've replaced our use of OpenMM's units with a more [interoperable units package called `openff-units`](https://github.com/openforcefield/openff-units#getting-started). This is a thin wrapper over `pint` which is made to happily interoperate with `openmm.unit`, and to make it easy to implement interoperation with additional units packages in the future.\n",
"\n",
"When it comes to handling units, the major user-facing difference is that _getters_ will now return `openff-units` `Quantity` objects (previously they returned `openmm` `Quantity` objects). On the other hand, _setters_ are made to work with `openmm`, `pint`, or `openff-units` `Quantity` objects. \n",
"\n",
"https://docs.openforcefield.org/projects/toolkit/en/main/releasehistory.html#migration-guide"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "U1Xw0BQbhgji",
"outputId": "015ce584-20e7-4938-a1c5-023877f0e395"
},
"outputs": [
{
"data": {
"text/plain": [
"openff.units.units.Quantity"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Quantities are now returned as openff-units Quantities\n",
"first_conformer = protein.conformers[0]\n",
"type(first_conformer)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "V4tOa15Ph42i",
"outputId": "16d828bb-9c65-4ef7-bb3f-75901016f864"
},
"outputs": [
{
"data": {
"text/plain": [
"openmm.unit.quantity.Quantity"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# We can convert an openff-units Quantity to openmm\n",
"from openff.units.openmm import to_openmm, from_openmm\n",
"type(to_openmm(first_conformer))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "csdwreNwSUOV"
},
"source": [
"# (probably not that useful) Assigning residue information to a protein that we made from SMILES\n",
"\n",
"If we somehow have a chemical description of a protein without residue numbers/names, we can use the `perceive_residues` method to assign that information."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 322,
"referenced_widgets": [
"f76efb6ffa1240678342be81a9bd2462"
]
},
"id": "yLiDkOxl8U3x",
"outputId": "9b458103-f7d9-4f4f-c962-2b23007a31ae"
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "5779eccd390147b99b82de30cb896e07",
"version_major": 2,
"version_minor": 0
},
"text/plain": []
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/svg+xml": [
"<svg xmlns=\"http://www.w3.org/2000/svg\" xmlns:rdkit=\"http://www.rdkit.org/xml\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" version=\"1.1\" baseProfile=\"full\" xml:space=\"preserve\" width=\"500px\" height=\"300px\" viewBox=\"0 0 500 300\">\n",
"<!-- END OF HEADER -->\n",
"<rect style=\"opacity:1.0;fill:#FFFFFF;stroke:none\" width=\"500.0\" height=\"300.0\" x=\"0.0\" y=\"0.0\"> </rect>\n",
"<path class=\"bond-0 atom-0 atom-1\" d=\"M 68.0,138.0 L 104.5,116.7\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-1 atom-1 atom-2\" d=\"M 100.5,116.9 L 99.8,100.1\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-1 atom-1 atom-2\" d=\"M 99.8,100.1 L 99.0,83.4\" style=\"fill:none;fill-rule:evenodd;stroke:#FF0000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-1 atom-1 atom-2\" d=\"M 108.4,116.5 L 107.6,99.8\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-1 atom-1 atom-2\" d=\"M 107.6,99.8 L 106.8,83.0\" style=\"fill:none;fill-rule:evenodd;stroke:#FF0000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-2 atom-1 atom-3\" d=\"M 104.5,116.7 L 121.8,127.2\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-2 atom-1 atom-3\" d=\"M 121.8,127.2 L 139.1,137.7\" style=\"fill:none;fill-rule:evenodd;stroke:#0000FF;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-3 atom-3 atom-4\" d=\"M 156.0,139.0 L 174.0,130.9\" style=\"fill:none;fill-rule:evenodd;stroke:#0000FF;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-3 atom-3 atom-4\" d=\"M 174.0,130.9 L 191.9,122.8\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-4 atom-4 atom-5\" d=\"M 191.9,122.8 L 171.9,83.3\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-5 atom-4 atom-6\" d=\"M 191.9,122.8 L 228.5,146.7\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-6 atom-6 atom-7\" d=\"M 232.4,146.9 L 231.9,163.3\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-6 atom-6 atom-7\" d=\"M 231.9,163.3 L 231.4,179.7\" style=\"fill:none;fill-rule:evenodd;stroke:#FF0000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-6 atom-6 atom-7\" d=\"M 224.6,146.6 L 224.1,163.0\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-6 atom-6 atom-7\" d=\"M 224.1,163.0 L 223.6,179.5\" style=\"fill:none;fill-rule:evenodd;stroke:#FF0000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-7 atom-6 atom-8\" d=\"M 228.5,146.7 L 244.3,137.8\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-7 atom-6 atom-8\" d=\"M 244.3,137.8 L 260.2,128.8\" style=\"fill:none;fill-rule:evenodd;stroke:#0000FF;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-8 atom-8 atom-9\" d=\"M 277.1,128.9 L 292.3,137.6\" style=\"fill:none;fill-rule:evenodd;stroke:#0000FF;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-8 atom-8 atom-9\" d=\"M 292.3,137.6 L 307.5,146.4\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-9 atom-9 atom-10\" d=\"M 307.5,146.4 L 324.3,185.0\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-10 atom-10 atom-11\" d=\"M 324.3,185.0 L 315.7,197.1\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-10 atom-10 atom-11\" d=\"M 315.7,197.1 L 307.1,209.2\" style=\"fill:none;fill-rule:evenodd;stroke:#FF0000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-11 atom-9 atom-12\" d=\"M 307.5,146.4 L 352.4,123.6\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-12 atom-12 atom-13\" d=\"M 348.5,123.5 L 349.0,107.1\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-12 atom-12 atom-13\" d=\"M 349.0,107.1 L 349.5,90.8\" style=\"fill:none;fill-rule:evenodd;stroke:#FF0000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-12 atom-12 atom-13\" d=\"M 356.3,123.8 L 356.8,107.4\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-12 atom-12 atom-13\" d=\"M 356.8,107.4 L 357.3,91.0\" style=\"fill:none;fill-rule:evenodd;stroke:#FF0000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-13 atom-12 atom-14\" d=\"M 352.4,123.6 L 369.4,133.6\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-13 atom-12 atom-14\" d=\"M 369.4,133.6 L 386.3,143.6\" style=\"fill:none;fill-rule:evenodd;stroke:#0000FF;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-14 atom-14 atom-15\" d=\"M 403.2,144.6 L 418.9,137.0\" style=\"fill:none;fill-rule:evenodd;stroke:#0000FF;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-14 atom-14 atom-15\" d=\"M 418.9,137.0 L 434.5,129.5\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-15 atom-0 atom-16\" d=\"M 68.0,138.0 L 83.9,164.2\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-16 atom-0 atom-17\" d=\"M 68.0,138.0 L 53.0,165.1\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-17 atom-0 atom-18\" d=\"M 68.0,138.0 L 38.6,121.5\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-18 atom-3 atom-19\" d=\"M 146.9,153.4 L 146.1,164.5\" style=\"fill:none;fill-rule:evenodd;stroke:#0000FF;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-18 atom-3 atom-19\" d=\"M 146.1,164.5 L 145.4,175.6\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-19 atom-4 atom-20\" d=\"M 195.9,119.6 L 194.6,118.4\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-19 atom-4 atom-20\" d=\"M 199.8,116.3 L 197.2,114.0\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-19 atom-4 atom-20\" d=\"M 203.8,113.1 L 199.8,109.6\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-19 atom-4 atom-20\" d=\"M 207.7,109.8 L 202.4,105.3\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-19 atom-4 atom-20\" d=\"M 211.6,106.6 L 205.0,100.9\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-20 atom-5 atom-21\" d=\"M 171.9,83.3 L 197.0,64.5\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-21 atom-5 atom-22\" d=\"M 171.9,83.3 L 168.3,51.6\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-22 atom-5 atom-23\" d=\"M 171.9,83.3 L 146.5,85.2\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-23 atom-8 atom-24\" d=\"M 268.5,113.3 L 268.4,102.2\" style=\"fill:none;fill-rule:evenodd;stroke:#0000FF;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-23 atom-8 atom-24\" d=\"M 268.4,102.2 L 268.3,91.0\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-24 atom-9 atom-25\" d=\"M 307.5,146.4 L 295.2,172.8 L 287.9,167.9 Z\" style=\"fill:#000000;fill-rule:evenodd;fill-opacity:1;stroke:#000000;stroke-width:1.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;\"/>\n",
"<path class=\"bond-25 atom-10 atom-26\" d=\"M 324.3,185.0 L 351.4,172.5\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-26 atom-10 atom-27\" d=\"M 324.3,185.0 L 351.0,203.8\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-27 atom-11 atom-28\" d=\"M 304.7,231.0 L 308.8,239.7\" style=\"fill:none;fill-rule:evenodd;stroke:#FF0000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-27 atom-11 atom-28\" d=\"M 308.8,239.7 L 312.9,248.4\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-28 atom-14 atom-29\" d=\"M 394.7,159.3 L 394.7,170.4\" style=\"fill:none;fill-rule:evenodd;stroke:#0000FF;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-28 atom-14 atom-29\" d=\"M 394.7,170.4 L 394.7,181.4\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-29 atom-15 atom-30\" d=\"M 434.5,129.5 L 421.7,100.8\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-30 atom-15 atom-31\" d=\"M 434.5,129.5 L 452.5,105.4\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"bond-31 atom-15 atom-32\" d=\"M 434.5,129.5 L 461.4,148.9\" style=\"fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:2.0px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1\"/>\n",
"<path class=\"atom-2\" d=\"M 93.9 72.2 Q 93.9 67.8, 96.1 65.3 Q 98.3 62.8, 102.4 62.8 Q 106.4 62.8, 108.6 65.3 Q 110.8 67.8, 110.8 72.2 Q 110.8 76.7, 108.6 79.2 Q 106.4 81.8, 102.4 81.8 Q 98.3 81.8, 96.1 79.2 Q 93.9 76.7, 93.9 72.2 M 102.4 79.7 Q 105.2 79.7, 106.7 77.8 Q 108.2 75.9, 108.2 72.2 Q 108.2 68.6, 106.7 66.7 Q 105.2 64.9, 102.4 64.9 Q 99.5 64.9, 98.0 66.7 Q 96.5 68.5, 96.5 72.2 Q 96.5 75.9, 98.0 77.8 Q 99.5 79.7, 102.4 79.7 \" fill=\"#FF0000\"/>\n",
"<path class=\"atom-3\" d=\"M 143.5 133.5 L 149.6 143.3 Q 150.2 144.3, 151.1 146.0 Q 152.1 147.8, 152.1 147.9 L 152.1 133.5 L 154.6 133.5 L 154.6 152.0 L 152.1 152.0 L 145.6 141.3 Q 144.8 140.0, 144.0 138.6 Q 143.2 137.2, 143.0 136.7 L 143.0 152.0 L 140.6 152.0 L 140.6 133.5 L 143.5 133.5 \" fill=\"#0000FF\"/>\n",
"<path class=\"atom-7\" d=\"M 218.7 190.4 Q 218.7 186.0, 220.8 183.5 Q 223.0 181.0, 227.1 181.0 Q 231.2 181.0, 233.4 183.5 Q 235.6 186.0, 235.6 190.4 Q 235.6 194.9, 233.4 197.5 Q 231.2 200.0, 227.1 200.0 Q 223.1 200.0, 220.8 197.5 Q 218.7 194.9, 218.7 190.4 M 227.1 197.9 Q 230.0 197.9, 231.5 196.0 Q 233.0 194.1, 233.0 190.4 Q 233.0 186.8, 231.5 185.0 Q 230.0 183.1, 227.1 183.1 Q 224.3 183.1, 222.8 184.9 Q 221.3 186.8, 221.3 190.4 Q 221.3 194.1, 222.8 196.0 Q 224.3 197.9, 227.1 197.9 \" fill=\"#FF0000\"/>\n",
"<path class=\"atom-8\" d=\"M 264.5 114.8 L 270.6 124.5 Q 271.2 125.5, 272.2 127.3 Q 273.1 129.0, 273.2 129.1 L 273.2 114.8 L 275.6 114.8 L 275.6 133.2 L 273.1 133.2 L 266.6 122.5 Q 265.8 121.3, 265.0 119.9 Q 264.3 118.4, 264.0 118.0 L 264.0 133.2 L 261.6 133.2 L 261.6 114.8 L 264.5 114.8 \" fill=\"#0000FF\"/>\n",
"<path class=\"atom-11\" d=\"M 291.0 220.0 Q 291.0 215.6, 293.2 213.1 Q 295.3 210.6, 299.4 210.6 Q 303.5 210.6, 305.7 213.1 Q 307.9 215.6, 307.9 220.0 Q 307.9 224.5, 305.7 227.1 Q 303.5 229.6, 299.4 229.6 Q 295.4 229.6, 293.2 227.1 Q 291.0 224.5, 291.0 220.0 M 299.4 227.5 Q 302.3 227.5, 303.8 225.6 Q 305.3 223.7, 305.3 220.0 Q 305.3 216.4, 303.8 214.6 Q 302.3 212.7, 299.4 212.7 Q 296.6 212.7, 295.1 214.5 Q 293.6 216.4, 293.6 220.0 Q 293.6 223.8, 295.1 225.6 Q 296.6 227.5, 299.4 227.5 \" fill=\"#FF0000\"/>\n",
"<path class=\"atom-13\" d=\"M 345.3 79.9 Q 345.3 75.4, 347.5 73.0 Q 349.7 70.5, 353.7 70.5 Q 357.8 70.5, 360.0 73.0 Q 362.2 75.4, 362.2 79.9 Q 362.2 84.4, 360.0 86.9 Q 357.8 89.4, 353.7 89.4 Q 349.7 89.4, 347.5 86.9 Q 345.3 84.4, 345.3 79.9 M 353.7 87.4 Q 356.6 87.4, 358.1 85.5 Q 359.6 83.6, 359.6 79.9 Q 359.6 76.2, 358.1 74.4 Q 356.6 72.6, 353.7 72.6 Q 350.9 72.6, 349.4 74.4 Q 347.9 76.2, 347.9 79.9 Q 347.9 83.6, 349.4 85.5 Q 350.9 87.4, 353.7 87.4 \" fill=\"#FF0000\"/>\n",
"<path class=\"atom-14\" d=\"M 390.7 139.4 L 396.7 149.2 Q 397.3 150.2, 398.3 151.9 Q 399.3 153.6, 399.3 153.8 L 399.3 139.4 L 401.8 139.4 L 401.8 157.9 L 399.2 157.9 L 392.7 147.2 Q 392.0 145.9, 391.2 144.5 Q 390.4 143.1, 390.2 142.6 L 390.2 157.9 L 387.8 157.9 L 387.8 139.4 L 390.7 139.4 \" fill=\"#0000FF\"/>\n",
"<path class=\"atom-16\" d=\"M 83.1 165.7 L 85.6 165.7 L 85.6 173.5 L 95.1 173.5 L 95.1 165.7 L 97.6 165.7 L 97.6 184.2 L 95.1 184.2 L 95.1 175.6 L 85.6 175.6 L 85.6 184.2 L 83.1 184.2 L 83.1 165.7 \" fill=\"#000000\"/>\n",
"<path class=\"atom-17\" d=\"M 39.9 166.5 L 42.4 166.5 L 42.4 174.4 L 51.9 174.4 L 51.9 166.5 L 54.4 166.5 L 54.4 185.0 L 51.9 185.0 L 51.9 176.5 L 42.4 176.5 L 42.4 185.0 L 39.9 185.0 L 39.9 166.5 \" fill=\"#000000\"/>\n",
"<path class=\"atom-18\" d=\"M 22.7 107.3 L 25.2 107.3 L 25.2 115.2 L 34.7 115.2 L 34.7 107.3 L 37.2 107.3 L 37.2 125.8 L 34.7 125.8 L 34.7 117.3 L 25.2 117.3 L 25.2 125.8 L 22.7 125.8 L 22.7 107.3 \" fill=\"#000000\"/>\n",
"<path class=\"atom-19\" d=\"M 137.4 177.1 L 139.9 177.1 L 139.9 184.9 L 149.4 184.9 L 149.4 177.1 L 151.9 177.1 L 151.9 195.6 L 149.4 195.6 L 149.4 187.0 L 139.9 187.0 L 139.9 195.6 L 137.4 195.6 L 137.4 177.1 \" fill=\"#000000\"/>\n",
"<path class=\"atom-20\" d=\"M 211.2 82.7 L 213.7 82.7 L 213.7 90.5 L 223.2 90.5 L 223.2 82.7 L 225.7 82.7 L 225.7 101.2 L 223.2 101.2 L 223.2 92.6 L 213.7 92.6 L 213.7 101.2 L 211.2 101.2 L 211.2 82.7 \" fill=\"#000000\"/>\n",
"<path class=\"atom-21\" d=\"M 198.4 48.8 L 200.9 48.8 L 200.9 56.7 L 210.4 56.7 L 210.4 48.8 L 212.9 48.8 L 212.9 67.3 L 210.4 67.3 L 210.4 58.7 L 200.9 58.7 L 200.9 67.3 L 198.4 67.3 L 198.4 48.8 \" fill=\"#000000\"/>\n",
"<path class=\"atom-22\" d=\"M 159.8 31.7 L 162.3 31.7 L 162.3 39.5 L 171.8 39.5 L 171.8 31.7 L 174.3 31.7 L 174.3 50.1 L 171.8 50.1 L 171.8 41.6 L 162.3 41.6 L 162.3 50.1 L 159.8 50.1 L 159.8 31.7 \" fill=\"#000000\"/>\n",
"<path class=\"atom-23\" d=\"M 130.6 76.6 L 133.1 76.6 L 133.1 84.5 L 142.5 84.5 L 142.5 76.6 L 145.0 76.6 L 145.0 95.1 L 142.5 95.1 L 142.5 86.6 L 133.1 86.6 L 133.1 95.1 L 130.6 95.1 L 130.6 76.6 \" fill=\"#000000\"/>\n",
"<path class=\"atom-24\" d=\"M 261.0 71.1 L 263.5 71.1 L 263.5 78.9 L 273.0 78.9 L 273.0 71.1 L 275.5 71.1 L 275.5 89.5 L 273.0 89.5 L 273.0 81.0 L 263.5 81.0 L 263.5 89.5 L 261.0 89.5 L 261.0 71.1 \" fill=\"#000000\"/>\n",
"<path class=\"atom-25\" d=\"M 276.2 173.2 L 278.7 173.2 L 278.7 181.1 L 288.2 181.1 L 288.2 173.2 L 290.7 173.2 L 290.7 191.7 L 288.2 191.7 L 288.2 183.2 L 278.7 183.2 L 278.7 191.7 L 276.2 191.7 L 276.2 173.2 \" fill=\"#000000\"/>\n",
"<path class=\"atom-26\" d=\"M 352.8 159.2 L 355.3 159.2 L 355.3 167.1 L 364.8 167.1 L 364.8 159.2 L 367.3 159.2 L 367.3 177.7 L 364.8 177.7 L 364.8 169.2 L 355.3 169.2 L 355.3 177.7 L 352.8 177.7 L 352.8 159.2 \" fill=\"#000000\"/>\n",
"<path class=\"atom-27\" d=\"M 352.4 200.7 L 354.9 200.7 L 354.9 208.5 L 364.4 208.5 L 364.4 200.7 L 366.9 200.7 L 366.9 219.1 L 364.4 219.1 L 364.4 210.6 L 354.9 210.6 L 354.9 219.1 L 352.4 219.1 L 352.4 200.7 \" fill=\"#000000\"/>\n",
"<path class=\"atom-28\" d=\"M 310.7 249.9 L 313.2 249.9 L 313.2 257.7 L 322.7 257.7 L 322.7 249.9 L 325.2 249.9 L 325.2 268.3 L 322.7 268.3 L 322.7 259.8 L 313.2 259.8 L 313.2 268.3 L 310.7 268.3 L 310.7 249.9 \" fill=\"#000000\"/>\n",
"<path class=\"atom-29\" d=\"M 387.4 182.9 L 389.9 182.9 L 389.9 190.7 L 399.4 190.7 L 399.4 182.9 L 401.9 182.9 L 401.9 201.4 L 399.4 201.4 L 399.4 192.8 L 389.9 192.8 L 389.9 201.4 L 387.4 201.4 L 387.4 182.9 \" fill=\"#000000\"/>\n",
"<path class=\"atom-30\" d=\"M 409.7 80.9 L 412.2 80.9 L 412.2 88.8 L 421.7 88.8 L 421.7 80.9 L 424.2 80.9 L 424.2 99.4 L 421.7 99.4 L 421.7 90.9 L 412.2 90.9 L 412.2 99.4 L 409.7 99.4 L 409.7 80.9 \" fill=\"#000000\"/>\n",
"<path class=\"atom-31\" d=\"M 453.2 85.4 L 455.7 85.4 L 455.7 93.3 L 465.1 93.3 L 465.1 85.4 L 467.6 85.4 L 467.6 103.9 L 465.1 103.9 L 465.1 95.4 L 455.7 95.4 L 455.7 103.9 L 453.2 103.9 L 453.2 85.4 \" fill=\"#000000\"/>\n",
"<path class=\"atom-32\" d=\"M 462.8 146.0 L 465.3 146.0 L 465.3 153.8 L 474.8 153.8 L 474.8 146.0 L 477.3 146.0 L 477.3 164.5 L 474.8 164.5 L 474.8 155.9 L 465.3 155.9 L 465.3 164.5 L 462.8 164.5 L 462.8 146.0 \" fill=\"#000000\"/>\n",
"</svg>"
],
"text/plain": [
"<IPython.core.display.SVG object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"ala_ser = Molecule.from_smiles('CC(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(=O)NC')\n",
"ala_ser"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"id": "kZA8qZV7gwo9"
},
"outputs": [],
"source": [
"ala_ser.generate_conformers()"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "ED0efjzmR-zW",
"outputId": "369fbf10-af60-48d5-e4ee-b249a030cc3e"
},
"outputs": [
{
"data": {
"text/plain": [
"{}"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# The atoms of this molecule initially have no metadata\n",
"ala_ser.atoms[0].metadata"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "dLbXUk3QTIWe",
"outputId": "b13a2663-9af1-48fc-d2fa-803b318e6905"
},
"outputs": [
{
"data": {
"text/plain": [
"{'residue_name': 'ACE', 'residue_number': 1, 'atom_name': 'CH3'}"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# But we can add standard protein residue metadata using the perceive_residues method\n",
"ala_ser.perceive_residues()\n",
"ala_ser.atoms[0].metadata"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "R48ZsItsTPAH",
"outputId": "73b5f154-6c4b-460f-f934-9e0715cc95f0"
},
"outputs": [
{
"data": {
"text/plain": [
"[HierarchyElement ('None', 1, 'ACE') of iterator 'residues' containing 6 atom(s),\n",
" HierarchyElement ('None', 2, 'ALA') of iterator 'residues' containing 10 atom(s),\n",
" HierarchyElement ('None', 3, 'SER') of iterator 'residues' containing 11 atom(s),\n",
" HierarchyElement ('None', 4, 'NME') of iterator 'residues' containing 6 atom(s)]"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ala_ser.residues"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "6yS4BapCTkee"
},
"source": [
"# Iterators like `topology.topology_molecules` and `topology.topology_atoms` are being deprecated\n",
"\n",
"The difference between `Topology.reference_molecules` and `Topology.topology_molecules` managed to confuse everyone, and would have added a lot of complexity by keeping them around in this refactor. While the code still takes advantage of knowing which molecules are identical to each other, it now does so outside the public API. \n",
"\n",
"Since we didn't give a lot of warning for this, the old code paths will still exist for a few months, but you'll get these annoying deprecation warnings. To reach the new recommended API points, just take out the `topology_` part of the method name. So for example, `topology.topology_atoms` should now be accessed as `topology.atoms`. The objects that are returned are no longer weird views of atoms (`TopologyAtoms`), instead they're just the same atoms as you'd get from calling `Molecule.atoms`. "
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "pVsPUNtLTSiQ",
"outputId": "8e80f371-ca67-4d08-8b56-adfa1ae651e7"
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/jeffreywagner/conda/envs/offtk-0-11-0-rc/lib/python3.9/site-packages/openff/toolkit/topology/topology.py:51: TopologyDeprecationWarning: Topology.topology_atoms is deprecated. Use Topology.atoms instead.\n",
" warnings.warn(\n"
]
},
{
"data": {
"text/plain": [
"2667"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"top = Topology.from_molecules([protein, ala_ser])\n",
"# This will emit a deprecation warning\n",
"len([*top.topology_atoms])"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "oLGpLhr2VgJT",
"outputId": "cc26da2e-bb7c-4f27-978e-cf0b1b129a93"
},
"outputs": [
{
"data": {
"text/plain": [
"2667"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# This will give the same result with no deprecation warning\n",
"len([*top.atoms])"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "7ldyVVBrURoE",
"outputId": "010efca5-a648-4de5-949a-326582415a45"
},
"outputs": [
{
"data": {
"text/plain": [
"[2634, 33]"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# The molecules returned by Topology.molecules are now just full copies of the input molecules.\n",
"[mol.n_atoms for mol in top.molecules]"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "ioiDXIPocXLu",
"outputId": "5a8bbca3-fd7e-4680-804f-6607f4bdc7b6"
},
"outputs": [
{
"data": {
"text/plain": [
"[164, 4]"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# You can still access hierarchy iterators on the underlying molecules\n",
"[len(mol.residues) for mol in top.molecules]"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "UclsjlOhWFje",
"outputId": "b6b4107f-6ede-4ecf-ee58-bdc919437972"
},
"outputs": [
{
"data": {
"text/plain": [
"168"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Hierarchies also expose the underlying hierarchy iterators through this crummy API.\n",
"# This simply returns the result of calling the desired hierarchy iterator on each \n",
"# molecule in the order that they appear in the hierarchy. \n",
"len([*top.hierarchy_iterator('residues')])\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "759GW-yxWJnm"
},
"source": [
"# Now we can make Interchange objects\n",
"\n",
"We finally have our first alternative to `ForceField.create_openmm_system` - It's `ForceField.create_interchange`. Interchange is very cool, we're building it to replace ParmEd in workflows and do other things good too. See [the docs](https://docs.openforcefield.org/projects/interchange/en/latest/using/intro.html) for details.\n",
"\n",
"We also have a protein force field that can be used for testing - It's a port of AMBER's ff14SB in SMIRNOFF format. Due to the difference between atomtype-based parameterization and the chemical perception-based parameterization, this force field is gigantic. So it takes a while (~60 seconds) to apply, and while we've tested it a reasonable amount there may still be energy-affecting bugs to shake out."
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "TWzNie0_Vv3_",
"outputId": "7afd9e8a-aaf3-41e5-9b5e-8cbf454cbd79"
},
"outputs": [],
"source": [
"from openff.toolkit import ForceField\n",
"ff = ForceField('ff14sb_off_impropers_0.0.3.offxml')"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"id": "dXxgTjNHXuU2"
},
"outputs": [],
"source": [
"# Make the topology periodic\n",
"from openff.units import unit\n",
"top.box_vectors = [10., 10., 10.] * unit.nanometer"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"id": "4HXMuaJ8X_r3"
},
"outputs": [],
"source": [
"interchange = ff.create_interchange(top)"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "S98SVBCsYYF7",
"outputId": "0d7f46e4-5bae-449c-c788-4610f09f9766"
},
"outputs": [
{
"data": {
"text/plain": [
"Interchange with 6 potential handlers, periodic topology with 2667 atoms."
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"interchange"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "0XpTTQJehuxz",
"outputId": "7e1a1d9a-a5a1-4a8d-a10e-59d726cd6815"
},
"outputs": [
{
"data": {
"text/plain": [
"Potential(parameters={'k': <Quantity(868.0, 'kilocalorie / angstrom ** 2 / mole')>, 'length': <Quantity(1.01, 'angstrom')>}, map_key=None)"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Interchange contains the underlying parameters\n",
"[*interchange['Bonds'].potentials.values()][0]"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Qw22uF-Wf4Gr",
"outputId": "cea011fd-c75e-4725-f4e4-682a8bd7d3e6"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[3.3429999999999995 1.128 4.449000000000001] [3.2859999999999996 1.187 4.509] [3.2969999999999997 1.038 4.443999999999999] ... [-0.460730402913539 -0.22444411880710183 0.01874313443855259] [-0.5825801296383424 -0.1305705161684031 0.10692123816470528] [-0.5482891464956796 -0.0988002309477765 -0.07205858865949857]] nanometer\n"
]
}
],
"source": [
"# Because both input molecules had positions (T4 was loaded from PDB and ala_ser \n",
"# had conformer generation run), the resulting Interchange object has positions.\n",
"print(interchange.positions)"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "9XGZw35oYyqK",
"outputId": "d12d3b0a-8deb-4e3e-96ca-b405fab70307"
},
"outputs": [
{
"data": {
"text/plain": [
"(<openmm.openmm.System; proxy of <Swig Object of type 'OpenMM::System *' at 0x1731a1570> >,\n",
" <Topology; 2 chains, 172 residues, 2667 atoms, 2686 bonds>)"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# The Interchange can export to an OpenMM system, which is pretty much the \n",
"# same as you'd get by using ForceField.create_openmm_system. It also contains\n",
"# an OpenFF Topology object (a copy of the one that made it) that can be turned\n",
"# into an OpenMM Topology\n",
"(interchange.to_openmm(), interchange.topology.to_openmm())"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"id": "3pA-8EtoKJRn"
},
"outputs": [],
"source": [
"# As a little taste of what's to come, it can also export to prmtop and inpcrd.\n",
"# This is better validated for small molecules than for proteins, there's more \n",
"# development to come here\n",
"interchange.to_prmtop('proteins.prmtop')\n",
"interchange.to_inpcrd('proteins.inpcrd')"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "hqQxl12-CS5u",
"outputId": "d213a437-3cc0-4e3d-95c7-d9bac0d3009a"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"%VERSION VERSION_STAMP = V0001.000 DATE = 07/07/22 13:23:29\r\n",
"%FLAG TITLE\r\n",
"%FORMAT(20a4)\r\n",
"\r\n",
"%FLAG POINTERS\r\n",
"%FORMAT(10I8)\r\n",
" 2667 293 1345 1341 3039 1805 6743 6142 0 0\r\n",
" 14634 168 1341 1805 6142 276 448 924 1 0\r\n",
" 0 0 0 0 0 0 0 1 2667 0\r\n",
" 0 0\r\n"
]
}
],
"source": [
"!head proteins.prmtop"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Cos00kdOC6NJ",
"outputId": "13645f28-5ac9-40b2-9984-8e6c33192f04"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r\n",
" 2667 0.0000000e+00\r\n",
" 33.4300000 11.2800000 44.4900000 32.8600000 11.8700000 45.0900000\r\n",
" 32.9700000 10.3800000 44.4400000 34.2700000 11.2800000 45.0500000\r\n",
" 33.5500000 11.9700000 43.2000000 34.0900000 12.8900000 43.4100000\r\n",
" 34.5500000 11.3900000 42.2700000 35.4700000 11.3300000 42.8500000\r\n",
" 34.3200000 10.3700000 41.9600000 34.7300000 12.3200000 41.0200000\r\n",
" 35.4300000 11.8400000 40.3400000 33.7200000 12.3500000 40.6200000\r\n",
" 35.2700000 14.0500000 41.3000000 37.0500000 13.8000000 41.7600000\r\n",
" 37.1700000 13.6500000 42.8300000 37.5000000 12.9900000 41.1900000\r\n"
]
}
],
"source": [
"!head proteins.inpcrd"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "XkffoOC5C8jL"
},
"outputs": [],
"source": []
}
],
"metadata": {
"colab": {
"collapsed_sections": [],
"include_colab_link": true,
"name": "biopolymer_alpha_notebook.ipynb",
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
},
"widgets": {
"application/vnd.jupyter.widget-state+json": {
"f76efb6ffa1240678342be81a9bd2462": {
"model_module": "nglview-js-widgets",
"model_module_version": "3.0.1",
"model_name": "ColormakerRegistryModel",
"state": {
"_dom_classes": [],
"_model_module": "nglview-js-widgets",
"_model_module_version": "3.0.1",
"_model_name": "ColormakerRegistryModel",
"_msg_ar": [],
"_msg_q": [],
"_ready": false,
"_view_count": null,
"_view_module": "nglview-js-widgets",
"_view_module_version": "3.0.1",
"_view_name": "ColormakerRegistryView",
"layout": "IPY_MODEL_4cf7ce207b0d418a88f1546d8fb24816"
}
}
}
}
},
"nbformat": 4,
"nbformat_minor": 1
}
name: test
channels:
- jaimergp/label/unsupported-cudatoolkit-shim
- conda-forge
dependencies:
# Base depends
- python
- pip
# Testing
- pytest
- pytest-cov
- pytest-xdist
- pytest-rerunfailures
- nbval
- codecov
- coverage
- numpy
- networkx
- ambertools
- rdkit
- packaging
# Removed until a compatible release is made
# - openmmforcefields
- openmm >=7.6
- openff-toolkit
- nglview
- notebook
- openff-amber-ff-ports
- openff-forcefields
- openff-units >=0.1.6
- openff-utilities
- smirnoff99Frosst
- cachetools
- cached-property
- pyyaml
- toml
- bson
- msgpack-python
- xmltodict
- python-constraint
- qcelemental
- qcportal >=0.15
- qcengine
- mdtraj
- distributed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment