Skip to content

Instantly share code, notes, and snippets.

@apahl
Created September 3, 2019 08:40
Show Gist options
  • Save apahl/06e55f5965cb82bc43d2aafd8ee0d532 to your computer and use it in GitHub Desktop.
Save apahl/06e55f5965cb82bc43d2aafd8ee0d532 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Non Round-trippable Molecule\n",
"\n",
"The problem with round-tripping (I like the word) this molecule seems to be a different aromaticity model for the input and the output Smiles"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"ExecuteTime": {
"end_time": "2019-09-03T08:39:41.555070Z",
"start_time": "2019-09-03T08:39:41.287722Z"
}
},
"outputs": [],
"source": [
"from rdkit.Chem import AllChem as Chem\n",
"from rdkit.Chem import Draw\n",
"from rdkit.Chem import Descriptors as Desc\n",
"from rdkit.Chem.Draw import IPythonConsole"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"RDKit can parse the original Smiles into a valid molecule."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"ExecuteTime": {
"end_time": "2019-09-03T08:39:41.604157Z",
"start_time": "2019-09-03T08:39:41.564019Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"680.8420000000002\n"
]
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<rdkit.Chem.rdchem.Mol at 0x7fb51009fe40>"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mol = Chem.MolFromSmiles(\"c12c(\\C=C/c(ccc3OC)cc3Oc4ccc(cc4)\\C=C/c(cc5O1)c(CCN(C)C)cc5OC)c(CCN(C)C)c(OC)c(OC)c2OC\")\n",
"print(Desc.MolWt(mol))\n",
"mol"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And parse it back into a Smiles."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"ExecuteTime": {
"end_time": "2019-09-03T08:39:41.619655Z",
"start_time": "2019-09-03T08:39:41.610242Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"'COc1ccc2cc1-o-c1ccc(cc1)/c=c\\\\c1cc(c(OC)cc1CCN(C)C)-o-c1c(c(CCN(C)C)c(OC)c(OC)c1OC)/c=c\\\\2'"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"smi = Chem.MolToSmiles(mol)\n",
"smi"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"But the Smiles generated by RDKit can not be parsed back into a valid molecule."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"ExecuteTime": {
"end_time": "2019-09-03T08:39:41.636858Z",
"start_time": "2019-09-03T08:39:41.628453Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"RDKit ERROR: [10:39:41] Can't kekulize mol. Unkekulized atoms: 2 3 4 5 6 7 9 10 11 12 13 14 15 16 17 18 19 20 23 24 31 32 33 39 42 45 48 49\n",
"RDKit ERROR: \n"
]
}
],
"source": [
"tmp = Chem.MolFromSmiles(smi)\n",
"tmp"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"hide_input": false,
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": true,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment