Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@jose-manuel
Created February 28, 2020 15:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jose-manuel/8ef38eadbefbfc30e23f166c9e751aba to your computer and use it in GitHub Desktop.
Save jose-manuel/8ef38eadbefbfc30e23f166c9e751aba to your computer and use it in GitHub Desktop.
Draw Murcko Scaffolds Extraction as Reaction with Kekulized Molecules
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Representing Kekulized Molecules in a Reaction"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"import rdkit\n",
"from rdkit.Chem import AllChem as Chem\n",
"from rdkit.Chem import PandasTools\n",
"from rdkit.Chem import Draw\n",
"from rdkit.Chem import rdChemReactions\n",
"from rdkit.Chem.Draw import rdMolDraw2D\n",
"from rdkit.Chem.Draw import IPythonConsole\n",
"from rdkit.Chem.Scaffolds import MurckoScaffold\n",
"from IPython.display import SVG"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"RDKIT: 2019.03.4\n"
]
}
],
"source": [
"print('RDKIT:'.ljust(20) + f\"{rdkit.__version__}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I would like to represent the Murcko Scaffold extraction with an image (preferientially SVG). \n",
"To get started, I define an example molecule."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<rdkit.Chem.rdchem.Mol at 0x7f7e2b396d50>"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"smi1 = 'OC1=CC=CC=C1'\n",
"mol1 = Chem.MolFromSmiles(smi1)\n",
"mol1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I extract the Murcko Scaffold from the molecule above (see below). \n",
"I kekulize the SMILES because I want the rendered image to show kekulized Molecules instead of the aromatic representation."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<rdkit.Chem.rdchem.Mol at 0x7f7e29c26c10>"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mol2 = MurckoScaffold.GetScaffoldForMol(mol1)\n",
"Chem.Kekulize(mol2)\n",
"smi2 = Chem.MolToSmiles(mol2, kekuleSmiles=True)\n",
"mol2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I compute the reaction from smarts (could not find any function for smiles). The useSmiles argument looks promising."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"rxn_str=OC1=CC=CC=C1>>C1=CC=CC=C1\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAcIAAACWCAIAAADCEh9HAAAWj0lEQVR4nO3de1BV1R4H8N/hJSIiioJ2UnH0KqKlo5iiWTYqWZqllo6SKfcGaM2EaeEjraxsMhWxmcjHqKCiKUoKZpqmjFFokA98lDlqqERgxxcv4TzW/WPfeyLOPhs9e++194HvZ/znnrW6+0c7fq6z9tq/n4ExRgAA4CoPrQMAAHBvSKMAALIgjQIAyII0CgAgC9IoAIAsSKMAALIgjQIAyII0CgAgC9IoAIAsSKMAALIgjQIAyII0CgAgC9IoAIAsSKMAALIgjQIAyII0CgAgC9IoAIAsSKMAALIgjQIAyII0CgAgC9IoAIAsSKMAALIgjQIAyII0CgAgC9IoAIAsSKMAALIgjQIAyII0CgAgC9IoAIAsSKMAALIgjQIAyII0CgAgC9IoAIAsSKMAALIgjQIAyOKl4bWPHz++bt26bdu2aRgDEaWnp0dGRoaEhGgbBoDbKS0tzcvLi46O1jaMyZMnx8bGDhw4UKsADIwxTS5cUlLSo0cPIiovL9ckADtvb+/o6OiNGzdqGwaA24mJiUlPTzebzdqG0bJlSyK6cOFChw4dNAlAszQ6ffr0tLS0559/Pj09XZMABFeuXImIiDCbzXl5eY899piGkQC4lxMnTgwYMMDLyys/P79r164aRjJlypSsrKzp06drthhiWigoKPDw8PDx8fntt980CaCuuXPnElFkZKTNZtM6FgD3YLPZHn/8cSKaN2+e1rGwS5cu+fr6GgyG48ePaxKABmnUfgPmz5/P/+qO7t69K3wX2LJli9axALiHzZs3E1FISMidO3e0joUxxubNm0dEgwYN0mQxpEEa3bRpk65uAGNsw4YNRGQ0GisqKrSOBUDvKisrO3XqREQbN27UOpb/KS8vf+ihh4ho8+bN/K/OO43ab0BqairnS0uwWq3CxuiiRYu0jgVA7xYuXEhE/fr1s1qtWsfyN2Fj1Gg0lpeXc7407zT6zjvvEFH//v11dQMYYz/++KPBYGjevPmVK1e0jgVAv65evern52cwGL7//nutY/kHm80mLIYWLlzI+dJcn9Rfu3YtLCysurr66NGjwvaoI5vNdvr06bqfeHp6Pvroo2rMqSc6Onrr1q0TJ07cvn37/fw4AE3QxIkTMzIyoqOjt2zZ4mzO1atXTSZT3U86d+7cpk0bNebUdezYscGDBzdr1uyXX34JDQ29j59GITxz9osvvkhEL7/8ssScO3fu1IuwZcuWKs2p5/r16y1atCCinJwcF346gEYvNzdX+NJWVFQkMW3atGn1fvscd1GVmlOP8C7ASy+99KA/mhz83mLKzc3dtWuXn5/fkiVLJKZ5enr27du37if+/v4qzanHaDQmJia+9957s2bNKigo8PT0lJ4P0KTYbLaEhATG2Pz584UnHM507ty53m9fUFCQSnPqWbp06e7duzMyMnJycoYNGyY9WTF8srXVau3fvz8Rffjhh6ITFixYkJ+fzyeY2trapKSkQ4cOOQ5VVVUJ3wXWrl3LJxgAd7FmzRoi6tixY2VlpePooUOHkpKSamtr+QSTn5+/YMEC0aEPPviAiPr06WOxWPgEwymNrl69WuIG7N+/n4gCAwP5PGJLSUkhovDwcLPZ7DgqbIwGBwffunWLQzAAbuHOnTvt27cnoh07djiOms3m8PBwIkpJSeEQTHl5eWBgIBHt37/fcbS6ulpYDK1Zs4ZDMIxPGrXfgIyMDMdRs9ncu3dvIlq2bBmHYBhjNTU13bt3J6JVq1aJTnjyySeJaM6cOXziAdC/2bNnE9GQIUNEz7cnJycTUdeuXe/du8cnnk8//ZSIevbsKbr+3bFjBxG1a9eOz2KIRxp98803JW7AypUrOd8AxtiePXuIqHXr1jdu3HAcPXnypKenp7e396+//sotJADdunjxYrNmzTw8PER33kwmk7BlmZWVxS0k+2IoOTlZdIKwGJo9ezaHYFRPo7/88ou3t7eHh0dBQYHjqP0GZGdnqx1JPaNGjSKi1157TXQ0NjaWiEaPHs05KgAdevbZZ4koLi5OdHTmzJlENHz4cM5RZWVlNbgY8vLyOnv2rNqRqJ5Gn3nmGSKKj48XHc3JyQkKCoqKilI7DEdnz5718vIaPny46D50aWmpsPmyb98+/rEB6MfBgweJKCAgoKSkxHHUYrEMHz6cT7ZyFBUVFRQU5OyEYlxcHBGNHDlS7TDUTaN79+6VuAECk8kkfQZNPSdPnpQYXb58ORGFhYVxe/gIoDdms7lXr15EtGLFColp0r9K6ikqKjKZTM5Gy8rKhMXQ119/rWoYKqbR2tpaoTBzUlKSeldRT01NjRD/ypUrtY4FQBtJSUlE1K1bN56PLhS0YsUKIf6amhr1rqJiGuXzA6gqOztbOIlVVlamdSwAvJlMJuHNy71792odi4v4LObUSqPcltNqE/Z2Z8yYoXUgALzFx8cT0YgRI7QORJb72VqUSa00Kr25W1hYqLcKT8XFxaWlpY6fCycNPD09T506xT8qAK0Iz2CdPTsqLS0tLi7mH5UEq9VaWFgoOiT9oFs+VdKo9FGD4uJif3//iIgI/mUBncnMzPT39582bZroaEJCAhE99dRTfIMC0NLIkSOJaNasWaKj06ZN8/f3z8zM5ByVM+Xl5REREf7+/qLJXfrYpXyqpFHpg69Tp04logkTJqhxaddI93K5efNm27ZtiWjnzp38YwPgLyMjg4jatGnz119/OY7qqpea3fjx44nolVdeER2VfglIJuXTqPRrWMINaNas2cWLFxW/tBzSvVyE1/C7dOlSXV3NPzYAnu7du9etWzci+uKLLxxH9dZLze7y5cvCYujYsWOOo9KvpMukcBqVLgpgs9mGDBlCRM5Ks2jI3stFtLGdxWLp06cPEX300Uf8YwPg6cMPPySiXr16idbu0Vszu7rmz58vsRiSLpAkh8JpVLpEVWZmJhE99NBD+tkVrUvo5RIaGioa/OHDh4nIz8/v6tWr/GMD4OP69etCZd5vv/3WcdRisQjrJP00s6vLvhgS3bRtsFyny5RMo/bq8UeOHBGdYLVa165dq8aiWhFWqzUhIeH8+fPOJkhvvgA0Ag0+ujh//nxCQoLeTtrYZWRkrF271ll433//vcFg8PPzU/bNSSXTqCbl+3myb77orZkXgCLy8vIMBoMOH10o6H5aGT0oxdKocAN8fX0bd2fNBQsWkC47mwLIZLPZBg4cSETvvPOO1rGoSI3OpsqkUQ1bm3Jm33xJS0vTOhYAJaWmphJR+/btdfjsSFmKt3lXJo0KD2eMRqM+nx0pKy0tTbdPKgFcY18fbNq0SetYVFdZWSm05EtNTVXk/1CBNGq/AZs3b3YctVgsw4YNS05OFj08oVvXr1+fOnWqaGO7JvLdB5oU6d2q1atXjxkz5vLly/wDc5nVak1LS3vqqadEM8+mTZsUXAwpkEbv5+B6p06dFD+rpSqhsZ2zlwiawk48NB3Sz06lm9npVnV1dZcuXfi8RCA3jTbi1yjd7pVW0LmMjIyjR49qHYUIDV+jVBW3V1rlptEXXniBiKZPny466tZFPaQb25WUlAQEBDg7payK27fZvHnsX/9iPj4sMJBFRbGDB/8xISGBde5c/58yGhlanOrAtWvXWrVq5eHh8cYbb+jqm5nwXomzoh4XL1708fFRr6iH2hossEJE48aNk3kVWWn0u+++k7gBjaDEnHRju48++kjinTmFmUwsLIx17sxSU9nZsyw3l8XGMoOBff7533OQRnXMbDZ/8sknPj4+QnGGw4cPax0RY4xZLJZHH32UiJYsWSI6QbqZnf5Jl/v7888/hcXQgQMH5FzF9TRqvwEff/yx6IQzZ84MGjRIvRp/HJSWlgYHBycmJoq+HipdwUFh//43a96c/f77Pz6cOZN5ezP7xj/SqO4VFhYK7yMaDIa4uDjNT7ZI19yxWCyJiYnBwcGipXjdRXx8/KBBg86cOSM6umTJEiIKDw+XsxhyPY2eOnUqICDAaDRKFD2y2WxVVVUuX0IPpL9/CZsvzZs3pwfxTf/+jOgB/ixdyvz8mGMF/j//ZETMvo5QIY2Gh4c/0I8GD2T9+vUu3xpFPPfcc9RQ0SNdbUG4oKqqSmJXt7q62mg0BgQEyPnS7KXqfyUGg+FBUwyIuH2bqqrokUfqfx4SQu3a0YULf39SVEQGA8/QoNHz8/PTOgS983D5n+zdu3doaGhxcbHQO1DUzZs3hZNDbqqsrKxLly5z5861Wq2Oo/fu3Xv77bfpwVuHjiooIMYe4M9//kNE4vmRsX/8zw4d6OTJf/wJDpbzb+DcuXMu/xUNjgoLC/v160f//1I/ceJEOXdHPmHrc/bs2VVVVc7m7Nq1q6ysjGNQCnvzzTcHDx589uxZ0dEVK1YUFxd37NhR6CPtIjn/TUg/46uoqAgJCXHrR0yvvvoqEY0ZM0Z0VLoso5JqasS/1JeUqP2lHpSi20dM0oV0hWP5sbGxnANTyqlTpyQaGtkfMck8byP3wNO4ceOIyFkXo1mzZhHR448/7nYnzlhDB56kyzIqLyZG/BGTlxceMbkFnR94clY77uLFi82aNfPw8MjPz+cfm3zDhg0TFqSio6+88goRjR8/XuZVFDt+L1q4/9atW+3atSN1CverTTh+P8dJDnr55ZeJ6MUXX+QUzY0brHt3FhrK0tLYuXPshx9w4Mnt6Pb4/YQJE4ho6tSpoqOzZ88m9zx+LzQ0atOmjclkchzV0fF71lDh/i+++ILUKdyvqi+//JKIgoODJV4G9fX15fqW8a1bLDGRdevGvL1Zq1Zs5EhWbyGMNAouKSoqkqgdZ38ZdPv27fxjc5m9odHq1asdR5VtaKRkaRLR2jDC5ouHh0dWVpb8a3FTVFQ0adIkZ6VJmkhVQGg6pGvHrVmzhoiioqL4B+Yyi8WSkpIyfPhw0UPfytZpU6ZQnlCp0FmhvJ9++slNN1ZESf+wAO6osrKyY8eO5KSQrsVi2bBhQ21tLf/A1CD9w7pAsbLNTaR2XJMqywhNStMppKt4DwuFm4g0+tpx0hvBAO5Lz/3PFaRGRzUlW9rxfnjNnfSxBAB3p+DDa92SPpbgGoUbLN/PUUrdbik22GBZ+pAsQCNwP0cpKyoqdNvSUbrBsvQhWZcpmUZZQy/2VFZWxsXFPfzww/rMpEJHqdDQUNFHe9KvbAE0Dg2+2PPVV18ZjcaNGzfyjeu+2B9dZGZmOo42+MqWyxROo/bC/c7OailYuF9Zd+/e7dChAxGlp6c7jjZYlhGg0ZAupLtlyxbdPomSbmgknGFXo6GRwmmU1Sncr/abA8qaO3cuEUVGRoregM8//5ycl2UEaEzshXRTUlIcR+2LoXnz5vGPTYJ0QyP7G5VqNDRSPo2yht5jVapwv4IuXbokvDgs3VFq165d/GMD4G/nzp3kvIvRzz//rMPFkIb1PVRJo3wK9ysoMzPT39/f2Q144403yG07SgG4RuhilJCQIDqqt8VQeXl5RESEVg2NVEmjjLH4+HgiGjFihOjo0qVL4+Pjb9y4odLVXVBcXCzaKeH8+fPCDTh9+jT/qAC0Yl8Mibbf+OOPPyIjIw8dOsQ/MGesVmthYaHo0DPPPENEMxxLTSpErTRqMpnatGlDRHv37lXpEnyMGjWKiGbOnKl1IAC8zZgxQ2Ix5C6ys7OJKDAwsKysTKVLqJVGGWNCVfxu3brdu3dPvauoKisri4hat26tq4UzAB8mkykoKIiIsrOztY7FRbW1tT169KAHb1HxQFRMo2azWajLv2LFCvWuop6ampru3bsTUXJystaxAGhj5cqVRNS1a1c3XQwtX76ciMLCwlStq6JiGmWMffvtt0QUEBBQUlKi6oVcc/LkSYnRZcuWEVHPnj0bTWEbgAdlNpt79+5NRMuXL9c6FhFFRUWiBysFpaWlgYGBRLRv3z5Vw1A3jTLGhJ5ZcXFxziYcO3bMWYV5VQk76M7KEZaWlrZq1YqIvvnmG/6xAejHwYMHiahly5bOFkN37tx5++23Nel+GBUVFRQUlJOTIzoaGxtLRKNHj1Y7DNXTqHQvl4qKCuFJFP+izk8//TQRvf7666Kj0s3sAJqU0aNHSzS2e+uttzQp6rxnzx6JRxfSvdSUpXoaZQ31cklOTua/+bJ7926JG3DixAlPT08fHx8ONwBA/6QXQ/YnUXv27OEWkv3RxapVq0QnSPdSUxaPNGrv5bJjxw7HUfvmy6effsohGFbnBnz22WeiE5544gkieuutt/jEA6B/c+bMIaLBgweLLoZWrVrFeTG0dOlSiUcX27dvl+ilpjgeaZT9v5eLs8Z2hw4dEjZf/vjjDw7BpKSkEFF4eLho5QV7M7vbt29zCAbALdy9e1eisZ3ZbH7kkUe4LYbKy8uFZ0f79+93HK2qqhKa2Yn2UlMDpzRqtVr79+9PRB988IHohDFjxrRt2/bIkSMcgqmtrU1KShJ9AcN+A9atW8chEgA3snbtWonF0IEDBwwGg8TDZGXl5+c7q9K/ePFiIurbt6/o02M1GBhjxMUPP/wwdOhQX1/fX3/9tVOnTvVGi4uLW7RoIfwNk52d/e6779YdHTt2rPCvxk6pOfUsXrz4/fff79u3b0FBgaen5/3+bABNgNByraCgYPHixfV+swTnzp0TjooT0dChQysqKuqO5ubmtmjRou4nSs2pq7i4uEePHpWVlTk5OcL2KA98srXgpZdeIqLo6GjpaUL55Loci4YoNaeua9euCbfH2fkJgCYuNzfXYDA0b978999/l57ZsmXLer99jvVJlZpT15QpU4ho4sSJLvx0LuO3GiWia9euhYWFVVdXHz16VChZKOrmzZtFRUV1PwkKCqq3gFVqTl1TpkzZtm3bpEmThO1RAHA0adKkHTt2TJkyJT09XWJaYWGh1Wqt+0mfPn08PDzUmGOXl5c3ZMgQX1/f8+fPC7tznPDM2YyxhQsXElG/fv301svlxx9/FP6avXLlitaxAOjX1atX/fz8DAbD0aNHtY7lH6xW62OPPUZEixYt4nxp3mm0srJSWA/qqpeL/Qa8++67WscCoHeLFi3S4WJow4YNRGQ0GisqKjhfmncaZYxt3ryZdNbLZf369VrdAAC3U1VVJSyGNmzYoHUs/2PvpbZlyxb+V9cgjeqtl4t0MzsAcGRvbKeT49XSvdTUpkEaZTrr5ZKYmKjhDQBwRzabbejQoUQ0d+5crWNpoJcaB1yf1NcVExOTmpo6duzYrVu3ahKA4MqVKxEREWazOS8vT9geBYD7ceLEiQEDBnh5eeXn53ft2lXDSCZPnpydnR0TEyNsj/KnWRotKSkRqlKXl5drEoCdt7d3dHS04yFTAJAWExOTnp5uNpu1DUM4W3rhwgVhd44/zdIoER0/fnzdunXbtm3TKgBBenp6ZGRkSEiItmEAuJ3S0tK8vLzo6Ghtw5g8eXJsbOzAgQO1CkDLNAoA0AiIvwwAAAD3CWkUAEAWpFEAAFmQRgEAZEEaBQCQBWkUAEAWpFEAAFmQRgEAZEEaBQCQBWkUAEAWpFEAAFmQRgEAZEEaBQCQBWkUAEAWpFEAAFmQRgEAZEEaBQCQBWkUAEAWpFEAAFmQRgEAZEEaBQCQBWkUAEAWpFEAAFmQRgEAZEEaBQCQBWkUAEAWpFEAAFmQRgEAZEEaBQCQBWkUAEAWpFEAAFmQRgEAZEEaBQCQBWkUAEAWpFEAAFn+C3QPl85w6mSjAAAAAElFTkSuQmCC\n",
"text/plain": [
"<rdkit.Chem.rdChemReactions.ChemicalReaction at 0x7f7e29c269e0>"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"rxn_str = f\"{smi1}>>{smi2}\"\n",
"print(f\"rxn_str={rxn_str}\")\n",
"rxn = rdChemReactions.ReactionFromSmarts(rxn_str, useSmiles=True)\n",
"rxn"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The molecules are shown with anaromatic representation instead of a kekulized representation. \n",
"Am I missing something?"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment