Skip to content

Instantly share code, notes, and snippets.

View whitead's full-sized avatar
:atom:

Andrew White whitead

:atom:
View GitHub Profile
@whitead
whitead / randomize_smi.py
Created March 21, 2024 21:26
Randomize SMILES
from rdkit.Chem import MolFromSmiles, MolToSmiles
smi = "..."
MolToSmiles(MolFromSmiles(smi), canonical=False, doRandom=True, isomericSmiles=True, kekuleSmiles=True)
@whitead
whitead / bart.py
Last active February 12, 2024 16:23
Bart Vestaboard
import click
import time
import requests
import xml.etree.ElementTree as ET
from vesta import vesta_layout, send_to_vesta
def get_departures(station_name):
api_key = "MW9S-E7SL-26DU-VV8V"
base_url = "https://api.bart.gov/api/etd.aspx"
@whitead
whitead / review.txt
Created February 15, 2023 20:50
RoboReview
The paper by Caldas (2023) explored an approach to avoid the need for web server maintenance and cost by hosting a static file on sites like Github. The application developed was a JavaScript implementation of TensorFlow framework to predict the solubility of small molecules. The model implements a deep ensemble approach to report model uncertainty when reporting the prediction. The model was evaluated using RMSE, MAE, and correlation coefficient and outperformed the baseline models (Caldas2023 pages 6-7). The paper also provides a review of methods for calculating solution free energies and modelling systems in solution (Caldas2023 pages 11-12). The authors' model, kde10LSTM Aug, achieved a RMSE of 0.983 and a %±0.5log of 40.0% in the solubility challenge 1 dataset, outperforming 62% of the published RMSE values and 50% of the %±0.5log (Caldas2023 pages 9-10). This paper is significant as it provides an efficient and cost-effective approach to predict the solubility of small molecules with improved accuracy.
@whitead
whitead / name.py
Created November 1, 2022 21:22
Automated naming of compounds
import rdkit
from rdkit import Chem
from rdkit.Chem import AllChem
import exmol
import skunk
import math
import matplotlib.pyplot as plt
import textwrap
import matplotlib.pyplot as plt
import matplotlib.font_manager as font_manager
@whitead
whitead / fetch_pdb.py
Last active May 2, 2022 08:38
Here's a python function that goes from search string (like "human albumin") and returns a pdb file using @rcsbPDB's top result.
import requests
import tempfile
def get_pdb(query_string):
url = "https://search.rcsb.org/rcsbsearch/v1/query"
query = {
"query": {
"type": "terminal",
"service": "full_text",
"parameters": {"value": query_string},
},
tranches = pd.read_csv('https://gist.githubusercontent.com/whitead/f47887e45bbd2f38332182d2d422da6b/raw/a3948beac9b9034dab432b697c5ec238503ac5d0/tranches.txt')
def get_mol_batch(batch_size = 32):
for t in tranches.values:
d = pd.read_csv(t[0], sep=' ')
for i in range(len(d) // batch_size):
yield d.iloc[i * batch_size:(i + 1) * batch_size, 0].values
@whitead
whitead / tranches.txt
Created December 5, 2021 05:37
zinc20 tranches
http://files.docking.org/2D/AA/AAAA.smi
http://files.docking.org/2D/AA/AAAB.smi
http://files.docking.org/2D/AA/AAAC.smi
http://files.docking.org/2D/AA/AAAD.smi
http://files.docking.org/2D/AA/AABA.smi
http://files.docking.org/2D/AA/AABB.smi
http://files.docking.org/2D/AA/AABD.smi
http://files.docking.org/2D/AA/AACA.smi
http://files.docking.org/2D/AA/AACB.smi
http://files.docking.org/2D/AA/AACD.smi
@whitead
whitead / animate.py
Created August 27, 2021 14:48
Some code for animating
from matplotlib.collections import LineCollection
fps = 60.
stride = 1
duration = (T - 5) / fps / stride
print(duration, fps)
all_segments = [make_segments(paths, i) for i in range(N)]
fig = plt.figure(figsize=(1080 //180, 1080 // 180), dpi=180)
ALA ARG ASN ASP CYS GLN GLU HIS ILE LEU LYS MET PHE PRO SER THR TRP TYR VAL
ALA -0.34 0.08 0.04 0.15 -0.13 0.01 0.03 0.12 -0.2 -0.2 0.02 -0.18 -0.16 -0.01 0.03 -0.02 0.03 -0.02 -0.2
ARG 0.08 0.08 -0.08 -0.74 0.26 -0.2 -0.78 0.08 0.23 0.14 0.3 0.17 0.15 -0.1 -0.08 -0.03 -0.11 0.01 0.26
ASN 0.04 -0.08 -0.54 -0.47 0.12 -0.32 -0.26 -0.08 0.42 0.34 -0.25 0.23 0.14 -0.1 -0.32 -0.24 -0.01 0 0.25
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_context('talk')
sns.set_style('darkgrid')
ps = [0.001, 0.005, 0.01, 0.02, 0.035]
cities = ['Rural Montana', 'Seattle', 'Minneapolis', 'Chicago', 'Miami']
cp = sns.cubehelix_palette(len(ps), start=0.5, rot=-0.75)