Skip to content

Instantly share code, notes, and snippets.

Linear Regression - lecture series II-IV

Model representation

Notation (used throughout course)

  m = number of training samples x = “input” variable / feature y = “output” variable / “target” variable (x,y) = one training example

@danielparton
danielparton / analyze_ssbonds.py
Created May 9, 2015 01:21
Analysis of disulfide bonds in protein kinases
# for each template:
# from PDB entry:
# print SS bonds within template span
import os
import gzip
import ensembler
from ensembler.uniprot import get_uniprot_xml
from ensembler.initproject import extract_template_pdbchains_from_uniprot_xml, parse_sifts_xml
from ensembler.utils import set_loglevel
@danielparton
danielparton / param_parsers.py
Created April 1, 2015 23:26
Parser for sending parameters from a CLI to an API, and also for evaluating simtk unit quantities
import ast
import re
import operator as op
import simtk.unit
unit_membernames = [name for name in simtk.unit.__dict__]
quantity_as_number_space_unit_regex = re.compile(
'([0-9.]+) ?({0})'.format('|'.join(unit_membernames))
) # e.g. "2 picoseconds"
@danielparton
danielparton / gist:a7b83c85bc7e06dc5189
Created October 7, 2014 15:02
Retrieve UniProt function (removes namespace stuff)
def retrieve_uniprot(search_string, maxreadlength=100000000):
'''
Searches the UniProt database given a search string, and retrieves an XML
file, which is returned as a string.
maxreadlength is the maximum size in bytes which will be read from the website
(default 100MB)
Example search string: 'domain:"Protein kinase" AND reviewed:yes'
The function also removes the xmlns attribute from <uniprot> tag, as this
makes xpath searching annoying
@danielparton
danielparton / build-mpirun-configfile.py
Created September 30, 2014 20:57
Script for building a configfile for MPICH2 mpirun from Torque/Moab $PBS_GPUFILE contents
#!/usr/bin/env python
"""
Construct a configfile for MPICH2 mpirun from Torque/Moab $PBS_GPUFILE contents.
Usage:
python build-mpirun-configfile.py executable [args...]
mpirun -configfile configfile
@danielparton
danielparton / openmm-gpu-mpi-test.py
Last active August 29, 2015 14:06
Test script for tracking down problems on cbio cluster when using multiple GPUs in parallel
import socket
import mpi4py.MPI
import gzip
import simtk.openmm as openmm
import simtk.openmm.app as app
import simtk.unit as unit
comm = mpi4py.MPI.COMM_WORLD
rank = comm.rank
size = comm.size
tokenized_string = "2 + 3 * 4".split()
# parsing arithmetic expressions with + and *
# A parser returns (parsed_thing, new_index) on success.
def fail(): raise Exception("failed to parse")
def is_num(x):
try: int(x)
except ValueError: return False