Skip to content

Instantly share code, notes, and snippets.

View cathalgarvey's full-sized avatar

Cathal Garvey cathalgarvey

View GitHub Profile
@cathalgarvey
cathalgarvey / seqio_answers.py
Created March 31, 2014 19:02
Suggested Solutions to SeqIO Exercises
from Bio import SeqIO
from Bio.Seq import Seq
sequence_generator = SeqIO.parse("br_sequences.fasta", "fasta")
all_sequences = list(sequence_generator)
# * How many records are in the file?
print("Number of records:", len(all_sequences))
# * How many records have a sequence of length 249?
@cathalgarvey
cathalgarvey / bioinfo_funcs.py
Created March 31, 2014 19:11
Features missing from Python's string/list types that are handy for bio-informatics
"Functions missing from Python's string/list types that are handy for bio-informatics."
def codonise(seq):
'''Returns a list of codons, not including trailing 1/2n.
To get codons starting from letter X, pass seq[X:].'''
mylist = []
for i in range(0, len(seq), 3):
this_codon = seq[i:i+3]
# This bit ensures that only whole codons,
# not trailing bits, are added:
@cathalgarvey
cathalgarvey / seq_searcher
Created March 31, 2014 19:14
Simplified method of searching for forward/reverse-complement sequence in a multi-sequence file
import sys
from Bio import SeqIO
from Bio.Seq import Seq
filename = sys.argv[1]
usersequence = Seq(sys.argv[2])
usersequence = usersequence.upper()
user_reverse = usersequence.reverse_complement()
records = SeqIO.parse(filename, "fasta")
@cathalgarvey
cathalgarvey / dncode.py
Last active August 29, 2015 13:58
Python 8-line DNA compression tool. Usage: 'python3 dncode.py [e/d] <somefile>', output to stdout.
import sys as s, itertools as i
t,u,i2b,o=b'T',b'U',lambda i:i.to_bytes(1,'big'),[s.stdout.buffer.write,lambda s:0]
A,e=s.argv,dict(zip((''.join(x).encode()for x in i.product(*('ACGT',)*4)),map(i2b,range(256))))
with open(A[2],'rb')as I:D,Rr=b''.join(I.read().strip().split()),lambda s:s.replace(t,u)
S,R,M,o,d=D[:-1],D[-1]&4,D[-1]&2,o[::-1]if A[1]=="d"else o,dict(zip(e.values(),e)).get
[o[0](e.get(q.replace(u,t)+(b'A'*(4-len(q)))))for q in(D[i: i+4] for i in range(0,len(D),4))]
o[0](i2b((len( [D[i: i+4] for i in range(0,len(D),4)][-1] )%4)|(4 if u in S else 0)))
o[1](b''.join(Rr(d(i2b(x)))if R else d(x)for x in S)[:-M if M else None]+b'\n')
@cathalgarvey
cathalgarvey / pruner.py
Created April 10, 2014 13:34
An object for recursively pruning empty containers and Nonetypes from containers or sequences.
class Prune:
'Treat like a function; call Prune.prune on any datatype to prune NoneTypes and empty Tuples/Lists/Dicts.'
_t = tuple() # Empty tuple, as "(,)" literal doesn't work.
@classmethod
def prune(self, some_data):
if isinstance(some_data, (list, tuple)):
return self.prune_sequence(some_data)
elif isinstance(some_data, dict):
return self.prune_tree(some_data)
@cathalgarvey
cathalgarvey / strict_decorator.py
Created May 1, 2014 14:30
A decorator for when you want strictness in Python3.
def strict(func):
'''A decorator for methods or functions that requires annotations for all
arguments and the return value, throws typeerrors on deviations.
Remember that for more than one return value, the return type is "tuple".
Container-type arguments or return values are only inspected at top-level.
Note that as written, this does not handle catchall argument types "*args", or "**kwargs".
'''
import inspect, collections
NoneType = type(None)
def die_on_untyped_annotation(par, ann_type="argument annotation"):
@cathalgarvey
cathalgarvey / 2048_arbgrid.py
Created June 16, 2014 10:57
An almost-2048 clone for CLI use, with optional arbitrary grid dimensions (Python 3.3+)
import random
import shutil
import tty
import sys
import termios
def getchar():
'Linux-only: Could not be bothered making arrow-getting code WinMac-compatible.'
fd = sys.stdin.fileno()
old_settings = termios.tcgetattr(fd)
@cathalgarvey
cathalgarvey / mullis.py
Last active August 29, 2015 14:04
A micro-lisp-like for processing 'formulas' in and upon JSON data, in Python / Coffeescript
# This is a micro-language designed to be embedded in JSON, allowing
# data in the JSON tree to specify a formula for how it should be derived
# based on the rest of the tree. Formulas are constructed of prefix-notation
# lists naming a function and passing arguments, which are recursively
# evaluated.
#
# Operations are mostly mathematical, with one ternary function that allows
# for simple conditional operations or code-branching.
#
# An example:
@cathalgarvey
cathalgarvey / biorad_dialogue_killer.txt
Last active August 29, 2015 14:06
Bookmarklet to remove the goddamed "pick your location" landing dialog from bio-rad.com
Bio-rad.com has one of those intensely aggravating landing dialogs that demands
to know where you are so it can forcibly remove you from what you're trying to
view, and perhaps gouge you for more money.
This is annoying, invasive and generally not-cool, so here's a little
bookmarklet to remove the dialog and get on with things.
To use, just copy the below code into a new bookmark using your browser's
bookmark manager. Don't omit the "javascript:" bit or the final empty
parentheses: "()".
@cathalgarvey
cathalgarvey / patch_wgetted_4chan.py
Last active August 29, 2015 14:18
A CSS/JS-fetching and HTML-patching script for correctly archiving sites (well, 4chan at least) using wget, as requested on Reddit.
#!/usr/bin/env python3
# by Cathal Garvey, copyright 2015, released under the AGPL: https://gnu.org/licenses/agpl.txt
# Commissioned by a 4chan user on reddit /r/linux who wanted backups but wget couldn't fetch most JS/CSS
# correctly. Only tested on 4chan in keeping with request.
# Usage e.g. (papercraft sub on 4chan):
# wget --recursive --no-clobber --page-requisites --html-extension --convert-links --no-parent http://boards.4chan.org/po/
# cd boards.4chan.org
# # (Directory contains subdirectory "po/" which contains all HTML)
# # (Provide root domain of crawled site to help resolve relative links, and target folder)
# python3 <this script> boards.4chan.org po