Skip to content

Instantly share code, notes, and snippets.

View cathalgarvey's full-sized avatar

Cathal Garvey cathalgarvey

View GitHub Profile
@cathalgarvey
cathalgarvey / jlgz_dump.go
Created April 5, 2017 12:09
How to Read Lines from GZIP-Compressed Files in Go
package main
import (
"compress/gzip"
"os"
"bufio"
"fmt"
"log"
)
@cathalgarvey
cathalgarvey / chacha20
Created August 15, 2014 14:43
ChaCha20 stream cipher in Python 3
# Pure Python ChaCha20
# Based on Numpy implementation: https://gist.github.com/chiiph/6855750
# Based on http://cr.yp.to/chacha.html
#
# I wanted an implementation of ChaCha in clean, understandable Python
# as a way to get a handle on the algorithm for porting to another language.
# There are plenty of bindings but few pure implementations, because
# Pure Python is too slow for normal practical use in Cryptography.
#
# The preceding implementation used NumPy, which avoided a lot of the
@cathalgarvey
cathalgarvey / setupbiopython.sh
Created March 31, 2014 19:16
A quick two-liner that should install Biopython on a Linux-Mint/Ubuntu/Debian system.
#!/bin/bash
# This script installs Python3, ipython, pip, and uses pip to install biopython.
# You'll be asked for your password.
sudo apt-get install python3-dev ipython3 python3-pip build-essential
sudo pip install biopython
# That's it, two lines!
@cathalgarvey
cathalgarvey / VersionedDict
Created September 21, 2013 23:10
A revision-enabled dict subclass, so your dicts don't forget their prior entries.
class VersionedDict(dict):
'''A dictionary sublcass that remembers all or a defined number of prior entries for a key.
Allows reversion by number from "head" or by absolute reference in revision list.
Allows retrieval of currently retained revision history for a key.
Deletion deletes all revisions, not merely the most recent.
If instantiated with the "revisions" keyword and an integer argument, only retains that many revisions per entry.'''
def __init__(self, *args, **kwargs):
revisions = kwargs.pop('revisions', None)
self._allowed_revisions = abs(int(revisions))
@cathalgarvey
cathalgarvey / grep_tweets
Created April 14, 2013 20:06
grep_tweets, a companion script to cat_tweets that allows searching and filtering of Twitter tweet archive data by regex or a bunch of other useful parameters.
#!/usr/bin/env python3
import time
import datetime
import json
import re
timestamp_format = '%a %b %d %H:%M:%S %z %Y'
def twitter_timestamp_to_obj(time_string):
'Returns a timezone-aware datetime object.'
return datetime.datetime.strptime(time_string, timestamp_format)
@cathalgarvey
cathalgarvey / cat_tweets
Created April 14, 2013 13:06
A script to concatenate the monthly JSON twitter files given in Twitter's tweets archive, and to add a UTC Unix timestamp to each tweet for easy parsing with other tools.
#!/usr/bin/env python3
import time
import datetime
import os
import json
timestamp_format = '%a %b %d %H:%M:%S %z %Y'
def twitter_timestamp_to_obj(time_string):
'Returns a timezone-aware datetime object.'
return datetime.datetime.strptime(time_string, timestamp_format)
@cathalgarvey
cathalgarvey / wordsoupfixer.py
Created January 20, 2013 18:39
Word soup fixer, for emails written in one long line full of ellipses. I get a surprising number of these.
#!/usr/bin/env python3
import sys
fixfile = sys.argv[1]
with open(fixfile) as InputFile:
word_soup = InputFile.read()
# Strip off excess whitespace and any trailing ellipsis.
word_soup = word_soup.strip().strip(".!?")
@cathalgarvey
cathalgarvey / countfiles
Last active December 10, 2015 22:38 — forked from anonymous/countfiles
A little script I wrote to perform a quick census on my music library, to help me identify which artists/albums contain the most mp3s/m4as. This was intended to help me replace my music library with oggs, but could be used for all sorts of other handy things, too.
#!/usr/bin/env python3
import os
from sys import argv
# Walk through folders recursively, list the full path and number of (extension) files found in each.
basefolder = os.path.expanduser(argv[1])
filetype = str(argv[2]).lower()
output = []
@cathalgarvey
cathalgarvey / EcoliK12_OptimalCodons
Created July 12, 2012 20:02
E.coli K-12 Optimal Table, a compilation of information from (Welch et al, Sept 2009) and K12 genomic codon frequencies.
{
"End": {
"TAG": {
"frequency": 0.0,
"relfreq": 0.0
},
"localfrequency": 2.74,
"TGA": {
"frequency": 0.98,
"relfreq": 0.3576642335766423
@cathalgarvey
cathalgarvey / RestrictionEnzymes.json
Created July 12, 2012 09:39
A somewhat exhaustive collection of restriction enzymes in JSON format.
This file has been truncated, but you can view the full file.
{
"binsi": {
"target_site": "CCWGG",
"name": "BinSI",
"suppliers": [],
"source": "ATCC 15702",
"references": [
"Khosaka, T., Kiwaki, M., Rak, B., (1983) FEBS Lett., vol. 163, pp. 170-174."
],
"prototype": "EcoRII",