Skip to content

Instantly share code, notes, and snippets.

View jrjhealey's full-sized avatar

Joe Healey jrjhealey

View GitHub Profile
@jrjhealey
jrjhealey / tabulateHHpred.py
Created July 5, 2017 21:06
Turn the output of HHsearch in to tab delimited text
# -*- coding: utf-8 -*-
"""
This script takes the .hhr files output by HHSuite and
turns the quite verbose file in to a fully tabulated
version with all the fields separated one, one line per
file. Thus, the file can be viewed simply in Excel etc.
It requires the non-standard pandas module.
"""
@jrjhealey
jrjhealey / MacMAC.sh
Last active September 20, 2017 10:55
Get wireless and ethernet MAC addresses.
#!/bin/bash
ethernet=$(ifconfig en0 | awk '/ether/{print $2}')
wifi=$(ifconfig en1 | awk '/ether/{print $2}')
echo "Ethernet MAC Address: $ethernet"
echo "Wifi MAC Address: $wifi"
@jrjhealey
jrjhealey / Genbank_slicer.py
Last active September 20, 2017 19:11
Creating subsetted operons/gene genbank files from a 'parent' sequence!
#!/usr/bin/python
# This script is designed to take a genbank file and 'slice out'/'subset'
# regions (genes/operons etc.) and produce a separate file. This can be
# done explicitly by telling the script which base sites to use, or can
# 'decide' for itself by blasting a fasta of the sequence you're inter-
# ed in against the Genbank you want to slice a record out of.
# Note, the script (obviously) does not preseve the index number of the
# bases from the original
# Getting python logging info output in a colour coded and customised manner.
# Stolen from M. Galardini and https://stackoverflow.com/questions/384076/how-can-i-color-python-logging-output/2532931#2532931
import logging
import sys
BLACK, RED, GREEN, YELLOW, BLUE, MAGENTA, CYAN, WHITE = range(8)
COLORS = {
'WARNING' : YELLOW,
@jrjhealey
jrjhealey / Fiddling_with_fastas.sh
Created November 27, 2017 14:13
A useful pure bash construct for dealing with FASTA files. Can be tweaked to perform all sorts of actions on the headers of sequences (e.g. rearrangement, regex, text matching)
#!/bin/bash
#### Print out all fastas ending in a certain string. ####
# Change *"$string" to *"$string"* to find containing,
# or "$string"* to find starts with.
file="$1"
string="$2"
@jrjhealey
jrjhealey / Iterative_R_images.r
Created January 30, 2018 17:39
Making a video from an R loop
#!/usr/bin/env Rscript
# Plotting surface plots via ggplot2/plotly
# Usage:
# $ Rscript CDmeltplot.R -i data.csv -o filename
############################################################
# General purpose heatmap plotting script for consistency. #
# This script can be slow as it was designed to be pretty #
@jrjhealey
jrjhealey / prime_anagrams.py
Last active February 1, 2018 10:54
Work out if 2 words are anagrams based on their unique prime products.
#!/usr/bin/env python
"""
Using products of primes to work out whether 2 strings are anagrams of one another.
Inspired by: https://twitter.com/fermatslibrary/status/958700402647674880
Map each of the 26 english characters (A, B, C, ...) to a (unique) prime number.
Multiply the values mapped to each character together (A x B x C x ...).
Since every integer is a prime, or a unique product of primes, according to the
fundamental theorem of arithmetic), two words are anagrams of one another if their
@jrjhealey
jrjhealey / getPDB.sh
Last active April 26, 2018 10:59
Fetching PDB structures from the Protein Databank
#!/bin/bash
# Script to retrieve PDBs via the command line from the PDB HTTP/FTP
# Capture inputs
usage()
{
cat << EOF
usage: $0 options
#!/usr/bin/env Rscript
# Perform cell-wise averaging of matrices (ignoring headers and row names)
# Standard install if missing
list.of.packages <- c("argparse", "abind")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)
for(i in list.of.packages){suppressMessages(library(i, character.only = TRUE))}
@jrjhealey
jrjhealey / seqcol.sh
Created October 10, 2018 11:49
Entirely uncessarily complex script to colour DNA sequences at arbitrary positions.. because...reasons? Even the help is colourful!
#!/bin/bash
# Set colours:
df=$(tput sgr0)
tr=$(tput setaf 1)
tg=$(tput setaf 2)
ty=$(tput setaf 3)
tb=$(tput setaf 4)
usage(){