Skip to content

Instantly share code, notes, and snippets.

View pansapiens's full-sized avatar

Andrew Perry pansapiens

  • Melbourne, Australia
View GitHub Profile
@pansapiens
pansapiens / Dockerfile
Last active July 21, 2020 09:43
biobloom in Docker
FROM ubuntu:18.04
RUN apt-get -y update && \
apt-get -y install git build-essential autogen autotools-dev automake \
libsdsl-dev libsdsl3 libsparsehash-dev zlib1g-dev libboost-all-dev libgomp1 && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
WORKDIR /
RUN git clone --depth 1 --branch 2.3.2 https://github.com/bcgsc/biobloom.git && \
cd /biobloom && \
git submodule update --init && ./autogen.sh && \
@pansapiens
pansapiens / _README.md
Last active April 14, 2022 11:01
The Aspera ascp key you are looking for

Installing and running ascp is a PITA.

Here's the private key you need (at least for NCBI/ENA downloads).

ascp is now available in a Docker container. This makes things easier.

Do this:

# We need the not-actually-secret private key that comes packaged with Aspera Connect but is inexplicably not used as the default
# when no key is specified. Here's one I prepared earlier.
@pansapiens
pansapiens / kmer_vector_example.ipynb
Created May 11, 2020 11:07
k-mer counts as input feature vectors in Python
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@pansapiens
pansapiens / custom_environments_in_jupyter.ipynb
Last active July 21, 2020 01:53
Using custom environments in Jupyter
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@pansapiens
pansapiens / aaindex.py
Created March 13, 2020 01:15
AAIndex parsing
import urllib.request as request
from collections import defaultdict
def parse_aaindex2(lines, default=None):
"""
Parse the lines of an AAIndex2 substitution matrix, return a dict of the entire database keyed by
AAIndex identifier.
The aaindex[id]['matrix'] dictionary is the same structure as Biopython's `Bio.SubsMat.MatrixInfo`
substitution matricies.
@pansapiens
pansapiens / 00_README.md
Last active November 20, 2019 23:59
Quick scripts for setting up Data Carpentry Genomics: https://datacarpentry.org/genomics-workshop/setup.html (option B)
@pansapiens
pansapiens / gff2gtf.R
Created August 8, 2019 03:25
gff2gtf.R
#!/usr/bin/env Rscript
# Based on https://www.biostars.org/p/45791/#45804
#
# Usage:
# ./gff2gtf.R genes.gff genes.gtf
#
# Dependencies can be easily installed like:
# conda create -n gff2gtf bioconductor-rtracklayer
# conda activate gff2gtf
@pansapiens
pansapiens / 0README.md
Last active November 22, 2022 00:58
RStudio Server in Singularity (via rocker), M3 HPC flavour
@pansapiens
pansapiens / Dockerfile
Last active June 6, 2019 02:26
nullarbor Dockerfile
FROM continuumio/miniconda3
LABEL maintainer="Andrew Perry <andrew.perry@monash.edu>"
ARG KRAKEN_DB_URL=https://ccb.jhu.edu/software/kraken/dl/minikraken_20171019_4GB.tgz
ARG CENTRIFUGE_DB_URL=ftp://ftp.ccb.jhu.edu/pub/infphilo/centrifuge/data/p_compressed+h+v.tar.gz
ENV KRAKEN_DB_PATH=/databases/minikraken
ENV CENTRIFUGE_DB_PATH=/databases/centrifuge-db
ENV KRAKEN_DEFAULT_DB=$KRAKEN_DB_PATH/minikraken_20171013_4GB
ENV CENTRIFUGE_DEFAULT_DB=$CENTRIFUGE_DB_PATH/p_compressed+h+v
@pansapiens
pansapiens / keybase.md
Created January 14, 2019 04:32
Keybase identification

Keybase proof

I hereby claim:

  • I am pansapiens on github.
  • I am pansapiens (https://keybase.io/pansapiens) on keybase.
  • I have a public key whose fingerprint is CC4E 5E55 3A06 BC36 2B20 F907 DB09 40BE B37B 0426

To claim this, I am signing this object: