Skip to content

Instantly share code, notes, and snippets.

View mschubert's full-sized avatar
🧬

Michael Schubert mschubert

🧬
View GitHub Profile
@mschubert
mschubert / bsub
Created January 31, 2014 17:05
A python script that emulates the LSF "bsub" command so you can test configurations locally before running them on a cluster
#!/usr/bin/env python
"""
Fake qsub command, for testing qsub jobs in local
original: https://bitbucket.org/dalloliogm/fake-qsub
"""
import argparse
import subprocess
import logging
@mschubert
mschubert / unix.R
Last active August 29, 2015 13:56
A unix-like pipe operator in R to perform CLI operations (or call R functions)
# a unix-like pipe operator in R to perform CLI operations (or call R functions)
`%|%` = function(x, command) {
if (class(command) == 'function') {
command(x)
} else {
stopifnot(class(x) %in% c('character', 'numeric', 'integer'))
system(command, input=as.character(x), intern=TRUE)
}
}
@mschubert
mschubert / BatchJobsWrapper.R
Last active August 29, 2015 13:57
Wrapper for the BatchJobs library that simplifies the interface and performs more checks
library(stringr)
library(BatchJobs)
library(plyr)
# Rationale
# This script uses BatchJobs to run functions either locally, on multiple cores, or LSF,
# depending on your BatchJobs configuration. It has a simpler interface, does more error
# checking than the library itself, and is able to queue different function calls. The
# function supplied *MUST* be self-sufficient, i.e. load libraries and scripts.
# BatchJobs on the EBI cluster is already set up when using the gentoo prefix.
@mschubert
mschubert / delay.py
Last active August 29, 2015 13:58
When starting a command using delay.py, delays the start of subsequent calls. This can be useful to e.g. limit peak file system load on a computing cluster.
#!/usr/bin/env python2.7
import sys, os.path
import subprocess
import time
import lockfile
import threading
lock = lockfile.FileLock(os.path.join(os.path.expanduser("~"), ".delay"))
lock.acquire()
#
# loading the table, converting it to an expression matrix
#
df = read.table(
"GBM.rnaseqv2__illuminahiseq_rnaseqv2__unc_edu__Level_3__RSEM_genes__data.data.txt",
header=TRUE, sep="\t", stringsAsFactors=FALSE)
expr = data.matrix(df[2:nrow(df), df[1,]=="raw_count"])
rownames(expr) = df[2:nrow(df), 1]
#
@mschubert
mschubert / LSF.tmpl
Last active August 29, 2015 14:27
Minimal example of using rzmq to submit a worker job using LSF
#BSUB-J {{ job_name }} # name of the job / array jobs
#BSUB-o {{ log_file | /dev/null }} # output is sent to logfile, stdout + stderr by default
#BSUB-P {{ queue }} # Job queue
#BSUB-W {{ walltime }} # Walltime in minutes
#BSUB-M {{ memory | 4096 }} # Memory requirements in Mbytes
#BSUB-R rusage[mem={{ memory | 4096 }}] # Memory requirements in Mbytes
#BSUB-R select[panfs_nobackup_research]
R --no-save --no-restore --args "{{ args }}" < "{{ rscript }}"
@mschubert
mschubert / encfs-rsync.sh
Last active May 4, 2016 04:33
Script to (decrypt encfs and) selectively sync a remote sshfs directory
#!/bin/bash
# Script to (decrypt encfs and) selectively sync a remote sshfs directory
REMOTEDIR=server:/path/to/some/encfs
LOCALDIR=/path/to/local/directory
MOUNTFUNC=_ssh+encfs
# verbose
# set -vx
n = 1:1e7
# use a loop
new_n = rep(NA, length(n))
for (i in seq_along(n))
new_n[i] = n[i] * 2 - 5
# use *apply
new_n = sapply(n, function(x) x * 2 - 5)
@mschubert
mschubert / .bashrc
Created December 22, 2016 21:34
Type the first letters of an old command and then "up-arrow" search
# enable line editing for terminal
if [ -t 1 ]; then
bind '"\e[A": history-search-backward'
bind '"\e[B": history-search-forward'
fi
# ignore system installation
# e.g. https://github.com/EBI-predocs/research-software/issues/57
.libPaths("~/.R")
# no factor coercion
# no graphical package install menu
# warn if partial matching arguments
options(
stringsAsFactors = FALSE,