Skip to content

Instantly share code, notes, and snippets.

View jrherr's full-sized avatar
💭
Probably working

Josh Herr jrherr

💭
Probably working
View GitHub Profile
@jrherr
jrherr / excel_my_barplot.R
Created January 13, 2016 16:07 — forked from mw55309/excel_my_barplot.R
R code to create Excel-like barplot
x <- data.frame(d=runif(12), g=rep(1:4, each =3))
my.col <- c("deepskyblue3","darkorange2","darkgray","gold")
spacer <- c(1, 0.1, 0.1, 1, 0.1, 0.1, 1, 0.1, 0.1, 1, 0.1, 0.1)
bw <- 0.8
xmax <- (sum(spacer) * bw) + (nrow(x) * bw)
@jrherr
jrherr / read_hdf5_biom.R
Created December 8, 2015 17:55 — forked from jnpaulson/read_hdf5_biom.R
Hacks to load an hdf5 biom file and be able to add it to metagenomeSeq or phyloseq
source("http://bioconductor.org/biocLite.R")
biocLite(c("rhdf5","biom"))
library(rhdf5)
library(biom)
# This generates the matrix columns-wise
generate_matrix <- function(x){
indptr = x$sample$matrix$indptr+1
indices = x$sample$matrix$indices+1
data = x$sample$matrix$data
@jrherr
jrherr / fq-strip-contam.sh
Last active September 9, 2015 15:28 — forked from standage/fq-strip-contam.sh
Procedure for removing contaminants from paired-end sequence data. The bwa-mem algorithm is used to map reads against a database of contaminants, a small Perl one-liner is used to filter out reads that map to the contaminants, the SAM data is converted to BAM format, which is then fed (via process substitution) to Tophat's bam2fastx to convert b…
# -q: output in Fastq format
# -Q: ignore BAM quality flags
# -P: paired-end data
bam2fastx -qQP -o clean.fq <(bwa mem contaminants.fasta reads.1.fq reads.2.fq | \
perl -ne '@v = split(/\t/); print if(m/^@/ or ($v[1] & 4 and $v[1] & 8))' | \
samtools view -bhS -)
@jrherr
jrherr / genome_plots.r
Last active August 29, 2015 14:27 — forked from roblanf/genome_plots.r
Make plots about genome sequencing, size, and gene content
source("http://bioconductor.org/biocLite.R")
biocLite("genomes")
library(genomes)
library(ggplot2)
valid <- c("released", "created", "submitted")
data(proks)
update(proks)
@jrherr
jrherr / search_download.md
Last active August 29, 2015 14:26 — forked from sckott/search_download.md
occ_search and occ_download

to get similar results for GBIF search and download APIs

Load rgbif

library("rgbif")

occ_search() method

@jrherr
jrherr / 01.cmake
Last active August 29, 2015 14:22 — forked from anonymous/01.cmake
Wed Jun 10 19:52:40 +0000 2015
cmake
..
-DCMAKE_C_FLAGS_RELEASE=
-DCMAKE_CXX_FLAGS_RELEASE=
-DCMAKE_INSTALL_PREFIX=/home/ubuntu/.linuxbrew/Cellar/spades/3.5.0
-DCMAKE_BUILD_TYPE=Release
-DCMAKE_FIND_FRAMEWORK=LAST
-DCMAKE_VERBOSE_MAKEFILE=ON
@jrherr
jrherr / remove_redundant_sequences
Created May 18, 2015 17:47
remove redundant sequences from fasta file
sed -e '/^>/s/$/@/' -e 's/^>/#/' name.fasta | tr -d '\n' | tr "#" "\n" | tr "@" "\t" | sort -u -t ' ' -f -k 2,2 | sed -e 's/^/>/' -e 's/\t/\n/'
# kill all qsub jobs
qselect -u $USER -S -Q | xargs qdel
@jrherr
jrherr / gist:2c5fed226a1a63f5643a
Created February 8, 2015 16:55
Rename SPAdes output fasta headers line for pipeline
sed -r "s/>NODE(_[0-9]+)_(.*)/>${input.name}\1 \2/g" $input > $output