Skip to content

Instantly share code, notes, and snippets.

@jnpaulson
jnpaulson / convert_to_biom.R
Last active August 29, 2015 13:57
Here is a hack to convert dfs/matrices to biom format to take advantage of the write_biom function
################################################################################
#' Convert dataframe objects to biom-format \code{biom-class} objects.
#'
#' Wrapper to convert dataframe objects to \code{biom-class} objects. Currently cannot
#' create sparse biom-format objects. The user can then export their previously
#' prepared files using \code{write_biom}.
#' \href{http://biom-format.org/documentation/biom_format.html}{the
#' biom-format definition}.
#'
#' #' The BIOM file format (canonically pronounced biome) is designed to be a general-use format for representing biological sample by observation contingency tables. BIOM is a recognized standard for the \href{http://www.earthmicrobiome.org/}{Earth Microbiome Project} and is a \href{http://gensc.org/}{Genomics Standards Consortium} candidate project. Please see \href{http://biom-format.org/}{the biom-format home page} for more details.
@jnpaulson
jnpaulson / make_biom.R
Created April 13, 2014 23:00
Version of make_biom that allows a change in matrix_element_type
make_biom2<-function (data, sample_metadata = NULL, observation_metadata = NULL,
id = NULL,matrix_element_type="int")
{
if (!is.null(observation_metadata)) {
rows = mapply(list, SIMPLIFY = FALSE, id = as.list(rownames(data)),
metadata = alply(as.matrix(observation_metadata),
1, .expand = FALSE, .dims = TRUE))
}
else {
rows = mapply(list, id = as.list(rownames(data)), metadata = NA,
@jnpaulson
jnpaulson / bubblePlot.R
Last active August 29, 2015 14:00
Bubble plot function!
bubblePlot<-function(yvector,xvector,sigvector=NULL,nbreaks=10,ret=FALSE,scale=1,...){
#if(names(yvector)%in%names(xvector)){
# stop("Name the y and x vectors -- ideally the same name ;-)")
#}
ybreaks = cut(yvector,breaks=quantile(yvector,p=seq(0,1,length.out=nbreaks)),include.lowest=T)
xbreaks = cut(xvector,breaks=quantile(xvector,p=seq(0,1,length.out=nbreaks)),include.lowest=T)
numFeatures = lapply(levels(xbreaks),function(i){
k = which(xbreaks==i)
sapply(levels(ybreaks),function(j){
length(which(ybreaks[k]==j))
@jnpaulson
jnpaulson / read_hdf5_biom.R
Last active March 25, 2020 17:24
Hacks to load an hdf5 biom file and be able to add it to metagenomeSeq or phyloseq
source("http://bioconductor.org/biocLite.R")
biocLite(c("rhdf5","biom"))
library(rhdf5)
library(biom)
# This generates the matrix columns-wise
generate_matrix <- function(x){
indptr = x$sample$matrix$indptr+1
indices = x$sample$matrix$indices+1
data = x$sample$matrix$data
@jnpaulson
jnpaulson / MRexperiment.Rmd
Created October 22, 2014 18:40
metagenomeSeq RMarkdown file to share MRexperiment objects through shinyapps
---
title: "metagenomeSeq"
author: "jpaulson"
date: "October 22, 2014"
output: html_document
runtime: shiny
---
This R Markdown document is made interactive using Shiny and allows you to explore MRexperiment objects.
Replace
@jnpaulson
jnpaulson / dcmetromap.geojson
Last active August 29, 2015 14:08
DC metro stations geojson
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@jnpaulson
jnpaulson / dcbusmap.geojson
Last active August 29, 2015 14:08
DC bus station geojson
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@jnpaulson
jnpaulson / grabAG.sh
Created November 15, 2014 23:49
grab data from the american gut consortium
#!/bin/bash
# grab the biom data
wget https://github.com/biocore/American-Gut/raw/master/data/PGP/PGP_100nt.biom.gz
wget https://github.com/biocore/American-Gut/raw/master/data/HMP/HMPv35_100nt.biom.gz
wget https://github.com/biocore/American-Gut/raw/master/data/GG/GG_100nt.biom.gz
wget https://github.com/biocore/American-Gut/raw/master/data/AG/AG_100nt.biom.gz
gunzip *
# grab the phenodata
@jnpaulson
jnpaulson / plotAG.R
Last active August 29, 2015 14:09
pca/pcoa plots of the american gut consortium data
require(vegan)
require(biom)
require(metagenomeSeq)
files = grep(".biom$",list.files(),value=TRUE)
files2= grep(".txt",list.files(),value=TRUE)
files3 = sprintf("%s_MRexperiment.rdata",files)
# This generates the data - only run once.
for(i in 1:length(files)){
@jnpaulson
jnpaulson / predictPatient.R
Created November 23, 2014 22:20
predict patient from k mds coordinates and accuracy
j = unlist(locals[2:3])
obj2 = obj[,j]
sampleID2 = sampleID[j]
mat = MRcounts(obj2,norm=TRUE,log=TRUE)
otusToKeep <- which(rowSums(mat)>0)
otuVars<-rowSds(mat[otusToKeep,])
otuIndices<-otusToKeep[order(otuVars,decreasing=TRUE)[1:1000]]
mat <- mat[otuIndices,]
mat = t(mat)