Skip to content

Instantly share code, notes, and snippets.

@aaronwolen
aaronwolen / dataframetools.r
Created July 5, 2012 21:37
dataframetools: A few simple functions for performing simple tasks with data.frames
# dataframetools
# A few simple functions for performing simple tasks with data.frames
# ---
# Includes functions for:
#
# - reordering data.frames
# - identifying invariant or blank columns
# - identifying groups of columns that are redundant with each other
# - converting all columns of class factor to class character
@aaronwolen
aaronwolen / ggpairs.r
Last active May 25, 2016 13:12
Modified version of the ggplot2 plotmatrix function that accepts additional variables for aesthetic mapping.
#' Modified version of the ggplot2 plotmatrix function that accepts additional
#' variables for aesthetic mapping.
#'
#' example
#' data(iris)
#' iris.vars <- c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width")
#' ggpairs(data = iris, facet.vars = iris.vars,
#' mapping = aes(color = Species, shape = Species))
ggpairs <- function (data, facet.vars = colnames(data), facet.scale = "free",
@aaronwolen
aaronwolen / sample_fastq.py
Created September 3, 2012 17:38
Random sample of fastq sequences from paired-end files
# Code written by brentp in response to BioStars question:
# http://www.biostars.org/post/show/6544/
import random
import sys
def write_random_records(fqa, fqb, N=100000):
""" get N random headers from a fastq file without reading the
whole thing into memory"""
records = sum(1 for _ in open(fqa)) / 4
@aaronwolen
aaronwolen / annotate_chip.r
Created September 12, 2012 17:21
Generate data.frame of feature annotations using Bioconductor
#' Generate data.frame of feature annotations
#'
#' Use bioconductor annotation packages to create a data.frame of feature/probe
#' annotations.
#'
#' @param chip character string identifying chip model (e.g., "illuminaHumanv2")
#' @param features optional character vector of chip features (i.e., probeset ids)
#' @param vars character vector of desired annotations. These must match objects
#' provided by the annotation package (e.g., "CHR")
#' @param duplicate.values how should duplicate values be handled? The default
@aaronwolen
aaronwolen / map_ftp.r
Created October 31, 2012 14:42
FTP tree mapper: save an FTP site's directory stucture as a list
#' FTP tree mapper
#' Save an FTP site's directory stucture as a list.
#' @author Aaron Wolen
#'
#' @example
#' url <- 'ftp://ftp.genboree.org/EpigenomeAtlas/Current-Release/experiment-sample'
#' roadmap <- map_ftp(url = url, dirs = "Histone_H2BK120ac", recursive = TRUE)
map_ftp <- function(url, dirs, recursive = FALSE) {
require(RCurl, quietly = TRUE)
@aaronwolen
aaronwolen / wig-sources.csv
Created November 4, 2012 16:08
List of sources of wig/bigwig genomic data
Name URL
ENCODE ftp://encodeftp.cse.ucsc.edu/pipeline/hg19/
ENCODE (test) http://hgdownload-test.cse.ucsc.edu/goldenPath/hg19/encodeDCC/
RoadMap ftp://ftp.genboree.org/EpigenomeAtlas/Current-Release
@aaronwolen
aaronwolen / load-bigwig.r
Created November 20, 2012 17:08
Example: loading data from a bigWig file
library(IRanges)
library(GenomicRanges)
library(rtracklayer)
# Select a BigWig file
bw.dir <- "/home/chromatin/roadmap/DNase_hypersensitivity/brain_fetal"
bw.file <- dir(bw.dir, full.names = TRUE, pattern = "*.bigWig")[1]
# Specify a genomic range
selection <- GRanges(seqnames = "chr4",
@aaronwolen
aaronwolen / bioc_archive.r
Created November 21, 2012 15:39
Install bioconductor for older versions of R
install.packages(c("RCurl", "XML"))
bioc.v <- tools:::.BioC_version_associated_with_R_version
repos <- tools:::.read_repositories(file.path(R.home("etc"), "repositories"))
bioc.repo <- repos["BioCsoft",]$URL
bioc.repo <- sub("2\\.\\d+", bioc.v, bioc.repo)
packages <- c("zlibbioc", "BiocGenerics", "Biobase", "IRanges",
"AnnotationDbi", "GenomicRanges", "Biostrings", "Rsamtools",
@aaronwolen
aaronwolen / clipboard.r
Created January 11, 2013 20:11
Paste directly to Mac clipboard
clipboard <- function(x, sep.lines = FALSE){
clipboard <- pipe('pbcopy', 'w')
if(sep.lines){
x <- unlist(strsplit(as.character(x), split = ","))
x <- sub(" ", "", x)
}
write.table(x, clipboard, sep = "\t",
quote = FALSE, col.names = FALSE, row.names = FALSE)
@aaronwolen
aaronwolen / slides.md
Last active November 11, 2022 23:57
Pandoc template to generate reveal.js slideshows.

% Title % Name % Date

My first slide

List