Skip to content

Instantly share code, notes, and snippets.

View grimbough's full-sized avatar

Mike Smith grimbough

View GitHub Profile
@grimbough
grimbough / fastq-memory.R
Last active August 29, 2015 14:18
Examining memory usage in ShortReadQ class
library(ShortRead)
## create a single string with the number of characters defined by 'length'
## and the letter sampled from 'alphabet'
generateString <- function(length = 50, alphabet = c("A","C","G","T") ) {
paste0(sample(alphabet, size = length, replace = TRUE), collapse = "")
}
## create a ShortReadQ object containing 'nreads' reads, each of which has length 'length'
createFastq <- function(nreads, length) {
@grimbough
grimbough / BioMart_query.R
Last active October 28, 2017 22:16
Query BioMart with httr and RCurl
fullXmlQuery <- "<?xml version='1.0' encoding='UTF-8'?> <!DOCTYPE Query>
<Query virtualSchemaName = 'default' uniqueRows = '1' count = '0' datasetConfigVersion = '0.6' header='1' requestid= 'biomaRt'>
<Dataset name = 'hsapiens_gene_ensembl'>
<Attribute name = 'ensembl_transcript_id'/>
<Attribute name = 'ensembl_exon_id'/>
<Attribute name = 'rank'/>
<Attribute name = 'genomic_coding_start'/>
<Attribute name = 'cds_start'/>
<Attribute name = '5_utr_start'/>
</Dataset>
fullXmlQuery <- "<?xml version='1.0' encoding='UTF-8'?> <!DOCTYPE Query>
<Query virtualSchemaName = 'default' uniqueRows = '1' count = '0' datasetConfigVersion = '0.6' header='1' requestid= 'biomaRt'>
<Dataset name = 'hsapiens_gene_ensembl'>
<Attribute name = 'phenotype_description'/>
<Attribute name = 'ensembl_gene_id'/>
<Filter name = 'ensembl_gene_id' value = 'ENSG00000094804'/>
</Dataset>
</Query>"
rcurl_return <- RCurl::postForm(uri = "http://www.ensembl.org:80/biomart/martservice?",
@grimbough
grimbough / BDAvSVD.R
Created February 25, 2018 18:36
Timing BigDataAlgorithms::rsvd against base::svd
library(TENxBrainData)
library(ggplot2)
library(tidyr)
tenx <- TENxBrainData()
ncols <- c(1,2,5,10,20,50,100,200,500,1000)
svd <- bda <- numeric(length = length(ncols))
@grimbough
grimbough / hierarchical_layout.R
Created August 3, 2018 20:37
Demonstrating laying out plots in a 'tree' structure using gridExtra
library(ggplot2)
library(gridExtra)
## create 17 plots, where the only element is a number in the centre
res <- list()
for(i in 1:17) {
x <- data.frame(x = 1, y = 1, num = as.character(i))
res[[i]] <- ggplot(x, aes(x = x, y = y, label = num)) +
geom_text() +
theme_void() +
@grimbough
grimbough / hyperslab_selection_benchmark.R
Last active November 13, 2018 22:16
Testing performance of H5Sselect_index in the rhdf5 package
library(rhdf5)
hslab_bm <- function(m = 10000, n = 10000) {
## TENxBrainData
fid <- H5Fopen('/media/Storage/Work/ExperimentHub/1040')
did <- H5Dopen(h5loc = fid, 'counts')
sid <- H5Dget_space(did)
## select first 100 rows and a random (ordered) sample of columns
index <- list(1:100,
@grimbough
grimbough / IF015_example.R
Last active November 20, 2018 17:40
Modified example of code provided in https://support.bioconductor.org/p/115224/ to identify error thrown by EBImage
WD <- tempdir()
setwd(WD)
library(EBImage)
library(tidyverse)
## download example zip file ad unpack
download.file('https://www.dropbox.com/s/40e5tfohgdm9wlo/IF015_plate002_test.zip?dl=1',
destfile = 'IF015_plate002.zip', mode = "wb")
unzip('IF015_plate002.zip')
@grimbough
grimbough / benchmark_index2hyperslab.R
Created January 22, 2019 16:21
Benchmarking improvements in rhdf5 index to hyperslab selection
BiocManager::install("grimbough/rhdf5", ref = "f06ab6f",
update = FALSE, ask = TRUE, INSTALL_opts = c('--no-lock'))
suppressPackageStartupMessages(library(rhdf5))
suppressPackageStartupMessages(library(microbenchmark))
suppressPackageStartupMessages(library(ggplot2))
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(stringr))
h5file <- "/tmpdata/msmith/ExperimentHub/2230"
@grimbough
grimbough / example.Rmd
Created June 3, 2019 14:48
Reusing code chunks from an existing Rmarkdown document
This code chunk will extract all code blocks from `example.Rmd` and write them to a temporary file, preservining chunk options. It will then read them and make those chunks avaliable in the current R environment when the presentation is knitted.
```{r include, eval=TRUE, echo=FALSE, include = FALSE}
tmp_script <- tempfile()
knitr::purl("example.Rmd", output=tmp_script, quiet = TRUE)
knitr::read_chunk(tmp_script)
```
We can then include a chunk from `example.Rmd` like so:
@grimbough
grimbough / h5_combine_vs_select_hslab.c
Created March 5, 2021 14:05
Comparing H5Sselect_hyperslab and H5Scombine_hyperslab
/*
* Confirming hyperslab issue remains when combining two selections
* using H5Scombine_select
*/
#include <hdf5.h>
#include <stdlib.h>
#define FILE "subset.h5"
#define DATASETNAME "IntArray"