Skip to content

Instantly share code, notes, and snippets.

@hckiang
Last active December 2, 2022 16:24
Show Gist options
  • Save hckiang/ba6bdabdcd0b401c29077d752d6cc11c to your computer and use it in GitHub Desktop.
Save hckiang/ba6bdabdcd0b401c29077d752d6cc11c to your computer and use it in GitHub Desktop.

FAQ: Bioinformatics Lab 4

Q1: What to do if BioConductor refuses to install simpleaffy?

Because installing the simpleaffy package from BioConductor does not seem to be working "out-of-box" any more, you may try one of the following solutions:

  1. If you are on Linux, you could try downloading the source package and install it manually. This should work on Mac OS X and Windows as well but I have not tested it.
  2. Otherwise, you can use the affy package instead.

Option 1: Install simpleaffy from package source

First, download the package source from this page. If you were on Windows, don't download the "Windows Binary". You want the "package source". Then run the following in the terminal:

install.packages('/PATH/TO/YOUR/simpleaffy_2.50.0.tar.gz', type='source')

Of course, if it complains that some dependencies are missing you need to install them and try again. You should be fine with installing the dependencies the "normal" way, i.e., by install.packages(some_dependency) for CRAN packages and with

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install()   # Install Bioconductor's base packages
BiocManager::install('some_dependency')

for Bioconductor packages.

Option 2: Using the affy package instead

If you cannot install simpleaffy yourself you could use affy directly to read in the .CEL files. In fact, simpleaffy::read.affy is just a thin wrapper around affy::ReadAffy.

Using the facilities provided by the affy package, we could define our own read.affy in as follows:

library('affy')
read.affy = function (covdesc = "covdesc", path = ".", ...)
{
    samples <- read.AnnotatedDataFrame(paste(path, covdesc, sep = "/"),
        sep = "")
    files.to.read <- rownames(pData(samples))
    files.to.read <- paste(path, files.to.read, sep = "/")
    eset <- ReadAffy(filenames = files.to.read, ...)
    newPhenoData <- cbind(pData(eset), pData(samples)[rownames(pData(eset)),
        ])
    colnames(newPhenoData) <- c(colnames(pData(eset)), colnames(pData(samples)))
    tmp <- as.list(colnames(newPhenoData))
    names(tmp) <- colnames(newPhenoData)
    newPhenoData <- as(newPhenoData, "AnnotatedDataFrame")
    phenoData(eset) <- newPhenoData
    return(eset)
}

The above code is just taken from simpleaffy version 2.50.0, which is distributed under GPL (>=2).

Q2: What to do if I see pthread-related error message?

On many computers, depending on how your R was compiled, how your ~/.R/Makevars looks like, and where your libblas.so came from, you may see an error message that looks something like this

ERROR; return code from pthread_create() is 22

when you run gcrma::gcrma. This is very likely not a bug from the gcrma package, but rather, a problem in OpenBlas and GNU's libc (1)(2).

The easiest way to get rid of this is perhaps to just disable multithreading by reinstalling the preprocessCore package:

BiocManager::install("preprocessCore", configure.args="--disable-threading", force = TRUE)

Alternatively, very ambitious students may also try to recompile and reconfigure R so that both R and all packages are linked to other BLAS implementations or different libc. I cannot give much advice on this, as OS-specific issues can be very complicated. If you choose to try this, note that OpenBLAS is NOT thread-safe unless some environment variables are set correctly (but these environment variables won't solve the pthread_create() bug).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment