hckiang/FAQ: Lab 4.md

## FAQ: Lab 4.md

      
    Raw
  

              FAQ: Lab 4.md
            
          
    FAQ: Bioinformatics Lab 4

Q1: What to do if BioConductor refuses to install simpleaffy?

Because installing the simpleaffy package from BioConductor
does not seem to be working "out-of-box" any more, you may try one of the
following solutions:

If you are on Linux, you could try downloading the source
package and install it manually. This should work on Mac OS X
and Windows as well but I have not tested it.
Otherwise, you can use the affy package instead.

Option 1: Install simpleaffy from package source

First, download the package source from this page. If you were on Windows, don't download the "Windows Binary". You want the "package source". Then run the following in the terminal:
install.packages('/PATH/TO/YOUR/simpleaffy_2.50.0.tar.gz', type='source')

Of course, if it complains that some dependencies are missing
you need to install them and try again. You should be fine with
installing the dependencies the "normal" way, i.e., by
install.packages(some_dependency) for CRAN packages and with
if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install()   # Install Bioconductor's base packages
BiocManager::install('some_dependency')

for Bioconductor packages.
Option 2: Using the affy package instead

If you cannot install simpleaffy yourself you could use affy
directly to read in the .CEL files. In fact,
simpleaffy::read.affy is just a thin wrapper around
affy::ReadAffy.
Using the facilities provided by the affy package, we could define
our own read.affy in as follows:
library('affy')
read.affy = function (covdesc = "covdesc", path = ".", ...)
{
    samples <- read.AnnotatedDataFrame(paste(path, covdesc, sep = "/"),
        sep = "")
    files.to.read <- rownames(pData(samples))
    files.to.read <- paste(path, files.to.read, sep = "/")
    eset <- ReadAffy(filenames = files.to.read, ...)
    newPhenoData <- cbind(pData(eset), pData(samples)[rownames(pData(eset)),
        ])
    colnames(newPhenoData) <- c(colnames(pData(eset)), colnames(pData(samples)))
    tmp <- as.list(colnames(newPhenoData))
    names(tmp) <- colnames(newPhenoData)
    newPhenoData <- as(newPhenoData, "AnnotatedDataFrame")
    phenoData(eset) <- newPhenoData
    return(eset)
}

The above code is just taken from simpleaffy version 2.50.0, which is distributed
under GPL (>=2).
Q2: What to do if I see pthread-related error message?

On many computers, depending on how your R was compiled, how your
~/.R/Makevars looks like, and where your libblas.so came from,
you may see an error message that looks something like this
ERROR; return code from pthread_create() is 22

when you run gcrma::gcrma. This is very likely not a bug from
the gcrma package, but rather, a problem in OpenBlas and GNU's libc
(1)(2).
The easiest way to get rid of this is perhaps to just disable
multithreading by reinstalling the preprocessCore package:
BiocManager::install("preprocessCore", configure.args="--disable-threading", force = TRUE)

Alternatively, very ambitious students may also try to recompile and
reconfigure R so that both R and all packages are linked to other BLAS
implementations or different libc. I cannot give much advice on this, as
OS-specific issues can be very complicated. If you choose to try this,
note that OpenBLAS is NOT thread-safe unless some environment variables
are set correctly
(but these environment variables won't solve the pthread_create() bug).