andreagrioni/limma_camera_dev.Rmd

## limma_camera_dev.Rmd

## Limma::Camera pipeline

This block of code will run camera: Competitive Gene Set Test Accounting for Inter-gene Correlation.

See reference here:

Wu, D, and Smyth, GK (2012). Camera: a competitive gene set test accounting for inter-gene correlation. Nucleic Acids Research 40, e133.
http://nar.oxfordjournals.org/content/40/17/e133

set `gs_cat` of interest.

```{r}
# select msigndb categories
target_cat <- c("H", "C5")
```

convert `msign_polish` table to list of gene sets with `SeqId` as value.

```{r, eval=FALSE}
# subset gs_name and SeqId from main table
msign_polish %>%
  dplyr::filter(
    gs_cat %in% target_cat
  ) %>%
  dplyr::select(
    dplyr::all_of(
      c("gs_name", "SeqId")
  )) %>%
  dplyr::distinct() -> term2SeqId
```

limma::camera needs the expression set used for the differential expression analysis.

```{r}
somapp_output$somapp_expr_input -> somapp_exprs
```

convert `term2SeqId` to named list of gene sets and SeqIds.

```{r, eval=FALSE}
# create a named list of gene sets from `term2SeqId` df
placeholder <- split(term2SeqId, term2SeqId$gs_name)
# simplify placeholder to SeqIds only
camera_map <- lapply(placeholder, function(x) x$SeqId)
# map camera_map SeqIds to Expression Set row index;
# basically, you replace SeqId with its corresponding
# rowname index
limma::ids2indices(
  camera_map,
  Biobase::exprs(somapp_exprs) %>% rownames
  ) -> SeqId2Index
```

## Run `limma::camera`

define `limma::camera` function for gene set enrichment analysis with intergene correlation.

```{r}
#' camera.run
#' run gene set enrichment analysis
#' with limma::cameraPR.
#' see publication https://doi.org/10.1093/nar/gks461
#' @param limma.results output of limma topTable
#' @param gene_sets a named list of gene sets and SeqIds
#' @param inter.gene.cor a default value for intergene correlation (default 0.01)
camera.run <- function(
    limma.results,
    gene_sets,
    inter.gene.cor=0.01
    ) {
  if (all(is.na(limma.results$t))) {
    return(NULL)
  }
  # run camera pre-rank
  # analysis
  limma::cameraPR(
    limma.results$t,
    gene_sets,
    inter.gene.cor=inter.gene.cor) -> gs_enrich
  # calculate qvalue
  placeholder <- try(qvalue::qvalue(gs_enrich$FDR))
  # add qvalue col to gs_enrich
  if (any(class(placeholder) == "try-error")) {
    gs_enrich$qvalue <- NA
  } else {
    gs_enrich$qvalue <- placeholder$qvalues
  }
  return(gs_enrich)
}
```

TODO check how to break genesets into groups rather than run all in one.


```{r, eval=FALSE}

degs_tmp <- split(limma_result, limma_result$coeff)

# iterate camera.run on limma outputs
lapply(
  degs_tmp,
  camera.run,
  gene_sets=SeqId2Index
  ) -> camera.out
```

	## Limma::Camera pipeline

	This block of code will run camera: Competitive Gene Set Test Accounting for Inter-gene Correlation.

	See reference here:

	Wu, D, and Smyth, GK (2012). Camera: a competitive gene set test accounting for inter-gene correlation. Nucleic Acids Research 40, e133.
	http://nar.oxfordjournals.org/content/40/17/e133

	set `gs_cat` of interest.

	```{r}
	# select msigndb categories
	target_cat <- c("H", "C5")
	```

	convert `msign_polish` table to list of gene sets with `SeqId` as value.

	```{r, eval=FALSE}
	# subset gs_name and SeqId from main table
	msign_polish %>%
	dplyr::filter(
	gs_cat %in% target_cat
	) %>%
	dplyr::select(
	dplyr::all_of(
	c("gs_name", "SeqId")
	)) %>%
	dplyr::distinct() -> term2SeqId
	```

	limma::camera needs the expression set used for the differential expression analysis.

	```{r}
	somapp_output$somapp_expr_input -> somapp_exprs
	```

	convert `term2SeqId` to named list of gene sets and SeqIds.

	```{r, eval=FALSE}
	# create a named list of gene sets from `term2SeqId` df
	placeholder <- split(term2SeqId, term2SeqId$gs_name)
	# simplify placeholder to SeqIds only
	camera_map <- lapply(placeholder, function(x) x$SeqId)
	# map camera_map SeqIds to Expression Set row index;
	# basically, you replace SeqId with its corresponding
	# rowname index
	limma::ids2indices(
	camera_map,
	Biobase::exprs(somapp_exprs) %>% rownames
	) -> SeqId2Index
	```

	## Run `limma::camera`

	define `limma::camera` function for gene set enrichment analysis with intergene correlation.

	```{r}
	#' camera.run
	#' run gene set enrichment analysis
	#' with limma::cameraPR.
	#' see publication https://doi.org/10.1093/nar/gks461
	#' @param limma.results output of limma topTable
	#' @param gene_sets a named list of gene sets and SeqIds
	#' @param inter.gene.cor a default value for intergene correlation (default 0.01)
	camera.run <- function(
	limma.results,
	gene_sets,
	inter.gene.cor=0.01
	) {
	if (all(is.na(limma.results$t))) {
	return(NULL)
	}
	# run camera pre-rank
	# analysis
	limma::cameraPR(
	limma.results$t,
	gene_sets,
	inter.gene.cor=inter.gene.cor) -> gs_enrich
	# calculate qvalue
	placeholder <- try(qvalue::qvalue(gs_enrich$FDR))
	# add qvalue col to gs_enrich
	if (any(class(placeholder) == "try-error")) {
	gs_enrich$qvalue <- NA
	} else {
	gs_enrich$qvalue <- placeholder$qvalues
	}
	return(gs_enrich)
	}
	```

	TODO check how to break genesets into groups rather than run all in one.


	```{r, eval=FALSE}

	degs_tmp <- split(limma_result, limma_result$coeff)

	# iterate camera.run on limma outputs
	lapply(
	degs_tmp,
	camera.run,
	gene_sets=SeqId2Index
	) -> camera.out
	```