Skip to content

Instantly share code, notes, and snippets.

View kieranrcampbell's full-sized avatar

Kieran R Campbell kieranrcampbell

View GitHub Profile
@kieranrcampbell
kieranrcampbell / reproduce-tscan.R
Last active May 20, 2016 14:06
TSCAN applied to small subset of monocle data
library(TSCAN)
set.seed(1)
load("X.rda")
dim(X)
## [1] 228 5
ts <- exprmclust(t(X))
// [[Rcpp::export]]
NumericMatrix sample_tau_pg(NumericMatrix beta,
double a_beta, double b_beta) {
int P = beta.nrow();
int G = beta.ncol();
NumericMatrix tau_pg(P, G);
for(int p = 0; p < P; p++) {
for(int g = 0; g < G; g++) {
@kieranrcampbell
kieranrcampbell / calc_tpr_fpr_fdr.R
Created March 16, 2017 15:39
Calculate true positive rate, false positive rate & false discovery rate from contingency table in R
## Suppose we have a contigency table tbl formed by the table(...) command in R, with
## a logical vector of discoveries as the first argument and a logical vector of
## the ground truth as the second, e.g. tbl <- table(discoveries, ground_truth), then
## this function calculates the true positive rate, false positive rate and false discovery rate
## as per the wikipedia definition at https://en.wikipedia.org/wiki/Sensitivity_and_specificity
calculate_statistics <- function(tbl) {
P <- sum(tbl[,2])
N <- sum(tbl[,1])
TP <- tbl[2,2]
from edward.models import RandomVariable
import tensorflow as tf
from tensorflow.contrib.distributions import Distribution
class Weibull(RandomVariable, Distribution):
"""Weibull distribution
The Weibull distribution is defined over the non-negative real numbers.
@kieranrcampbell
kieranrcampbell / go_analysis.R
Created May 3, 2017 14:07
Gene ontology enrichment analysis with Goseq
# A function that takes a set of genes and a background list and goes through the
# steps in goseq to output the enriched list
library(goseq)
#' Run all steps for a goseq analysis
#'
#' @param genes The genes "differentially expressed" or in the condition
#' @param all_genes The background gene set
#' @param genome The genome of genes and all_genes, e.g. mm10 or hg19
@kieranrcampbell
kieranrcampbell / snakefile_for_rmarkdown
Created May 12, 2017 08:55
Snakemake to build Rmarkdown files
rule exploratory_analysis:
input:
"data/tec_sceset_qc.rds", "analysis/02_exploratory_analysis.html"
output:
"data/tec_sceset_clusters.rds",
"analysis/02_exploratory_analysis.Rmd",
"data/deseq2_results.csv"
shell:
"Rscript -e \"rmarkdown::render('analysis/02_exploratory_analysis.Rmd')\""
@kieranrcampbell
kieranrcampbell / goplot.R
Created May 15, 2017 09:06
Plot the p-values of a GO analysis
library(stringr)
library(ggplot)
library(dplyr)
## Assuming "go" has columns "term" and "q_value"
go_top <- head(go, n = 10)
go_top$term <- str_to_title(go_top$term)
terms_sorted <- arrange(go_top, desc(q_value)) %>% .$term
go_top$term <- factor(go_top$term, levels = terms_sorted)
## You can show that the estimate of fg is wrong *only*
## when m_beta = 1 and m_c = 1 and xx = 1 and that in
## such a case it appears to overcount by 2, implying term 11 in the derivation is wrong
m_alpha <- 0
s_alpha <- 1
m_beta <- 1
s_beta <- 1
m_t <- 0
s_t <- 1
@kieranrcampbell
kieranrcampbell / overdispersion-analysis.R
Created June 15, 2017 09:55
Brief example of scran's method for "Accounting for technical noise in single-cell RNA-seq experiments"
library(stringr)
library(ggplot2)
library(scran)
library(scater)
library(dplyr)
# Step 1: do the overdispersion analysis
sc2 <- sce[rowMeans(round(exprs(sce)) > 0) > 0, ] # Keep only genes that are expressed
is_ercc <- grepl("NA_ERCC", featureNames(sc2)) # Your ERCCs might be named differently to mine!
# Nice colour palette for visualising copy number profiles
# inspired by DLP paper
cnv_cols <- c("0" = "#deebf7",
"1" = "#9ecae1",
"2" = "grey80",
"3" = "#fdae6b",
"4" = "#e6550d")
# Factors for chromosomes
chr_levels <- c(as.character(1:23), "X", "Y")