Skip to content

Instantly share code, notes, and snippets.

@arraytools
arraytools / EMgenes.R
Created May 3, 2022
Epithelial and mesenchymal gene signatures from Tan et al. 2014 . ‘Table S1B. Generic EMT signature for cell line’ . See also https://bioconductor.org/packages/release/bioc/vignettes/singscore/inst/doc/singscore.html#22_Sample_scoring_with_a_reduced_number_of_measurements
View EMgenes.R
x <- readxl::read_excel("~/Downloads/EMset.xlsx")
x %>% drop_na() %>% write.table("~/Downloads/EMset.txt", quote=F, row.names = F)
x2 <- read.table("~/Downloads/EMset.txt")
dim(x2) # 218 x 2
epi <- x2[1:170, 2]
mes <- x2[171:218, 2]
x3 <- data.frame(symbol=x2[,2], epimes=c(rep("epi", 170), rep("mes", 48)))
x3[, 1] <- gsub(substr(x3[1,1], 1, 1), "", x3[,1]) # rm "white space'?
write.table(x3, file="~/Downloads/EMgenes.txt", quote=F, row.names = F, sep="\t")
@arraytools
arraytools / interObj.R
Created Jan 26, 2022
Demonstration of interaction model
View interObj.R
df <-
structure(list(muc6 = c(0.65991920232772827, 12.234284400939941,
7.5393352508544922, 13.921844482421875, 4.8508410453796387, 0,
6.7801151275634766, 5.5127406120300293, 1.030299186706543, 3.5845346450805664,
0.98191821575164795, 0, 0, 1.052558422088623, 0, 0, 2.5294754505157471,
0.59700602293014526, 0, 0, 7.0485143661499023, 0, 0, 0, 1.1008850336074829,
5.777961254119873, 6.1289887428283691, 0, 4.9814505577087402,
0, 5.4703717231750488, 8.0720577239990234, 0), tumortype = c("Carcinoma",
"sarcoma", "Carcinoma", "Carcinoma", "sarcoma", "sarcoma", "Carcinoma",
"Carcinoma", "sarcoma", "sarcoma", "Carcinoma", "Carcinoma",
@arraytools
arraytools / glmnet_surv_code
Last active Dec 26, 2020
Survival data used for glmnet. n=78, p=101.
View glmnet_surv_code
library(glmnet)
x <- dget(url("https://gist.githubusercontent.com/arraytools/238e812555d69cb0213adaf99353c25f/raw/911bd4c6810fff6483521808cd1dfa1833891eb7/glmnet_surv_x"))
y <- dget(url("https://gist.githubusercontent.com/arraytools/238e812555d69cb0213adaf99353c25f/raw/911bd4c6810fff6483521808cd1dfa1833891eb7/glmnet_surv_y"))
lambda <- dget(url("https://gist.githubusercontent.com/arraytools/238e812555d69cb0213adaf99353c25f/raw/911bd4c6810fff6483521808cd1dfa1833891eb7/glmnet_surv_lambda"))
cvfit <- cv.glmnet(x, y, family = "cox", nfolds=10)
fit <- glmnet(x, y, family = "cox", lambda = lambda)
coef.cv <- coef(cvfit, s = lambda)
coef.fit <- coef(fit)
length(coef.cv[coef.cv != 0]) # 31
@arraytools
arraytools / embed.c
Created Nov 24, 2020
An example from Bioconductor workshop
View embed.c
#include <Rembedded.h>
#include <Rdefines.h>
static void doSplinesExample();
int
main(int argc, char *argv[])
{
Rf_initEmbeddedR(argc, argv);
doSplinesExample();
Rf_endEmbeddedR(0);
@arraytools
arraytools / MASS_vs_mvtnorm.R
Created Nov 24, 2020
R generate multivariate normal
View MASS_vs_mvtnorm.R
set.seed(1234)
junk <- biospear::simdata(n=500, p=500, q.main = 10, q.inter = 10,
prob.tt = .5, m0=1, alpha.tt= -.5,
beta.main= -.5, beta.inter= -.5, b.corr = .7, b.corr.by=25,
wei.shape = 1, recr=3, fu=2, timefactor=1)
## Method 1: MASS::mvrnorm()
## This is simdata() has used. It gives different numbers on different OS.
##
library(MASS)
set.seed(1234)
@arraytools
arraytools / randomData.R
Created Aug 27, 2020
test glmnet with random data
View randomData.R
library(glmnet)
# Binary data
n = 1000
p = 100
nzc = trunc(p/10)
for(i in 1:100) {
cat(i, " ")
if (i %% 10 == 0) cat("\n")
x = matrix(rnorm(n * p), n, p)
@arraytools
arraytools / readsum.R
Last active Mar 29, 2020
Three program languages to compute the sum of a numerical matrix (with column header)
View readsum.R
x <- read.delim("GX_datab31.txt")
dim(x)
# [1] 1579 463
sum(x)
sum(x[1, ])
options(digits = 20)
sum(x)
# [1] 12589441.434223400429
sum(x[1,])
# [1] 4787.087760000000344
@arraytools
arraytools / TPM_rsem_tximport_DESeq2.R
Last active Dec 11, 2019
import TPM for gene level analysis in DESeq2
View TPM_rsem_tximport_DESeq2.R
# This is a note about import rsem-generated file for DESeq2 package
# As described by the tximport's vignette, the method below uses the gene-level estimated counts from the quantification tools, and additionally to use the transcript-level abundance estimates to calculate a gene-level offset that corrects for changes to the average transcript length across samples.
# This approach corrects for potential changes in gene length across samples (e.g. from differential isoform usage)
# References:
# 1. http://bioconductor.org/packages/release/bioc/vignettes/tximport/inst/doc/tximport.html#rsem
# 2. https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#differential-expression-analysis
@arraytools
arraytools / Rconsole
Last active Oct 1, 2019
Rconsole for Windows OS with black background. The font size is increased to 14.
View Rconsole
# Optional parameters for the console and the pager
# The system-wide copy is in rwxxxx/etc.
# A user copy can be installed in `R_USER'.
## Style
# This can be `yes' (for MDI) or `no' (for SDI).
MDI = yes
# the next two are only relevant for MDI
toolbar = yes
statusbar = no
View silentnight.R
# https://aschinchon.wordpress.com/2014/03/13/the-lonely-acacia-is-rocked-by-the-wind-of-the-african-night/
depth <- 9
angle<-30 #Between branches division
L <- 0.90 #Decreasing rate of branches by depth
nstars <- 300 #Number of stars to draw
mstars <- matrix(runif(2*nstars), ncol=2)
branches <- rbind(c(1,0,0,abs(jitter(0)),1,jitter(5, amount = 5)), data.frame())
colnames(branches) <- c("depth", "x1", "y1", "x2", "y2", "inertia")
for(i in 1:depth)
{