This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#Identify working design model among numerous covariates | |
# Returns design that can be used directly by DESeq2 | |
#Usage: | |
# select.model(covariates,main,region) | |
# covariates: character list of covariates in meta data matrix (column names of meta data matrix) | |
# main: main factor forumula (can include covariates) | |
# region: brain region (temprorarily hard-coded for the immediate analysis) | |
#Example: | |
# select.model(c("Antidepressant","Alcool","History.of.Abuse","Cause.of.death","PMI"), | |
# "RIN+Age.+Gender+Phenotype+Gender:Phenotype", |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
generate.rrho<-function(pval.data,logfc.data,list,outdir){ | |
max.scale<-list() | |
for(i in 1:nrow(list)){ | |
list1<-cbind(rownames(pval.data),-1*log10(pval.data[,as.character(list[i,1])])*sign(logfc.data[,as.character(list[i,1])])) | |
list2<-cbind(rownames(pval.data),-1*log10(pval.data[,as.character(list[i,2])])*sign(logfc.data[,as.character(list[i,2])])) | |
print(head(list1)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
analyze.proteomic.data<-function(intensities,meta_data,condA,condB){ | |
#parse conditions to prepare for t test with replicates | |
colnames(intensities)<-meta_data | |
condA.regex<-paste("^",condA,"$",sep="") | |
condB.regex<-paste("^",condB,"$",sep="") | |
condA.indices<-grep(condA.regex,colnames(intensities)) | |
condB.indices<-grep(condB.regex,colnames(intensities)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
filename=$1 | |
awk -F "\t" '{split($9,a,";");print a[1], $10}' -v awk_file=filename | cut -d ' ' -f2,3 | sort -u | sort -k2 -n |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(biomaRt) | |
library(stringr) | |
#######get counts ######### | |
setwd("/path/to/count/directory") | |
data = list.files(pattern = 'htseq_counts.txt'); #detect count files based on file type | |
count_list = lapply(data,read.table,header=F,sep="\t",row.names=1) #read files in batch, save to list | |
counts<-do.call(cbind, count_list) #create count data frame | |
colnames(counts)<-data #set column names | |
colnames(counts)<-str_replace(colnames(counts),".htseq_counts.txt","")#remove filename extension |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"""GC_calc.py""" | |
import sys | |
from pyspark import SparkContext | |
import re | |
#turns Fasta file into a list of sequences (for current understanding of pyspark SparkContext input) | |
fastaFile = sys.argv[1] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def FASTA(filename): | |
try: | |
f = file(filename) | |
except IOError: | |
print "The file, %s, does not exist" % filename | |
return | |
sequences = {} | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def featureCoverage(reference: RDD[Feature],reads: RDD[AlignmentRecord],bins:Int): RDD[(String, Iterable[Double])] = { | |
val getBinsForward = for{ | |
feature <- reference | |
bin = bins | |
interval = ((((feature.getEnd.toDouble-feature.getStart.toDouble)/bin).toDouble).ceil).toInt | |
strand = feature.getFeatureType.toString | |
start = feature.getStart.toInt | |
end = feature.getEnd.toInt | |
refName = feature.getContig.getContigName.toString |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
evaluate.covariates<-function(x,pc.percents,continuous,categorical){ | |
covariate.contribution<-function(x,continuous,categorical){ | |
#asinh transform continuous covariates | |
asinh.continuous <- lapply(continuous,asinh) | |
asinh.continuous <- as.data.frame(do.call(cbind,asinh.continuous)) | |
#discretize cateogorical covariates to perform lm |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import scala.collection.mutable.ArrayBuffer | |
def binFromRange(start: Int, end: Int): ArrayBuffer[Int] ={ | |
val bin_offsets = Array(512+64+8+1,64+8+1,8+1,0) | |
val binFirstShift = 17 | |
val binNextShift = 3 | |
var startBin = start >> binFirstShift |
OlderNewer