Skip to content

Instantly share code, notes, and snippets.

@sashaphanes
sashaphanes / fasta.maker.pl
Last active August 29, 2015 13:57
takes a list of protein accession numbers with SNP information from STDIN, trims the SNPs, and generates a .fasta containing amino acid sequences for each accession
#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
# pipe STDIN to a loop that
# a) removes the strings after each accession number
# b) puts the remaining accession numbers from each line into an array
# $IN = \*STDIN;
#!/usr/bin/env perl
use warnings;
use strict;
##############################################################
#This program takes 2 inputs:
##############################################################
setwd("C:/Users/Sasha/Google Drive/LNGstuff/glut analyses")
data = read.csv("all.genesLabelled2.csv")
library(psych)
data$structure = NULL
#sample data format (after NULL):
# GRIA1 GRIA2 GRIA3 GRIA4 GRIK1 GRIK2 GRIK3 GRIK4
#1 4.200910 13.198395 4.531563 4.283639 2.323050 1.680551 0.591786 1.653584
#2 1.193841 4.902818 1.869044 1.779108 1.620001 0.839528 0.138910 1.132033
@sashaphanes
sashaphanes / extract.data.cmd
Created August 13, 2013 13:27
extract specific gene expression data from the ABA RNA-Seq BrainSpan Matrix
$ perl extract.data.pl GENESYMBL 2 |perl -ne '{chomp; $row++; @cols = split (/\t/, $_); $colNo = 0; foreach $data (@cols) {$colNo++; $h{$colNo}{$row} = $data; } if (eof) { foreach $colNo (sort {$a<=>$b} keys %h) { $newRow =""; foreach $row (sort {$a<=>$b} keys %{$h{$colNo}} ) {$data = $h{$colNo}{$row}; $newRow .="$data\t"; } $newRow =~ s/\t$//; print "$newRow\n"; } } }' |sed 's/"//g' >gene.data/GENESYMBL.txt; paste sample.info.txt gene.data/GENESYMBL.txt >gene.data/GENESYMBL.data; rm gene.data/GENESYMBL.txt; cut -f4 gene.data/GENESYMBL.data >gene.data/GENESYMBL.val;
@sashaphanes
sashaphanes / extract.data.pl
Last active May 6, 2019 15:50
A simple example:$ perl extract.data.pl MAOA 1# this will attempt to find the gene MAOA in the genes_rows_metadata.csv and return the gene info. This is a good first step to ensure that the gene exists in the database.$ perl extract.data.pl MAOA 2# this will attempt to find the gene MAOA in the gene_matrix and return the RPKM values. The header …
# sample of genes_rows_metadata.csv:
# $ cat genes_rows_metadata.csv | head -n5
# row_num,gene_id,ensembl_gene_symbol,gene_symbol,entrez_id
# 1,66982,"ENSG00000000003","TSPAN6",7105
# 2,66983,"ENSG00000000005","TNMD",64102
# 3,66984,"ENSG00000000419","DPM1",8813
# 4,66985,"ENSG00000000457","SCYL3",57147
#!/usr/bin/perl
> ggplot(ABhumanRat.data.2, aes(x=MAOAxprs, y=MAOBxprs, group=species,color=region)) + theme(axis.text.x = element_text(angle=90, face="bold", colour="black")) + geom_point(size=5) + ggtitle("MAOA vs. MAOB expression in human & rat")
Warning message:
Removed 49 rows containing missing values (geom_point).
> head(ABhumanRat.data.2)
region MAOAxprs MAOBxprs gene species
1 07. amygdala 7.590958 7.755932 MAOB rat
2 07. amygdala 7.329797 7.965554 MAOB rat
3 07. amygdala 7.292817 7.896363 MAOB rat
4 07. amygdala 7.556804 8.077286 MAOB rat
5 07. amygdala 7.300975 8.093814 MAOB rat
@sashaphanes
sashaphanes / scatterplotPlusLineggplot2.R
Last active December 18, 2015 23:49
scatterplotPlusLineggplot2.R
#head(ABhumanRat.data)
# region expression gene
#1 07. amygdala 7.755932 ratB
#2 07. amygdala 7.965554 ratB
#3 07. amygdala 7.896363 ratB
#4 07. amygdala 8.077286 ratB
#5 07. amygdala 8.093814 ratB
#6 07. amygdala 8.185065 ratB
ggplot(ABhumanRat.data, aes(x=region, y=expression, group=gene,color=gene))
@sashaphanes
sashaphanes / html.table.parser.pl
Last active December 18, 2015 01:09
Parse STDIN from bash program "links" HTML download, remove all tags except tables
#!/usr/bin/perl -ws
use HTML::Scrubber;
use HTML::Entities qw(decode_entities);
use Text::Unidecode qw(unidecode);
my $HTMLinput = do {local $/; <STDIN>};
my $scrubber = HTML::Scrubber->new( allow => [ qw[ table tr td ] ] );
#print $scrubber->scrub($HTMLinput);
setwd("C:/Users/Sasha/Google Drive/LNGstuff/Human")
maoa = read.csv("mergedAtransposed.csv")
maob = read.csv("mergedBtransposed.csv")
test.result = mapply(t.test, maoa, maob)
p.values = stack(mapply(function(x, y) t.test(x,y)$p.value, maoa, maob))
matrix.maoa = as.matrix(maoa)
matrix.maob = as.matrix(maob)
fun = Vectorize(function(i,j) t.test(matrix.maoa[,i],matrix.maob[,j])$p.value)
res = outer(1:27,1:27,FUN = "fun")
image(1:27,1:27,res,axes=FALSE,xlab="MAOA",ylab="MAOB")
colnames(res) <- colnames(maoa)
res <-as.data.frame(res)
res$group <- colnames(maoa)
library(reshape2)
res <- melt(res,id="group")
library(ggplot2)
p <- ggplot(res, aes(x=group, y=variable)) +
geom_tile(aes(fill = value), colour = "yellow") +