This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#I found this stuff interesting to share, more on http://code.google.com/p/bio-js/ | |
#This is my attempt at a bioinformatics framework in JavaScript. I've noticed more and more bioinformatics interfaces going live online, and so we are in a kind of cloud-computing era in bioinformatics. For example, there are many sites out there which let a user paste FASTA sequences into a web form. Things like parsing FASTA files should be easy to do and should be object oriented. I have decided on using the existing framework PrototypeJS to facilitate the process and to ensure that the project is compatible with most operating systems and browsers. | |
#Example | |
<html> | |
<head> | |
<title>Bio-JS Test</title> | |
<script src='lib/prototype.js' type='text/javascript'></script> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/perl | |
# Correlation | |
# Didier Gonze | |
# Updated: 28/4/2004 | |
########################################################################################## | |
&ReadArguments; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Variables: | |
# $NQuery - the number of query sequences | |
# $QueryHeader{$i} - the header line for query $i | |
# $QueryLength{$i} - the length of the query | |
# $Database{$i} - the database searched | |
# $DbSequences{$i} - the number of sequences in the database | |
# $DbLength{$i} - the number of residues in the database | |
# $Lambda{$i} - lambda factor | |
# $Kterm{$i} - K term | |
# $Information{$i} - expected information content of the alignment |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#You can parse Genbank bank files with BioRuby the standard way, but there's a hidden problem. If the file ends with blank lines, i.e. after the genbank terminator (two forwards slashes, //) there are empty lines, BioRuby reads these as additional, empty records. However, you can route around this by trimming the blank lines before handing it to the parser. | |
puts "Parsing seqs ..." | |
Bio::FlatFile.auto("foo.genbank").each_entry { |gb| | |
puts "Sequence '#{gb.to_biosequence.entry_id}'" | |
} | |
puts "Finished." | |
which will print the id of every sequence in the file. However, if the file ends with blank lines, i.e. after the genbank terminator (two forwards slashes, which the wiki markup doesn't like) there are empty lines, BioRuby reads these as additional, empty records: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#An automation of a tedious task I have to do often: coloring the nodes on a phylogeny. This script takes a dendroscope tree file and a "color description" file, a simple csv file with taxa labels and a corresponding color. The color may either be an RGB triplet or a scalar value which will be mapped to a pallete. Usage is: | |
#color-dendro.rb [options] CLRFILE TREEFILE1 [...] | |
#where the options are: | |
#-h, --help Display this screen | |
#-m, --default-color STR The default color nodes will be given | |
#--map-to-colors The coloring instructions give a float value which will be mapped to a color | |
#--save STR |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#A sample script using the ETE package to detect shift in internal nodes where a significant change in enrichment values (could be anything) happens | |
#An example (using the scipy python module to perform a K-S test): | |
from scipy import stats | |
from ete2 import Tree | |
newick = "((((A, B)edge1, C)edge2, ((D, E)edge3, F)edge4)edge5, (((G, H)edge6, I)edge7, ((J, K)edge8, L)edge9)edge10)RootEdge;" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* | |
to compile: | |
$gmcs multifasta-parser.cs -out:multifasta-parser | |
to run: | |
$mono multifasta-parser [/path/multifasta-file] | |
*/ | |
using System; | |
using System.IO; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# -*- coding: utf-8 -*- | |
""" | |
Convert biosequences from one format to another. | |
Usage is: convbioseq [options] FORMAT INFILES ... | |
Options: | |
--version show program's version number and exit |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
# download sequences from db by id | |
### IMPORTS | |
require 'bio' | |
require 'ostruct' | |
require 'timeout' | |
require 'pp' | |
require 'test/unit/assertions' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# For demonstration purposes, let's create a very simple alignment, where | |
# everything agrees xcept the last sequence which leads with a differing | |
# character and ends with a gap: | |
require 'bio' | |
aln = Bio::Alignment.new(['acgt', 'acgt', 'acgt', 'ccg-']) | |
# consensus_iupac produces a "true" consensus sequence across all members. | |
# If sequences differ, the consensus sequence has an ambiguous character | |
# that sums these differences: |