Radhouane Aniba radaniba

## parsefasta.html
#I found this stuff interesting to share, more on http://code.google.com/p/bio-js/

#This is my attempt at a bioinformatics framework in JavaScript. I've noticed more and more bioinformatics interfaces going live online, and so we are in a kind of cloud-computing era in bioinformatics. For example, there are many sites out there which let a user paste FASTA sequences into a web form. Things like parsing FASTA files should be easy to do and should be object oriented. I have decided on using the existing framework PrototypeJS to facilitate the process and to ensure that the project is compatible with most operating systems and browsers.

#Example

<html>
<head>
  <title>Bio-JS Test</title>
  <script src='lib/prototype.js' type='text/javascript'></script>

## pearsoncorrelation.pl
#!/usr/bin/perl

# Correlation
# Didier Gonze
# Updated: 28/4/2004

##########################################################################################


&ReadArguments;

## parseblast-anotherscript.pl
# Variables:
#  $NQuery - the number of query sequences
#    $QueryHeader{$i} - the header line for query $i
#    $QueryLength{$i} - the length of the query
#    $Database{$i} - the database searched
#    $DbSequences{$i} - the number of sequences in the database
#    $DbLength{$i} - the number of residues in the database
#    $Lambda{$i} - lambda factor
#    $Kterm{$i} - K term
#    $Information{$i} - expected information content of the alignment

## parsegb.rb
#You can parse Genbank bank files with BioRuby the standard way, but there's a hidden problem.  If the file ends with blank lines, i.e. after the genbank terminator (two forwards slashes, //) there are empty lines, BioRuby reads these as additional, empty records. However, you can route around this by trimming the blank lines before handing it to the parser.


puts "Parsing seqs ..."
  Bio::FlatFile.auto("foo.genbank").each_entry { |gb|
     puts "Sequence '#{gb.to_biosequence.entry_id}'"
  }
  puts "Finished."

which will print the id of every sequence in the file. However, if the file ends with blank lines, i.e. after the genbank terminator (two forwards slashes, which the wiki markup doesn't like) there are empty lines, BioRuby reads these as additional, empty records:

## colornodes.rb
#An automation of a tedious task I have to do often: coloring the nodes on a phylogeny. This script takes a dendroscope tree file and a "color description" file, a simple csv file with taxa labels and a corresponding color. The color may either be an RGB triplet or a scalar value which will be mapped to a pallete. Usage is:

#color-dendro.rb [options] CLRFILE TREEFILE1 [...]

#where the options are:

#-h, --help Display this screen
#-m, --default-color STR The default color nodes will be given
#--map-to-colors The coloring instructions give a float value which will be mapped to a color
#--save STR

## parsetree.py
#A sample script using the ETE package to detect shift in internal nodes where a significant change in enrichment values (could be anything) happens


#An example (using the scipy python module to perform a K-S test):

from scipy import stats
from ete2 import Tree

newick = "((((A, B)edge1, C)edge2, ((D, E)edge3, F)edge4)edge5, (((G, H)edge6, I)edge7, ((J, K)edge8, L)edge9)edge10)RootEdge;"

## multifasta.cs
/*
to compile:
$gmcs multifasta-parser.cs -out:multifasta-parser

to run:
$mono multifasta-parser [/path/multifasta-file]
*/

using System;
using System.IO;

## convertformat.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
Convert biosequences from one format to another.

Usage is: convbioseq [options] FORMAT INFILES ...

Options:

--version show program's version number and exit

## getseqbyid.rb
#!/usr/bin/env ruby
# download sequences from db by id

### IMPORTS

require 'bio'
require 'ostruct'
require 'timeout'
require 'pp'
require 'test/unit/assertions'

## alignment.rb
# For demonstration purposes,  let's create a very simple alignment, where
# everything agrees xcept the last sequence which leads with a differing
# character and ends with a gap:

require 'bio'
aln = Bio::Alignment.new(['acgt', 'acgt', 'acgt', 'ccg-'])

# consensus_iupac produces a "true" consensus sequence across all members.
# If sequences differ, the consensus sequence has an ambiguous character
# that sums these differences:
	#I found this stuff interesting to share, more on http://code.google.com/p/bio-js/

	#This is my attempt at a bioinformatics framework in JavaScript. I've noticed more and more bioinformatics interfaces going live online, and so we are in a kind of cloud-computing era in bioinformatics. For example, there are many sites out there which let a user paste FASTA sequences into a web form. Things like parsing FASTA files should be easy to do and should be object oriented. I have decided on using the existing framework PrototypeJS to facilitate the process and to ensure that the project is compatible with most operating systems and browsers.

	#Example

	<html>
	<head>
	<title>Bio-JS Test</title>
	<script src='lib/prototype.js' type='text/javascript'></script>
	#!/usr/bin/perl

	# Correlation
	# Didier Gonze
	# Updated: 28/4/2004

	##########################################################################################


	&ReadArguments;
	# Variables:
	# $NQuery - the number of query sequences
	# $QueryHeader{$i} - the header line for query $i
	# $QueryLength{$i} - the length of the query
	# $Database{$i} - the database searched
	# $DbSequences{$i} - the number of sequences in the database
	# $DbLength{$i} - the number of residues in the database
	# $Lambda{$i} - lambda factor
	# $Kterm{$i} - K term
	# $Information{$i} - expected information content of the alignment
	#You can parse Genbank bank files with BioRuby the standard way, but there's a hidden problem. If the file ends with blank lines, i.e. after the genbank terminator (two forwards slashes, //) there are empty lines, BioRuby reads these as additional, empty records. However, you can route around this by trimming the blank lines before handing it to the parser.


	puts "Parsing seqs ..."
	Bio::FlatFile.auto("foo.genbank").each_entry { \|gb\|
	puts "Sequence '#{gb.to_biosequence.entry_id}'"
	}
	puts "Finished."

	which will print the id of every sequence in the file. However, if the file ends with blank lines, i.e. after the genbank terminator (two forwards slashes, which the wiki markup doesn't like) there are empty lines, BioRuby reads these as additional, empty records:
	#An automation of a tedious task I have to do often: coloring the nodes on a phylogeny. This script takes a dendroscope tree file and a "color description" file, a simple csv file with taxa labels and a corresponding color. The color may either be an RGB triplet or a scalar value which will be mapped to a pallete. Usage is:

	#color-dendro.rb [options] CLRFILE TREEFILE1 [...]

	#where the options are:

	#-h, --help Display this screen
	#-m, --default-color STR The default color nodes will be given
	#--map-to-colors The coloring instructions give a float value which will be mapped to a color
	#--save STR
	#A sample script using the ETE package to detect shift in internal nodes where a significant change in enrichment values (could be anything) happens



	#An example (using the scipy python module to perform a K-S test):

	from scipy import stats
	from ete2 import Tree

	newick = "((((A, B)edge1, C)edge2, ((D, E)edge3, F)edge4)edge5, (((G, H)edge6, I)edge7, ((J, K)edge8, L)edge9)edge10)RootEdge;"
	/*
	to compile:
	$gmcs multifasta-parser.cs -out:multifasta-parser

	to run:
	$mono multifasta-parser [/path/multifasta-file]
	*/

	using System;
	using System.IO;
	#!/usr/bin/env python
	# -- coding: utf-8 --
	"""
	Convert biosequences from one format to another.

	Usage is: convbioseq [options] FORMAT INFILES ...

	Options:

	--version show program's version number and exit
	#!/usr/bin/env ruby
	# download sequences from db by id

	### IMPORTS

	require 'bio'
	require 'ostruct'
	require 'timeout'
	require 'pp'
	require 'test/unit/assertions'
	# For demonstration purposes, let's create a very simple alignment, where
	# everything agrees xcept the last sequence which leads with a differing
	# character and ends with a gap:

	require 'bio'
	aln = Bio::Alignment.new(['acgt', 'acgt', 'acgt', 'ccg-'])

	# consensus_iupac produces a "true" consensus sequence across all members.
	# If sequences differ, the consensus sequence has an ambiguous character
	# that sums these differences: