Skip to content

Instantly share code, notes, and snippets.

@wwood
Created July 23, 2011 07:51
Show Gist options
  • Save wwood/1101158 to your computer and use it in GitHub Desktop.
Save wwood/1101158 to your computer and use it in GitHub Desktop.
Prototype for getting the species prefixes for Ensembl identifiers for all species, using the production database
# This script is released under a GPL version 3 license or newer. Copyright by the author Ben Woodcroft.
require 'rubygems'
require 'ensembl'
require 'pp'
module Ensembl
module Production
class DBConnection < Ensembl::DBRegistry::Base
self.abstract_class = true
self.pluralize_table_names = false
def self.connect(release = Ensembl::ENSEMBL_RELEASE, args = {})
database = "ensembl_production_#{release}"
self.generic_connect(nil,nil, release,{:database => database})
end
end # Core::DBConnection
class Species < DBConnection
set_primary_key 'species_id'
end
end # Production
class SpeciesPrefixExtractor
def initialize
@@species_info = nil
end
def [](db_name)
Ensembl::SpeciesPrefixExtractor.cache
@@species_info[db_name]
end
def infos
Ensembl::SpeciesPrefixExtractor.cache
@@species_info
end
def self.cache
return nil unless @@species_info.nil?
Ensembl::Production::DBConnection.connect
# A hash of species db_names to SpeciesInformation objects
@@species_info = {}
# Get the list of species, db_name and common_names for each of species
Ensembl::Production::Species.all(:limit => 2).each do |sp|
info = SpeciesInformation.new
info.db_name = sp.db_name
info.common_name = sp.common_name
@@species_info[sp.db_name] = info
end
# Get the prefix information, which are in each of the individual databases
@@species_info.each do |db_name, info|
$stderr.puts db_name
Ensembl::Core::DBConnection.connect(db_name)
prefix = Ensembl::Core::Meta.first(:conditions => ['meta_key = ?','species.stable_id_prefix'])
@@species_info[db_name].prefix = prefix.meta_value unless prefix.nil?
end
nil
end
class SpeciesInformation
attr_accessor :common_name, :db_name, :prefix
end
end
end # Ensembl
ex = Ensembl::SpeciesPrefixExtractor.new
hash = {}
ex.infos.collect do |db_name, info|
unless info.prefix.nil?
hash[info.prefix] = [info.common_name, db_name]
end
end
pp hash
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment