Skip to content

Instantly share code, notes, and snippets.

@cjfields
Created September 2, 2010 16:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cjfields/f5db90a432fed68548d4 to your computer and use it in GitHub Desktop.
Save cjfields/f5db90a432fed68548d4 to your computer and use it in GitHub Desktop.
use Bio::DB::EUtilities;
my (%taxa, @taxa);
my (%names, %idmap);
# these are protein ids; nuc ids will work by changing -dbfrom => 'nucleotide',
# (probably)
my @ids = qw(1621261 89318838 68536103 20807972 730439);
my $factory = Bio::DB::EUtilities->new(-eutil => 'elink',
-db => 'taxonomy',
-email => 'cjfields@bioperl.org',
-dbfrom => 'protein',
-correspondence => 1,
-id => \@ids);
# iterate through the LinkSet objects
while (my $ds = $factory->next_LinkSet) {
$taxa{($ds->get_submitted_ids)[0]} = ($ds->get_ids)[0]
}
# don't use a slice when you need all values; if one of the
# ids in the original list is missing, you'll get an undef
# mixed in the array
@taxa = values %taxa;
$factory->reset_parameters(
-email => 'cjfields@bioperl.org',
-eutil => 'esummary',
-db => 'taxonomy',
-id => \@taxa );
while (my $ds = $factory->next_DocSum) {
$names{($ds->get_contents_by_name('TaxId'))[0]} =
($ds->get_contents_by_name('ScientificName'))[0];
}
foreach my $id (@ids) {
# again, this assumes all IDs are found, use exists to check
$idmap{$id} = $names{$taxa{$id}} if exists $taxa{$id};
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment