Skip to content

Instantly share code, notes, and snippets.

@vjrj
Last active December 13, 2022 08:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save vjrj/60b68a68af76bfaa6cd54c1a787d88b7 to your computer and use it in GitHub Desktop.
Save vjrj/60b68a68af76bfaa6cd54c1a787d88b7 to your computer and use it in GitHub Desktop.
A node parse-transform utility to remove authors from scientificNames in GBIF Backbone to use in ALA
//
// DEPRECATED
// Use: https://github.com/living-atlases/gbif-taxonomy-for-la
//
const parse = require("csv-parse/lib/es5");
const transform = require('stream-transform');
const fs = require('fs');
var readStream = fs.createReadStream("./gbif-backbone/Taxon.tsv.orig");
const parser = parse({
quote: null,
delimiter: '\t'
});
var atFirstLine = true;
const transformer = transform(function(record) {
if (atFirstLine) {
// we skip the first line
atFirstLine = false;
} else {
const hasAuthor = record[6].length > 0;
const hasCanonicalName = record[7].length > 0;
const sciName = record[5];
if (hasAuthor) {
if (hasCanonicalName) record[5] = record[7];
else {
// no canonicalName so we try to remove author from sciName if it's there
var pos = record[5].lastIndexOf(" " + record[6]);
if (pos !== -1) {
record[5] = record[5].substr(0, pos);
}
}
}
}
return record.join('\t')+'\n';
});
readStream.on('open', function () {
readStream.pipe(parser).pipe(transformer).pipe(process.stdout)
});
readStream.on('error', function(err) {
readStream.end(err);
});
@vjrj
Copy link
Author

vjrj commented Jan 20, 2020

@rpfigueira lucky you! I So have to reindex my errors personal index ;-)

@vjrj
Copy link
Author

vjrj commented Jun 18, 2020

Back to this. I updated my gist following the @rpfigueira comment.
Before:

nameindexer --testSearch "Methanobrevibacter ruminantium"         
(...)
Search for name: Methanobrevibacter ruminantium
ID: 1000111
GUID: 1000111
Classification: "Balch & Wolfe, 1981 (Smith & Hungate, 1958)",Archaea,Euryarchaeota,Methanobacteria,Methanobacteriales,Methanobacteriaceae,Methanobrevibacter
Scientific name: Methanobrevibacter ruminantium (Smith & Hungate, 1958) Balch & Wolfe, 1981
Authorship: Balch & Wolfe, 1981 (Smith & Hungate, 1958)
Rank: SPECIES
Synonym: null
Match type: exactMatch

and now:

nameindexer --testSearch "Methanobrevibacter ruminantium"                                                                                                                             
(...)
Search for name: Methanobrevibacter ruminantium
ID: 1000111
GUID: 1000111
Classification: "Balch & Wolfe, 1981 (Smith & Hungate, 1958)",Archaea,Euryarchaeota,Methanobacteria,Methanobacteriales,Methanobacteriaceae,Methanobrevibacter
Scientific name: Methanobrevibacter ruminantium
Authorship: Balch & Wolfe, 1981 (Smith & Hungate, 1958)
Rank: SPECIES
Synonym: null
Match type: exactMatch

Thanks!

@vjrj
Copy link
Author

vjrj commented Nov 24, 2022

The issue reported by Tim gbif/checklistbank#100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment