Skip to content

Instantly share code, notes, and snippets.

{
"updated": "2015-08-28T23:58:18.165753+00:00",
"target": [
{
"source": "http://www.gbif.org/occurrence/912442645",
"selector": [
{
"conformsTo": "https://tools.ietf.org/html/rfc3236",
"type": "FragmentSelector",
"value": "occurrence"
== Taxonomy tester - taxa with same basionym
This test asks whether there are multiple taxa with the same basionym. Logically this should not occur.
This example uses the legume genrra Poissonia and Coursetia.
//hide
//setup
//output
== Taxonomy tester - anachronistic species and genus names
This test compares the dates of species names and the genus name. If a name is a new combination then the author name is enclosed in parentheses. If a species is placed in a genus that was published after the species was described, then logically the name is a new combination. However in some cases GBIF has species names which violate this rule.
This example uses the butterfly genus Heliopyrgus
//hide
//setup
//output
[source,cypher]
== Taxonomy tester - species and subspecies with same name
This test compares the epithets of species and subspecies in the same genus. If a subspecies with a given epithet exists in a genus, there should not be another species (other than the species to which subspecies belongs if it is the nominate subspecies).
This example uses the butterfly genus Heliopyrgus
//hide
//setup
//output
[source,cypher]
== Taxonomy tester - genus names should match species name
This test compares the generic name of a species with that of the genus it is placed in. These should be the same, but often aren't, signalling a problem, such as a homonym, synonym, or spelling mistake.
This example uses the butterfly genus Forsterinaria Gray, 1973, see http://bionames.org/search/Forsterinaria, which is a replacement name for the genus Haywardina Forster, 1964, see http://biostor.org/reference/77525. GBIF has Forsterinaria http://bionames.org/taxa/gbif/3257628 but the species names are labelled with Haywardina (which it also has in its classification, http://www.gbif.org/species/1894033 ).
//hide
//setup
//output
[source,cypher]
== Taxonomy tester - same species name in different genera
This gist loads a graph of the GBIF classification for bats of the family Molossidae, and tests for possible duplicated species. It is essentially a Neo4J version of the "papaya plot" http://iphylo.blogspot.co.uk/2013/08/cluster-maps-papaya-plots-and-trouble.html
Note that this is a very simple test, we should also test for subspecies names as well, and standardise epithets to avoid missing matches due to differing gender of genus names.
//hide
//setup
//output
[source,cypher]
@rdmpage
rdmpage / simple.sql
Created May 18, 2015 10:37
Simple ION dump
SELECT
IFNULL(id, ""),
IFNULL(cluster_id, ""),
IFNULL(nameComplete, ""),
IFNULL(taxonAuthor, ""),
IFNULL(`group`, ""),
IFNULL(publication, "")
INTO OUTFILE "/tmp/markus.tsv"
FIELDS TERMINATED BY "\t" ENCLOSED BY "" ESCAPED BY ""
LINES TERMINATED BY "\n"
@rdmpage
rdmpage / bold.txt
Created March 25, 2015 10:55
BOLD phylogeny from API
BOLD has a (hidden) API to retrieve a Newick tree for a BIN. For example on the page http://www.boldsystems.org/index.php/Public_BarcodeCluster?clusteruri=BOLD:AAD8242 we see an SVG tree. Under the hood this is called by: http://www.boldsystems.org/index.php/Public_Ajax_BINTree?clusterguid=BOLD:AAD8242&clusteruri=BOLD:AAD8242&clustername=BIN38242&clusterid=1738567
It's not clear where the "clusterid" comes from, but without it the API call fails.
@rdmpage
rdmpage / traits.html
Last active August 29, 2015 14:17
CouchDB-like view of EOL Trait data
<html>
<head>
<title>CouchDB-like view of EOL Trait data</title>
<meta charset="UTF-8"/>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.2/jquery.min.js"></script>
</head>
<body>
<h1>CouchDB-like view of EOL Trait data</h1>
<p>API call like this: http://eol.org/api/traits/26374</p>
<div id="output"></div>
{
"created_at": "Thu Mar 12 21:17:00 +0000 2015",
"id": 576129791759384600,
"id_str": "576129791759384576",
"text": "#myGBIF (via web client)",
"source": "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>",
"truncated": false,
"in_reply_to_status_id": null,
"in_reply_to_status_id_str": null,
"in_reply_to_user_id": null,