Skip to content

Instantly share code, notes, and snippets.

@nickynicolson
Created November 11, 2020 20:58
Show Gist options
  • Save nickynicolson/fe935ceb7dcb30e12f7d09ace234bff3 to your computer and use it in GitHub Desktop.
Save nickynicolson/fe935ceb7dcb30e12f7d09ace234bff3 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Reconciling author names against IPNI\n",
"\n",
"Request from [@BortEdwards](https://twitter.com/@BortEdwards): \n",
"\n",
"> I have 6 large plant databases. Want to standardize author names before harmonizing taxa across them. Thought easiest way would be to reduce all author names to their abbreviations 1st - but relies on a look-up list...\n",
"\n",
"Kew has an API to its nomenclatural & taxonomic databases - [pykew](https://github.com/RBGKew/pykew)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Install pykew and pandas utility (the `%%capture` magic just hides the pip output)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"%%capture\n",
"! pip install pykew\n",
"! pip install pandas"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Import pykew, create query and execute it"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import pykew.ipni as ipni\n",
"from pykew.ipni_terms import Author\n",
"\n",
"# Search terms supported for the Author type are \n",
"# forename\n",
"# full_name\n",
"# standard_form\n",
"# surname\n",
"\n",
"query = { Author.surname: 'Blackwell' }\n",
"res = ipni.search(query)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Results are returned in an iterator, so can get size without retrieving all results"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"6\n"
]
}
],
"source": [
"print(res.size())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Each result is a dictionary, the following fields are available for each result:\n",
"\n",
"- alternativeNames\n",
"- dates\n",
"- forename\n",
"- fqId\n",
"- id\n",
"- recordType\n",
"- source\n",
"- standardForm\n",
"- surname\n",
"- suppressed\n",
"- taxonGroups\n",
"- url\n",
"- version\n",
"- summary\n",
"- hasBhlLink"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Display results in a table"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>alternativeAbbreviations</th>\n",
" <th>alternativeNames</th>\n",
" <th>dates</th>\n",
" <th>examples</th>\n",
" <th>forename</th>\n",
" <th>fqId</th>\n",
" <th>id</th>\n",
" <th>isoCountries</th>\n",
" <th>notes</th>\n",
" <th>recordType</th>\n",
" <th>...</th>\n",
" <th>surname</th>\n",
" <th>suppressed</th>\n",
" <th>taxonGroups</th>\n",
" <th>url</th>\n",
" <th>version</th>\n",
" <th>summary</th>\n",
" <th>hasBhlLink</th>\n",
" <th>bhlPageLink</th>\n",
" <th>source</th>\n",
" <th>comments</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>0</td>\n",
" <td>Blackw. From TL2;Blackw. From Meikle</td>\n",
" <td></td>\n",
" <td>1707-1758</td>\n",
" <td>Amomum verum Blackw. in Herb. Blackwell. 4: t....</td>\n",
" <td>Elizabeth</td>\n",
" <td>urn:lsid:ipni.org:authors:830-1</td>\n",
" <td>830-1</td>\n",
" <td>United Kingdom</td>\n",
" <td>First female author of a plant name.</td>\n",
" <td>author</td>\n",
" <td>...</td>\n",
" <td>Blackwell</td>\n",
" <td>False</td>\n",
" <td>Spermatophytes</td>\n",
" <td>/a/830-1</td>\n",
" <td>1.1.1.2</td>\n",
" <td>Blackwell, Elizabeth (1707-1758)</td>\n",
" <td>True</td>\n",
" <td>http://www.biodiversitylibrary.org/openurl?ctx...</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>1</td>\n",
" <td>K.Blackw. From Meikle</td>\n",
" <td></td>\n",
" <td>fl. 1975</td>\n",
" <td>J. Elisha Mit. Soc. 90(4): 137. 1975</td>\n",
" <td>Kay P.</td>\n",
" <td>urn:lsid:ipni.org:authors:831-1</td>\n",
" <td>831-1</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>author</td>\n",
" <td>...</td>\n",
" <td>Blackwell</td>\n",
" <td>False</td>\n",
" <td>Spermatophytes</td>\n",
" <td>/a/831-1</td>\n",
" <td>1.1</td>\n",
" <td>Blackwell, Kay P. (fl. 1975)</td>\n",
" <td>False</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>NaN</td>\n",
" <td></td>\n",
" <td>1940-</td>\n",
" <td>NaN</td>\n",
" <td>Meredith</td>\n",
" <td>urn:lsid:ipni.org:authors:16189-1</td>\n",
" <td>16189-1</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>author</td>\n",
" <td>...</td>\n",
" <td>Blackwell</td>\n",
" <td>False</td>\n",
" <td>Mycology</td>\n",
" <td>/a/16189-1</td>\n",
" <td>1.1</td>\n",
" <td>Blackwell, Meredith (1940-)</td>\n",
" <td>False</td>\n",
" <td>NaN</td>\n",
" <td>CMI</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>W.Blackw. From Meikle</td>\n",
" <td></td>\n",
" <td>1939-</td>\n",
" <td>Asclepias prostrata W.H.Blackw., Southw. Natur...</td>\n",
" <td>Will Hoyle</td>\n",
" <td>urn:lsid:ipni.org:authors:832-1</td>\n",
" <td>832-1</td>\n",
" <td>United States</td>\n",
" <td>NaN</td>\n",
" <td>author</td>\n",
" <td>...</td>\n",
" <td>Blackwell</td>\n",
" <td>False</td>\n",
" <td>Mycology, Algae, Spermatophytes</td>\n",
" <td>/a/832-1</td>\n",
" <td>1.1.2.1</td>\n",
" <td>Blackwell, Will Hoyle (1939-)</td>\n",
" <td>False</td>\n",
" <td>NaN</td>\n",
" <td>From Correll &amp; Johnson 'Manual of the Vascular...</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4</td>\n",
" <td>NaN</td>\n",
" <td>Cox, P.J.B.</td>\n",
" <td>1954-</td>\n",
" <td>Rudbeckia texana (Perdue) P.B. Cox &amp; L.E. Urba...</td>\n",
" <td>Patricia Blackwell</td>\n",
" <td>urn:lsid:ipni.org:authors:14655-1</td>\n",
" <td>14655-1</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>author</td>\n",
" <td>...</td>\n",
" <td>Cox</td>\n",
" <td>False</td>\n",
" <td>Spermatophytes</td>\n",
" <td>/a/14655-1</td>\n",
" <td>1.2</td>\n",
" <td>Cox, Patricia Blackwell (1954-)</td>\n",
" <td>False</td>\n",
" <td>NaN</td>\n",
" <td>D.H. Kent 1990. Date and full name from 'R.D.T...</td>\n",
" <td>P.J.B.Cox after plant names in Castanea 59(4):...</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5</td>\n",
" <td>F.Forbes From TL2;F.B.Forbes From Meikle</td>\n",
" <td></td>\n",
" <td>1839-1908</td>\n",
" <td>NaN</td>\n",
" <td>Francis Blackwell</td>\n",
" <td>urn:lsid:ipni.org:authors:2814-1</td>\n",
" <td>2814-1</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>author</td>\n",
" <td>...</td>\n",
" <td>Forbes</td>\n",
" <td>False</td>\n",
" <td>Spermatophytes</td>\n",
" <td>/a/2814-1</td>\n",
" <td>1.1</td>\n",
" <td>Forbes, Francis Blackwell (1839-1908)</td>\n",
" <td>False</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>6 rows × 21 columns</p>\n",
"</div>"
],
"text/plain": [
" alternativeAbbreviations alternativeNames dates \\\n",
"0 Blackw. From TL2;Blackw. From Meikle 1707-1758 \n",
"1 K.Blackw. From Meikle fl. 1975 \n",
"2 NaN 1940- \n",
"3 W.Blackw. From Meikle 1939- \n",
"4 NaN Cox, P.J.B. 1954- \n",
"5 F.Forbes From TL2;F.B.Forbes From Meikle 1839-1908 \n",
"\n",
" examples forename \\\n",
"0 Amomum verum Blackw. in Herb. Blackwell. 4: t.... Elizabeth \n",
"1 J. Elisha Mit. Soc. 90(4): 137. 1975 Kay P. \n",
"2 NaN Meredith \n",
"3 Asclepias prostrata W.H.Blackw., Southw. Natur... Will Hoyle \n",
"4 Rudbeckia texana (Perdue) P.B. Cox & L.E. Urba... Patricia Blackwell \n",
"5 NaN Francis Blackwell \n",
"\n",
" fqId id isoCountries \\\n",
"0 urn:lsid:ipni.org:authors:830-1 830-1 United Kingdom \n",
"1 urn:lsid:ipni.org:authors:831-1 831-1 NaN \n",
"2 urn:lsid:ipni.org:authors:16189-1 16189-1 NaN \n",
"3 urn:lsid:ipni.org:authors:832-1 832-1 United States \n",
"4 urn:lsid:ipni.org:authors:14655-1 14655-1 NaN \n",
"5 urn:lsid:ipni.org:authors:2814-1 2814-1 NaN \n",
"\n",
" notes recordType ... surname suppressed \\\n",
"0 First female author of a plant name. author ... Blackwell False \n",
"1 NaN author ... Blackwell False \n",
"2 NaN author ... Blackwell False \n",
"3 NaN author ... Blackwell False \n",
"4 NaN author ... Cox False \n",
"5 NaN author ... Forbes False \n",
"\n",
" taxonGroups url version \\\n",
"0 Spermatophytes /a/830-1 1.1.1.2 \n",
"1 Spermatophytes /a/831-1 1.1 \n",
"2 Mycology /a/16189-1 1.1 \n",
"3 Mycology, Algae, Spermatophytes /a/832-1 1.1.2.1 \n",
"4 Spermatophytes /a/14655-1 1.2 \n",
"5 Spermatophytes /a/2814-1 1.1 \n",
"\n",
" summary hasBhlLink \\\n",
"0 Blackwell, Elizabeth (1707-1758) True \n",
"1 Blackwell, Kay P. (fl. 1975) False \n",
"2 Blackwell, Meredith (1940-) False \n",
"3 Blackwell, Will Hoyle (1939-) False \n",
"4 Cox, Patricia Blackwell (1954-) False \n",
"5 Forbes, Francis Blackwell (1839-1908) False \n",
"\n",
" bhlPageLink \\\n",
"0 http://www.biodiversitylibrary.org/openurl?ctx... \n",
"1 NaN \n",
"2 NaN \n",
"3 NaN \n",
"4 NaN \n",
"5 NaN \n",
"\n",
" source \\\n",
"0 NaN \n",
"1 NaN \n",
"2 CMI \n",
"3 From Correll & Johnson 'Manual of the Vascular... \n",
"4 D.H. Kent 1990. Date and full name from 'R.D.T... \n",
"5 NaN \n",
"\n",
" comments \n",
"0 NaN \n",
"1 NaN \n",
"2 NaN \n",
"3 NaN \n",
"4 P.J.B.Cox after plant names in Castanea 59(4):... \n",
"5 NaN \n",
"\n",
"[6 rows x 21 columns]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"df=pd.DataFrame.from_dict(res)\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Links to other datasources\n",
"\n",
"### Biodiversity Heritage Library\n",
"The author object includes links to the Biodiversity Heritage Library (to the biographical statement in [Taxonomic Literature 2](https://www.biodiversitylibrary.org/bibliography/48631) - if available).\n",
"\n",
"Example: Marjan Raciborski https://www.ipni.org/a/8071-1 listed in [TL-2 1:529](https://www.biodiversitylibrary.org/page/33190000)\n",
"\n",
"### Wikidata\n",
"Many authors are represented in Wikidata, and there is a wikidata property for the IPNI author identifier ([property P586](https://www.wikidata.org/wiki/Property:P586)). As of 2020-11-11, 53131 wikidata items include an IPNI author ID.\n",
"\n",
"These can be retrieved using a [SPARQL query](https://w.wiki/m7R) to the wikidata query service."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment