Skip to content

Instantly share code, notes, and snippets.

@epifanio
Created November 14, 2019 12:46
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save epifanio/b52b3a8b79ae67db8e8bb3230f9c1fbf to your computer and use it in GitHub Desktop.
Save epifanio/b52b3a8b79ae67db8e8bb3230f9c1fbf to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Custom names Part 3: Provisional names\n",
"Names awaiting confirmation from an expert. They are usually coded using \"cf\" but they have to be recoded so as not to be confused with unsure identifications"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from fuzzyutil import *"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"from fuzzyutil import tidy"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"df = pd.read_csv('names_v2.csv', encoding='latin1')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## List of candidate names\n",
"Names containing \"cf\""
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"unconfirmed = df['Taxonomy'][((df['Taxonomy'].str.contains('cf'))) & (df['Status']==False)]"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1252"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(unconfirmed)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## List of provisional names"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"await_confirm = [\"Grantia compressa TBC\"]"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[['Grantia compressa cf', 'Grantia compressa TBC', 83]]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"corrector = matchinglist(unconfirmed, await_confirm,scorelimit=80, method='token_sort')\n",
"corrector"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* Replace and set status as OK"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:2: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame\n",
"\n",
"See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n",
" \n",
"/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:3: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame\n",
"\n",
"See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n",
" This is separate from the ipykernel package so we can avoid doing imports until\n"
]
}
],
"source": [
"for i in corrector:\n",
" df['To_name'][df['Taxonomy']==i[0]] = i[1]\n",
" df['Status'][df['Taxonomy']==i[0]] = True"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"df.to_csv('names_v3.csv', index=False, encoding='latin1')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* Find Tethya citrina and replace with Craniella cranium\n",
"* Filograna implexa cf in R1017, R1001 can be corrected to Filograna implexa\n",
"* sort out Ophiuroidea etc."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment