Skip to content

Instantly share code, notes, and snippets.

@tonyfast
Created August 19, 2021 00:26
Show Gist options
  • Save tonyfast/d01a0248979b21c9484d0827e04c0e68 to your computer and use it in GitHub Desktop.
Save tonyfast/d01a0248979b21c9484d0827e04c0e68 to your computer and use it in GitHub Desktop.
using templates to generate alt text for figures in matplotib
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "f1dd85ab-e176-4915-bf8a-79adf54f942d",
"metadata": {},
"source": [
"# generating alt text for an image from a dataframe.\n",
"\n",
"to share generated alt text we need to understand that both the plots and alt text are projections of a dataframe;\n",
"one in pure form and the other in pure typography. to generate an example scenario we need:\n",
"1. a dataframe\n",
"2. a plot of the dataframe\n",
"3. formatted text derived from the dataframe\n",
"\n",
"warning: the alt text in this example can be improve, please help me. i wanted to demonstrate an end to end workflow for generated alt text. \n",
"\n",
"consult with [chartability](https://chartability.fizz.studio/ \"a methodology for ensuring that data visualizations, systems, and interfaces are accessible\")."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "83b2c83c-ff70-4c71-a8cd-7f3bd2a4a1b7",
"metadata": {},
"outputs": [],
"source": [
" %matplotlib agg\n",
" import pandas, IPython.display as display, io, jinja2, base64"
]
},
{
"cell_type": "markdown",
"id": "312d441d-0c83-4c2b-9c75-23b833d34f2b",
"metadata": {},
"source": [
"create some sample data `df`"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "60e47590-5c96-4575-b5e5-433dfd6ae9a8",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/tonyfast/miniforge3/lib/python3.9/site-packages/pandas/util/__init__.py:15: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.\n",
" import pandas.util.testing\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>DeWbsEAqVl</th>\n",
" <th>u7Shqke0mg</th>\n",
" <th>DNIa9Ydxjh</th>\n",
" <th>JhBydInTBG</th>\n",
" <th>QzFCquTMgL</th>\n",
" <th>iiTLaCjbvK</th>\n",
" <th>eTJR8I4inM</th>\n",
" <th>VeRukIzX7j</th>\n",
" <th>3t3PjjOxRt</th>\n",
" <th>7uNCtGznIg</th>\n",
" <th>...</th>\n",
" <th>ZWDCUmkC9H</th>\n",
" <th>SKxnmjbznW</th>\n",
" <th>OJZQgDldhm</th>\n",
" <th>d8jXEUjVX1</th>\n",
" <th>zZwmXOGn2R</th>\n",
" <th>itPeesfNjT</th>\n",
" <th>nCUaPaX4hN</th>\n",
" <th>7XYLZeOtEn</th>\n",
" <th>7eun8Zwowd</th>\n",
" <th>bvKwo0nVBO</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>A</th>\n",
" <td>-0.513266</td>\n",
" <td>-0.569885</td>\n",
" <td>1.046707</td>\n",
" <td>-0.040230</td>\n",
" <td>0.252344</td>\n",
" <td>-0.869195</td>\n",
" <td>0.400589</td>\n",
" <td>1.186798</td>\n",
" <td>1.121367</td>\n",
" <td>0.379662</td>\n",
" <td>...</td>\n",
" <td>-0.295613</td>\n",
" <td>-0.130283</td>\n",
" <td>1.396725</td>\n",
" <td>1.436834</td>\n",
" <td>-0.376181</td>\n",
" <td>0.083873</td>\n",
" <td>0.432758</td>\n",
" <td>0.569672</td>\n",
" <td>1.730417</td>\n",
" <td>-0.608284</td>\n",
" </tr>\n",
" <tr>\n",
" <th>B</th>\n",
" <td>-0.053815</td>\n",
" <td>1.635761</td>\n",
" <td>0.236135</td>\n",
" <td>0.674369</td>\n",
" <td>0.299866</td>\n",
" <td>-0.743593</td>\n",
" <td>1.655680</td>\n",
" <td>-0.240112</td>\n",
" <td>0.203308</td>\n",
" <td>1.455075</td>\n",
" <td>...</td>\n",
" <td>0.017701</td>\n",
" <td>-0.122268</td>\n",
" <td>-0.969117</td>\n",
" <td>-0.305535</td>\n",
" <td>-1.006477</td>\n",
" <td>-0.807684</td>\n",
" <td>-0.169746</td>\n",
" <td>1.277646</td>\n",
" <td>0.098450</td>\n",
" <td>-0.430554</td>\n",
" </tr>\n",
" <tr>\n",
" <th>C</th>\n",
" <td>-1.979477</td>\n",
" <td>0.762999</td>\n",
" <td>0.931630</td>\n",
" <td>-0.867674</td>\n",
" <td>1.146874</td>\n",
" <td>-0.828406</td>\n",
" <td>1.091406</td>\n",
" <td>-0.166953</td>\n",
" <td>-0.571468</td>\n",
" <td>2.372057</td>\n",
" <td>...</td>\n",
" <td>1.664442</td>\n",
" <td>0.899626</td>\n",
" <td>-1.266544</td>\n",
" <td>1.674116</td>\n",
" <td>-1.782052</td>\n",
" <td>-1.735131</td>\n",
" <td>0.320543</td>\n",
" <td>-0.457179</td>\n",
" <td>1.483902</td>\n",
" <td>0.120748</td>\n",
" </tr>\n",
" <tr>\n",
" <th>D</th>\n",
" <td>0.234598</td>\n",
" <td>-0.114717</td>\n",
" <td>-0.136272</td>\n",
" <td>-1.135210</td>\n",
" <td>0.606916</td>\n",
" <td>-0.102985</td>\n",
" <td>-0.614704</td>\n",
" <td>-1.935681</td>\n",
" <td>-0.640178</td>\n",
" <td>-0.397213</td>\n",
" <td>...</td>\n",
" <td>0.186925</td>\n",
" <td>-2.806571</td>\n",
" <td>0.806676</td>\n",
" <td>-0.019412</td>\n",
" <td>-0.824382</td>\n",
" <td>-1.037335</td>\n",
" <td>-1.542543</td>\n",
" <td>1.022760</td>\n",
" <td>-0.300296</td>\n",
" <td>0.441676</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>4 rows × 30 columns</p>\n",
"</div>"
],
"text/plain": [
" DeWbsEAqVl u7Shqke0mg DNIa9Ydxjh JhBydInTBG QzFCquTMgL iiTLaCjbvK \\\n",
"A -0.513266 -0.569885 1.046707 -0.040230 0.252344 -0.869195 \n",
"B -0.053815 1.635761 0.236135 0.674369 0.299866 -0.743593 \n",
"C -1.979477 0.762999 0.931630 -0.867674 1.146874 -0.828406 \n",
"D 0.234598 -0.114717 -0.136272 -1.135210 0.606916 -0.102985 \n",
"\n",
" eTJR8I4inM VeRukIzX7j 3t3PjjOxRt 7uNCtGznIg ... ZWDCUmkC9H \\\n",
"A 0.400589 1.186798 1.121367 0.379662 ... -0.295613 \n",
"B 1.655680 -0.240112 0.203308 1.455075 ... 0.017701 \n",
"C 1.091406 -0.166953 -0.571468 2.372057 ... 1.664442 \n",
"D -0.614704 -1.935681 -0.640178 -0.397213 ... 0.186925 \n",
"\n",
" SKxnmjbznW OJZQgDldhm d8jXEUjVX1 zZwmXOGn2R itPeesfNjT nCUaPaX4hN \\\n",
"A -0.130283 1.396725 1.436834 -0.376181 0.083873 0.432758 \n",
"B -0.122268 -0.969117 -0.305535 -1.006477 -0.807684 -0.169746 \n",
"C 0.899626 -1.266544 1.674116 -1.782052 -1.735131 0.320543 \n",
"D -2.806571 0.806676 -0.019412 -0.824382 -1.037335 -1.542543 \n",
"\n",
" 7XYLZeOtEn 7eun8Zwowd bvKwo0nVBO \n",
"A 0.569672 1.730417 -0.608284 \n",
"B 1.277646 0.098450 -0.430554 \n",
"C -0.457179 1.483902 0.120748 \n",
"D 1.022760 -0.300296 0.441676 \n",
"\n",
"[4 rows x 30 columns]"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
" df = pandas.util.testing.makeDataFrame(); df.T"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "e0e6f2dd-cdb3-4846-ba2c-40a35488044c",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>count</th>\n",
" <th>mean</th>\n",
" <th>std</th>\n",
" <th>min</th>\n",
" <th>25%</th>\n",
" <th>50%</th>\n",
" <th>75%</th>\n",
" <th>max</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>A</th>\n",
" <td>30.0</td>\n",
" <td>0.309370</td>\n",
" <td>0.864548</td>\n",
" <td>-0.891069</td>\n",
" <td>-0.356039</td>\n",
" <td>0.113096</td>\n",
" <td>1.033456</td>\n",
" <td>2.519034</td>\n",
" </tr>\n",
" <tr>\n",
" <th>B</th>\n",
" <td>30.0</td>\n",
" <td>0.119557</td>\n",
" <td>0.923898</td>\n",
" <td>-1.570313</td>\n",
" <td>-0.563011</td>\n",
" <td>0.055682</td>\n",
" <td>0.754481</td>\n",
" <td>1.752108</td>\n",
" </tr>\n",
" <tr>\n",
" <th>C</th>\n",
" <td>30.0</td>\n",
" <td>-0.007826</td>\n",
" <td>1.213352</td>\n",
" <td>-1.979477</td>\n",
" <td>-0.872796</td>\n",
" <td>0.052330</td>\n",
" <td>0.923629</td>\n",
" <td>2.372057</td>\n",
" </tr>\n",
" <tr>\n",
" <th>D</th>\n",
" <td>30.0</td>\n",
" <td>-0.505479</td>\n",
" <td>0.884101</td>\n",
" <td>-2.806571</td>\n",
" <td>-1.074367</td>\n",
" <td>-0.393693</td>\n",
" <td>0.135341</td>\n",
" <td>1.022760</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" count mean std min 25% 50% 75% max\n",
"A 30.0 0.309370 0.864548 -0.891069 -0.356039 0.113096 1.033456 2.519034\n",
"B 30.0 0.119557 0.923898 -1.570313 -0.563011 0.055682 0.754481 1.752108\n",
"C 30.0 -0.007826 1.213352 -1.979477 -0.872796 0.052330 0.923629 2.372057\n",
"D 30.0 -0.505479 0.884101 -2.806571 -1.074367 -0.393693 0.135341 1.022760"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
" statistics = df.describe(); statistics.T"
]
},
{
"cell_type": "markdown",
"id": "0237333b-0413-463a-a7c7-a58154480714",
"metadata": {},
"source": [
"## about the data"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "ccbf698d-72bf-47a2-8ba0-45794305af51",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"`df` is a dataframe with 30 rows and 4 columns with the names A, B, C, D"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
" display.Markdown(F\"`df` is a dataframe with {len(df)} rows and {len(df.columns)} columns with the names {', '.join(df.columns)}\")"
]
},
{
"cell_type": "markdown",
"id": "3257f61c-ef66-49d7-9104-149f64e8d5f5",
"metadata": {},
"source": [
"## capturing the `matplotlib` figure\n",
"\n",
"using `io.BytesIO` to read the figure from the pipe and for the image a base64 encoded [data uri](https://en.wikipedia.org/wiki/Data_URI_scheme \"data uri scheme wiki\"). this first example uses a `boxplot`"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "6ae4fc03-834c-47ae-b8f2-93f97167e08d",
"metadata": {},
"outputs": [],
"source": [
" data = io.BytesIO()\n",
" df.plot.box().figure.savefig(data)\n",
" image = F\"data:image/png;base64,{base64.b64encode(data.getvalue()).decode()}\""
]
},
{
"cell_type": "markdown",
"id": "ee8406b6-f208-4922-b761-fdfde259c231",
"metadata": {},
"source": [
"write a formatted string describing the `boxplot` figure."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "1b5365d4-26bc-422b-9191-05575869c615",
"metadata": {},
"outputs": [],
"source": [
" alt = F\"\"\"A box plot showing the columns with names {\", \".join(df.columns)}. The averages for each columns are: {\n",
" \", \".join(f'{k} is {v:.2f}' for k, v in statistics.loc['mean'].items())\n",
" }.\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "dfa2735f-318e-4ce4-90aa-e0a404d0349c",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"![]( \"A box plot showing the columns with names A, B, C, D. The averages for each columns are: A is 0.31, B is 0.12, C is -0.01, D is -0.51.\")"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
" display.Markdown(F\"\"\"![]({image} \"{alt}\")\"\"\")"
]
},
{
"cell_type": "markdown",
"id": "8f944b44-4d06-4c7b-b171-0f52d311c536",
"metadata": {},
"source": [
"other figures with need different alt text, and `jinja2` will be the most powerful candidate for templating. we put these concepts together in another figure below."
]
},
{
"cell_type": "markdown",
"id": "4717a172-2e75-4e64-a690-d28968abe2e0",
"metadata": {},
"source": [
"`capture` the figure as a data uri"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "b6dc4f34-4e74-4c61-9406-b2a254d8bd77",
"metadata": {},
"outputs": [],
"source": [
" def capture(figure):\n",
" buffer = io.BytesIO()\n",
" figure.savefig(buffer)\n",
" return F\"data:image/png;base64,{base64.b64encode(buffer.getvalue()).decode()}\""
]
},
{
"cell_type": "markdown",
"id": "a43ae6c0-374c-408a-ad21-28491c44c94b",
"metadata": {},
"source": [
"use a template for the alt text to make an `accessible` figure."
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "13f6f5f5-4c94-4ed9-bb9d-8450f205d7c1",
"metadata": {},
"outputs": [],
"source": [
" def accessible(figure, template, **kwargs):\n",
" return display.Markdown(F\"\"\"![]({capture(figure)} \"{template.render(**globals(), **kwargs)}\")\"\"\")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "d94fae2f-747b-4b31-952a-0d44e14cbfc6",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"![]( \"A scatter plot comparing 30 points of A vs B.\")"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
" accessible(df.plot.scatter(*\"AB\").figure, jinja2.Template(\"A scatter plot comparing {{len(df)}} points of A vs B.\"), len=len)"
]
},
{
"cell_type": "markdown",
"id": "f0ec843b-194a-485b-bc6e-81101b464749",
"metadata": {},
"source": [
"These alt text examples are probably incomplete, but the goal was to demontrate templating alt text from generated data with `pandas` and `matplotlib`"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment