Skip to content

Instantly share code, notes, and snippets.

@lewisacidic
Last active August 29, 2015 14:22
Show Gist options
  • Save lewisacidic/b806185b847096498318 to your computer and use it in GitHub Desktop.
Save lewisacidic/b806185b847096498318 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#Example of proposed pandas rich rendering improvements"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from IPython.display import Image\n",
"from IPython.display import HTML\n",
"from IPython.display import display\n",
"import pandas as pd\n",
"import pandas.core.common as com"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Function that provides rich repr objects:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def flag(country_code):\n",
" return Image(width=200, url='http://images.nationmaster.com/images/flags/{}-lgflag.gif'.format(country_code))"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"a = flag('br')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"####Object's rich display:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<img src=\"http://images.nationmaster.com/images/flags/br-lgflag.gif\" width=\"200\"/>"
],
"text/plain": [
"<IPython.core.display.Image object>"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a ## or display(a)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is a useful feature of IPython notebooks :)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"####Str, repr and pandas prettyprint value:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"('<IPython.core.display.Image object>',\n",
" '<IPython.core.display.Image object>',\n",
" u'<IPython.core.display.Image object>')"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"repr(a), str(a), pd.core.common.pprint_thing(a)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"These string implementations are not informative when printed."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"####Result of implemented `_repr_html_` :"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"u'<img src=\"http://images.nationmaster.com/images/flags/br-lgflag.gif\" width=\"200\"/>'"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a._repr_html_()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is a string value that is interpretable..."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###Pandas behaviour at the moment:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"countries = ['us', 'uk', 'br', 'nz']"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>0</th>\n",
" <th>1</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>us</td>\n",
" <td>&lt;IPython.core.display.Image object&gt;</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>uk</td>\n",
" <td>&lt;IPython.core.display.Image object&gt;</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>br</td>\n",
" <td>&lt;IPython.core.display.Image object&gt;</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>nz</td>\n",
" <td>&lt;IPython.core.display.Image object&gt;</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" 0 1\n",
"0 us <IPython.core.display.Image object>\n",
"1 uk <IPython.core.display.Image object>\n",
"2 br <IPython.core.display.Image object>\n",
"3 nz <IPython.core.display.Image object>"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.DataFrame([(c, flag(c)) for c in countries]); df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Pandas DataFrame's `_repr_html_` uses the string values (via panda's pretty print), not the rich value in the table."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>0</th>\n",
" <th>1</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>us</td>\n",
" <td>&lt;IPython.core.display.Image object&gt;</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>uk</td>\n",
" <td>&lt;IPython.core.display.Image object&gt;</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>br</td>\n",
" <td>&lt;IPython.core.display.Image object&gt;</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>nz</td>\n",
" <td>&lt;IPython.core.display.Image object&gt;</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>\n"
]
}
],
"source": [
"print df._repr_html_()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If the object's HTML was used **by default** in place of escaped pretty print objects, this would make pandas dataframes richer and better to work with, when doing analysis, but especially when preparing a presentable document. Something like this:\n"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>0</th>\n",
" <th>1</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>us</td>\n",
" <td><img src=\"http://images.nationmaster.com/images/flags/us-lgflag.gif\" width=\"200\"/></td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>uk</td>\n",
" <td><img src=\"http://images.nationmaster.com/images/flags/uk-lgflag.gif\" width=\"200\"/></td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>br</td>\n",
" <td><img src=\"http://images.nationmaster.com/images/flags/br-lgflag.gif\" width=\"200\"/></td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>nz</td>\n",
" <td><img src=\"http://images.nationmaster.com/images/flags/nz-lgflag.gif\" width=\"200\"/></td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"HTML(\"\"\"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>0</th>\n",
" <th>1</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>us</td>\n",
" <td><img src=\"http://images.nationmaster.com/images/flags/us-lgflag.gif\" width=\"200\"/></td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>uk</td>\n",
" <td><img src=\"http://images.nationmaster.com/images/flags/uk-lgflag.gif\" width=\"200\"/></td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>br</td>\n",
" <td><img src=\"http://images.nationmaster.com/images/flags/br-lgflag.gif\" width=\"200\"/></td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>nz</td>\n",
" <td><img src=\"http://images.nationmaster.com/images/flags/nz-lgflag.gif\" width=\"200\"/></td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>\"\"\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###Problems"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"####Security\n",
"the strings are escaped etc. before, now not."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"class AnnoyingObject(object):\n",
" def _repr_html_(self):\n",
" return \"<script>alert('Could be much worse :)')</script>\""
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>0</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>&lt;__main__.AnnoyingObject object at 0x110f56fd0&gt;</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" 0\n",
"0 <__main__.AnnoyingObject object at 0x110f56fd0>"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pd.DataFrame([AnnoyingObject()])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With proposed changes, this would be problematic. However this is not really a problem for pandas, but for the notebook itself, as the notebook does this too. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"####Layout\n",
"\n",
"The html should be limited in size. This would be obnoxious:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false,
"scrolled": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>0</th>\n",
" <th>1</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>us</td>\n",
" <td ><img src=\"http://images.nationmaster.com/images/flags/us-lgflag.gif\" /></td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>uk</td>\n",
" <td><img src=\"http://images.nationmaster.com/images/flags/uk-lgflag.gif\" /></td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>br</td>\n",
" <td><img src=\"http://images.nationmaster.com/images/flags/br-lgflag.gif\" /></td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>nz</td>\n",
" <td><img src=\"http://images.nationmaster.com/images/flags/nz-lgflag.gif\" /></td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"HTML(\"\"\"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>0</th>\n",
" <th>1</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>us</td>\n",
" <td ><img src=\"http://images.nationmaster.com/images/flags/us-lgflag.gif\" /></td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>uk</td>\n",
" <td><img src=\"http://images.nationmaster.com/images/flags/uk-lgflag.gif\" /></td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>br</td>\n",
" <td><img src=\"http://images.nationmaster.com/images/flags/br-lgflag.gif\" /></td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>nz</td>\n",
" <td><img src=\"http://images.nationmaster.com/images/flags/nz-lgflag.gif\" /></td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>\"\"\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Could just add a width (or max-width) on all td. Something like this:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>0</th>\n",
" <th>1</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>us</td>\n",
" <td style=\"max-width:200px\"><img src=\"http://images.nationmaster.com/images/flags/us-lgflag.gif\" /></td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>uk</td>\n",
" <td style=\"max-width:200px\"><img src=\"http://images.nationmaster.com/images/flags/uk-lgflag.gif\" /></td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>br</td>\n",
" <td style=\"max-width:200px\"><img src=\"http://images.nationmaster.com/images/flags/br-lgflag.gif\" /></td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>nz</td>\n",
" <td style=\"max-width:200px\"><img src=\"http://images.nationmaster.com/images/flags/nz-lgflag.gif\" /></td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"HTML(\"\"\"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>0</th>\n",
" <th>1</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>us</td>\n",
" <td style=\"max-width:200px\"><img src=\"http://images.nationmaster.com/images/flags/us-lgflag.gif\" /></td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>uk</td>\n",
" <td style=\"max-width:200px\"><img src=\"http://images.nationmaster.com/images/flags/uk-lgflag.gif\" /></td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>br</td>\n",
" <td style=\"max-width:200px\"><img src=\"http://images.nationmaster.com/images/flags/br-lgflag.gif\" /></td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>nz</td>\n",
" <td style=\"max-width:200px\"><img src=\"http://images.nationmaster.com/images/flags/nz-lgflag.gif\" /></td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>\"\"\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Perhaps should even just have a pandas css class that is imported whenever pandas is loaded in the notebook?"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.10"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment