Skip to content

Instantly share code, notes, and snippets.

@drorata
Created December 9, 2014 13:54
Show Gist options
  • Save drorata/8b6d9332c05b272426f1 to your computer and use it in GitHub Desktop.
Save drorata/8b6d9332c05b272426f1 to your computer and use it in GitHub Desktop.
Print mixed dtypes in columns of DataFrame
{
"metadata": {
"name": "",
"signature": "sha256:63382758aa4c42ca13e1a47f0deebe1617a89f35f5fcc00ce9f6b2e24efa2e9e"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "code",
"collapsed": false,
"input": [
"import numpy as np\n",
"import pandas as pd"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 1
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Consider the following simple DataFrame"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"df = pd.DataFrame([[1,2],[3,4]])"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 2
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It has the following types:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"df.dtypes"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 3,
"text": [
"0 int64\n",
"1 int64\n",
"dtype: object"
]
}
],
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"However, if we append another row that contain a float"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"df2 = pd.concat([df, pd.DataFrame([[10.,11]])])\n",
"df2"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>0</th>\n",
" <th>1</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> 1</td>\n",
" <td> 2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> 3</td>\n",
" <td> 4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> 10</td>\n",
" <td> 11</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 4,
"text": [
" 0 1\n",
"0 1 2\n",
"1 3 4\n",
"0 10 11"
]
}
],
"prompt_number": 4
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then, we see that the type of the first column is changed to a float"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"df2.dtypes"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 5,
"text": [
"0 float64\n",
"1 int64\n",
"dtype: object"
]
}
],
"prompt_number": 5
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is not so bad... at the momment. But if we change the DataFrame further:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"df3 = pd.concat([df, pd.DataFrame([[10.1,11]])])\n",
"df3"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>0</th>\n",
" <th>1</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> 1.0</td>\n",
" <td> 2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> 3.0</td>\n",
" <td> 4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> 10.1</td>\n",
" <td> 11</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 6,
"text": [
" 0 1\n",
"0 1.0 2\n",
"1 3.0 4\n",
"0 10.1 11"
]
}
],
"prompt_number": 6
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is already annoying. The integers (that were casted into floats) are now printed as floats. A workaround is to define a printing function:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def printFloat(x):\n",
" if np.modf(x)[0] == 0:\n",
" return str(int(x))\n",
" else:\n",
" return str(x)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 7
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"and pass it to Pandas:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"pd.options.display.float_format = printFloat"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 8
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, printing df3 yields:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"df3"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>0</th>\n",
" <th>1</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> 1</td>\n",
" <td> 2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> 3</td>\n",
" <td> 4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>10.1</td>\n",
" <td> 11</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 9,
"text": [
" 0 1\n",
"0 1 2\n",
"1 3 4\n",
"0 10.1 11"
]
}
],
"prompt_number": 9
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment