Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save flamingbear/4cfac24c80fe34a67474 to your computer and use it in GitHub Desktop.
Save flamingbear/4cfac24c80fe34a67474 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"Pandas ordering/grouping/slicing/? MultiIndex columns\n",
"\n",
"I've managed to group my data correctly, but I would like to switch the\n",
"fastest varying index when I'm displaying my dataframe.\n",
"\n",
"Currently, I've got month varying fastest, but I would like it to be the slowest.\n",
"\n",
"Basically, I'm trying to reorder some columns so that each value is under the month, rather than each much under the value.\n",
"\n",
"Here's a sample data frame that I'm trying to work with.\n"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import pandas as pd\n",
"from StringIO import StringIO"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"TESTDATA=StringIO(\"\"\"2000, '1Jan', 45.1, 13.442, 13\n",
"2000, 'Feb', 46.1, 14.94, 17\n",
"2000, 'Mar', 25.1, 15.02, 14\n",
"2001, '1Jan', 85., 13.38, 12\n",
"2001, 'Feb', 16., 14.81, 15\n",
"2001, 'Mar', 49., 15.14, 17\n",
"2002, '1Jan', 90., 13.59, 15\n",
"2002, 'Feb', 33., 15.13, 22\n",
"2002, 'Mar', 82., 14.88, 10\n",
"2003, '1Jan', 47., 13.64, 17\n",
"2003, 'Feb', 34., 14.83, 16\n",
"2003, 'Mar', 78., 15.27, 22\"\"\")"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr>\n",
" <th></th>\n",
" <th colspan=\"3\" halign=\"left\">weight</th>\n",
" <th colspan=\"3\" halign=\"left\">extent</th>\n",
" <th colspan=\"3\" halign=\"left\">rank</th>\n",
" </tr>\n",
" <tr>\n",
" <th>month</th>\n",
" <th>'1Jan'</th>\n",
" <th>'Feb'</th>\n",
" <th>'Mar'</th>\n",
" <th>'1Jan'</th>\n",
" <th>'Feb'</th>\n",
" <th>'Mar'</th>\n",
" <th>'1Jan'</th>\n",
" <th>'Feb'</th>\n",
" <th>'Mar'</th>\n",
" </tr>\n",
" <tr>\n",
" <th>year</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>2000</th>\n",
" <td>45.1</td>\n",
" <td>46.1</td>\n",
" <td>25.1</td>\n",
" <td>13.442</td>\n",
" <td>14.94</td>\n",
" <td>15.02</td>\n",
" <td>13</td>\n",
" <td>17</td>\n",
" <td>14</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2001</th>\n",
" <td>85.0</td>\n",
" <td>16.0</td>\n",
" <td>49.0</td>\n",
" <td>13.380</td>\n",
" <td>14.81</td>\n",
" <td>15.14</td>\n",
" <td>12</td>\n",
" <td>15</td>\n",
" <td>17</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2002</th>\n",
" <td>90.0</td>\n",
" <td>33.0</td>\n",
" <td>82.0</td>\n",
" <td>13.590</td>\n",
" <td>15.13</td>\n",
" <td>14.88</td>\n",
" <td>15</td>\n",
" <td>22</td>\n",
" <td>10</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2003</th>\n",
" <td>47.0</td>\n",
" <td>34.0</td>\n",
" <td>78.0</td>\n",
" <td>13.640</td>\n",
" <td>14.83</td>\n",
" <td>15.27</td>\n",
" <td>17</td>\n",
" <td>16</td>\n",
" <td>22</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" weight extent rank \n",
"month '1Jan' 'Feb' 'Mar' '1Jan' 'Feb' 'Mar' '1Jan' 'Feb' 'Mar'\n",
"year \n",
"2000 45.1 46.1 25.1 13.442 14.94 15.02 13 17 14\n",
"2001 85.0 16.0 49.0 13.380 14.81 15.14 12 15 17\n",
"2002 90.0 33.0 82.0 13.590 15.13 14.88 15 22 10\n",
"2003 47.0 34.0 78.0 13.640 14.83 15.27 17 16 22"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.read_csv(TESTDATA, header=None, names=['year', 'month', 'weight', 'extent', 'rank']).set_index(['year', 'month']).unstack('month')\n",
"df"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"MultiIndex(levels=[[u'weight', u'extent', u'rank'], [u' '1Jan'', u' 'Feb'', u' 'Mar'']],\n",
" labels=[[0, 0, 0, 1, 1, 1, 2, 2, 2], [0, 1, 2, 0, 1, 2, 0, 1, 2]],\n",
" names=[None, u'month'])"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.columns"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"That's nice, but I really want 'months' as the top level\n",
"\n",
"Here's where I get lost, I've played with a few things and I can't get what I'm looking for.\n",
"I know the index would look like this after the fact:\n",
"```\n",
"pd.MultiIndex(levels=[ ['1Jan', 'Feb', 'Mar'], [u'weight', u'extent', u'rank']],\n",
" labels=[ [0, 0, 0, 1, 1, 1, 2, 2, 2],[0, 1, 2, 0, 1, 2, 0, 1, 2]],\n",
" names=[None, u'month'])\n",
"```\n",
"And I know that just setting the index doesn't work:"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr>\n",
" <th>month</th>\n",
" <th colspan=\"3\" halign=\"left\">1Jan</th>\n",
" <th colspan=\"3\" halign=\"left\">Feb</th>\n",
" <th colspan=\"3\" halign=\"left\">Mar</th>\n",
" </tr>\n",
" <tr>\n",
" <th></th>\n",
" <th>weight</th>\n",
" <th>extent</th>\n",
" <th>rank</th>\n",
" <th>weight</th>\n",
" <th>extent</th>\n",
" <th>rank</th>\n",
" <th>weight</th>\n",
" <th>extent</th>\n",
" <th>rank</th>\n",
" </tr>\n",
" <tr>\n",
" <th>year</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>2000</th>\n",
" <td>45.1</td>\n",
" <td>46.1</td>\n",
" <td>25.1</td>\n",
" <td>13.442</td>\n",
" <td>14.94</td>\n",
" <td>15.02</td>\n",
" <td>13</td>\n",
" <td>17</td>\n",
" <td>14</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2001</th>\n",
" <td>85.0</td>\n",
" <td>16.0</td>\n",
" <td>49.0</td>\n",
" <td>13.380</td>\n",
" <td>14.81</td>\n",
" <td>15.14</td>\n",
" <td>12</td>\n",
" <td>15</td>\n",
" <td>17</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2002</th>\n",
" <td>90.0</td>\n",
" <td>33.0</td>\n",
" <td>82.0</td>\n",
" <td>13.590</td>\n",
" <td>15.13</td>\n",
" <td>14.88</td>\n",
" <td>15</td>\n",
" <td>22</td>\n",
" <td>10</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2003</th>\n",
" <td>47.0</td>\n",
" <td>34.0</td>\n",
" <td>78.0</td>\n",
" <td>13.640</td>\n",
" <td>14.83</td>\n",
" <td>15.27</td>\n",
" <td>17</td>\n",
" <td>16</td>\n",
" <td>22</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"month 1Jan Feb Mar \n",
" weight extent rank weight extent rank weight extent rank\n",
"year \n",
"2000 45.1 46.1 25.1 13.442 14.94 15.02 13 17 14\n",
"2001 85.0 16.0 49.0 13.380 14.81 15.14 12 15 17\n",
"2002 90.0 33.0 82.0 13.590 15.13 14.88 15 22 10\n",
"2003 47.0 34.0 78.0 13.640 14.83 15.27 17 16 22"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.columns = pd.MultiIndex(levels=[ ['1Jan', 'Feb', 'Mar'], [u'weight', u'extent', u'rank']],\n",
" labels=[ [0, 0, 0, 1, 1, 1, 2, 2, 2],[0, 1, 2, 0, 1, 2, 0, 1, 2]],\n",
" names=[u'month', None])\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"###I'm looking for the magic that makes my index like the desired one above, but also moved my columns.\n",
"\n",
"Any help with answers or even search keywords would be apprecated.\n",
"\n",
"```\n",
"month 1Jan Feb Mar \n",
" weight extent rank weight extent rank weight extent rank\n",
"year \n",
"2000 45 13.442 46 25 14.94 15.02 13 17 14\n",
"2001 85 13.380 16 49 14.81 15.14 12 15 17\n",
"2002 90 13.590 33 82 15.13 14.88 15 22 10\n",
"2003 47 13.640 34 78 14.83 15.27 17 16 22\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.9"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment