Created
November 7, 2021 20:39
-
-
Save vaclavdekanovsky/f7b0b49904fde2aa97473216fa31c725 to your computer and use it in GitHub Desktop.
Comparing 2 version of one dataset having one key and two value columns in pandas. The differences are styled by background color.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Find a change - One key, two values\n", | |
"Find the differences in two version of the same file having 1 key and two value columns." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import pandas as pd\n", | |
"import numpy as np" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"df_old = pd.DataFrame({\"key\":[\"A\",\"B\",\"C\"], \"height\":[1,2,3], \"width\":[4,5,6]})" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"df_new = pd.DataFrame({\"key\":[\"A\",\"B\",\"D\"], \"height\":[-1,2,7],\"width\":[4,6,9]})" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"You want to compare all the columns that are not keys" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"['height', 'width']" | |
] | |
}, | |
"execution_count": 4, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"# merging/joining key\n", | |
"key_columns = [\"key\"]\n", | |
"\n", | |
"# All columns in the df\n", | |
"columns_to_compare = list(df_old.columns)\n", | |
"\n", | |
"# remove key column to have only the rest\n", | |
"[columns_to_compare.remove(i) for i in key_columns]\n", | |
"\n", | |
"# check the values\n", | |
"columns_to_compare" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"To see the comparison, we join the `df_old` to the new version `df_new`. For each of the values we create a column `change_column_name`. It will have the value either \n", | |
"* `same` - if the key exists and the value is the same\n", | |
"* `different` - if the key exists and the column differs\n", | |
"* `removed` - if the key doesn't exist anymore (it was only in the old version)\n", | |
"* `added` - if the key appears only in the new version" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"result = df_old.merge(df_new, on=key_columns, how=\"outer\", suffixes=[\"_old\",\"_new\"])\n", | |
"for v in columns_to_compare:\n", | |
" result[\"change_\" + v] = np.where(result[v + \"_new\"]==result[v + \"_old\"],\"same\",\n", | |
" np.where(result[v + \"_new\"].isna(),\"removed\",\n", | |
" np.where(result[v + \"_old\"].isna(),\"added\",\"different\")))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"['change_height', 'change_width']" | |
] | |
}, | |
"execution_count": 6, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"# we have added new column with the change information\n", | |
"change_columns = [\"change_\" + i for i in columns_to_compare]\n", | |
"change_columns" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<div>\n", | |
"<style scoped>\n", | |
" .dataframe tbody tr th:only-of-type {\n", | |
" vertical-align: middle;\n", | |
" }\n", | |
"\n", | |
" .dataframe tbody tr th {\n", | |
" vertical-align: top;\n", | |
" }\n", | |
"\n", | |
" .dataframe thead th {\n", | |
" text-align: right;\n", | |
" }\n", | |
"</style>\n", | |
"<table border=\"1\" class=\"dataframe\">\n", | |
" <thead>\n", | |
" <tr style=\"text-align: right;\">\n", | |
" <th></th>\n", | |
" <th>key</th>\n", | |
" <th>height_old</th>\n", | |
" <th>width_old</th>\n", | |
" <th>height_new</th>\n", | |
" <th>width_new</th>\n", | |
" <th>change_height</th>\n", | |
" <th>change_width</th>\n", | |
" </tr>\n", | |
" </thead>\n", | |
" <tbody>\n", | |
" <tr>\n", | |
" <th>0</th>\n", | |
" <td>A</td>\n", | |
" <td>1.0</td>\n", | |
" <td>4.0</td>\n", | |
" <td>-1.0</td>\n", | |
" <td>4.0</td>\n", | |
" <td>different</td>\n", | |
" <td>same</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>1</th>\n", | |
" <td>B</td>\n", | |
" <td>2.0</td>\n", | |
" <td>5.0</td>\n", | |
" <td>2.0</td>\n", | |
" <td>6.0</td>\n", | |
" <td>same</td>\n", | |
" <td>different</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>2</th>\n", | |
" <td>C</td>\n", | |
" <td>3.0</td>\n", | |
" <td>6.0</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>removed</td>\n", | |
" <td>removed</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th>3</th>\n", | |
" <td>D</td>\n", | |
" <td>NaN</td>\n", | |
" <td>NaN</td>\n", | |
" <td>7.0</td>\n", | |
" <td>9.0</td>\n", | |
" <td>added</td>\n", | |
" <td>added</td>\n", | |
" </tr>\n", | |
" </tbody>\n", | |
"</table>\n", | |
"</div>" | |
], | |
"text/plain": [ | |
" key height_old width_old height_new width_new change_height change_width\n", | |
"0 A 1.0 4.0 -1.0 4.0 different same\n", | |
"1 B 2.0 5.0 2.0 6.0 same different\n", | |
"2 C 3.0 6.0 NaN NaN removed removed\n", | |
"3 D NaN NaN 7.0 9.0 added added" | |
] | |
}, | |
"execution_count": 7, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"result" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Style the results\n", | |
"Let's color the changes to highlight them. We can start by applying color to the `change_` columns based on their values." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 8, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"cell_bg_colors = {\n", | |
" 'different': 'yellow', \n", | |
" 'removed': '#FFA500',\n", | |
" 'added': '#00CC33',\n", | |
"}\n", | |
"\n", | |
"def color_background(cell):\n", | |
" for value, color in cell_bg_colors.items():\n", | |
" if value in cell:\n", | |
" return \"background-color: {}\".format(color)\n", | |
" return \"\" # default: do nothing" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 9, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<style type=\"text/css\" >\n", | |
"#T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow0_col5,#T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow1_col6{\n", | |
" background-color: yellow;\n", | |
" }#T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow2_col5,#T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow2_col6{\n", | |
" background-color: #FFA500;\n", | |
" }#T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow3_col5,#T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow3_col6{\n", | |
" background-color: #00CC33;\n", | |
" }</style><table id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15a\" ><thead> <tr> <th class=\"blank level0\" ></th> <th class=\"col_heading level0 col0\" >key</th> <th class=\"col_heading level0 col1\" >height_old</th> <th class=\"col_heading level0 col2\" >width_old</th> <th class=\"col_heading level0 col3\" >height_new</th> <th class=\"col_heading level0 col4\" >width_new</th> <th class=\"col_heading level0 col5\" >change_height</th> <th class=\"col_heading level0 col6\" >change_width</th> </tr></thead><tbody>\n", | |
" <tr>\n", | |
" <th id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15alevel0_row0\" class=\"row_heading level0 row0\" >0</th>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow0_col0\" class=\"data row0 col0\" >A</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow0_col1\" class=\"data row0 col1\" >1.000000</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow0_col2\" class=\"data row0 col2\" >4.000000</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow0_col3\" class=\"data row0 col3\" >-1.000000</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow0_col4\" class=\"data row0 col4\" >4.000000</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow0_col5\" class=\"data row0 col5\" >different</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow0_col6\" class=\"data row0 col6\" >same</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15alevel0_row1\" class=\"row_heading level0 row1\" >1</th>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow1_col0\" class=\"data row1 col0\" >B</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow1_col1\" class=\"data row1 col1\" >2.000000</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow1_col2\" class=\"data row1 col2\" >5.000000</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow1_col3\" class=\"data row1 col3\" >2.000000</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow1_col4\" class=\"data row1 col4\" >6.000000</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow1_col5\" class=\"data row1 col5\" >same</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow1_col6\" class=\"data row1 col6\" >different</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15alevel0_row2\" class=\"row_heading level0 row2\" >2</th>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow2_col0\" class=\"data row2 col0\" >C</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow2_col1\" class=\"data row2 col1\" >3.000000</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow2_col2\" class=\"data row2 col2\" >6.000000</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow2_col3\" class=\"data row2 col3\" >nan</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow2_col4\" class=\"data row2 col4\" >nan</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow2_col5\" class=\"data row2 col5\" >removed</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow2_col6\" class=\"data row2 col6\" >removed</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15alevel0_row3\" class=\"row_heading level0 row3\" >3</th>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow3_col0\" class=\"data row3 col0\" >D</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow3_col1\" class=\"data row3 col1\" >nan</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow3_col2\" class=\"data row3 col2\" >nan</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow3_col3\" class=\"data row3 col3\" >7.000000</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow3_col4\" class=\"data row3 col4\" >9.000000</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow3_col5\" class=\"data row3 col5\" >added</td>\n", | |
" <td id=\"T_26aa36fd_400a_11ec_b6b0_0c7a15d0f15arow3_col6\" class=\"data row3 col6\" >added</td>\n", | |
" </tr>\n", | |
" </tbody></table>" | |
], | |
"text/plain": [ | |
"<pandas.io.formats.style.Styler at 0x1e0f019b490>" | |
] | |
}, | |
"execution_count": 9, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"highlighted = result.style.applymap(color_background, subset=change_columns)\n", | |
"highlighted" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Maybe you prefer to color the changed values themselves. Highlight the values using brute force. We do it row by row using `.apply(highlight_function, axis=1)` where `axis=1` mean row by row. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 10, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"def highlight(row):\n", | |
" # first we create a list containing as many '' as there are columns in the row\n", | |
" ret = [\"\" for _ in row.index]\n", | |
" \n", | |
" # then we apply various formatting depending on the change column\n", | |
" if row[\"change_height\"] == \"different\":\n", | |
" ret[row.index.get_loc(\"height_old\")] = \"background-color: yellow\"\n", | |
" ret[row.index.get_loc(\"height_new\")] = \"background-color: yellow\"\n", | |
" ret[row.index.get_loc(\"change_height\")] = \"background-color: yellow\"\n", | |
" if row[\"change_width\"] == \"different\":\n", | |
" ret[row.index.get_loc(\"width_old\")] = \"background-color: yellow\"\n", | |
" ret[row.index.get_loc(\"width_new\")] = \"background-color: yellow\"\n", | |
" ret[row.index.get_loc(\"change_width\")] = \"background-color: yellow\"\n", | |
" if row[\"change_height\"] == \"removed\":\n", | |
" ret[row.index.get_loc(\"height_old\")] = \"background-color: #FFA500\"\n", | |
" ret[row.index.get_loc(\"width_old\")] = \"background-color: #FFA500\"\n", | |
" ret[row.index.get_loc(\"change_height\")] = \"background-color: #FFA500\"\n", | |
" ret[row.index.get_loc(\"change_width\")] = \"background-color: #FFA500\"\n", | |
" if row[\"change_width\"] == \"added\":\n", | |
" ret[row.index.get_loc(\"height_new\")] = \"background-color: #00CC33\"\n", | |
" ret[row.index.get_loc(\"width_new\")] = \"background-color: #00CC33\"\n", | |
" ret[row.index.get_loc(\"change_height\")] = \"background-color: #00CC33\"\n", | |
" ret[row.index.get_loc(\"change_width\")] = \"background-color: #00CC33\"\n", | |
" return ret" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 11, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<style type=\"text/css\" >\n", | |
"#T_26b42227_400a_11ec_9065_0c7a15d0f15arow0_col1,#T_26b42227_400a_11ec_9065_0c7a15d0f15arow0_col3,#T_26b42227_400a_11ec_9065_0c7a15d0f15arow0_col5,#T_26b42227_400a_11ec_9065_0c7a15d0f15arow1_col2,#T_26b42227_400a_11ec_9065_0c7a15d0f15arow1_col4,#T_26b42227_400a_11ec_9065_0c7a15d0f15arow1_col6{\n", | |
" background-color: yellow;\n", | |
" }#T_26b42227_400a_11ec_9065_0c7a15d0f15arow2_col1,#T_26b42227_400a_11ec_9065_0c7a15d0f15arow2_col2,#T_26b42227_400a_11ec_9065_0c7a15d0f15arow2_col5,#T_26b42227_400a_11ec_9065_0c7a15d0f15arow2_col6{\n", | |
" background-color: #FFA500;\n", | |
" }#T_26b42227_400a_11ec_9065_0c7a15d0f15arow3_col3,#T_26b42227_400a_11ec_9065_0c7a15d0f15arow3_col4,#T_26b42227_400a_11ec_9065_0c7a15d0f15arow3_col5,#T_26b42227_400a_11ec_9065_0c7a15d0f15arow3_col6{\n", | |
" background-color: #00CC33;\n", | |
" }</style><table id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15a\" ><thead> <tr> <th class=\"blank level0\" ></th> <th class=\"col_heading level0 col0\" >key</th> <th class=\"col_heading level0 col1\" >height_old</th> <th class=\"col_heading level0 col2\" >width_old</th> <th class=\"col_heading level0 col3\" >height_new</th> <th class=\"col_heading level0 col4\" >width_new</th> <th class=\"col_heading level0 col5\" >change_height</th> <th class=\"col_heading level0 col6\" >change_width</th> </tr></thead><tbody>\n", | |
" <tr>\n", | |
" <th id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15alevel0_row0\" class=\"row_heading level0 row0\" >0</th>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow0_col0\" class=\"data row0 col0\" >A</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow0_col1\" class=\"data row0 col1\" >1.000000</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow0_col2\" class=\"data row0 col2\" >4.000000</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow0_col3\" class=\"data row0 col3\" >-1.000000</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow0_col4\" class=\"data row0 col4\" >4.000000</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow0_col5\" class=\"data row0 col5\" >different</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow0_col6\" class=\"data row0 col6\" >same</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15alevel0_row1\" class=\"row_heading level0 row1\" >1</th>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow1_col0\" class=\"data row1 col0\" >B</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow1_col1\" class=\"data row1 col1\" >2.000000</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow1_col2\" class=\"data row1 col2\" >5.000000</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow1_col3\" class=\"data row1 col3\" >2.000000</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow1_col4\" class=\"data row1 col4\" >6.000000</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow1_col5\" class=\"data row1 col5\" >same</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow1_col6\" class=\"data row1 col6\" >different</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15alevel0_row2\" class=\"row_heading level0 row2\" >2</th>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow2_col0\" class=\"data row2 col0\" >C</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow2_col1\" class=\"data row2 col1\" >3.000000</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow2_col2\" class=\"data row2 col2\" >6.000000</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow2_col3\" class=\"data row2 col3\" >nan</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow2_col4\" class=\"data row2 col4\" >nan</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow2_col5\" class=\"data row2 col5\" >removed</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow2_col6\" class=\"data row2 col6\" >removed</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15alevel0_row3\" class=\"row_heading level0 row3\" >3</th>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow3_col0\" class=\"data row3 col0\" >D</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow3_col1\" class=\"data row3 col1\" >nan</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow3_col2\" class=\"data row3 col2\" >nan</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow3_col3\" class=\"data row3 col3\" >7.000000</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow3_col4\" class=\"data row3 col4\" >9.000000</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow3_col5\" class=\"data row3 col5\" >added</td>\n", | |
" <td id=\"T_26b42227_400a_11ec_9065_0c7a15d0f15arow3_col6\" class=\"data row3 col6\" >added</td>\n", | |
" </tr>\n", | |
" </tbody></table>" | |
], | |
"text/plain": [ | |
"<pandas.io.formats.style.Styler at 0x1e0f123fdf0>" | |
] | |
}, | |
"execution_count": 11, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"result.style.apply(highlight, axis=1)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Let's do it a bit clever way, so that you don't have to rename everytime you compare a file with different columns. \n", | |
"Let's do it a clever clever way:\n", | |
"1. Color differences" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 12, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"def highlight(row):\n", | |
" \n", | |
" colors = {\n", | |
" \"different\": \"yellow\"\n", | |
" }\n", | |
" ret = [\"\" for _ in row.index]\n", | |
" for c in change_columns:\n", | |
" related_value = c.split(\"_\")[1] # e.g. change_height --> height\n", | |
" if row[c] == \"different\":\n", | |
" ret[row.index.get_loc(f\"{related_value}_old\")] = f\"background-color: {colors['different']}\"\n", | |
" ret[row.index.get_loc(f\"{related_value}_new\")] = f\"background-color: {colors['different']}\"\n", | |
" ret[row.index.get_loc(c)] = f\"background-color: {colors['different']}\"\n", | |
" if row[c] == \"removed\":\n", | |
" pass\n", | |
" if row[c] == \"added\":\n", | |
" pass\n", | |
" return ret" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 13, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<style type=\"text/css\" >\n", | |
"#T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow0_col1,#T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow0_col3,#T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow0_col5,#T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow1_col2,#T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow1_col4,#T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow1_col6{\n", | |
" background-color: yellow;\n", | |
" }</style><table id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15a\" ><thead> <tr> <th class=\"blank level0\" ></th> <th class=\"col_heading level0 col0\" >key</th> <th class=\"col_heading level0 col1\" >height_old</th> <th class=\"col_heading level0 col2\" >width_old</th> <th class=\"col_heading level0 col3\" >height_new</th> <th class=\"col_heading level0 col4\" >width_new</th> <th class=\"col_heading level0 col5\" >change_height</th> <th class=\"col_heading level0 col6\" >change_width</th> </tr></thead><tbody>\n", | |
" <tr>\n", | |
" <th id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15alevel0_row0\" class=\"row_heading level0 row0\" >0</th>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow0_col0\" class=\"data row0 col0\" >A</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow0_col1\" class=\"data row0 col1\" >1.000000</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow0_col2\" class=\"data row0 col2\" >4.000000</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow0_col3\" class=\"data row0 col3\" >-1.000000</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow0_col4\" class=\"data row0 col4\" >4.000000</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow0_col5\" class=\"data row0 col5\" >different</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow0_col6\" class=\"data row0 col6\" >same</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15alevel0_row1\" class=\"row_heading level0 row1\" >1</th>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow1_col0\" class=\"data row1 col0\" >B</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow1_col1\" class=\"data row1 col1\" >2.000000</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow1_col2\" class=\"data row1 col2\" >5.000000</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow1_col3\" class=\"data row1 col3\" >2.000000</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow1_col4\" class=\"data row1 col4\" >6.000000</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow1_col5\" class=\"data row1 col5\" >same</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow1_col6\" class=\"data row1 col6\" >different</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15alevel0_row2\" class=\"row_heading level0 row2\" >2</th>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow2_col0\" class=\"data row2 col0\" >C</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow2_col1\" class=\"data row2 col1\" >3.000000</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow2_col2\" class=\"data row2 col2\" >6.000000</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow2_col3\" class=\"data row2 col3\" >nan</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow2_col4\" class=\"data row2 col4\" >nan</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow2_col5\" class=\"data row2 col5\" >removed</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow2_col6\" class=\"data row2 col6\" >removed</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15alevel0_row3\" class=\"row_heading level0 row3\" >3</th>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow3_col0\" class=\"data row3 col0\" >D</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow3_col1\" class=\"data row3 col1\" >nan</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow3_col2\" class=\"data row3 col2\" >nan</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow3_col3\" class=\"data row3 col3\" >7.000000</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow3_col4\" class=\"data row3 col4\" >9.000000</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow3_col5\" class=\"data row3 col5\" >added</td>\n", | |
" <td id=\"T_26beaa49_400a_11ec_a4d0_0c7a15d0f15arow3_col6\" class=\"data row3 col6\" >added</td>\n", | |
" </tr>\n", | |
" </tbody></table>" | |
], | |
"text/plain": [ | |
"<pandas.io.formats.style.Styler at 0x1e0f1230b20>" | |
] | |
}, | |
"execution_count": 13, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"result.style.apply(highlight, axis=1)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Next, let's deal with removed and added\n", | |
"\n", | |
"2. Color removed rows - \n", | |
"3. Color added rows - " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 14, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"def highlight(row):\n", | |
" \"\"\"Iterate the dataframe rowwise and color columns in the rows based on conditions\"\"\"\n", | |
" \n", | |
" # define the colors\n", | |
" colors = {\n", | |
" \"different\": \"yellow\",\n", | |
" \"removed\": \"red\",\n", | |
" \"added\": \"#00CC33\"\n", | |
" }\n", | |
" \n", | |
" # create a list of '' for each column in the row\n", | |
" ret = [\"\" for _ in row.index]\n", | |
" \n", | |
" # iterate through the change column and depending on the values assign background-colors\n", | |
" for c in change_columns:\n", | |
" # based on the name of the change column, find the name of the value columns\n", | |
" # e.g. change_height --> height\n", | |
" related_value = c.split(\"_\")[1] \n", | |
" if row[c] == \"different\":\n", | |
" ret[row.index.get_loc(f\"{related_value}_old\")] = f\"background-color: {colors['different']}\"\n", | |
" ret[row.index.get_loc(f\"{related_value}_new\")] = f\"background-color: {colors['different']}\"\n", | |
" ret[row.index.get_loc(c)] = f\"background-color: {colors['different']}\"\n", | |
" # if the key was removed, we highlight all the _old columns and all the change_ columns\n", | |
" if row[c] == \"removed\":\n", | |
" ret[row.index.get_loc(f\"{related_value}_old\")] = f\"background-color: {colors['removed']}\"\n", | |
" ret[row.index.get_loc(c)] = f\"background-color: {colors['removed']}\"\n", | |
" # if the key was added, we highlight all the _new columns and all the change_ columns\n", | |
" if row[c] == \"added\":\n", | |
" ret[row.index.get_loc(f\"{related_value}_new\")] = f\"background-color: {colors['added']}\"\n", | |
" ret[row.index.get_loc(c)] = f\"background-color: {colors['added']}\"\n", | |
" return ret" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 15, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<style type=\"text/css\" >\n", | |
"#T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow0_col1,#T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow0_col3,#T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow0_col5,#T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow1_col2,#T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow1_col4,#T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow1_col6{\n", | |
" background-color: yellow;\n", | |
" }#T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow2_col1,#T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow2_col2,#T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow2_col5,#T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow2_col6{\n", | |
" background-color: red;\n", | |
" }#T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow3_col3,#T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow3_col4,#T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow3_col5,#T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow3_col6{\n", | |
" background-color: #00CC33;\n", | |
" }</style><table id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15a\" ><thead> <tr> <th class=\"blank level0\" ></th> <th class=\"col_heading level0 col0\" >key</th> <th class=\"col_heading level0 col1\" >height_old</th> <th class=\"col_heading level0 col2\" >width_old</th> <th class=\"col_heading level0 col3\" >height_new</th> <th class=\"col_heading level0 col4\" >width_new</th> <th class=\"col_heading level0 col5\" >change_height</th> <th class=\"col_heading level0 col6\" >change_width</th> </tr></thead><tbody>\n", | |
" <tr>\n", | |
" <th id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15alevel0_row0\" class=\"row_heading level0 row0\" >0</th>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow0_col0\" class=\"data row0 col0\" >A</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow0_col1\" class=\"data row0 col1\" >1.000000</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow0_col2\" class=\"data row0 col2\" >4.000000</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow0_col3\" class=\"data row0 col3\" >-1.000000</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow0_col4\" class=\"data row0 col4\" >4.000000</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow0_col5\" class=\"data row0 col5\" >different</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow0_col6\" class=\"data row0 col6\" >same</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15alevel0_row1\" class=\"row_heading level0 row1\" >1</th>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow1_col0\" class=\"data row1 col0\" >B</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow1_col1\" class=\"data row1 col1\" >2.000000</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow1_col2\" class=\"data row1 col2\" >5.000000</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow1_col3\" class=\"data row1 col3\" >2.000000</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow1_col4\" class=\"data row1 col4\" >6.000000</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow1_col5\" class=\"data row1 col5\" >same</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow1_col6\" class=\"data row1 col6\" >different</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15alevel0_row2\" class=\"row_heading level0 row2\" >2</th>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow2_col0\" class=\"data row2 col0\" >C</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow2_col1\" class=\"data row2 col1\" >3.000000</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow2_col2\" class=\"data row2 col2\" >6.000000</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow2_col3\" class=\"data row2 col3\" >nan</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow2_col4\" class=\"data row2 col4\" >nan</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow2_col5\" class=\"data row2 col5\" >removed</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow2_col6\" class=\"data row2 col6\" >removed</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15alevel0_row3\" class=\"row_heading level0 row3\" >3</th>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow3_col0\" class=\"data row3 col0\" >D</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow3_col1\" class=\"data row3 col1\" >nan</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow3_col2\" class=\"data row3 col2\" >nan</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow3_col3\" class=\"data row3 col3\" >7.000000</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow3_col4\" class=\"data row3 col4\" >9.000000</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow3_col5\" class=\"data row3 col5\" >added</td>\n", | |
" <td id=\"T_26c70de5_400a_11ec_91a6_0c7a15d0f15arow3_col6\" class=\"data row3 col6\" >added</td>\n", | |
" </tr>\n", | |
" </tbody></table>" | |
], | |
"text/plain": [ | |
"<pandas.io.formats.style.Styler at 0x1e0f123ff70>" | |
] | |
}, | |
"execution_count": 15, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"result.style.apply(highlight, axis=1)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Reorder _old and _new columns\n", | |
"Reorder the columns to show old and new columns next to one another.You can reorder by applying the order to the dataframe, e.g. `result[[\"key\",\"height_old\",\"height_new\",...]]`. We can use a function to do it programatically. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 16, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<style type=\"text/css\" >\n", | |
"#T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow0_col1,#T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow0_col2,#T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow0_col5,#T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow1_col3,#T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow1_col4,#T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow1_col6{\n", | |
" background-color: yellow;\n", | |
" }#T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow2_col1,#T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow2_col3,#T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow2_col5,#T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow2_col6{\n", | |
" background-color: red;\n", | |
" }#T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow3_col2,#T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow3_col4,#T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow3_col5,#T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow3_col6{\n", | |
" background-color: #00CC33;\n", | |
" }</style><table id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15a\" ><thead> <tr> <th class=\"blank level0\" ></th> <th class=\"col_heading level0 col0\" >key</th> <th class=\"col_heading level0 col1\" >height_old</th> <th class=\"col_heading level0 col2\" >height_new</th> <th class=\"col_heading level0 col3\" >width_old</th> <th class=\"col_heading level0 col4\" >width_new</th> <th class=\"col_heading level0 col5\" >change_height</th> <th class=\"col_heading level0 col6\" >change_width</th> </tr></thead><tbody>\n", | |
" <tr>\n", | |
" <th id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15alevel0_row0\" class=\"row_heading level0 row0\" >0</th>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow0_col0\" class=\"data row0 col0\" >A</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow0_col1\" class=\"data row0 col1\" >1.000000</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow0_col2\" class=\"data row0 col2\" >-1.000000</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow0_col3\" class=\"data row0 col3\" >4.000000</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow0_col4\" class=\"data row0 col4\" >4.000000</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow0_col5\" class=\"data row0 col5\" >different</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow0_col6\" class=\"data row0 col6\" >same</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15alevel0_row1\" class=\"row_heading level0 row1\" >1</th>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow1_col0\" class=\"data row1 col0\" >B</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow1_col1\" class=\"data row1 col1\" >2.000000</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow1_col2\" class=\"data row1 col2\" >2.000000</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow1_col3\" class=\"data row1 col3\" >5.000000</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow1_col4\" class=\"data row1 col4\" >6.000000</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow1_col5\" class=\"data row1 col5\" >same</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow1_col6\" class=\"data row1 col6\" >different</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15alevel0_row2\" class=\"row_heading level0 row2\" >2</th>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow2_col0\" class=\"data row2 col0\" >C</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow2_col1\" class=\"data row2 col1\" >3.000000</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow2_col2\" class=\"data row2 col2\" >nan</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow2_col3\" class=\"data row2 col3\" >6.000000</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow2_col4\" class=\"data row2 col4\" >nan</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow2_col5\" class=\"data row2 col5\" >removed</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow2_col6\" class=\"data row2 col6\" >removed</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15alevel0_row3\" class=\"row_heading level0 row3\" >3</th>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow3_col0\" class=\"data row3 col0\" >D</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow3_col1\" class=\"data row3 col1\" >nan</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow3_col2\" class=\"data row3 col2\" >7.000000</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow3_col3\" class=\"data row3 col3\" >nan</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow3_col4\" class=\"data row3 col4\" >9.000000</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow3_col5\" class=\"data row3 col5\" >added</td>\n", | |
" <td id=\"T_26cd76b5_400a_11ec_82fd_0c7a15d0f15arow3_col6\" class=\"data row3 col6\" >added</td>\n", | |
" </tr>\n", | |
" </tbody></table>" | |
], | |
"text/plain": [ | |
"<pandas.io.formats.style.Styler at 0x1e0f123fc70>" | |
] | |
}, | |
"execution_count": 16, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"# You might rather want to order the columns so that the values are next to each other. \n", | |
"import itertools\n", | |
"def f(name:str):\n", | |
" return name.split(\"_\")[1]\n", | |
"\n", | |
"column_order = key_columns + list(itertools.chain(*[(f(c)+\"_old\",f(c)+\"_new\") for c in change_columns])) + change_columns\n", | |
"result[column_order].style.apply(highlight, axis=1)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Another styling option, is to `.apply(highlight_function, axis=None)`. In this case the `highlight_function` must return a dataframe having the same size as the original df, but each cell contains the style. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 17, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/html": [ | |
"<style type=\"text/css\" >\n", | |
"#T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow0_col1,#T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow0_col3,#T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow1_col2,#T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow1_col4{\n", | |
" background-color: yellow;\n", | |
" }#T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow2_col1,#T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow2_col2{\n", | |
" background-color: red;\n", | |
" }#T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow3_col3,#T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow3_col4{\n", | |
" background-color: #00CC33;\n", | |
" }</style><table id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15a\" ><thead> <tr> <th class=\"blank level0\" ></th> <th class=\"col_heading level0 col0\" >key</th> <th class=\"col_heading level0 col1\" >height_old</th> <th class=\"col_heading level0 col2\" >width_old</th> <th class=\"col_heading level0 col3\" >height_new</th> <th class=\"col_heading level0 col4\" >width_new</th> <th class=\"col_heading level0 col5\" >change_height</th> <th class=\"col_heading level0 col6\" >change_width</th> </tr></thead><tbody>\n", | |
" <tr>\n", | |
" <th id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15alevel0_row0\" class=\"row_heading level0 row0\" >0</th>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow0_col0\" class=\"data row0 col0\" >A</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow0_col1\" class=\"data row0 col1\" >1.000000</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow0_col2\" class=\"data row0 col2\" >4.000000</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow0_col3\" class=\"data row0 col3\" >-1.000000</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow0_col4\" class=\"data row0 col4\" >4.000000</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow0_col5\" class=\"data row0 col5\" >different</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow0_col6\" class=\"data row0 col6\" >same</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15alevel0_row1\" class=\"row_heading level0 row1\" >1</th>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow1_col0\" class=\"data row1 col0\" >B</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow1_col1\" class=\"data row1 col1\" >2.000000</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow1_col2\" class=\"data row1 col2\" >5.000000</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow1_col3\" class=\"data row1 col3\" >2.000000</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow1_col4\" class=\"data row1 col4\" >6.000000</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow1_col5\" class=\"data row1 col5\" >same</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow1_col6\" class=\"data row1 col6\" >different</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15alevel0_row2\" class=\"row_heading level0 row2\" >2</th>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow2_col0\" class=\"data row2 col0\" >C</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow2_col1\" class=\"data row2 col1\" >3.000000</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow2_col2\" class=\"data row2 col2\" >6.000000</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow2_col3\" class=\"data row2 col3\" >nan</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow2_col4\" class=\"data row2 col4\" >nan</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow2_col5\" class=\"data row2 col5\" >removed</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow2_col6\" class=\"data row2 col6\" >removed</td>\n", | |
" </tr>\n", | |
" <tr>\n", | |
" <th id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15alevel0_row3\" class=\"row_heading level0 row3\" >3</th>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow3_col0\" class=\"data row3 col0\" >D</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow3_col1\" class=\"data row3 col1\" >nan</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow3_col2\" class=\"data row3 col2\" >nan</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow3_col3\" class=\"data row3 col3\" >7.000000</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow3_col4\" class=\"data row3 col4\" >9.000000</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow3_col5\" class=\"data row3 col5\" >added</td>\n", | |
" <td id=\"T_26d42d48_400a_11ec_bda4_0c7a15d0f15arow3_col6\" class=\"data row3 col6\" >added</td>\n", | |
" </tr>\n", | |
" </tbody></table>" | |
], | |
"text/plain": [ | |
"<pandas.io.formats.style.Styler at 0x1e0f139a370>" | |
] | |
}, | |
"execution_count": 17, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"def highlight(df):\n", | |
" style_base = \"background-color: \"\n", | |
" \n", | |
" # define the colors\n", | |
" colors = {\n", | |
" \"different\": \"yellow\",\n", | |
" \"removed\": \"red\",\n", | |
" \"added\": \"#00CC33\"\n", | |
" }\n", | |
" \n", | |
" #DataFrame with same index and columns names as original filled empty strings\n", | |
" df1 = pd.DataFrame('', index=df.index, columns=df.columns)\n", | |
" \n", | |
" #compare columns\n", | |
" for c in change_columns:\n", | |
" related_value = c.split(\"_\")[1]\n", | |
" \n", | |
" #modify values of df1 column by boolean mask\n", | |
" mask = df[f\"change_{related_value}\"] == \"different\"\n", | |
" for ver in [\"_old\",\"_new\"]:\n", | |
" df1.loc[mask, f\"{related_value}{ver}\"] = style_base + colors[\"different\"]\n", | |
" \n", | |
" mask2 = df[f\"change_{related_value}\"] == \"removed\"\n", | |
" df1.loc[mask2, f\"{related_value}_old\"] = style_base + colors[\"removed\"]\n", | |
"\n", | |
" mask3 = df[f\"change_{related_value}\"] == \"added\"\n", | |
" df1.loc[mask3, f\"{related_value}_new\"] = style_base + colors[\"added\"]\n", | |
" \n", | |
" return df1\n", | |
"\n", | |
"#column_order = key_column + list(itertools.chain(*[(f(c)+\"_old\",f(c)+\"_new\") for c in change_columns]))\n", | |
"\n", | |
"result.style.apply(highlight, axis=None)" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.8.5" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 4 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment