Skip to content

Instantly share code, notes, and snippets.

@DmitryUlyanov
Created October 2, 2017 13:22
Show Gist options
  • Save DmitryUlyanov/358a36930e78c33449d0f8caaeee7026 to your computer and use it in GitHub Desktop.
Save DmitryUlyanov/358a36930e78c33449d0f8caaeee7026 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Weighted median"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the video we have discussed that for MAPE metric the best constant prediction is [weighted median](https://en.wikipedia.org/wiki/Weighted_median) with weights\n",
"\n",
"$$w_i = \\frac{\\sum_{j=1}^N \\frac{1}{x_j}}{x_i}$$\n",
"\n",
"for each object $x_i$.\n",
"\n",
"This notebook exlpains how to compute weighted median. Let's generate some data first, and then find it's weighted median."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([63, 52, 70, 17, 76])"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"N = 5\n",
"x = np.random.randint(low=1, high=100, size=N)\n",
"x"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"1) Compute *normalized* weights:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 0.01587302, 0.01923077, 0.01428571, 0.05882353, 0.01315789])"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"inv_x = 1.0/x\n",
"inv_x"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 0.13078104, 0.15844626, 0.11770294, 0.48465916, 0.1084106 ])"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"w = inv_x/sum(inv_x)\n",
"w"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"2) Now sort the normalized weights. We will use `argsort` (and not just `sort`) since we will need indices later."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"array([ 0.1084106 , 0.11770294, 0.13078104, 0.15844626, 0.48465916])"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"idxs = np.argsort(w)\n",
"sorted_w = w[idxs]\n",
"sorted_w"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"3) Compute [cumulitive sum](https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.cumsum.html) of sorted weights"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAD8CAYAAACMwORRAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAHc5JREFUeJzt3Xl4lfWd9/H3NyEsYV8ChISwyRZ2CLjjUlAWKVZcqF2m\ndKH41GekLqO2Vjt1npnaTqv2GafoWKvOTEUiSgkCWjuKuxIkCWEPi2ZlCZBAQshyfvNHDprGQE7g\nJPdZPq/r4rpyzrnl/vC7wseb+3zzO+acQ0REIkuM1wFERCT4VO4iIhFI5S4iEoFU7iIiEUjlLiIS\ngVTuIiIRSOUuIhKBVO4iIhFI5S4iEoHaeXXiPn36uMGDB3t1ehGRsLRp06bDzrmE5o7zrNwHDx5M\nZmamV6cXEQlLZvZpIMfptoyISARSuYuIRCCVu4hIBFK5i4hEIJW7iEgEarbczewZMztoZrlneN3M\n7HdmlmdmOWY2OfgxRUSkJQK5cn8WmHWW12cDw/2/FgO/P/9YIiJyPpotd+fc28CRsxwyH3je1fsQ\n6GFmicEKKCISSR5/Yze5hWWtfp5g3HNPAvIbPC7wP/clZrbYzDLNLPPQoUNBOLWISPhYt6WYR9/Y\nxfrcklY/V5u+oeqce8o5l+acS0tIaPanZ0VEIkZJWRX3vbyF8cnduWPG8FY/XzDKvRAY2OBxsv85\nEREBfD7HXelZVNf6eOyWicTFtv51dTDOsBr4tn9q5iKgzDlXHITfV0QkIjzz3j7eyyvlwXmpDE3o\n0ibnbHbjMDN7AbgS6GNmBcBDQByAc24ZsBaYA+QBlcCi1gorIhJuthWV86v1O5mZ2o+FUwc2/x8E\nSbPl7pz7ejOvO+BHQUskIhIhqmrquGP5ZrrHx/HIgvGYWZud27Mtf0VEIt0v1+1g98ETPPfdafTq\n3L5Nz63tB0REWsGbOw/y7Pv7WXTpYK4Y0fbTgSp3EZEgO3ziFPek5zCyX1funTXKkwy6LSMiEkTO\nOe5bmUP5yRr+83vT6BgX60kOXbmLiATRnz7+jDe2H+Te2aMYndjNsxwqdxGRIMk7eIKH12zj8uF9\nWHTJYE+zqNxFRIKgutbH0hc30ykuln+9aQIxMW039tgU3XMXEQmCR9/YRW5hOcu+OYV+3Tp6HUdX\n7iIi5+uDPaUs27CHhVMHMmtsf6/jACp3EZHzUlZZw10rshjUK56fXZfqdZzP6baMiMg5cs7xwJ9z\nOXD8FCtvu4TOHUKnUnXlLiJyjlZlFZKRXcSPZwxn4sAeXsf5Gyp3EZFzkH+kkp+t2srUwT257coL\nvI7zJSp3EZEWqq3z8eMXszDgtzdPJNbjscemhM4NIhGRMPH7t/aQ+elRHr1lAgN7xXsdp0m6chcR\naYGs/GM89tfdfHXCAK6fmOR1nDNSuYuIBKjiVC1Ll2+mf7eOPHz92Db98I2W0m0ZEZEA/SJjG58e\nqeSFH1xE905xXsc5K125i4gEYH1uMS9m5nPbFcO4aGhvr+M0S+UuItKMkrIq7nt5C+OSurN0xgiv\n4wRE5S4ichY+n+Pu9GxO1fh4bOFE2rcLj9oMj5QiIh555r19vJt3mAeuG82whC5exwmYyl1E5Ay2\nF5fzq/U7mTG6H7dOS/E6Touo3EVEmlBVU8cdyzfTPT6ORxaMC+mxx6ZoFFJEpAm/XLeDXQdO8Oyi\nqfTu0sHrOC2mK3cRkUY27DrEs+/v5zuXDObKkX29jnNOVO4iIg2UnjjF3enZjOjXhftmj/I6zjnT\nbRkRET/nHPeu3EJZZQ3Pf3caHeNivY50znTlLiLi98LH+byx/QD/MGskoxO7eR3nvKjcRUSAPYdO\n8PCabVx2QR++e+kQr+OcN5W7iES96lofS5dn0SEuht/cPIGYEPzwjZbSPXcRiXqPvbGLLYVlLPvm\nZPp16+h1nKDQlbuIRLWP9pby+w17uCVtILPGJnodJ2hU7iIStcpO1nDnimwG9YrnwXmpXscJKt2W\nEZGo9bNVuZSUV/HSkovp3CGy6jCgK3czm2VmO80sz8zua+L17maWYWbZZrbVzBYFP6qISPCs2lzI\n6uwi7vjKcCal9PQ6TtA1W+5mFgs8AcwGUoGvm1njf7/8CNjmnJsAXAn8xszaBzmriEhQ5B+p5Ger\nckkb1JP/c+Uwr+O0ikCu3KcBec65vc65amA5ML/RMQ7oavXbpnUBjgC1QU0qIhIEdT7HnSuycMCj\nt0ykXWxkvvUYyJ8qCchv8LjA/1xD/waMBoqALcAdzjlfUBKKiATR79/KY+P+o/xi/hgG9or3Ok6r\nCdb/sq4FsoABwETg38zsSz+7a2aLzSzTzDIPHToUpFOLiAQmO/8Yj72xm3kTBvC1SY2vUSNLIOVe\nCAxs8DjZ/1xDi4CXXb08YB/wpe3UnHNPOefSnHNpCQkJ55pZRKTFKk7VsvTFLPp27cA/XT827D58\no6UCKfeNwHAzG+J/k3QhsLrRMZ8BXwEws37ASGBvMIOKiJyPh9dsY39pBb+9ZSLdO8V5HafVNTvY\n6ZyrNbPbgdeAWOAZ59xWM1vif30Z8DDwrJltAQy41zl3uBVzi4gEbH1uCcs35rPkimFcNLS313Ha\nREBT+865tcDaRs8ta/B1EXBNcKOJiJy/A+VV3P9yDmOTunHnzBFex2kzkTkDJCIC+HyOu9OzOVlT\nx2O3TKJ9u+ipvOj5k4pI1Pnj+/t5Z/dhHpibygV9u3gdp02p3EUkIu0oKeeR9TuYMbov37gwxes4\nbU7lLiIRp6qmjjteyKJbxzgeWTA+4scemxJZ26CJiACPrN/BzgPH+eOiqfTu0sHrOJ7QlbuIRJS3\ndx3ij+/t5+8uHsRVI/t6HcczKncRiRhHKqq5Kz2b4X27cP+c0V7H8ZRuy4hIRHDOce/KHMoqa3hu\n0TQ6xsV6HclTunIXkYiwfGM+f9l2gHuuHUnqgC/tWxh1VO4iEvb2HjrBLzK2cekFvfneZUO8jhMS\nVO4iEtZq6nz8+MUsOsTF8JubJhITE31jj03RPXcRCWuPv7Gb7IIyfv+NyfTv3tHrOCFDV+4iErY+\n3neEJ97K46Ypycwel+h1nJCicheRsFReVcOPX8wipVc8D311jNdxQo5uy4hIWHpwVS4l5VW8tORi\nunRQlTWmK3cRCTt/zipkVVYRf3/1cCal9PQ6TkhSuYtIWCk4WskDr+QyZVBPfnTVMK/jhCyVu4iE\njTqf484Xs3HAozdPpF2sKuxMdKNKRMLGsg17+Hj/EX5z0wRSesd7HSek6X97IhIWcgqO8ehfdjF3\nfCI3TE7yOk7IU7mLSMirrK5l6fIsErp24J+vHxeVH77RUrotIyIh7+E129lXWsGfvn8R3ePjvI4T\nFnTlLiIh7fWtJbzw8Wcsnj6Ui4f19jpO2FC5i0jIOlhexb0rcxgzoBt3zRzpdZywonIXkZDk8znu\nfimHkzV1PL5wIu3bqa5aQqslIiHpuQ/28/auQ/x0bioX9O3qdZywo3IXkZCzo6Scf1m3g6+M6ss3\nL0zxOk5YUrmLSEipqqlj6fIsunVsxyM3jtfY4znSKKSIhJRfv7aTHSXH+eN3ptKnSwev44QtXbmL\nSMh4Z/ch/vDuPr598SCuGtXX6zhhTeUuIiHhSEU1d63I5oK+XfjJnNFexwl7KncR8ZxzjvtfzuFo\nZTWPL5xIx7hYryOFPZW7iHhuRWY+r209wD3XjmTMgO5ex4kIKncR8dS+wxX8Y8Y2LhnWm+9fNtTr\nOBFD5S4inqmp87F0+WbiYmP4zc0TiInR2GOwBFTuZjbLzHaaWZ6Z3XeGY640sywz22pmG4IbU0Qi\n0e/+upvsgjL+5YZxJHbv5HWciNLsnLuZxQJPADOBAmCjma12zm1rcEwP4N+BWc65z8xMM0wiclYb\n9x/hiTfzuHFKMnPGJXodJ+IEcuU+Dchzzu11zlUDy4H5jY65FXjZOfcZgHPuYHBjikgkKa+qYeny\nLJJ7xvPzr47xOk5ECqTck4D8Bo8L/M81NALoaWZvmdkmM/t2sAKKSOR56M9bKSmv4tFbJtKlg35Q\nvjUEa1XbAVOArwCdgA/M7EPn3K6GB5nZYmAxQEqKNgMSiUars4t4ZXMhS2cMZ8qgnl7HiViBXLkX\nAgMbPE72P9dQAfCac67COXcYeBuY0Pg3cs495ZxLc86lJSQknGtmEQlThcdO8tNXtjA5pQe3X3WB\n13EiWiDlvhEYbmZDzKw9sBBY3eiYPwOXmVk7M4sHLgS2BzeqiISzOp/jxy9m4fM5HrtlEu1iNYnd\nmpq9LeOcqzWz24HXgFjgGefcVjNb4n99mXNuu5mtB3IAH/C0cy63NYOLSHh58u09fLzvCP960wRS\nesd7HSfiBXTP3Tm3Fljb6LlljR7/Gvh18KKJSKTYUlDGb1/fxdxxiSyY3HgeQ1qD/l0kIq2qsrqW\nO5Zvpk+XDvy/r43Vh2+0Ec0giUir+qdXt7OvtIL//v6F9Ihv73WcqKErdxFpNX/ZdoA/ffQZiy8f\nyiXD+ngdJ6qo3EWkVRw8XsW9K3NITezGndeM8DpO1FG5i0jQOee4Jz2HilO1/O7rE+nQTh++0dZU\n7iISdM+9v58Nuw7xwNzRXNC3q9dxopLKXUSCateB4/zzuh1cPaov37xokNdxopbKXUSC5lRtHX//\nwma6dmjHIwvGa+zRQxqFFJGg+fX6newoOc4z30kjoWsHr+NENV25i0hQvLv7ME+/u49vXTSIq0f1\n8zpO1FO5i8h5O1pRzV3pWQxL6MxP5oz2Oo6g2zIicp6cc9z/8haOVFTzh7+bSqf2GnsMBbpyF5Hz\nkp5ZwPqtJdx9zUjGJnX3Oo746cpdRM7JZ6WVPP3uXpZvzOfiob35weVDvY4kDajcRaRFthSU8eTb\ne1i7pZh2MTF8bVIS98waSUyMxh5DicpdRJrlnOOd3Yd58u09vJdXStcO7Vg8fRiLLh1Mv24dvY4n\nTVC5i8gZ1db5eHVLMU9u2Mu24nL6devA/bNH8fULU+jWMc7reHIWKncR+ZLK6lpWbMzn6Xf3UXD0\nJBf07cKvbhzP/IkDtAlYmFC5i8jnSk+c4rkPPuX5D/ZzrLKGtEE9+fm8MVw9qq/uqYcZlbuI8Flp\nJf/xzl5WZOZzqtbHzNR+/HD6UNIG9/I6mpwjlbtIFGs4+RIbY9wwKZkfTB/KBX27eB1NzpPKXSTK\nOOd4N+8wyzZ8Mfnyg+lD+e6lQzT5EkFU7iJRovHkS9+umnyJZCp3kQjXePJlWEJnTb5EAZW7SIQq\nPXGK5/2TL0f9ky8PzRvDVzT5EhVU7iIR5vSeLysy86mq0eRLtFK5i0SI3MIylm1oPPkyRB9QHaVU\n7iJh7PTky5Mb9vJu3mFNvsjnVO4iYUiTL9IclbtIGKmsriU9s4D/eGfvF5MvC8Yzf5ImX+RvqdxF\nwsCRimqee3+/Jl8kYCp3kRDWePJlxuh+LLlCky/SPJW7SAjKLSzjybf38mpOEbExxtcmJbF4+lBN\nvkjAVO4iIUKTLxJMKncRjzU1+XLf7FHcqskXOQ8qdxGPaPJFWlNA5W5ms4DHgVjgaefcL89w3FTg\nA2Chc+6loKUUiSCNJ1+maPJFWkGz5W5mscATwEygANhoZqudc9uaOO4R4PXWCCoS7vKPfPFpR5p8\nkdYWyJX7NCDPObcXwMyWA/OBbY2O+7/ASmBqUBOKhDlNvogXAin3JCC/weMC4MKGB5hZEvA14CrO\nUu5mthhYDJCSktLSrCJh40yTL4suGUL/7pp8kdYXrDdUHwPudc75zM58z9A59xTwFEBaWpoL0rlF\nQkZtnY+1uSU8uWEPW4s0+SLeCaTcC4GBDR4n+59rKA1Y7i/2PsAcM6t1zq0KSkqREHeyuo4Vmfma\nfJGQEUi5bwSGm9kQ6kt9IXBrwwOcc0NOf21mzwJrVOwSDY5UVPP8B/t57v36yZfJKT148LpUZozu\np8kX8VSz5e6cqzWz24HXqB+FfMY5t9XMlvhfX9bKGUVCTv6RSp5+Zy8vavJFQlRA99ydc2uBtY2e\na7LUnXPfOf9YIqGp8eTL9ROT+OEVmnyR0KOfUBVphnOO9/JKefLtPbyz+zBdOrTjB5cPZdGlmnyR\n0KVyFzkDTb5IOFO5izTSePJlaEJnHlkwjusnJWnyRcKGyl3Eb1tROemb8nllcyHHNPkiYU7lLlHt\nWGU1q7OLWJGZT25hOe1jY5iZ2o/vXDqYqZp8kTCmcpeoU+er3xogPTOf17ceoLrOx5gB3fj5vFTm\nT0yiZ+f2XkcUOW8qd4kan5ZWkJ5ZwMpPCiguq6JHfBy3XpjCTWnJjBnQ3et4IkGlcpeIVlldy9ot\nJaRn5vPRviPEGEwfkcADc1OZkdpXb5BKxFK5S8RxzvHJZ0dZsbGANTlFVFTXMbh3PPdcO5IFk5M1\nmy5RQeUuEeNgeRUrPykkfVM+ew9VEN8+lrnjErl56kDSBvXkbDuWikQalbuEtepaH/+z4wArMgvY\nsOsQdT7H1ME9WXLFMOaOS6RzB32LS3TSd76Epe3F5aRnFrAqq5AjFdX069aBH04fyo1Tkhma0MXr\neCKeU7lL2CirrOHP2YWkZxawpbCMuFjjmtT+3JiWzPThCcTqB41EPqdyl5BW53O8l3eY9E0FvLa1\nhOpaH6MTu/GQfya9l2bSRZqkcpeQ9GlpBS9tKmDlpgKKyqro3imOW6elcOOUZMYmaSZdpDkqdwkZ\nldW1rNtSwgr/TLoZTB+ewE81ky7SYip38VT9TPox0jPzWZNTzIlTtZ/PpN8wOYnE7p28jigSllTu\n4omD5VW8vLmQFZlfzKTPGZfIzWkDmTpYM+ki50vlLm2mfib9IOmZ+bzln0lPG9STJQuGMWd8Il00\nky4SNPrbJK1uR0n9TPorm7+YSV88fSg3aSZdpNWo3KVVlFXWsDq7kPRNBeQU1M+kz0ztx01TBnL5\n8D60i43xOqJIRFO5S9D4fI739hxmReYXM+mj+nfVTLqIB1Tuct4+K63kpU35rPykkMJjJ+neKY6v\nTx3ITWkDGTOgm94cFfGAyl3OycnqOtblFrMiM58P99bPpF8+PIH754xixuh+dIzTTLqIl1TuErDT\nM+kvbconI7t+Jn1Q73juvmYEN0xOZkAPzaSLhAqVuzTr4PEqXvmkfiZ9z6EKOsWdnklPZtqQXrrt\nIhKCVO7SpNMz6S9tyufNnV/MpD+yYChzxw/QTLpIiNPfUPkbO0uOsyIzn1WbCymtqKZv1/qZ9Bun\nJDNMM+kiYUPlLpSdrGF1dhHpmfmfz6TPGN2Pm9M0ky4SrlTuUcrnc7y/p5QVmfmsbzCT/uB1qVw/\nSTPpIuFO5R5l8o9Uku7fJ/30TPrCqQO5WTPpIhFF5R4FTs+kp2cW8MHeUszgsgv6cN/sUcxM1Uy6\nSCRSuUeomjof7+YdJiO7iNe3HuDEqVpSesVz18wRLJiimXSRSKdyjyB1PsdH+0rJyC5mXW4xxypr\n6NqxHbPH9mfBlGSmDe5FjD5EWiQqqNzDnM/n2Jx/lIzsYl7dUsyh46eIbx/LzNR+zBs/gMtH9NHH\n04lEoYDK3cxmAY8DscDTzrlfNnr9G8C9gAHHgducc9lBzip+zjlyC8vJyCni1ZxiCo+dpH27GK4e\n2Zd5EwZw9ai+dGqvQheJZs2Wu5nFAk8AM4ECYKOZrXbObWtw2D7gCufcUTObDTwFXNgagaPZrgPH\nycguIiO7iP2llbSLMaaPSODua0cwY3Q/unaM8zqiiISIQK7cpwF5zrm9AGa2HJgPfF7uzrn3Gxz/\nIZAczJDRbN/hCtZkF5GRU8SuAyeIMbhkWB9uu3IY147pT494zaOLyJcFUu5JQH6DxwWc/ar8e8C6\n8wkV7QqPneTVnCIysovZUlgGwLTBvfjF/DHMHptIQtcOHicUkVAX1DdUzewq6sv9sjO8vhhYDJCS\nkhLMU4e9g8erWJtTTEZOMZs+PQrAhOTuPDB3NHPGJWp0UURaJJByLwQGNnic7H/ub5jZeOBpYLZz\nrrSp38g59xT19+NJS0tzLU4bYY5WVLMut4Q1OUV8uLcUn4NR/btyz7UjuW58IoN6d/Y6ooiEqUDK\nfSMw3MyGUF/qC4FbGx5gZinAy8C3nHO7gp4ygpRX1fCXrQfIyCni3d2HqfU5hvbpzO1XD2fe+ESG\n9+vqdUQRiQDNlrtzrtbMbgdeo34U8hnn3FYzW+J/fRnwINAb+Hf/3iS1zrm01osdXiqra/nr9oNk\nZBfx1q5DVNf6SOrRie9fPpR5ExJJTdSeLiISXOacN3dH0tLSXGZmpifnbgtVNXVs2HWIjOwi/rr9\nICdr6ujbtQNzxycyb8IAJg3soUIXkRYzs02BXDzrJ1SDqKbOx3t5h8nILub1rSUcP1VLr87tuWFy\nEvMmDGDq4F7E6sf/RaQNqNzPU8P9XNbnFnPUv5/LrLH9uW7CAC4Z1ps4fdiFiLQxlfs5ONN+LjNG\n92PehAFM134uIuIxlXuAnHNsLSonI7uINdrPRURCnMq9Gaf3c1mTU8y+wxWf7+dy1zUjmJmq/VxE\nJDSp3Juw/3AFa/w//r/zwHFiDC4e1psfTh/KrLHaz0VEQp/K3a+p/VymDu6p/VxEJCxFdbmf3s9l\nTU4xmQ32c/npnNHMHa/9XEQkfEVduR+tqGb91hIysrWfi4hErqgo96b2cxmi/VxEJIJFbLmf3s9l\nTU4Rb+78Yj+X710+hHnjBzBmgPZzEZHIFVHlfqq2jg07D5GRU8wb2w58vp/LNy5M0X4uIhJVwr7c\n/2Y/l20lHK+qpWd8HDdMTuK68QOYNkT7uYhI9AnLcq/zOT7ed4SMnCLWbfliP5drx/RnnvZzEREJ\nv3L/6/YD3PfyFg4dP0WnuFhmpmo/FxGRxsKu3JN7xjM5pcfn+7nEtw+7P4KISKsLu2Yc2b8rT35L\nH/IkInI2ujEtIhKBVO4iIhFI5S4iEoFU7iIiEUjlLiISgVTuIiIRSOUuIhKBVO4iIhHInHPenNjs\nEPDpOf7nfYDDQYwTLKGaC0I3m3K1jHK1TCTmGuScS2juIM/K/XyYWaZzLuR+TDVUc0HoZlOullGu\nlonmXLotIyISgVTuIiIRKFzL/SmvA5xBqOaC0M2mXC2jXC0TtbnC8p67iIicXbheuYuIyFmEdLmb\n2Swz22lmeWZ2XxOvm5n9zv96jplNDpFcV5pZmZll+X892Ea5njGzg2aWe4bXvVqv5nK1+XqZ2UAz\ne9PMtpnZVjO7o4lj2ny9AszlxXp1NLOPzSzbn+sfmzjGi/UKJJcnfx/95441s81mtqaJ11p3vZxz\nIfkLiAX2AEOB9kA2kNromDnAOsCAi4CPQiTXlcAaD9ZsOjAZyD3D622+XgHmavP1AhKByf6vuwK7\nQuT7K5BcXqyXAV38X8cBHwEXhcB6BZLLk7+P/nPfCfypqfO39nqF8pX7NCDPObfXOVcNLAfmNzpm\nPvC8q/ch0MPMEkMglyecc28DR85yiBfrFUiuNuecK3bOfeL/+jiwHUhqdFibr1eAudqcfw1O+B/G\n+X81fsPOi/UKJJcnzCwZmAs8fYZDWnW9Qrnck4D8Bo8L+PI3eSDHeJEL4BL/P7XWmdmYVs4UKC/W\nK1CerZeZDQYmUX/V15Cn63WWXODBevlvMWQBB4G/OOdCYr0CyAXefH89BvwD4DvD6626XqFc7uHs\nEyDFOTce+P/AKo/zhDrP1svMugArgaXOufK2Om9zmsnlyXo55+qccxOBZGCamY1ti/M2J4Bcbb5e\nZnYdcNA5t6m1z3UmoVzuhcDABo+T/c+19Jg2z+WcKz/9T0Xn3Fogzsz6tHKuQHixXs3yar3MLI76\nAv1v59zLTRziyXo1l8vr7y/n3DHgTWBWo5c8/f46Uy6P1utS4Ktmtp/6W7dXm9l/NTqmVdcrlMt9\nIzDczIaYWXtgIbC60TGrgW/733W+CChzzhV7ncvM+puZ+b+eRv06l7ZyrkB4sV7N8mK9/Of7A7Dd\nOffbMxzW5usVSC6P1ivBzHr4v+4EzAR2NDrMi/VqNpcX6+Wcu985l+ycG0x9R/yPc+6bjQ5r1fVq\nF6zfKNicc7VmdjvwGvUTKs8457aa2RL/68uAtdS/45wHVAKLQiTXjcBtZlYLnAQWOv/b463JzF6g\nfjKgj5kVAA9R/waTZ+sVYC4v1utS4FvAFv/9WoCfACkNcnmxXoHk8mK9EoHnzCyW+nJc4Zxb4/Xf\nxwBzefL3sSltuV76CVURkQgUyrdlRETkHKncRUQikMpdRCQCqdxFRCKQyl1EJAKp3EVEIpDKXUQk\nAqncRUQi0P8C1AoCZ0IGAjwAAAAASUVORK5CYII=\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x7f4204e390f0>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"sorted_w_cumsum: [ 0.1084106 0.22611354 0.35689458 0.51534084 1. ]\n"
]
}
],
"source": [
"sorted_w_cumsum = np.cumsum(sorted_w)\n",
"plt.plot(sorted_w_cumsum); plt.show()\n",
"print ('sorted_w_cumsum: ', sorted_w_cumsum)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"4) Now find the index when cumsum hits 0.5:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"idx = np.where(sorted_w_cumsum>0.5)[0][0]\n",
"idx"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"5) Finally, your answer is sample at that position:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"52"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pos = idxs[idx]\n",
"x[pos]"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Data: [63 52 70 17 76]\n",
"Sorted data: [17 52 63 70 76]\n",
"Weighted median: 52, Median: 63\n"
]
}
],
"source": [
"print('Data: ', x)\n",
"print('Sorted data: ', np.sort(x))\n",
"print('Weighted median: %d, Median: %d' %(x[pos], np.median(x)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Thats it! "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If the procedure looks surprising for you, try to do steps 2--5 assuming the weights are $w_i=\\frac{1}{N}$. That way you will find a simple median (not weighted) of the data. "
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python [conda env:py3]",
"language": "python",
"name": "conda-env-py3-py"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment