Instantly share code, notes, and snippets.

# maxkapur/SchoolLocation.ipynb

Last active April 13, 2022 09:00
Star You must be signed in to star a gist
SchoolLocation
Display the source blob
Display the rendered blob
Raw
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 { "cells": [ { "cell_type": "markdown", "id": "3c51999f-c422-4923-948f-df379d47293f", "metadata": { "tags": [] }, "source": [ "# The school location problem\n", "\n", "Max Kapur | [maxkapur.com](https://www.maxkapur.com) | Apr. 13, 2022\n", "\n", "Video supplement: https://youtu.be/bjreiQRHerU\n", "\n", "## Introduction\n", "\n", "The school location problem is a kind of [facility location problem](https://en.wikipedia.org/wiki/Facility_location_problem). The city needs to build $n$ schools to serve $m$ families. Each family's location (in Euclidean space) is given by $p_j$, and their children will attend the school *closest* to home, so the distance they travel to school is \n", "$$\\min_{i = 1 \\dots n}\\left\\{\\lVert x_i - p_j \\rVert_q \\right\\}.$$\n", "Here $x_i$ is the location of school $i$ and $\\lVert \\cdot \\rVert_q$ is some $p$-norm. In this demonstration, we will use the 2-norm or Euclidean distance, but the 1-norm or taxicab distance may be more appropriate in an urban context. \n", "\n", "Our goal is to choose the $x_i$ that *minimize,* in some sense, the distance between families and schools. Since we already assume that families attend the nearest school, this means we are dealing with some kind of \"minimin\" problem. \"Minimin\" and \"maximax\" problems tend to be more computationally challenging than \"minimax\" or \"maximin\" problems because of the [convexity/concavity](https://en.wikipedia.org/wiki/Convex_function) of the max/min operators. Often, the most effective way to formulate a minimin problem is by adding binary variables that identify the minimum. \n", "\n", "In the school location problem, we will define the [social cost](https://en.wikipedia.org/wiki/Social_cost) as the *longest* distance that any student travels to school. The reason is that in a public planning problem such as this, it is important to prevent jealousy among the families. By minimizing the longest distance, we ensure that no single family has a longer commute than everyone else. (To see why this is true, suppose that some plan does give a single family the longest commute. Then their commute defines the social cost for this plan, and by nudging their school infinitesimally closer to them, we can decrease the social cost; therefore this plan is not optimal.) The plan that minimizes the average distance, for example, typically does not have this property.\n", "\n", "## Problem formulation\n", "\n", "Incorporating the social cost function above, we can now write the school location problem. It has a delightful \"minimaximin\" form:\n", "$$\\text{minimize}\\quad \\max_{j = 1 \\dots m} \\Bigl\\{ \\min_{i = 1 \\dots n}\\left\\{\\lVert x_i - p_j \\rVert_2 \\right\\} \\Bigr\\}$$\n", "To get this into a form that our solver can handle, we first let the variable $t$ stand in for the maximal distance:\n", "\\begin{align}\n", "\\text{minimize}\\quad & t \\\\\n", "\\text{subject to}\\quad &t \\geq\\min_{i = 1 \\dots n}\\left\\{\\lVert x_i - p_j \\rVert_2 \\right\\}, &j = 1 \\dots m\n", "\\end{align}\n", "Since $t$ is greater than or equal to all of the inputs to the maximum, it is greater than or equal to the value of the maximum itself. And since this is a minimization problem, $t$ will be decreased \"automatically\" until it equals the value of the value of the maximum.\n", "\n", "On the other hand, because this is *not* a maximization problem, the same trick doesn't work for getting rid of the minimum. Instead, let's introduce a binary variable $s_{ij}$ that equals one if family $j$'s children attend school $i$, and zero otherwise. Now we can write the problem as follows.\n", "\\begin{align}\n", "\\text{minimize}\\quad & t \\\\\n", "\\text{subject to}\\quad &t \\geq \\lVert x_i - p_j \\rVert_2 - M (1 - s_{ij}), & i = 1\\dots n,~j = 1 \\dots m \\\\\n", "& \\sum_{i=1}^n s_{ij} \\geq 1, & j = 1\\dots m \\\\\n", "& s_{ij} \\in \\{0, 1\\}, & i = 1\\dots n,~j = 1 \\dots m\n", "\\end{align}\n", "Here $M$ is a large, positive constant. If $s_{ij} = 0$, then the first constraint is vacuous; if $s_{ij} = 1$, the first constraint activates as required. The second constraint simply says that everyone must attend at least one school. By the minimization argument, it is easy to see that this constraint will hold with equality \"automatically.\"\n", "\n", "We will use the [Splitting Conic Solver](https://github.com/cvxgrp/scs) (SCS) to solve the relaxations of this problem. The first constraint in the problem above is already convex, but the solver will process it more efficiently if we input it in the following conic form:\n", "\\begin{align}\n", "\\text{minimize}\\quad & t \\\\\n", "\\text{subject to}\\quad &\n", "\\begin{bmatrix}d_{ij} \\\\ x_i - p_j \\end{bmatrix} \\in C_2, & i = 1\\dots n,~j = 1 \\dots m \\\\\n", "& t \\geq d_{ij} - M (1 - s_{ij}), & i = 1\\dots n,~j = 1 \\dots m \\\\\n", "& \\sum_{i=1}^n s_{ij} \\geq 1, & j = 1\\dots m \\\\\n", "& s_{ij} \\in \\{0, 1\\}, & i = 1\\dots n,~j = 1 \\dots m\n", "\\end{align}\n", "Here $C_2 = \\bigl\\{(h, y) \\in \\mathbb{R}^3 : h \\geq \\lVert y \\rVert_2 \\bigr\\}$ is known as the *second-order cone.* \n", "\n", "## Solving the problem in Julia\n", "\n", "We will use Julia and the [JuMP](https://jump.dev) modeling language to express this problem as a second-order conic program (SOCP). We will use SCS to solve the relaxations, and the [Juniper](https://lanl-ansi.github.io/Juniper.jl/stable/) package to handle the binary variables." ] }, { "cell_type": "code", "execution_count": 1, "id": "8ce1635b-ee26-47be-afc3-52591d9ff67f", "metadata": {}, "outputs": [], "source": [ "using Plots\n", "using JuMP\n", "using Juniper, SCS\n", "using LinearAlgebra\n", "using Test" ] }, { "cell_type": "markdown", "id": "f472e472-4ad2-45f5-adb7-8df1cc95d4d1", "metadata": {}, "source": [ "Let's generate a fake instance of the problem with $m = 20$ families and $n = 3$ schools. The plot shows the location of each household." ] }, { "cell_type": "code", "execution_count": 2, "id": "suspended-anxiety", "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m = 20 # number of families\n", "n = 3 # number of schools\n", "ndims = 2 # number of Euclidean dimensions in which points are embedded\n", "\n", "# Generate points approximately on the unit circle\n", "ps = map(1:m) do _\n", " p = randn(ndims)\n", " normalize!(p)\n", " p += 0.1*randn(ndims)\n", " return p\n", "end\n", "\n", "pl = plot(size=(600,600), xlim=(-1.5, 1.5), ylim=(-1.5, 1.5))\n", "scatter!(pl, [tuple(p...) for p in ps], m=:square, c=:black, msc=:auto, label=\"families\")" ] }, { "cell_type": "markdown", "id": "066a8a68-2803-41a0-bf1c-7d1046260764", "metadata": {}, "source": [ "Now we can start building the model in JuMP. \n", "\n", "To ensure that the model is numerically stable, it is important to choose a value of $M$ that is [as small as possible](\n", "https://or.stackexchange.com/questions/236/why-is-it-important-to-choose-big-m-carefully-and-what-are-the-consequences-of-d?noredirect=1&lq=1) without breaking the logic of the constraints. In our case, it is sufficient to have $M \\geq \\lVert x_i - p_j \\rVert_2$ for all $i$ and $j$. It is obvious that, in an optimal plan, the distance between any school and any household will not exceed the largest distance between any two households. So, we can set \n", "$$M = \\max_{i \\neq k} \\bigl\\{\\lVert p_i - p_k \\rVert_2\\bigr\\}.$$\n", "In the code below, I use the 1-norm here to make it easy for the reader to try using the 1-norm as the distance criterion. This is a conservative choice since the 1-norm is guaranteed to be larger than the 2-norm. " ] }, { "cell_type": "code", "execution_count": 3, "id": "9a68d918-3441-49d2-b260-b99b8a8571eb", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3.1267062299149817" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Set SCS as the solver for the relaxation problems. \n", "nl_solver = optimizer_with_attributes(SCS.Optimizer, \"verbose\"=>0)\n", "\n", "# Wrap SCS in Juniper to enable integrality constraints.\n", "mdl = Model(\n", " optimizer_with_attributes(\n", " Juniper.Optimizer,\n", " \"nl_solver\"=>nl_solver,\n", " \"atol\"=>1e-3,\n", " )\n", ")\n", "\n", "set_silent(mdl)\n", "\n", "# Set the value of big M, which needs to be an upper bound on d[i, j]. \n", "M = maximum(norm(ps[i] - ps[k], 1) for i in 1:m for k in i+1:m)" ] }, { "cell_type": "markdown", "id": "6b95cbe1-30bf-4e71-b55b-1293ed996a41", "metadata": {}, "source": [ "Now we declare our variables and constraints. Notice how easy it is to enter the conic constraint using JuMP's MOI.SecondOrderCone() notation." ] }, { "cell_type": "code", "execution_count": 4, "id": "688a8d40-5498-452a-aeb8-371a85c53316", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "A JuMP Model\n", "Minimization problem with:\n", "Variables: 127\n", "Objective function type: VariableRef\n", "AffExpr-in-MathOptInterface.GreaterThan{Float64}: 80 constraints\n", "Vector{AffExpr}-in-MathOptInterface.SecondOrderCone: 60 constraints\n", "VariableRef-in-MathOptInterface.ZeroOne: 60 constraints\n", "Model mode: AUTOMATIC\n", "CachingOptimizer state: EMPTY_OPTIMIZER\n", "Solver name: Juniper\n", "Names registered in the model: d, s, t, x" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@variable(mdl, t)\n", "@variable(mdl, d[1:n, 1:m])\n", "@variable(mdl, s[1:n, 1:m], Bin)\n", "@variable(mdl, x[1:ndims, 1:n])\n", "\n", "for j in 1:m\n", " for i in 1:n \n", " @constraint(mdl, [d[i, j]; x[:, i] .- ps[j]] in MOI.SecondOrderCone(ndims+1)) \n", " \n", " # To use one norm (taxicab distance) instead\n", " # @constraint(mdl, [d[i, j]; x[:, i] .- ps[j]] in MOI.NormOneCone(ndims+1)) \n", "\n", " # s[i, j] = 1 ⟹ d[i, j] ≤ t\n", " @constraint(mdl, t ≥ d[i, j] - M * (1 - s[i, j]))\n", " end\n", " @constraint(mdl, sum(s[i, j] for i in 1:n) ≥ 1)\n", "end\n", "\n", "@objective(mdl, Min, t)\n", "\n", "mdl" ] }, { "cell_type": "markdown", "id": "0ce3b60a-3846-4ebf-b44c-3d751867a93f", "metadata": {}, "source": [ "Here is where we actually solve the model. Depending on your computer, this cell may take several minutes to run. You can reduce the number or families and/or schools if you are impatient." ] }, { "cell_type": "code", "execution_count": 5, "id": "456ac181-46f4-49e4-80b8-65996a286aea", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "┌ Warning: Only almost solved\n", "└ @ Juniper /home/max/.julia/packages/Juniper/OUSNz/src/BnBTree.jl:89\n" ] }, { "data": { "text/plain": [ "* Solver : Juniper\n", "\n", "* Status\n", " Termination status : LOCALLY_SOLVED\n", " Primal status : FEASIBLE_POINT\n", " Dual status : FEASIBLE_POINT\n", " Message from the solver:\n", " \"LOCALLY_SOLVED\"\n", "\n", "* Candidate solution\n", " Objective value : 0.7316442405728293\n", " Objective bound : 0.7316156410382292\n", "\n", "* Work counters\n", " Solve time (sec) : 89.21828\n" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "optimize!(mdl)\n", "solution_summary(mdl)" ] }, { "cell_type": "markdown", "id": "9c95b17b-253b-413b-9aed-95fd5e7fba98", "metadata": {}, "source": [ "Read out the solution and verify its correctness. Each entry of xs gives the coordinates of one school, and we store the index of the $j$th family's nearest school as attend_school[j]. We verify that this is correct in two ways:\n", "\n", "- First, we check the d[i, j]-values and ensure that each attend_school[j] is the minimum of the corresponding column. (Note that the d[i, j] values only give the correct Euclidean distance to the *closest* school. If s[i, j] == 0, it is not necessarily true that d[i, j] == norm(xs[i] - ps[j]), because the corresponding big-$M$ constraint is not necessarily binding.)\n", "- Next, we check that the 1 in each column of s occurs at the attend_school[j]th position. " ] }, { "cell_type": "code", "execution_count": 6, "id": "ac4fa1d0-429c-4249-94de-2c63e9b4823f", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\u001b[32m\u001b[1mTest Passed\u001b[22m\u001b[39m\n", " Expression: attend_school == map(argmin, eachcol(value.(d))) == map(argmax, eachcol(value.(s)))\n", " Evaluated: [2, 2, 2, 2, 1, 2, 2, 1, 3, 1, 2, 2, 1, 3, 1, 3, 3, 2, 3, 2] == [2, 2, 2, 2, 1, 2, 2, 1, 3, 1, 2, 2, 1, 3, 1, 3, 3, 2, 3, 2] == [2, 2, 2, 2, 1, 2, 2, 1, 3, 1, 2, 2, 1, 3, 1, 3, 3, 2, 3, 2]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "xs = [value.(x[:, i]) for i in 1:n]\n", "\n", "attend_school = map(ps) do p\n", " argmin(i -> norm(p - xs[i]), 1:n)\n", "end\n", "\n", "@test attend_school == map(argmin, eachcol(value.(d))) == map(argmax, eachcol(value.(s)))" ] }, { "cell_type": "markdown", "id": "2bc6aceb-8a86-4e17-b718-f4d67797dd32", "metadata": {}, "source": [ "Finally, we make a plot showing the school locations and each family's local school. " ] }, { "cell_type": "code", "execution_count": 7, "id": "progressive-suspension", "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pl = plot(size=(600,600), xlim=(-1.5, 1.5), ylim=(-1.5, 1.5))\n", "\n", "for (j, p) in enumerate(ps)\n", " plot!(\n", " pl,\n", " [tuple(p...), tuple(xs[attend_school[j]]...)],\n", " label= j==1 ? \"local school\" : nothing,\n", " c = \"navy\",\n", " ls = :dash\n", " )\n", "end\n", "\n", "scatter!(pl, [tuple(p...) for p in ps], m=:square, c=:black, msc=:auto, label=\"families\")\n", "scatter!(pl, [tuple(x...) for x in xs], c=:crimson, msc=:auto, ms=7, label=\"schools\")\n", "\n", "pl" ] }, { "cell_type": "markdown", "id": "1ae4cdf6-d220-4868-abd9-3feae229eae1", "metadata": {}, "source": [ "## Evaluating the model\n", "\n", "The output above suggests that this model finds intuitive \"neighborhoods\" in the input data and produces a school for each. In this regard, the problem is very similar to [$k$-means clustering](https://en.wikipedia.org/wiki/K-means_clustering). However, the $k$-means clustering solution minimizes the *sum* (equivalently, average) Euclidean distance between families and schools, and therefore differs from our model. For example, if there is an outlier student who lives very far from the others, our model will place a school halfway between her family and the population center, whereas the $k$-means clustering solution will not be as drastically swayed. This means that the school location problem is not very *robust* to outliers. Such is the cost of fairness.\n", "\n", "Because of its integrality constraints, the school location problem is very difficult to solve. This is another commonality with $k$-means clustering, which is known to be NP-hard. \n", "\n", "## Comparison with $k$-means\n", "\n", "Let's try using a heuristic algorithm for the $k$-means clustering problem (which, confusingly, is often called the \"$k$-means algorithm\") to produce an alternative solution and see how it differs from that above. " ] }, { "cell_type": "code", "execution_count": 8, "id": "rational-sensitivity", "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "using Clustering\n", "\n", "res = kmeans(hcat(ps...), n)\n", "xs_kmeans = map(Vector{Float64}, eachcol(res.centers))\n", "\n", "attend_school_kmeans = attend_school = map(ps) do p\n", " argmin(i -> norm(p - xs_kmeans[i]), 1:n)\n", "end\n", "\n", "pl = plot(size=(600,600), xlim=(-1.5, 1.5), ylim=(-1.5, 1.5))\n", "\n", "for (j, p) in enumerate(ps)\n", " plot!(\n", " pl,\n", " [tuple(p...), tuple(xs_kmeans[attend_school_kmeans[j]]...)],\n", " label= j==1 ? \"local school\" : nothing,\n", " c = \"navy\",\n", " ls = :dash\n", " )\n", "end\n", "\n", "scatter!(pl, [tuple(p...) for p in ps], m=:square, c=:black, msc=:auto, label=\"families\")\n", "scatter!(pl, [tuple(x...) for x in xs_kmeans], c=:crimson, msc=:auto, ms=7, label=\"schools (k means)\")\n", "\n", "pl" ] }, { "cell_type": "markdown", "id": "5431e1e9-1ea8-4788-a2a3-ab7c450a15d8", "metadata": {}, "source": [ "As expected, this solution results in some students who have long commutes in service of the greater good. " ] } ], "metadata": { "kernelspec": { "display_name": "Julia 1.7.1", "language": "julia", "name": "julia-1.7" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.7.1" } }, "nbformat": 4, "nbformat_minor": 5 }