Skip to content

Instantly share code, notes, and snippets.

@andersbjorkland
Created April 10, 2024 16:08
Show Gist options
  • Save andersbjorkland/3ee7c4dc426fc0d0ab358bb6158080f1 to your computer and use it in GitHub Desktop.
Save andersbjorkland/3ee7c4dc426fc0d0ab358bb6158080f1 to your computer and use it in GitHub Desktop.
Traveling Salesman Problem with Genetic Algorithm in Elixir/Livebook

Travelling Salesman

Mix.install([
  {:vega_lite, "~> 0.1.8"},
  {:kino_vega_lite, "~> 0.1.11"},
  {:jason, "~> 1.4"},
  {:kino_explorer, "~> 0.1.11"}
])

Background

The Travelling Salesman Problem (TSP) is a classic optimization problem. The "Salesman" needs to visit each city and return to the starting point by travelling the least distance possible. The solution is given by the order in which each city is visited in.

There are multiple ways to approach a solution. Some of these are:

  • Brute force
    By calculating the distance for each possible solution and then use the optimal route.
    The number of possible solutions are (n-1)!/2. For n=18 this would be about 1.8 * 10^14. It is about 200 times more combinations than the number of blood cells in a human body (26 trillion).
  • Nearest Neighbor
    This will see the Salesman travell to whichever city is closest that they have not visited before. The result will probably not be the shortest total route, but might be good enough, depending on circumstance.
  • Genetic Algorithm
    Simulate evolutionary processes to arrive at a route that has the best fitness. This will iterate through multiple generations to arrive at a solution that might be the best fit, but might also be a "local maxima".

In this article we will solve, or approach a solution to, the TSP by using a Genetic Algorithm (GA). We will use all the common components of a GA and visualize its solution.

Genetic Algorithms in Summary
"Genetic Algorithms (GA) are algorithms inspired by the evolutionary process found in nature. Sometimes these are called Stochastic Algorithms as they make use of randomness to find an optimal solution, but with features such a "natural selection" to remove less suitable solutions while generating new possible ones.", abjork.land 2024

Helper Modules

Before we delve into the details of the Travelling Salesman Genetic Algorithm, we will define two helper modules. These are not directly involved in the GA but will help with illustrating the GA:s performance.

The first module, GeometryRender, is used to draw the coordinates of each city and draws a path between them, as stipulated by a chromosome.

The second module, DataAggregator, is used to collect data from the GA in a concurrent fashion. Its loop/1 will continue to listen to messages and aggregate them. The receive/1 allows to read the current available aggregate. This allows for collecting data on how the GA perform over generations.

defmodule GeometryRender do
  def cities_and_route_plot(cities, route) do
    """
    <svg width='400' height='240'>
      <defs>
        <linearGradient id="gradient" x1="0%" y1="0%" x2="100%" y2="0%">
          <stop offset="0%"   stop-color="#DDB7E8"/>
          <stop offset="100%" stop-color="#3F297F"/>
        </linearGradient>
      </defs>
      #{render_cities(cities)}
      #{render_route(cities, route)}
    </svg>
    """
  end

  #  #{render_route(cities, route)}

  def render_cities(cities) do
    Enum.map(cities, fn city ->
      "<circle cx='#{city.x * 10}' cy='#{city.y * 10}' r='3' fill='#4291c2' />"
    end)
    |> Enum.join()
  end

  def render_route(cities, route) do
    first_city = Enum.at(cities, Enum.at(route, 0))

    points =
      Enum.map(route, fn city_index ->
        city = Enum.at(cities, city_index)
        "#{city.x * 10},#{city.y * 10}"
      end)
      |> Kernel.++(["#{first_city.x * 10},#{first_city.y * 10}"])
      |> Enum.join(" ")

    render_route = "<path d='M #{points}' stroke='#4eace6' stroke-width='1' fill='none' />"

    render_labels =
      Enum.with_index(route, fn city_index, index ->
        city = Enum.at(cities, city_index)

        "<text x='#{city.x * 10 + 8}' y='#{city.y * 10 + 4}' fill='black'>#{index + 1}</text>"
      end)
      |> Enum.join(" ")

    render_route <> render_labels
  end

  def get_styles() do
    """
      .path-wrapper {
        display: flex;
      }

      .center-path {
        display: flex;
        flex-direction: column;
        margin: auto;
      }

      path {
        stroke : url(#gradient);
        stroke-width: 1;
      }

      circle {
        stroke: #603DC0;
        stroke-width: 1;
        fill: none;
      }

      table {
        border-collapse: collapse; 
        width: 200px; /* Adjust width as needed */
        font-size: 0.8rem;
      }

      th, td {
        border: 1px solid #ddd;
        padding: 8px;
        text-overflow: ellipsis;
        white-space: nowrap;
      }
    """
  end
end
# Gather data from each loop 
defmodule DataAggregator do
  def loop(agg) do
    receive do
      {:continue, message} ->
        data = [message | agg]
        loop(data)

      {:read, pid} ->
        send(pid, {:read_receiver, agg})
        loop(agg)

      :done ->
        Process.exit(self(), :normal)
        IO.puts("Done!")
        nil

      _ ->
        {:error}
    end
  end

  def receiver(pid) do
    send(pid, {:read, self()})

    receive do
      {:read_receiver, data} -> Enum.reverse(data)
    end
  end
end

TSP Genetic Algorithm

The Traveller Salesman Problem (TSP) requires a list of cities for the Salesman to visit. The order in which they visit them will be important; this is the route. The shorter the route, the better!

We will employ a Genetic Algorithm (GA) to approach a solution. Meaning, it might not be the best possible solution, but it will approach a decent enough solution.

Using GA we will represent each possible solution as a chromosome. A chromosome will be a list of the cities in the order that they will be visited. In the beginning we will initialize a population of randomly generated chromosomes, meaning they will be far from an optimal solution as the path may criss-cross multiple times.

This population will be passed to an evolutionary loop that we will set out to stop after n number of generations (another way would be to use a sliding window to determine if the progress is tapering off).

The evolutionary loop will do the following:

  1. Select n number of parents. The selection is based on probability where chromosomes that has a shorter route are more likely to be selected - but other chromosomes still have a chance.
    This is important as this gives a way out for the algorithm to lock in too early on a route that may be suboptimal.
  2. Each pair of parents will produce two offsprings via the procceses crossover and mutation.
  3. Prune as many chromosomes from the population as there are new offsprings and add the offsprings to the new population.
    The population size will remain static with this approach
  4. Loop the evolution over until loop-limit has been reached.

We will explore each step in detail before we go into coding it out in Elixir.

Selecting Parents

As in life, we can't choose who are our parents, but in GA we can influence who will be able to produce children. We can control this process very tightly. We can select only the most fit chromosomes to be parents. This will most likely see us improve upon the route optimization pretty fast. The backside of this is that this might lead us to lock us in to one specific route without allowing other options that might branch into better solutions - also known as being trapped in a local maxima.

The alternative is that we will allow all chromosomes of the population the chance of being selected as a parent. But with that said, the chance should not be uniform. Rather we will give the chromosomes that have a shorter route a higher chance for reproduction. This will combine the strength of promoting overall shorter distance in the population, while still maintaining diversity to avoid the problem of lock-in.

defmodule TSP do
  # Data: a list of cities with coordinates
  @cities [
    %{x: 10, y: 20},
    %{x: 5, y: 15},
    %{x: 12, y: 1},
    %{x: 20, y: 10},
    %{x: 7, y: 14},
    %{x: 15, y: 3},
    %{x: 15, y: 12},
    %{x: 15, y: 10},
    %{x: 20, y: 8},
    %{x: 2, y: 14},
    %{x: 3, y: 11},
    %{x: 14, y: 7},
    %{x: 5, y: 11},
    %{x: 5, y: 23},
    %{x: 20, y: 23},
    %{x: 7, y: 23},
    %{x: 16, y: 12},
    %{x: 25, y: 12}
  ]

  def initialize_population(population_size) do
    chromosome_template = Enum.to_list(0..(Enum.count(@cities) - 1))

    Enum.map(1..population_size, fn _ -> Enum.shuffle(chromosome_template) end)
  end

  # Calculate the distance between two cities 
  def distance(%{x: x1, y: y1}, %{x: x2, y: y2}) do
    # Using 2D Euclidean distance
    dx = abs(x1 - x2)
    dy = abs(y1 - y2)

    (dx ** 2 + dy ** 2)
    |> :math.sqrt()
  end

  def distances([first, second | []]) do
    city1 = Enum.at(@cities, first)
    city2 = Enum.at(@cities, second)

    distance(city1, city2)
  end

  def distances([first, second | tail]) do
    city1 = Enum.at(@cities, first)
    city2 = Enum.at(@cities, second)

    distance(city1, city2) + distances([second | tail])
  end

  def fitness_function(chromosome) do
    last_to_first_elements = Enum.take(chromosome, 1) ++ Enum.take(chromosome, -1)

    there = distances(chromosome)
    and_back = distances(last_to_first_elements)

    there + and_back
  end

  def general_inverse_fitness(population_data) do
    Enum.reduce(population_data, 0, fn data, acc -> acc + 1 / data.distance end)
  end

  def select_parents(population_data, num_parents \\ 2) do
    general_inverse_fitness = general_inverse_fitness(population_data)

    # Select n parents
    Enum.map(1..num_parents, fn _ ->
      random_value = :rand.uniform() * general_inverse_fitness
      select_chromosome(population_data, random_value, 0)
    end)
  end

  defp select_chromosome(population_data, target_fitness, acc_fitness) do
    [chromosome_data | tail] = population_data

    # The greater value from the distance function the less the new distance will be, 
    # so we promote the shorter routes before the longer routes
    new_fitness = acc_fitness + 1 / chromosome_data.distance

    if new_fitness >= target_fitness do
      chromosome_data
    else
      select_chromosome(tail, target_fitness, new_fitness)
    end
  end

  def crossover(chromosome_1, chromosome_2) do
    # 1. We select the parts of chromosome_1 randomly (the order is important)
    #    It can be [_, _, B, C, A, _]
    #    which means first selection will be [B, C, A], leaving D, E, F 

    cut_point_1 = :rand.uniform(Enum.count(chromosome_1) - 1)
    cut_point_2 = :rand.uniform(Enum.count(chromosome_1) - 1)
    [start_point, end_point] = [cut_point_1, cut_point_2] |> Enum.sort()
    selection = Enum.slice(chromosome_1, start_point..end_point)

    # 2. We want to get the relationships that is present in chromosome_2

    #    It can be [B, C, D, F, E, A]
    #    Be mindful that we do not want to select either A, B, or C again. 
    #    This means that we should filter out every used gene: [D, F, E]
    #    and start filling the missing pieces with the genes that are left.

    available_genes = Enum.filter(chromosome_2, fn gene -> !Enum.member?(selection, gene) end)
    {prefix, suffix} = Enum.split(available_genes, start_point)

    # 3. And then we assemble the chromosome from the different parts, 
    #    combining chromosome_1 and chromosome_2 genes

    prefix ++ selection ++ suffix
  end

  def mutation(chromosome) do
    chromosome_length = Enum.count(chromosome)
    portion_of_mutation = 0.1
    least_num_of_mutations = 1

    num_of_mutations =
      round(chromosome_length * portion_of_mutation)
      |> max(least_num_of_mutations)

    # For each number of mutations
    # - Take random element
    # - Insert at random position

    mutating_genes = Enum.take_random(chromosome, num_of_mutations)

    template = Enum.filter(chromosome, fn gene -> !Enum.member?(mutating_genes, gene) end)

    Enum.reduce(mutating_genes, template, fn mutating_gene, acc ->
      random_pos = (Enum.count(acc) - 1) |> :rand.uniform()
      List.insert_at(acc, random_pos, mutating_gene)
    end)
  end

  def prune(sorted_population_data, prune_count) do
    preserve_count = Enum.count(sorted_population_data) - prune_count

    Enum.take(sorted_population_data, preserve_count)
  end

  def evolve(
        sorted_population,
        generation_n \\ 0,
        memo \\ %{num_parents: 10, generation_limit: 1000, aggregator_pid: false}
      )

  def evolve(
        sorted_population_data,
        generation_n,
        %{generation_limit: generation_limit, aggregator_pid: aggregator_pid}
      )
      when generation_n == generation_limit do
    [chromosome_data] = Enum.take(sorted_population_data, 1)

    total_fitness = general_inverse_fitness(sorted_population_data)

    result = %{
      distance: chromosome_data.distance,
      chromosome: chromosome_data.chromosome,
      generation: generation_n,
      population_fitness: total_fitness
    }

    case aggregator_pid do
      x when is_pid(x) -> send(aggregator_pid, {:continue, result})
      _ -> nil
    end

    result
  end

  def evolve(sorted_population_data, generation_n, memo) do
    %{num_parents: num_parents, aggregator_pid: aggregator_pid} = memo

    parents = select_parents(sorted_population_data, num_parents)

    offsprings =
      parents
      |> Enum.chunk_every(2)
      |> Enum.flat_map(fn [parent_1, parent_2] ->
        [
          TSP.crossover(parent_1.chromosome, parent_2.chromosome) |> TSP.mutation(),
          TSP.crossover(parent_2.chromosome, parent_1.chromosome) |> TSP.mutation()
        ]
      end)
      |> Enum.map(fn chromosome ->
        %{chromosome: chromosome, distance: fitness_function(chromosome)}
      end)

    new_population_data =
      (prune(sorted_population_data, num_parents) ++ offsprings)
      |> Enum.sort_by(
        fn chromosome_data -> chromosome_data.distance end,
        :asc
      )

    # determines how often we collect loop data (for later visualization)
    agg_update_rate = 1

    case rem(generation_n, agg_update_rate) do
      x when x == 0 ->
        case aggregator_pid do
          x when is_pid(x) ->
            [elite] = Enum.take(new_population_data, 1)
            total_fitness = general_inverse_fitness(new_population_data)

            send(
              aggregator_pid,
              {
                :continue,
                %{
                  distance: elite.distance,
                  chromosome: elite.chromosome,
                  generation: generation_n,
                  population_fitness: total_fitness
                }
              }
            )

          _ ->
            nil
        end

      _ ->
        nil
    end

    evolve(new_population_data, generation_n + 1, memo)
  end

  def run(
        aggregator_pid,
        population_size \\ 100,
        generation_limit \\ 250,
        reproduction_rate \\ 0.3
      ) do
    population = initialize_population(population_size)

    chromosome_datas =
      population
      |> Enum.map(fn chromosome ->
        %{chromosome: chromosome, distance: fitness_function(chromosome)}
      end)

    sorted_population =
      Enum.sort_by(
        chromosome_datas,
        fn %{distance: distance} ->
          distance
        end,
        :asc
      )

    num_parents = num_parents(population_size, reproduction_rate)

    evolve(
      sorted_population,
      0,
      %{
        num_parents: num_parents,
        generation_limit: generation_limit,
        aggregator_pid: aggregator_pid
      }
    )
  end

  defp num_parents(population_size, proportion) do
    preliminary_num_parents = round(population_size * proportion) |> max(2)

    case rem(preliminary_num_parents, 2) do
      x when x == 0 -> preliminary_num_parents
      _ -> min(preliminary_num_parents - 1, 2)
    end
  end

  def get_cities do
    @cities
  end
end
aggregator_pid =
  spawn(fn ->
    DataAggregator.loop([])
  end)

result = TSP.run(aggregator_pid, 100, 1000, 0.3)

%{
  chromosome: best_fit,
  distance: best_distance,
  generation: best_generation,
  population_fitness: best_population_fitness
} = result

fitness_curve = DataAggregator.receiver(aggregator_pid)
send(aggregator_pid, :done)

result
distance =
  VegaLite.new()
  |> VegaLite.data_from_values(fitness_curve, only: ["generation", "distance"])
  |> VegaLite.mark(:line)
  |> VegaLite.encode_field(:x, "generation", type: :quantitative)
  |> VegaLite.encode_field(:y, "distance", type: :quantitative, scale: %{domain: [80, 180]})

total_fitness =
  VegaLite.new()
  |> VegaLite.data_from_values(fitness_curve, only: ["generation", "population_fitness"])
  |> VegaLite.mark(:line)
  |> VegaLite.encode_field(:x, "generation", type: :quantitative)
  |> VegaLite.encode_field(:y, "population_fitness", type: :quantitative)

row_1 = Kino.Layout.grid([distance, total_fitness], columns: 2)

path_data =
  fitness_curve
  |> Enum.map(fn element ->
    route = element.chromosome

    first_city = Enum.at(TSP.get_cities(), Enum.at(route, 0))

    points =
      route
      |> Enum.map(fn city_index ->
        city = Enum.at(TSP.get_cities(), city_index)
        "#{city.x * 10},#{city.y * 10}"
      end)
      |> Kernel.++(["#{first_city.x * 10},#{first_city.y * 10}"])
      |> Enum.join(" ")

    "M #{points}"
  end)
  |> Jason.encode!()

meta_data =
  fitness_curve
  |> Jason.encode!()

plot = GeometryRender.cities_and_route_plot(TSP.get_cities(), best_fit)

formatted_distance = best_distance |> Float.round(4) |> Float.to_string()

best_path_html =
  Kino.HTML.new("""
    <style>
      #{GeometryRender.get_styles()}
    </style>
    <h4>Final Route</h4>
    #{plot}
    <table>
      <thead>
        <tr>
          <th>Generation</th>
          <th>Distance</th>
          <th>Pop. Fitness</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <td>#{best_generation}</td>
          <td>#{best_distance |> Float.round(2)}</td>
          <td>#{best_population_fitness |> Float.round(4)}</td>
        </tr>
      </tbody>
    </table>
  """)

generations_html =
  Kino.HTML.new("""
  <style>
    #{GeometryRender.get_styles()}
  </style>
  <div class="path-wrapper">
    <div class="center-path">
      <h4>Best Route per Generation</h4>
      <svg width='400' height='240'>
        <defs>
          <linearGradient id="gradient" x1="0%" y1="0%" x2="100%" y2="0%">
            <stop offset="0%"   stop-color="#DDB7E8"/>
            <stop offset="100%" stop-color="#3F297F"/>
          </linearGradient>
        </defs>
        <path id='route-path' d='' stroke='#4eace6' stroke-width='2' fill='none' />
        #{GeometryRender.render_cities(TSP.get_cities())}

      </svg>
      <table>
        <thead>
          <tr>
            <th>Generation</th>
            <th>Distance</th>
            <th>Pop. Fitness</th>
          </tr>
        </thead>
        <tbody>
          <tr>
            <td id="path-generation">0</td>
            <td id="path-distance">0</td>
            <td id="pop-fitness">0</td>
          </tr>
        </tbody>
      </table>
    <div>
  </div>
  <script>
    const metaData = #{meta_data}
    const pathData = #{path_data};
    const pathElement = document.getElementById('route-path');
    const nGenerationElement = document.getElementById('path-generation');
    const distanceElement = document.getElementById('path-distance');
    const popFitnessElement = document.getElementById('pop-fitness');

    let index = 0;

    function updatePath() {
      if (index < pathData.length) {
        pathElement.setAttribute('d', pathData[index]);
        nGenerationElement.innerHTML = metaData[index]["generation"];
        distanceElement.innerHTML = metaData[index]["distance"].toFixed(2)
        popFitnessElement.innerHTML = metaData[index]["population_fitness"].toFixed(4)
        index++;
      }
    }

    setInterval(updatePath, 200);
  </script>
  """)

row_2 = Kino.Layout.grid([best_path_html, generations_html], columns: 2)
Kino.Layout.grid([row_1, row_2], columns: 1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment