Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 10 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save akiatoji/42c04598535e823637ba18a54747fca5 to your computer and use it in GitHub Desktop.
Save akiatoji/42c04598535e823637ba18a54747fca5 to your computer and use it in GitHub Desktop.

OSRM North America route server on EC2

Overview

OSRM route server is an extremely useful tool for getting the driving distance/time through multiple locations. The route server requires data that has to be downloaded and processsed before it can be used to serve routes from.

Processing OSRM data for large region like North America can be a real challenge due to memory and disk size requirements. It's also really time consuming. If you cut and try from scratch, you will repeatedly run into some constraints and fail after hours of running.

The following are summary notes from trying this with eventual success.

Docker on AWS EC2

Since most people don't have a machine with huge amount memory sitting around, doing this on AWS EC2 is a natural choice. Using Docker image like below is the easiest, but on AWS you need an instance with at least 64GB memory (I used m4.4xlarge). Even with 64GB memory, things get tight, so you should have some swap space also.

Building tools and processing

Alternatively, you can build osrm-backend and run native (i.e. not in Docker). Unfortunately, this is even more of a hassle on EC2 because OSRM requires many tools and libraries to be installed, including newer version of GCC than default, etc.

In comparison, building osrm-backend and processing OSRM route data was very straightforward on a MacPro 6 core + 32GB memory. You get latest tool chain and macOS manages memory demand very well. But then again you can't run a MacPro in a AWS VPC.

If you get bad_alloc() error or Docker container blows up, or if your machine starts to swap heavily and go non-responsive.... you just don't have enough memory. You are better off overallocating from the start because peak memory demand happens way into the process.

Disk space needed

North America osm.pbf is ~9GB. The output data size is another 47GB or so for a total of 56GB just to house the data. Then you need ~10GB for the OS, and another few GB for swapfile. Overallocating here is also a good idea because you run out of disk space just when you think you are about done.

Steps to process on EC2

Set up Docker

# (stand up m4.4xlarge instance first)
sudo yum update -y
sudo yum install docker -y
sudo service docker start
sudo usermod -a -G docker ec2-user

Log out Log back in

Make sure docker is up and memory available

docker info

Add swap space

# Add 10GB swap space 
sudo /bin/dd if=/dev/zero of=/var/swapfile bs=1M count=10240
sudo /sbin/mkswap /var/swapfile
sudo chmod 600 /var/swapfile
sudo /sbin/swapon /var/swapfile

Fetch and Process North America OSM data

# Fetch data.  This is about 8GB. Appreciate Geofabrik for making it this easy.
wget http://download.geofabrik.de/north-america-latest.osm.pbf

# Process with Docker.
docker run -t -v $(pwd):/data osrm/osrm-backend osrm-extract -p /opt/car.lua /data/north-america-latest.osm.pbf
docker run -t -v $(pwd):/data osrm/osrm-backend osrm-partition /data/north-america-latest.osrm
docker run -t -v $(pwd):/data osrm/osrm-backend osrm-customize /data/north-america-latest.osrm

Running OSRM routing server

OSRM routing server uses 24GB memory when serving North American routes.

docker run -t -i -p 5000:5000 -v $(pwd):/data osrm/osrm-backend osrm-routed --algorithm mld /data/north-america-latest.osrm

Routing Performance

Simple benchmark of getting 100 or so trip routes:

  • EC2 m4.4xlarge instance 16vCPU/64GB - 11m25.716s
  • MacPro 6core/32GB - 4m1.637s
@marceloszilagyi
Copy link

Thank you very much for this - really helpful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment