Skip to content

Instantly share code, notes, and snippets.

@whyvez
Last active July 22, 2020 19:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save whyvez/83185c863da949bf3be1bd6805500e5d to your computer and use it in GitHub Desktop.
Save whyvez/83185c863da949bf3be1bd6805500e5d to your computer and use it in GitHub Desktop.
OSM Planet PBF to PostgreSQL loader on GCP

osm-loader

Small utility that loads OSM planet data into PosgresSQL.

Utility does the following:

  • Creates GCP instance with 4 SSD (1.5TB)
  • Bootstraps instance with dependencies (src/install.sh)
  • Copies src/load.sh to instance.

Usage:

$ ./osm-loader.sh
    Creating instance...
    Waiting for instance to become ready...
    Connection to 35.231.43.211 port 22 [tcp/ssh] succeeded!
    Instance is ready!
    Copying load.sh script to instance...
    load.sh                                                                                                                                                                                                                     100%  404     3.3KB/s   00:00

    Usage:
        SSH into instance:
        $ gcloud compute ssh osm-loader

        Check bootstrap progress:
        $ sudo journalctl -u google-startup-scripts.service

        Run load.sh in background...
        $ nohup nohup bash load.sh &

        Check load.sh progress...
        $ tail -f nohup.out

Cleanup:

$ gcloud compute instances delete osm-loader
#! /bin/bash
apt upgrade
apt update
apt install mdadm \
postgresql \
postgresql-contrib \
postgresql-10-postgis-2.4 \
postgresql-10-postgis-scripts \
osm2pgsql \
wget -y
echo 'export PATH="/usr/lib/postgresql/10/bin:$PATH"' >> ~/.bashrc
mdadm --create /dev/md0 --level=0 --raid-devices=4 \
/dev/nvme0n1 /dev/nvme0n2 /dev/nvme0n3 /dev/nvme0n4
mkfs.ext4 -F /dev/md0
mkdir -p /mnt/disks/ssd
mount /dev/md0 /mnt/disks/ssd
chmod a+w /mnt/disks/ssd
OLD_DATA_AREA="/var/lib/postgresql"
DATA_AREA=/mnt/disks/ssd
systemctl stop postgresql
rsync -av "$OLD_DATA_AREA" "$DATA_AREA"
mv "$OLD_DATA_AREA"/10/main "$OLD_DATA_AREA"/10/main.bak
cat << EOF >> "/etc/postgresql/10/main/postgresql.conf"
data_directory = '/mnt/disks/ssd/postgresql/10/main'
EOF
cat << EOF > "${DATA_AREA}/postgresql/10/main/osm2pgsql.conf"
shared_buffers = 8MB
maintenance_work_mem = 4096MB
work_mem = 1MB
fsync = off
autovacuum = off
checkpoint_segments = 60
random_page_cost = 1.1
EOF
cat << EOF >> "${DATA_AREA}/postgresql/10/main/postgresql.conf"
include osm2pgsql.conf
EOF
systemctl start postgresql
#!/usr/bin/env bash
export PGDATABASE="osm_planet"
wget -O /mnt/disks/ssd/planet-latest.osm.pbf https://ftp.osuosl.org/pub/openstreetmap/pbf/planet-latest.osm.pbf
sudo su - postgres -c "createuser -s -d -w $(whoami)"
createdb "$PGDATABASE"
psql -c 'CREATE EXTENSION postgis;'
osm2pgsql -c -d "$PGDATABASE" --slim -C 30000 --flat-nodes /mnt/disks/ssd/nodes.cache /mnt/disks/ssd/planet-latest.osm.pbf
#!/usr/bin/env bash
create_instance() {
gcloud compute instances create osm-loader \
--machine-type n1-standard-8 \
--image-project ubuntu-os-cloud \
--image-family ubuntu-1804-lts \
--local-ssd interface=nvme \
--local-ssd interface=nvme \
--local-ssd interface=nvme \
--local-ssd interface=nvme \
--zone=us-east1-b \
--metadata-from-file startup-script=src/install.sh
}
echo "Creating instance..."
create_instance
echo "Waiting for instance to become ready..."
IP=$(gcloud compute instances list | awk '/osm-loader/ {print $5}')
nc -w 1 -z "$IP" 22
echo "Instance is ready!"
echo "Copying load.sh script to instance..."
gcloud compute scp src/load.sh osm-loader:~/ --zone=us-east1-b
cat << 'EOM'
Usage:
SSH into instance:
$ gcloud compute ssh osm-loader
Check bootstrap progress:
$ sudo journalctl -u google-startup-scripts.service
Run load.sh in background...
$ nohup nohup bash load.sh &
Check load.sh progress...
$ tail -f nohup.out
EOM
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment