Skip to content

Instantly share code, notes, and snippets.

@celoyd
Last active September 15, 2023 17:04
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save celoyd/62b1f8e359bed2b99d7e2f320434660a to your computer and use it in GitHub Desktop.
Save celoyd/62b1f8e359bed2b99d7e2f320434660a to your computer and use it in GitHub Desktop.
A little CLI workflow for visualizing OSM tile traffic data

This is how to make animations like this (alternate), inspired by Paul Norman’s. This is a write-up of a one-off 45 minute project, so it’s rough around the edges and probably has a few typos; feel free to point them out. It’s mostly command-line work, using tools like GNU parallel and ImageMagick convert; it’s slow and wastes a lot of filesystem space compared to a more monolithic approach in (say) python, but it’s very flexible.

1. Get data

I use curl globs, for example:

mkdir xzs
cd xzs
curl -O 'https://planet.openstreetmap.org/tile_logs/tiles-2022-[01-12]-[01-31].txt.xz'
cd ..

2. Draw daily images

Please see this gist for my drawing script. The important thing here is that it will read from stdin and write raw sums to a named TIFF (although you might prefer to write something that works completely differently, of course).

We start by making a directory to hold the images. Do this on a fast drive with a reasonable amount of space – I buy low-end external SSDs for this kind of scratch work.

mkdir images

Now we draw the images. This is the most time-consuming part of the process. A word on parallel: it’s an outstanding tool for replacing shell loops, and not nearly as confusing as it looks at first. But it does have a learning curve, and I’m not going to try to teach it here. I’ll just recommend it if it seems like it might be useful to you.

parallel -j-1 'xzcat {} | python draw_osm_day.py - images/{/.}.tiff' ::: xzs/*

(When run, this will give errors for a few files; we’ll fix these later – see Calendar trouble below.)

The -j flag to parallel says to use one fewer than the number of cores available. At the end, from :::, that says that we’re using xzs/* (exposed to the shell and therefore globbed in the usual way) as inputs to the template string. The string will expand to a command that runs xzcat to create a plaintext log, which it pipes through the drawing script, asking for an output file that has the name of the input file stripped of its enclosing directory (/) and its suffix (.). So for an input named xzs/day1.xz, this creates an output called images/day1.tiff.

(N.b., my first try at this decoded the xz files directly in python, but that was slow even when following the project’s example usage. Perhaps it’s because I was keeping the files on a network drive that made seeks slow, but I didn’t investigate. In any case, switching to xzcat was a 25× speedup.)

3. Color images

Now we have a lot of single-band float32 images with absolute values, but we want a lot of RGB images colored in a pleasantly interpretable way. I started by looking at typical values of the images (with rio info --tell-me-more) and saw that they range up to the tens of millions. I used the Inferno color ramp from here, plus hard black and white as endpoints and a scaling by factors of 8. The factors of 8 are completely empirical, meaning I tried a couple things (factors of 2 and 10) and 8 was the first one that looked okay. A table of value, R, G, B looks like this:

20971520 255 255 255
2621440 252 255 164
327680 250 193 39
40960 245 125 21
5120 212 72 66
640 159 42 99
80 101 21 110
10 40 11 84
0 0 0 0

We put those in a file called ramp.txt.

Incidentally, we start at 10 because the public data dumps seem to filter out tiles with fewer than 10 views in the day (pretty reasonably, I imagine, for space and privacy reasons).

Now we use parallel to template a loop again, and gdaldem (which is very fast) to actually apply the color ramp`:

mkdir coloredin
parallel gdaldem color-relief -co compress=lzw {} ramp.txt coloredin/{/} ::: images/*

Calendar trouble

Why add compression? Because for days that 404, like 2023-02-31, curl saved an HTML file containing the error text. The first time I did one of these animations, those ended up as black frames. Very embarrassing. With compression, you can sort the coloredin folder by size and simply delete everything that’s obviously empty and contains only format headers and such.

4. Add captions

Now we want to draw a little date in the corner of each image so we can figure out when events are happening. We’re also going to clean up the filenames, which right now all have the prefix tiles- and the suffix .txt.tiff when all we want is .tiff (not that it particularly matters). And we’ll convert to PNG, which is more widely portable than TIFF.

mkdir pngs
parallel convert coloredin/tiles-{}.txt.tiff -font "B612-Regular" -pointsize 20 -fill white -gravity southwest -annotate +5+3 {} pngs/{}.png ::: 2022-{01..12}-{01..31}

At the end of the parallel template, we create every valid date in some range we care about, plus some invalid ones that we know will fail. Now, simplifying, the templated command looks like this:

convert input.tiff -drawing-stuff ... output.png

With inputs selected by, and outputs named by, those date strings. Naturally you will want to make sure that the drawing commands use a font that you have, are placed somewhere you like, etc.

5. Make the movie

Now we have a lot of colored-in images in pngs that sort correctly. Here’s the ffmpeg command I used:

ffmpeg -r 15 -f image2 -pattern_type glob -i "pngs/*.png" -vcodec libx264 -tune grain -crf 15 -pix_fmt yuv420p -movflags +faststart osm.mp4

Those flags saying to

  • make the movie 15 fps,
  • use the image i/o module, applying a shell-style glob to get the images in the pngs directory as ordered frames,
  • encode with libx264, which has a reasonable mix of quality and support,
  • tune the codec to perform well on grainy images (intended for film grain, but presumably good for the flickering details here),
  • use a quality factor of 15, which seems to be reasonable (lower is better),
  • do some chroma subsampling, since the color ramp should be interpretable on luma only, and
  • add a format option that I vaguely remember increases compatibility.

I am not one of the 5 humans who understand ffmpeg best practices for web videos at any one time, so don’t take this too seriously.


— 🦈 Fin 🦈 —

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment