Meteo agencies [1] need the ability to cache time enabled WMS services but currently mapproxy disables caching when dimensions are set. This RFC proposes to support caching of different dimensions values, by compromising on explicit configuration and limited flexibility when picking cache backends or directory structure.
The Norwegian Meteorological Institute, through Trond Michelsen already created an implementation in mapproxy/mapproxy#377 and have been using it in production since last year.
There was feedback from @olt in the PR that recommended to:
- Not make cache.dimensions a global option
- Verify handling of configurations with caches that have different dimensions (or no dimensions at all)
- Provide extensive tests and documentation
Ideally, the following cache and seed configuration should be enough to get a local file system based cache:
services:
demo:
wms:
md:
title: Meteo Example
caches:
my_cache:
sources: [my_source]
grids: [GLOBAL_GEODETIC]
sources:
my_source:
type: wms
req:
url: http://example.org/geomet/?
layers: ETA_TT
forward_req_params: ['time', 'elevation']
layers:
- name: ETA_TT
title: ETA_TT - Global temperature
sources: [my_cache]
dimensions:
time:
values:
- "2020-01-31T16:10:00:00Z"
- "2020-01-31T16:11:00:00Z"
- "2020-01-31T16:12:00:00Z"
- "2020-01-31T16:13:00:00Z"
- "2020-01-31T16:14:00:00Z"
- "2020-01-31T16:15:00:00Z"
- "2020-01-30T16:16:00.000Z"
default: "2020-01-30T16:00:00.000Z"
elevation:
values:
- "10"
- "100"
- "1000"
default: "100"
seeds:
myseed1:
caches: [my_cache]
grids: [GLOBAL_GEODETIC]
dimensions:
time: true
elevation: true
levels:
from: 2
to: 3
This effort will build on PR377, mainly by providing testing tools, improving unit tests and documentation.
A new key will be added to the tile cache structure based on the file value, below is an example of how it would work:
>>> dimensions_part(['reference-time', 'time'], {"time": "2016-11-24T18:00Z", "reference-time": "2016-11-24T00:00Z"})
'2016-11-24T00:00Z/2016-11-24T18:00Z'
This has the potential to make one layer have a lot of cache files (in particular if many dimensions are used) and can make seeding extremely time consuming. This will be documented prominently and runtime warnings can be added if need be when this number exceeds a threshold.
Any dimension will be supported, and for TIME dimension that follows one of these two formats:
- TIME=, e.g. TIME=2020-01-31
- TIME=<timestamp_start>/<timestamp_end>, e.g. TIME=2020-01-31/2020-02-01
The format respects ISO 8601:1988(E) “extended” format. For the complete list of patterns currently supported, please refer to wms time support: https://www.mapserver.org/ogc/wms_time.html
In mapproxy, a conditional will be included, using dateutil.parse that will read the
- mapproxy/cache/*.py
- mapproxy/config/loader.py
- mapproxy/service/templates/demo/wms_demo.html (optional)
- mapproxy/service/wms.py
- mapproxy/request/base.py
- mapproxy/seed/*
-
The seed tool will iterate over dimensions if they are set in the config and iteratete over all the potential values but will not accept a smaller subset as command line options.
-
Integration tests will be done against MapServer's implementation (as that is the one we have access to)
-
Unit tests will be created to cover the newly added code paths.
-
Documentation will be created for the new options in cache and seed configuration.
-
Since caching time requests can introduce subtle bugs, the implementation will be tested against access logs from a high traffic production workload, and the images coming from the cache and the original will be compared for differences using each image histogram and a parameterized tolerance level. A pre-flight utility will be created that will compare results from the original server and the local mapproxy instance to help in testing for edge conditions and potential configuration problems. This tool can live outside mapproxy, but will be linked in the documentation to show how to run this test against public dimension-enable WMS servers. If potential users have a way to verify caching works against their usage patterns they could alert the Mapproxy project before potential errors go to production.
-
The idea is to create one configuration that works and submit that as a PR, therefore it is possible that only one directory layout and backend comes out of this effort, incompatible configuration will present the user with an error and indicate to disable caching by moving to the DummyCache as it is currently implemented. Once we have a working path, other contributors can step in to expand the feature set and remove the limitations.
This document is a gist and will evolve based on feedback from the mailing list and the pull request.
[1] In particular Norwegian Meteorological Institute and Meteorological Service of Canada
Hope this helps! Thanks