Skip to content

Instantly share code, notes, and snippets.

@tmcw
Created June 5, 2014 13:51
Show Gist options
  • Save tmcw/fac12b6eb89cb28bb18f to your computer and use it in GitHub Desktop.
Save tmcw/fac12b6eb89cb28bb18f to your computer and use it in GitHub Desktop.

Ah, GeoPackage. Comments on the current spec.

Base

  • Hooray, C extensions were removed. Hope that idea is gone for good.
  • Spatial Reference Systems Ah yes. This is a broken system, based on hoping that no SRS systems ever get invented after now, and in the meantime sticking it all together with a big CSV file that nobody wants to think about. Anyway, this is a bad idea but a ubiquitous bad idea, so won't blame that on GPKX.
  • 1.1.3.1.1. Table Definition Exposes the problem of projecting everything: if you don't have proj4 handy, you won't even be able to get the extent of a data file. This radically de-simplifies simple metadata.

Core Geometry Model

Okay, so maybe we aren't discussing this, but: why? Like, SQLite has its own geospatial geometry encoding, Spatialite has a needlessly incompatible version of that, and now it's time to invent a third?

  • Core Geometry Model includes unusual primitives like MultiCurve

A 1-dimensional geometry is a geometry that has a length, but no area.

That sounds like a 2D line.

  • To implement this new geometry encoding, you'll need ISO/IEC 13249-3:2011. Cost of specification just increased to $225 USD
  • Geometry supports Z values (yay) but doesn't specify what they mean: what are the units?

Tiles

Unfortunately, no applicable existing consensus, national or international specifications have standardized practices in this domain.

Except for the custom used by every major web map service. I wish these intros weren't so flip about prior art: saying "no consensus, national, or international specifications" is kind of a cop-out: most useful specs aren't.

  • 2.2.3. Zoom Levels: Looks like the zooms-that-aren't-powers-of-two-or-even-constant-between-numbers thing survived the draft process. This will be implemented by <1%.
  • 2.4.2. Metadata Table: That is a lot of standards and a lot of XML.
  • 3.1.3. RTree Spatial Indexes: because, from above, we aren't using SQLite's built in geometry encoding, so we can't use SQLite's built in R*Tree module
@pepijnve
Copy link

pepijnve commented Jun 5, 2014

Reading between the lines it looks like what you would have preferred is a least-common-denominator approach that requires all data to be in wgs-84 for vector data and web mercator for imagery. That's fine for a certain types of maps, but other user communities have different needs. Anyway on to the specifices.

Base

C extensions

You still need those for certain usages (e.g., when non-standard SQL functions are used in triggers). That's inherent to SQLite; not something specific to geopackage. See SQLite application-defined functions for details.

SRS

You can put any srs in here as long as you can provide its definition as WKT. It doens't have to be an EPSG code. (id: 1, auth: luciad, code:42, wkt:...) would be fine as well. Providing the SRS definition is one of the reasons this table exists.

Bounding box

I guess this could have been restricted to some fixed srs (e.g. epsg:4326). That would have made it easier for clients that can't interpret WKT to show display at least a bounding box.

Geometry Model

Geometry Encoding

SQLite does not have a geometry encoding. Not sure where you're getting that from. Spatialite encodes geometry as <header><custom wkb dialect><footer>. Geopackage ended up going for <header><iso/ogc wkb> so that at least the bulk of the geometry blob is compatible. The header is needed to enable efficient implementation of certain SQL functions like st_srid, st_minx, etc.

Dimensions

That text is mostly taken from other ISO specs. It's talking about the topological dimensions of the geometry, not the coordinate dimensions. I agree that this is probably not the most accessible way to describe these things. It's important when defining spatial predicates. See DE-9IM for instance.

Units

The same applies to X and Y. What are the units of those in a projected coordinate system? Meters, feet, cm? This is specified in the SRS definition. If you have Z coordinates then you should be using an SRS that defines that as well. EPSG:4979 for instance specifies three axes and that the height axis has unit meter.

ISO spec cost

You're right that it's unfortunate that we defer to a for-pay spec for something critical. I'm not sure if it would be allowed to inline the WKB and WKT definitions as appendixes to the geopackage spec copyright wise. I'll try to find out.

Tiles

No consensus etc

That is rather superfluous indeed. GeoPackage does the common '0,0 is upper left' but that detail is kind of buried in the text. I'll pass this on to the spec editor.

Non factor two scaling

Yes there still in there, but the default is factor two. There are legitmate use cases for these. If you're combining existing maps that were designed for a certain scale level then these don't always fit nicely into a factor two tile matrix. Resampling the data is undesireable so we the scale levels become variable.
It might be the case that it's implemented by less than 1%, but it's something our users ask for and need. In other words, I can understand that from the mapbox perspective this seems silly but there is actually demand for things like this. Because it's more of a specialist use case feature it's been made an extension so that implementers can choose to not support it. You check gpkg_extensions and if it says 'gpkg_zoom_other' applies to the tiles table then you tell your user you don't support this kind of data. The fact that simpler clients exist shouldn't preclude more advanced usages imo.

Metadata

Not sure what your point is here. For better or worse ISO 19115, the most common metadata model in use as far as I know, has a single standardized encoding which is XML-based (ISO 19139). Nobody's saying you have to use this, nor that the metadata must be XML. We're just pointing to existing metadata standards that people might want to use. If you want to come up with a JSON encoding of ISO 19115 metadata you're free to do so and put it in the metadata table. You just won't be able to expect anyone else to be able to interpret this correctly.

RTree

We are using the standard SQLite rtree extension. The spec even says so and points to the SQLite documentation. RTrees in SQLite are not automatically updated indices, you need to either do this manually on every insert, update or delete from the data table or add triggers to the data table that do this for you.
The geopackage specification is prescribing how the rtree extension should be used in order to guarantee interoperability between applications. This consists of naming conventions for the rtree index tables and triggers and the definitions of the triggers. Without those you would have no reliable way of finding the rtree index or checking if the triggers are correct.

@tmcw
Copy link
Author

tmcw commented Jun 5, 2014

Reading between the lines it looks like what you would have preferred is a least-common-denominator approach that requires all data to be in wgs-84 for vector data and web mercator for imagery.

I'd say that's an oversimplistic reading.

Basically, my problem is that GeoPackage is primarily a container format, like MBTiles. So, it contains things: and in this case, they are vector and raster features. You have two choices: you can either contain existing things, like MBTiles contains XYZ tiles, or you can invent new things. GeoPackage did the latter - it baked up a new way to encode geometries, and a new tile spec - and I don't think anyone spent nearly enough time thinking about these things. Instead of solving one problem well, it's creating two new problems and solving three problems poorly.

If there was already a great way to encode geometries - one that didn't cost two hundred bucks to access and didn't have multiple incompatible variants and was compact and had all the wonderful bells and whistles. And if there was already a tile system that did projections right and also somehow supported non-power-of-two zoom levels, then sure: I would support a container format that wrapped those well-thought out, battle-tested strategies.

SQLite does not have a geometry encoding. Not sure where you're getting that from.

See the GDAL ogr driver, and mapnik's support, and also the part in the private mailing list where I demonstrated the two.


For those at home who don't know the backstory, the stuff about non-power-of-two zoom levels might seem weird. Why such focus? Why that feature? To clarify: the main sponsors of GeoPackage are military, and non-power-of-two zoom levels are useful mainly for data "coming off the bird" in non-multilevel form.

This mismatch between the funders intent and the public face of a standard is one of the things that bothers me: why should a general format have vestigial features of a military usecase?

@pepijnve
Copy link

pepijnve commented Jun 5, 2014

Fair enough on the over simplistic bit :)

The intention of the things in GeoPackage was actually to not invent anything new (to the extent possible). Features is reusing the work from SQL/MM and SF/SQL. Tiles is reusing the work from WMTS. You can argue whether those are good starting points or not of course, but that's the rationale behind it.

I fully agree that WKB is not the greatest geometry encoding imagineable, but its the only thing out there that's already standardized. We could have opted to come up with something better but decided not to. The header bit was made to be a prefix for ogc wkb. Just to be clear that's the same 'raw OGC WKB' that the mapnik docs mention and which OGR supports. SQL/MM is actually the place where this is normatively specified. The idea behind using wkb unmodified is that you can skip over the header and use an existing wkb parser to read the geometry. That's exactly what Justin Deoliveira did in geotools for instance. I did the same in our codebase fwiw. I would expect Paul Ramsey did the same for the OGR geopackage driver.
The other alternatives out there in the context of SQLite are FDO RFC16 and spatialite. Neither are defined by SQLite itself. They use the same strategy of encoding a geometry as a blob in sqlite as geopackage does. We ended up not using spatialite's encoding since it modifies certain bytes of standard wkb (without any tangible benefit imo). The FDO RFC16 is something completely custom that isn't in wide spread use as far as I know.

There are non-military use cases for non-power-two tilesets btw. See http://www.ngi.be/NL/NL1-19-1.shtm for instance. We wanted to be able to use geopackage to cache whatever you can serve via wmts. The only thing that was dropped was variable top-left corner; partially based on a comment from Even Rouault.
I don't think the 'coming off the bird' use case is actually feasible anymore. Putting a single 20kx20k image as a single tile in a tile matrix is a bad idea and definitely not the intended use case anymore. I think (hope) we reached consensus on this quite some time ago. This was one of the arguments to drop NITF from the spec as well. It just doesn't make sense to have something with the complexity of NITF in a format that's trying to cache RGB(A) imagery tiles.

@isc-rsingh
Copy link

@pepijnve in your first comment it looks like some words under the "Geometry Encoding" header got lost.

@pepijnve
Copy link

pepijnve commented Jun 5, 2014

@rajsingh fixed. markdown didn't like the angle brackets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment