Skip to content

Instantly share code, notes, and snippets.

@zeux

zeux/codec-versioning.md

Last active Jan 27, 2020
Embed
What would you like to do?
Writing down various options to extend meshoptimizer encoding API with format versioning

Problem

meshoptimizer currently ships with index and vertex codecs with the API along these lines:

size_t encode(destination, destination_capacity, source, source_size)
void decode(destination, destination_size, source, source_size)

The next version of meshoptimizer needs to make backwards incompatible changes to the encoding format to allow for better compression. The format has a version byte, so decode signature doesn't need to change. However, how would encode know which version to encode?

It's tempting to always encode the new format. However, this can lead to issues with version upgrades: upgrading meshoptimizer version will have to happen first for the code that runs decoding (so that it supports new format), and only then for the code that runs encoding (to use new format). When the code has different release cadence etc. it becomes easy to accidentally encode new format before it can be decoded by all clients.

Solution 1: explicit encoding version

One solution is to change encode signature to force the user to pick the version:

size_t encode(destination, destination_capacity, source, source_size, int version)

This sidesteps the deployment issues since the calling code is now in control. However, this makes the API less intuitive - assuming the version changes very rarely, this requires all users to use an extra parameter.

What should this parameter be set to? This would mean we need some sort of MESHOPTIMIZER_INDEX_CODEC_LATEST_VERSION macro if we want to avoid users having to know which version is the latest. However, this, again, means that during source upgrades the code by default silently adopt the new format - and the cost of this is having each user deal with an extra argument.

This could be slightly prettier with an enum argument, although the names of the enum variants would probably be "IndexCodec2" et al.

Solution 2: separate function for version-aware encoding

We can keep the use of the encoding functions simple if we introduce two variants:

size_t encode(destination, destination_capacity, source, source_size)
size_t encodeVersion(destination, destination_capacity, source, source_size, int version)

Where encode uses the latest version, but encodeVersion uses the specified version. That way users who care about stability of the encoded format can use encodeVersion, and users who don't can automatically gain benefits of the new format when it ships. However, if an application already used encode there's no way for it to know that it might use a format that isn't supported by old decoders...

Solution 3: all format upgrades require new functions

If we want a solution that is maximally robust we can have separate encoding functions for format upgrades. That way, when the application is written, it can use the latest function available; library changes don't change the application behavior:

size_t encode(destination, destination_capacity, source, source_size)
size_t encode2(destination, destination_capacity, source, source_size)
size_t encode3(destination, destination_capacity, source, source_size)
... etc

This is somewhat similar to solution 1 but it remains backwards compatible even if the original version of the API didn't include version, and allows to add extra arguments to tune encoding for later format versions. It also means that FFI interfaces don't need to use enums/#defines out of meshoptimizer.h, although this is probably not a big deal in practice?

However, meshoptimizer right now is at version 0; since we need to bump the format, and the new format is superior to the old format, this will lead to a situation where encode should not be used in practice. Since it's the most obvious name to reach out for, why should future users pay for past legacy?

Solution 4: global encoding version

One way to decouple the interface from the version specification here is to have a way to set the encoding version globally separately. The defaults for the version can slowly increase following updates to the formats, but applications that need this can set the version once at startup.

size_t encode(destination, destination_capacity, source, source_size)
void setEncodeVersion(int version)

This means that users that care about version stability can call the version setter; by default the version can either always use the latest codec. This doesn't let us easily add extra encoding parameters, but keeps the simple use cases separate from more complex use cases on the API level.

???

I'm really unsure which option is best here.

On one hand, I'd like to keep the library interface really simple. Version parameters don't seem like something users should care about by default. Global version setting is a way out but feels gross / like a hack.

On another hand, any application that uses the default version will eventually face a situation when the library update bumps the version up - is it reasonable to protect the users against this at the cost of an extra API parameter?

I'm currently leaning towards option 4 because it seems like the "new encoder, old decoder" situation is comparatively rare.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment