Skip to content

Instantly share code, notes, and snippets.

@HeroicKatora
Last active February 9, 2021 20:31
Show Gist options
  • Save HeroicKatora/0ef9e17be0e27acb9f5a73e6fbb48244 to your computer and use it in GitHub Desktop.
Save HeroicKatora/0ef9e17be0e27acb9f5a73e6fbb48244 to your computer and use it in GitHub Desktop.
Image issue priority

A hindsight of 2020

The last year was mainly one of stability. Working up to 0.23 we had fixed many outstanding stability risks and it continues to be one of the most long-serving major versions for a long term. Part of this stability work was the ability to upgrade decoder dependencies behind the scenes. This allowed you to silently benefit from the improved gif decoder as detailed above and other similar fixes across other decoders and it will also enable similar updates for png soon.

This is directly reflected in dependencies and downloads. While previously there was a long tail, the dependencies and downloads are now dominantly for recent versions and thus quicker, more secure, and more correct. (If you're still using 0.18 or 0.22, you should really consider updating). We've also more than doubled our total downloads over the last year!

All of this didn't stop progress. Take the png and tiff decoders that have seen a significant increase in coverage of features. Add the fact that png and gif are now quite competitive with their C counterparts. Also, the jpeg-decoder crate that has fixed quite a lot of outstanding bugs in the decoding process; The newly added bindings for avif decoders and encoders; And not to forget improved and copyless image operations.

There are still many outstanding issues and unimplemented improvements. From the point of view of a maintainer, some of the most pressing ones are listed in the next sections, sorted roughly by topic.

Fallbacks and Error logs

The decoders are strict, in many ways. There exist a few fallbacks here and there, but not enough. The justification for this is that being too lenient in accepted formats leads to an ossification of formats where particular non-standard behavior by the most popular encoders is de-facto standardized without appearing in any specification. For many older formats that ship has, however, long sailed. We should thus accept the popular interpretation (e.g. ImageMagick) of such extensions.

The issues goes a bit farther than this. It would be great to have a mode where such details and nit-picks are remarked upon but decoding continues. Meanwhile we do not want to dump them to stderr.

Related issues:

Supplementary information

The image library currently does not expose a consistent interface to supplementary information such as color spaces, comments, copyright, orientation, and other parts of EXIF. This starts at concrete treatment of extension chunks in the decoder libraries and goes all the way to design work for integrating those in the ImageDecoder trait and the Reader.

Limits during decoding

Loading an image from a remote server, only to find out that it decompresses to several gigabytes of memory and totally grinds your system to a halt/crashes your program isn't great. We'd like to ensure that the decoders all have memory and/or runtime limits that they check and abide to. This also helps with fuzzing as it turns allows controlling for such use explicitly. Note that most of the core libraries (png, gif, tiff) all have their own form of limits but notably jpeg-decoder does not. Also, there is no common interface to control them in image and to restrict decoding to such parsers that can enforce them.

Code style and Documentation

Code changes and gets old, sometimes quite a bit and very quickly. This can manifest as the coding style of one part of the library being inconsistent with the rest, leading to new contributors unwittingly basing their contributions on outdated styles. This is a bit of problem for both parties: It makes reviews harder and doesn't teach the right values to potentially new programmers. Clippy isn't always helpful here as it neglects both API stability and our MSRV commitment (1.34 at the moment, potentially changing this year).

This also concerns the documentation itself. Have you ever been puzzled by an interface, only to find the correct use some hours later? Answers to these question rarely end up as issues or, even rarer, as PRs. However, consider there is no more understanding of your struggle than yourself. It's really easy to slip into a mindset where you accept bad ergonomics as quirks and in many cases a single additional sentence of documentation would help tremendously.

There is no dedicated issue tracking for most of these issues (for reasons stated above) but there are a couple:

Image Buffers

Lastly, raising a bit of awareness of an experimental image buffer library in the hopes of gathering a few new eyes, use cases, and contributors. The ImageBuffer is showing a few cracks in its design but replacing it directly is also not quite feasible. It has chosen one layout and encodes this within its own type. Both of these aspects have proven to be too inflexible.

Using Vec<T> for representing pixels is not very efficient if one wants to support operations that change the sample type. This is due to the fact that it relies on the standard libraries use of allocators and the memory allocated for a vector is tied to the exact layout of the sample type. There is a concept library to work around this restriction by storing everything as a highly aligned byte buffer, which makes it unnecessary to track the sample type as a type parameter to the buffer type itself. It's a work in progress found here:

https://github.com/image-rs/canvas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment