Skip to content

Instantly share code, notes, and snippets.

@kartikkhullar
Last active August 29, 2020 12:55
Show Gist options
  • Save kartikkhullar/9a470d723f4a4c807ed67e9fdcdc7d2e to your computer and use it in GitHub Desktop.
Save kartikkhullar/9a470d723f4a4c807ed67e9fdcdc7d2e to your computer and use it in GitHub Desktop.
GSOC 2020: Writing a FLIF Decoder and Encoder for FFmpeg

GSOC 2020: Writing a FLIF Decoder and Encoder for FFmpeg

Free Lossless Image Format, or FLIF is the precursor to JPEG XL and FUIF. It claims to have a higher level of compression or lower file size than conventional lossless image formats like PNG, Lossless JPEG 2000, etc. FLIF and FUIF development has been stopped in favour of JPEG XL.

The task was to write an encoder and decoder for this format for FFmpeg.

Authors

This project was undertaken by two people, which is apparently unusual for GSOC:

Both of whom had submitted independent proposals for the same project. The public repository being used for communication during development is available here.

Project Status as of 23rd August 2020

$ > git diff --stat HEAD~4
 Changelog                      |    3 +-
 configure                      |    1 +
 doc/general.texi               |    2 +
 libavcodec/Makefile            |    3 +
 libavcodec/allcodecs.c         |    1 +
 libavcodec/codec_desc.c        |    7 +
 libavcodec/codec_id.h          |    1 +
 libavcodec/flif16.c            |  204 +++
 libavcodec/flif16.h            |  287 ++++
 libavcodec/flif16_parser.c     |  193 +++
 libavcodec/flif16_rangecoder.c |  800 +++++++++++
 libavcodec/flif16_rangecoder.h |  400 ++++++
 libavcodec/flif16_transform.c  | 2895 ++++++++++++++++++++++++++++++++++++++++
 libavcodec/flif16_transform.h  |  124 ++
 libavcodec/flif16dec.c         | 1764 ++++++++++++++++++++++++
 libavcodec/parsers.c           |    1 +
 libavcodec/version.h           |    2 +-
 libavformat/Makefile           |    1 +
 libavformat/allformats.c       |    1 +
 libavformat/flifdec.c          |  431 ++++++
 libavformat/version.h          |    2 +-
 21 files changed, 7120 insertions(+), 3 deletions(-)
  • The decoder is complete and can decode all tested FLIF files with slightly better performance efficiency and better memory efficiency than the reference decoder.
  • The decoder has however not yet been merged into FFmpeg's upstream repository.
  • The encoder is now being written. It will not be completed within the GSOC period.

List of Public Submissions of Sourcecode

This is a list of links to the iterations of the source code posted to the development channels of FFmpeg. The latest iteration is version 7. (faulty patches are not shown)

v3:
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/267742.html

v5:
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/267920.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/267921.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/267922.html

v6:
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268473.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268474.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268475.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268476.html

v7:
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268837.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268840.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268842.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268838.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268839.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268841.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268843.html

Author Comments

Regarding FLIF

  • FLIF's Reference Specification is inadequate for actually writing a working codec for the format. The reference codec is how most of the working and data documentation was derived while writing the codec.
  • FLIF's reference codec is written in C++. From personal observation, the whole codec had been written poorly, both in terms of cosmetics and logic. While due to our inexperience we haven't written something that can be considered very elegant as of now, we believe that the code is, at the very least, more organised and logically sound than the reference codec.
  • The disorganised manner the sourcecode was written in was also a cause of impedance in workflow.

Regarding FFmpeg

Developing for FFmpeg requires the interested party to read through a lot of the existing code base and refering to existing components of the same type for which one wants to develop, which I think is common and required for any software project.

However I think there could be a few more pointers as to what a novice developer has to do, such as being told to expect reading through a lot of source code, and what files must one refer to if one wants to develop a component.

In the initial stages of working for this project, I was in a disarray as to what to do and where to start from. Understanding what a component even does was a large part of the time spent.

Regarding Us

Anamitra has made his own comments here. Please check them out.

Maybe a lot of the above problems could have been completely avoided and would have saved our time if we had went through the FFmpeg manpage thoroughly as well as rest of the documentation (non-API).

I (Kartik) had no experience with FFmpeg prior to GSoC. As soon as the organizations were announced I saw FFmpeg as a participating organization, I saw this project and as I had studied some Image Compression Algorithms and Colorspaces before, so I decided to go with this project. Due to my lack of experience in FFmpeg before, I spent some time settling down in this organization's code writing conventions and other ways most of the FFmpeg code works. First 15 days were spent in revising the same code again and again due to lack of knowledge of how most of the things work in FFmpeg. But after the initial help from Mentors and my project co-author, I was able to write all the intermediate transforms in decoding process myself.

It took me some time to understand the reference codec's way of handling transforms and their corresponding ranges subsystem. FLIF has a unique way of compressing down image data using ranges which vary from transform to transform as we go down encoding/decoding the image data. The ranges itself are stored in a complex manner recursively. The final range is used in most of the other components of codec like pixel prediction, pixel data encoding/decoding. The final range itself has all the required data from previous transform ranges which is accessed when the final range is not able to solve the query given to it. All this is handled by Ranges Subsystem through generic function calls. Due to such complex nature of Transform & Ranges subsystem, it took nearly a month to write initial few transforms because the subsystem had to be made robust which can handle all combinations of Transforms.

Another challenge that I faced was writing down own data structures like linked lists, sets for writing the transform's routines. The reference codec is written in C++ which uses STL for all such needs. Most of the time using arrays instead of vectors did the job but at some places like in ColorBuckets transforms where the values were being inserted and removed from anywhere in the list, at those places I had to create lists for all those operations. And those points became vulnerable to failures/lag so many times until finally those operations were optimized with suggestions from the FFmpeg community.

In conclusion, we think there is a need for slightly better developer documentation, and providing pointers to new developers and a description of what to expect while trying to develop for the code base.

Talking with others (other than ourselves) never really took a center stage during the project (which the size and nature of the project had a hand in, and because of which we could not frequently send out patches). Most of what we were able to do was by reading through source code of FLIF and FFmpeg.

Regarding the Project

Anamitra has made his own comments here. Please check them out.

  • Time spent on the project, as inferrable from previous sections, had significantly consisted of deducing how the format is actually decoded from the reference decoder.
  • Understanding the Transforms & Ranges Subsystem was also crucial for completion of the project, since it was a lot complex than it actually looked initially. Without the time spent in first month of coding period in understanding the subsystem, it would have been difficult to translate all the transform/ranges routines from reference codec to FFmpeg codebase.

Technical Comments

One of the main decisions made while writing the decoder was to convert it into a state machine, such that it can save states between intermittent packets. This was done because FLIF does not provide a simple method to determine the end of the file bitstream, since the entirety of the image data has been entropy coded. Aditionally, the whole frame data has been interleaved, which makes it impossible to split the data into frames without reading the whole of the image data first.

The reference decoder allocates frame data for duplicate frames and copies frame data into them. We could not find one reason why would one do this. So, at the very least, in this aspect, our decoder is more memory efficient.

The reference decoder crashes on certain files while trying to decode eXmp metadata. This does not happen with our decoder.

We may eventually rewrite the FLIF specification, such that the components are much clearer.

Benefits of the Project

  • JPEG XL uses an entropy coding method similar to MANIAC, and probably shares a lot of other similar components as well. If work on implementing JPEG XL in FFmpeg is initiated sometime in the future, this codec can be used as a base or reference for doing it.
  • Very much improbable, but it may result in a bit more adoption of FLIF by people.

Future Work

  1. The encoder, which is being written.
  2. Make the decoder stop at a certain zoomlevel, interpolate the data and produce a lower quality image for an interlaced FLIF image.

Additional Points

  • The original paper on rangecoders by G. N. N. Martin was retypeset in LaTeX by us and is available here.
  • This file is mirrored here.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment