kartikkhullar/GSoC2020-FFmpeg-FLIF.md

## GSoC2020-FFmpeg-FLIF.md

      
    Raw
  

              GSoC2020-FFmpeg-FLIF.md
            
          
    GSOC 2020: Writing a FLIF Decoder and Encoder for FFmpeg

Free Lossless Image Format, or FLIF is the precursor to
JPEG XL and FUIF.
It claims to have a higher level of compression or lower file size than
conventional lossless image formats like PNG, Lossless JPEG 2000, etc.
FLIF and FUIF development has been stopped in favour of JPEG XL.
The task was to write an encoder and decoder for this format for FFmpeg.
Authors

This project was undertaken by two people, which is apparently unusual for GSOC:

Anamitra Ghorui
Kartik K Khullar

Both of whom had submitted independent proposals for the same project.
The public repository being used for communication during development is available
here.
Project Status as of 23rd August 2020

$ > git diff --stat HEAD~4
 Changelog                      |    3 +-
 configure                      |    1 +
 doc/general.texi               |    2 +
 libavcodec/Makefile            |    3 +
 libavcodec/allcodecs.c         |    1 +
 libavcodec/codec_desc.c        |    7 +
 libavcodec/codec_id.h          |    1 +
 libavcodec/flif16.c            |  204 +++
 libavcodec/flif16.h            |  287 ++++
 libavcodec/flif16_parser.c     |  193 +++
 libavcodec/flif16_rangecoder.c |  800 +++++++++++
 libavcodec/flif16_rangecoder.h |  400 ++++++
 libavcodec/flif16_transform.c  | 2895 ++++++++++++++++++++++++++++++++++++++++
 libavcodec/flif16_transform.h  |  124 ++
 libavcodec/flif16dec.c         | 1764 ++++++++++++++++++++++++
 libavcodec/parsers.c           |    1 +
 libavcodec/version.h           |    2 +-
 libavformat/Makefile           |    1 +
 libavformat/allformats.c       |    1 +
 libavformat/flifdec.c          |  431 ++++++
 libavformat/version.h          |    2 +-
 21 files changed, 7120 insertions(+), 3 deletions(-)


The decoder is complete and can decode all tested FLIF files with slightly
better performance efficiency and better memory efficiency than
the reference decoder.
The decoder has however not yet been merged into FFmpeg's upstream repository.
The encoder is now being written. It will not be completed within the GSOC
period.

List of Public Submissions of Sourcecode

This is a list of links to the iterations of the source code posted to the
development channels of FFmpeg. The latest iteration is version 7.
(faulty patches are not shown)
v3:
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/267742.html

v5:
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/267920.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/267921.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/267922.html

v6:
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268473.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268474.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268475.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268476.html

v7:
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268837.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268840.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268842.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268838.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268839.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268841.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268843.html

Author Comments

Regarding FLIF


FLIF's Reference Specification is inadequate for
actually writing a working codec for the format. The reference codec
is how most of the working and data documentation was derived while writing the
codec.
FLIF's reference codec is written in C++. From personal observation, the whole
codec had been written poorly, both in terms of cosmetics and logic. While due
to our inexperience we haven't written something that can be considered very
elegant as of now, we believe that the code is, at the very least, more
organised and logically sound than the reference codec.
The disorganised manner the sourcecode was written in was also a cause of
impedance in workflow.

Regarding FFmpeg

Developing for FFmpeg requires the interested party to read through a lot of the
existing code base and refering to existing components of the same type for
which one wants to develop, which I think is common and required for any
software project.
However I think there could be a few more pointers as to what a novice developer
has to do, such as being told to expect reading through a lot of source code,
and what files must one refer to if one wants to develop a component.
In the initial stages of working for this project, I was in a disarray as to
what to do and where to start from. Understanding what a component even does was
a large part of the time spent.
Regarding Us

Anamitra has made his own comments here. Please check them out.
Maybe a lot of the above problems could have been completely avoided and would
have saved our time if we had went through the FFmpeg manpage thoroughly as well
as rest of the documentation (non-API).
I (Kartik) had no experience with FFmpeg prior to GSoC. As soon as the
organizations were announced I saw FFmpeg as a participating organization, I saw
this project and as I had studied some Image Compression Algorithms and
Colorspaces before, so I decided to go with this project. Due to my lack of
experience in FFmpeg before, I spent some time settling down in this
organization's code writing conventions and other ways most of the FFmpeg code works.
First 15 days were spent in revising the same code again and again due to lack of
knowledge of  how most of the things work in FFmpeg. But after the initial help
from Mentors and  my project co-author, I was able to write all the intermediate
transforms in decoding process myself.
It took me some time to understand the reference codec's way of handling
transforms and their corresponding ranges subsystem. FLIF has a unique way of
compressing down image data using ranges which vary from transform to transform
as we go down encoding/decoding the image data. The ranges itself are stored in a
complex manner recursively. The final range is used in most of the other
components of codec like pixel prediction, pixel data encoding/decoding.
The final range itself has all the required data from previous transform ranges
which is accessed when the final range is not able to solve the query given to it.
All this is handled by Ranges Subsystem through generic function calls. Due to
such complex nature of Transform & Ranges subsystem, it took nearly a month to
write initial few transforms because the subsystem had to be made robust which can
handle all combinations of Transforms.
Another challenge that I faced was writing down own data structures like linked lists,
sets for writing the transform's routines. The reference codec is written in C++
which uses STL for all such needs. Most of the time using arrays instead of vectors
did the job but at some places like in ColorBuckets transforms where the values
were being inserted and removed from anywhere in the list, at those places I had
to create lists for all those operations. And those points became vulnerable to
failures/lag so many times until finally those operations were optimized with
suggestions from the FFmpeg community.
In conclusion, we think there is a need for slightly better developer
documentation, and providing pointers to new developers and a description of what
to expect while trying to develop for the code base.
Talking with others (other than ourselves) never really took a center stage
during the project (which the size and nature of the project had a hand in, and
because of which we could not frequently send out patches). Most of what we were
able to do was by reading through source code of FLIF and FFmpeg.
Regarding the Project

Anamitra has made his own comments here. Please check them out.

Time spent on the project, as inferrable from previous sections,
had significantly consisted of deducing how the format is actually decoded
from the reference decoder.
Understanding the Transforms & Ranges Subsystem was also crucial for completion
of the project, since it was a lot complex than it actually looked initially.
Without the time spent in first month of coding period in understanding the
subsystem, it would have been difficult to translate all the transform/ranges
routines from reference codec to FFmpeg codebase.

Technical Comments

One of the main decisions made while writing the decoder was to convert it
into a state machine, such that it can save states between intermittent
packets. This was done because FLIF does not provide a simple method to
determine the end of the file bitstream, since the entirety of the image data
has been entropy coded. Aditionally, the whole frame data has been interleaved,
which makes it impossible to split the data into frames without reading the
whole of the image data first.
The reference decoder allocates frame data for duplicate frames and copies
frame data into them. We could not find one reason why would one do this. So,
at the very least, in this aspect, our decoder is more memory efficient.
The reference decoder crashes on certain files while trying to decode eXmp
metadata. This does not happen with our decoder.
We may eventually rewrite the FLIF specification, such that the components are
much clearer.
Benefits of the Project


JPEG XL uses an entropy coding method similar to MANIAC, and probably shares a
lot of other similar components as well. If work on implementing JPEG XL in
FFmpeg is initiated sometime in the future, this codec can be used as a base or
reference for doing it.
Very much improbable, but it may result in a bit more adoption of FLIF by
people.

Future Work


The encoder, which is being written.
Make the decoder stop at a certain zoomlevel, interpolate the data and
produce a lower quality image for an interlaced FLIF image.

Additional Points


The original paper on rangecoders by G. N. N. Martin was retypeset in LaTeX by
us and is available here.
This file is mirrored here.