daujerrine/gsoc-2020-ffmpeg-flif.md

## gsoc-2020-ffmpeg-flif.md

      
    Raw
  

              gsoc-2020-ffmpeg-flif.md
            
          
    GSOC 2020: Writing a FLIF Decoder and Encoder for FFmpeg

Free Lossless Image Format, or FLIF is the precursor to
JPEG XL and FUIF.
It claims to have a higher level of compression or lower file size than
conventional lossless image formats like PNG, Lossless JPEG 2000, etc.
FLIF and FUIF development has been stopped in favour of JPEG XL.
The task was to write an encoder and decoder for this format for FFmpeg.
Authors

This project was undertaken by two people, which is apparently unusual for GSOC:

Anamitra Ghorui
Kartik K Khullar

Both of whom had submitted independent proposals for the same project.
The public repository being used for communication during development is available
here.
Project Status as of 23rd August 2020

$ > git diff --stat HEAD~4
 Changelog                      |    3 +-
 configure                      |    1 +
 doc/general.texi               |    2 +
 libavcodec/Makefile            |    3 +
 libavcodec/allcodecs.c         |    1 +
 libavcodec/codec_desc.c        |    7 +
 libavcodec/codec_id.h          |    1 +
 libavcodec/flif16.c            |  204 +++
 libavcodec/flif16.h            |  287 ++++
 libavcodec/flif16_parser.c     |  193 +++
 libavcodec/flif16_rangecoder.c |  800 +++++++++++
 libavcodec/flif16_rangecoder.h |  400 ++++++
 libavcodec/flif16_transform.c  | 2895 ++++++++++++++++++++++++++++++++++++++++
 libavcodec/flif16_transform.h  |  124 ++
 libavcodec/flif16dec.c         | 1764 ++++++++++++++++++++++++
 libavcodec/parsers.c           |    1 +
 libavcodec/version.h           |    2 +-
 libavformat/Makefile           |    1 +
 libavformat/allformats.c       |    1 +
 libavformat/flifdec.c          |  431 ++++++
 libavformat/version.h          |    2 +-
 21 files changed, 7120 insertions(+), 3 deletions(-)


The decoder is complete and can decode all tested FLIF files with slightly
better performance efficiency and better memory efficiency than
the reference decoder.
The decoder has however not yet been merged into FFmpeg's upstream repository.
The encoder is now being written. It will not be completed within the GSOC
period.

List of Public Submissions of Sourcecode

This is a list of links to the iterations of the source code posted to the
development channels of FFmpeg. The latest iteration is version 7.
(faulty patches are not shown)
v3:
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/267742.html

v5:
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/267920.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/267921.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/267922.html

v6:
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268473.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268474.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268475.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268476.html

v7:
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268837.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268840.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268842.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268838.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268839.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268841.html
http://ffmpeg.org/pipermail/ffmpeg-devel/2020-August/268843.html

Author Comments

Regarding FLIF


FLIF's Reference Specification is inadequate for
actually writing a working codec for the format. The reference codec
is how most of the working and data documentation was derived while writing the
codec.
FLIF's reference codec is written in C++. From personal observation, the whole
codec had been written poorly, both in terms of cosmetics and logic. While due
to our inexperience we haven't written something that can be considered very
elegant as of now, we believe that the code is, at the very least, more
organised and logically sound than the reference codec.
The disorganised manner the sourcecode was written in was also a cause of
impedance in workflow.

Regarding FFmpeg

Developing for FFmpeg requires the interested party to read through a lot of the
existing code base and refering to existing components of the same type for
which one wants to develop, which I think is common and required for any
software project.
However I think there could be a few more pointers as to what a novice developer
has to do, such as being told to expect reading through a lot of source code,
and what files must one refer to if one wants to develop a component.
In the initial stages of working for this project, I was in a disarray as to
what to do and where to start from. Understanding what a component even does was
a large part of the time spent.
For example, The description of libavformat in FFmpeg's documentation says that it
deals with format demuxers, muxers and I/O, which means that, say for a video file,
the audio and video streams are separated (or demuxed) for processing into decoders or
joined together (muxed) to create the file. This is the primary purpose of libavformat,
but its usage of the term (de)muxer is broader than that.
A demuxer in libavformat is also responsible for probing the said format,
reading the header of the format, and, of course, what is sent to the decoder
as the packet. All of this could have been inferred from the phrase 'I/O' in the
description but I initially did not make the connection. This all makes sense
when one gets the idea of a clear distinction between container formats (eg.
all video files with sound such as webm) and encoding formats (the format of
an audio or a video stream), which I did have an intuitive sense of, but
wasn't articulate. A lot of the confusion came from the meaning of the term
demuxer here and whether or not it is used for only container formats or encoding
formats as well. This brings us to the next section:
Regarding Us

Kartik has made his own comments here. Please check them out.
Maybe a lot of the above problems could have been completely avoided and would
have saved our time if we had went through the FFmpeg manpage thoroughly as well
as rest of the documentation (non-API).
My (Anamitra) experience with FFmpeg before the project was limited to clipping
videos, extracting individual frames from gifs, conversion between image formats
and reducing file sizes of videos, all of which I assume are the most common use
cases of FFmpeg. Therefore I did not have much other than a superficial
understanding of FFmpeg.
However, after working on the project for about half to one month, a lot of the
on-site documentation now seems obvious to me. This was after having
to go through the sourcecode of concerned files again and again and after
writing a working template of the decoder by hand. The FFmpeg codec HOWTO
on MultimediaWiki does give useful pointers, but can be updated to provide a
better description on how to write codecs for still image formats and gif like
formats, which was the main hurdle for understanding in the context of our project,
and in general a better anatomy of FFmpeg itself (which the manpages do provide,
but not in a very direct manner and in the sense of the internal API itself). I
have started to write a document about this for my own reference.
In conclusion, we think there is a need for slightly better developer
documentation, and providing pointers to new developers and a description what
to expect while trying to develop for the code base. However, I think the
hurdles I had encountered while developing for FFmpeg (and still am) are
necessary to pass through for any developer to learn two important things:

How to discover things by oneself.
How to deduce, derive and figure out things by either testing or simply
reading through the code.

Talking with others (other than ourselves) never really took a center stage
during the project (which the size and nature of the project had a hand in, and
because of which we could not frequently send out patches). Most of what we were
able to do was by reading through source code of FLIF and FFmpeg.
Regarding the Project

Kartik has made his own comments here. Please check them out.

Time spent on the project, as inferrable from previous sections, had
significantly consisted of deducing how the format is actually encoded
from the reference decoder.
A large chunk of time was lost in understanding and translating the code of the
interlaced decoder. The reference decoder had written it in a very convoluted
manner and was order of magnitudes more convoluted than the non interlaced
decoder. A few redundant functions that weren't even used in decoding were
present. This was during July.

Technical Comments

One of the main decisions made while writing the decoder was to convert it
into a state machine, such that it can save states between intermittent
packets. This was done because FLIF does not provide a simple method to
determine the end of the file bitstream, since the entirety of the image data
has been entropy coded. Aditionally, the whole frame data has been interleaved,
which makes it impossible to split the data into frames without reading the
whole of the image data first.
The reference decoder allocates frame data for duplicate frames and copies
frame data into them. We could not find one reason why would one do this. So,
at the very least, in this aspect, our decoder is more memory efficient.
The reference decoder crashes on certain files while trying to decode eXmp
metadata. This does not happen with our decoder.
We may eventually rewrite the FLIF specification, such that the components are
much clearer.
Benefits of the Project


JPEG XL uses an entropy coding method similar to MANIAC, and probably shares a
lot of other similar components as well. If work on implementing JPEG XL in
FFmpeg is initiated sometime in the future, this codec can be used as a base or
reference for doing it.
Very much improbable, but it may result in a bit more adoption of FLIF by
people.

Future Work


The encoder, which is being written.
As mentioned before, I (Anamitra) have started to write a document about the
anatomy of FFmpeg. This is mostly for my own reference and learning, but I
hope it to be of help to others.
Make the decoder stop at a certain zoomlevel, interpolate the data and
produce a lower quality image for an interlaced FLIF image.

Additional Points


The original paper on rangecoders by G. N. N. Martin was retypeset in LaTeX by
us and is available here.
This file is mirrored here.