Skip to content

Instantly share code, notes, and snippets.

@cfra
Last active July 12, 2022 21:21
Show Gist options
  • Save cfra/30c29eedcdc1e928903f76f216710104 to your computer and use it in GitHub Desktop.
Save cfra/30c29eedcdc1e928903f76f216710104 to your computer and use it in GitHub Desktop.
Converting RGB PNGs to H.264 with Correct Colors

Colors in Video

Disclaimer: I don't have a deep understanding of this topic, so take everything I have written up here with a grain of salt. This also is all from the perspective of 8 bit SDR video. HDR video is probably a whole other can of worms.

YCbCr and Chroma Subsampling

This describes my observations about Converting from RGB to video, attempting to match colors when encoding using libx264.

Video is usually represented not in RGB, but in YCbCr which uses a luminance channel and a red and blue channel.

On one hand, this has historic reasons: Color was later added to existing black and white TVs.

On the other hand, it also resembles the way that our eyes perceive visual information: We have a high resolution when it comes to seeing differences in brightness, but for colors, the resolution is lower.

This aspect of our perception is taken advantage off in chroma subsampling: While luminance is transmitted separately for each pixel, the color information is only sent for every other pixel or sometimes even every fourth pixel. Using e.g. 4:2:0 chroma subsampling, only 12bits per pixel are needed compared to the 24bits which are needed when full images (4:4:4) are transmitted, while the perceived quality is still quite high.

Incidentally, 4:2:0 chroma subsampling is commonly used for standard video applications.

Color spaces

To display YCbCr information on a screen, it usually needs to be converted back to RGB. As display technologies have evolved over time, there are different standards which are used to do this.

Most of these are ITU recommendations. For HD-TV, BT.709 is used. For UHD it is BT.2020.

For SD-TV, I am a bit uncertain, there seems to be BT.470 which exists both as BT.470bg and BT.470m, and also various flavors of BT.601. This might be related to different standards being used in PAL and NTSC.

There are common color spaces for standardized media like DVDs or DVB flavours. For example, encoding for a Bluray with BT.470bg will probably lead to incorrect results when it is played in a Bluray player.

On a PC, players will often deduce colorspace by the resolution of the material if it is not specified. This might lead to incorrect results, so it seems sensible to provide the information about the used colorspace in the meta-data of the generated file.

Where this is stored varies between codecs. Some store it directly in the bitstream, some only in the container.

In addition to the color space, the standards named also describe color primaries and media transfer characteristics. Those should also be set in the meta-data of the file.

I have not put further research into when and why one would use different settings for colorspace, transfer characteristics and color primaries and have usually set them all to the same standard I was using.

Color ranges

To accomodate for overshoot in analog filters and to allow embedding of control information, video has only been using luminance values between 16 and 235 where 8-bit would theoretically allow for values between 0 and 255.

Even now in the digital age, 8 bit video still usually uses this so called video range between 16 and 235.

In contrast, when storing still images, e.g. with JPEG, usually the full 8-bit between 0 and 255 are used.

Which one is used can and should also be stored in the meta-information of the file.

You can encode a video with full levels between 0 and 255. This will provide slightly better quality and less banding if played back correctly. But if the player doesn't support it, or you are missing the meta-information, your blacks will probably be too dark and your whites will be all blown out because they are too bright.

Putting it together

Now, to convert my PNG to a video, I did the following:

# Don't do this
ffmpeg -loop 1 -i color-bars.png -c:v libx264 -b:v 5M -t 15 \
    -pix_fmt yuv420p \
    out.mp4

My color bars were 1080p and this resulted in wrong colors. What happened?

It turns out that ffmpeg, when converting to YCbCr (for some reason, they call it yuv), is apparently using a transformation matrix that follows BT.470bg. As the content was 1080p though and there was no meta-data defining the color space, my player was assuming the colors should be transformed according to BT.709, leading to incorrect results.

Indeed, if I use the same command line, but scale down my color-bars to PAL widescreen, colors show correctly, because the player assumes BT.470bg.

There are two problems here: I did not provide any information about the colorspace in the meta-data, and the assumed/common colorspace for this type of file is different from the one that was used.

So how can this be resolved?

One approach is marking the file with the appropriate meta-data:

ffmpeg -loop 1 -i color-bars.png -c:v libx264 -b:v 5M -t 15 \
    -pix_fmt yuv420p \
    -colorspace bt470bg \
    -color_trc gamma28 \
    -color_primaries bt470bg \
    -color_range mpeg \
    out.mp4

This tells the player that the file is using BT.470bg. (For some reasone, transfer characteristics for BT.470bg are named gamma28) Also, for good measure, it tells the player we are using video range. (-color_range mpeg)

The file generated this way plays back with correct colors on my player (mpv).

If it was intended for a Bluray though, it might still be problematic, as BT.470bg is not common for HD content.

I am also not sure if all players on PCs really look at the meta-information, or if there are some which will always guess by the resolution.

So converting to BT.709 seems like a better approach.

That alone already solves the problem, because the guessed color space matches the one we are using, but let's still include the meta-data for good measure:

ffmpeg -loop 1 -i color-bars.png -c:v libx264 -b:v 5M -y -t 15 \
    -pix_fmt yuv420p \
    -vf "colormatrix=bt470bg:bt709" \
    -colorspace bt709 \
    -color_trc bt709 \
    -color_primaries bt709 \
    -color_range mpeg \
    out.mp4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment