danilodequeiroz/10-Bit H.264

## 10-Bit H.264
10-Bit H.264

For all those who haven’t heard of it already, here’s a quick rundown about the
newest trend in making our encodes unplayable on even more systems: So-called
high-bit-depth H.264. So, why another format, and what makes this stuff
different from what you know already?

First off: What is bit depth?
In short, bit depth is the level of precision that’s available for storing color
information. The encodes you’re used to have a precision of 8 bits (256 levels)
per color channel. There are usually three color channels, so that makes a bit
depth of 24 bits per pixel, which is also the most commonly used bit depth of
modern desktop PCs. Now, you can use a higher bit depth for video encoding, and
x264 currently allows up to 10 bits per channel (1024 levels and 30
bits per pixel), and of course that allows for much higher precision.
But: Most graphics cards and display devices don’t allow more than 24 bits per
pixel.

This makes higher bit depth sound pretty pointless, so why are we doing this?
Here’s a bit of side info: Most LCD displays (TN panels to be precise) can only
represent a bit depth of 6 bits per channel (a mere 64 levels). This would look
pretty awful under normal circumstances, so these displays use a little trick
called “dithering” to simulate a bit depth of 8 bits per channel. In simplified
terms, this means that the panel’s controller quickly alternates between the
nearest colors in a dynamic pattern. When done correctly, this creates the
illusion of a higher color accuracy than what the panel is actually capable of
displaying.
The exact same trick can be used to display high-bit-depth encodes.

But by that logic, couldn’t we just encode with 8 bits and hardcode that
dithering?
Of course that’s possible, and in fact we’re already doing this to prevent
so-called banding (http://en.wikipedia.org/wiki/Colour_banding).
But this also has a big drawback: The bitrate required to keep the dithering
intact is disproportionately high.

This brings us to the real advantage of higher bit depths: We can save bandwidth
even if the source only uses 8 bits per channel.
That’s right: Not only do we no longer need to hardcode any dithering, but higher
bit depth also means higher error tolerance. Losing one bit of information in
an 8-bit color space is equivalent to losing three bits in a 10-bit color space,
and thus the same quality can be achieved with less bitrate. Want an example?
One of my first tests was encoding episode 13 of Shakugan no Shana from a DVD
source, with dithering added to prevent banding. I used the exact same input and
settings for both encodes.
The video track of the 8-bit encode has 275 MiB, while the 10-bit encode has no
more than 152 MiB and doesn’t look worse at all -- in fact, it even looks better
than the much larger 8-bit encode.
Now, if I hadn’t hardcoded the dithering for the 10-bit encode and instead
passed a high-bit-depth picture to x264, it would’ve resulted in even better
perceived quality and an even smaller file size!

That’s terrific, but there has to be a catch to this, right?
Unfortunately, yes. Software support is currently lacking in a lot of places,
but it’s being worked on. Decoders that don’t support higher bit depths don’t
simply fail to decode anything, but decode wrong information, which leads to
really annoying artifacts: http://screenshots.srsfckn.biz/10bit-decodefail.png
Note that also none of the available hardware accelerated decoders (VDPAU, DXVA,
CUVID, etc.) support this.

Currently, you have the following options for playing such content:
    1. MPlayer2 (cross-platform, Windows builds at http://mplayer2.srsfckn.biz)
       You might want to use SMPlayer as GUI (http://smplayer.srsfckn.biz)
    2. VLC (cross-platform, use the nightly builds at
            http://nightlies.videolan.org/build/win32/last/)
       It’s not as bad as it used to be, seriously.
    3. CCCP Beta (http://www.cccp-project.net/beta/)
       Note that this is currently a CCCP exclusive feature, so you will not get
       this by simply installing the most recent ffdshow-tryouts.

And what does this all mean for my precious fagsubs?
It means that we’re doing dual encodes until compatible software is more readily
available (i.e. CCCP supports it in a release build), but it also implies the
following:
    1. much smaller encodes with the same or better perceived quality
    2. slightly smaller but better looking encodes
    3. same file size but much better quality, right up to transparency
        (http://en.wikipedia.org/wiki/Transparency_(data_compression))

So, things can only get better! I’ll keep you posted.

============================JEEB’s Rant=================================

Just a quickie on current 10bit H.264 support:

- ffmpeg/libav have now had it for ~months (made by irock, they now have asm
optimizations by Jumpyshoes)
- mplayer(2) has had support for some time now ( these builds recommended, can be
used with smplayer if you need a front-end )
- VLC will have it in their next release ( you can test with nightlies from here )
- Lord patched it into FFDShow-tryouts (and I undumbed its swscale usage flags
so that RGB output wouldn’t look like crap). It should work fine’ish, although
we are still scratching off some rough edges. Like the fact that it seems like
we’ve stumbled onto a bug in VSFilter not really having as correct color conversions
as possible inside. Of course, whether or not the effects of this bug are visible
to people is a whole separate affair. Regardless, we’re working on it.

What is this whole “10bit” affair?
Higher-than-8bit colorspaces are part of the H.264 standard, usually until now only
used in the “professional” zone. It’s not really anything new, and there actually was
at least one DirectShow decoder for it available on the internet before libavcodec
got one (trivia: MainConcept’s broadcast decoder). It just wasn’t picked up by the
media companies for the masses, where the choice went towards Blu-ray just hitting
the source with immense amounts of bitrate paired with 8bit (and thus no open source
entrepreneur had yet taken it into his or her TODO list until irock developed 10bit
encoding routines into x264 during last year’s GSoC program).

Unlike what would probably come to your mind first when thinking about “higher bit
depth in color”, its biggest merit for most of the people is not in the capability
of actually having a way to keep 10bit things 10bit (as most people pretty much have
no way of getting such content originally), or in the fact that you could use hyper
special rendering straight onto a 30bit display or whatever. It’s compression.

Even if your source is originally 8bit, encoding it in 10bit (in case of lossy
compression, of course — otherwise the “redundant” data will actually start biting
us. Although the output of course wouldn’t be identical compared to the 8bit source
either in such a case, either) will have the merit of making the output suffer less
from various compression artifacts. In layman’s terms, this means that lossy
compression will be more efficient in leaving things pretty, leading to smaller
files looking better in the end (Ateme’s PDF on this).

Not to mention that even if one converts the 10bit picture into a 8bit one to make
it easier to deal with (for such stuff as playback etc.), the difference is usually
miniscule (after all, we are in the same 4:2:0 colorspace), or might even look better
as some ways of conversion use dithering in the process.
	10-Bit H.264

	For all those who haven’t heard of it already, here’s a quick rundown about the
	newest trend in making our encodes unplayable on even more systems: So-called
	high-bit-depth H.264. So, why another format, and what makes this stuff
	different from what you know already?

	First off: What is bit depth?
	In short, bit depth is the level of precision that’s available for storing color
	information. The encodes you’re used to have a precision of 8 bits (256 levels)
	per color channel. There are usually three color channels, so that makes a bit
	depth of 24 bits per pixel, which is also the most commonly used bit depth of
	modern desktop PCs. Now, you can use a higher bit depth for video encoding, and
	x264 currently allows up to 10 bits per channel (1024 levels and 30
	bits per pixel), and of course that allows for much higher precision.
	But: Most graphics cards and display devices don’t allow more than 24 bits per
	pixel.

	This makes higher bit depth sound pretty pointless, so why are we doing this?
	Here’s a bit of side info: Most LCD displays (TN panels to be precise) can only
	represent a bit depth of 6 bits per channel (a mere 64 levels). This would look
	pretty awful under normal circumstances, so these displays use a little trick
	called “dithering” to simulate a bit depth of 8 bits per channel. In simplified
	terms, this means that the panel’s controller quickly alternates between the
	nearest colors in a dynamic pattern. When done correctly, this creates the
	illusion of a higher color accuracy than what the panel is actually capable of
	displaying.
	The exact same trick can be used to display high-bit-depth encodes.

	But by that logic, couldn’t we just encode with 8 bits and hardcode that
	dithering?
	Of course that’s possible, and in fact we’re already doing this to prevent
	so-called banding (http://en.wikipedia.org/wiki/Colour_banding).
	But this also has a big drawback: The bitrate required to keep the dithering
	intact is disproportionately high.

	This brings us to the real advantage of higher bit depths: We can save bandwidth
	even if the source only uses 8 bits per channel.
	That’s right: Not only do we no longer need to hardcode any dithering, but higher
	bit depth also means higher error tolerance. Losing one bit of information in
	an 8-bit color space is equivalent to losing three bits in a 10-bit color space,
	and thus the same quality can be achieved with less bitrate. Want an example?
	One of my first tests was encoding episode 13 of Shakugan no Shana from a DVD
	source, with dithering added to prevent banding. I used the exact same input and
	settings for both encodes.
	The video track of the 8-bit encode has 275 MiB, while the 10-bit encode has no
	more than 152 MiB and doesn’t look worse at all -- in fact, it even looks better
	than the much larger 8-bit encode.
	Now, if I hadn’t hardcoded the dithering for the 10-bit encode and instead
	passed a high-bit-depth picture to x264, it would’ve resulted in even better
	perceived quality and an even smaller file size!

	That’s terrific, but there has to be a catch to this, right?
	Unfortunately, yes. Software support is currently lacking in a lot of places,
	but it’s being worked on. Decoders that don’t support higher bit depths don’t
	simply fail to decode anything, but decode wrong information, which leads to
	really annoying artifacts: http://screenshots.srsfckn.biz/10bit-decodefail.png
	Note that also none of the available hardware accelerated decoders (VDPAU, DXVA,
	CUVID, etc.) support this.

	Currently, you have the following options for playing such content:
	1. MPlayer2 (cross-platform, Windows builds at http://mplayer2.srsfckn.biz)
	You might want to use SMPlayer as GUI (http://smplayer.srsfckn.biz)
	2. VLC (cross-platform, use the nightly builds at
	http://nightlies.videolan.org/build/win32/last/)
	It’s not as bad as it used to be, seriously.
	3. CCCP Beta (http://www.cccp-project.net/beta/)
	Note that this is currently a CCCP exclusive feature, so you will not get
	this by simply installing the most recent ffdshow-tryouts.

	And what does this all mean for my precious fagsubs?
	It means that we’re doing dual encodes until compatible software is more readily
	available (i.e. CCCP supports it in a release build), but it also implies the
	following:
	1. much smaller encodes with the same or better perceived quality
	2. slightly smaller but better looking encodes
	3. same file size but much better quality, right up to transparency
	(http://en.wikipedia.org/wiki/Transparency_(data_compression))

	So, things can only get better! I’ll keep you posted.

	============================JEEB’s Rant=================================

	Just a quickie on current 10bit H.264 support:

	- ffmpeg/libav have now had it for ~months (made by irock, they now have asm
	optimizations by Jumpyshoes)
	- mplayer(2) has had support for some time now ( these builds recommended, can be
	used with smplayer if you need a front-end )
	- VLC will have it in their next release ( you can test with nightlies from here )
	- Lord patched it into FFDShow-tryouts (and I undumbed its swscale usage flags
	so that RGB output wouldn’t look like crap). It should work fine’ish, although
	we are still scratching off some rough edges. Like the fact that it seems like
	we’ve stumbled onto a bug in VSFilter not really having as correct color conversions
	as possible inside. Of course, whether or not the effects of this bug are visible
	to people is a whole separate affair. Regardless, we’re working on it.

	What is this whole “10bit” affair?
	Higher-than-8bit colorspaces are part of the H.264 standard, usually until now only
	used in the “professional” zone. It’s not really anything new, and there actually was
	at least one DirectShow decoder for it available on the internet before libavcodec
	got one (trivia: MainConcept’s broadcast decoder). It just wasn’t picked up by the
	media companies for the masses, where the choice went towards Blu-ray just hitting
	the source with immense amounts of bitrate paired with 8bit (and thus no open source
	entrepreneur had yet taken it into his or her TODO list until irock developed 10bit
	encoding routines into x264 during last year’s GSoC program).

	Unlike what would probably come to your mind first when thinking about “higher bit
	depth in color”, its biggest merit for most of the people is not in the capability
	of actually having a way to keep 10bit things 10bit (as most people pretty much have
	no way of getting such content originally), or in the fact that you could use hyper
	special rendering straight onto a 30bit display or whatever. It’s compression.

	Even if your source is originally 8bit, encoding it in 10bit (in case of lossy
	compression, of course — otherwise the “redundant” data will actually start biting
	us. Although the output of course wouldn’t be identical compared to the 8bit source
	either in such a case, either) will have the merit of making the output suffer less
	from various compression artifacts. In layman’s terms, this means that lossy
	compression will be more efficient in leaving things pretty, leading to smaller
	files looking better in the end (Ateme’s PDF on this).

	Not to mention that even if one converts the 10bit picture into a 8bit one to make
	it easier to deal with (for such stuff as playback etc.), the difference is usually
	miniscule (after all, we are in the same 4:2:0 colorspace), or might even look better
	as some ways of conversion use dithering in the process.