Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 11 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save ScribbleGhost/54ad17da006e8bba4a1612bd6a64571c to your computer and use it in GitHub Desktop.
Save ScribbleGhost/54ad17da006e8bba4a1612bd6a64571c to your computer and use it in GitHub Desktop.

Converting audio to AAC with Fraunhofer FDK AAC (libfdk_aac) in FFmpeg

Check if you have an FFmpeg build supporting libfdk_aac

Run:

ffmpeg -hide_banner -h encoder=libfdk_aac

If you have an FFmpeg version that does not include libfdk_aac, you will see this:

Codec 'libfdk_aac' is not recognized by FFmpeg.

If you have a build that includes libfdk_aac you will see this:

Encoder libfdk_aac [Fraunhofer FDK AAC]:
    General capabilities: delay small 
    Threading capabilities: none
    Supported sample rates: 96000 88200 64000 48000 44100 32000 24000 22050 16000 12000 11025 8000
    Supported sample formats: s16
    Supported channel layouts: mono stereo 3.0 4.0 5.0 5.1 7.1(wide) 7.1
libfdk_aac AVOptions:
  -afterburner       <int>        E...A...... Afterburner (improved quality) (from 0 to 1) (default 1)
  -eld_sbr           <int>        E...A...... Enable SBR for ELD (for SBR in other configurations, use the -profile parameter) (from 0 to 1) (default 0)
  -eld_v2            <int>        E...A...... Enable ELDv2 (LD-MPS extension for ELD stereo signals) (from 0 to 1) (default 0)
  -signaling         <int>        E...A...... SBR/PS signaling style (from -1 to 2) (default default)
     default         -1           E...A...... Choose signaling implicitly (explicit hierarchical by default, implicit if global header is disabled)
     implicit        0            E...A...... Implicit backwards compatible signaling
     explicit_sbr    1            E...A...... Explicit SBR, implicit PS signaling
     explicit_hierarchical 2            E...A...... Explicit hierarchical signaling
  -latm              <int>        E...A...... Output LATM/LOAS encapsulated data (from 0 to 1) (default 0)
  -header_period     <int>        E...A...... StreamMuxConfig and PCE repetition period (in frames) (from 0 to 65535) (default 0)
  -vbr               <int>        E...A...... VBR mode (1-5) (from 0 to 5) (default 0)

How to get an FFmpeg build with libfdk_aac

FFmpeg supports two AAC-LC encoders (aac and libfdk_aac) and one HE-AAC (v1/2) encoder (libfdk_aac). The license of libfdk_aac is not compatible with GPL, so the GPL does not permit distribution of binaries containing incompatible code when GPL-licensed code is also included. Therefore this encoder have been designated as "non-free", and you cannot download a pre-built ffmpeg that supports it. This can be resolved by compiling ffmpeg yourself.

My way of building a custom FFmpeg

I setup a clean install of Windows 10 in a VM and run https://github.com/m-ab-s/media-autobuild_suite

My go-to preset for highest quality regardless of file size

ffmpeg -i input.wav -ac 2 -c:a libfdk_aac -cutoff 20000 -afterburner 1 -vbr 0 output.m4a

-ac 2 Downmix to a stereo track

-c:a libfdk_aac Use Fraunhofer FDK AAC (libfdk_aac).

-cutoff 20000 libfdk_aac defaults to a low-pass filter of around 14kHz. 20000 is the maximum available.

-afterburner 1 Afterburner is "a type of analysis by synthesis algorithm which increases the audio quality but also the required processing power." Fraunhofer recommends to always activate this feature. 1 = On and 0 = Off.

-vbr 0 - Setting VBR (variable bitrate) to 0 means libfdk_aac will try to set the maximum available CBR (constant bitrate) for the stream. This results in the best theoretical quality no matter if you choose VBR or CBR. This will increase the filesize though.

@ddelange
Copy link

Thanks for sparring with me :) Wonder if we can find the source of the discrepancy!

https://http.cat/417

@joshbarrass
Copy link

When using the latest version of ffmpeg/libfdk, I see the same ~19kHz cutoff.

Digging into libfdk's source code and assuming I'm understanding correctly, this diff seems to show that the cutoff was changed in 2020 (my version of ffmpeg must be older than I thought!) to 19293Hz in order to improve audio quality. Page 17 of this document explains why, and the answer agrees with my assumption earlier in the thread: the higher the cutoff, the more frequencies you have to represent with the same amount of data, giving you lower quality overall for the sake of saving near-imperceptible frequencies. The cutoffs are chosen based on listening tests to maximise perceived quality. As with the ffmpeg documentation, they also strongly recommend keeping the default cutoffs for this reason. If they've changed these cutoffs, they are presumably working from more recent listening tests that show a better result.

The more you know :)

@ddelange
Copy link

What a rabbit hole :) awesome find!

@ScribbleGhost
Copy link
Author

I haven't had the time to comment on any of this, but I am glad to see this is relevant to you guys. For me, I am not that interested in which AAC encoder produces better quality than the other. All lossy encoders produce low quality. If I want quality I go for FLAC. But that's just me 🐵

@AdventurerRussia
Copy link

image
can you tell me where it's better?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment