arnabanimesh/Simple SVT-AV1 Beginner Guide Part 1.md

## Simple SVT-AV1 Beginner Guide Part 1.md

      
    Raw
  

              Simple SVT-AV1 Beginner Guide Part 1.md
            
          
    Since we're dealing with simpler AV1 encoding, that does mean we'll be eskewing aomenc-av1, since it requires 2-pass encoding to be able to take
advantage of it, and I use it externally since I have access to a special build.
Now, let's get on with the simple guide.
You'll first need to be reasonably competent with command line builds or use a recent up to date ffmpeg GUI with support for SVT-AV1.
As such, I would recommend getting a master git ffmpeg build for the operating system of your choice right here:
https://github.com/BtbN/FFmpeg-Builds
Make sure to periodically update if possible to get speed, encoding and quality increases.
Something like this should get well started:

ffmpeg -i input.mkv -c:v libsvtav1 -crf 20 -preset 6 -g 240 -svtav1-params tune=0:enable-overlays=1:scd=1:scm=0 -pix_fmt yuv420p10le -c:a copy output.mkv for live-action.
ffmpeg -i input.mkv -c:v libsvtav1 -crf 20 -preset 6 -g 240 -svtav1-params tune=0:enable-overlays=1:scd=1 -pix_fmt yuv420p10le -c:a copy output.mkv for anime.
Basic explanations for the choice of settings


Preset 6 should be a good starting point for quality/speed.
If you want to bias more towards quality, you can try 3-5.
If you want to bias more towards speed, you can try 6-8.
If you want to bias towards realtime and much faster than realtime, you can go from 8-12.


I always recommend to use 10-bit (-pix_fmt yuv420p10le) no matter what. It provides a good quality boost in all cases, particularly in darker shades and noisy stuff.


tune=0 is a more psycho-visually driven encoder tune where stuff like alternative reference frane temporal filtering frame window(how many frames) and filtering strength is reduced, CDEF and LR become
noise based(lots of noise = they become disabled), sharper encoding modes are biased, scenecut detection becomes more psy biased, and higher bitrate performance is prioritized.


enable-overlays=1 activates overlay frames, which are frames that are used to essentially complete the amount of detail used when alt-ref frames are used for
prediction by storing higher frequency information at a reduced data rate.


scd=1 biases the encoders' scene detection to insert more intra refreshes if needed, and reduce temporal dependencies around scene-changes.
Unlike in older SVT-AV1 versions(2021 and older), this does not affect the keyframe insertion logic, which is still performed as normal.


scm=X is the screen content mode decision making. 2 is the default, being automatic detection based on the preset.
1 is turned on fully, greatly boosting detection and the utilization of screen content tools, and 0 is turned off.
0 is best for live-action, 2 is best overall, and 1 is good if you know you'll only be streaming screen content(extremely simple animated shows like Peppa Pig
and computer slides).


NOTE: When encoding at higher bitrates than this, you might want to try disabling temporal filtering for alt-ref frames (--enable-tf 0 for SVT-AV1 and enable-tf=0) for ffmpeg, which minimizes the temporal filtering for alt-refs to the absolute minimum(or disable it entirely). This will increase the potential for fidelity at high bitrates, but disabling it has some adverse side effects if you don't decrease the quantization enough: while over-filtering alt-refs can cause a quality drop, even with overlay frame compensation, under-filtering means you'll either spend more bitrate(acceptable), or you'll just get a drop in quality. Just test it out and see what you get.
This guide could be a lot more complex and complete, but that'll be all for now.