To split sounds for usage in Second Life.
- Gwyneth Llewelyn (20220608)
Your mileage may wary.
ffmpeg
doesn’t work natively with MP3, it uses an external plugin for that. Therefore, we’ll first convert it to WAV.
ffmpeg -i Sugar\ Plum\ Dark\ Mix.mp3 sugar-plum-dark-mix.wav
If you wish, you can check with ffprobe
before to see the average bitrate. Our example: 320 kb/s, so we do:
ffmpeg -i Sugar\ Plum\ Dark\ Mix.mp3 -b:a 320k -minrate 256k sugar-plum-dark-mix.wav
First, determine the average and peak loudness:
ffmpeg -i sugar-plum-dark-mix.wav -filter:a volumedetect -f null /dev/null
This should show something like:
[Parsed_volumedetect_0 @ 0x7fcb8400e1c0] mean_volume: -24.5 dB
[Parsed_volumedetect_0 @ 0x7fcb8400e1c0] max_volume: -0.9 dB
Better still, get it working with loudnorm
:
ffmpeg -i sugar-plum-dark-mix.wav -af loudnorm=I=-16:TP=-1.5:LRA=11:print_format=json -f null -
which results in:
{
"input_i" : "-23.21",
"input_tp" : "-0.87",
"input_lra" : "12.50",
"input_thresh" : "-33.81",
"output_i" : "-14.47",
"output_tp" : "-1.50",
"output_lra" : "5.40",
"output_thresh" : "-24.73",
"normalization_type" : "dynamic",
"target_offset" : "-1.53"
}
Note that mean_volume
is equivalent to input_i
while max_volume
comes close to input_tp
. In the above scenario, clearly -16dB is not enough to get this to acceptable levels; so I’ve tried to go even higher than that:
ffmpeg -i sugar-plum-dark-mix.wav -af loudnorm=I=-8:TP=0:LRA=12.5:print_format=json -f null -
Radical!
{
"input_i" : "-23.21",
"input_tp" : "-0.87",
"input_lra" : "12.50",
"input_thresh" : "-33.81",
"output_i" : "-11.23",
"output_tp" : "+0.00",
"output_lra" : "5.40",
"output_thresh" : "-21.44",
"normalization_type" : "dynamic",
"target_offset" : "3.23"
}
What we need now is to do a second pass, feeding those values into the call to the loudnorm
filter:
ffmpeg -i sugar-plum-dark-mix.wav -af loudnorm=I=-8:TP=-0:LRA=12.5:measured_I=-23.21:measured_TP=-0.87:measured_LRA=12.50:measured_thresh=-33.81:offset=3.23:linear=true:print_format=summary -ar 44100 sugar-plum-dark-mix-louder.wav
-ar 44100
may not be necessary.
Result:
Input Integrated: -23.2 LUFS
Input True Peak: -0.9 dBTP
Input LRA: 12.5 LU
Input Threshold: -33.8 LUFS
Output Integrated: -14.7 LUFS
Output True Peak: +0.0 dBTP
Output LRA: 8.4 LU
Output Threshold: -25.0 LUFS
Normalization Type: Dynamic
Target Offset: +6.7 LU
If there is garbage at the end, trim it:
ffmpeg -ss 00:00:00 -t 00:02:00.00 -i sugar-plum-dark-mix-louder.wav sugar-plum-dark-mix-louder-trimmed.wav
This should do the trick:
ffmpeg -i sugar-plum-dark-mix-louder-trimmed.wav -ac 1 sugar-plum-dark-mix-mono.wav
Note: bitrate should now be half of the stereo sound.
ffmpeg -i sugar-plum-dark-mix-mono.wav -codec:a copy -vn -f segment -segment_time 9.9 -segment_start_number 1 split/%02d-sugar-plum-dark-mix.wav
Note that the %02d
template parameter will use 2 digits, so that segments are properly ordered when using ls -la
.