Skip to content

Instantly share code, notes, and snippets.

@aularon
Last active June 7, 2024 14:44
Show Gist options
  • Save aularon/c48173f8246fa57e9c1ef7ff694ab06f to your computer and use it in GitHub Desktop.
Save aularon/c48173f8246fa57e9c1ef7ff694ab06f to your computer and use it in GitHub Desktop.
Split an m4b into its chapters. No recoding is done, just splitting
#!/bin/bash
# Description: Split an m4b into its chapters. No recoding is done, just splitting
# Usage: m4b_split.sh $input_file $output_dir/
# Requires: ffmpeg, jq
# Author: Hasan Arous
# License: MIT
in="$1"
out="$2"
splits=""
while read start end title; do
splits="$splits -c copy -ss $start -to $end $out/$title.m4b"
done <<<$(ffprobe -i "$in" -print_format json -show_chapters \
| jq -r '.chapters[] | .start_time + " " + .end_time + " " + (.tags.title | sub(" "; "_"))')
ffmpeg -i "$in" $splits
@mattura
Copy link

mattura commented Nov 2, 2020

Thanks for this!
My shell's 'read' doesn't split the chapter values as required, I had to add a delimiter:

while read '-d|' start end title; do
splits="$splits -c copy -ss $start -to $end $out/$title.m4b"
done <<<$(ffprobe -i "$in" -print_format json -show_chapters \
| jq -r '.chapters[] | .start_time + " " + .end_time + " " + (.tags.title | sub(" "; "_")) + "|"')`

@tripodsan
Copy link

FYI, If you also want to speedup the audio, you also need to adjust the start and end markers.

in="$1"
out="$2"
splits=""
while read -d'|' start end title; do
  splits="$splits -map 0:a -c:a libmp3lame -filter:a atempo=1.25 -ss $start -to $end $out/$title.mp3"
done <<<$(ffprobe -i "$in" -print_format json -show_chapters \
  | jq -r '.chapters[] | ((.start_time| tonumber)/1.25 |tostring) + " " + ((.end_time| tonumber)/1.25 |tostring) + " " + (.tags.title | sub(" "; "_")) + "|"')

ffmpeg -i "$in" $splits 

(note, I also had to specify the -map 0:a so that ffmpeg isn't confused about the subtitle stream, otherwise it didn't show the progress, and didn't teminate)

@BorysekOndrej
Copy link

BorysekOndrej commented Mar 21, 2022

Note that if the name of the chapters contains more than one space, than it's not sufficiently escaped by sub(" "; "_"), but gsub(" "; "_") needs to be used. The full snippet from tripodsan (with the speed set to 1.00 speed by default) would look like:

in="$1"
out="$2"
splits=""
while read -d'|' start end title; do
  splits="$splits -map 0:a -c:a libmp3lame -filter:a atempo=1.00 -ss $start -to $end $out/$title.mp3"
done <<<$(ffprobe -i "$in" -print_format json -show_chapters \
  | jq -r '.chapters[] | ((.start_time| tonumber)/1.00 |tostring) + " " + ((.end_time| tonumber)/1.00 |tostring) + " " + (.tags.title | gsub(" "; "_")) + "|"')
ffmpeg -i "$in" $splits

And if you don't need speed control, than you can remove filter -filter:a atempo=1.00 for 5-10x speedup.

@michaellammers
Copy link

Love this, but can't seem to figure out how to get this working with spaces, in situations when:

  • the output folder contains spaces
  • with an addition of the original filename in the output name (example $out/$filename$title.mp3) if that filename contains spaces

Thanks!

@NilsIrl
Copy link

NilsIrl commented Jul 21, 2022

ffmpeg -i "$in" $(ffprobe -i "$in" -show_chapters -print_format json | jq -r '[.chapters[] | "-c copy -ss " + .start_time + " -to " + .end_time + " " + (.tags.title + ".m4b" | gsub(" "; "_"))] | join(" ")')

One-liner

@lefth
Copy link

lefth commented Sep 17, 2023

@michaellammers This version handles spaces and chapters that are not alphabetically sorted, but it requires zsh to run. I did that because zsh handles spaces in variable names and arrays, while bash is tricky. Note that this does not and cannot speed up the audio, because it makes a lossless copy (as in the original version of the gist). Non-m4b formats are also supported, as well as metadata that has the wrong ending timestamp for the last chapter. And the ffprobe commands used above don't work; they need something like -v 0 to avoid outputting a lot of junk for some files.

Note that this technique works for chapterized files, but it will make bad sounding splits if the chapters are random and sometimes fall in the middle of a word.

#!/bin/zsh
# Note: this is only good for CHAPTERIZED files. If the breaks come in the middle of a word,
# the result will sound bad.
#
# Note that other containers are also supported!
if [[ -z $1 || -z $2 ]]; then
    echo "Usage: $0 <in file> <out directory>" >&2
    exit 1
fi

in=$1
extension=$1:e
out=$2
mkdir -p $out
metadata_source=${OVERRIDE_METADATA_SOURCE:-$in}

chapters_str=$(ffprobe -i $metadata_source -print_format json -show_chapters -v 0 | \
    jq -r '.chapters[] | .start_time + " " + .end_time + " " + (.tags.title | sub(" "; "_"))')
chapters_arr=("${(@f)chapters_str}")
chapter_count=$#chapters_arr
# Prefix width, like 3 for "001", "002"...
chapter_width=$#chapter_count
n=0
splits=()
# Skip the end because the last chapter should not have the -to flag, in case
# the metadata is wrong. (It was in one case I saw)
for line in $chapters_arr[1,-2]; do
    ((n++))
    echo $line | read start end title
    splits+=(-c copy -c:a copy -map 0:a -ss $start -to $end "$out/${(l:$chapter_width::0:)n} - $title.$extension")
done
echo $chapters_arr[-1] | read start end title
splits+=(-c copy -c:a copy -map 0:a -ss $start "$out/${(l:$chapter_width::0:)n} - $title.$extension")

ffmpeg -i $in $splits

@michaellammers
Copy link

Great, thanks! 🙏

@michaellammers This version handles spaces and chapters that are not alphabetically sorted, but it requires zsh to run. I did that because zsh handles spaces in variable names and arrays, while bash is tricky. Note that this does not and cannot speed up the audio, because it makes a lossless copy (as in the original version of the gist):

#!/bin/zsh
if [[ -z $1 || -z $2 ]]; then
    echo "Usage: $0 <in file> <out directory>" >&2
    exit 1
fi

in=$1
out=$2
mkdir -p $out

chapters_str=$(ffprobe -i $in -print_format json -show_chapters | \
    jq -r '.chapters[] | .start_time + " " + .end_time + " " + (.tags.title | sub(" "; "_"))')
# Prefix width, like 3 for "001", "002"...
width=$(($(echo $chapters_str | wc -l | wc -c) - 1))
n=0
splits=()
echo $chapters_str | \
while read start end title; do
    ((n++))
    splits+=(-c copy -c:a copy -map 0:a -ss $start -to $end "$out/${(l:$width::0:)n} - $title.m4b")
done

ffmpeg -i $in $splits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment