Skip to content

Instantly share code, notes, and snippets.

@dcondrey
Created April 20, 2016 08:34
Show Gist options
  • Star 16 You must be signed in to star a gist
  • Fork 6 You must be signed in to fork a gist
  • Save dcondrey/469e2850e7f88ac198e8c3ff111bda7c to your computer and use it in GitHub Desktop.
Save dcondrey/469e2850e7f88ac198e8c3ff111bda7c to your computer and use it in GitHub Desktop.
Use ffmpeg to split file by chapters. Python version and bash version
#!/bin/bash
# Author: http://crunchbang.org/forums/viewtopic.php?id=38748#p414992
# m4bronto
# Chapter #0:0: start 0.000000, end 1290.013333
# first _ _ start _ end
while [ $# -gt 0 ]; do
ffmpeg -i "$1" 2> tmp.txt
while read -r first _ _ start _ end; do
if [[ $first = Chapter ]]; then
read # discard line with Metadata:
read _ _ chapter
ffmpeg -vsync 2 -i "$1" -ss "${start%?}" -to "$end" -vn -ar 44100 -ac 2 -ab 128 -f mp3 "$chapter.mp3" </dev/null
fi
done <tmp.txt
rm tmp.txt
shift
done
#!/usr/bin/env python
import os
import re
import subprocess as sp
from subprocess import *
from optparse import OptionParser
def parseChapters(filename):
chapters = []
command = [ "ffmpeg", '-i', filename]
output = ""
try:
# ffmpeg requires an output file and so it errors
# when it does not get one so we need to capture stderr,
# not stdout.
output = sp.check_output(command, stderr=sp.STDOUT, universal_newlines=True)
except CalledProcessError, e:
output = e.output
for line in iter(output.splitlines()):
m = re.match(r".*Chapter #(\d+:\d+): start (\d+\.\d+), end (\d+\.\d+).*", line)
num = 0
if m != None:
chapters.append({ "name": m.group(1), "start": m.group(2), "end": m.group(3)})
num += 1
return chapters
def getChapters():
parser = OptionParser(usage="usage: %prog [options] filename", version="%prog 1.0")
parser.add_option("-f", "--file",dest="infile", help="Input File", metavar="FILE")
(options, args) = parser.parse_args()
if not options.infile:
parser.error('Filename required')
chapters = parseChapters(options.infile)
fbase, fext = os.path.splitext(options.infile)
for chap in chapters:
print "start:" + chap['start']
chap['outfile'] = fbase + "-ch-"+ chap['name'] + fext
chap['origfile'] = options.infile
print chap['outfile']
return chapters
def convertChapters(chapters):
for chap in chapters:
print "start:" + chap['start']
print chap
command = [
"ffmpeg", '-i', chap['origfile'],
'-vcodec', 'copy',
'-acodec', 'copy',
'-ss', chap['start'],
'-to', chap['end'],
chap['outfile']]
output = ""
try:
# ffmpeg requires an output file and so it errors
# when it does not get one
output = sp.check_output(command, stderr=sp.STDOUT, universal_newlines=True)
except CalledProcessError, e:
output = e.output
raise RuntimeError("command '{}' return with error (code {}): {}".format(e.cmd, e.returncode, e.output))
if __name__ == '__main__':
chapters = getChapters()
convertChapters(chapters)
ffmpeg -i "$SOURCE.$EXT" 2>&1 \ # get metadata about file
| grep Chapter \ # search for Chapter in metadata and pass the results
| sed -E "s/ *Chapter #([0-9]+.[0-9]+): start ([0-9]+.[0-9]+), end ([0-9]+.[0-9]+)/-i \"$SOURCE.$EXT\" -vcodec copy -acodec copy -ss \2 -to \3 \"$SOURCE-\1.$EXT\"/" \ # filter the results, explicitly defining the timecode markers for each chapter
| xargs -n 11 ffmpeg # construct argument list with maximum of 11 arguments and execute ffmpeg
@shahinism
Copy link

It's also possible to extract json result on chapters using ffprobe, which is a little bit cleaner in python:

    def get_chapters(self):
        chapters = []

        cmd = 'ffprobe -i {} -print_format json -show_chapters -loglevel error'
        ffmpeg = delegator.run(cmd.format(self.file_path))

        return json.loads(ffmpeg.out).get('chapters', [])

idea source

@davidmcgettigan
Copy link

Super stuff.

Would be nice to have the output formatted as follows:

0001_InputFileName.mp3
0002_InputFileName.mp3
0003_InputFileName.mp3
0004_InputFileName.mp3
00nn_InputFileName.mp3

As this format is more likely to play in the correct order on most car / home hi-fi systems.

@captn3m0
Copy link

captn3m0 commented Sep 7, 2019

Forked version with prefixed-filenames (as @hotswap requested), and support for a chapters.txt file passed as second argument from where it reads a list of all chapters.

https://github.com/captn3m0/Scripts/blob/master/split-audio-by-chapters

@davidmcgettigan
Copy link

davidmcgettigan commented Sep 7, 2019 via email

@JavaShipped
Copy link

Sorry for my ignorance here, how do I use the python script, do I just double click to run? Do I have to use some powershell command?

@davidmcgettigan
Copy link

The only version I have used is the ffmpegchapters-explicit.sh version, and I run it on a mac from the command line like this:

./ffmpegchapters-explicit.sh input.mp4

And you will end up with the following in the same directory:

001 - Chapter 1.mp3
002 - Chapter 2.mp3
003 - Chapter 3.mp3
004 - Chapter 4.mp3
005 - Chapter 5.mp3
006 - Chapter 6.mp3
007 - Chapter 7.mp3
008 - Chapter 8.mp3
009 - Chapter 9.mp3
010 - Chapter 10.mp3

@JavaShipped
Copy link

Yeah, I thought so, I run it and get a syntax error on line 17

SyntaxError: invalid syntax
PS <path>\New folder> python ffmpegchapters.py video.mp4
  File "ffmpegchapters.py", line 17
    except CalledProcessError, e:
                             ^
SyntaxError: invalid syntax
<path>\New folder>

@JavaShipped
Copy link

so I actually stopped being dumb for a second and just used .\ffmpegchapters.py Video.mp4,

A cmd box came up and disappeared instantly, with no output.

@davidmcgettigan
Copy link

Ok, thats the python version, I just ran it like this:

./ffmpegchapters.py -f ThePlanets_ep7.mp4
start:0.000000
ThePlanets_ep7-ch-0:0.mp4
start:693.418000
ThePlanets_ep7-ch-0:1.mp4
start:3589.619000
ThePlanets_ep7-ch-0:2.mp4
start:5494.700000
ThePlanets_ep7-ch-0:3.mp4
start:8418.162000
ThePlanets_ep7-ch-0:4.mp4
start:11473.305000
ThePlanets_ep7-ch-0:5.mp4
start:14312.826000
ThePlanets_ep7-ch-0:6.mp4
start:16547.747000
ThePlanets_ep7-ch-0:7.mp4
start:19777.736000
ThePlanets_ep7-ch-0:8.mp4
start:22807.591000
ThePlanets_ep7-ch-0:9.mp4
start:24972.295000
ThePlanets_ep7-ch-0:10.mp4
start:0.000000
{'start': '0.000000', 'origfile': 'ThePlanets_ep7.mp4', 'end': '693.418000', 'name': '0:0', 'outfile': 'ThePlanets_ep7-ch-0:0.mp4'}
start:693.418000
{'start': '693.418000', 'origfile': 'ThePlanets_ep7.mp4', 'end': '3589.619000', 'name': '0:1', 'outfile': 'ThePlanets_ep7-ch-0:1.mp4'}
start:3589.619000
{'start': '3589.619000', 'origfile': 'ThePlanets_ep7.mp4', 'end': '5494.700000', 'name': '0:2', 'outfile': 'ThePlanets_ep7-ch-0:2.mp4'}
start:5494.700000
{'start': '5494.700000', 'origfile': 'ThePlanets_ep7.mp4', 'end': '8418.162000', 'name': '0:3', 'outfile': 'ThePlanets_ep7-ch-0:3.mp4'}
start:8418.162000
{'start': '8418.162000', 'origfile': 'ThePlanets_ep7.mp4', 'end': '11473.305000', 'name': '0:4', 'outfile': 'ThePlanets_ep7-ch-0:4.mp4'}
start:11473.305000
{'start': '11473.305000', 'origfile': 'ThePlanets_ep7.mp4', 'end': '14312.826000', 'name': '0:5', 'outfile': 'ThePlanets_ep7-ch-0:5.mp4'}
start:14312.826000
{'start': '14312.826000', 'origfile': 'ThePlanets_ep7.mp4', 'end': '16547.747000', 'name': '0:6', 'outfile': 'ThePlanets_ep7-ch-0:6.mp4'}
start:16547.747000
{'start': '16547.747000', 'origfile': 'ThePlanets_ep7.mp4', 'end': '19777.736000', 'name': '0:7', 'outfile': 'ThePlanets_ep7-ch-0:7.mp4'}
start:19777.736000
{'start': '19777.736000', 'origfile': 'ThePlanets_ep7.mp4', 'end': '22807.591000', 'name': '0:8', 'outfile': 'ThePlanets_ep7-ch-0:8.mp4'}
start:22807.591000
{'start': '22807.591000', 'origfile': 'ThePlanets_ep7.mp4', 'end': '24972.295000', 'name': '0:9', 'outfile': 'ThePlanets_ep7-ch-0:9.mp4'}
start:24972.295000
{'start': '24972.295000', 'origfile': 'ThePlanets_ep7.mp4', 'end': '27792.513000', 'name': '0:10', 'outfile': 'ThePlanets_ep7-ch-0:10.mp4'}

@JavaShipped
Copy link

(I really appreciate the help on a year old thread that I totally revived from the dead btw!)

Same thing happens, it launches a cmd window and instantly closes it, no output. Really odd.

@JavaShipped
Copy link

Is it possible that using an older version of python (3.7.5) is causing this to be an error?

@ElusiveZatchmo
Copy link

I'm getting the same problem, but my Python is the latest version

@rschader
Copy link

rschader commented Feb 7, 2022

I'm attempting to use the ffmpegchapters-explicit.sh script to extract the chapters, but they alternate between Video and Advertisement, and it fails when it gets to the subsequent chapters that are also named Video and Advertisement. How can I modify it so the chapter names also have the chapter number in them? Thanks!

@IlyasYOY
Copy link

Hello! Thanks you for your work

Adapted it so I can extract chapters data & use it as YouTube marks!

https://github.com/IlyasYOY/ffmpeg-video-chapters-parser

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment