Skip to content

Instantly share code, notes, and snippets.

@bancek
Last active September 3, 2023 15:50
Show Gist options
  • Star 37 You must be signed in to star a gist
  • Fork 29 You must be signed in to fork a gist
  • Save bancek/b37b780292540ed2d17d to your computer and use it in GitHub Desktop.
Save bancek/b37b780292540ed2d17d to your computer and use it in GitHub Desktop.
CUE splitter using ffmpeg (to mp3)
cue_file = 'file.cue'
d = open(cue_file).read().splitlines()
general = {}
tracks = []
current_file = None
for line in d:
if line.startswith('REM GENRE '):
general['genre'] = ' '.join(line.split(' ')[2:])
if line.startswith('REM DATE '):
general['date'] = ' '.join(line.split(' ')[2:])
if line.startswith('PERFORMER '):
general['artist'] = ' '.join(line.split(' ')[1:]).replace('"', '')
if line.startswith('TITLE '):
general['album'] = ' '.join(line.split(' ')[1:]).replace('"', '')
if line.startswith('FILE '):
current_file = ' '.join(line.split(' ')[1:-1]).replace('"', '')
if line.startswith(' TRACK '):
track = general.copy()
track['track'] = int(line.strip().split(' ')[1], 10)
tracks.append(track)
if line.startswith(' TITLE '):
tracks[-1]['title'] = ' '.join(line.strip().split(' ')[1:]).replace('"', '')
if line.startswith(' PERFORMER '):
tracks[-1]['artist'] = ' '.join(line.strip().split(' ')[1:]).replace('"', '')
if line.startswith(' INDEX 01 '):
t = map(int, ' '.join(line.strip().split(' ')[2:]).replace('"', '').split(':'))
tracks[-1]['start'] = 60 * t[0] + t[1] + t[2] / 100.0
for i in range(len(tracks)):
if i != len(tracks) - 1:
tracks[i]['duration'] = tracks[i + 1]['start'] - tracks[i]['start']
for track in tracks:
metadata = {
'artist': track['artist'],
'title': track['title'],
'album': track['album'],
'track': str(track['track']) + '/' + str(len(tracks))
}
if 'genre' in track:
metadata['genre'] = track['genre']
if 'date' in track:
metadata['date'] = track['date']
cmd = 'ffmpeg'
cmd += ' -b:a 320k'
cmd += ' -i "%s"' % current_file
cmd += ' -ss %.2d:%.2d:%.2d' % (track['start'] / 60 / 60, track['start'] / 60 % 60, int(track['start'] % 60))
if 'duration' in track:
cmd += ' -t %.2d:%.2d:%.2d' % (track['duration'] / 60 / 60, track['duration'] / 60 % 60, int(track['duration'] % 60))
cmd += ' ' + ' '.join('-metadata %s="%s"' % (k, v) for (k, v) in metadata.items())
cmd += ' "%.2d - %s - %s.mp3"' % (track['track'], track['artist'], track['title'])
print cmd
@holesocks
Copy link

holesocks commented Feb 23, 2021

This code does a pretty good job. Thanks.
However getting the cuts to the millisecond needs some more work!
The problem is standard cue file Index points are specified in MM:SS:FF format, where FF are frames.
And ffmpeg wants fractions of a second to make the cuts.
Also If we want to avoid re-encoding, which is sensible, ffmpeg has to cut at frame boundaries, which it is cautious about, so adds a couple of frames to ensure nothing is excluded. (Typically .026 secs a go for mp3).

If the cue file was designed for CD rather than an MP3 file, which is usual, then each FF is 1/75 sec, so the calculation to get ms from FF is easy, but the problem with ffmpeg remains.

If you want to get this spot on, the frame size in ms will need to be calculated (The typical MP3 (Layer III, version 1) has 1152 samples per frame and the sample rate is (commonly) 44100 hz.) and all valid audio frames will have to read and written 1 by 1 to the desired duration.

Alternatively mp3directcut (windows free) will read a cue file, and split the audio without reencoding, and works to the frame level, but I have never checked exactly how accurate this is. There may be better tools. I'd love to know.

@OscarL
Copy link

OscarL commented Mar 3, 2021

Alright, following @holesocks advise (Thanks!), I've forked this gist, see here, and made the following changes.

  • fixed location of the bitrate parameter.
  • Support both Python 2 & 3.
  • Fixed track duration so it does not cuts tracks short, nor starts them early (for the usual case of CD-Images as .flac files at least).

I've kept the changes to the minimum, so its easy to compare to the original (and anyone can use it as a base).

I'll probably rewrite an over-engineered version (call ffmpeg, flac-to-flac splits, selectable output format. error checking, etc) just to exercise a bit my rusty fingers.

@holesocks
Copy link

Thanks!
ffmpeg will work out what output to produce going by the filename extension. Your program could split aac and wav files too (don't know about flac) with very few changes, Just an idea!
Generally mp3's are just not designed to be cut at the frame level - data can overflow from one frame to the next for one. ffmpeg probably tidies up the ends as best it can to avoid audible imperfections, but at the expense of a little loss of precision.
According to the hydrogenaud.io specialists, pcutmp3 is the best tool to cut mp3s that will deal with overflow and gapless play. It is a java program and it is unclear if it is still supported so I didn't test it.
That's me done - cheerio.

@OscarL
Copy link

OscarL commented Mar 10, 2021

@holesocks: that's the idea! The script I have in progress it's called "cue_splitter.py", and it let's you select format/codec/bitrate/etc... albeit personally will only use it to do .flac to .flac splitting (particularly due to your comments regarding frame-level splitting).

Using ffmpeg you can do splitting without re-conversion, but there's a bug in ffmpeg, and the split files end up all having the right size, but the wrong duration in them (and tend to confuse some media players). I just resort to "flac 2 flac" with the default compression level (fast enough even on my old CPU) and files work ok.

I've intentionally kept this gist as close to the original as possible (while fixing the most glaring errors), because maybe other fellows can do like me... and use it for practicing their programming with a simple, but concrete project.

Thanks for your feedback, and greetings from Argentina! :-)

Copy link

ghost commented May 7, 2021

Thank you 🙏

@jeanslack
Copy link

I made this FFmpeg based command line utility https://github.com/jeanslack/FFcuesplitter, it has some interesting options and is flexible enough for most needs, the results seem accurate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment