Skip to content

Instantly share code, notes, and snippets.

@frabad
Forked from bancek/cue_to_mp3.py
Last active January 16, 2023 20:57
Show Gist options
  • Star 6 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save frabad/5e4505b9f6e6c0ffb29e4fdb5a05c76e to your computer and use it in GitHub Desktop.
Save frabad/5e4505b9f6e6c0ffb29e4fdb5a05c76e to your computer and use it in GitHub Desktop.
CUE splitter using ffmpeg
#!/usr/bin/env python
import os
import subprocess
import sys
if len(sys.argv) != 2:
sys.exit("Usage:\n\t%s <%s>" % (sys.argv[0], 'input_cuesheet'))
cuesheet_fname = sys.argv[1]
global_metadata = {}
media_fname = ""
tracks = []
if not os.path.exists(cuesheet_fname):
sys.exit("Cuesheet '%s' was not found." % cuesheet_fname)
with open(cuesheet_fname) as f:
for line in f.readlines():
line = line.strip('\n')
if line.startswith('REM GENRE '):
global_metadata['genre'] = ' '.join(line.split(' ')[2:])
if line.startswith('REM DATE '):
global_metadata['date'] = ' '.join(line.split(' ')[2:])
if line.startswith('PERFORMER '):
global_metadata['artist'] = ' '.join(line.split(' ')[1:]).replace('"', '')
if line.startswith('TITLE '):
global_metadata['album'] = ' '.join(line.split(' ')[1:]).replace('"', '')
if line.startswith('FILE '):
media_fname = ' '.join(line.split(' ')[1:-1]).replace('"', '')
if line.startswith(' TRACK '):
track = global_metadata.copy()
track['track'] = int(line.strip().split(' ')[1], 10)
tracks.append(track)
if line.startswith(' TITLE '):
tracks[-1]['title'] = ' '.join(line.strip().split(' ')[1:]).replace('"', '')
if line.startswith(' PERFORMER '):
tracks[-1]['artist'] = ' '.join(line.strip().split(' ')[1:]).replace('"', '')
if line.startswith(' INDEX 01 '):
t = list(map(int, ' '.join(line.strip().split(' ')[2:]).replace('"', '').split(':')))
tracks[-1]['start'] = 60 * t[0] + t[1] + t[2] / 100.0
if not os.path.exists(media_fname):
sys.exit("Media file '%s' referenced in cue sheet could not be found." % media_fname)
for i in range(len(tracks)):
if i != len(tracks) - 1:
tracks[i]['duration'] = tracks[i + 1]['start'] - tracks[i]['start']
for track in tracks:
track_metadata = {
'artist': track['artist'],
'title': track['title'],
'album': track['album'],
'track': str(track['track']) + '/' + str(len(tracks))
}
if 'genre' in track:
track_metadata['genre'] = track['genre']
if 'date' in track:
track_metadata['date'] = track['date']
cmd, args = ['ffmpeg'], []
args.extend(['-i',media_fname])
args.extend(['-c','copy'])
args.extend(['-ss','%.2d:%.2d:%.2d' % (track['start'] / 60 / 60, track['start'] / 60 % 60, int(track['start'] % 60))])
if 'duration' in track:
args.extend(['-t','%.2d:%.2d:%.2d' % (track['duration'] / 60 / 60, track['duration'] / 60 % 60, int(track['duration'] % 60))])
for (k, v) in list(track_metadata.items()):
args.extend(['-metadata','%s=%s' % (k, v) ])
args.append('%.2d - %s - %s%s' % (track['track'], track['artist'], track['title'], os.path.splitext(media_fname)[1]))
cmd.extend(args)
subprocess.call(cmd)
@frabad
Copy link
Author

frabad commented May 20, 2017

Changes over the forked script

  • the syntax is now compatible with Python 3 (the script works with both versions 2.7 and 3.5), but the code still deserves a rewrite
  • basic console-based user interface : the filename of the input cuesheet is expected to be provided on the command line
  • don't re-encode the tracks, just use the same codec in the output media files as in the input (hence the new name of the script)
  • actually process the ffmpeg calls, don't just display the intended commands

@scotia70
Copy link

Hi, thanks for the script. I found I needed to add ".rstrip()" at the end of textual extraction lines, otherwise a few of my CUEs left the artist with a '\r' at the end which made for all sorts of problems in the output files.

@DL444
Copy link

DL444 commented Jul 31, 2022

Line 18:

with open(cuesheet_fname) as f:

Maybe it would be better to allow users to choose a file encoding via an argument or add some auto detection. I got decoding exceptions from some of my cuesheets with non-UTF encodings like JIS.

Line 20:

if line.startswith('FILE '):
    media_fname = ' '.join(line.split(' ')[1:-1]).replace('"', '')

You might want to consider adding the containing directory of the cuesheet to form a path rather than using the media filename directly. People may run this script from a different working directory than the one containing the cuesheet and the media file. In that case, the script wouldn't be able to find the media file with only its name. Something like this should be fine:

if line.startswith('FILE '):
    media_fname = ' '.join(line.split(' ')[1:-1]).replace('"', '')
    media_fname = os.path.join(os.path.dirname(cuesheet_fname), media_fname)

A command line argument can also be added to allow users to choose where their media files are. But I would say assuming the cuesheet and the media sit in the same directory should be okay, since that seems to be a convention for most media players.

Line 69:

args.append('%.2d - %s - %s%s' % (track['track'], track['artist'], track['title'], os.path.splitext(media_fname)[1]))
  • The previous path suggestion also applies here.
  • Some tracks may not have a performer attribute so the script will crash here.
  • Joining track ID, performer, and title directly together to form a filename means that the resulting name can be invalid on some OS or filesystems (e.g. contains : on Windows). However, after some Googling it seems that Python has not provided a convenient way to escape these invalid names. I might try to come up with some platform-dependent regexes to handle these cases without introducing any 3rd party dependencies.

I ended up changing as follows to address the first two points:

if 'artist' in track:
    output_track_fname = format('%.2d - %s - %s%s' % (track['track'], track['artist'], track['title'], os.path.splitext(media_fname)[1]))
else:
    output_track_fname = format('%.2d - %s%s' % (track['track'], track['title'], os.path.splitext(media_fname)[1]))
output_track_path = os.path.join(os.path.dirname(cuesheet_fname), output_track_fname)
args.append(output_track_path)

By the way, ffmpeg seems to have a 7-year bug tracked at https://trac.ffmpeg.org/ticket/4905 that produces split FLAC files with wrong durations. I have yet to find a way to circumvent this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment