Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Bostwickenator/e023343bdff52de9c15486287ec3ae86 to your computer and use it in GitHub Desktop.
Save Bostwickenator/e023343bdff52de9c15486287ec3ae86 to your computer and use it in GitHub Desktop.
Chirp Audiobook Download Script

Chirp AudioBook Download Script

This script eases the process of downloading the audio files from Chirp Audiobooks. It uses the browsers console to generate a list of URLs, and then provides a list of wget commands to download them.

Tested with Firefox + Terminal on MacOS, and Firefox + PowerShell on Windows 10.

As an aside, I want to give a shout out to Libro.fm for providing a simple download button for each purchase. Then you don't need a script like this!

Instructions

  1. Find the book in your Chirp Library.
    • If you've already listened to it, you may need to move it back from your Archive.
  2. Click the book to open Chirp's web player.
  3. Open the browser's Web Developer Tools.
  4. Copy-paste the script.js contents into the console and press [enter].
  5. Initiate the script:
    • If the book is already at the start, click Play (▶).
    • If the book is on any other track, open the Chapters menu (top left) and select the first Track.
  6. Wait while the script advances through each track; it's saving the URLs in the background.
    • It may say "There was an error loading your audiobook, please reload the page." under the Play button, ignore this.
    • It may also show a number of URLs in red in the console, along with a warning after each one. Ignore these also.
  7. When it reaches the final track, the script will show a list of commands on the screen in a white box.
    • Click once to highlight the complete list.
    • Copy-paste it to a command line (Terminal, Power Shell, etc.) and press [enter] to execute it.
      • Some command lines will begin executing immediately, however you still need to press [enter] to execute the final command.
    • (The commands are also printed to the browser console, but I've found that it can sometimes collapse longer lists, making it difficult to copy exactly what you want.)
  8. Once the commands finish, you should have a new folder with a cover image and each of the tracks as .m4a files.
    • On macOS, type open . and press [enter] to view the files.
    • On Windows, type explorer . and press [enter] to view the files.
  9. Check the file size of each track:
    • If any are 0 bytes, the download URL may have expired.
      • In that case, go through the process again, but in step 7, first paste the commands into a text editor and delete everything except for the ones to download the 0-byte files.

Merging m4a files

After completing the steps above you will have an audio file for each chapter. If you prefer to have a single m4a or m4b file for each title you can use main.py to create these. The script embeds the cover image and chapter markers into the resulting file.

ONLY TESTED ON WINDOWS

Prerequisites:

  1. Drag the folder containing your collection of m4a files onto the python script. on CLI use python main.py foldername
  2. Wait. The script will build some intermediate files and log it's progress. ffmpeg needs to repack the files into a single stream this can take 10 minutes.
  3. The script will create a file called {foldername}.m4a

Enjoy!

import subprocess
import os
import sys
import re
def parse_book_info(title):
"""
Parses a book title string and returns a dictionary with title, author, and narrator.
Args:
title: The book title string.
Returns:
A dictionary with keys "title", "author", and "narrator", containing the extracted information.
"""
# Regular expressions for extracting title, author, and narrator
title_regex = r"^(.*)- Writ"
author_regex = r"ten by (.*) -"
narrator_regex = r" Narrated by (.*)$"
# Extract information using regex
match = re.match(f"{title_regex}{author_regex}{narrator_regex}$", title)
# Check if any information was found
if not match:
raise ValueError(f"Could not parse book information from title: '{title}'")
# Extract and return information
title = match.group(1).strip()
author = match.group(2) if match.group(2) else None
narrator = match.group(3) if match.group(3) else None
return {"title": title, "author": author, "narrator": narrator}
title =""
def update_title_with_album(filepath):
"""
Updates a file by replacing the remaining content after "title=" with the value
of another line starting with "album=" while keeping the "title=" prefix and newline.
Args:
filepath: The path to the file to modify.
Raises:
FileNotFoundError: If the file cannot be found.
ValueError: If either "title=" or "album=" lines are not found.
"""
with open(filepath, "r") as file:
lines = file.readlines()
# Find the line indexes
title_line_index = None
album_line_index = None
for i, line in enumerate(lines):
if line.startswith("title="):
title_line_index = i
elif line.startswith("album="):
album_line_index = i
# Check if both lines were found
if title_line_index is None or album_line_index is None:
print(f"File '{filepath}' does not contain both required lines ('title=' and 'album=')")
folderMeta = parse_book_info(title)
lines.append(f"title={folderMeta['title']}\n")
lines.append(f"author={folderMeta['author']}\n")
lines.append(f"artist={folderMeta['author']}; {folderMeta['narrator']}\n")
lines.append(f"album_artist=Narrated by {folderMeta['narrator']}\n")
else:
# Extract the title and album values
album_value = lines[album_line_index][len("album="):]
# Update the title line
new_title_line = f"title={album_value}"
# Replace the old title line with the updated one
lines[title_line_index] = new_title_line
# Save the changes to the file
with open(filepath, "w") as file:
file.writelines(lines)
def get_chapter_title(filepath):
command = f'ffprobe -show_entries format_tags="title" -v quiet {filepath}'
out = subprocess.run(command, shell=False, capture_output=True,cwd=os.getcwd())
out = out.stdout.decode().splitlines()
title_regex = r"^TAG:title=(.*)$"
for i, line in enumerate(out):
match = re.match(title_regex, line)
if match:
title = match.group(1)
break
if not match:
title = filepath[filepath.rfind(start:='- ')+len(start):filepath.find('.m4a')]
return title
def make_chapters_metadata(list_audio_files: list):
print(f"Making metadata source file")
chapters = {}
count = 1
for single_audio_files in list_audio_files:
file_path = f'"{folder}\{single_audio_files}"'
command = f'ffprobe -v quiet -of csv=p=0 -show_entries format=duration {file_path}'
out = subprocess.run(command, shell=False, capture_output=True,cwd=os.getcwd())
duration_in_microseconds = int((out.stdout.decode().strip().replace(".", "")))
title = get_chapter_title(file_path)
chapters[f"{count:04d}"] = {"duration": duration_in_microseconds, "title": title}
count = count+1
chapters["0001"]["start"] = 0
for n in range(1, len(chapters)):
chapter = f"{n:04d}"
next_chapter = f"{n + 1:04d}"
chapters[chapter]["end"] = chapters[chapter]["start"] + chapters[chapter]["duration"]
chapters[next_chapter]["start"] = chapters[chapter]["end"] + 1
last_chapter = f"{len(chapters):04d}"
chapters[last_chapter]["end"] = chapters[last_chapter]["start"] + chapters[last_chapter]["duration"]
metadatafile = f"{folder}\\combined.metadata.txt"
command = f'ffmpeg -y -loglevel error -i "{folder}\{list_audio_files[0]}" -f ffmetadata "{metadatafile}"'
subprocess.run(command, shell=False, capture_output=True,cwd=os.getcwd())
update_title_with_album(metadatafile)
with open(metadatafile, "a+") as m:
for chapter in chapters:
ch_meta = """
[CHAPTER]
TIMEBASE=1/1000000
START={}
END={}
title={}
""".format(chapters[chapter]["start"], chapters[chapter]["end"], chapters[chapter]["title"])
m.writelines(ch_meta)
print(ch_meta)
def concatenate_all_to_one_with_chapters():
filename = f'{title}.m4a'
print(f"Concatenating chapters to {filename}")
metadatafile = f"{folder}\\combined.metadata.txt"
cover = f"{folder}\\cover.jpg"
os.system(f'ffmpeg -hide_banner -y -f concat -safe 0 -i list_audio_files.txt -i "{metadatafile}" -map_metadata 1 "{folder}\\i.m4a"')
os.system(f'ffmpeg -i "{folder}\\i.m4a" -i "{cover}" -c copy -disposition:v attached_pic "{filename}"')
os.remove(f'"{folder}\\i.m4a"')
if __name__ == '__main__':
print(sys.argv)
folder = sys.argv[1].replace('"','')
title = os.path.split(folder)[-1].replace('"','')
print(title)
list_audio_files = [f for f in os.listdir(folder) if f.find(".m4a")>=0]
list_audio_files.sort()
if os.path.isfile("list_audio_files.txt"):
os.remove("list_audio_files.txt")
for filename_audio_files in list_audio_files:
with open("list_audio_files.txt", "a") as f:
line = f"file '{folder}\{filename_audio_files}'\n"
f.write(line)
make_chapters_metadata(list_audio_files)
concatenate_all_to_one_with_chapters()
const $ = document.querySelector.bind(document);
function filename(name) {
return name.replaceAll('&', 'and').replaceAll(':', ' -').replaceAll(/[^a-z0-9 ._-]+/ig, '');
}
const title = filename($('h1.book-title').textContent);
const credits = [].slice.call(document.querySelectorAll('.credit'))
.map(n => filename(n.textContent))
.join(' - ');
const dirname = `${title} - ${credits}`;
const commands = [
`mkdir "${dirname}"`,
`cd "${dirname}"`,
`wget -O "cover.jpg" "${$('.cover-image').src }"`
];
const tracks = [];
let count = 0;
function addUrl(url) {
count += 1;
const chapter = filename($('div.chapter').textContent);
tracks.push({
count,
chapter,
url
})
}
function showCommands() {
const padSize = tracks.length.toString().length;
tracks.forEach(({count, chapter, url}) => {
let trackNum = count.toString().padStart(padSize, "0");
commands.push(`wget -O "${title} - ${trackNum} - ${chapter}.m4a" "${url}"`);
})
commands.push(`cd ..`);
console.log(commands.join('\n'))
const div = document.createElement('div');
div.innerHTML = '<div style="position: absolute; top: 100px; left: 100px; z-index: 100000; background: white; padding: 10px;"><p>Copy these commands to PowerShell/Terminal/etc:</p><textarea id="dl-commands" style="min-height:20em; min-width:30em"></textarea></div>';
document.body.appendChild(div);
const textarea = document.querySelector('#dl-commands');
textarea.value = commands.join('\n');
textarea.onfocus = function(){this.select()};
}
function next() {
const btn = $('button.next-chapter')
if (btn.disabled) {
showCommands()
} else {
btn.click();
}
}
const audio = $('audio');
Object.defineProperty(audio, "src", {
get() {
return '';
},
set(url) {
setTimeout(() => {
addUrl(url);
next();
}, 500);
},
});
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment