Skip to content

Instantly share code, notes, and snippets.

@ivan
Last active June 14, 2023 19:54
Show Gist options
  • Star 30 You must be signed in to star a gist
  • Fork 10 You must be signed in to fork a gist
  • Save ivan/411e75128eb22f4a278a87f98a58ef74 to your computer and use it in GitHub Desktop.
Save ivan/411e75128eb22f4a278a87f98a58ef74 to your computer and use it in GitHub Desktop.
Download a podcast episode from anchor.fm
#!/usr/bin/env bash
# Download a podcast episode from anchor.fm
#
# Usage:
# grab-anchor-episode "https://anchor.fm/emerge/episodes/Robert-MacNaughton---Learnings-from-the-Life-and-Death-of-the-Integral-Center-e31val" # (m4a example)
# grab-anchor-episode "https://anchor.fm/free-chapel/episodes/Are-You-Still-In-Love-With-Praise--Pastor-Jentezen-Franklin-e19u4i8" # (mp3 example)
#
# anchor.fm serves a list of m4a or mp3 files that need to be concatenated with ffmpeg.
#
# For debugging, uncomment:
# set -o verbose
set -eu -o pipefail
url=$1
json=$(curl -sL "$url" | grep -P -o 'window.__STATE__ = .*' | cut -d ' ' -f 3- | sed -r 's/;$//g')
ymd=$(echo -E $json | jq -r '.episodePreview.publishOn' | cut -d 'T' -f 1)
extension=$((echo -E $json | jq -r '.[].episodeEnclosureUrl' | grep -F --max-count=1 :// | grep -oP '\.[0-9a-z]+$' | cut -d . -f 2) || echo m4a)
output_basename=$ymd-$(basename -- "$url").$extension
if [[ -f "$output_basename" ]]; then
echo "$output_basename already exists; skipping download"
exit
fi
temp_dir="$(mktemp -d)"
cd "$temp_dir"
audio_urls=$(echo -E $json | jq -r '.station.audios|map(.audioUrl)|.[]')
for i in $audio_urls; do
output_file=$(basename -- "$i")
wget "$i" -O "$output_file"
echo "file '$output_file'" >> .copy_list
done
ffmpeg -f concat -safe 0 -i .copy_list -c copy "$output_basename"
cd -
mv "$temp_dir/$output_basename" ./
rm -rf "$temp_dir"
@ivan
Copy link
Author

ivan commented Jan 31, 2019

Requires jq, wget and ffmpeg.

@amelhorcarol
Copy link

thank u so much!

@juliohenriquerocha
Copy link

Thanks, man!

@chris-short
Copy link

On a Mac, brew install grep and use ggrep in the script.

@shillshocked
Copy link

shillshocked commented Nov 11, 2021

Recommend you add in:

extension="$(jq '.[].episodeEnclosureUrl' <<< "$json" | grep "://" | sed 's|.*\.\(.*\)\"|\1|g')"

if [[ ! "$extension" ]]; then
	extension="mp4"
fi

after ymd and replace the subsequent line with:

output_basename=$ymd-$(basename -- "$url")."$extension"

This will allow for file extension detection in the event that there is a different filetype (like mp3). The script wasn't working for me in that case. it defaults to .mp4 here but you could change the default.

This works for the podcast I was downloading: https://anchor.fm/free-chapel

@ivan
Copy link
Author

ivan commented Nov 11, 2021

@shillshocked thanks! I just fixed the script to support mp3 based on your use of jq. Please let me know if there are issues.

@Nikubik
Copy link

Nikubik commented Nov 24, 2021

Hello, thank you to share this.
I does interest me, but i don't know how to use it ? I download the script, and i'm under windows.
If i type : grab-anchor-episode "https.." it says it is not a command
Can you help me ?

@ivan
Copy link
Author

ivan commented Nov 24, 2021

@Nikubik Hi, I tested this only on Linux, though someone above reports success on macOS when the system grep is replaced.

On Windows, you might be able to make this work in Cygwin, though I have not tested it. Install jq and wget through Cygwin; install ffmpeg by other means, e.g. chocolatey.

You might need someone's assistance if you are not familiar with the command line. wget the raw gist URL and then remember to chmod +x the downloaded script. Rename with mv to grab-anchor-episode and use ./grab-anchor-episode "URL".

If you find an easier way to download things on anchor.fm, let us know here. Also note that anchor.fm content will typically be syndicated to podcasting platforms that serve proper, contiguous audio files, from where it may be easier to download them.

@Nikubik
Copy link

Nikubik commented Nov 24, 2021

Oh, thank you with your complete answer. I look into that tomorrow.

@hutber
Copy link

hutber commented Dec 24, 2021

Can't think why its complaining but .copy_list: No such file or directory

@ivan
Copy link
Author

ivan commented Dec 24, 2021

If there is no .copy_list, the issue is that it did not find any audio_urls.

You can add some debug prints e.g. echo -E $json before audio_urls=$(echo -E $json | jq -r '.station.audios|map(.audioUrl)|.[]') if you would like to investigate the JSON.

Which URL causes that?

@solomonrb
Copy link

solomonrb commented Dec 26, 2021

I have a 2 step hybrid solution on Mac that I think is a bit easier, only requires ggrep (via brew install ggrep)

From terminal, insert your anchor url into the following code and run curl -sL "<insert-url-here>" | ggrep -P -o 'window.__STATE__ = .*' | cut -d ' ' -f 3- | sed -r 's/;$//g'

Cmd-F the printed output for "episodeEnclosureUrl":, and copy the string that follows it (e.g. "https: ...")

Replace any \u002F in that string with /, and paste the resultant url into your web browser. Then click the 3 dots and click download!

@ivan
Copy link
Author

ivan commented Dec 27, 2021

The whole point of the script is to deal with anchor.fm's multi-file serving: for many podcasts, anchor publishes audio as multiple files that need to be concatenated. I believe the segments are split up as they were originally edited using their software.

@solomonrb
Copy link

Great point. I guess my solution is only useful for single-file podcasts from anchor.fm

@Potatrix
Copy link

Potatrix commented Jan 6, 2022

Thank you so much for this!

In my case, I'm trying to download all episodes of a certain podcast. After poking around this one for a bit, I found that the Json returned by the curl request contains Urls for other episodes. (Maybe all of them? seems like it was in my case)

For those looking to do the same, here is a helper script that works with this one

./grab-all-anchor-episodes.sh

#!/bin/bash

# Downloads all(?) episodes from a podcaster

# Usage:
# ./grab-all-anchor-episodes.sh "https://anchor.fm/emerge/episodes/Robert-MacNaughton---Learnings-from-the-Life-and-Death-of-the-Integral-Center-e31val"
#
# Must be run from same directory as ./grab-anchor-episodes.sh

# URL from an episode seems to contain information about other episodes too
# writes JSON to file in /tmp and iterates through each 'shareLinkPath' and writes to urlList
#
# Runs ./grab-anchor-episode.sh for each URL in list
#
#

url=$1

echo $url

json=$(curl -sL "$url" | grep -P -o 'window.__STATE__ = .*' | cut -d ' ' -f 3- | sed -r 's/;$//g')

echo $json > /tmp/json

python3 - <<END
import os
import json

data = open("/tmp/json", "r")

file = json.load(data)

for url in file['episodePreview']['episodes']:
        urlPath = "https://anchor.fm%s" % url['shareLinkPath']
        os.system("echo %s >> /tmp/urlList" % urlPath)
END

urlList=$(cat /tmp/urlList)

for url in $urlList
do
        ./grab-anchor-episode.sh $url
done

#cleanup
rm /tmp/json
rm /tmp/urlList

@ivan
Copy link
Author

ivan commented Jan 6, 2022

@Potatrix Did the script actually produce incorrect audio files? If it needs to be fixed, it would really help to have the URL for testing.

@viocar
Copy link

viocar commented Feb 16, 2022

This seems to have stopped working at some point. I have the latest version, and anything I try to download, even the provided examples, just results in .copy_list: No such file or directory.

The command I used: bash grab-anchor-episode.sh "https://anchor.fm/emerge/episodes/Robert-MacNaughton---Learnings-fr om-the-Life-and-Death-of-the-Integral-Center-e31val"

@viocar
Copy link

viocar commented Feb 18, 2022

With the help of a friend, I've managed to modify the script so that it works again (at least for my purposes). I've forked it here: https://gist.github.com/viocar/a6b6a0f485b3f400b8bcb0f8334b454d

@Potatrix
Copy link

@ivan the script downloaded the audio files fine. Sometimes it didn't convert to mp3 but wasn't really an issue for me. I had a task to download all of the recordings for anchor podcast I manage and needed a quick way to download all of them which is why I made the modification

@viocar I notice a space in your URL but I assume it wasn't like this when you tried to run the script?

@viocar
Copy link

viocar commented Feb 28, 2022

No, I tried several URLs that I copied directly from my browser. I'm not sure why there's a space in my post.

@bshankarpandey
Copy link

Please any one could suggest me script code for tracking anchor.fm podcast audio in Tag Manager tools ?

@MaxMussi
Copy link

Where can I change the output location, sorry, I am new to linux

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment