Skip to content

Instantly share code, notes, and snippets.

@aminnj
Last active October 18, 2023 13:52
Show Gist options
  • Save aminnj/2d05f7f2173e12d518f455d47cdf690d to your computer and use it in GitHub Desktop.
Save aminnj/2d05f7f2173e12d518f455d47cdf690d to your computer and use it in GitHub Desktop.
Download reddit-hosted videos/audio
import requests
import os
# change this url to the post's url
post_url = "https://www.reddit.com/r/holdmycatnip/comments/7vyada/hmc_so_i_can_drink_this_air_real_quick/"
# use UA headers to prevent 429 error
headers = {
'User-Agent': 'My User Agent 1.0',
'From': 'testyouremail@domain.com'
}
url = post_url + ".json"
data = requests.get(url, headers=headers).json()
media_data = data[0]["data"]["children"][0]["data"]["media"]
video_url = media_data["reddit_video"]["fallback_url"]
audio_url = video_url.split("DASH_")[0] + "audio"
print video_url, audio_url
# curl both audio and video separately
os.system("curl -o video.mp4 {}".format(video_url))
os.system("curl -o audio.wav {}".format(audio_url))
# mux them
os.system("ffmpeg -i video.mp4 -i audio.wav -c:v copy -c:a aac -strict experimental output.mp4")
@Xoma163
Copy link

Xoma163 commented Aug 29, 2021

Hi, its not working correctly. Audio filename is changes dynamicly
You can get it from dash_url:
media_data['reddit_video']['dash_url']
xml parse 2 versions of xmls:

  1. filename = bs4.find("adaptationset", {'contenttype': 'audio'}).find('representation').find('baseurl').text
  2. filename = bs4.find("representation", {'id': 'AUDIO-1'}).find('baseurl').text

The second trouble is if post is from subreddit, then:

data = data[0]["data"]["children"][0]["data"]
media_data = data["media"]
if not media_data:
    data = data['crosspost_parent_list'][0]
    media_data = data["media"]

and then working with media_data

Third problem. That is a audio filename. Its maybe mp4, maybe wav(as default). You need to check it (see filename in trouble 1)

Fourth problem is if no audio track. Check it by filename in dash_url. If no filename, then no audio track

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment