Skip to content

Instantly share code, notes, and snippets.

@boredzo
Forked from nriley/extract.py
Created June 27, 2013 06:53
Show Gist options
  • Save boredzo/5874472 to your computer and use it in GitHub Desktop.
Save boredzo/5874472 to your computer and use it in GitHub Desktop.
import requests
import os, re, sys
RE_SD_VIDEO = re.compile(
r'<a href="(http://devstreaming.apple.com/videos/wwdc/2013/[^"]*-SD.mov)')
RE_WEBVTT = re.compile(r'fileSequence[0-9]+\.webvtt')
# stdin: dump of https://developer.apple.com/wwdc/videos/
for l in sys.stdin:
m = RE_SD_VIDEO.search(l)
if not m:
continue
video_url = m.group(1)
video_dir = video_url[:video_url.rindex('/')]
session = video_dir[video_dir.rindex('/') + 1:]
prog_index = requests.get(video_dir + '/subtitles/eng/prog_index.m3u8')
os.mkdir(session)
webvtt_names = RE_WEBVTT.findall(prog_index.text)
for webvtt_name in webvtt_names:
webvtt = requests.get(video_dir + '/subtitles/eng/' + webvtt_name)
open(os.path.join(session, webvtt_name), 'w').write(webvtt.text)
@dongkeyworld
Copy link

Hello! My name is Donghwi Kim.
Actually, I need subtitles of WWDC 2013 session videos.
But, I don't know python language.
Could you please explain how to use this script?
Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment