-
-
Save XueshiQiao/5976402 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
import os, re, sys | |
RE_SD_VIDEO = re.compile( | |
r'<a href="(http://devstreaming.apple.com/videos/wwdc/2013/[^"]*-SD.mov)') | |
RE_WEBVTT = re.compile(r'fileSequence[0-9]+\.webvtt') | |
# stdin: dump of https://developer.apple.com/wwdc/videos/ | |
for l in sys.stdin: | |
m = RE_SD_VIDEO.search(l) | |
if not m: | |
continue | |
print "downloading subtitle of video [" + video_url + "]" | |
video_url = m.group(1) | |
video_dir = video_url[:video_url.rindex('/')] | |
session = video_dir[video_dir.rindex('/') + 1:] | |
prog_index = requests.get(video_dir + '/subtitles/eng/prog_index.m3u8') | |
os.mkdir(session) | |
webvtt_names = RE_WEBVTT.findall(prog_index.text) | |
for webvtt_name in webvtt_names: | |
webvtt = requests.get(video_dir + '/subtitles/eng/' + webvtt_name) | |
open(os.path.join(session, webvtt_name), 'w').write(webvtt.text) |
老是报下面的错误:
raceback (most recent call last):
File "extract.py", line 1, in
import requests
ImportError: No module named requests
@shede333 貌似是没有引入requests
这个module
我用的mac,自带的pathon,这个module难道不是标配?
帮人帮到底吧,应该怎么导入那个module
wwdc 2012及之前的视频有中文字幕可以抓么?
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This Gist is forked from @nriley 's extract.py,
Thx @nriley
How to use:
~/wwdc_video.html
python extract.py < ~/wwdc_video.html
Just waiting for download completed.
After script finish, combine all webvtt files to a whole file with wwdc_combine_webvtt.rb