Skip to content

Instantly share code, notes, and snippets.

@su27
Created September 17, 2013 06:08
Show Gist options
  • Save su27/6590628 to your computer and use it in GitHub Desktop.
Save su27/6590628 to your computer and use it in GitHub Desktop.
A tool to fetch subtitles from edx.org
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
The edx.org provides video for download, but the subtitle is
only available when watching video online.
This script can help to convert the online subtitle to local srt file.
Usage:
1. Open your Chrome inspector(or other debug tool)
2. Visit the video page, find the subtitle url(you can search for "subs")
3. run "python edx-sub.py [the url] > your_srt.srt
"""
import sys
import urllib2
from json import loads
def main():
f = urllib2.urlopen(sys.argv[1])
json_subs = loads(f.read())
subs = zip(json_subs['start'], json_subs['end'], json_subs['text'])
for i, s in enumerate(subs):
print "%s\n%s --> %s\n%s\n" % (
i, conv_time(s[0]), conv_time(s[1]), s[2])
def conv_time(num):
seconds, ms = divmod(num, 1000)
minutes, ss = divmod(seconds, 60)
hh, mm = divmod(minutes, 60)
return "%.2d:%.2d:%.2d,%.3d" % (hh, mm, ss, ms)
if __name__ == "__main__":
if len(sys.argv) < 2:
print "usage: 1. Use inspector to find the subtitle url, like https://courses.edx.org/123.sjson"
print " 2. python %s [subs url] > sub.srt" % sys.argv[0]
sys.exit()
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment