Skip to content

Instantly share code, notes, and snippets.

@nonsleepr
Last active October 25, 2015 09:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save nonsleepr/11401542 to your computer and use it in GitHub Desktop.
Save nonsleepr/11401542 to your computer and use it in GitHub Desktop.
FutureLearn Video downloader
#!/bin/bash
#
# Usage:
# > futurelearn_dl.sh login@email.com password course-name week-id
# Where *login@email.com* and *password* - your credentials
# ,*course-name* is the name from URL
# and *week-id* is the ID from the URL
#
# E.g. To download all videos from the page: https://www.futurelearn.com/courses/corpus-linguistics/todo/238
# Execute following command:
# > futurelearn_dl.sh login@email.com password corpus-linguistics 238
#
email=$1
password=$2
course=$3
weekid=$4
HD=/hd
# Pulls the login page and strips out the auth token
authToken=`curl -s -L -c cookies.txt 'https://www.futurelearn.com/sign-in' | \
grep -Po "(?<=authenticity_token\" value=\")([^\"]+)"`
function dlvid {
vzid=`curl -s -b cookies.txt $1 | grep -Po '(?<=video-)[0-9]+'`
vzurl=https://view.vzaar.com/${vzid}/download${HD}
curl -O -J -L $vzurl
}
# Posts all the pre-URI-encoded stuff and appends the URI-encoded auth token
curl -X POST -s -L -e 'https://www.futurelearn.com/sign-in' -c cookies.txt -b cookies.txt \
--data-urlencode email=$email \
--data-urlencode password=$password \
--data-urlencode authenticity_token=$authToken 'https://www.futurelearn.com/sign-in' > /dev/null
# Download Course page
curl -s -L -b cookies.txt https://www.futurelearn.com/courses/${course}/todo/${weekid} | \
grep -B8 'headline.*video' | grep -o '/courses[^"]*' | \
while read -r line; do
url=https://www.futurelearn.com${line}/progress
dlvid $url
done
@mjbright
Copy link

Are you still using this script?
I tried it today for the first time.

It wasn't working for me, it looks like the 'headline.*video' and 'vzid' code doesn't work with the current pages - at least of the course "talk-the-talk" I was trying to download.

I created an updated version which allowed me to pull down videos and text for all weeks of a course.
It's pretty horrible code though - I could clean it up and make it available if anyone's interested.

Anyway, thanks for this starting script, it means I can really do some FutureLearn courses now ....

@mjjimenez
Copy link

@mjbright I would really appreciate it if you could post a link to your script to download futurelearn courses. You don't need to clean it up, I just need it to work. Thanks!

@nonsleepr
Copy link
Author

@mjbright, @mjjimenez It appears, GitHub doesn't have notifications for Gist comments.
I created this script as one-off and used it once or twice after that.
Futurelearn changed site layout since then, they added links to download videos. I've updated my script, it should be more stable now (in regards to auth at least).

@MarwanShehata
Copy link

Hello , I tried to use this script on Windows using Cygwin but I couldn't ..

Python version is 3.4.3
Windows 10 64bit
12121

@deepakjois
Copy link

Updated this script a little bit, because it wasn’t working for me.

I also added aria2 support to enable me to resume downloads (and skip over completed downloads) if things got interrupted midway.

https://gist.github.com/deepakjois/e2e75d3eabae3dc71253

@mjbright
Copy link

@thelostelite It looks like you just don't have curl installed, you need to rerun the cygwin setup.exe and select the curl package (in the 'Net' category)

@mjbright
Copy link

OK, I couldn't put this off any longer ...

I scrapped the bash script (still not achieving login) although it is still there in old commits of the repo,
https://github.com/mjbright/futurelearn-dl

and we now have a Python3 version.

Current status is that I'm successfully obtaining mp4 and pdf downloadable urls.
By the time you read this it should be doing basic downloads ... and then I have to do something else on this Sunday ...

I hope this helps people.

I won't have much time to update this before December, but hope to evolve it.

The repo is here:
https://github.com/mjbright/futurelearn-dl

The biggest todo items once downloading is implemented are

  • fixing the "occasional" unicode errors (tricky)
  • add proper command-line arguments
  • handle a week at a time
  • don't repeat downloads

@mjbright
Copy link

OK, I've published something useable (for me ... YMMV).

It downloads most mp4 and pdf files for a course.

It can download just one week and avoids downloading files which already exist
(doesn't download if the destination file exists ...careful if you move/rename)

Still some unhandled unicode errors and the need for proper cmd-line argument handling.

I'll stop spamming here now.
Follow the repo if you're interested.
https://github.com/mjbright/futurelearn-dl

NOTE: I won't have much time to look at issues until December, but please file issues anyway..

More than welcome to have functionality issues or just comments on bad style ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment