Skip to content

Instantly share code, notes, and snippets.

@janmoesen
Created Sep 24, 2012
Embed
What would you like to do?
Download Sporza videos based on web page URL
#!/bin/bash
function log {
echo "$@" 1>&2;
}
for page_url; do
log "Now processing page: $page_url";
# Assemble the playlist URL. It is stored in two JavaScript properties in
# the HTML page.
playlist_url='';
html="$(wget -qO- "$1")";
while read part; do
playlist_url="$playlist_url/$part";
done < <(awk -F= '/^vars.*iphone/ { gsub(/['\'';]/, ""); print $2; }' <<< "$html");
playlist_url="${playlist_url#/}/playlist.m3u8";
log "Playlist URL as in the HTML page: $playlist_url";
base_url="${playlist_url%/*}";
read title < <(awk -F= '/^vars.*statProgram/ { gsub(/['\'';]/, ""); print $2; }' <<< "$html");
title="${title//_/-}";
# Clear the playlist files.
remote_playlist_output_filename="$title.remote.m3u8";
local_playlist_output_filename="$title.local.m3u8";
log "Saving local playlist to $local_playlist_output_filename";
printf '' >| "$remote_playlist_output_filename";
printf '' >| "$local_playlist_output_filename";
# Getting the playlist returns the same URL but with a session ID appended.
playlist_url="$base_url/$(
wget --timeout=1 -qO- "$playlist_url" |
awk '/^[^#]/ && ! /audioonly/ { print; exit; }'
)";
log "Playlist URL with the session ID: $playlist_url";
# Now, convert the relative filenames in the playlist to absolute URLs.
wget --timeout=1 -qO- "$playlist_url" | while read -r line; do
media_url="$base_url/$line"
if [ "${line:0:1}" = "#" ]; then
echo "$line" >> "$remote_playlist_output_filename";
echo "$line" >> "$local_playlist_output_filename";
else
echo "$media_url" >> "$remote_playlist_output_filename";
# Download the URL, too.
log "Downloading media item: $media_url";
media_filename="$title-${line%\?*}";
wget --timeout=1 -q -O "$media_filename" "$media_url" 1>&2;
echo "$media_filename" >> "$local_playlist_output_filename";
fi;
done;
ls -halF "$local_playlist_output_filename";
done;
@janmoesen

This comment has been minimized.

Copy link
Owner Author

@janmoesen janmoesen commented Sep 24, 2012

I missed the UCI Road World Championships road race because we were insulating the bedroom-to-be, and the Sporza site (and likely other VRT sites) was (and is) having a lot of capacity issues with the video. The desktop sites use Real Time Streaming Protocol, but the iPhone uses simple HTTP, which seemed to be having less issues.

This script downloads the MPEG-4 video files for the iPhone so you can watch them on your desktop.

Usage: ./sporza.sh 'http://www.sporza.be/cm/sporza/videozone/MG_sportnieuws/MG_wielrennen/1.1437937'

This will create a file called gilbert-wint-in-valkenburg-en-is-wereldkampioen-id-1-1437937.local.m3u8, and several similarly-named video files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment