Skip to content

Instantly share code, notes, and snippets.

@SteveBenner
Created May 17, 2015 16:48
Show Gist options
  • Save SteveBenner/02470cfc7b4fc026f214 to your computer and use it in GitHub Desktop.
Save SteveBenner/02470cfc7b4fc026f214 to your computer and use it in GitHub Desktop.
Web scraper that downloads archived episodes of 'Trance Around The World' podcast
#!/usr/bin/env ruby
#
# This is a web scraper which downloads mp3 files from the TATW web archives.
#
# Usage: run this script and pass it an episode range to download via arguments
#
# Dependencies: requires 'aria2' to be installed (uses the CLI tool 'aria2c')
#
require 'pathname'
require 'colorize' # Use of this gem is optional; it makes the output prettier
PLAYER_URL = 'http://www.trancearoundtheworld.com/player/play.php?id='
DOWNLOAD_URL = 'http://static.trancearoundtheworld.com/archives'
LOCAL_DOWNLOAD_DEST = '~/Downloads'
CMD = 'aria2c'
Dir.chdir Pathname(LOCAL_DOWNLOAD_DEST).expand_path
# Download a range of TATW episodes using the Aria2 CLI
unless ARGV[0] && ARGV[1]
abort 'ERROR: Supply episode range (two successive numbers) as parameters to the script, e.g. 430 450'
end
SETS = Range.new ARGV[0], ARGV[1]
puts "Attempting to download #{SETS.count} TATW episodes!"
# Episodes take about 3-5 minutes each to download with standard cable
SETS.each do |ep|
if `#{CMD} #{DOWNLOAD_URL}/TATW#{ep}.mp3`
puts "...Episode #{ep.to_s.blue} was successfully downloaded!" if File.exist?("#{Dir.pwd}/TATW#{ep}.mp3")
else
puts '...Failed '.red + 'to download episode '.yellow + ep.to_s.light_red
end
end
@doctorwho42
Copy link

Hey Steve, I was wondering if you have a TATW archive I could download from you. I was in the process of writing a script like this until I found the TATW archive missing due to web-address changing or Above and beyond hiding their TATW archives since they are into their new set Group Therapy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment