Skip to content

Instantly share code, notes, and snippets.

@cameroncooke
Created August 22, 2018 21:32
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save cameroncooke/f2aa60078ada2cad8e15113a06114711 to your computer and use it in GitHub Desktop.
Save cameroncooke/f2aa60078ada2cad8e15113a06114711 to your computer and use it in GitHub Desktop.
Downloads videos from nsscreencast (requires paid account)
# Using this script downloads ALL the videos in NSScreencast.
# Usage: `EMAIL=your email PASSWORD=your password.`
require 'HTTParty'
require 'Nokogiri'
require 'pry'
require "mechanize"
require "parallel"
class Scraper
def initialize
mechanize = Mechanize.new
mechanize.post("https://www.nsscreencast.com/user_sessions", {"email" => ENV["EMAIL"], "password" => ENV["PASSWORD"]})
mechanize.pluggable_parser.default = Mechanize::Download
total_pages = 16
failed_downloads = Array.new
1.upto(total_pages) do |i|
doc = HTTParty.get("https://nsscreencast.com/episodes?page=#{i}#episodes")
parse_page ||= Nokogiri::HTML(doc)
links = parse_page.css("div.episode").css("div.episode_thumbnail").css("a")
# Parallel.each(links) do |link|
links.each do |link|
path = link.attributes['href'].value.sub('/episodes/', '')
episode_id = path.match(/^([0-9]*)/)[1]
video_url = "https://nsscreencast.com/episodes/#{path}.mp4"
file_name = "#{path}.mp4"
if File.file?(file_name)
puts "[EXISTS] #{file_name} has already been downloaded for video at URL:\n#{video_url}\n\n"
else
puts "[DOWNLOADING] #{video_url}\n"
begin
video = mechanize.get(video_url)
rescue Mechanize::ResponseCodeError => e
failed_downloads.push(video_url)
else
video.save("#{path}.mp4")
puts "Saved #{video_url}\n\n"
end
end
end
end
puts("\n\nThe following downloads failed: \n[\n#{failed_downloads.join("\n")}\n]")
end
end
_scraper = Scraper.new
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment