Skip to content

Instantly share code, notes, and snippets.

@majie1993
Forked from cameroncooke/nsscreencast_downloader.rb
Last active August 22, 2020 05:04
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save majie1993/5dcbf16e26ad0bb4b94b382d8d9f5056 to your computer and use it in GitHub Desktop.
Save majie1993/5dcbf16e26ad0bb4b94b382d8d9f5056 to your computer and use it in GitHub Desktop.
Downloads videos from nsscreencast (requires paid account)
# Using this script downloads ALL the videos in NSScreencast.
# Usage: `EMAIL=your email PASSWORD=your password.`
require 'HTTParty'
require 'Nokogiri'
require 'pry'
require "mechanize"
require "parallel"
class Scraper
def initialize
agent = Mechanize.new
page = agent.get("https://nsscreencast.com/login")
form = page.forms[1]
puts page.forms
form.email = ENV["EMAIL"]
form.password = ENV["PASSWORD"]
page = agent.submit form
agent.pluggable_parser.default = Mechanize::Download
total_pages = 20
failed_downloads = Array.new
1.upto(total_pages) do |i|
doc = HTTParty.get("https://nsscreencast.com/episodes?page=#{i}#episodes")
parse_page ||= Nokogiri::HTML(doc)
links = parse_page.css("div.episode").css("div.episode_thumbnail").css("a")
# Parallel.each(links) do |link|
links.each do |link|
path = link.attributes['href'].value.sub('/episodes/', '')
episode_id = path.match(/^([0-9]*)/)[1]
video_url = "https://nsscreencast.com/episodes/#{path}.mp4"
file_name = "#{path}.mp4"
if File.file?(file_name)
puts "[EXISTS] #{file_name} has already been downloaded for video at URL:\n#{video_url}\n\n"
else
puts "[DOWNLOADING] #{video_url}\n"
begin
video = agent.get(video_url)
rescue Mechanize::ResponseCodeError => e
failed_downloads.push(video_url)
puts e
exit -1
else
video.save("#{path}.mp4")
puts "Saved #{video_url}\n\n"
end
end
end
end
puts("\n\nThe following downloads failed: \n[\n#{failed_downloads.join("\n")}\n]")
end
end
_scraper = Scraper.new
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment