-
-
Save gabrielecirulli/49f02ae82740f421406c to your computer and use it in GitHub Desktop.
fp more pronporn scrapper
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
require 'nokogiri' | |
require 'open-uri' | |
require 'net/http' | |
require 'uri' | |
require 'fileutils' | |
require 'colorize' | |
def file_name path | |
path.split( File::SEPARATOR ).last | |
end | |
# Make results dir | |
dir = File.join Dir.pwd, "results" | |
FileUtils.mkdir_p dir | |
# Request URL | |
puts "Page URL:" | |
page = Nokogiri::HTML open gets.chomp | |
links = page.css( "img" ).map { |img| img['src'] } | |
hosts = links.map { |x| URI x }.group_by &:hostname | |
hosts.each do |hostname, uris| | |
http = Net::HTTP.start hostname | |
uris.each do |uri| | |
resp = http.get uri.path | |
puts "Downloading '#{file_name uri.path}'" | |
name = file_name uri.path | |
file_path = File.join dir, name | |
open file_path, "wb" do |file| | |
file.write resp.body | |
end | |
end | |
end | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment