Skip to content

Instantly share code, notes, and snippets.

@benmac3
Created September 25, 2012 05:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save benmac3/3780211 to your computer and use it in GitHub Desktop.
Save benmac3/3780211 to your computer and use it in GitHub Desktop.
Generic content transform for URL
require 'httparty'
class TransformHttpResponse
include HTTParty
def scrape(url)
response = self.class.get(url)
if block_given?
yield response
return
else
puts response
end
end
end
➜ Documents irb
1.9.3-p194 :001 > load 'transform_http_response.rb' => true
1.9.3-p194 :002 > thr = TransformHttpResponse.new => #<TransformHttpResponse:0x007ff24b9322f0>
1.9.3-p194 :003 > thr.scrape('http://www.youtube.com/robots.txt')
# robots.txt file for YouTube
# Created in the distant future (the year 2000) after
# the robotic uprising of the mid 90's which wiped out all humans.
User-agent: Mediapartners-Google*
Disallow:
User-agent: *
Disallow: /bulletin
Disallow: /comment
Disallow: /forgot
Disallow: /get_video
Disallow: /get_video_info
Disallow: /login
Disallow: /results
Disallow: /signup
Disallow: /t/terms
Disallow: /t/privacy
Disallow: /verify_age
Disallow: /videos
Disallow: /watch_ajax
Disallow: /watch_popup
Disallow: /watch_queue_ajax
=> nil
1.9.3-p194 :004 > thr.scrape('http://www.youtube.com/robots.txt') {|response| puts response.split("\n").select { |line| line =~ /^#/ }.map { |line| line[2..-1] }.join("\n")}
robots.txt file for YouTube
Created in the distant future (the year 2000) after
the robotic uprising of the mid 90's which wiped out all humans.
=> nil
1.9.3-p194 :005 > thr.scrape('http://www.youtube.com/robots.txt') {|response| puts response.split("\n").select { |line| line =~ /^Disallow/ }.map { |line| line[10..-1] }.join("\n")}
/bulletin
/comment
/forgot
/get_video
/get_video_info
/login
/results
/signup
/t/terms
/t/privacy
/verify_age
/videos
/watch_ajax
/watch_popup
/watch_queue_ajax
=> nil
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment