Skip to content

Instantly share code, notes, and snippets.

@sethladd
Created August 29, 2011 05:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sethladd/1177801 to your computer and use it in GitHub Desktop.
Save sethladd/1177801 to your computer and use it in GitHub Desktop.
Pulling images from Post Secret
#!/usr/bin/ruby
require 'rubygems'
require 'hpricot'
require 'open-uri'
require 'ftools'
doc = Hpricot(open("http://postsecret.blogspot.com/"))
images = []
(doc/"//img").select{|img| img.attributes['id'] =~ /BLOGGER_PHOTO_ID/}.each do |img|
img_source = open(img.attributes['src'])
img_location = 'images/' + img.attributes['src'].sub(/^.*blogger.com\//, '')
File.makedirs(File.dirname(img_location))
File.open(img_location, 'w') {|f| f.write(img_source.read)}
images << img_location
end
html_filename = Time.now.strftime("%Y-%m-%d") + ".html"
File.open(html_filename, "w") do |f|
images.each do |img|
f.write("<div><img src=\"#{img}\" /></div>\n")
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment