Skip to content

Instantly share code, notes, and snippets.

@rlisowski
Created May 27, 2011 11:28
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rlisowski/995074 to your computer and use it in GitHub Desktop.
Save rlisowski/995074 to your computer and use it in GitHub Desktop.
get images from http page
def get_images(url)
images = []
if url =~ /^.*\.(jpg|jpeg|png)$/
images << url
else
uri = URI.parse(url)
domain_with_port = "#{uri.scheme}://#{uri.host}:#{uri.port}"
doc = Hpricot(open(uri, proxy: "http://#{Settings.proxy.address}:#{Settings.proxy.port}"))
doc.search("img").each do |e|
src = e[:src]
unless src =~ /^https?:\/\/.*$/
src = domain_with_port + src
end
images << src
end
end
images
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment