Skip to content

Instantly share code, notes, and snippets.

@adamico
Last active December 23, 2015 06:09
Show Gist options
  • Save adamico/6592299 to your computer and use it in GitHub Desktop.
Save adamico/6592299 to your computer and use it in GitHub Desktop.
nokogiri blog parsing for http://blog.shopittome.com/
require 'open-uri'
require 'nokogiri'
doc = Nokogiri::HTML(open("http://www.threescompany.com/"))
titles = doc.css("#content_inner h2")
containers = doc.css("#content_inner .format_text")
articles = []
titles.each_with_index do |title, i|
images = containers[i].css("div > img, a > img")[0..-2]
images = images.map { |image| image.attributes["src"].value }
articles << { :title => title.content, :images => images }
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment