Skip to content

Instantly share code, notes, and snippets.

@ssaunier
Forked from Papillard/timeout_pub_scraper.rb
Last active November 8, 2017 17:15
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ssaunier/6109e2c29740bf670c593176862b1641 to your computer and use it in GitHub Desktop.
Save ssaunier/6109e2c29740bf670c593176862b1641 to your computer and use it in GitHub Desktop.
TimeOut best pubs scraper
require "open-uri"
require "nokogiri"
url = "https://www.timeout.com/london/bars-and-pubs/the-100-best-bars-and-pubs-in-london"
html_file = open(url)
doc = Nokogiri::HTML(html_file)
doc.search(".feature-item").take(12).each do |bar|
p bar.search("img")[0].attr("src") # image_url
p bar.search("h3 a")[0].text # name
p bar.search(".listings_flag")[0].text.strip # address
puts "*" * 50
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment