Skip to content

Instantly share code, notes, and snippets.

@ahmdrefat
Created June 28, 2012 08:28
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save ahmdrefat/3009908 to your computer and use it in GitHub Desktop.
Save ahmdrefat/3009908 to your computer and use it in GitHub Desktop.
Scraping newspapers names from kiosko
require 'open-uri'
require 'nokogiri'
home_page = Nokogiri::HTML(open('http://en.kiosko.net/'))
countries = {}
cities = {}
newspapers = []
home_page.css("#menu a").each do |a|
puts countries[a.content] = "http://en.kiosko.net" + a.attributes["href"].value
#links << a.attributes["href"].value
end
countries_newspapaer = {}
countries.each do |value, key|
country_page = Nokogiri::HTML(open(key))
country_page.css(".line li a img").each do |img|
puts img.attributes["alt"].value
countries_newspapaer[value] = img.attributes["alt"].value
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment