Skip to content

Instantly share code, notes, and snippets.

@abinoam
Created Sep 5, 2012
Embed
What would you like to do?
Sybren Kooistra doubt - http://www.ruby-forum.com/topic/4405257
require 'nokogiri'
require 'open-uri'
def get_search_result_links(n_page)
links = n_page.css('.linker-kolom li a')
puts "** There were #{links.length} links found"
links.each do |link|
href = link['href']
inner_url = 'https://zoek.officielebekendmakingen.nl' + href
puts "\n\n\nFetching page at #{File.basename(inner_url).split('?')[0]}"
datalezer = open(inner_url).read
lokalenieuwefilenaam = href + ".html"
lokalenieuwefile = open(lokalenieuwefilenaam, "w")
lokalenieuwefile.write(datalezer)
lokalenieuwefile.close
end
end
INITIAL_URL = 'https://zoek.officielebekendmakingen.nl/zoeken/resultaat/?zkt=Uitgebreid&pst=ParlementaireDocumenten'
initial_page = Nokogiri::HTML(open(INITIAL_URL))
pagination_links = initial_page.css('.paginering.beneden a')
last_page_link = pagination_links[-2]
last_page_number = last_page_link.text.to_i
(5..last_page_number).each do |page_num|
puts "\n\n\n***** Getting page #{page_num}"
results_page_url = "#{INITIAL_URL}&_page=#{page_num}"
results_page = Nokogiri::HTML(open(results_page_url))
get_search_result_links(results_page)
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment