Skip to content

Instantly share code, notes, and snippets.

@auient
Created May 6, 2015
Embed
What would you like to do?
おいしい紅茶の店マップのスクレイピングスクリプト
#! /usr/bin/env ruby
require "Nokogiri"
require "open-uri"
def parse_list(url)
doc = Nokogiri::HTML(open(url))
doc.xpath('//div[@class = "shop"]').each do |node|
shop_name = node.xpath('h3').text
left = node.xpath('p[@class = "left"]').text
left = left.gsub(/ +/, '').split
post_code = left[0]
address = left[1]
access = left[2]
right = node.xpath('p[@class = "right"]').text
right = right.gsub(/ +/, '').split
telephone = right[0]
website = right[1]
yield [shop_name, post_code, address, access, telephone, website]
end
end
def parse_index(url)
doc = Nokogiri::HTML(open(url))
doc.xpath('//div[@id = "areamap"]/a').each do |node|
name = node.text.strip
href = node.attr('href').match(/'(.*)'/)[1]
# puts name + "::" + href
yield name, href
end
end
index_url = 'http://www.tea-a.gr.jp/shop/'
# run
parse_index(index_url) do |name, href|
# puts name+"&"+href
list_url = index_url + href
parse_list(list_url) do |result|
result.unshift name
puts result.join(",")
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment