Skip to content

Instantly share code, notes, and snippets.

@huacnlee
Created August 3, 2010 00:59
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save huacnlee/505618 to your computer and use it in GitHub Desktop.
Save huacnlee/505618 to your computer and use it in GitHub Desktop.
#
# Ruby scrapi 采集口碑房源数据
# Author: Huacnlee <huacnlee@gmail.com>
# Blog: http://huacnlee.com
#
# Gems install:
# sudo gem install scrapi
#
require 'rubygems'
require 'scrapi'
scraper = Scraper.define do
array :items
process "html>body>div.yui-d2f>div#bd.yui-cn-t2>div.yui-main>div.yui-b>table.infoList>tbody>tr",
:items => Scraper.define {
process "td:nth-child(2)>p.titleInfo>a:nth-child(1)", :title => :text, :link => "@href"
process "td:nth-child(3)>p.priceInfo>span.number", :price => :text
process "td:nth-child(3)>p.fitmentInfo", :fitment => :text
process "td:nth-child(4)>p.rentStyle", :style => :text
result :price,:title, :link, :style, :fitment
}
result :items
end
uri = URI.parse("http://chengdu.koubei.com/fang/li-rent-all.html")
scraper.scrape(uri).each do |house|
puts "标题:#{house.title}"
puts "价格:#{house.price} 元/月"
puts "类型:#{house.style}"
puts "装修:#{house.fitment }"
puts "连接:#{house.link}"
puts
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment