Skip to content

Instantly share code, notes, and snippets.

@pelf
Created March 22, 2012 02:17
Show Gist options
  • Save pelf/2155196 to your computer and use it in GitHub Desktop.
Save pelf/2155196 to your computer and use it in GitHub Desktop.
Brickset.com scraper
require 'rubygems'
require 'nokogiri'
require 'open-uri'
class Brickset
@@db = nil # database / cache
class Set
attr_accessor :id, :year, :pieces, :rrpp, :rrpd, :rating, :theme, :subtheme
def initialize(id)
begin
# get data from web
self.id = id
# fetch brickset page
html = Nokogiri::HTML(open("http://www.brickset.com/detail/?Set=#{id}-1"))
# parse brickset details
details = html.css("#menuPanel .menuPanel .setDetails li")
details.each do |detail|
set_property(detail)
end
rescue => e
puts e.inspect
return nil
end
end
# set a brickset property from the nokogiri html node
def set_property(li)
attribute = li.children.first.content.strip
value = li.children[li.children.size-1].content.strip
case attribute
when "Theme"
self.theme = value
when "Subtheme"
self.subtheme = value
when "Year released"
self.year = value
when "Pieces"
self.pieces = value
when "RRP"
value =~ /(\d+).+\$(\d+)/
self.rrpp = $1
self.rrpd = $2
end
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment