Skip to content

Instantly share code, notes, and snippets.

@ktheory
Created May 11, 2011 17:54
Show Gist options
  • Save ktheory/966948 to your computer and use it in GitHub Desktop.
Save ktheory/966948 to your computer and use it in GitHub Desktop.
Nokogiri end_element demo
xml = %{
<doc>
<key1>value1</key1>
<key2>
Value with
line breaks
</key2>
</doc>
}
require 'nokogiri'
class BadParser < Nokogiri::XML::SAX::Document
attr_reader :response
def initialize
reset
end
def reset
@response = {}
end
def characters(string)
@value ||= ''
@value << string
end
def start_element(name, attrs = [])
@value = nil
end
def end_element(name)
# Does not change @value. This is bad.
case name
when 'key1', 'key2'
@response[name] = @value
end
end
end
class GoodParser < Nokogiri::XML::SAX::Document
attr_reader :response
def initialize
reset
end
def reset
@response = {}
end
def characters(string)
@value ||= ''
@value << string
end
def start_element(name, attrs = [])
@value = nil
end
def end_element(name)
case name
when 'key1', 'key2'
@response[name] = @value
end
# Resets @value. This is good
@value = nil
end
end
bad = BadParser.new
good = GoodParser.new
Nokogiri::XML::SAX::Parser.new(bad).parse(xml)
Nokogiri::XML::SAX::Parser.new(good).parse(xml)
# Notice that the bad parser has extra whitespace at the end of "value1"
# because #characters is called with the whitespace between </key1> and <key2>.
# Since @value was not reset with end_element,
# @response['key1'] points to the same @value
bad.response
# => {"key1"=>"value1\n ", "key2"=>"\nValue with\nline breaks\n\n"}
# A good solution is to reset @value from end_element
good.response
# => {"key1"=>"value1", "key2"=>"\nValue with\nline breaks\n"}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment