Skip to content

Instantly share code, notes, and snippets.

@keithrbennett
Last active June 8, 2022 04:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save keithrbennett/19e6eb2233434fd2e5be17fa921b867f to your computer and use it in GitHub Desktop.
Save keithrbennett/19e6eb2233434fd2e5be17fa921b867f to your computer and use it in GitHub Desktop.
#!/usr/bin/env ruby
require 'nokogiri'
def process_example(message, xml_text, use_noblanks_option)
puts message
puts "XML text: #{xml_text.inspect}"
doc = Nokogiri::XML(xml_text) { |config| use_noblanks_option ? config.noblanks : config }
puts 'Resulting XML document:'
puts doc.inspect; puts; puts
end
process_example(
"Without the noblanks option, the separating whitespace is parsed into Text elements.",
"<xml>\n <something/>\n</xml>",
false)
process_example(
"With the noblanks option, the separating whitespace is *not* parsed into Text elements.",
"<xml>\n <something/>\n</xml>",
true)
process_example(
"But when the whitespace is an attribute it will be preserved.",
%Q{<xml><something an_attribute=" \n " /></xml>},
true)
process_example(
"And when the whitespace is the value of an element it will also be preserved.",
%Q{<xml><something> \n </something></xml>},
true)
=begin
Output is:
XML text: "<xml>\n <something/>\n</xml>"
Resulting XML document:
#<Nokogiri::XML::Document:0x8c name="document" children=[#<Nokogiri::XML::Element:0x78 name="xml" children=[#<Nokogiri::XML::Text:0x3c "\n ">, #<Nokogiri::XML::Element:0x50 name="something">, #<Nokogiri::XML::Text:0x64 "\n">]>]>
With the noblanks option, the separating whitespace is *not* parsed into Text elements.
XML text: "<xml>\n <something/>\n</xml>"
Resulting XML document:
#<Nokogiri::XML::Document:0xc8 name="document" children=[#<Nokogiri::XML::Element:0xb4 name="xml" children=[#<Nokogiri::XML::Element:0xa0 name="something">]>]>
But when the whitespace is an attribute it will be preserved.
XML text: "<xml><something an_attribute=\" \n \" /></xml>"
Resulting XML document:
#<Nokogiri::XML::Document:0x118 name="document" children=[#<Nokogiri::XML::Element:0x104 name="xml" children=[#<Nokogiri::XML::Element:0xf0 name="something" attributes=[#<Nokogiri::XML::Attr:0xdc name="an_attribute" value=" ">]>]>]>
And when the whitespace is the value of an element it will also be preserved.
XML text: "<xml><something> \n </something></xml>"
Resulting XML document:
#<Nokogiri::XML::Document:0x168 name="document" children=[#<Nokogiri::XML::Element:0x154 name="xml" children=[#<Nokogiri::XML::Element:0x140 name="something" children=[#<Nokogiri::XML::Text:0x12c " \n ">]>]>]>
=end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment