secret
Last active

  • Download Gist
richardcolley.rb
Ruby
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115
=begin
My submission for RFCN#8
This program, reads stdin, or all files named as arguments, and outputs
to stdout the transformed XML.
Alternatively, the submission can be require'd and the Transformer
class used in other applications.
Transformation rules are (currently) at the end of this file in YAML.
Rules can define name mappings, nesting, and whether they are optional
in the input document. The rule also indicates whether or not to copy
the contents of input elements to output elements or not.
I have created an additional "optional" element <age/> to test the
extensibility of the rules.
=end
 
require 'yaml'
require 'nokogiri'
 
class Transformer
def initialize rules=YAML::load(DATA)
@rules = rules
end
def transform indoc
transform_recursively( @rules, indoc, Nokogiri::XML::Document.new )
end
private
 
def transform_recursively( current_rule_node, current_input_node, current_output_node )
current_rule_node.each do |symbol,tree|
instructions, paths, child_rules = tree
found_nodes = current_input_node.xpath( *paths )
node_not_found = found_nodes.empty?
found_nodes = [ current_input_node ] if node_not_found and instructions.include? :optional
raise "Missing element <#{symbol}>" if node_not_found and not instructions.include? :optional
raise "Too many <#{symbol}> elements" if found_nodes.count > 1 and instructions.include? :singular
found_nodes.each do |found|
new_node = current_output_node.add_child( Nokogiri::XML::Node.new(symbol.to_s, current_output_node) )
new_node.content = found.content if instructions.include? :copy_content and not node_not_found
transform_recursively( child_rules, found, new_node ) if child_rules
end
end
current_output_node
end
end
 
 
 
# simple test run
if __FILE__ == $PROGRAM_NAME
transformer = Transformer.new
for filename in ARGV
begin
open(filename) do |f|
indoc = Nokogiri::XML(f)
outdoc = transformer.transform(indoc)
puts outdoc
end
rescue Exception => e
$stderr.puts e
end
end
end
 
 
=begin
Define the rules used for the transformation. Nested dictionaries
are used to indicate the output structure. Each dictionary key/symbol
is used as the output xml tag name.
 
These could be read from a file, or placed elsewhere, but are here
now for convenience. Currently in YAML format.
 
Format is a recursive structure:
node = { symbol: [ [instructions], [xpath locations], {child nodes} ] }
where the following instructions are currently understood:
:copy_content -> will copy the node content from input to output nodes
:optional -> if we don't need to find this node in the input doc
(note, we still create it in output doc)
:singular -> this element can only appear once
(note: the outer element can only appear once by definition)
=end
__END__
---
:people:
- []
- - //people
- //address_book
- //employees
- :person:
- []
- - person
- contact
- employee
- :name:
- - :optional
- - name
- :first:
- - :copy_content
- - first_name
- first
:last:
- - :copy_content
- - last_name
- last
- surname
:age:
- - :optional
- :copy_content
- - age

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.