Phlip (owner)

Revisions

gist: 75525 Download_button fork
public
Public Clone URL: git://gist.github.com/75525.git
Embed All Files: show embed
nokogiri_from_xml.rb #
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
require 'nokogiri'
 
=begin
 
from_xml{} and convert{} form a light DSL to convert XML
(such as to_xml() provides) into a matching new or updated
ActiveRecord object model. The DSL provides numerous hooks
and optional callbacks to rename and reprocess custom
XML, allowing it to range from a direct translation
to a complete reinterpretation of the represented objects.
 
Nokogiri::XML::Node#convert takes these arguments:
 
* xpath pointing to nodes, relative to the current node
* [id, field_name]
- id is the primary key in the input
- field_name is an optional value to rename the
* [field_name_2, optional_rename_2]... the subsequent field names
* &block - convert{} optionally calls this for each detected node, with these arguments:
- node - the current XML::Node, decorated with
* data - a hash containing your field_names and their string values
- id - the string value of the primary key in the input
- field_name_2... - subsequent values (discard them with splat * !)
 
Use the block to fire subsequent conversions on subrecords.
 
This function shows convert{} reconstituting our familiar
Post, Author, and Tag records:
 
def reconstitute(xml)
doc = Nokogiri::XML(xml)
doc.convert 'posts/post', :id, :title, :body do |node, id, *data|
post = Post.find_or_initialize_by_id(id)
post.update_attributes node.data
node.convert 'tags/tag', :id, :name do |n, id, name|
tag = Tag.find_or_initialize_by_id(id)
tag.update_attribute :name, name
post.tags << tag
end
node.convert 'author', :id, :name do |n, id, name|
author = Author.find_or_initialize_by_id(id)
author.update_attributes n.data
post.update_attribute :author, author
end
end
end
 
That assembly still duplicates many lines, so from_xml{} DRYs them up
by packing more information into the input arguments.
 
Nokogiri::XML::Node#from_xml{} takes these arguments:
 
* Model - the ActiveRecord class itself
* [id, field_name]
- id is the primary key in the input
- field_name is an optional value to rename the
from_xml{} will use find_or_initialize_by_field_name(id)
to prepare one record for its new data
* [field_name_2, optional_rename_2]... the subsequent field names
from_xml{} will stuff the string values of field_name_N
into your attributes, indexed by either field_nameN, or
optional_rename_N, if provided
* &block - convert calls this for each detected node, with these arguments:
- record - the record currently under construction
- node - the current XML::Node, decorated with
* data - a hash containing your field_names and their string values
- id - the string value of the primary key in the input
- field_name_2... - subsequent values (discard them with splat * !)
 
After optionally calling your &block, from_xml{} saves the current
record. It also returns a flattened array of any found records, so
you can associate them into a containing record.
 
This is the equivalent to the afforementioned reconstitute():
 
doc.from_xml Post, :id, :title, :body do |post, node, *|
post.tags = node.from_xml(Tag, :id, :name)
post.author = *node.from_xml(Author, :id, :name)
post.save!
end
 
Use the block to fire subsequent conversions on subrecords. Here
is a more complex declaration. We pretend that we must copy our
records into an auxiliary database that must maintain
extra copies into our primary database's primary keys. This
allows us to incrementally upgrade the auxilary database's
values.
 
We also pretend that Tags have parent Tags, and we must
rebuild this relationship out-of-band from the normal
associations.
 
doc.from_xml Post, [:id, :remote_post_id],
:title,
:body,
:some_data,
[:more_data, :renamed_as_field] do |post, node, *|
node.from_xml Tag, [:id, :remote_tag_id], :name do |tag, n, *|
id = n.xpath('parent_id').text
tag.parent = Tag.find_by_remote_tag_id(id)
post.tags = ([tag] + post.tags).uniq
end
post.author = node.from_xml(Author, [:id, :remote_id], :name)
end
 
=end
 
  class ::Nokogiri::XML::Node
 
    def from_xml(model, *needs, &block)
      needs = needs.map{|x| [x,x].flatten[0..1] }
      singular = model.name.downcase
      plural = singular.pluralize + '/' + singular
 
      add_record = lambda do |n, *data|
        find_or_init = "find_or_initialize_by_#{needs.first.last}"
        record = model.send find_or_init, data.first
        record.attributes = n.data
        block.call(record, n, *data) if block
        record.save!
        record # map me up!
      end
 
      return [ convert(singular, *needs, &add_record),
               convert(plural, *needs, &add_record) ].flatten.compact
    end
    
    def convert(tag_name, *needs, &block)
      needs = needs.map{|x| [x,x].flatten[0..1] }
 
      xpath(tag_name).map do |node|
        gots = []
        
        node.data = needs.inject({}) do |h,(n,g)|
          gots << h[g] = node.xpath(n.to_s).text
          h
        end
        
        block.call(node, *gots) if block
      end # note ruby1.9 can drop the gots[] system and just use node.data.values,
    end # (against the objections of certain math zealots!;)
 
    attr_accessor :data
 
  end