Skip to content

Instantly share code, notes, and snippets.

@privatezero
Last active March 25, 2019 20:41
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save privatezero/c348e7ef569b3b08c106319630f03620 to your computer and use it in GitHub Desktop.
Save privatezero/c348e7ef569b3b08c106319630f03620 to your computer and use it in GitHub Desktop.
parsing dspace with nokogiri
# xpath to type for xoai
# puts doc.xpath("//record/metadata/metadata/element/*[@name='type']/element/field")[0].content
doc = File.open("TARGET.XML") { |f| Nokogiri::XML(f) }
doc.remove_namespaces!
single_text = Array.new
publisher_text = Array.new
relation_text = Array.new
doc.xpath("//record").each do |record_element|
if (record_element.xpath('metadata/dc/type').count == 1 && record_element.xpath('metadata/dc/type').text == 'Text')
single_text << record_element.xpath('metadata/dc/type').text
if ! record_element.xpath('metadata/dc/publisher').empty?
publisher_text << record_element.xpath('metadata/dc/publisher').text
end
if ! record_element.xpath('metadata/dc/relation').empty?
relation_text << record_element.xpath('metadata/dc/relation').text
end
end
end ; nil
puts single_text.count
puts publisher_text.count
puts relation_text.count
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment