Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
XQuery Update data corruption problem
xquery version "3.0";
Goal: Take a TEI document containing <ref> elements that need to be fixed, and fix these with XQuery Update.
Specifically, we find the page number references from the text node immediately following the <ref> element,
and move the page number inside the <ref> element. (I've simplified my data and the query to illustrate.)
Problem: The XQuery Update statement corrupts the sample.xml file. The resulting file has 0 bytes. When I
comment out the XQuery Update statement and uncomment the $test variable in the return expression, I get
expected results, so I think the logic is sound. Also, when I comment out line 25, the corruption doesn't
occur. But I need that line, which reconstructs the attributes. I'm stumped.
Test environment: Saxon-EE XQuery with oXygen 17.1; with XQuery 3.0 and XQuery Update enabled.
declare namespace tei="";
declare function local:reconstruct($nodes as node()*) {
for $node in $nodes
typeswitch ($node)
case element() return
{ node-name($node) }
default return $node
let $doc := doc('02-sample.xml')
let $refs := $doc//tei:ref
[matches(following-sibling::node()[1][. instance of text()], '^, pp?\.\s+\d+')]
for $ref in $refs
let $following-text := $ref/following-sibling::text()[1]
let $analyze := analyze-string($following-text, '^(, pp?\.\s+)(\d+)(.*)$')
let $new-ref :=
{ QName('', 'ref') }
string-join($analyze/fn:match/fn:group[@nr = (1, 2)])
let $new-following-text := string-join($analyze/fn:match/fn:group[@nr ge 3])
let $test :=
<original>{$ref, $following-text}</original>
<new>{$new-ref, $new-following-text}</new>
replace node $ref with $new-ref
replace node $following-text with $new-following-text
<?xml version="1.0" encoding="UTF-8"?>
<note xmlns="">For text of NSC 164/1, see <ref>
<hi rend="italic">Foreign Relations,</hi> 1952–1954, vol. VII, Part 2</ref>, p. 1914.</note>
<ref xmlns="">
<hi rend="italic">Foreign Relations,</hi> 1952–1954, vol. VII, Part 2</ref>, p. 1914.</original>
<fn:analyze-string-result xmlns:fn="">
<fn:group nr="1">, p. </fn:group>
<fn:group nr="2">1914</fn:group>
<fn:group nr="3">.</fn:group>
<ref xmlns="">
<hi rend="italic">Foreign Relations,</hi> 1952–1954, vol. VII, Part 2, p. 1914</ref>.</new>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment