Skip to content

Instantly share code, notes, and snippets.

@joewiz
Last active June 10, 2016 20:10
Show Gist options
  • Save joewiz/2369367de3babba30e0aad8c9beec893 to your computer and use it in GitHub Desktop.
Save joewiz/2369367de3babba30e0aad8c9beec893 to your computer and use it in GitHub Desktop.
XQuery Update data corruption problem http://markmail.org/message/3fzcixmxeh76z6l3
xquery version "3.0";
(:
Goal: Take a TEI document containing <ref> elements that need to be fixed, and fix these with XQuery Update.
Specifically, we find the page number references from the text node immediately following the <ref> element,
and move the page number inside the <ref> element. (I've simplified my data and the query to illustrate.)
Problem: The XQuery Update statement corrupts the sample.xml file. The resulting file has 0 bytes. When I
comment out the XQuery Update statement and uncomment the $test variable in the return expression, I get
expected results, so I think the logic is sound. Also, when I comment out line 25, the corruption doesn't
occur. But I need that line, which reconstructs the attributes. I'm stumped.
Test environment: Saxon-EE XQuery 9.6.0.7 with oXygen 17.1; with XQuery 3.0 and XQuery Update enabled.
:)
declare namespace tei="http://www.tei-c.org/ns/1.0";
declare function local:reconstruct($nodes as node()*) {
for $node in $nodes
return
typeswitch ($node)
case element() return
element
{ node-name($node) }
{
$node/@*,
local:reconstruct($node/node())
}
default return $node
};
let $doc := doc('02-sample.xml')
let $refs := $doc//tei:ref
[matches(following-sibling::node()[1][. instance of text()], '^, pp?\.\s+\d+')]
for $ref in $refs
let $following-text := $ref/following-sibling::text()[1]
let $analyze := analyze-string($following-text, '^(, pp?\.\s+)(\d+)(.*)$')
let $new-ref :=
(
element
{ QName('http://www.tei-c.org/ns/1.0', 'ref') }
{
local:reconstruct($ref/node()),
string-join($analyze/fn:match/fn:group[@nr = (1, 2)])
}
)
let $new-following-text := string-join($analyze/fn:match/fn:group[@nr ge 3])
let $test :=
<result>
<original>{$ref, $following-text}</original>
<analysis>{$analyze}</analysis>
<new>{$new-ref, $new-following-text}</new>
</result>
return
(:
$test
:)
(
replace node $ref with $new-ref
,
replace node $following-text with $new-following-text
)
<?xml version="1.0" encoding="UTF-8"?>
<note xmlns="http://www.tei-c.org/ns/1.0">For text of NSC 164/1, see <ref>
<hi rend="italic">Foreign Relations,</hi> 1952–1954, vol. VII, Part 2</ref>, p. 1914.</note>
<result>
<original>
<ref xmlns="http://www.tei-c.org/ns/1.0">
<hi rend="italic">Foreign Relations,</hi> 1952–1954, vol. VII, Part 2</ref>, p. 1914.</original>
<analysis>
<fn:analyze-string-result xmlns:fn="http://www.w3.org/2005/xpath-functions">
<fn:match>
<fn:group nr="1">, p. </fn:group>
<fn:group nr="2">1914</fn:group>
<fn:group nr="3">.</fn:group>
</fn:match>
</fn:analyze-string-result>
</analysis>
<new>
<ref xmlns="http://www.tei-c.org/ns/1.0">
<hi rend="italic">Foreign Relations,</hi> 1952–1954, vol. VII, Part 2, p. 1914</ref>.</new>
</result>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment