Skip to content

Instantly share code, notes, and snippets.

@caschwartz
Last active October 5, 2015 09:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save caschwartz/2789344 to your computer and use it in GitHub Desktop.
Save caschwartz/2789344 to your computer and use it in GitHub Desktop.
XQuery - Query retrieves all duplicate titles with multiple volumes
xquery version "1.0-ml";
(: 5/23/12 This query retrieves all duplicate titles (with multiple volumes) from MARCXML records in database :)
declare namespace ia = "http://my.server/my.directory";
declare namespace m = "http://www.loc.gov/MARC21/slim";
fn:distinct-values(
let $docs := xdmp:directory("/ia-xml/v/", "infinity")
let $dupTitles := let $metadata := $docs/ia:doc/ia:metadata
for $title in fn:distinct-values($metadata/ia:title)
where fn:count($metadata[ia:title eq $title]) gt 1
return $title
for $doc in $docs
for $dupTitle in $dupTitles
let $title := $doc/ia:doc/ia:metadata/ia:title
let $volumeInfo := $doc/ia:doc/ia:metadata/ia:marc/m:record/m:datafield[@tag = "300"]/m:subfield[@code = "a"]
where $title = $dupTitle and fn:contains($volumeInfo, "v.") or fn:contains($volumeInfo, "vol.")
order by $title ascending
return fn:normalize-space($title)
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment