Skip to content

Instantly share code, notes, and snippets.

@caschwartz
Last active December 21, 2015 14:59
Show Gist options
  • Save caschwartz/6323505 to your computer and use it in GitHub Desktop.
Save caschwartz/6323505 to your computer and use it in GitHub Desktop.
XQuery - Query to retrieve all exact duplicates title-date values
xquery version "1.0-ml";
(: 8/6/13 Query to retrieve all exact duplicates title-date values (using one of Priscilla Walmsley's functx XQuery functions :)
declare namespace ia = "http://my.local.namespace";
declare namespace m = "http://www.loc.gov/MARC21/slim";
declare namespace functx = "http://www.functx.com";
declare function functx:non-distinct-values
( $seq as xs:anyAtomicType* ) as xs:anyAtomicType* {
for $val in distinct-values($seq)
return $val[count($seq[. = $val]) > 1]
} ;
let $items :=
<items>{ let $titleDateSeq :=
let $docs := xdmp:directory("/ia-xml/q/", "infinity")
for $doc in $docs
let $date := $doc/ia:doc/ia:metadata/ia:date
let $title := $doc/ia:doc/ia:metadata/ia:title
let $vol := $doc/ia:doc/ia:metadata/ia:volumeInfo
where fn:not($vol)
order by $title
return <item>
<title>{ fn:normalize-space($title) }</title>
<date>{ fn:normalize-space($date) }</date>
</item>
return $titleDateSeq }</items>
order by $items/item/title
(: Return value is title-date combination that appears more than once (i.e., exact duplicate title-date values) :)
return functx:non-distinct-values($items/item)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment