Skip to content

Instantly share code, notes, and snippets.

@wsalesky
Last active August 29, 2015 14:03
Show Gist options
  • Save wsalesky/449f486750414acc6334 to your computer and use it in GitHub Desktop.
Save wsalesky/449f486750414acc6334 to your computer and use it in GitHub Desktop.
Find malformed persName elements in srophe data.
xquery version "3.0";
declare namespace tei="http://www.tei-c.org/ns/1.0";
<div>
{
for $person in collection("/db/apps/srophe/data/persons")//tei:persName[child::text()]
let $mistake := string-join($person/text(),' ')
let $uri := string($person/parent::tei:person/@xml:id)
let $source := $person/@source
let $bibl :=
if($person/parent::tei:person/tei:bibl[tei:ptr[ends-with(@target,'bibl/5')]]) then concat('#',string($person/parent::tei:person/tei:bibl[tei:ptr[ends-with(@target,'bibl/5')]]/@xml:id))
else ''
where $source != $bibl or $person[not(@source)]
order by $mistake
return
<item file="{$uri}">{$mistake}</item>
}
</div>
@wsalesky
Copy link
Author

@davidamichelson - This still returns 231 persName elements. Let me know if the results look correct to you, or if it is returning false positives.

@davidamichelson
Copy link

Thanks, looks like it works! I've tweaked it and put it and the results here: https://github.com/srophe/persons/tree/master/working-files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment