Last active
March 8, 2018 17:31
-
-
Save zachary-johnson/51a3066d68f9ef00f09fb60ba89bb28e to your computer and use it in GitHub Desktop.
Attempts at doing something to extract individual snippets from JesseWills FA extracted portion
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
xquery version "3.1"; | |
let $xml := fn:doc("https://raw.githubusercontent.com/HeardLibrary/finding-aids/master/ASpace/Working%20EADs/WillsJesseEly_MSS_0001_working_pieces.xml") | |
let $date := fn:data($xml//unitdate) | |
let $title := fn:data($xml//unittitle) | |
(: This was just something I was playing around with, trying to figure things out.:) | |
(: let $new-title := fn:string-join($title, $date) :) | |
let $string-title := fn:string-join($title, ' ') | |
let $new-title := fn:tokenize($string-title, ';') | |
return $string-title | |
(: What I want the code to do: | |
Take from here: | |
....<unitdate>1958</unitdate> | |
<unittitle>30 April to Allen Tate; 9 June to Allen Tate; 9 June to . "Bill"; (William Yandell Elliott); F. 1-1. | |
17 December to "Lib" (Mrs. R.D. Crabtree); F. 3-4.</unittitle>.... | |
Give me something like below by creating a new tagged piece every time a ';' is encountered. | |
<new_unittitle>30 April to Allen Tate</new_unittitle> | |
<new_unittitle>9 June to Allen Tate<new_unittitle> | |
etc. etc. | |
Preferably, I could eventually get it to make this: | |
<c02><did> | |
<unittitle>To Allen Tate</unittitle> | |
<unitdate>1958 April 30</unitdate> | |
</did></c02> | |
<c02><did> | |
<unittitle>To Allen Tate</unittitle> | |
<unitdate>1958 June 9</unitdate> | |
</did></c02> | |
etc. etc. | |
FYI, 'F. #1-#2' means that everything preceding it should actually be Box #1, Folder #2. This info should | |
be in a container tag for each <c02>, but I may just add by hand if it's not consistent enough in the original | |
<unittitles> to be extracted easily. | |
:) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment