Skip to content

Instantly share code, notes, and snippets.

@CJHArch
CJHArch / dtl_oai_264
Created July 14, 2015 20:07
This xquery will retrieve all records from the namespace-stripped Digitool OAI feed that contain a 264 MARC field.
xquery version "3.0";
<results>
{
for $OAIMarcRecord in /repository/record[metadata/record/datafield[@tag="264"]]
let $callno:= $OAIMarcRecord/metadata/record/datafield[@tag[contains(., "09")]][1]/subfield[@code="a"]
let $title := $OAIMarcRecord/metadata/record/datafield[@tag="245"]
let $publisher := $OAIMarcRecord/metadata/record/datafield[@tag="264"]
let $PID := $OAIMarcRecord/header/identifier/substring-after(., "oai:digital.cjh.org:")
@CJHArch
CJHArch / LBI MARC 856 add
Last active November 6, 2015 19:01
This XSLT adds 856 fields based on the 094 field, in preparation for ingest via the Digitool MARCXML ingest process.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:marc="http://www.loc.gov/MARC21/slim" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC21/slim
http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd" version="2.0">
<xsl:output method="xml"/>
<!-- This stylesheet was written in July 2015 to automatically add 856 fields for the LBI art project. Filenames are derived from the 094 field. KS 20150713 -->
<xsl:variable name="filename">
<xsl:value-of select="translate(marc:collection/marc:record/marc:datafield[@tag='094']/marc:subfield[@code='a']/text(), '.', '-')"/>
</xsl:variable>
@CJHArch
CJHArch / title_partner
Created April 11, 2015 00:22
From a list of links in XML, extracts information (in this case, title and partner) from XHTML finding aids
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xhtml="http://www.w3.org/1999/xhtml" exclude-result-prefixes="xs" version="2.0">
<xsl:output omit-xml-declaration="yes"/>
<xsl:template match="/">
<xsl:for-each select="root/a">
<xsl:variable name="FA">
<xsl:value-of select="."/>
</xsl:variable>
<xsl:value-of select="normalize-space(document($FA)/xhtml:html/xhtml:head/xhtml:title)"/>
<xsl:text>; </xsl:text>
@CJHArch
CJHArch / Pull655s
Created February 27, 2015 20:25
This XSLT will take a list of PIDs, turn them into file names, and then look at those EAD files and pull out the genreforms.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="2.0">
<xsl:template match="/">
<xsl:for-each select="document('PIDsforEADs.xml')/record/pid">
<record>
<xsl:variable name="PID" select="."></xsl:variable>
@CJHArch
CJHArch / OAIsOutviaPIDList
Created February 9, 2015 22:01
This xquery looks at a list of PIDs and grabs the associated records out of the OAI feed. There is probably a more efficient way to do this, but it works.
xquery version "3.0";
<results>
{
for $PIDlist in doc('OH_PIDS_XML.xml')/data/pid/text()
let $OAIRecord := repository/record[header/identifier/substring-after(., "oai:digital.cjh.org:") = $PIDlist]
return
@CJHArch
CJHArch / gist:3b21aa3c826ef8e4e305
Created February 9, 2015 22:00
XSLT templates to grab various specific records from the OAI feed
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<!--List of PIDS -->
<xsl:template match="/">
<xsl:for-each select="results/record/header/identifier">
<xsl:copy-of select="./text()"/>,
</xsl:for-each>
</xsl:template>
@CJHArch
CJHArch / gist:225cfa99afda9fe73619
Created February 9, 2015 00:47
OAI reporting - proof of concept
#!/bin/bash
# This script will pull the oai feed, assign it a name based on the time and date, pull out requested data, and
mkdir $(date +%Y%m%d)
cd $(date +%Y%m%d)
echo "Files found in" $(date +%Y%m%d)
#run pythonaoi to get feed
python /home/kevin/test/pyoaiharvest.py -l http://digital.cjh.org/OAI-PUB -o LBI_periodicals$(date +%Y%m%d).xml -m marc21 -s LBI_periodicals
# sed to remove marc namespace prefix, consider using the sed 'or' to clean up other stuff in one shot
@CJHArch
CJHArch / OHsOutViaPid
Last active August 29, 2015 14:14
Xquery to get a list of records from the OAI feed based on an XML list of PIDs
xquery version "3.0";
<results>
{
for $PIDlist in doc('OH_PIDS_XML.xml')/data/pid/text()
let $OAIRecord := repository/record[header/identifier/substring-after(., "oai:digital.cjh.org:") = $PIDlist]
return
@CJHArch
CJHArch / gist:3f9a5fb0a9d04df10270
Created January 16, 2015 15:56
This xQuery will grab PIDs and access rights metadata records from Digitool digital entities imported into BaseX. Namespaces were NOT stripped, so the name space declaration in line two is necessary.
xquery version "3.0";
declare namespace xb="http://com/exlibris/digitool/repository/api/xmlbeans";
<data>
{
for $Record in /xb:digital_entity_call
let $PID := $Record/xb:digital_entity/pid
let $arrecord := $Record/xb:digital_entity/mds/md[type[contains (., 'rights_md')]]
return
@CJHArch
CJHArch / Macro-LBI-5digit
Created January 13, 2015 19:29
To be run on a directory of files to create a csv for use in ingesting digital files into DigiTool. Run on a spreadsheet with the list of files in column A and extensions in column B. Created for LBI collections with 5-digit AR numbers.
Sub Digitool_KDP_based_ingest_LBI_AR5()
'
' This macro will take a basic Karens Directory Printer output and prepare a template for Digitool CSV ingest. 100214 KS with updates by LL for LBI Creekside collections with five digits after the AR 2014-11-21
'
'
' RenameSheet Macro
'
'