Metadata Records into HathiTrust
General HathiTrust metadata submission guide https://goo.gl/FCbQBS
Step 1: Create an itemized set of physical items (NUL uses barcodes)
- Create a spreadsheet with the header: Barcode -- format the column as text so that numeric strings do not convert to Scientific Notation.
- Upload record set into ALMA: Admin > Manage Sets > Add set > itemized
- Set content type = Physical items
- Upload file, then Save
- To see an error file: Administration > Manage Jobs > Monitor Jobs
- Confirm count of set members matches count on spreadsheet of barcodes. Actions > Members
Step 2: Export from ALMA using a publishing profile (need correct ALMA permissions to do this)
- ALMA menu > Resources > Publishing Profile > Add Profile
- Profile Details
- Select set name
- Publishing Mode: Full
- FTP – [your local ftp server]
- Physical format: Binary // number of records in file: one
- Data Enrichment
- Add holding information > checked
852 $b $c $h $i
- Add items information > checked
- Include item information >
955 $b (barcode) $v (description) $d (permanent location) $e (call number)
- Run publishing profile > Actions -> Run
- Include item information >
Step 3: Check publishing report and download file
- Use FileZilla to download your exported file
- [login to your local ftp server]
- drag the file over to My Documents
- Open with MARCEdit > MARC Tools
- convert file to .mrk format (MARC Breaker function)
Step 4: Fields to remove/add -- Note: this could also be done with a normalization rule
- Open the .mrk file created in MARCEdit
- Remove 9XX fields other than the 955. (948, 949, 938, 994)
- Remove 035 $9, ONLY 035 with OCLC should remain – also no 019 or 035 $z
- Required elements:
LDR (000), 001, 008, 035 $a (OCoLC) 040 $c, 245, 300 $a
Step 5: Record checks using MARCEdit
- Make sure the counts of
000/001/008/035/245/955equal the same quantity (one bib per item record)
- Check MARCEdit MARC validation report
- Confirm that all records have only one OCLC number
- Validate headings – correct and establish headings as needed
Step 6: Convert final file to XML
In MARCEditor, compile .mrk file into MARC - File menu > Compile file into MARC. This will create .mrc file which is also UTF encoded. Close file.
Use MARCEdit tools to convert final file to MARC21XML
Click the MARC Tools Icon
Supply input and Output file names and make sure to select MARC->MARC21XML.
Step 7: Uploading to Zephir
[HathiTrust gives you a naming convention]
This naming convention also ensure sthat your files are not run through your configuration for any Google-scanned materials.
Use CoreFTP to upload file
Upload file to ftps.cdlib.org/submissions
Step 8: Send notification email to CDL
file name=<file name>
file size=<file size in bytes>
record count=<number of records>
notification email=<email address to which you would like your run notification sent>
Step 9: look at error reports, etc.
Run reports for contributor's files are posted to their FTPS space, in the subdirectory ftps.cdlib.org/runreports.
You are responsible for retrieving your own error files via FTPS from ftps.cdlib.org/errfiles. Error files will remain in the FTPS location for 60 days.
Error files will be provided as MARCXML files using the file naming convention: original_file_name_error.xml
Correcting your records and re-submitting them
After correcting the errors, you may re-submit the records to ftps.cdlib.org/submissions , following the same guidelines you followed for initial record submission. Important : Because the loader script implements updates to records relying on a change in the date associated with each, it is critical that the filename used for the corrected records includes the date of re-submission and NOT the date included in the name of the file as initially submitted.