Metadata Records into HathiTrust
General HathiTrust metadata submission guide https://goo.gl/FCbQBS
Step 1: Create an itemized set of physical items (NUL uses barcodes)
- Create a spreadsheet with the header: Barcode -- format the column as text so that numeric strings do not convert to Scientific Notation.
- Upload record set into ALMA: Admin > Manage Sets > Add set > itemized
- Set content type = Physical items
- Upload file, then Save
- To see an error file: Administration > Manage Jobs > Monitor Jobs
- Confirm count of set members matches count on spreadsheet of barcodes. Actions > Members
Step 2: Export from ALMA using a publishing profile (need correct ALMA permissions to do this)
- ALMA menu > Resources > Publishing Profile > Add Profile
- Profile Details
- Select set name
- Publishing Mode: Full
- FTP – [your local ftp server]
- Physical format: Binary // number of records in file: one
- Data Enrichment
- Add holding information > checked
852 $b $c $h $i
- Add items information > checked
- Include item information >
955 $b (barcode) $v (description) $d (permanent location) $e (call number)
- Run publishing profile > Actions -> Run
- Include item information >
Step 3: Check publishing report and download file
- Use FileZilla to download your exported file
- [login to your local ftp server]
- drag the file over to My Documents
- Open with MARCEdit > MARC Tools
- convert file to .mrk format (MARC Breaker function)
Step 4: Fields to remove/add -- Note: this could also be done with a normalization rule
- Open the .mrk file created in MARCEdit
- Remove 9XX fields other than the 955. (948, 949, 938, 994)
- Remove 035 $9, ONLY 035 with OCLC should remain – also no 019 or 035 $z
- Required elements:
LDR (000), 001, 008, 035 $a (OCoLC) 040 $c, 245, 300 $a
Step 5: Record checks using MARCEdit
- Make sure the counts of
000/001/008/035/245/955
equal the same quantity (one bib per item record) - Check MARCEdit MARC validation report
- Confirm that all records have only one OCLC number
- Validate headings – correct and establish headings as needed
Step 6: Convert final file to XML
-
In MARCEditor, compile .mrk file into MARC - File menu > Compile file into MARC. This will create .mrc file which is also UTF encoded. Close file.
-
Use MARCEdit tools to convert final file to MARC21XML
-
Click the MARC Tools Icon
-
Supply input and Output file names and make sure to select MARC->MARC21XML.
-
Execute
Step 7: Uploading to Zephir
-
[HathiTrust gives you a naming convention]
-
This naming convention also ensure sthat your files are not run through your configuration for any Google-scanned materials.
-
Use CoreFTP to upload file
-
Upload file to ftps.cdlib.org/submissions
Step 8: Send notification email to CDL
file name=<file name>
file size=<file size in bytes>
record count=<number of records>
notification email=<email address to which you would like your run notification sent>
Step 9: look at error reports, etc.
Run reports for contributor's files are posted to their FTPS space, in the subdirectory ftps.cdlib.org/runreports.
You are responsible for retrieving your own error files via FTPS from ftps.cdlib.org/errfiles. Error files will remain in the FTPS location for 60 days.
Error files will be provided as MARCXML files using the file naming convention: original_file_name_error.xml
Correcting your records and re-submitting them
After correcting the errors, you may re-submit the records to ftps.cdlib.org/submissions , following the same guidelines you followed for initial record submission. Important : Because the loader script implements updates to records relying on a change in the date associated with each, it is critical that the filename used for the corrected records includes the date of re-submission and NOT the date included in the name of the file as initially submitted.