Skip to content

Instantly share code, notes, and snippets.

@jrgriffiniii
Last active August 11, 2020 20:32
Show Gist options
  • Save jrgriffiniii/1832157544833f781237018757b8f410 to your computer and use it in GitHub Desktop.
Save jrgriffiniii/1832157544833f781237018757b8f410 to your computer and use it in GitHub Desktop.
DataSpace Dissertation MARC Record Enhancement

Updating MARC Dissertation Records

Firstly, please download the MARC files sent by e-mail.

Then, please use MarcEdit in order to process and verify that these files are valid. Using MARC Tools, open the Princeton USMARC.MRC file and execute the MarcBreaker function, saving to a filename referencing the month and year of the batch (e. g. dataspace_dissertations_june_2019.mrc). Individual records may be isolated in their own files, and renaming these appropriately (e. g. for dataspace_dissertations_13886597.mrc for Princeton - USMARC - Pub 13886597.MRC) Following this, select Edit Records in order to confirm that the MARC records appear to be valid.

After tunneling over the SSH:

ssh -L 1234:dataspace.princeton.edu:22 $USER@$BASTION_HOST

...please copy the files to the server environment:

% scp -P 1234 dataspace_dissertations_june_2019.mrc libvijrg@localhost:/tmp/
% scp -P 1234 dataspace_dissertations_13886597.mrc libvijrg@localhost:/tmp/

Then, as the user dspace, please invoke the following:

cd ~/dspace/dissertations/MARC-records
set BATCH_ID=june_2019
mkdir $BATCH_ID
cd $BATCH_ID
cp /tmp/dataspace_dissertations_june_2019.mrc .
../marc-records-add-arks dataspace_dissertations_june_2019.mrc
#!/bin/tcsh -e
set dir = `pwd`
set clijava = "$HOME/cli-java"
if ($#argv != 1) then
echo "Usage: $0 'absolute-path-to-MARK-record-file'"
echo -n "enter MARC record file name> "
set input = $<
else
set input = $1
endif
set f = $dir/MARC-`date +%Y.%m.%d`
if ( -f $input ) then
cp $input $f
else
echo "Input, '$input', not found, missing/incorrect path?"
exit
endif
echo "copied to $f"
(cd $clijava ; ant compile )
echo "removing line feeds from $f"
cat $f | tr -d '\012' > $f.new
echo -n "Number of Thesese on file: "
strings $f | fgrep 'Princeton University.' | wc -l
set log = $dir/log-`date +%Y.%m.%d`
echo "running MARC record enhancement - this can take a while"
echo "logging to $log"
echo "run the tail command to watch the log file: tail -f $log"
$clijava/runMARCProcessor $f.new $f.mrc >& $log
fgrep DEPARTMENT $log
if ($status != 0) then
echo "FAILURE"
exit 1
endif
echo "created file with arks: $f"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment