Skip to content

Instantly share code, notes, and snippets.

@bibliotechy
Last active March 5, 2019 15:19
Show Gist options
  • Save bibliotechy/f04a586f25bb47fe41a567aa2bdcbf46 to your computer and use it in GitHub Desktop.
Save bibliotechy/f04a586f25bb47fe41a567aa2bdcbf46 to your computer and use it in GitHub Desktop.

MADS has responsibility for uploading born digital objects to the Isilon that are acquired as part of an archival collection. This is usually in the form of hard drives and similar media that they want complete copies sent to the Isilon as part of the curation process.

Existing Project Proposal is in Confluence - https://tulibdev.atlassian.net/wiki/spaces/TPI/pages/864321537/Project+Proposal+Hard+Drive+Ingestion+Script

Currently, MADS (Stefanie) mounts the Isilon to her machine and then mounts the HD to her MAC via a write blocker and then runs an rsync command to recursively cpy the entire filesystem.
The rsync was provided to her by Jim Bongiovani. Prior to that there was an extensive bash script, written by Kate Lynch, that was being used, but Stefanie has never able to get it working. I don't believe those old scriptts were even kept in version control anywhere.

The biggest problems with the current command is that

  • it is brittle; e.g. rsync command line flags suck
  • it feels dangerous - rsync can have bad effects if small flag mistakes are made
  • Doesn't work on Windows
  • Doesn't even do bare minimum digital preservation - checksums on both sides

A project proposal has been submitted to build a replacement that meets the following requirements:

  1. Easier to use. CLI ok, but avoid arcane flags with reasonable defaults
  2. Generates checksums all the files intended to be transferred before transfer, and then double checks the checksum on the destination.
  3. Has actual docuemtation about how it works, how it can be installed, effects
  4. Works on Mac and Windows
  5. Is supported by devs / LTS. I take this to mean we actually write tests for it so we can confidently add features in the future

It seems like we coudl use something like Bagit to avoid bikeshedding on how to do the checksumming. Also seems like a great opportunity to write in a language that compiles to a binary, like Go, to make supporting OSX and Windows easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment