Skip to content

Instantly share code, notes, and snippets.

@ahankinson
Last active June 27, 2018 15:47
Show Gist options
  • Save ahankinson/00796be6d2088fd6ace4ec5930692c6e to your computer and use it in GitHub Desktop.
Save ahankinson/00796be6d2088fd6ace4ec5930692c6e to your computer and use it in GitHub Desktop.
OCFL Drafts
ocfl-root ❯ tree
.
├── 0=ocfl_1.0
├── ark:12148
│   └── btv
│   └── 1b844
│   ├── 7298r
│   │   └── btv1b8447298r.tar
│   └── 90444
│   └── btv1b84490444
│   ├── 0=ocfl_object_1.0
│   ├── inventory.jsonld
│   ├── inventory.jsonld.sha512
│   ├── logs
│   │   ├── 2015
│   │   │   └── audit-2015-01-01.txt
│   │   ├── 2016
│   │   │   └── audit-2016-01-01.txt
│   │   ├── 2017
│   │   │   └── audit-2017-01-01.txt
│   │   └── 2018
│   │   └── audit-2018-01-01.txt
│   ├── v1
│ │ ├── inventory.jsonld
│   │   ├── data
│   │   │   ├── page1
│   │   │   │   ├── btv1b84490444_page1.jp2
│   │   │   │   ├── btv1b84490444_page1.tiff
│   │   │   │   └── btv1b84490444_page1.xmp
│   │   │   └── page2
│   │   │   └── btv1b84490444_page2.tiff
│   │   └── metadata
│   │   └── mets.xml
│   ├── v2
│ │ ├── inventory.jsonld
│   │   ├── data
│   │   │   └── page2
│   │   │   ├── btv1b84490444_page2.jp2
│   │   │   └── btv1b84490444_page2.xmp
│   │   └── metadata
│   │   └── mets.xml
│   └── v3
│ ├── inventory.jsonld
│      ├── data
│      │   └── page1
│      │   ├── btv1b84490444_page1_reshoot.jp2
│      │   ├── btv1b84490444_page1_reshoot.tiff
│      │   └── btv1b84490444_page1_reshoot.xmp
│      └── metadata
│      └── mets.xml
└── logs
└── 2018
└── filesystem-audit-2018-01-01.txt

Notes

NamAsTe is used to identify the OCFL root and object versions:

0=ocfl_1.0
0=ocfl_object_1.0

The inventory.jsonld file is always located in the version directory. For the most recent version, it is also copied to the root of the OCFL Object. Because it is impossible to store a SHA512 hash of the file inside itself, the hash of the most recent version is kept in a inventory.jsonld.sha512 file in the root of the object.

The SHA512 of the ‘inventory.jsonld’ in the root is captured in the inventory.jsonld.sha512 file with the format:

cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e inventory.jsonld

The ‘inventory.jsonld’ file contains the information necessary to identify the contents of the object, as well as its versions. It is structured as JSON-LD and features:

  • A pointer to the most recent “version” in the ‘head’ key; this is implemented as a pointer to a local “@id”.
  • A manifest of all files in the object. This will include all files in all versions.
  • A list of the versions.

The JSON-LD shown is not shown as globally-accessible; hence the use of the ‘urn:ark’ value for the root @id. Rather, it was chosen as a way of accessing and defining the semantics of the keys within the versions file, and so that the file could be ingested into a triplestore.

An OCFL client would attempt to read the ‘versions.jsonld’ file, find the ID of the latest version and read the ‘members’ block, and then look up the paths for the files using the SHA512 in the manifest.

Versions can also be compared by doing set operations between two ‘members’ blocks to identify the files that have changed.

There are no requirements for the contents and structure of the ‘logs’ directories in both the disk root and object root. The contents shown are for illustration only.

A tar file is shown to illustrate how tarred objects may be implemented. One could imagine that an expanded tar block would look similar to the example shown.

Use of SHA512 can help identify problems quite quickly. If a file exists in a version folder for which there is no SHA512, or if the file at a given path in the manifest has a different checksum, an object validator can identify these as problems.

{
"@context": "https://ocfl.org/v1.0/",
"id": "urn:ark:12148/btv1b84490444",
"type": "Object",
"head": "#v3",
// This is a list of all files that are in the object. It must be updated
// with files added in new versions.
// Writing the same file to a new location (same checksum, different path)
// is an error, since versions are fixed and cannot be changed. Instead,
// if a file is restored in later versions, it can be referenced in its earlier
// incarnation.
"manifest": {
"a83e3633c880fb4caae55edd60571d930681496572f6ec6910b15a7796a38fcf0fff6aaf4acb8aac0f9a3049e42cf27bca0e2790105705d35d5727e1b9443f44": "v1/data/page1/btv1b84490444_page1.jp2",
"2638e7c5bab6aadd2cb8eec5a48eefe35d25e005b55bd638e8ce222ef7204a790e64613ec63bb66e543c7e6d610391ebc3dd1b019b5ed028d942a1b00a0fb177": "v1/data/page1/btv1b84490444_page1.tiff",
"e8eb20d68e2960cc77cf52976f665c368a5c5aba43af3e8cfa2bf5c56547087fa290f06b49a595cf6557c5d882e0ebbe16e2231771c659dc9d166f5320b40eb7": "v1/data/page1/btv1b84490444_page1.xmp",
"de655a830d1c5d5c5cb6bf333d43f63fabf23a325d1a821c057b9ccf53e754c2e57e99e1753b1bcf071d1e6f9ced0067b7873fe2cb4dcd2d6741c27085e75f4f": "v1/data/page2/btv1b84490444_page2.tiff",
"83fe4cd4e6b711c29f0b9e16b0c7700c4c9946b6c61af261341efda4252d988cb3584d77f16e16138d012e6960c1c23467de4b8e3cfacda802e24920636212f6": "v2/data/page2/btv1b84490444_page2.jp2",
"f3f965c408011fab1b32e31e1536f77ec6b8796945d3ec6414e6bf46320c21b1af05a43f98da161a382c29594c3abb2a4426299bc1340362f9fb1ad71c2100df": "v2/data/page2/btv1b84490444_page2.xmp",
"fd3cfe0321074427d3907a2ff581d91112a831db84d0ef04be5dc0cb0628b01e2bfbd0f6375cc74fe33f2e2b24d22bef195eeb3811c9c04022b0ab1a4bd1c044": "v3/data/page1/btv1b84490444_page1_reshoot.jp2",
"5dc5a8f0926a8d68740274ceb99304c7f2f0640d683f84bfbbfa19f34dc50b3676f4b6973e71426a364defe030c98fa271b89bb34370bedcbd718eba00566397": "v3/data/page1/btv1b84490444_page1_reshoot.tiff",
"c548b878e9babcb24676188715be10a0286173b7dc7c61f9fc9409d4ed98c0404e28913013f5b16fa15aba6b853ac8db54975760dd4663bc4272701453154f81": "v3/data/page1/btv1b84490444_page1_reshoot.xmp",
"2ef53d9842fd6cc5410a42e2df560b65b8d9fcdedb28e99c08f81f98ece35d90232b3530d140b5a0242972092d8de5d7206032dc5e78e45ac9d00e461a85587d": "v1/metadata/mets.xml",
"2b10f07d828d76428d15fab1949e990e11f38663e65466cff2759ca10ab9c6411caf702152239045efb498041a0820d8d4a1aebdd6bbc0d4cb6c5468674741aa": "v2/metadata/mets.xml",
"5cc988a4f29c07229ed321dbf58e00de16ea36ea3c28df9db8f112dfca0ff0760ca4fb626d0a65300cf074107990c3ae704b5a3870f3c6ec7fc027633feb2773": "v3/metadata/mets.xml",
"c2465c96dbbdecfac375997e949a494bf7b4786e7810c61bc994033ffd99c39d2920379f51e2c938012635e5e27cb32d1c98fda558f2aea1d4484b275fab2987": "v1/inventory.jsonld",
"694c9c48d09bf9a1a2cecdcd92b27f3d50b7d8bd28d2bb5d50291402d4230ca73843825281f8ac2057bbc7bd1ea34910a68f3d229415854a7f89d5946e89529f": "v2/inventory.jsonld"
},
"versions": [
{
"type": "Version",
"id": "#v1",
"created": "2014-01-01T12:00:00Z",
"message": "Initial version",
"client": "OCFL Python Library 1.1.0",
"user": {
"name": "Andrew Hankinson",
"email": "andrew.hankinson@bodleian.ox.ac.uk"
},
"members": [
"a83e3633c880fb4caae55edd60571d930681496572f6ec6910b15a7796a38fcf0fff6aaf4acb8aac0f9a3049e42cf27bca0e2790105705d35d5727e1b9443f44", // "v1/data/page1/btv1b84490444_page1.jp2"
"2638e7c5bab6aadd2cb8eec5a48eefe35d25e005b55bd638e8ce222ef7204a790e64613ec63bb66e543c7e6d610391ebc3dd1b019b5ed028d942a1b00a0fb177", // "v1/data/page1/btv1b84490444_page1.tiff"
"e8eb20d68e2960cc77cf52976f665c368a5c5aba43af3e8cfa2bf5c56547087fa290f06b49a595cf6557c5d882e0ebbe16e2231771c659dc9d166f5320b40eb7", // "v1/data/page1/btv1b84490444_page1.xmp"
"de655a830d1c5d5c5cb6bf333d43f63fabf23a325d1a821c057b9ccf53e754c2e57e99e1753b1bcf071d1e6f9ced0067b7873fe2cb4dcd2d6741c27085e75f4f", // "v1/data/page2/btv1b84490444_page2.tiff"
"2ef53d9842fd6cc5410a42e2df560b65b8d9fcdedb28e99c08f81f98ece35d90232b3530d140b5a0242972092d8de5d7206032dc5e78e45ac9d00e461a85587d", // "v1/metadata/mets.xml"
]
},
{
"type": "Version",
"id": "#v2",
"created": "2014-01-01T13:00:00Z",
"message": "Added page 2 JPEG 2000 and XMP",
"client": "OCFL Python Library 1.1.0",
"user": {
"name": "Andrew Hankinson",
"email": "andrew.hankinson@bodleian.ox.ac.uk"
},
"members": [
"a83e3633c880fb4caae55edd60571d930681496572f6ec6910b15a7796a38fcf0fff6aaf4acb8aac0f9a3049e42cf27bca0e2790105705d35d5727e1b9443f44", // "v1/data/page1/btv1b84490444_page1.jp2"
"2638e7c5bab6aadd2cb8eec5a48eefe35d25e005b55bd638e8ce222ef7204a790e64613ec63bb66e543c7e6d610391ebc3dd1b019b5ed028d942a1b00a0fb177", // "v1/data/page1/btv1b84490444_page1.tiff"
"e8eb20d68e2960cc77cf52976f665c368a5c5aba43af3e8cfa2bf5c56547087fa290f06b49a595cf6557c5d882e0ebbe16e2231771c659dc9d166f5320b40eb7", // "v1/data/page1/btv1b84490444_page1.xmp"
"de655a830d1c5d5c5cb6bf333d43f63fabf23a325d1a821c057b9ccf53e754c2e57e99e1753b1bcf071d1e6f9ced0067b7873fe2cb4dcd2d6741c27085e75f4f", // "v1/data/page2/btv1b84490444_page2.tiff"
// new files added in version 2
"2b10f07d828d76428d15fab1949e990e11f38663e65466cff2759ca10ab9c6411caf702152239045efb498041a0820d8d4a1aebdd6bbc0d4cb6c5468674741aa", // "v2/metadata/mets.xml"
"83fe4cd4e6b711c29f0b9e16b0c7700c4c9946b6c61af261341efda4252d988cb3584d77f16e16138d012e6960c1c23467de4b8e3cfacda802e24920636212f6", // "v2/data/page2/btv1b84490444_page2.jp2",
"f3f965c408011fab1b32e31e1536f77ec6b8796945d3ec6414e6bf46320c21b1af05a43f98da161a382c29594c3abb2a4426299bc1340362f9fb1ad71c2100df" // "v2/data/page2/btv1b84490444_page2.xmp",
]
},
{
"type": "Version",
"id": "#v3",
"created": "2018-01-05:14:00:00Z",
"message": "Replaced page 1 with a re-shot version",
"client": "OCFL Ruby Gem 0.9.5",
"user": {
"name": "Andrew Woods",
"email": "awoods@duraspace.org"
},
"members": [
"fd3cfe0321074427d3907a2ff581d91112a831db84d0ef04be5dc0cb0628b01e2bfbd0f6375cc74fe33f2e2b24d22bef195eeb3811c9c04022b0ab1a4bd1c044", // "v3/data/page1/btv1b84490444_page1_reshoot.jp2",
"5dc5a8f0926a8d68740274ceb99304c7f2f0640d683f84bfbbfa19f34dc50b3676f4b6973e71426a364defe030c98fa271b89bb34370bedcbd718eba00566397", // "v3/data/page1/btv1b84490444_page1_reshoot.tiff",
"c548b878e9babcb24676188715be10a0286173b7dc7c61f9fc9409d4ed98c0404e28913013f5b16fa15aba6b853ac8db54975760dd4663bc4272701453154f81", // "v3/data/page1/btv1b84490444_page1_reshoot.xmp",
"de655a830d1c5d5c5cb6bf333d43f63fabf23a325d1a821c057b9ccf53e754c2e57e99e1753b1bcf071d1e6f9ced0067b7873fe2cb4dcd2d6741c27085e75f4f", // "v1/data/page2/btv1b84490444_page2.tiff"
"83fe4cd4e6b711c29f0b9e16b0c7700c4c9946b6c61af261341efda4252d988cb3584d77f16e16138d012e6960c1c23467de4b8e3cfacda802e24920636212f6", // "v2/data/page2/btv1b84490444_page2.jp2"
"f3f965c408011fab1b32e31e1536f77ec6b8796945d3ec6414e6bf46320c21b1af05a43f98da161a382c29594c3abb2a4426299bc1340362f9fb1ad71c2100df", // "v2/data/page2/btv1b84490444_page2.xmp"
"5cc988a4f29c07229ed321dbf58e00de16ea36ea3c28df9db8f112dfca0ff0760ca4fb626d0a65300cf074107990c3ae704b5a3870f3c6ec7fc027633feb2773", // "v3/metadata/mets.xml",
]
}
]
}
@zimeon
Copy link

zimeon commented May 25, 2018

Need to zap @ from type and id in last block of versions.jsonld

@ahankinson
Copy link
Author

^^ Done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment