Skip to content

Instantly share code, notes, and snippets.

@cmharlow
Last active April 24, 2023 20:02
Show Gist options
  • Save cmharlow/723bebaee52d9ba4eb68eebc9044843b to your computer and use it in GitHub Desktop.
Save cmharlow/723bebaee52d9ba4eb68eebc9044843b to your computer and use it in GitHub Desktop.
Proposed Metadata Starter Specs to add to metadata-docs/ for Hyrax Codebase

Hyrax Metadata Specifications & Documentation

Table of Contents

Goal(s) & Scope of metadata-docs/

This metadata-docs/ directory contains metadata technical documentation and specifications that don't fit in the existing codebase documentation methods, but require management, a review process, and versioning beyond what is captured in the GitHub wiki. In particular, when a Hyrax version is released, the metadata-docs/ travel with the codebase and indicate the current metadata specifications followed in that release. This documentation doesn't indicate recommendations for individual implementations of Hyrax (your metadata documentation will no doubt vary), but aims to clarify what metadata specifications exist in the core Hyrax codebase.

It is not intended to be a space for auto-generated docs based off of the code (i.e., this is not attached to something like Rdoc), nor is it necessarily limited to plain text markdown files (though files with proprietary formats are to be avoided). However, it should be metadata documentation and specifications that are (or need to be) more closely coupled with the codebase itself and version releases. Other documentation should use the existing GitHub Wiki for the Hyrax Repository.

metadata-docs/ Update Procedure (Proposed)

The update and review procedure for the documentation in metadata-docs/ is proposed as follows:

  1. Discussion: Questions, Requests for Changes, or Issues are discussed using GH Issues on the Hyrax Repository tagged 'metadata' and including the Hyrax-Metadataist Team as reviewers. This triggers Hyrax Metadataist Team volunteers to set up a timeframe & channel of communication for the requested input/review. They should also clarify the needed output from this discussion, as well as manage generating the output or kicking to a further working group as needed.
  2. Proposal: A Proposal for update of the relevant documentation is created and either attached to a current branch with the code changes that led to the documentation update, or attached to it's own branch if not directly attached to code changes (see below for more on this).
  3. PR Review: The branch with the metadata-docs/ changes is part of an issued PR, including the existing review structure of the changes there. This includes checking the documentation / codebase relationship (described below).
  4. Merge: Once accepted, the PR is merged into the core Hyrax codebase master (or appropriate) branch.

Additionally, under the charge of the management of the Hyrax-Metadataist Team, an annual full-pass review of metadata-docs/ for needed updates, checking on stalled discussions and outputs, or other general maintenance issues should be performed.

metadata-docs/ Review & Code Relationship Procedure (Proposed)

For metadata-docs/ to remain relevant and in sync with the code, there needs to be a two-way review process between codebase / code changes and docs or metadata specification changes.

PRs and Release reviews should include a metadata-docs/ review (reviewers for this would be pulled from the Hyrax-Metadataist Team for the metadata specifications). This ensures the metadata-docs/ documentation is relevant to that codebase release or change.

Specific questions or issues that pertain to the metadata-docs/ or the related work perhaps not yet captured there can be brought to discussion using GH Issues on the Hyrax Repository tagged 'metadata' and including the Hyrax-Metadataist Team as reviewers. The output of those discussions, where appropriate, should fall back into the metadata-docs/ Update Procedure, above.

Contact & Help

Contact:

For help with git / GitHub to work with these specifications, we recommend an introduction to the GitHub flow and simple steps for getting started in GitHub.

That said, git (a version control system) and GitHub (a place to host git repositories) are used to coordinate the above metadata procedures and documentation within a community, and our Hyrax community has many folks with git/Github experience willing to help, as well as plenty of work or discussion paths that don't require git knowledge to participate.

Error in user YAML: (<unknown>): mapping values are not allowed in this context at line 11 column 38
---
title: Hyrax Basic Metadata
author(s): Tom Johnson, Christina Harlow
date updated: 2017-04-12
profile:
  organization: Hyrax Community
  project: Hyrax Metadata Profile
  namespaces:
    dce:    http://purl.org/dc/elements/1.1/
    dct:    http://purl.org/dc/terms/
    edm:    http://www.europeana.eu/schemas/edm/
    fcr:    info:fedora/fedora-system:
    foaf:   http://xmlns.com/foaf/0.1/
    mrel:   http://id.loc.gov/vocabulary/relators/
    pcdm:   http://pcdm.org/models#Object
    premis: http://www.loc.gov/premis/rdf/v1#
    rdfs:   http://www.w3.org/2000/01/rdf-schema#
    sufia:  http://scholarsphere.psu.edu/ns#
    sweet:  http://sweet.jpl.nasa.gov/2.2/reprDataFormat.owl#
    works:  http://pcdm.org/works#
    xsd:    http://www.w3.org/2001/XMLSchema#

---

Hyrax Basic Metadata

About

This is the core Metadata Application Profile (MAP) for the Hyrax code. This doesn't bind Hyrax implementations to using this MAP, but instead is meant to serve as a technical specification clarifying the core community-sourced metadata decisions in this codebase so that is can be easier to expand or modify in implementations.

To update your Hyrax Implementation's MAP, check out the documentation to be added here: https://github.com/projecthydra-labs/hyrax/wiki

To take part in community review, revisions, and updates to this MAP, see the management details in the docs/README.md document.

Model

This model is a formalization of the PCDM Application Profile and its Hydra Works Extension as deployed in Hyrax.

Drawing of model Ordering construction

location of ActiveFedora models unless otherwise indicated: app/models/concerns/hyrax/

pcdm:Collection

Descriptive

Field Predicate Recommendation Expected Value (Data Type) Expected Value (Controlled Source) Obligation ActiveFedora Model Solr Mapping(s) Related Docs
title dct:title MUST xsd:string n/a {1} core_metadata.rb

Provenance

Field Predicate Recommendation Expected Value (Data Type) Expected Value (Controlled Source) Obligation ActiveFedora Model Solr Mapping(s) Related Docs
date modified dct:modified MUST xsd:dateTime n/a {1} core_metadata.rb
date uploaded dct:dateSubmitted MUST xsd:dateTime n/a {1} core_metadata.rb
depositor mrel:dpt MUST xsd:string ?? n/a {1}
import url sufia:importUrl MUST xsd:string n/a {1}
label fcr:downloadFilename MUST xsd:string n/a {1}
relative path sufia:relativePath MUST xsd:string n/a {1}

works:Work < pcdm:Object

Descriptive

Field Predicate Recommendation Expected Value (Data Type) Expected Value (Controlled Source) Obligation ActiveFedora Model Solr Mapping(s) Related Docs
title dct:title MUST xsd:string n/a {1}
creator dce:creator SHOULD xsd:string n/a {0,n}
rights statement edm:rights SHOULD dct:RightsStatement (xsd:string?) n/a {0,n}
date created dct:created SHOULD xsd:date or xsd:dateTime n/a {0,n}
based near foaf:basedNear MAY xsd:string n/a {0,n}
citation dct:bibliographicCitation MAY xsd:string n/a {0,n}
contributor dce:contributor MAY xsd:string n/a {0,n}
description dce:description MAY xsd:string n/a {0,n}
identifier dct:identifier MAY xsd:string n/a {0,n}
keyword dce:relation MAY xsd:string n/a {0,n}
language dce:language MAY xsd:string n/a {0,n}
publisher dce:publisher MAY xsd:string n/a {0,n}
related url rdfs:seeAlso MAY xsd:string or xsd:anyURI n/a {0,n}
rights dct:rights MAY dct:RightsStatement (xsd:string?) n/a {0,n}
source dct:source MAY xsd:string n/a {0,n}
subject dce:subject MAY xsd:string n/a {0,n}

Provenance

Field Predicate Recommendation Expected Value (Data Type) Expected Value (Controlled Source) Obligation ActiveFedora Model Solr Mapping(s) Related Docs
date modified dct:modified MUST xsd:dateTime n/a {1} core_metadata.rb
date uploaded dct:dateSubmitted MUST xsd:dateTime n/a {1} core_metadata.rb
depositor mrel:dpt MUST xsd:string ?? n/a {1}
import url sufia:importUrl MUST xsd:string n/a {1}
label fcr:downloadFilename MUST xsd:string n/a {1}
relative path sufia:relativePath MUST xsd:string n/a {1}

works:FileSet < pcdm:Object

Field Predicate Recommendation Expected Value (Data Type) Expected Value (Controlled Source) Obligation ActiveFedora Model Solr Mapping(s) Related Docs
title dct:title MUST xsd:string n/a {1} core_metadata.rb
creator dce:creator SHOULD xsd:string n/a {0,n}
rights statement edm:rights SHOULD dct:RightsStatement (xsd:string?) n/a {0,n}
date created dct:created SHOULD xsd:date or xsd:dateTime n/a {0,n} core_metadata.rb
based near foaf:basedNear MAY xsd:string n/a {0,n}
citation dct:bibliographicCitation MAY xsd:string n/a {0,n}
contributor dce:contributor MAY xsd:string n/a {0,n}
description dce:description MAY xsd:string n/a {0,n}
identifier dct:identifier MAY xsd:string n/a {0,n}
keyword dce:relation MAY xsd:string n/a {0,n}
language dce:language MAY xsd:string n/a {0,n}
publisher dce:publisher MAY xsd:string n/a {0,n}
related url rdfs:seeAlso MAY xsd:string or xsd:anyURI n/a {0,n}
rights dct:rights MAY dct:RightsStatement (xsd:string?) n/a {0,n}
source dct:source MAY xsd:string n/a {0,n}
subject dce:subject MAY xsd:string n/a {0,n}

pcdm:File

Field Predicate Recommendation Expected Value (Data Type) Expected Value (Controlled Source) Obligation ActiveFedora Model Solr Mapping(s) Related Docs
label rdfs:label MUST xsd:string n/a {1}
file name ebu:filename MUST xsd:string n/a {1}
file size ebu:fileSize MUST xsd:string ?? n/a {1}
date created ebu:dateCreated MUST xsd:dateTime ?? n/a {1}
date modified ebu:dateModified MUST xsd:dateTime ?? n/a {1}
byte order sweet:byteOrder MUST ?? n/a {1}
file hash premis:hasMessageDigest MUST xsd:string ?? n/a {1}

Usage

[Fill in expectations, behaviors]

To Do:

  • Change description from DC Elements dce:description to DC Terms dct:description.
  • Use a better term for keyword than dce:relation.
  • Resolve rights. Should this be a dct:RightsStatement entity?
  • Clarify practice around edm and dct rights terms
  • For based near, the range of foaf:based_near is foaf:SpatialThing. Fix this!
  • Consider fixing related url to use xsd:anyURI.
  • Refactor curation_concerns to use ActiveTriples::Schema for cleaner overrides.
    • Organize terms used for works/filesets pare down works:FileSet.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment