ross-spencer/spencer-ica-ric-cm-comments.md

## spencer-ica-ric-cm-comments.md

      
    Raw
  

              spencer-ica-ric-cm-comments.md
            
          
    Comments on Records in Contexts: A Conceptual Model for Archival Description

Consultation Draft v0.1 September 2016

Date: 30 January 2017.

By:   Ross Spencer < all.along.the.watchtower2001 [at] gmail.com >
Background:

I'm a digital preservation expert working at Archives New Zealand. This
short response to the consultation draft is submitted independently of my
organisation.
I have worked previously at The National Archives, UK. I have a keen
interest in supporting archivists and end-users to make full use of the
collections that we are custodians of.
General Comments:

I believe I join in the good majority of the community in expressing my
gratitude about being given the opportunity to comment on the consultation
draft.
I am in favour of the approach taken by ICA to combine multiple standards
into a single new standard (p1). I think the work done on this draft is
phenomenal, and despite very specific comments below about features of the
modelling work thus far, this is a standard that I think paves the way to
a bright future for archival description and discovery.
Whatever way the standard evolves, it is one I hope to be using in the near
future.
Specific Comments:


I am in favour of an approach that embraces the techniques of linked
open data (LoD) (p2).


A clearer delineation/description of the differences between RiC-CM
and RiC-O would be beneficial and may help resolve other concerns
noted below, e.g. expansion of controlled-vocabulary terms (p1).


Comments establishing the standard within the LoD/semantic web
ecosystem would be appreciated. The comments should survey the
semantic web landscape and discuss complimentary standards that are
recommended by ICA to be used alongside any future RiC model.


The standard is very wide ranging. In its early stages I would
consider it to be too broad. I ask the ICA to consider a more
restricted version of this standard that is more concise. Comparable
to other LoD standards such as Dublin Core (15 elements), SKOS (32)
vs. RiC-CM ~800 (?).


A restricted set could focus on features of the vocabulary that are
absolutely necessary, and can be used by the widest possible audience
to support management and discovery of archives.


A restricted set could be monitored for use and iterations built upon
henceforth.


It is noted that the 'relations' described in the paper are
suggestive. When 'rounded out' it should also be noted that using LoD
techniques 'relations' become 'resources' in their own right (the
predicates in the subject, predicate, object, triumvirate). That makes
them something the user will look up more information about. As such
the data they contain should be as complete as the entites themselves.


Because of this then, additional consideration should be given to
point 05. I raise above, where I call for an initial, more concise
vocabulary to be considered. There is a maintenance overhead of a
large vocabulary that (the lack of description in v0.1) may be an
indication of an existing reality.


A vocabulary that is too broad could have a dilutant quality impacting
discovery, whereby, a wide variance of terms are used to describe too
large a number of records creating smaller results sets when using
techniques such as faceted search. (Posit, a smaller number of
properties across a larger number of records creates larger results
sets)


I appreciate the inclusion of an authenticity and integrity note
(RiC-P5). I would like to see this expanded further for digital with
a separate field, or set of fields that have a specific data-type of
'checksum' i.e. a field that can be validated as being just a checksum
only.


A checksum is a mechanism by which a digital file in 'a' digital
repository can be reliably paired with a catalogue entry. By having an
explicit mechanism for attaching checksum or sets of checksum to the
catalogue it promotes computerization of processes between the two.


On that note, I would like to see the LoD concepts committed to more
fully. Where there are facts to be recorded - a checksum being a fact
about a digital record's current state - more rigid properties can be
created and used.


RiC-P39 (Contact Information) is an example of a property that can be
expanded into 'facts'. Email address, postal address, phone number.
All properties that can be validated in some way, and that might be
desirable to be searched upon in some way. Conditions of use, where
licenses could be searched upon, may be another useful example.


RiC-P6 (Content Type) will be set via controlled list. Controlled
vocabularies such as in this example, where not otherwise specified
(e.g. as in MIME) should be described fully by this standard as
resources that can also be looked-up and de-referenced to provide more
information.


To promote interoperability, data types should be specified more fully
e.g. preferred/expected number, text, or date formats, plus strategies
to resolve areas of ambiguity, such as for dates, where precision may
not always be possible.


RiC-P10 (Encoding Format) is a good example of a field explicitly made
available for digital. Properties such as RDFS:Domain may become
important as an ontology develops out of this work. What is the ICA's
chosen approach to managing the lines between paper and digital where
properties may or may not make sense to one record type over the
other?


Conclusion:

The consultation draft makes adequate attempts to caveat its work in places,
including:
"It is essential that developers	of records management and record
description and access systems are part of RiC’s audience. RiC is detailed
and complex, and therefore successful implementation and use will require
the development of methods that will ameliorate the intellectual,
technological, and economic challenge of data creation and maintenance."
It is possible therefore that some of my suggestions above have been
considered to be out of scope currently; or simply not relevant to the
future goals of this standard.
It is also likely that for realistic questions raised, they will be
answered during the proving stages of Records in Contexts where I hope to be
an active participant in working with the new standard.
Thank you once again for the opportunity to contribute thus far.