Version: 2020-05-27
This cheatsheet: https://gist.github.com/00e47cf4771dff8566a44529a77aae48.git
Main goal: Simple Tabular Model for Application Profiles (AP-STM)
For human consumption
- For display in tabular format ("as is")
- For conversion into HTML or PDF
For machine processing
- For generating validation schemas in XML Schema, SHACL, or ShEx
Next call
- 2020-04-22 Wed 1600 Berlin time
Key links
- https://github.com/dcmi/dcap/tree/master/prototypes/simple - main deliverables
- http://dublincore.org/groups/application_profiles_ig/ - DCAP-IG home page
- https://github.com/dcmi/dcap - DCAP-IG Github repo
- https://github.com/dcmi/dcap/wiki - DCAP-IG Github wiki
- https://lists.dublincore.org/pipermail/application-profiles-ig - mailing list
- https://github.com/dcmi/dcap/issues - issue tracker
- https://github.com/dcmi/dcap/blob/master/prototypes/ - prototypes and variants
In scope for DCAP-IG (potential deliverables)
- "Data, instances of"
- Example data! - data in datasets, aka "instance data"
- The need to understand or validate instance data is, of course, the whole point.
- "Simple Tabular Model for Application Profiles (AP-STM), instances of"
- Example profiles! CSV files, based on AP-STM, filled in with specific constraints.
- Templates for validating instance data about books, paintings, languages...
- "Simple Tabular Model for Application Profiles (AP-STM), specification of"
- In effect, a profile for AP-STM instances:
- Uses AP-STM vocabulary
- Adds applicable constraints (eg, cardinality)
- "Formalisms"
- ShEx expression of formalism
- "Model and vocabulary specification for Dublin Core Application Profiles (DCAP)"
- Defines elements of CSV = spreadsheet columns = terminology = vocabulary
- Entity_Shape_ID / Entity_Shape_Label
- Property_ID / Property_Label
- Value_Type / Value_Constraint
- Cardinality / Annotation / Namespace_Prefix / Namespace_URI
- Glossary with definitions of general concepts: Entity Shape, Property, Value, Namespace.
- Definitions - EDIT HERE
- 2020-04-22: Karen will create Github issues - watch this space
- Compared with BIBFRAME definitions
- Compared with W3C DXWG definitions
- "AP-STM Value Types, vocabulary specification of"
- Starter set of core value types
- Examples: Literal, Non-literal, Entity Shape Reference, IRI, IRI Stem, Pick List
- Scripts in Python, etc, for converting AP-STM instances into validation schemas (ShEx, etc).
BEYOND the scope of DCAP-IG
- "Application Profiles, instances of" (other than simple tabular APs)
- Examples: More complex profiles based on DCAT, BIBFRAME, or RDA.
- AP Instances lie mid-way between data and AP Models:
- Instance Data is matched against AP Instances (eg, for validation).
- AP Instances are based on AP Models.
- "Application Profiles, generic model and vocabulary for"
- DCAP-IG is not trying to invent a comprehensive, generic model for APs.
- Examples of more comprehensive models:
Meetings and calls (reverse-chronological)
- 2020-04-08 https://hackmd.io/9U145qXMQUCnlUyjqO1YFg
- 2020-04-08 https://github.com/dcmi/dcap/blob/master/meetings/2020/2020-04-08.dcap_zoom_call.md
- 2020-03-18 https://github.com/dcmi/dcap/blob/master/meetings/2020/2020-03-18.dcap_zoom_call.md
- 2020-03-03 https://github.com/dcmi/dcap/blob/master/meetings/2020/2020-03-03.dcap_zoom_call.md
- 2020-02-18 https://github.com/dcmi/dcap/blob/master/meetings/2020/2020-02-18.dcap_zoom_call.md
- 2019-09-25 https://github.com/dcmi/dcap/blob/master/meetings/2019/hackDay_DCMI2019.md
- 2019-09-25 https://github.com/dcmi/dcap/blob/master/meetings/2019/2019-09-23_AP_Discussion_DCMI2019.md
- 2019-09-02 https://github.com/dcmi/dcap/blob/master/meetings/2019/september-2-2019.md
- 2019-09-02 https://github.com/dcmi/dcap/blob/master/meetings/2019/2019-09-02.dcap_hackathon.md
- 2019-03-15 https://github.com/dcmi/dcap/blob/master/meetings/2019/2019-03-15.informal_zoom_call.md
Where to declare namespace prefixes
- Nishad: "Have prefixes for each table in a standard format in the same folder as the table"
- Nishad: "URIs can be either QNames with prefixes from prefix.cc or RFC3986 IRIs"
- Apr 2020: https://gist.github.com/nishad/1339d3962002eea3f9282e4ef4b2b09c (Nishad)
- Apr 2020: https://github.com/kcoyle/RDF-AP/blob/master/test.csv (Karen) - two extra columns
- Apr 2020: https://github.com/kcoyle/RDF-AP/blob/master/test2.csv (Karen) - two extra columns
- Apr 2020: https://gist.github.com/tombaker/34fd23f3866cc30e6c9823360fec7764 (Tom)
- Namespace prefix lists
Older prototypes
- 2019-11-29 https://github.com/dcmi/dcap/tree/master/prototypes/templateYAML - with statement IDs (Karen)
- 2019-11-28 https://github.com/dcmi/dcap/tree/master/prototypes/bookCase1 (Karen)
- 2019-11-26 https://github.com/dcmi/dcap/tree/master/prototypes/wikidata_painting (Tom) - SWIB19 example
- 2019-09-25 https://github.com/dcmi/dcap/tree/master/prototypes/simpleFromHackathon - DC2019 Hackathon
- 2019-05-20 https://nishad.github.io/yama/spec/latest/ - Nishad's YAMA: Yet Another Metadata Application Profile
ShEx Lite - aka ShExJ-Lite, subset of ShExJ that should work in any ShEx implementation
- https://dcmi.github.io/dcap/shex_lite/micro-spec.html (Aug 2019, Mar 2020)
- https://github.com/dcmi/dcap/blob/master/shex_lite/README.md
- See also: https://shexspec.github.io/ns/shex.html - ShEx vocabulary
- See also: https://github.com/weso/shex-lite - Scala implementation of ShEx Lite
ShExStatements (John Samuel)
- Format for creating entity schemas on Wikidata; first version released 2020-03-23.
- https://github.com/johnsamuelwrites/ShExStatements/blob/master/docs.md
- https://github.com/johnsamuelwrites/ShExStatements/tree/master/examples
- https://github.com/johnsamuelwrites/ShExStatements/tree/master/examples/wikidata
Other related work
- https://github.com/dcmi/dcap/wiki/Related-Projects (occasionally updated)
- https://tools.ietf.org/html/rfc4180 - Common Format and MIME Type for Comma-Separated Values (CSV) Files (2005)
Early requirements and discussion
- 2019-09-03 https://github.com/dcmi/dcap/blob/master/simpleSchema.csv - rather full model, based on DSP
- 2019-08-21 https://github.com/dcmi/dcap/blob/master/schemaList.csv - rather full model, based on DSP
- 2019-04-18 https://github.com/dcmi/dcap/blob/master/requirements.md - Requirements and motivation
- 2019-04-12 https://github.com/dcmi/dcap/blob/master/patterns.md - Patterns for Application Profiles and Constraints
- 2017-07-29 https://github.com/kcoyle/RDF-AP/blob/master/Patterns.md - Patterns
- 2008-03-31 http://www.dublincore.org/specifications/dublin-core/dc-dsp/ - Description Set Profile Constraint Language
- 2008-01-14 http://dublincore.org/documents/singapore-framework/ - Singapore Framework for DCAPs
- 2000-09-24 http://www.ariadne.ac.uk/issue25/app-profiles - Heery and Patel - starting point
Some favorite definitions (so far)
AP concepts
- Entity Shape - a class of things describable by given properties
- Archetype? - but: [information science baggage](https://en.wikipedia.org/wiki/Archetype_(information_science)
- Just Entity? Just Shape? Entity_Type? Entity_Class? Timmy?
- Property - an attribute used to describe an entity
- Value - the specific content of a property
AP-STM elements
- Entity_Shape_ID - the handle for class of things being described
- Entity_Shape_Label - a human-readable text representing the class of things being described
- Property_ID - identifier of a property used to describe the resource
- Property_Label - a human-readable text representing the property
- Value_type - data type of the value in the instance data for the related property
- Value_constraint - a further constraint on the value. Examples: pick list, URI stem.
- Cardinality - number of pairs of a given property and value allowable for describing an entity
- Annotation - free-form comments about the statement
- Namespace_Prefix
- Namespace_URI
Note: If the vocabulary were small enough, it could be "translated" into library- and computer-science terminology.
Various DCAP-IG resolutions and design decisions
- Aim at "minimalist" profile (minimize the number of columns in spreadsheet).
- Most minimal AP could consist of just a list of properties.
- Properties do not have to be URIs; "label" not required.
- Text fields needed for generating input forms or human-readable documentation.
- Property Label
- Entity Label
- Annotation
Requirements for application profiles
For human consumption
- Expressible in TXT, Markdown, HTML, PDF, MSWord, Google Docs...
- Serves
- to document community consensus
- to document the structure of a specific dataset
For machine processing
- Expressible in an actionable form (XML Schema, SHACL, ShEx)
- Provides a template or schema for
- creating instance data
- consuming instance data
- displaying data (eg, Web forms)
- validating instance data.
Write-up...?
- 2019-11-28: https://hackmd.io/ItJ6XFC9RHy9wKjwUj0aYQ