davaya/SpdxElementIds.md

## SpdxElementIds.md

      
    Raw
  

              SpdxElementIds.md
            
          
    SPDX Element IDs

RFC 3986 defines the syntax of URIs.
When used with linked data only scheme and hier-part are significant; query and fragment are ignored.
    URI         = scheme ":" hier-part [ "?" query ] [ "#" fragment ]  
    unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"  
    reserved    = gen-delims / sub-delims  
    gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"  
    sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"  
                      / "*" / "+" / "," / ";" / "="  

SPARQL defines two forms of IRIs: IRI references and prefixed names.
The mapping between the two is:
A prefixed name is a prefix label and a local part, separated by a colon ":".
A prefixed name is mapped to an IRI by concatenating the IRI associated with the prefix
and the local part. The prefix label or the local part may be empty.

[136] iri           ::= IRIREF | PrefixedName
[137] PrefixedName  ::= PNAME_LN | PNAME_NS
[141] PNAME_LN      ::= PNAME_NS PN_LOCAL

The following fragments are some of the different ways to write the same IRI:
<http://example.org/book/book1>

BASE <http://example.org/book/>
<book1>

PREFIX book: <http://example.org/book/>
book:book1

SPARQL's definition of a well-defined mapping between IRI and PrefixedName is the key to
understanding how to model and serialize IRIs used as Element IDs:

Derived attributes are attributes that do not exist in the physical data,
but their values are derived from other attributes present in the data.
For example, age can be derived from date_of_birth.

--- Entity-Relationship Modeling

In a class model representing Elements identified by IRIs, the following types can be used
to model the relationship between physical data in SPDX documents and derived attribute values:

ElementId: a globally-unique IRI property of an Element
ElementRef: an IRI reference to an Element, e.g. Document/rootElement, Relationship/to.
Namespace: (PNAME_NS) the BASE IRI or the IRI corresponding to a PREFIX string.
Lui: (Local Unique Identifier = PN_LOCAL) the local part of PrefixedName
Hint: information that is not part of ElementId but may be included in ElementRef to aid readability

For collections of multiple Elements such as Document and ContextualCollection, Lui uniquely identifies
the Element within the collection.
For singleton Elements, Lui MAY be empty/null, in which case the ElementId IRI equals its Namespace.
The productions and functions used to support derived attributes are:
elementId    ::= a production of namespace and lui that yields an iri
elementRef   ::= a production of namespace, lui, and hint that yields an iri

elementId = element_ref_format(namespace, lui)
elementRef = element_ref_format(namespace, lui, hint)
namespace, lui, hint = element_ref_parse(iri)

Examples

The SPDX definition of PN_LOCAL MUST be consistent with that of SPARQL, but MAY impose additional
restrictions (such as prohibiting escaped characters) in the interest of efficiency and readability.
Amazon product page

The following IRIs are aliases for the same resource:

[1] https://www.amazon.com/dp/0127999574
[2] https://www.amazon.com/RDF-Database-Systems-Triples-Processing/dp/0127999574
[3] https://www.amazon.com/dp/0127999574#RDF-Database-Systems-Triples-Processing

When treated as strings they would be considered three different resources.  But treating them as IRIs
derived from the same properties identifies the aliasing and allows the single resource to be recognized.

[1] Namespace = https://www.amazon.com/dp/, Lui = 0127999574
[3] Namespace = https://www.amazon.com/dp/, Lui = 0127999574, Hint = RDF-Database-Systems-Triples-Processing

Amazon uses server-side processing to alias pages with and without embedded hints. Without interpreting an
Amazon-specific Lui delimiter "/dp/", the IRI is not recognized as an alias for the same resource as the other two:

[2] Namespace = https://www.amazon.com/RDF-Database-Systems-Triples-Processing/dp/, Lui = 0127999574

Note that Lui may sometimes be a "component id", since ISBNs, grocery barcode UPCs, etc. identify a specific
component across more than one vendor, application, or namespace.  But Lui is always local to the namespace under
which it appears, since the same value 0127999574 under a different namespace may be entirely unrelated to its
use as an ISBN.
Identifier Conversion Service

Identifiers.org is a SPARQL-based service to enable
on-the-fly integration of life science data. Identifiers.org registers and assigns short names to major
producers of life sciences data, for example "csd" for the
Cambridge Crystallographic Data Centre.
The producers in turn identify their products by id.

Resource URI: https://identifiers.org/csd:PELNAW
Namespace = https://identifiers.org/, Lui = csd:PELNAW

SPDX does not need to understand multi-level namespace hierarchies, but treating the top level in accordance with
linked data standards ensures that the SPDX standard does not preclude more complex use cases.
SPDX v2

SPDX v2 recognizes IRI structure by defining documentNamespace (e.g., http://spdx.org/spdxdocs/spdx-example-444504E0-4F89-41D3-9A0C-0305E82C330) and SPDXID (e.g., SPDXRef-File, SPDXRef-JenaLib, DocumentRef-spdx-tool-1.2:SPDXRef-ToolsElement, LicenseRef-4).
If SPDXID were an unstructured string type, the namespace would be repeated throughout the document every time an id was defined or
referenced:
http://spdx.org/spdxdocs/spdx-example-444504E0-4F89-41D3-9A0C-0305E82C330/SPDXRef-File  
http://spdx.org/spdxdocs/spdx-example-444504E0-4F89-41D3-9A0C-0305E82C330/SPDXRef-JenaLib  
http://spdx.org/spdxdocs/spdx-example-444504E0-4F89-41D3-9A0C-0305E82C330/DocumentRef-spdx-tool-1.2:SPDXRef-ToolsElement  
http://spdx.org/spdxdocs/spdx-example-444504E0-4F89-41D3-9A0C-0305E82C330/LicenseRef-4  

No XML document, whether representing linked data or not, treats URI as a string, just as they do not treat number
as a string where "1.0" has a different value than "1.00". The equivalence between IRI and PrefixName and the ability
to derive one from the other, as defined in SPARQL, must be respected in SPDX v3.
Hints

SPDX v2 includes both unique identifier and hint information in SPDXID values. But SPARQL, by explicitly excluding
query and fragment when comparing IRIs, provides a standard syntax for hints. SPDX v3 does not need to define
anything else in order for producing tools to include whatever query and/or fragment values in ElementRefs are deemed
useful, knowing that they are ignored by consuming tools.
The following ElementRefs identify the same Element:
http://spdx.org/spdxdocs/spdx-example-444504E0-4F89-41D3-9A0C-0305E82C330/4  
http://spdx.org/spdxdocs/spdx-example-444504E0-4F89-41D3-9A0C-0305E82C330/4#LicenseRef  
http://spdx.org/spdxdocs/spdx-example-444504E0-4F89-41D3-9A0C-0305E82C330/4?type=License&name=Apache-2.0  

The properties used to derive the IRI lexical value are:
namespace = http://spdx.org/spdxdocs/spdx-example-444504E0-4F89-41D3-9A0C-0305E82C330/  
lui = 4  
hint = "", "#LicenseRef" or "?type=License&name=Apache-2.0" respectively