Skip to content

Instantly share code, notes, and snippets.

@VladimirAlexiev
Last active December 24, 2016 02:56
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save VladimirAlexiev/090d5e54a525d57acb9b366121e77573 to your computer and use it in GitHub Desktop.
Save VladimirAlexiev/090d5e54a525d57acb9b366121e77573 to your computer and use it in GitHub Desktop.
How Not To Do Linked Data

How Not To Do Linked Data

I did a bit of review of the Linked Data of a respected CH institution, which will remain unnamed. Below are my findings

Preparation

riot --formatted=turtle ex1.rdf  1>ex1.ttl

Syntax Errors

  • The file begins
<rdf:RDF xsi:schemaLocation="http://www.w3.org/2000/01/rdf-schema# http://erlangen-crm.org/120111/" >

This causes error

[line: 1, col: 102] The prefix "rdf" for element "rdf:RDF" is not bound.

Fix like so:

<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:crm="http://erlangen-crm.org/120111/" >
  • using a dated version of ontology URLs is a bad idea. This means your data can interop ONLY with people who use the same exact version (i.e. nobody)
  • special chars in URLs are not URL-encoded. This causes error
[line: 6, col: 97] The reference to entity "case" must end with the ';' delimiter

Change & to &amp; But notice that there is a badly escaped URL in crm:E18_Physical_Thing: &amp-id=oai%3aculturaitalia-it

  • there is some HTML markup inside a literal, which just won't do
<rdf:value>
<a href="viewItem.jsp?language=en&amp;id=oai%3Aculturaitalia.it%3Amuseiditalia-coll_306">View parent resource</a>
</rdf:value>

Causes these errors:

 {W104} Unqualified typed nodes are not allowed. Type treated as a relative URI.
 {W136} Relative URIs are not permitted in RDF: specifically <a>
 {W136} Relative URIs are not permitted in RDF: specifically <href>
 {W102} Unqualified property attributes are not allowed. Property treated as a relative URI.
 {E202} Expecting XML start or end element(s). String data "View parent resource" not allowed

Commented out

Semantic Errors

Now that we've the fixed syntactic errors, we can start looking at the RDF data. It tries (but fails) to describe this object, which has nice structured data: http://www.culturaitalia.it/opencms/it/temi/viewItem.jsp?language=it&id=oai%3Aculturaitalia.it%3Amuseiditalia-work_34345

  • uses rdf:value throughout, but CRM doesn't specify the use of such property, and nobody else uses it (use crm:P3_has_note or rdfs:label instead)
  • maybe the weirdest construct is below. crm:E62_String does not exist, just use the literal directly!
crm:P3_has_note  [ a crm:E62_String ;
                   rdf:value  "Testa di Atena con elmo adorno di grigfone"
                 ] ;
<http://culturaitalia.it/resource/place/paslazzo-mazzarotta-cb-molise-italia-inv-45829-09-2011->
        a          crm:E53_Place ;
        rdf:value  "paslazzo Mazzarotta (CB), Molise - Italia, inv. 45829 (09/2011)" .
  • including inventory numbers in places is wrong, that should be an crm:E42_Identifier
  • lots of blank-node types, rather than using some thesaurus. Eg:
<http://194.242.241.163/fedora/objects/work:34345/datastreams/MM105015/content>
        a                crm:E36_Visual_Item ;
        crm:P2_has_type  [ a          crm:E55_Type ;
                           rdf:value  "preview"
                           ] .

Several mistakes in this part:

<file:///C:/my/Onto/culture/CulturaItalia/fedora/objects/work:34345/datastreams/export/content>
        a                        crm:E89_Propositional_Object ;
        crm:P1_is_identified_by  [ a                crm:E42_Identifier ;
                                   rdf:value        "fedora/objects/work:34345/datastreams/export/content" ;
                                   crm:P2_has_type  [ a          crm:E55_Type ;
                                                      rdf:value  "URL"
                                                    ]
                                 ] .

Totally ugly, non-permanent, and wrong URL (notice the incomplete/unescaped XML entity &amp-id).

crm:P46i_forms_part_of
  <http://culturaitalia.it/resource/thing/-a-href=-viewitem-jsp?language=en&amp-id=oai%3aculturaitalia-it%3amuseiditalia-c-$5$G0BTMMeY$sgniH4PL39nLT4IHvU2jjVILwGGaI.cYDrmDZXWeK9A>
  • and then, there is no useful info about that resource:
<http://culturaitalia.it/resource/thing/-a-href=-viewitem-jsp?language=en&amp-id=oai%3aculturaitalia-it%3amuseiditalia-c-$5$G0BTMMeY$sgniH4PL39nLT4IHvU2jjVILwGGaI.cYDrmDZXWeK9A> ;
  a crm:E18_Physical_Thing .
  • "Scheda ICCD NU" means "ICCD Card Number", so again that's a Document not crm:E89_Propositional_Object
<http://culturaitalia.it/resource/object/scheda-iccd-nu-14-00085092>
        a          crm:E89_Propositional_Object ;
        rdf:value  "Scheda ICCD NU: 14-00085092" .
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:crm="http://erlangen-crm.org/120111/" >
<crm:E22_Man_Made_Object rdf:about="http://culturaitalia.it/resource/oai-culturaitalia-it-museiditalia-work_34345" >
<crm:P1_is_identified_by>
<crm:E42_Identifier rdf:about="http://www.culturaitalia.it/opencms/viewItem.jsp?language=en&amp;case=&amp;id=oai%3Aculturaitalia.it%3Amuseiditalia-work_34345" >
<rdf:value>http://www.culturaitalia.it/opencms/viewItem.jsp?language=en&amp;case=&amp;id=oai%3Aculturaitalia.it%3Amuseiditalia-work_34345</rdf:value>
</crm:E42_Identifier>
</crm:P1_is_identified_by>
<crm:P1_is_identified_by>
<crm:E42_Identifier rdf:about="http://culturaitalia.it/resource/identifier/work_34345" >
<rdf:value>work_34345</rdf:value>
</crm:E42_Identifier>
</crm:P1_is_identified_by>
<crm:P1_is_identified_by>
<crm:E35_Title rdf:about="http://culturaitalia.it/resource/title/didracma" >
<rdf:value>didracma</rdf:value>
<crm:P139_has_alternative_form>
<crm:E35_Title rdf:about="http://culturaitalia.it/resource/alternative/moneta" >
<rdf:value>Moneta</rdf:value>
</crm:E35_Title>
</crm:P139_has_alternative_form>
</crm:E35_Title>
</crm:P1_is_identified_by>
<crm:P2_has_type>
<crm:E55_Type rdf:about="http://culturaitalia.it/pico/thesaurus/4.1#monete_e_medaglie" ></crm:E55_Type>
</crm:P2_has_type>
<crm:P3_has_note>
<crm:E62_String>
<rdf:value>Testa di Atena con elmo adorno di grigfone</rdf:value>
</crm:E62_String>
</crm:P3_has_note>
<crm:P53_has_former_or_current_location>
<crm:E53_Place rdf:about="http://culturaitalia.it/resource/place/paslazzo-mazzarotta-cb-molise-italia-inv-45829-09-2011-" >
<rdf:value>paslazzo Mazzarotta (CB), Molise - Italia, inv. 45829 (09/2011)</rdf:value>
</crm:E53_Place>
</crm:P53_has_former_or_current_location>
<crm:P108i_was_produced_by>
<crm:E12_Production rdf:about="http://culturaitalia.it/resource/created/iv-sec-a-c-" >
<crm:P4_has_time_span>
<crm:E52_Time_Span rdf:about="http://culturaitalia.it/resource/date/iv-sec-a-c-" >
<rdf:value>IV sec. a.C.</rdf:value>
</crm:E52_Time_Span>
</crm:P4_has_time_span>
</crm:E12_Production>
</crm:P108i_was_produced_by>
<crm:P108i_was_produced_by>
<crm:E12_Production rdf:about="http://culturaitalia.it/resource/created/350-bc-340-bc" >
<crm:P4_has_time_span>
<crm:E52_Time_Span rdf:about="http://culturaitalia.it/resource/date/350-bc-340-bc" >
<rdf:value>350 BC - 340 BC</rdf:value>
</crm:E52_Time_Span>
</crm:P4_has_time_span>
</crm:E12_Production>
</crm:P108i_was_produced_by>
<crm:P2_has_type>
<crm:E55_Type rdf:about="http://culturaitalia.it/resource/type/opere" >
<rdf:value>Opere</rdf:value>
</crm:E55_Type>
</crm:P2_has_type>
<crm:P2_has_type>
<crm:E55_Type rdf:about="http://culturaitalia.it/resource/type/moneta" >
<rdf:value>Moneta</rdf:value>
</crm:E55_Type>
</crm:P2_has_type>
<crm:P2_has_type>
<crm:E55_Type rdf:about="http://purl.org/dc/dcmitype/PhysicalObject" >
<rdf:value>PhysicalObject</rdf:value>
</crm:E55_Type>
</crm:P2_has_type>
<crm:P46i_forms_part_of>
<crm:E18_Physical_Thing rdf:about="http://culturaitalia.it/resource/thing/-a-href=-viewitem-jsp?language=en&amp;amp-id=oai%3aculturaitalia-it%3amuseiditalia-c-$5$G0BTMMeY$sgniH4PL39nLT4IHvU2jjVILwGGaI.cYDrmDZXWeK9A" >
<!--rdf:value><a href="viewItem.jsp?language=en&amp;id=oai%3Aculturaitalia.it%3Amuseiditalia-coll_306">View parent resource</a></rdf:value-->
</crm:E18_Physical_Thing>
</crm:P46i_forms_part_of>
<crm:P67i_is_referred_to_by>
<crm:E89_Propositional_Object rdf:about="http://culturaitalia.it/resource/object/scheda-iccd-nu-14-00085092" >
<rdf:value>Scheda ICCD NU: 14-00085092</rdf:value>
</crm:E89_Propositional_Object>
</crm:P67i_is_referred_to_by>
<crm:P45_consists_of>
<crm:E57_Material rdf:about="http://culturaitalia.it/resource/material/coniata-in-argento" >
<rdf:value>coniata in argento</rdf:value>
</crm:E57_Material>
</crm:P45_consists_of>
<crm:P43_has_dimension>
<crm:E54_Dimension rdf:about="http://culturaitalia.it/resource/dimension/diametro-cm-2-21" >
<rdf:value>diametro: cm 2.21</rdf:value>
</crm:E54_Dimension>
</crm:P43_has_dimension>
<crm:P43_has_dimension>
<crm:E54_Dimension rdf:about="http://culturaitalia.it/resource/dimension/peso-cm-7-16-g" >
<rdf:value>peso: cm 7.16 g</rdf:value>
</crm:E54_Dimension>
</crm:P43_has_dimension>
<crm:P138i_has_representation>
<crm:E36_Visual_Item rdf:about="http://194.242.241.163/fedora/objects/work:34345/datastreams/MM105015/content" >
<crm:P2_has_type>
<crm:E55_Type>
<rdf:value>preview</rdf:value>
</crm:E55_Type>
</crm:P2_has_type>
</crm:E36_Visual_Item>
</crm:P138i_has_representation>
<crm:P67i_is_referred_to_by>
<crm:E89_Propositional_Object rdf:about="fedora/objects/work:34345/datastreams/export/content" >
<crm:P1_is_identified_by>
<crm:E42_Identifier>
<rdf:value>fedora/objects/work:34345/datastreams/export/content</rdf:value>
<crm:P2_has_type>
<crm:E55_Type>
<rdf:value>URL</rdf:value>
</crm:E55_Type>
</crm:P2_has_type>
</crm:E42_Identifier>
</crm:P1_is_identified_by>
</crm:E89_Propositional_Object>
</crm:P67i_is_referred_to_by>
</crm:E22_Man_Made_Object>
</rdf:RDF>
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix crm: <http://erlangen-crm.org/120111/> .
<http://culturaitalia.it/resource/dimension/diametro-cm-2-21>
a crm:E54_Dimension ;
rdf:value "diametro: cm 2.21" .
<file:///C:/my/Onto/culture/CulturaItalia/fedora/objects/work:34345/datastreams/export/content>
a crm:E89_Propositional_Object ;
crm:P1_is_identified_by [ a crm:E42_Identifier ;
rdf:value "fedora/objects/work:34345/datastreams/export/content" ;
crm:P2_has_type [ a crm:E55_Type ;
rdf:value "URL"
]
] .
<http://194.242.241.163/fedora/objects/work:34345/datastreams/MM105015/content>
a crm:E36_Visual_Item ;
crm:P2_has_type [ a crm:E55_Type ;
rdf:value "preview"
] .
<http://culturaitalia.it/pico/thesaurus/4.1#monete_e_medaglie>
a crm:E55_Type .
<http://culturaitalia.it/resource/alternative/moneta>
a crm:E35_Title ;
rdf:value "Moneta" .
<http://culturaitalia.it/resource/created/350-bc-340-bc>
a crm:E12_Production ;
crm:P4_has_time_span <http://culturaitalia.it/resource/date/350-bc-340-bc> .
<http://culturaitalia.it/resource/created/iv-sec-a-c->
a crm:E12_Production ;
crm:P4_has_time_span <http://culturaitalia.it/resource/date/iv-sec-a-c-> .
<http://culturaitalia.it/resource/date/350-bc-340-bc>
a crm:E52_Time_Span ;
rdf:value "350 BC - 340 BC" .
<http://culturaitalia.it/resource/date/iv-sec-a-c->
a crm:E52_Time_Span ;
rdf:value "IV sec. a.C." .
<http://culturaitalia.it/resource/dimension/peso-cm-7-16-g>
a crm:E54_Dimension ;
rdf:value "peso: cm 7.16 g" .
<http://culturaitalia.it/resource/identifier/work_34345>
a crm:E42_Identifier ;
rdf:value "work_34345" .
<http://culturaitalia.it/resource/material/coniata-in-argento>
a crm:E57_Material ;
rdf:value "coniata in argento" .
<http://culturaitalia.it/resource/oai-culturaitalia-it-museiditalia-work_34345>
a crm:E22_Man_Made_Object ;
crm:P108i_was_produced_by <http://culturaitalia.it/resource/created/iv-sec-a-c-> , <http://culturaitalia.it/resource/created/350-bc-340-bc> ;
crm:P138i_has_representation <http://194.242.241.163/fedora/objects/work:34345/datastreams/MM105015/content> ;
crm:P1_is_identified_by <http://culturaitalia.it/resource/title/didracma> , <http://www.culturaitalia.it/opencms/viewItem.jsp?language=en&case=&id=oai%3Aculturaitalia.it%3Amuseiditalia-work_34345> , <http://culturaitalia.it/resource/identifier/work_34345> ;
crm:P2_has_type <http://culturaitalia.it/resource/type/opere> , <http://culturaitalia.it/resource/type/moneta> , <http://purl.org/dc/dcmitype/PhysicalObject> , <http://culturaitalia.it/pico/thesaurus/4.1#monete_e_medaglie> ;
crm:P3_has_note [ a crm:E62_String ;
rdf:value "Testa di Atena con elmo adorno di grigfone"
] ;
crm:P43_has_dimension <http://culturaitalia.it/resource/dimension/diametro-cm-2-21> , <http://culturaitalia.it/resource/dimension/peso-cm-7-16-g> ;
crm:P45_consists_of <http://culturaitalia.it/resource/material/coniata-in-argento> ;
crm:P46i_forms_part_of <http://culturaitalia.it/resource/thing/-a-href=-viewitem-jsp?language=en&amp-id=oai%3aculturaitalia-it%3amuseiditalia-c-$5$G0BTMMeY$sgniH4PL39nLT4IHvU2jjVILwGGaI.cYDrmDZXWeK9A> ;
crm:P53_has_former_or_current_location
<http://culturaitalia.it/resource/place/paslazzo-mazzarotta-cb-molise-italia-inv-45829-09-2011-> ;
crm:P67i_is_referred_to_by <http://culturaitalia.it/resource/object/scheda-iccd-nu-14-00085092> , <file:///C:/my/Onto/culture/CulturaItalia/fedora/objects/work:34345/datastreams/export/content> .
<http://culturaitalia.it/resource/object/scheda-iccd-nu-14-00085092>
a crm:E89_Propositional_Object ;
rdf:value "Scheda ICCD NU: 14-00085092" .
<http://culturaitalia.it/resource/place/paslazzo-mazzarotta-cb-molise-italia-inv-45829-09-2011->
a crm:E53_Place ;
rdf:value "paslazzo Mazzarotta (CB), Molise - Italia, inv. 45829 (09/2011)" .
<http://culturaitalia.it/resource/thing/-a-href=-viewitem-jsp?language=en&amp-id=oai%3aculturaitalia-it%3amuseiditalia-c-$5$G0BTMMeY$sgniH4PL39nLT4IHvU2jjVILwGGaI.cYDrmDZXWeK9A>
a crm:E18_Physical_Thing .
<http://culturaitalia.it/resource/title/didracma>
a crm:E35_Title ;
rdf:value "didracma" ;
crm:P139_has_alternative_form <http://culturaitalia.it/resource/alternative/moneta> .
<http://culturaitalia.it/resource/type/moneta>
a crm:E55_Type ;
rdf:value "Moneta" .
<http://culturaitalia.it/resource/type/opere>
a crm:E55_Type ;
rdf:value "Opere" .
<http://purl.org/dc/dcmitype/PhysicalObject>
a crm:E55_Type ;
rdf:value "PhysicalObject" .
<http://www.culturaitalia.it/opencms/viewItem.jsp?language=en&case=&id=oai%3Aculturaitalia.it%3Amuseiditalia-work_34345>
a crm:E42_Identifier ;
rdf:value "http://www.culturaitalia.it/opencms/viewItem.jsp?language=en&case=&id=oai%3Aculturaitalia.it%3Amuseiditalia-work_34345" .
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment