Skip to content

Instantly share code, notes, and snippets.

@justin2004
Last active August 27, 2021 23:01
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save justin2004/ae8d99aa1efafc5db150b0173104283c to your computer and use it in GitHub Desktop.
Save justin2004/ae8d99aa1efafc5db150b0173104283c to your computer and use it in GitHub Desktop.
NYC Open Data -- Data Modeling Exercise

NYC OpenData thoughtfully provides access to many datasets produced by the city government. It offers several export formats including the primary linked open data format: RDF.

There are, however, opportunites to make the RDF representation even more linked open data (LOD) friendly.

Using the following dataset as an example we will explore some RDF modeling options that make the data more LOD friendly.

Below is a description of each of the following files.

  • 2_before.ttl is a single sample from the water quality dataset. Note that NYC OpenData provides the XML serialization of RDF but we are using the TTL serialiation of RDF as they are equivalent.

  • 3_after.ttl is re-modeled representation of the same data using Wikidata vocabulary.

  • 4_how_can_this_modeling_be_used.rq is an example of the kind of SPARQL query that can be written once the data is modeled in this way.

  • 5_query_result.csv is the result of the SPARQL query.

@prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix dcterm: <http://purl.org/dc/terms/> .
@prefix dsbase: <https://data.cityofnewyork.us/resource/> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix socrata: <http://www.socrata.com/rdf/terms#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix ods: <http://open-data-standards.github.com/2012/01/open-data-standards#> .
@prefix ds: <https://data.cityofnewyork.us/resource/bkwf-xfky/> .
ds:201500865 rdfs:member dsbase:bkwf-xfky ;
socrata:rowID "row-cb8u.73zx_8k7k" ;
ds:coliform_quanti_tray_mpn_100ml
"<1" ;
ds:e_coli_quanti_tray_mpn_100ml
"<1" ;
ds:residual_free_chlorine_mg_l "0.39" ;
ds:sample_class "Compliance" ;
ds:sample_date "2015-01-11T00:00:00" ;
ds:sample_number "201500865" ;
ds:sample_site "55450" ;
ds:sample_time "07:42" ;
ds:turbidity_ntu "0.83" .
@prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix dcterm: <http://purl.org/dc/terms/> .
@prefix dsbase: <https://data.cityofnewyork.us/resource/> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix socrata: <http://www.socrata.com/rdf/terms#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix ods: <http://open-data-standards.github.com/2012/01/open-data-standards#> .
@prefix ds: <https://data.cityofnewyork.us/resource/bkwf-xfky/> .
@prefix wgs84_pos: <http://www.w3.org/2003/01/geo/wgs84_pos#> .
@prefix wd: <http://www.wikidata.org/entity/> .
@prefix wdt: <http://www.wikidata.org/prop/direct/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix wikibase: <http://wikiba.se/ontology#> .
[ a wd:Q485146 ; # sample
ds:sample_class "Compliance" ;
ds:sample_number "201500865" ;
wdt:P580 "2015-01-11T07:42:00"^^xsd:dateTime ; # start time
wdt:P625 [ wgs84_pos:lat "40.51949162979514" ; # coordinate location
wgs84_pos:long "-74.1970916294256"
] ;
wdt:P186 _:b0 # derived from
] .
_:b0 a wd:Q7892 ; # drinking water
wdt:P527 [ a wd:Q902010 ; #has part coliform
wdt:P1114 [ rdf:value 0 ; # quantity
wikibase:quantityUnit wd:Q21401573
# reciprocal cubic metre
]
] ;
wdt:P527 [ a wd:Q25419 ; #has part e coli
wdt:P1114 [ rdf:value 0 ; # quantity 0
wikibase:quantityUnit wd:Q21401573
# reciprocal cubic metre
]
] ;
wdt:P527 [ a wd:Q688 ; #has part chlorine
wdt:P2054 [ rdf:value .00039 ; # density
wikibase:quantityUnit wd:Q834105
# gram per litre
]
] ;
wdt:P1552 [ a wd:Q898574 ; # has quality turbidity
wdt:P1114 [ rdf:value 0.83 ; # quantity
wikibase:quantityUnit wd:Q1977731
# NTU
]
] .
# This is a SPARQL query uses `3_after.ttl` as input and does the following:
# For each sample of drinking water, find the constituent parts and use Wikidata to say what kind of thing each part is.
curl --silent -H 'Accept: text/csv' 'http://127.0.0.1:3030/water/query' \
--data-urlencode 'query=
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX wd: <http://www.wikidata.org/entity/> # Wikibase entity - item or property.
PREFIX wdt: <http://www.wikidata.org/prop/direct/> # Truthy assertions about the data, links entity to value directly.
PREFIX p: <http://www.wikidata.org/prop/> # Links entity to statement
PREFIX ps: <http://www.wikidata.org/prop/statement/> # Links value to statement
PREFIX pq: <http://www.wikidata.org/prop/qualifier/> # Links qualifier to statement node
prefix pqv: <http://www.wikidata.org/prop/qualifier/value/> # Links qualifier deep value to statement node
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX bd: <http://www.bigdata.com/rdf#>
prefix sch: <https://schema.org/>
select ?part_type ?part_super_type ?part_super_typeLabel
WHERE
{
?sample a wd:Q485146 ; # a sample
wdt:P186 ?source . # derived from
?source a wd:Q7892 . # a drinking water
?source wdt:P527 ?part . # has part
?part a ?part_type .
service <https://query.wikidata.org/sparql> {
?part_type wdt:P31/wdt:P279* ?part_super_type .
bind(?part_super_typeLabel as ?pstl )
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
} limit 500'
part_super_typeLabel part_super_type part_type
organisms known by a particular common name http://www.wikidata.org/entity/Q55983715 http://www.wikidata.org/entity/Q902010
living organism class http://www.wikidata.org/entity/Q21871294 http://www.wikidata.org/entity/Q902010
class http://www.wikidata.org/entity/Q16889133 http://www.wikidata.org/entity/Q902010
group or class of physical objects http://www.wikidata.org/entity/Q98119401 http://www.wikidata.org/entity/Q902010
collective entity http://www.wikidata.org/entity/Q99527517 http://www.wikidata.org/entity/Q902010
entity http://www.wikidata.org/entity/Q35120 http://www.wikidata.org/entity/Q902010
model organism http://www.wikidata.org/entity/Q213907 http://www.wikidata.org/entity/Q25419
organism http://www.wikidata.org/entity/Q7239 http://www.wikidata.org/entity/Q25419
research object http://www.wikidata.org/entity/Q4330518 http://www.wikidata.org/entity/Q25419
anatomical entity http://www.wikidata.org/entity/Q27043950 http://www.wikidata.org/entity/Q25419
object http://www.wikidata.org/entity/Q488383 http://www.wikidata.org/entity/Q25419
entity http://www.wikidata.org/entity/Q35120 http://www.wikidata.org/entity/Q25419
independent continuant http://www.wikidata.org/entity/Q53617489 http://www.wikidata.org/entity/Q25419
continuant http://www.wikidata.org/entity/Q103940464 http://www.wikidata.org/entity/Q25419
individual entity http://www.wikidata.org/entity/Q23958946 http://www.wikidata.org/entity/Q25419
taxon http://www.wikidata.org/entity/Q16521 http://www.wikidata.org/entity/Q25419
living organism class http://www.wikidata.org/entity/Q21871294 http://www.wikidata.org/entity/Q25419
class http://www.wikidata.org/entity/Q16889133 http://www.wikidata.org/entity/Q25419
group or class of physical objects http://www.wikidata.org/entity/Q98119401 http://www.wikidata.org/entity/Q25419
collective entity http://www.wikidata.org/entity/Q99527517 http://www.wikidata.org/entity/Q25419
entity http://www.wikidata.org/entity/Q35120 http://www.wikidata.org/entity/Q25419
chemical element http://www.wikidata.org/entity/Q11344 http://www.wikidata.org/entity/Q688
chemical substance http://www.wikidata.org/entity/Q79529 http://www.wikidata.org/entity/Q688
material substance http://www.wikidata.org/entity/Q28728771 http://www.wikidata.org/entity/Q688
physical substance http://www.wikidata.org/entity/Q28732711 http://www.wikidata.org/entity/Q688
chemical entity http://www.wikidata.org/entity/Q43460564 http://www.wikidata.org/entity/Q688
concrete object http://www.wikidata.org/entity/Q4406616 http://www.wikidata.org/entity/Q688
spatial entity http://www.wikidata.org/entity/Q58416391 http://www.wikidata.org/entity/Q688
object http://www.wikidata.org/entity/Q488383 http://www.wikidata.org/entity/Q688
spatio-temporal entity http://www.wikidata.org/entity/Q58415929 http://www.wikidata.org/entity/Q688
entity http://www.wikidata.org/entity/Q35120 http://www.wikidata.org/entity/Q688
essential medicine http://www.wikidata.org/entity/Q35456 http://www.wikidata.org/entity/Q688
medication http://www.wikidata.org/entity/Q12140 http://www.wikidata.org/entity/Q688
drug http://www.wikidata.org/entity/Q8386 http://www.wikidata.org/entity/Q688
pharmaceutical product http://www.wikidata.org/entity/Q28885102 http://www.wikidata.org/entity/Q688
chemical substance http://www.wikidata.org/entity/Q79529 http://www.wikidata.org/entity/Q688
xenobiotic http://www.wikidata.org/entity/Q409205 http://www.wikidata.org/entity/Q688
medicinal product http://www.wikidata.org/entity/Q86746756 http://www.wikidata.org/entity/Q688
physical substance http://www.wikidata.org/entity/Q28732711 http://www.wikidata.org/entity/Q688
chemical entity http://www.wikidata.org/entity/Q43460564 http://www.wikidata.org/entity/Q688
product http://www.wikidata.org/entity/Q2424752 http://www.wikidata.org/entity/Q688
goods http://www.wikidata.org/entity/Q28877 http://www.wikidata.org/entity/Q688
goods and services http://www.wikidata.org/entity/Q2897903 http://www.wikidata.org/entity/Q688
product http://www.wikidata.org/entity/Q15401930 http://www.wikidata.org/entity/Q688
concrete object http://www.wikidata.org/entity/Q4406616 http://www.wikidata.org/entity/Q688
spatial entity http://www.wikidata.org/entity/Q58416391 http://www.wikidata.org/entity/Q688
object http://www.wikidata.org/entity/Q488383 http://www.wikidata.org/entity/Q688
spatio-temporal entity http://www.wikidata.org/entity/Q58415929 http://www.wikidata.org/entity/Q688
output http://www.wikidata.org/entity/Q1150771 http://www.wikidata.org/entity/Q688
artificial entity http://www.wikidata.org/entity/Q16686448 http://www.wikidata.org/entity/Q688
entity http://www.wikidata.org/entity/Q35120 http://www.wikidata.org/entity/Q688
result http://www.wikidata.org/entity/Q2995644 http://www.wikidata.org/entity/Q688
consequence http://www.wikidata.org/entity/Q733541 http://www.wikidata.org/entity/Q688
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment