Skip to content

Instantly share code, notes, and snippets.

@NTerpo
Last active April 9, 2016 14:33
Show Gist options
  • Save NTerpo/b81a0b195ceb99a7e53a to your computer and use it in GitHub Desktop.
Save NTerpo/b81a0b195ceb99a7e53a to your computer and use it in GitHub Desktop.
{
"language": "en",
"name": "Paris Data",
"description": "City of Paris Open Data portal",
"url": "http://opendata.paris.fr/",
"linked_portals": ["http://data.gouv.fr", "http://data.iledefrance.fr"],
"data_language": ["fr"],
"modified": "2016-03-04T13:44:44+00:00",
"themes": ["Culture, Heritage", "Education, Training, Research, Teaching", "Environment", "Transport, Movements", "Spatial Planning, Town Planning, Buildings, Equipment, Housing", "Health", "Economy, Business, SME, Economic development, Employment", "Services, Social", "Administration, Government, Public finances, Citizenship", "Justice, Safety, Police, Crime", "Sports, Leisure", "Accommodation, Hospitality Industry"],
"links": [
{"url": "http://opendata.paris.fr/explore/download/", "rel": "Catalog CSV"},
{"url": "http://opendata.paris.fr/api/", "rel": "API v1"},
{"url": "http://opendata.paris.fr/api/datasets/1.0/search?format=rdf", "rel": "Catalog RDF"}
],
"version": "1.0",
"number_of_datasets": 176,
"organization_in_charge_of_the_portal":{
"name": "City of Paris",
"url": "http://www.paris.fr/"
},
"spatial": {
"country": "FR",
"coordinates": [
48.8567,
2.3508
],
"locality": "Paris",
"data_spatial_coverage": "a Geojson with the data coverage"
},
"type": "Local Government Open Data Portal",
"datapackages": [
"http://opendata.paris.fr/explore/dataset/liste_des_sites_des_hotspots_paris_wifi/datapackage.json",
"http://opendata.paris.fr/explore/dataset/points-de-vote-du-budget-participatif/datapackage.json",
"http://opendata.paris.fr/explore/dataset/cinemas-a-paris/datapackage.json"
]
}
@NTerpo
Copy link
Author

NTerpo commented Apr 4, 2016

  • publisher: I totally agree with you, we should use FOAF behind the scene.
  • data_language: the other language field only describes the document dataportal.json when data_language aims to describe the data themselves.
  • type : yes I also believe there may be a lot of discussion about what to include in the list. But that's the whole point of our approach : let usage decide and give time to experimentation.
  • themeTaxonomy : ok :)
  • linked-portals, links : I do understand your point, but I don't exactly want to design a long-term standard. I would prefer a set of common practices : something that's really easy for people/org to implement right now, that can give concrete return right away and something that will be easy (because of the easy implementation) to abandon the day Linked Data is mainstream or at least every issues has a better solution. Both fields may become irrelevant, but at least we will have an idea of what publishers want to link to in real life. If the whole document is pushed as a set of common practices and not as a standard, when the publishers will implement it (if they want, which is not sure haha) they will make a trade-off between how they think they will optimize their data diffusion and how other data portals are dealing with that document. For now we have consistent and compliant standards, but we don't have a lot of portals describing their catalogs. We have to understand why and how we can design something they will really use.

@jpmckinney
Copy link

The best way to understand why a standard isn't being adopted is to ask the potential adopters (with an unbiased questionnaire, methodology, etc.). That said, I don't think the problem is that DCAT is too hard. I think it's that:

  1. Publishers don't know what standards to adopt. When talking to publishers, this is really the most common reason in my experience.
  2. Publishers don't know how to interpret the standard's documentation. The solution to this is to provide good documentation written for implementers for existing standards. The W3C documentation for DCAT is written for RDF experts; a user-friendly, implementation-focused, jargon-free version of those docs would go a long way towards easing adoption.
  3. Publishers are using third-party software that doesn't provide machine-readable catalog metadata out of the box. The solution to this isn't to introduce some new practice – which will similarly not be adopted by those suppliers. The solution is to convince the major suppliers to implement a common standard (like DCAT).

In short we need:

  1. Awareness building
  2. Better documentation
  3. Vendor adoption

Creating some new format is not a solution for any of those. I really don't believe the problem is, "DCAT is hard." Let's at least validate what the real problems are before investing time and effort into a solution. Does that make sense?

@jpmckinney
Copy link

Anyway, for better alignment with DCAT please change:

  • name -> title
  • url -> homepage
  • version -> conformsTo and change the value to the URL for the documentation of this format (which should be a versioned web page)
  • organization_in_charge_of_the_portal -> publisher
    • url -> homepage
  • spatial -> make the value an actual GeoJSON feature, so:
{
  "type": "Feature",
  "geometry": {
    "type": ...,
    "coordinates": ...
  },
  "properties": {
    "name": "Paris",
    "country": "FR"
  }
}

@ColinMaudry
Copy link

I've created a draft JSON-LD fork in order to actually enable round tripping with RDF: https://gist.github.com/ColinMaudry/5163ecade149a837aa25694fdd7ac46f. It's still incomplete, but it gives an idea.

And here is how it behaves when processing the JSON to RDF with the context: http://tinyurl.com/hdza9yp

Suggestions if we want to go further in that direction:

  • type value should either be a keyword that we can resolve to a URI (ex: local-government) or a URI
  • As-is, themes values cannot be used in a UI in another language than English. A pity for French data :) Setting up a list of themes URI would enable multilingual support. As for type, in the JSON, the themes value could either be lower case keywords in English or URIs
  • I'm not very comfortable with property name in plural form. I assume it's a hint to know that the value is an array.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment