Skip to content

Instantly share code, notes, and snippets.

@tommorris
Created March 20, 2010 14:18
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tommorris/338692 to your computer and use it in GitHub Desktop.
Save tommorris/338692 to your computer and use it in GitHub Desktop.
Data.gov.uk verification preliminary results
Received: by 10.223.112.7 with HTTP; Sat, 20 Mar 2010 07:17:28 -0700 (PDT)
Date: Sat, 20 Mar 2010 14:17:28 +0000
Subject: Data.gov.uk format verification preliminary results
From: Tom Morris <tom@tommorris.org>
To: uk-government-data-developers@googlegroups.com
Content-Type: text/plain; charset=ISO-8859-1
Sorry it has taken so long, but here are the aggregate results of the
data.gov.uk format verification exercise.
HTML - 252
XML - 5
Word - 4
RTF - 1
OpenOffice - 1
Something odd - 85
JSON - 9
Nothing there! - 190
CSV - 12
Multiple formats - 1211
PDF - 468
RDF - 10
Excel - 408
TOTAL - 2656
Sadly, this is over-optimistic. I've manually checked some of the data
that has been categorised as JSON and RDF. Most of it is not actually
correctly categorised - either people clicked, say, 'RDF' when they
meant to click 'PDF', or they have seen an RSS or Atom feed and
categorised it as RDF.
What this admittedly imperfect dataset is basically saying is that the
vast majority of the 'data' on data.gov.uk is not actually
machine-readable data but human-readable documents.
I'll publish the complete dataset later.
--
Tom Morris
<http://tommorris.org/>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment