Created
March 20, 2010 14:18
-
-
Save tommorris/338692 to your computer and use it in GitHub Desktop.
Data.gov.uk verification preliminary results
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Received: by 10.223.112.7 with HTTP; Sat, 20 Mar 2010 07:17:28 -0700 (PDT) | |
Date: Sat, 20 Mar 2010 14:17:28 +0000 | |
Subject: Data.gov.uk format verification preliminary results | |
From: Tom Morris <tom@tommorris.org> | |
To: uk-government-data-developers@googlegroups.com | |
Content-Type: text/plain; charset=ISO-8859-1 | |
Sorry it has taken so long, but here are the aggregate results of the | |
data.gov.uk format verification exercise. | |
HTML - 252 | |
XML - 5 | |
Word - 4 | |
RTF - 1 | |
OpenOffice - 1 | |
Something odd - 85 | |
JSON - 9 | |
Nothing there! - 190 | |
CSV - 12 | |
Multiple formats - 1211 | |
PDF - 468 | |
RDF - 10 | |
Excel - 408 | |
TOTAL - 2656 | |
Sadly, this is over-optimistic. I've manually checked some of the data | |
that has been categorised as JSON and RDF. Most of it is not actually | |
correctly categorised - either people clicked, say, 'RDF' when they | |
meant to click 'PDF', or they have seen an RSS or Atom feed and | |
categorised it as RDF. | |
What this admittedly imperfect dataset is basically saying is that the | |
vast majority of the 'data' on data.gov.uk is not actually | |
machine-readable data but human-readable documents. | |
I'll publish the complete dataset later. | |
-- | |
Tom Morris | |
<http://tommorris.org/> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment