By Charles Roper
charles.roper@sxbrc.org.uk
http://sxbrc.org.uk
http://twitter.com/charlesroper
2014-09-11 GIGL, London
- Creative Commons
- The ODI
- The ODI Self-Certification
- Open Knowledge Foundation
- 5 Star Open Data
- GDS Design Principles
- Government Service Design Manual
- Piracy is Progressive Taxation, and Other Thoughts on the Evolution of Online Distribution - Article by Tim O'Reilly from 2002, a year before iTunes Store was launched. Very prescient and very relevant to some of the challenges we face today with online distribution of data.
- Showing you this list map of aggregated bullfrog records would be illegal - Peter Desmet post.
- Analyzing the licenses of all 11,000+ GBIF registered datasets - Peter Desmet post.
- Story behind the online 'Fracking Map'
- The Future of Data Sharing - My article for Adastra, the Sussex biological recording annual review, published January 2014.
- The Economics of Open Data
- How to Make the Business Case for Open Data
- Open Data Business Models
- 7 business models for linked data
- Open Data for Elections
- The ODI on Piketty - a great article on how merely "sharing" data is not enough. To enable confidence in data-use, we need to make it open as per the definition.
- Book: Beyond Transparency — note, available for sale in print and also as a free download. Full text of the book is also on GitHub so that anyone may suggest edits, fork the text, remix, etc.
- Open: How we'll work, learn and live in the future
- [Book: Open Data Now] (http://www.opendatanow.com/book-open-data-now/)
- The Data Revolution
- Paper: Realizing Lessons of the Last 20 Years: A Manifesto for Data Provisioning & Aggregation Services for the Digital Humanities (A Position Paper) - an excellent paper that is highly relevant to natural history data. This links to an introductory blog post by Rod Page.
- Deloitte: Stimulating demand for open data in the UK
- Deloitte: Driving growth, ingenuity and innovation
- The ODI guides - lots of useful guides about various aspects of open data
- The Open Data Handbook - A handbook available as HTML or PDF that discusses the legal, social and technical aspects of open data
- OpenGeo Suite
- CartoDB
- CartoDB Vision - "The future of geo isn’t a single app with hundreds of buttons. The future of geo is hundreds apps with a single button." Note, CartoDB is developed by a company called Vizzuality, who specialise in ciziten science and environmental data and mapping. I first became aware of them at the eBiosphere conference.
- Mapbox
- Storymaps
- Mode - Collaborative online data upload and analysis
- Ordnance Survey On-demand - This is OS's commercial subscription WMS/WFS service. Much of the data here is open data and freely available, but OS still sells a convenient, easy-to-use, plug-and-play service to save the considerable hassle associated with manual download and update of their various datasets. Look at the pricing page for pricing structure ideas.
- ScapeToad - Cartogram software
- Miso - makes it super-easy and affordable to upload, manage and serve INSPIRE compliant data - £60 per dataset.
- Miso Datapublisher - Specifics on the INSPIRE publishing part of the Miso service.
- Spreadsheet listing datasets eligable for INSPIRE - This shows what datasets are mandatory and optional, and where the responsibility for those datasets lie (county, district, unitary)
- Local Government Association (LGA) guidance on INSPIRE - this offers the clearest guidance on what is expected of LAs. The information here is much more approachable than the EU INSPIRE guidance and is much more relevant to LRCs.
- Guidance and other tools that support the implementation of INSPIRE in the UK - Note that in the UK, INSPIRE is often referred to as "UK Location Information Infrastructure" or just "UK Location". For Local Record Centres, the guidance under the "2. Data Sharing Operational Guidance" heading is of relevance, particularly "Part 2 - Licensing and Charging"
- UK Location - The UK Location (aka UK INSPIRE) hub
- Linked Data: Evolving the Web into a Global Data Space
- Linked Open Data: The Essentials (free PDF download)
- Understanding Linked Data by Example
- The vision thing - it's all about the links - This is very relevant to the NBN strategy!
- [Some design notes on modelling links between specimens and other kinds of data](Some design notes on modelling links between specimens and other kinds of data)
- DBpedia - A linked data database
- DBpedia Use Cases - Some DBpedia use cases
- DBpedia Datasets - This includes some useful information that demonstrates some of the possibilities of linked data. For instance, you can query out the following:
- Simple example of querying Linked Data - if you're a web developer, view source.
- W3C Linked Data Cookbook
- Linked Data Basics for Techies
Slides available on SpeakerDeck.
"Open means anyone can freely access, use, modify, and share for any purpose (subject, at most, to requirements that preserve provenance and openness)." -- http://opendefinition.org/
- The work must be available under an open license, preferably a well established standard license like Creative Commons, Open Data Commons or Open Government License.
- Must be easily and cheaply accessible; i.e., via internet download.
- Must be in a standard open format; i.e., easy to open and use.
- May or may not require attribution and/or 'share alike'
- Unlike other recent buzzwords like "big data" and "the cloud", open data is a movement that has a very specific meaning and that's a good thing. It provides a focus and limits marketing spin.
- Peter Desmet and other researchers collectively known as Datafable analysed the licenses of all 11000+ datasets (~416 million records) registered on GBIF.
- Only 10% of those datasets (26% of the occurrences) have any license at all, rendering them practically useless
- Only 1.4% of all datasets however (2% of all occurrences) are published with a standard license.
- The net results is that only a tiny proportion of data on GBIF is practically useful.
- Peter downloaded all 13000+ records of American Bullfrog data from GBIF and wanted to plot them on a map for a blog post he was working on. See here.
- Technically, it's easy to do, but the terms we agree to when using GBIF state we MUST observe the terms of the original provider.
- In the case of the Bullfrogs, that involves carefully reading 65(!) license statements. Of those statements, only 4 are standard CC licenses leaving a whopping 61 bespoke licenses to inspect.
- After considerable work to investigate the licenses, only 4% of the data may be used in a commercial context. If you're a journalist, have adverts on your blog, are running a blog on behalf of your business, this 4% is all you're allowed to use.
- 28% of data may be used in non-commercial setting.
- The remaining 72% cannot be used without first contacting 52 individual institutions and either seeking clarification or asking permission.
- THIS DOES NOT SCALE
- The results is that people will either ignore the data or ignore the fine print. Either way, this is undesirable.
Source article from 2002: Piracy is Progressive Taxation, and Other Thoughts on the Evolution of Online Distribution.
That is, the greater challenge is not that people will take our data an use it without permission or payment, but that people do not know our data exists. O'Reilly argues that piracy is a kind of progressive tax: the more exposure and more need for data, the more it will get illegitimately used, but also legitimate use also goes up. We should be trying so solve the problem of obscurity rather than limiting use.
Example: at a recent planning conference, lots of the planners had never heard of us, didn't know how to access our data, and wouldn't know what to do with it or how to interpret it if they did. This is an opportunity!
Clients will generally do the right thing in exchange for a fair price and a convenient service. Look at iTunes, Spotify and Netflix vs the illegal file sharing networks.
- O'Reilly is a great example of this. Despite it being incredibly easy to download pirated copies of their books from the likes of it-ebooks.info, their online ebook subscription service - which gives access to their full catalogue - is a huge success. Even more interestingly, books they publish either in full or in part online as web pages or as part of the documentation associated with programming languages, the regular printed books are some of their most popular. People will pay for convenience, packaging and quality.
- iTunes is now the biggest music retails in the world.
"Free" is eventually replaced by higher-quality paid service. Again, look at something like Spotify, Ordnance Survey, BGS.
- This is known as the freemium business model.
- It can increase exposure and enhance reputation.
- It invites greater participation and catalyse further data generation.
- It invites scrutiny.
- It is far better to have data out there and being used that sitting unused in obscurity.
- We should make more of the unique services and skills we can offer.
- Open data will shed light on these services and demonstrate capability.
- We can make use of our own open data - and others - in services such as planning screening.
- Crucially, we could open a large proportion of our data while keeping the "premium" data back in order to fund operations and reinvestment - freemium
- Package the data so that it is easy to use and hard to ignore.
- Make the premium stuff exceptionally high quality; add value.
- Require that freely available data is share-alike; clients can 'buy' private use if they don't want to share their work.
- Innovate: there are over 8000 government datasets available on data.gov.uk with a value estimated at £16 billion to the economy. Make use of this data. Become part of that ecosystem.
- A business case guide from The ODI.
At the very least have a position, a policy or a strategy to incorporate open data into your business.
- What can safely I open? Experiment!
- At the start of every project, consider if the output can be open data either in part or in whole.
- Go through The ODI's self-certification process.
- Work with small amounts of data first. Target, trial, repeat.
- Get with the movement! There is kudos attached, recognition, funding pots available a common vocabulary, understanding and set of values. It's a way of positively promoting data and its value.