Skip to content

Instantly share code, notes, and snippets.

@charlesroper
Last active August 29, 2015 14:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save charlesroper/d423909a4938669a6003 to your computer and use it in GitHub Desktop.
Save charlesroper/d423909a4938669a6003 to your computer and use it in GitHub Desktop.
Biodiversity data is not subject to copyright

Bidiversity data is not subject to copyright?

Charles Roper - Sussex Biodiversity Record Centre - 2014-06-11

There is some debate within the biodiversity informatics community as to whether species observation data is subject to copyright. It is true, data representing facts is not subject to copyright. Exceptions would be images, video and original text, such as commentary. Also, a sufficiently original selection and arrangement of data would be subject to copyright, such as an atlas or flora, but the copyright would exist in that arrangement rather than the data itself (think notes in a composition, or words in a book - the component parts are not subject to copyright, but the selection and arrangement is). Data that has involved skill and knowledge in the creation and/or verification would be subject to copyright. In short, it's a grey area. I've been doing some research on the matter and have dug up some useful reading. I'd recommend anyone involved in managing biodiversity data to get acquainted with the topic.

First, here's a good example of the problem:

Here is a well-researched (yet incomplete, in my opinion) guide to copyright and licensing datasets:

That guide is incomplete because it does not cover "database rights", also known as sui generis database rights. Database rights essentially protect the compiler of a database, and the investment they have made in its creation, even where the data itself is not subject to copyright. Database rights are very similar to copyright in practice and can exist immaterial of whether copyright exists or not. A clear analysis of EU Directive 96/9/EC which provides a background, rationale and case studies can be found here:

The general feeling among the global biodiversity informatics community, as far as I can make out, is to go for CC0 - that is, an explicit waiver of copyright - when publishing biodiversity data coupled with 'community norms' (example). I haven't as yet found any strong counter arguments. Here are some examples of those favouring CC0:

So CC0 is what GBIF is now suggesting as the way forward and is asking for feedback. Everything I have read thus far is partially flawed inasmuch as they either do not consider, nor appear aware of, sui generis database rights, or they seem to believe the content of the database is not afforded protection, only the structure (which is a misunderstanding to the best of my knowledge). I'm not saying the conclusion of recommending CC0 is necessarily wrong, but sui generis needs to be have been fully considered. It also seems as if the decision is being judged through the needs and drivers of the scientific community with no consideration for those working with the commercial, government, NGO, non-profit or social sectors.

How do we feel about the idea of publishing our databases, or portions of them, as CC0? Would this kill our business/funding models? What do our steering groups and boards think of this move toward more straightforward, open and legally unencumbered data publishing? How could they support this move? How do we adapt? Is now the time to start pushing for full funding and an abandonment of charging for data (not that we charge for data now)? How will recorders react to the suggestion that their data is not actually subject to copyright? Should we also consider the path of 'community norms'?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment