drjwbaker/2016-09-02_Documenting-History-workshop.md

## 2016-09-02_Documenting-History-workshop.md

      
    Raw
  

              2016-09-02_Documenting-History-workshop.md
            
          
    Documenting History Workshop

Notes from Documenting History, Loughborough University, 5-6 September 2016
The following text represents my notes rather than precisely what was said on the day and should be taken in that spirit.

Programme

On GitHub

Day 1


Data Management as Part of a Research Workflow (Dr Gareth Cole, Loughborough University)

Draft Concordat on Open Research Data:

Research Data are quantitative information or qualitative statements collected by researchers in the  course of their work by experimentation, observation, interview or other methods. Data may be raw or primary (e.g. direct from measurement or collection) or derived from primary data for subsequent analysis  or  interpretation  (e.g.  cleaned  up  or  as  an  extract  from  a  larger  data  set).  The  purpose  of open research data is to provide the information necessary to support or validate a research project's observations,  findings  or  outputs. Data  may  include,  for  example,  statistics,  collections  of  digital images,  sound recordings, transcripts of interviews,  survey  data  and fieldwork observations with appropriate annotations.

Digital Curation Centre data management plan: what/how data will be created, how it will be appropriate, restrictions. It is kind of common sense but worth articulating: it isn't exciting, but it is needed. Responsibility tied to public money.
Why not have a plan? Why not plan out how we go about doing our research? Why not be efficient? But beside the carrots (efficiency!) there are sticks: permissions must work with the circumstances in which you want data to be used, mandates may be in place.
And yet: be flexible! The landscape will change! Be aware of differences between international jurisdictions depending on who your collaborators are.
3 copies of the data, on 2 different types of media, 1 of which is geographically separate from the others.
Be realistic and don't over think it: data management plans are mostly common sense, combined with ensuring you have the relevant expertise supporting the project (either existing at your HE or from hires)
Good resource: DMPOnline

Knowing the Vocabulary – Data Management & Grant Capture (Dr Gareth Cole, Loughborough University)

Definition exercise (on what is DPA, CC, Copyright, Data Formats et al): our responses
FAIR (Findable, Accessible, Interoperable, and Re-usable) Data Principles: FORCE11 Website

Linked Data for the Documenting Historian (Dr Albert Meroño Peñuela, CLARIAH)

Aim to 1) define what it is and what problems it solves 2) what you gain 3) some tools to ease the pain!
Tom Heath and Christian Bizer (2011) Linked Data: Evolving the Web into a Global Data Space (1st edition). Synthesis Lectures on the Semantic Web: Theory and Technology, 1:1, 1-136. Morgan & Claypool. dx.doi.org/10.2200/S00334ED1V01Y201102WBE001 http://linkeddatabook.com/editions/1.0/
The web designed for people to share (and publish) information, not for machines to share information.
Linked data takes data out of silos.
Give all the things a name .. make names and concepts and everything unique (eg, John Smith, person, city, London) as statements such as has_name is_name et al. With these together we have a linked data graph. So finally we also need to make explicit the meaning of things: assign types to things and put them in a hierarchy.
Albert Meroño-Peñuela, Ashkan Ashkpour, Marieke van Erp, Kees Mandemakers, Leen Breure, Andrea Scharnhorst, Stefan Schlobach, Frank van Harmelen, 'Semantic Technologies for Historical Research: A Survey' SWJ (2012) http://www.semantic-web-journal.net/content/semantic-technologies-historical-research-survey
Summary:

lots of metadata
focus on time, geography, people
vocab/terminologies to describe historical things, processes, events (especially when there are variants)

Ways in which semantic technologies are being used for historical research - @albertmeronyo  #dochist pic.twitter.com/iIA2Jdcj2B
— Anne Welsh (@AnneWelsh) September 5, 2016
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
Different resources created using Linked Data - @albertmeronyo #dochist pic.twitter.com/8NsSZTSD8L
— Anne Welsh (@AnneWelsh) September 5, 2016
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
Why?

efficiency
better described
easier to find
provenance

Three different practical problems

Creating Linked Data
Publishing Linked Data
Accessing Linked Data

Main purpose of CLARIAH to solve these problems without needing to write, for example, SPARQL queries.
OpenRefine makes LOD from spreadsheets http://openrefine.org/

Some admin...


This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Some sections reworked from James Baker , "Preserving Your Research Data," Programming Historian (30 April 2014), http://programminghistorian.org/lessons/preserving-your-research-data
Exceptions: quotations and embeds to and from external sources