Skip to content

Instantly share code, notes, and snippets.

Last active December 19, 2015 11:39
Show Gist options
  • Save drjwbaker/5949343 to your computer and use it in GitHub Desktop.
Save drjwbaker/5949343 to your computer and use it in GitHub Desktop.
Monday DHOxSS 2013 notes
[live notes, so excuse the errors, omissions and personal perspective]
***This work is licensed under a Creative Commons Attribution 3.0 Unported License.***
#DHOxSS Michael Pidd, What is the value of Digital Humanities
Humanities Research Institute at the University of Sheffield established in 1992: support innovative use of technology to support humanities research.
Entirely project based and externally funded, with no teaching or faculty support.
Digital Humanities is an essentially practice based activity, not a theoretical one.
Oxford’s practice based approach makes it ideally suited to hosting this workshop.
Skills gained from practice required for DH.
How does the information age relate to the digital humanities? Modern information very similar to the record of our human past: messy.
Humanists are good at analysis is this messy data, so are useful to our age. Three examples of why.
Knowledge domains.
DH can bring manuscripts together to not only one place but to a place where cross-referencing and comparison can occur.
Assembly not basic.
Have to think what is useful for the scholar, which techniques are appropriate, survey the state of the field and proximity to data (here Canterbury Tales digital editions project).
Network analysis, used in humanities to understand relationship information not explicit in the evidence. Type of data work often referred to as web ontologies, though that has got some bad press in recent years.
Preference for Pidd of calling databases, databases. Needs of these sorts of projects challenging: visualising Twitter feeds ‘child’s play by comparison’.
Locating London’s Past. Bringing datasets together, from old maps to street view!
Complexity of doing this: collaboration with Museum of London, so map had to be warped to fit; which King’s Street is being talked about at any one time?; where on the Strand did the crime occur?
To achieve a good data mashup, you need a limited number of data sets which you know well.
Data curation.
Humanities impact through creation of research standards which help organise modern data.
Without standards we encounter problems of longevity.
Hiding poor OCR under a resource causes more problems than benefits, eg John Johnson Collection or British Library newspapers: billions of words, around a third of which are wrong.
Hartlib Papers made obsolete before it was even published because of updates to commercial software it relied on.
Sustainability can be ensured through good data curation, use of open standards, open software.
User testing and data curation for historical resources similar to those required for B&Q or John Lewis websites.
Investment in data during the Old Bailey project ensured its longevity, both as a dataset and as a resource. But much more expensive.
Project management.
Most important aspect of digital humanities. If management is bad, project fails. We learn from failures. Overambition often the key problem: early projects tried to do too much tagging, transcribing.
As a result, others sectors and the economy in general benefit from work done in the digital humanities.
***First reflection
Why so project/service based? Surely DH is about understanding historical phenomena using digital techniques, not just serving up data in usable formats? [for response see]
Intro to Web Science
Cross over between web science and DH.
Web science: impact of the web on humans, make sure changes pro-human. Most backgrounds of students outside of CS.
William Fyson:
Scholarly communication [do we need something on this within the training programme?
Ideas on disconnect between web world and academic tradition … though how is this relevant to the library? academic physical diminishing? … dispel myths about Gold]
Innovation and Regulation on the Web
People need to be rewarded, how do we go about this on the web?
Maximising innovation. Deregulation can revolutionise, all opportunities for innovation.
Control of unwanted behaviour to avoid total freedom.
How/why regulate the web. Why? - required to ensure that there are incentives for innovation (eg they are not stolen).
Reuben Binns: CC
[start with what is licensing, then go into CC > SA as a ‘viral license’ (I like that); problem of NC; conflict of SA and ND: mutually exclusive]
Open data: if you want your data mined and gathering into a larger dataset, a waiver is more appropriate than a license.
Open research data handbook.
PM workshop: debating DH
What is ‘data’? Is this a cultural change that has to happen in H (sources > data) for DH to be accepted?
[workshop questions... barriers to using the web for DH? main stakeholders and their motivations? opportunities? … good set of questions for a ‘What is DH?’ workshop]
Data does not equal fact.
Contested definitions...
Research Gate
GitHub: data and structure and collaboration.
Thomas Padilla @thomasgpadilla showed us viewshare (can take csv, mods [exciting!], json)
***This work is licensed under a Creative Commons Attribution 3.0 Unported License.***
<a rel="license" href=""><img alt="Creative Commons License" style="border-width:0" src="" /></a><br />This work is licensed under a <a rel="license" href="">Creative Commons Attribution 3.0 Unported License</a>.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment