Skip to content

Instantly share code, notes, and snippets.

@padraic7a
Last active August 29, 2015 14:23
Show Gist options
  • Save padraic7a/6e341625a24527ebf530 to your computer and use it in GitHub Desktop.
Save padraic7a/6e341625a24527ebf530 to your computer and use it in GitHub Desktop.

#Expert Panel: Shaping our Legacy: Safeguarding the Social and Cultural Record (Hogan)

Chair: Natalie Harrower, Digital Repository of Ireland @natalieharrower

Brid Dooley, Head of RTE Archives @briddooley

Helen Hockx-Yu, Head of Web Archiving, The British Library @hhockx

Owen Conlan, Assistant Prof., School of Computer Science and Statistics, TCD @oconlan

John McDonough, Director, National Archives of Ireland @mcojo

BD: the archive needs to be in every conversation about new technology, so that decisions around metadata, complex records etc. can be shaped. So "that how it operates in production is reflected in how it is in the archive when it gets there"

JMcD: NA deal with analogue content and digital workflows. 5k of boxed paper archives coming in every year. Want to be involved early on in the lifecycle of the records - more midwives than undertakers.

HH: 150m items in the BL. Think about how web archiving can be used to help other areas collect and catalogue better -> they create automatic metadata records and apply these to the documents, onjects etc they collect while web archiving.

Web archiving is based on certain assumptions, some of which have been proven to be incorrect. This is a path that has ben travellled to improve usability. Loads of data with a google box on top doesn't satisfy everyone - historians for example want to know how you selected things, what decisions the algorithms forced and so on. It's important to maximise transparancy - so for example showing the selection of websites that was excised from the corpus.

OC: interesting to consider how we could log things like web history - for example could we archive how stories break and people become aware of them - teh access paths from twitter hashtag to widespread distributed knowledge of an event.

Questions

Sandra: question around ethics of archiving as performed in job roles.

BD: 'States of Fear'. Archives had a lot of recorded long form interviews. Tricky to make it accessible however, esp if recordings might be used in a way informants didn't expect. Questions are usually around what has been recorded and what can be kept.

JMcD: records might be subject to data protection, but most material is public. Biggest challenge is long term preservation. When more things go This is in the analogue world, when things go online this will change. HH: issues around collecting illegal content, viral content. They try not to collect the former but do collect the latter, though they limit access to it. Data Protection issues around individuals.

Martha: 2 kinds of data - content and the deta around administrative decisions. Are the two influencing eachother to the extent that they become undifferenciated? and how does this affect the describers?

JMcD: but it's a messy world. Seperating the dancer from the dance - very hard to discretely seperate the content from the delivery mechanisms / technology. If the policies are right the tech can develop from there.

BD: Admin data not highly prized. Things like production decisions are highly valued. [why stage like this?]

HH: the web has text, paratexti, it's multi-dimensional, we shouldn't try to flatten that or compress it.

OC: it is a messy world.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment