drjwbaker/The Scholarly Use of Web Archives

## The Scholarly Use of Web Archives
The Scholarly Use of Web Archives

David Gauntlet

What is the UK web?
How do you capture a living space?


Helen Hockx-Yu

31TB of compressed data collected.
Intention to do this at least once a year.
Not just collecting the web due to our legal mandate, but because we believe in the value of collecting it.
Scholars have a document centered view of the web: browsing, discovery.
Use collected web archive as a test bed for how to work with the total UK web.

David Berry

Bring critical approaches to capturing digital archives.
Web archive is part of second wave of data capture: taking account of the born digital.
Discipinary boundaries problematised by web archives, capturing them and working with them.
Even the digital is historical. Flash is a dying format. How will we look at those materials as the company support disappears?
Shift from a static idea of a archive to the idea of one constantly in motion
 > web archive highlights the fact that all archives are in motion.
There is of course stuff you can't capture: the dark web, interconnected nature of web media, DRM content.

***
QA... What is scholarship? Web causes big changes to scholarship. Web archive highlights this 'problem'
***

Niels Brugger

Web archived not the web used.
Libraries and archives used to making choices when archiving, but there is not original to go back to with the web.
We need a transnational project as the web is not national.
Analytical tools we are used to are not usually applicable to web archives.
Need tools and skills to analyse web archives [ambitious aim, researchers need a hook of familiarity]


***
How do we actually capture apps?
Is the term web archive holding back archiving the web?
Do we want to preserve the ability to play it or how it was played?
***

Michel Hockx

A literary scholar whose become gradually convinced of the need for web archives.
China: more people read literature online than shop.
The Great Firewall? Western obsession that doesn't mean much to the Chinese.
But Chinese writers are fascinated by the problem of survival of their online work.
Problem of footnotes linking to URLs. By the time it is through peer review and published the links are down
Once we all knew where the texts are (where the originals are) but no longer
 > [we don't have an orignal of the web, even when it is archived]
Some people know their online literture is ephemeral, and want it to be so.
Web ephemerality changes the way people are taught to study literature.

***
The appearance of 404 error pages are a brilliant way of making a point: the web is different, and we need to capture it
***

Richard Rogers

His favourite web archive is the papal transition from JPII to Pope Benedict.
Nobody is citing web archives [or are they just not citing the web archive as they are supposed to?]
How do we study the web?
1) study the change in how it looks
2) study particular things - such as types of site, right extreemist - to see change in language.
3) study early blogs.
Demise over time of the Google 'Directory'

***
Should we sad about the demise of a human organised feature of Google (the 'Directory') and of the rise of the algorithm?
 > raises questions re the role of information professional.
 > what researchers lose in this is the process, don't know about the algorithm, lost in the machine.
 > we need to teach our colleagues.
 > information professional: role needs to be recast between the researcher and the tech
 >> though the closedness of the algorithm will cause a tension here.
***


***This work is licensed under a Creative Commons Attribution 3.0 Unported License.***
<a rel="license" href="http://creativecommons.org/licenses/by/3.0/"><img alt="Creative Commons License" style="border-width:0" src="http://i.creativecommons.org/l/by/3.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by/3.0/">Creative Commons Attribution 3.0 Unported License</a>.
	The Scholarly Use of Web Archives

	David Gauntlet

	What is the UK web?
	How do you capture a living space?


	Helen Hockx-Yu

	31TB of compressed data collected.
	Intention to do this at least once a year.
	Not just collecting the web due to our legal mandate, but because we believe in the value of collecting it.
	Scholars have a document centered view of the web: browsing, discovery.
	Use collected web archive as a test bed for how to work with the total UK web.

	David Berry

	Bring critical approaches to capturing digital archives.
	Web archive is part of second wave of data capture: taking account of the born digital.
	Discipinary boundaries problematised by web archives, capturing them and working with them.
	Even the digital is historical. Flash is a dying format. How will we look at those materials as the company support disappears?
	Shift from a static idea of a archive to the idea of one constantly in motion
	> web archive highlights the fact that all archives are in motion.
	There is of course stuff you can't capture: the dark web, interconnected nature of web media, DRM content.

	***
	QA... What is scholarship? Web causes big changes to scholarship. Web archive highlights this 'problem'
	***

	Niels Brugger

	Web archived not the web used.
	Libraries and archives used to making choices when archiving, but there is not original to go back to with the web.
	We need a transnational project as the web is not national.
	Analytical tools we are used to are not usually applicable to web archives.
	Need tools and skills to analyse web archives [ambitious aim, researchers need a hook of familiarity]


	***
	How do we actually capture apps?
	Is the term web archive holding back archiving the web?
	Do we want to preserve the ability to play it or how it was played?
	***

	Michel Hockx

	A literary scholar whose become gradually convinced of the need for web archives.
	China: more people read literature online than shop.
	The Great Firewall? Western obsession that doesn't mean much to the Chinese.
	But Chinese writers are fascinated by the problem of survival of their online work.
	Problem of footnotes linking to URLs. By the time it is through peer review and published the links are down
	Once we all knew where the texts are (where the originals are) but no longer
	> [we don't have an orignal of the web, even when it is archived]
	Some people know their online literture is ephemeral, and want it to be so.
	Web ephemerality changes the way people are taught to study literature.

	***
	The appearance of 404 error pages are a brilliant way of making a point: the web is different, and we need to capture it
	***

	Richard Rogers

	His favourite web archive is the papal transition from JPII to Pope Benedict.
	Nobody is citing web archives [or are they just not citing the web archive as they are supposed to?]
	How do we study the web?
	1) study the change in how it looks
	2) study particular things - such as types of site, right extreemist - to see change in language.
	3) study early blogs.
	Demise over time of the Google 'Directory'

	***
	Should we sad about the demise of a human organised feature of Google (the 'Directory') and of the rise of the algorithm?
	> raises questions re the role of information professional.
	> what researchers lose in this is the process, don't know about the algorithm, lost in the machine.
	> we need to teach our colleagues.
	> information professional: role needs to be recast between the researcher and the tech
	>> though the closedness of the algorithm will cause a tension here.
	***


	*This work is licensed under a Creative Commons Attribution 3.0 Unported License.*
	<a rel="license" href="http://creativecommons.org/licenses/by/3.0/"><img alt="Creative Commons License" style="border-width:0" src="http://i.creativecommons.org/l/by/3.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by/3.0/">Creative Commons Attribution 3.0 Unported License</a>.