Skip to content

Instantly share code, notes, and snippets.

@drjwbaker
Last active August 29, 2015 14:00
Show Gist options
  • Save drjwbaker/11291918 to your computer and use it in GitHub Desktop.
Save drjwbaker/11291918 to your computer and use it in GitHub Desktop.
'Digital History: a (personal) introduction', Philosophisch-Historische Fakultät, University of Bern, 28 April 2014

###Digital History: a (personal) introduction

Notes from an invited talk I gave at Philosophisch-Historische Fakultät, University of Bern, 28 April 2014

The following text represents my notes rather than precisely what was said on the day and should be taken in that spirit. [EDIT 28/04/14: as it happens, I only loosely followed these notes, spent longer than initially planned demonstrating some of the projects discussed, and allowed more time for the task element. As a result, the final section was summarised considerably.

Slides: http://www.slideshare.net/drjwbaker/2014-0425-bernslides


####Intro

Background of team, multi-disciplinary team with broad skill set S sense of importance of open S more than resource discovery; situate turn toward digital research within a response to external forces S deluge of data et al S libraries increasingly full of data as much as books S new contexts for scholarship in HSS.

S I'm the historian in the team and when thinking about digital history I often go to a founding centre for answers: the Roy Rosenzweig Center for History and New Media at George Mason University, a centre at the heart of the project of digital history: to use digital stuff to do better history. Or quote them:

Digital history is an approach to examining and representing the past that takes advantage of new communication technologies such as computers and the Web. It draws on essential features of the digital realm, such as databases, hypertextualization, and networks, to create and share historical knowledge.

Digital history complements other forms of history—indeed, it draws its strength and methodological rigor from this age-old form of human understanding while using the latest technology.

This emphasis on complimenting what historians do through the study and practical application of new media has meant that CHNM has spent of lot of time enabling others as much as doing great historical research. They: S

  • developed and supported the open source citation and research manager Zotero.
  • founded The Humanities And Technology Camp, usually referred to as THATCamp, an unconference movement.
  • and worked on Omeka, a platform for publishing online collections and exhibitions.
  • Finally, their former Director Dan Cohen, historian of Victorian spirituality, left to head up Digital Public Library of America - a remarkable service that connects sources and records of American past in new ways.

Research

S

This does not mean that digital historians don't do proper historical research.

  • S From the history of crime, especially crime in eighteenth century England and her colonies as come Old Bailey Online (now 10 years old), London Lives, Connected Histories, and Locating London's Past. Research out the back of this, 2014 CUP book from Tim Hitchcock and Bob Shoemaker.
    • CLICK Also innovative research, such as that at King's by Adam Crymble.

IMAGE ALT TEXT HERE

  • S Newspapers and periodical research a vibrant area of digital research
    • Viral Texts.
    • Melodee Beals (Sheffield Hallam): tracing the direction of cut-and-paste journalism.
  • S Medievalists: Dirty Books - good example of chief being about more than data; uses digital technology to explore past phenomena.
  • S Social History: Digital Harlem (1915-1930) - maps people, events, addresses and sources across early-20th Harlem available on a public website for query, interrogation and new insights through patterns revealed.

S If my examples sound a little anglophonic this merely represents what I know best. For digital history is spread across a range of centres across the world, and where within Digital Humanities departments often with a focus on digital editions of texts.

In response to this blending of digital history and digital humanities there has been much useful thinking about what digital history all means for history.

S Zaagsma:

"[T]he very phrase ‘digital history’ suggests separateness from, or the existence of, ‘non-digital’ historical practice. This seems highly problematic though. Both the idea that ‘digital history’ constitutes a specific sub-discipline, existing next to other historical sub-disciplines such as cultural, social, political or gender history, as well as the idea that it should essentially be seen as an auxiliary science of history, feed into the myth that historical practice in general can be uncoupled from technological, and thus methodological, developments and that going digital is a choice, which, I cannot emphasise strongly enough, it is not."

I'll come back to this theorising towards the end of my lecture. But for now it is sufficient to note in summary that digital history covers a range of historical subject areas and time periods, all with a healthy splash of theorising thrown in, more on which later.


Approaches

S But what is perhaps more interesting is not the thematic coverage of digital history but the range of approaches used, many of them transformative and innovative, where the digital and computation move from being tools that answer old questions in new way to enabling the discovery of new trends, new problems, new questions.

Corpus Analysis of texts

S Printed texts can be scanned and turned into digital documents a computer can read. The process by which this is done, known as optical character recognition (or OCR for short), isn't perfect, but once completed opens up texts to much more than search across them.

S We can use simple web tools such as Voyant to count word use, seek patterns, and map concurrences in - and across - texts [CLICK to go to Voyant http://voyant-tools.org/?corpus=1398674180389.9728&stopList=stop.en.taporware.txt ]

S We can ask a computer to analyse a text for 'topics', a process known as Topic Modelling, reading from a distance the sorts of things a corpora of text contains. And we can compare those results to similar tests ran on other texts.

In short we can apply the methods of linguists to texts to understand much more than just the linguistic structure of a text.

S That said linguistic analysis can still reap rewards. Research led by Magnus Huber at the University of Giessen looked at the Old Bailey Corpus I described previously. As this text was transcribed by humans (with >99% accuracy) it is ideal for corpus linguistics. What Huber did was separate out the sections marked as 'part of speech', that is where the court scribe stated that a person was speaking in court. He then analysed this corpus of 'spoken' English against a model (based on literature, newspapers, songs and the like) of how we think spoken English was formed in the 18th and 19th centuries. He discovered that the part of speech sections of the Old Bailey corpus closely matched our models. In effect then, he has shown that the quotations in the Old Bailey court records are true reflections of spoken English in the 18th and 19th centuries.

S And this work isn't just happening around English language texts. The ASYMENC project I mentioned earlier brings together a group of scholars - historians, linguists, developers - to develop tools so that we can trace 'viral texts' across linguistic barriers: here, German, French, Flemish, using periodicals printed in Luxembourg (here the relatively small volume of works printed key to testing the tools and methods.)

Geospatial Analysis

S By turning out texts into data we can do more than use a computer to count and read that text. We can also isolate elements of the texts and manipulate them in ways that go beyond language. As Tim Hitchcock writes:

'If each paragraph in the infinite archive, all the trillions of words, is simply a collection of data, it immediately becomes something that can be tied to a series of other things – to any other bit of data. A name, a date, a selection of words, or a phrase […] defined as a polygon on the surface of the earth. In other words, the texts that form the basis for western history can now be geo-referenced and tied directly to a historical / geographical understanding of spatial distribution, which can in turn be cross analysed with any other series of measures of text – textmining makes text available for embedding within a geographical frame.'

S And that is what researchers at the University of Lancaster are doing. They are taking vast corpora of text, turning them into geospatial entities, and analysing these corpora for the spatial patterns they contain. Their work has revealed patterns of the spread of cholera in the north west of England during the 19th century and will soon be looking at the spatial complexities of how communities respond to natural disaster such as floods, droughts, and extreme cold.

Modelling

S The Virtual Pauls' Cross Project, led by John Wall at North Carolina State University, reconstructs the experience of attending an early modern sermon. St Pauls' Cross, an outdoor space at the side of medieval St Paul's Cathedral (that is before the Great Fire of London in 1666), no longer exists, so the team had to model the space. They then populated the virtual space with people and other local noise, such as dogs barking and church bells, and modelled how it would sound from 8 different positions in that virtual space with different crowd sizes.

http://vpcp.chass.ncsu.edu/experience/ EXAMPLES - position 7 with 500 and 5000 people. Now see difference at position 2 (posh seats!)

There is more to this than just modelling a lost past experience. They discovered that the arrangement of the space at Paul's Cross ensured that people could hear an unamplified voice pretty well. They found that if the pace of speech was slow, people could hear even better as fast speech was lost in the reverberations from the rear buildings. They also realised during the process of adding a tolling bell in every 15 minutes, that clerics such as John Donne would have had to time their sermons around the bell, and that around every 15 minutes - and this is a working hypothesis - their sermons reached a climatic moment - just as the bell at medieval St Paul's Cathedral was set to chime. A remarkable discovery enabled by digital research.

Theory

S Digital research, and especially North American Digital Humanities, is not without theory. The work of a literature scholar, Franco Moretti, has perhaps been most influential in digital history - his call that we read texts from a distance influencing scholars such as the historian of Victorian Britain Bob Nicholson who writes: S

'Faced with this mountain of print, we have two choices: to continue subjecting tiny fragments of Victorian culture to close reading, or to supplement this approach by exploring a much larger proportion of the archive through 'distant reading'.'

In essence, Nicholson is calling for a pragmatic blending of close and distant, of reading and machine reading, of using the best of our skills as historians and the maximum a computer to can now offer by way of assistance.

Perhaps the best recent theoretical work has come from Rens Bod, Professor of Computational and Digital Humanities, whose book 'A New History of the Humanities' (2013 in English, 2010 for the Dutch speakers among you) uses a careful analysis of the humanities from the Antiquity to the present to show that the humanities, and in particular history, is all about pattern matching, and so the pattern matching, number crunching, scientific looking work of much digital history isn't that different put in the context of the history of history as a discipline - it maybe just seems so due to the distorting effect of the cultural turn, more on which shortly.

One final thought is these digital projects require the use of digital tools that do a number of things: text analysis, mapping, or image analysis. And it is easy to get hung up on tool, what it does, what it doesn't do, whether learning it is a good use of time. But it is of course not the tool that is important to digital history but the approaches those tools allow. The projects I've described today might start with tools in mind, with a knowledge of what is and isn't possible, but don't see the machine as the answer, as the only was of doing research. S As the great Annalist historian Emmanuel Le Roy Ladurie put it with regards to early computational humanities:

[...] en histoire, comme ailleurs, ce qui compte, ce n’est pas la machine, mais le problème. La machine n’a d’intérêt que dans la mesure où elle permet d’aborder des questions neuves, originales par les méthodes, les contenus et surtout l’ampleur

In history, as elsewhere, what counts is not the machine, but the problem. The machine is only interesting insofar as it allows [us] to tackle new questions that are original because of their methods, content and especially scale

Emmanuel Le Roy Ladurie, ‘L’historien et l’ordinateur’, Le territoire de l’historien (Paris 1973), 11.

...such thinking key to fitting digital history back into the historical profession at large, something digital history has and is struggling with...


Exercise

But before we get onto that, you’ll be pleased to hear we now move onto the bit of the session where I stop talking at you.

DIVIDE UP ROOM

Each group of 5-6 will have... WHAT??? a flipchart, some pens, and some cards.

These cards represent hypothetical tools for digital research and hypothetical digital collections – though in both cases they resemble real things.

On each card I have specified the sorts of properties these tools and collections have, and what I'd like you to do for the next 10 minutes is to in your groups look at the cards and come up with a potential research project that might be possible. You'll then have 30 seconds or so to pitch a basic research project.

Obviously you’ll need to work quickly, so I’d recommend keeping things high level and not getting too bogged down in detail – though ideally you’d identify some potential pitfalls.

I urge you to proceed with Ladurie in mind; with ideas of the novelty of questions, content and scale, as opposed to technology in and for itself

[for more details on this exercise, see the British Library Digital Scholarship blog http://britishlibrary.typepad.co.uk/digital-scholarship/2014/01/prototyping-task-for-digital-research-novices.html ]


Fitting In

I'm going to close by looking at a problem: how has digital history fitted in with the discipline of History?

The easy and sobering answer to the question of how all this digital history has fitted in with the discipline of history is not much. In many ways History has been changed little by a turn towards digital research. The digital turn in society, however, has has made more of an impression, if on the instrumental components of the historian's practice more so than on research 'proper'. We know historians Google before they go to libraries, conduct research from their laptop that until only recently they had to travel thousands of miles to undertake, send files to each other without concern for geography or time, photograph archives rather than note down (or memorise) everything they read. But this hasn't fundamentally changed what historians write about, rather just made their lives easier.

Of course, there is a false distinction here. Instrumental components of the historian's practice are part of the research proper. As Steven Jones argues in laying bare its hands-on doing, DH illuminates how theory-laden that doing is. He writes: S

It should not be assumed that, because DH emphasizes practice and making use of computers, it's therefore naively instrumental or positivist in its assumptions, or that its hands-on doing necessarily precludes theory. Only an impoverished view of theory as pure verbal and written discourse, separate from practice, would produce such an assumption (179)

This resonates with the historical method in the digital age, digital history or no. And so one of the interesting questions digital history raises is what is an historian?

A way of thinking about this is to go to our textbooks, the introductory texts we ask undergraduate historians to read when trying to understand what it is to be a historian. Texts you may have read. And when we do this through the lens of digital history (and I ask you to do this after this lecture) we notice something curious.

S Take, for example, John Tosh's The Pursuit of History, first published in 1984 and now, as of 2010, in its 5th edition.

In the index to the 1st and to the 2nd editions (1984 and 1991) each include three entries for 'computers'. In both a whole chapter is dedicated to 'History by Numbers', the growth of which prior to the 1980s Tosh attributes to two factors: a desire to explain more than history of great men (which turned historians to different sources, many of which needed counting) and the affordability from the 1960s of computers. As Tosh writes 'both the kind of data it [the computer] could handle and the operations it could carry out were rapidly diversified'. Here then the computer is a labour saving device yoked to numerical work and statistics, of which Tosh notes their importance with a note of caution:

Statistics may serve to reveal or clarify a particular tendency; but how we interpret that tendency - the significance we attach to it and the causes we adduce for it - is a matter for seasoned historical judgement, in which the historian trained exclusively in quantitative methods would be woefully deficient [197, 1st edition]

For certain flavours of today's data-rich digital history to thrive, such ideas and arguments could do with being a standard part of training for new historians. But they are not. Instead, by the 5th edition of Tosh's work (published in 2010, over a decade after the 4th) changes within the profession are reflected in such a way that global, comparative, cultural, and post-colonial history all feature prominently, but quantitative history is reduced to a mere two and a half pages. So just as digital history was kicking in, just as all the data libraries had spent over a decade creating began to be exploited by historians in novel and unexpected ways, and just a year before the Digital Humanities invited all on the fringes into its 'Big Tent', a key textbook in the field of History relegated quantitative history and the skills associated with it to marginal status and removed from the index all references to 'computers'.

Now there are good reasons for this. Quantitative history was in terminal decline. And of course, as we've seen, digital history is about much more than counting and statistics. But there is a knock on effect here: that as cultural history was in the ascendancy students were not being introduced to research done at scale (in anglophonic textbooks at least, the picture can look different elsewhere, for example in Germany) which - numbers, statistics or no - is a key motif of digital history. And by doing so it marginalised within their education knowledge of debates in the 1960s and 1970s around the relationship between big and small history, macro and micro, 'scientific' and humanistic method.

S This is important because digital history can seem 'scientific', the numbers they produce, the coverage they have, the techniques they use, can seem authoritative, correct, final. But of course as Ladurie and Tosh both suggest they are not - like all research digital research is conversational, partial, exploratory. And this is where, perhaps, a closer alignment between digital history and the history of science and technology might be fruitful, for as Historians of Science and Technology have been aware for some time the scientific method is nowhere near as scientific, as authoritative, correct, final, as historians often assume. Indeed as the famous Historian of Science Thomas Kuhn argued in 1961 regarding the role of measurement in science: quantitative work comes from qualitative work and theory, scientific data is always approximate, and scientists adapt there measurements to fit their theory. The point is that the history of science and technology can offer something to digital history: a critical apparatus for thinking through the relationship between quantitative work and established patterns of qualitative research; a parallel to the ‘productive play’ that by necessity surrounds much research that uses data and digital methods to explore historical phenomena; and a narrative digital historians can bounce off to erode the lingering fiction – if not in our minds, then in the minds of others – of 'data driven' research being a category of humanistic enquiry: for Kuhn and others remind us that in both scientific and humanistic research sources do not speak for themselves, they are always approached with qualitative baggage in mind, and the combination to varying degrees of theory and source material, be those sources data or archival material, is what defines scholarship.

For digital history to fit in better, these ideas need to be more central to our collective understanding of what history is, of what historians are.

There are other barriers to fitting in I won't have time to touch on today. Digital history tends to produce disruptive outputs: rather than books or articles, often digital history produces things that are online first, that reject print as an effective means of communication, that are aimed at different audiences, and that are - in a sense - always unfinished, partial, iterative, incomplete. Further, digital historians tend to work fluidly across disciplinary boundaries, appealing to both their own discipline and to the digital humanities, spoiling the monolithic structures some university systems - the German research council comes to mind - rather like.


Summary

S But I don't want to linger on problems because there are real opportunities in digital history: to fulfil the full potential of our sources, to engage with those outside scholarly communities and thus assert our relevance, to ask new questions of the past, and to lay the groundwork for contemporary history to advance post-1996, into the internet age, as that becomes a domain of Historical research. All this encompasses an intellectual turn towards digital history, if at present a radical strand of History often ignored and under-appreciated. There are precedents here for being optimistic that problems of fitting in with be overcome. Think how scholarship of race, gender and culture first broke away, proclaimed themselves as radical breaks from tradition, annoyed some traditionalists, before their lessons and provocations then became part of and were assimilated into the norms of historical research.

Achieving this assimilation is important because digital history is better research when all historians have the skills to meaningfully critique, review and build on that research, even if they do not use digitally-driven methods in their own research. Moreover, History would suffer were every researcher in the discipline to chose to identify themselves not as historians but as Digital Humanists with a big D and a big H. History is better served by those researchers remaining part of and embedded within ‘traditional’ departmental structures, so long as those structures continue to hold sway. This is not to devalue or discredit the work of Digital Humanists or Digital Humanities departments. This is not to say I don’t value the Digital Humanities community. I do. Rather if we think of the Humanities as an umbrella term for a number of academic research disciplines which explore human phenomena using the best tools, methods or approaches a researcher thinks appropriate for the job, then one of the options a Humanities researcher now has is digitally-driven, or data-driven. This is the reality of being a researcher in the digital age. And if History departments haemorrhage digitally-driven researchers to DH-as-discipline, where would that leave the History? In short, in a bad way. For it would be a discipline not using as appropriate the best tools for the job. And down that path leads the spectre of obsolescence.

S

S


Some admin...

This work is licensed under a Creative Commons Attribution 3.0 Unported License. Creative Commons License

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment