Digital Humanities: A brief introduction to the field
1. Digital Humanities
A brief introduction to the field
Dr Anouk Lang
Department of English
University of Edinburgh
@a_e_lang
Thurs 23 July 2015
6-7.30pm
Queen Mary
University of London
#adpsummer
2. “To infinity & beyond”: Where we’re headed
Working with data: where are the pitfalls?
- structured vs. unstructured
Overview of the field
- historical background and debates
Sample projects and tools
- textual, spatial and network analysis
Resources
- summer schools, workshops, teach-yourself tutorials, Twitter
@a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
3. Working with data as a humanities scholar
Ben Schmidt, “Gendered language in teacher reviews”
http://benschmidt.org/profGender
interdisciplinary serious fun
@a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
Randall Munroe, “Correlation”, XKCD, http://xkcd.com/552.
5. What are the limitations?
data collection sampling
What is obscured?
gender of reviewers context
gender of reviewees field size
@a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
6. Data:
you’re not just using it but producing it
Facebook’s “emotional contagion” study
http://www.pnas.org/content/111/24/8788.full.pdf
Facebook voting study
www.nature.com/nature/journal/v489/n7415/pdf/nature11421.pdf
Homicide Watch homicidewatch.org
And, obviously, NSA/GCHQ/etc
@a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
7. Data:
structured vs. unstructured
information that is organised in some way
vs information that comes without a data model
@a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
9. Data:
structured vs. unstructured
information that is organised in some way
vs information that comes without a data model
Schmidt’s dataset: partially structured but also in
need of some curation
Data from an API, eg. Twitter data
@a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
11. @a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
Humanities data: often unstructured
Image from Flickr: Jason Weinberger, “Mahler Symphony
5, IV Adagietto [page 15]”, CC BY 2.0 licence.
12. @a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
Jad Abumrad and Robert Crulwich, “Vanishing Words”,
RadioLab, www.radiolab.org/story/91960-vanishing-words/.
13. Concordancing software: AntConc (Laurence Anthony)
www.laurenceanthony.net/software/antconc/
Query 1: all instances of look as a simple text string
16. @a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
Query 3: all instances of look as a verb (look_VV*)
followed by a preposition (*_II) then sorted 1R, 2R
19. Data:
always contingent, never objective
Johanna Drucker & the concept of ‘capta’
what kind of data curation is necessary?
who else has come up with categories/data models?
think about how to capture & structure your data early
@a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
20. Overview of the field: Definitional skirmishes
Digital Humanities is a field of study in which scholarly
applications of technology are used to perform analyses and
generate insights that would be difficult or impossible to achieve
without the help of technology.
“digital humanities is more akin to a common methodological
outlook than an investment in any one specific set of texts or
even technologies”. (Matthew Kirschenbaum)
@a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
23. @a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
Historical antecedents: Humanities Computing
Roberto Busa, IBM & the Index Thomisticus
Livia Canestraro, one of the
female punchcard operators
for the Index Thomisticus.
CC-BY-NC, license by
permission of CIRCSE
Research Centre, Università
Cattolica del Sacro Cuore,
Milan.
Via Melissa Terras,
melissaterras.blogspot.co.uk
/2013_10_01_archive.html
24. @a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
Disciplinary antecedents
• corpus linguistics, computational linguistics & NLP
• GIS (Geographic Information System / Science)
• within History, Cliometrics
• others …
25. @a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
Readings giving historical background
• Kirschenbaum, Matthew G. ‘What Is Digital Humanities and
What’s It Doing in English Departments?’ ADE Bulletin 150
(2010): 1–7. http://mkirschenbaum.files.wordpress.com
/2011/01/kirschenbaum_ade150.pdf.
• Liu, Alan. ‘The Meaning of the Digital Humanities’. PMLA
128.2 (2013): 409–423. http://www.jstor.org/stable/23489068.
• Hockey, Susan. ‘The History of Humanities Computing’. In
Susan Schreibman, Ray Siemens and John Unsworth, eds., A
Companion to Digital Humanities (Oxford: Blackwell, 2004).
http://www.digitalhumanities.org/companion/.
26. @a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
Some broad debates and tensions in the field
• from outside the field: too empiricist, too positivistic, too
uncritical of the use of computers
• from within the field: not sufficiently
statistically/algorithmically literate, use of black boxes
• too apolitical: where are race, gender, & identity?
• too focused on literature
• “you’re not a real digital humanist unless you can code”
• “more hack, less yack”
27. Examples of projects and tools
@a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
Image from In the Forbidden Land: An account of a journey in
Tibet ... With a map and two hundred and fifty illustrations (1898), p.154.
From the British Library’s Flickr collection of images in the public domain
Textual analysis
Mapping
Network analysis
28. 0 day lydia dear replied felt cried aunt hear uncle charlotte
1 wickham made till evening added world knew married father visit
2 lady man young catherine brother ladies happiness half friends settled
3 make great give hope thought pleasure present general affection conversation
4 time sister mother love feelings ill speak leave meryton life
5 mr darcy bingley miss collins mind london civility convinced feeling
6 mrs bennet family long gardiner morning town found character coming
7 elizabeth jane letter longbourn happy answer kind left kitty reason
8 good friend house lizzy subject sisters father netherfield told home
9 room manner daughter heard sir moment looked woman immediately began
For more on topic modelling, start at Vol. 2 issue 1, Journal of Digital Humanities:
http://journalofdigitalhumanities.org/2-1/dh-contribution-to-topic-modeling/
@a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
Textual analysis: Topic modelling
29. @a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
Textual analysis: Stylometry
Authorship attribution: “the
science of inferring
characteristics of the author
from the characteristics of
documents written by that
author” (Juola 2006).
Deciphering
The Dynamiter
thedynamiter.llc.ed.ac.uk
30. @a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
Textual analysis: Stylometry
Deciphering
The Dynamiter
thedynamiter.llc.ed.ac.uk
green = Fanny
black = Fanny
black = Robert
orange = Robert
red = authorship uncertain
35. Franco Moretti,
“Network Theory, Plot
Analysis”, New Left
Review 68 (2011): 81.
Also available as a
LitLab pamphlet: see
litlab.stanford.edu
@a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
Network analysis and visualisations
38. Further resources
• DHOxSS: DH Summer School at Oxford
• Lancaster Summer Schools
• Further afield: DHSI, HILT, DH@Leipzig
• The Programming Historian
(http://programminghistorian.org)
• MOOCs, eg. IVMOOC, Coursera, FutureLearn
• Training courses at your institution, eg. ArcGIS
• Teach-yourself tutorials, eg. Codecademy
• DH Q&A http://digitalhumanities.org/answers/
@a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
39. Matthew Jockers, “Revealing Sentiment and Plot Arcs with the
Syuzhet Package”, blog post, Matthew L. Jockers 2 Feb. 2015.
www.matthewjockers. net/2015/02/02/syuzhet/. Code at
https://github.com/mjockers/syuzhet.
@a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
40. Eileen Clancy, “A Fabula of
Syuzhet II: Continuing the
tale of digital humanities
and sentiment analysis”.
Storify of tweets from 24
Mar-10 April 2015.
https://storify.com/clancyne
wyork/a-fabula-of-syuzhet-ii.
@a_e_lang | anouk.lang@ed.ac.uk | #adpsummer
41. until Syuzhet provides filters that don’t cause ringing
artifacts [extra lobes introduced into a graph by an
ideal low-pass filter], it is likely that most foundation
shapes will be inaccurate representations of the
stories’ true plot trajectories. Since the foundation
shape may in places be the opposite of the emotional
trajectory, two foundation shapes may look identical
despite having opposing emotional valences.
Jockers’s claim … may be due more to ringing
artifacts than to an actual similarity between the
emotional structures of the analyzed novels.
Annie Swafford, “Problems with the Syuzhet Package”,
blog post, Anglophile in Academia, 2 March 2015.
annieswafford.wordpress.com/2015/03/02/syuzhet/.
42. adapted from Allie Brosh, Hyperbole and a
Half (hyperboleandahalf.blogspot.co.uk)
@a_e_lang | anouk.lang@ed.ac.uk | #adpsummer