Demystifying Digital Humanities: Winter 2014 session #1
Jan. 18, 2014•0 likes•2,598 views
Download to read offline
Report
Education
Technology
Slides from the January 18th Demystifying Digital Humanities workshop on Exploring Programming in the Humanities, held at the Simpson Center for the Humanities, and taught by Paige Morgan, Sarah Kremen-Hicks, and Brian Gutierrez
6. There will always be new
programs and platforms that
you will want to experiment
with.
7. Working with technology
means periodically starting
from scratch -- a bit like
working with a new time
period or culture; or figuring
out how to teach a new class.
11. Example #1
•
find all the statements in quotes ("") from a
novel.
•
•
count how many words are in each statement
•
write all the statements from the novel in a text
file
put the statements in order from smallest
amount of words to largest
12. Example #2
•
allow a user to type in some information, i.e.,
"Benedict Cumberbatch"
•
compare “Benedict Cumberbatch” to a much
larger file
•
•
retrieve any data that matches the information
print the retrieved information on screen
13. Example #3
•
•
"read" two texts -- say, two plays by Seneca
•
print the words that they have in common on
screen
•
calculate what percentage of the words in each play
are shared
•
print that percentage onscreen
search for any words that the two plays have in
common
14. Example #4
• if the user is located in geographic location
Z, i.e., 45th and University, go to an online
address and retrieve some text
• print that text on the user’s tablet screen
• receive input from the user and respond
15. However...
•
In Example #1, the computer is focusing on things
that characters say. But what if you want to isolate
speeches from just one character?
•
In Example 2, how does the computer know how
much text to print? Will it just print "Benedict
Cumberbatch" 379 times, because that's how often
it appears in the larger file?
16. These are the areas of
programming where critical
thinking and humanities
skills become vital.
17. The Difference
• Humans are good at differentiating
between material in complex and
sophisticated ways.
• Computers are good at not differentiating
between material unless they’ve been
specifically instructed to do so.
18. Computers work with data.
You work with data, too -- but in most
cases, you'll have to make your data readable
by computer.
19. How to make your data
machine-readable
• Annotate it with markup language
• Organize it in patterns that the computer
can understand
• Add data that is not explicitly readable in
the current format (i.e.,
hardbound/softbound binding;
language:English; date of record creation)
20. Depending on the data you
have, and the way you
annotate or structure it,
different things become
possible.
21. Your goal is to make the data As
Simple As Possible -- but not so
simple that it stops being useful.
22. Depending on the data you
work with, the work of
structuring or annotating
becomes more challenging,
but also more useful.
24. Many programming languages have
governing bodies that establish
standards for their use:
•the World Wide Web (W3C) Consortium
(http://www.w3.org/standards/)
•the TEI Technical Council
29. Markup: HTML
Anything can be data -- and markup languages
provide instructions for how computers should
treat that data.
30. Markup: HTML
HTML is used to format text on webpages.
<p> separates text into paragraphs.
<em> makes text bold (emphasized).
These are just a few of the HTML formatting instructions that
you can use.
31. HTML Syntax Rules
•Open and closed tags: <> and </>
nd
•Attributes (2 -level information)
defined using =“”
37. Poetry w/ TEI
<text xmlns="http://www.tei-c.org/ns/1.0" xml:id="d1">
<body xml:id="d2">
<div1 type="book" xml:id="d3">
<head>Songs of Innocence</head>
<pb n="4"/>
<div2 type="poem" xml:id="d4">
<head>Introduction</head>
<lg type="stanza">
<l>Piping down the valleys wild, </l>
<l>Piping songs of pleasant glee, </l>
<l>On a cloud I saw a child, </l>
<l>And he laughing said to me: </l>
</lg>
39. TEI’s syntax rules are
identical to HTML’s -though your normal
browser can’t work with TEI
the way it works with
HTML.
40. TEI is meant to be a highly
social language -- meaning
that the committee who
maintains its standards want
it to be something that
anyone can use.
41. In order for TEI to
successfully encode texts, it
has to be adaptable to
individual projects.
42. Anything that you can isolate (and
put in brackets) can (theoretically)
be pulled out and displayed for a
reader.
43. TEI can be used to encode more than just text:
<div type="shot">
<view>BBC World symbol</view>
<sp>
<speaker>Voice Over</speaker>
<p>Monty Python's Flying Circus tonight comes to you live
from the Grillomat Snack Bar, Paignton.</p>
</sp>
</div>
<div type="shot">
<view>Interior of a nasty snack bar. Customers around, preferably
real people. Linkman sitting at one of the plastic tables.</view>
<sp>
<speaker>Linkman</speaker>
<p>Hello to you live from the Grillomat Snack Bar.</p>
</sp>
</div>
44. Or, you could encode all
Stephenie Meyer’s Twilight
according to its emotional
register.
45. Whether you include or
exclude some aspect of the
text in your markup can be
very important from an
academic perspective.
46. The challenge of creating
good data is one reason that
collaboration is so
important to digital
scholarship.
47. Data Collaboration
• Avoid reinventing the wheel (has the
markup for this text already been done?)
• Consider the labor involved vs. the
outcome (and future use of the data you
create.)
49. Study Scenario #1
• You study urban espresso stands: their
hours, brands of coffee, whether or not
they sell pastries, and how far the espresso
stands are from major roadways.
50. Study Scenario #2
• You study female characters in novels
written between 1700 and 1850. Encoding a
whole novel just to study female characters
isn’t practical for you.
52. Structured Data: Example #1
(MySQL)
ID
Name
Location
008
Java the Hut
009
Prufrock
Coffee
Hours
Coffee Brand
Pastries (Y/N)
Distance from
Street
56 Farringdon 7:00 a.m.-2:00
Road, London, p.m.
UK
Square Mile
Roasters
N
25 meters
18 Shoreditch
High Street
Monmouth
Y
10 meters
7:00 a.m. –
10:00 p.m.
60. Every project has
data.
Text objects, images, tags, geographical
coordinates, categories, records, creator
metadata, etc.
61. Even if you’re not planning to learn
any programming skills, you are still
working with data.
62. Next time:
Programming on the Whiteboard
February 1st, 9:30, CMU 202
•Cleaning data before you work with it!
•Identifying specific programming tasks
•How access affects your project idea
•Flash project development
•Homework: bring some data to work with.
Please take our quick eval survey!
http://tinyurl.com/dmdh14jan