Demystifying Digital Humanities: Winter 2014 Workshop #2: Programming on the Whiteboard

Paige Morgan
Paige MorganUniversity of Washington
Winter 2014: Session #2
Programming on the Whiteboard
(Paige Morgan, Sarah Kremen-Hicks, Brian Gutierrez)
Previously, at DMDH...
• The work of creating usable data
• Forms that this data might take:
• markup language
• spreadsheets
Workshop #2
• Caveat Curator (challenges of working with
data)
• Programming on the whiteboard, i.e.,
conceptualizing the specific steps that you
need to take to accomplish your goals
Why this focus on data?
• Understanding your data, and your
intended actions, is a key skill for working
with any programming language or
platform.
• This is true whether you are the
programmer or whether you are working
with professional programmers.
Programming languages
are like human
languages in that they
both have phrases,
patterns, and rules.
Programming languages
are unlike human
languages in that they
aren’t for communicating
with people.
They are also unlike human
languages in that every
programming utterance
does something, i.e., causes
an action to occur.
You can get used to
patterns – even
unfamiliar ones.
The shift is in getting
used to thinking in
terms of every single
action.
Our subject matter today is all
actions that you’ll need to
think about before you work
with...
Image: Josh Lee, @wtrsld, via Twitter, January 2014.
Even when you’re just
experimenting, you need to
prep your data.
You may know your dataset
in detail already, from your
research -- but your
computer is concerned with
different levels of detail.
Becoming aware of those levels
of detail is not only helpful for
your project ideas...
...it’s also a useful skill for
working with programming
languages.
(where a stray /> or ; can break your program/website)
Caveat Curator
Data only works if your
computer can read it.
But my data is just text!
(Isn’t that easy?)
(Remember, your computer is
fairly stupid).
Formatted text
is often full of
text your
computer can’t
parse correctly.
The┘re┘sÜlt ís that yoÜr te┘xt
might come┘ oÜt looking
like┘this
whe┘n yoÜ ope┘n it in a
programming e┘nvironme┘nt.
So you need to
convert it to
plain text.
(without any of the fancy details
encoded in MS Word fonts.)
But even that can produce
unexpected errors.
Maybe you want to work with
sailing data and ports of call:
The ship you’re interested in
leaves the Ivory Coast for St.
Helena...
Demystifying Digital Humanities: Winter 2014 Workshop #2: Programming on the Whiteboard
But when you create your map,
you get this:
Demystifying Digital Humanities: Winter 2014 Workshop #2: Programming on the Whiteboard
The latitude/longitude
coordinate is the significant
datum.
The city name is just the
human-readable component.
Each datum needs to be
unique.
Figuring out what sort of
unique configuration will
work best involves at
least some
experimentation.
To experiment effectively, you’ll
want to keep careful records.
If you develop categories of
information, you’ll want to
keep a record of what each
category means, and what
its limits are.
Cleaning and structuring your
data is a foundation issue that
changes, depending on the
available format of your data.
What if your data is
crowdsourced?
You can require a particular
format for submissions
You can even put
programmatic limits on the
formats available for
submission
But in the end, you’re still going
to need to scrub and/or
format.
This is true even for data
from supposedly reputable
sources, like government or
media organizations.
Example: Doctor WhoVillains
dataset
http://tinyurl.com/doctorwhovil
lains
This step is no fun!
But it’s absolutely necessary.
What does a baby computer
call his father: “data”
Break!
Working with “little data”:
GIS and the Spatial Turn
GIS technology has paved the
way for the analyzing qualitative
data associated with cultural
experiences
“A good map is worth a thousand words,
cartographers say, and they are right: because
it produces a thousand words: it raises doubts,
ideas. It poses new questions, and forces you
to look for new answers.”
(Moretti 1998, 3–4)
Literary texts are filled with
subjective spatial data: an
author or character's
articulation of geographically
located dwellings, urban and
rural landscapes, as well as
performance spaces
Project: Mapping William
Wordsworth's Conspicuous
Consumption in The Prelude
(Brian R. Gutierrez)
Objective: to map the visual culture
events referenced in Wordsworth’s
autobiographical poem The Prelude (as
well as the ones not referenced)
Problem to solve: Prove that literary
galleries, specifically Joseph Boydell’s
“Shakespeare Gallery” shaped the
dramaturgical choices in the only play
written by Wordsworth. He reads
Shakespeare not through a personal copy
of the play, but through the visual and
performative texts at that time
Data: place-names, indirect
references, and all non-
referenced visual cultural
events
Access to data: Project
Gutenberg, digital archive of
British newspapers and
periodicals
What to do with that data?
Map it!!
First data set:
Literary spatial articulations
Wordsworth mentions these following place
names and references:
"Oh wonderous power of words, how sweet  they are
 / According to the meaning which they bring-- /
Vauxhall and Ranelagh, I then had heard / Of your green
groves and wilderness of lamps, /Your gorgeous ladies,
fairy cataracts,And pageant fireworks"  (119-125)
"Half-rural Sadler's Wells" (267)
First, I need to know what and
where these places were in
order to identify them as
spatial data
Ex:Vauxhall and Ranelagh
Second, if I'm interested in
visual cultural experiences, I
need to identify what kind of
event occurred there: galley
play, etc.
Third, how would I access the data?
Answer: place-names in a book are not
under any copyright.  
However, if I wanted to include sections
from the text when a viewer would click
on that place name then I would have to
think about copyright, but it's on PG, so
that's covered.
Fourth, I would have to locate any indirect
reference to visual cultural phenomena.
Ex:Wordsworth mentions two actresses by
name Mary Robinson and Sarah Siddons.
Since I cannot map a person, I need to
investigate which plays they were in and at which
theaters during that moment of his life (it's an
autobiography)
Fifth, I need to research what special
events were occurring at other places
he mentions. For that, I look to The
Times (newspapers) and various
periodicals.
Sixth, because I going to create
a map, using ArcGIS, I need to
put my data in an excel
spreadsheet so that it can be
read by the program.
Demystifying Digital Humanities: Winter 2014 Workshop #2: Programming on the Whiteboard
What is the relationship
between the data?
Analyze the qualitative data
Humanist skill=
Dhumanist skill
Programming on the
whiteboard involves looking at
the categories of information,
and thinking about how they
interact.
Categories
• Place names
• Poetic lines
• Genre of visual/cultural event
• Spatial data (latitude/longitude)
Return to the source of original
data—the literary text—to
examine how the author is
describing these phenomena
Why use ArcGIS?
Benefits of ArcGIS
• It allows the overlay of historical maps
• Trainings were available and accessible
(through DHSI and UW courses)
• As a software program,ArcGIS is
established enough to be considered robust
• Available through the UW software suite
Disadvantages of ArcGIS
• Available only for PCs
• Proprietary file format (even if input data is
open-access, the end result is not)
• Available only on an annual subscription
model (and prohibitively expensive for
scholars without campus-granted access)
In Franco Moretti’s Atlas of the
European Novel 1800-1900
(1998), he calls for a “literary
geography,” predicated on the
creation of “readerly maps”
and the use of those maps as
analytical tools.
Caveats?
The pursuit of mapping data
may exclude complex social
spaces (e.g., gender domestic
environments)
Caveats?
Cartographical representations
should not be divorced from
their primary texts
Project:Visualizing Prosody
(Sarah Kremen-Hicks)
x / |x /|xx / | x / |x /
Sir Walter Vivian all a summer's day
/ x | / x | x / | x / | x /
Gave his broad lawns until the set of sun
Marking up a poem for
metrical scansion is encoding it
with data.
What can a computer do with
that data?
Computers are good at
counting things – like iambs.
Is it possible to predict
deviations from a metrical
norm based on author or lyric
classification?
Will authors show a tendency
for particular types of metrical
substitution?
Prepping the Data
• For proof of concept, start with one author
(Alfred, LordTennyson)
• Get Tennyson’s poems from Project
Gutenberg
• Hand-mark representative poems for
prosody
Programming on the Whiteboard
What should the
computer do?
Computer tasks• Count feet per line
• Recognize | as a foot boundary
• Recognize carriage return as a line boundary
• Supply foot boundaries at beginning/end of
lines
• Count the number of areas contained within
foot boundaries for each line
These steps involve recognizing
each metrical foot as units that
contain particular accentual-
syllabic data.
x / |x /|xx / | x / |x /
Sir WalterVivian all a summer's day
Computer tasks, cont’d.
• Identify the most common number of feet
per line
• Supply a report on lines (by number) that
deviate
• Calculate rate of deviation/adherence
• Mode = paradigm
After recognizing the foot as a
unit, the computer can calculate
what patterns of data each foot
contains.
Computer tasks, cont’d.
• Identify the most common foot type
• Identify markings within foot boundaries
• Compare markings to foot dictionary to
identify type
These tasks identify each line
as a unit composed of one or
more feet.
x / |x /|xx / | x / |x /
Sir WalterVivian all a summer's day
(iambic pentameter with third foot anapestic
substitution)
Still more computing tasks!
• Identify the most common foot type within
a poem
• Supply a report on feet (by line and foot
number) that deviate
• Calculate rate of deviation/adherence
• Mode = paradigm
Just as the feet contain
patterns, the lines contain
patterns that can be analyzed
as well.
Still more computing tasks!
• Report on types of deviations arranged by
most to least common
• Information should include location
(line/foot number), as well as prevalence of
substitution type
Deviations and their placement
within each line and each poem
should display certain patterns
unique to each author (I hope!)
Current status: I’m investigating
using the Natural Language
Toolkit to tokenize each foot;
and to establish syllables, feet,
and lines as a unique hierarchy.
ApplicableValues
•Iterative development
•Failure as valuable
•Collaboration
If you are thinking about your
data, and the tasks that you
need to accomplish, then it’s
easier to determine what sort
of language or platform your
project needs.
There are countless tutorials,
online courses, etc., for almost
any programming language or
platform.
(We’re giving you a cheat sheet,
too; and http://www.dmdh.org is
your friend. So is Google.)
Learning them can be a slow
process, especially at first.
However, knowing what tasks
you’re working towards makes
it easier to understand the
purpose of the introductory
lessons.
It’s also easy to think about
how the first rules you learn
for any language or platform
might affect your goals.
And now, it’s your turn...
For this activity, we
recommend that you pair up,
or form small groups to work
together.
Group Activity
• What do you need to do with your data?
• What units might that data exist in?
• What categories do you need to create?
• What relationships need to exist between
the units and categories?
Spring Workshops!
• Project Ideation and Development
• April 5th and April 26th (advance
registration for DMDH participants at the
end of Winter Quarter
DMDH content is developed by Paige Morgan,
Sarah Kremen-Hicks, and Brian Gutierrez, with
generous support from the Simpson Center for
the Humanities at the University of Washington.
Content is available under a
Creative Commons Attribution-NonCommercial
3.0 Unported License.
Please contact Paige at paigecm@uw.edu with
questions.
1 of 103

Recommended

Dmdh winter 2015 session #2 by
Dmdh winter 2015 session #2Dmdh winter 2015 session #2
Dmdh winter 2015 session #2sarahkh12
832 views104 slides
Dmdh winter 2015 session #1 by
Dmdh winter 2015 session #1Dmdh winter 2015 session #1
Dmdh winter 2015 session #1sarahkh12
891 views64 slides
Demystifying Digital Humanities: Winter 2014 session #1 by
Demystifying Digital Humanities: Winter 2014 session #1Demystifying Digital Humanities: Winter 2014 session #1
Demystifying Digital Humanities: Winter 2014 session #1Paige Morgan
2.6K views62 slides
DMDS Winter Workshop 2 Slides by
DMDS Winter Workshop 2 SlidesDMDS Winter Workshop 2 Slides
DMDS Winter Workshop 2 SlidesPaige Morgan
612 views79 slides
Digital humanities by
Digital humanitiesDigital humanities
Digital humanitiesMokhtar Ben Henda
2.1K views38 slides
Domain-Driven Design at ZendCon 2012 by
Domain-Driven Design at ZendCon 2012Domain-Driven Design at ZendCon 2012
Domain-Driven Design at ZendCon 2012Bradley Holt
1.5K views79 slides

More Related Content

Similar to Demystifying Digital Humanities: Winter 2014 Workshop #2: Programming on the Whiteboard

Domain-Driven Design by
Domain-Driven DesignDomain-Driven Design
Domain-Driven DesignBradley Holt
1.9K views60 slides
N8_R_for_Text_Analysis_Slides.pptx by
N8_R_for_Text_Analysis_Slides.pptxN8_R_for_Text_Analysis_Slides.pptx
N8_R_for_Text_Analysis_Slides.pptxNafisa Vaz
4 views33 slides
R in the Humanities: Text Analysis by
R in the Humanities: Text AnalysisR in the Humanities: Text Analysis
R in the Humanities: Text AnalysisLeah Henrickson
214 views33 slides
Beyond the Black Box: Data Visualisation by
Beyond the Black Box: Data VisualisationBeyond the Black Box: Data Visualisation
Beyond the Black Box: Data VisualisationMia
4.3K views93 slides
Phase III Presentation by
Phase III PresentationPhase III Presentation
Phase III PresentationGrey Vaisius
289 views30 slides
Essay About Stephen Crane And The Civil War by
Essay About Stephen Crane And The Civil WarEssay About Stephen Crane And The Civil War
Essay About Stephen Crane And The Civil WarLana Sorrels
2 views77 slides

Similar to Demystifying Digital Humanities: Winter 2014 Workshop #2: Programming on the Whiteboard(20)

Domain-Driven Design by Bradley Holt
Domain-Driven DesignDomain-Driven Design
Domain-Driven Design
Bradley Holt1.9K views
N8_R_for_Text_Analysis_Slides.pptx by Nafisa Vaz
N8_R_for_Text_Analysis_Slides.pptxN8_R_for_Text_Analysis_Slides.pptx
N8_R_for_Text_Analysis_Slides.pptx
Nafisa Vaz4 views
R in the Humanities: Text Analysis by Leah Henrickson
R in the Humanities: Text AnalysisR in the Humanities: Text Analysis
R in the Humanities: Text Analysis
Leah Henrickson214 views
Beyond the Black Box: Data Visualisation by Mia
Beyond the Black Box: Data VisualisationBeyond the Black Box: Data Visualisation
Beyond the Black Box: Data Visualisation
Mia 4.3K views
Phase III Presentation by Grey Vaisius
Phase III PresentationPhase III Presentation
Phase III Presentation
Grey Vaisius289 views
Essay About Stephen Crane And The Civil War by Lana Sorrels
Essay About Stephen Crane And The Civil WarEssay About Stephen Crane And The Civil War
Essay About Stephen Crane And The Civil War
Lana Sorrels2 views
Come with an idea - go home with a web map: Tools for sharing maps and vector... by Stefan Keller
Come with an idea - go home with a web map: Tools for sharing maps and vector...Come with an idea - go home with a web map: Tools for sharing maps and vector...
Come with an idea - go home with a web map: Tools for sharing maps and vector...
Stefan Keller2.1K views
How Graph Databases used in Police Department? by Samet KILICTAS
How Graph Databases used in Police Department?How Graph Databases used in Police Department?
How Graph Databases used in Police Department?
Samet KILICTAS540 views
R in the Humanities: Text Analysis (v2) by Leah Henrickson
R in the Humanities: Text Analysis (v2)R in the Humanities: Text Analysis (v2)
R in the Humanities: Text Analysis (v2)
Leah Henrickson32 views
20110324 linked openeuropeanahumanities by Stefan Gradmann
20110324 linked openeuropeanahumanities20110324 linked openeuropeanahumanities
20110324 linked openeuropeanahumanities
Stefan Gradmann716 views
Platforms and the Semantic Web by Danny Ayers
Platforms and the Semantic WebPlatforms and the Semantic Web
Platforms and the Semantic Web
Danny Ayers282 views
Design Patterns for Future Content by Don Day
Design Patterns for Future Content Design Patterns for Future Content
Design Patterns for Future Content
Don Day125 views
Linked Open Europeana: Semantics for the Citizen by Stefan Gradmann
Linked Open Europeana: Semantics for the CitizenLinked Open Europeana: Semantics for the Citizen
Linked Open Europeana: Semantics for the Citizen
Stefan Gradmann2.4K views
I want to know more about compuerized text analysis by Luke Czarnecki
I want to know more about   compuerized text analysisI want to know more about   compuerized text analysis
I want to know more about compuerized text analysis
Luke Czarnecki281 views
Corpora, Blogs and Linguistic Variation (Paderborn) by Cornelius Puschmann
Corpora, Blogs and Linguistic Variation (Paderborn)Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)
Frontiers of Computational Journalism week 2 - Text Analysis by Jonathan Stray
Frontiers of Computational Journalism week 2 - Text AnalysisFrontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text Analysis
Jonathan Stray527 views

More from Paige Morgan

Feb.2016 Demystifying Digital Humanities - Workshop 3 by
Feb.2016 Demystifying Digital Humanities - Workshop 3Feb.2016 Demystifying Digital Humanities - Workshop 3
Feb.2016 Demystifying Digital Humanities - Workshop 3Paige Morgan
2K views28 slides
Feb.2016 Demystifying Digital Humanities - Workshop 2 by
Feb.2016 Demystifying Digital Humanities - Workshop 2Feb.2016 Demystifying Digital Humanities - Workshop 2
Feb.2016 Demystifying Digital Humanities - Workshop 2Paige Morgan
1.9K views63 slides
Feb.2016 Demystifying Digital Humanities - Workshop 1 by
Feb.2016 Demystifying Digital Humanities - Workshop 1Feb.2016 Demystifying Digital Humanities - Workshop 1
Feb.2016 Demystifying Digital Humanities - Workshop 1Paige Morgan
2.1K views24 slides
Miami Demystifying DH session 1 slides-FINAL by
Miami   Demystifying DH   session 1 slides-FINALMiami   Demystifying DH   session 1 slides-FINAL
Miami Demystifying DH session 1 slides-FINALPaige Morgan
436 views25 slides
DMDH HASTAC 2015 Presentation: Building and Sustaining DH Communities by
DMDH HASTAC 2015 Presentation: Building and Sustaining DH Communities DMDH HASTAC 2015 Presentation: Building and Sustaining DH Communities
DMDH HASTAC 2015 Presentation: Building and Sustaining DH Communities Paige Morgan
602 views18 slides
Dmdh may 2015 - workshop 1 by
Dmdh   may 2015 - workshop 1Dmdh   may 2015 - workshop 1
Dmdh may 2015 - workshop 1Paige Morgan
404 views26 slides

More from Paige Morgan(18)

Feb.2016 Demystifying Digital Humanities - Workshop 3 by Paige Morgan
Feb.2016 Demystifying Digital Humanities - Workshop 3Feb.2016 Demystifying Digital Humanities - Workshop 3
Feb.2016 Demystifying Digital Humanities - Workshop 3
Paige Morgan2K views
Feb.2016 Demystifying Digital Humanities - Workshop 2 by Paige Morgan
Feb.2016 Demystifying Digital Humanities - Workshop 2Feb.2016 Demystifying Digital Humanities - Workshop 2
Feb.2016 Demystifying Digital Humanities - Workshop 2
Paige Morgan1.9K views
Feb.2016 Demystifying Digital Humanities - Workshop 1 by Paige Morgan
Feb.2016 Demystifying Digital Humanities - Workshop 1Feb.2016 Demystifying Digital Humanities - Workshop 1
Feb.2016 Demystifying Digital Humanities - Workshop 1
Paige Morgan2.1K views
Miami Demystifying DH session 1 slides-FINAL by Paige Morgan
Miami   Demystifying DH   session 1 slides-FINALMiami   Demystifying DH   session 1 slides-FINAL
Miami Demystifying DH session 1 slides-FINAL
Paige Morgan436 views
DMDH HASTAC 2015 Presentation: Building and Sustaining DH Communities by Paige Morgan
DMDH HASTAC 2015 Presentation: Building and Sustaining DH Communities DMDH HASTAC 2015 Presentation: Building and Sustaining DH Communities
DMDH HASTAC 2015 Presentation: Building and Sustaining DH Communities
Paige Morgan602 views
Dmdh may 2015 - workshop 1 by Paige Morgan
Dmdh   may 2015 - workshop 1Dmdh   may 2015 - workshop 1
Dmdh may 2015 - workshop 1
Paige Morgan404 views
Modular Digital Scholarship // for Seeding Digital Scholarship by Paige Morgan
Modular Digital Scholarship // for Seeding Digital ScholarshipModular Digital Scholarship // for Seeding Digital Scholarship
Modular Digital Scholarship // for Seeding Digital Scholarship
Paige Morgan601 views
Demystifying Digital Scholarship Workshop 6 Slides by Paige Morgan
Demystifying Digital Scholarship Workshop 6 SlidesDemystifying Digital Scholarship Workshop 6 Slides
Demystifying Digital Scholarship Workshop 6 Slides
Paige Morgan714 views
Demystifying Digital Scholarship Slides: Big Project, Small Project: Steps in... by Paige Morgan
Demystifying Digital Scholarship Slides: Big Project, Small Project: Steps in...Demystifying Digital Scholarship Slides: Big Project, Small Project: Steps in...
Demystifying Digital Scholarship Slides: Big Project, Small Project: Steps in...
Paige Morgan688 views
DMDS Winter 2015 Workshop 1 slides by Paige Morgan
DMDS Winter 2015 Workshop 1 slidesDMDS Winter 2015 Workshop 1 slides
DMDS Winter 2015 Workshop 1 slides
Paige Morgan835 views
Demystifying Digital Scholarship: Using Social Media for Learning and Profess... by Paige Morgan
Demystifying Digital Scholarship: Using Social Media for Learning and Profess...Demystifying Digital Scholarship: Using Social Media for Learning and Profess...
Demystifying Digital Scholarship: Using Social Media for Learning and Profess...
Paige Morgan716 views
Demystifying Digital Scholarship: Session 1, McMaster University by Paige Morgan
Demystifying Digital Scholarship: Session 1, McMaster UniversityDemystifying Digital Scholarship: Session 1, McMaster University
Demystifying Digital Scholarship: Session 1, McMaster University
Paige Morgan1.5K views
DMDH 2014: Workshop 5: Project Ideation and Development by Paige Morgan
DMDH 2014: Workshop 5: Project Ideation and DevelopmentDMDH 2014: Workshop 5: Project Ideation and Development
DMDH 2014: Workshop 5: Project Ideation and Development
Paige Morgan1.4K views
Dmdh session-2-2013-14 by Paige Morgan
Dmdh session-2-2013-14Dmdh session-2-2013-14
Dmdh session-2-2013-14
Paige Morgan1.5K views
Dmdh session-1-2013-14 by Paige Morgan
Dmdh session-1-2013-14Dmdh session-1-2013-14
Dmdh session-1-2013-14
Paige Morgan1.3K views
Dmdh workshop 5 slides by Paige Morgan
Dmdh   workshop 5 slidesDmdh   workshop 5 slides
Dmdh workshop 5 slides
Paige Morgan749 views
Visible Prices: Archiving the Intersection Between Literature and Economics by Paige Morgan
Visible Prices: Archiving the Intersection Between Literature and EconomicsVisible Prices: Archiving the Intersection Between Literature and Economics
Visible Prices: Archiving the Intersection Between Literature and Economics
Paige Morgan1.8K views

Recently uploaded

Drama KS5 Breakdown by
Drama KS5 BreakdownDrama KS5 Breakdown
Drama KS5 BreakdownWestHatch
71 views2 slides
Narration lesson plan.docx by
Narration lesson plan.docxNarration lesson plan.docx
Narration lesson plan.docxTARIQ KHAN
104 views11 slides
231112 (WR) v1 ChatGPT OEB 2023.pdf by
231112 (WR) v1  ChatGPT OEB 2023.pdf231112 (WR) v1  ChatGPT OEB 2023.pdf
231112 (WR) v1 ChatGPT OEB 2023.pdfWilfredRubens.com
144 views21 slides
Compare the flora and fauna of Kerala and Chhattisgarh ( Charttabulation) by
 Compare the flora and fauna of Kerala and Chhattisgarh ( Charttabulation) Compare the flora and fauna of Kerala and Chhattisgarh ( Charttabulation)
Compare the flora and fauna of Kerala and Chhattisgarh ( Charttabulation)AnshulDewangan3
316 views12 slides
ISO/IEC 27001 and ISO/IEC 27005: Managing AI Risks Effectively by
ISO/IEC 27001 and ISO/IEC 27005: Managing AI Risks EffectivelyISO/IEC 27001 and ISO/IEC 27005: Managing AI Risks Effectively
ISO/IEC 27001 and ISO/IEC 27005: Managing AI Risks EffectivelyPECB
545 views18 slides

Recently uploaded(20)

Drama KS5 Breakdown by WestHatch
Drama KS5 BreakdownDrama KS5 Breakdown
Drama KS5 Breakdown
WestHatch71 views
Narration lesson plan.docx by TARIQ KHAN
Narration lesson plan.docxNarration lesson plan.docx
Narration lesson plan.docx
TARIQ KHAN104 views
Compare the flora and fauna of Kerala and Chhattisgarh ( Charttabulation) by AnshulDewangan3
 Compare the flora and fauna of Kerala and Chhattisgarh ( Charttabulation) Compare the flora and fauna of Kerala and Chhattisgarh ( Charttabulation)
Compare the flora and fauna of Kerala and Chhattisgarh ( Charttabulation)
AnshulDewangan3316 views
ISO/IEC 27001 and ISO/IEC 27005: Managing AI Risks Effectively by PECB
ISO/IEC 27001 and ISO/IEC 27005: Managing AI Risks EffectivelyISO/IEC 27001 and ISO/IEC 27005: Managing AI Risks Effectively
ISO/IEC 27001 and ISO/IEC 27005: Managing AI Risks Effectively
PECB 545 views
AI Tools for Business and Startups by Svetlin Nakov
AI Tools for Business and StartupsAI Tools for Business and Startups
AI Tools for Business and Startups
Svetlin Nakov101 views
Use of Probiotics in Aquaculture.pptx by AKSHAY MANDAL
Use of Probiotics in Aquaculture.pptxUse of Probiotics in Aquaculture.pptx
Use of Probiotics in Aquaculture.pptx
AKSHAY MANDAL89 views
Scope of Biochemistry.pptx by shoba shoba
Scope of Biochemistry.pptxScope of Biochemistry.pptx
Scope of Biochemistry.pptx
shoba shoba124 views
JiscOAWeek_LAIR_slides_October2023.pptx by Jisc
JiscOAWeek_LAIR_slides_October2023.pptxJiscOAWeek_LAIR_slides_October2023.pptx
JiscOAWeek_LAIR_slides_October2023.pptx
Jisc79 views
Universe revised.pdf by DrHafizKosar
Universe revised.pdfUniverse revised.pdf
Universe revised.pdf
DrHafizKosar112 views
The basics - information, data, technology and systems.pdf by JonathanCovena1
The basics - information, data, technology and systems.pdfThe basics - information, data, technology and systems.pdf
The basics - information, data, technology and systems.pdf
JonathanCovena188 views

Demystifying Digital Humanities: Winter 2014 Workshop #2: Programming on the Whiteboard

  • 1. Winter 2014: Session #2 Programming on the Whiteboard (Paige Morgan, Sarah Kremen-Hicks, Brian Gutierrez)
  • 2. Previously, at DMDH... • The work of creating usable data • Forms that this data might take: • markup language • spreadsheets
  • 3. Workshop #2 • Caveat Curator (challenges of working with data) • Programming on the whiteboard, i.e., conceptualizing the specific steps that you need to take to accomplish your goals
  • 4. Why this focus on data? • Understanding your data, and your intended actions, is a key skill for working with any programming language or platform. • This is true whether you are the programmer or whether you are working with professional programmers.
  • 5. Programming languages are like human languages in that they both have phrases, patterns, and rules.
  • 6. Programming languages are unlike human languages in that they aren’t for communicating with people.
  • 7. They are also unlike human languages in that every programming utterance does something, i.e., causes an action to occur.
  • 8. You can get used to patterns – even unfamiliar ones.
  • 9. The shift is in getting used to thinking in terms of every single action.
  • 10. Our subject matter today is all actions that you’ll need to think about before you work with...
  • 11. Image: Josh Lee, @wtrsld, via Twitter, January 2014.
  • 12. Even when you’re just experimenting, you need to prep your data.
  • 13. You may know your dataset in detail already, from your research -- but your computer is concerned with different levels of detail.
  • 14. Becoming aware of those levels of detail is not only helpful for your project ideas...
  • 15. ...it’s also a useful skill for working with programming languages. (where a stray /> or ; can break your program/website)
  • 17. Data only works if your computer can read it.
  • 18. But my data is just text! (Isn’t that easy?)
  • 19. (Remember, your computer is fairly stupid).
  • 20. Formatted text is often full of text your computer can’t parse correctly.
  • 21. The┘re┘sÜlt ís that yoÜr te┘xt might come┘ oÜt looking like┘this whe┘n yoÜ ope┘n it in a programming e┘nvironme┘nt.
  • 22. So you need to convert it to plain text. (without any of the fancy details encoded in MS Word fonts.)
  • 23. But even that can produce unexpected errors.
  • 24. Maybe you want to work with sailing data and ports of call:
  • 25. The ship you’re interested in leaves the Ivory Coast for St. Helena...
  • 27. But when you create your map, you get this:
  • 29. The latitude/longitude coordinate is the significant datum.
  • 30. The city name is just the human-readable component.
  • 31. Each datum needs to be unique.
  • 32. Figuring out what sort of unique configuration will work best involves at least some experimentation.
  • 33. To experiment effectively, you’ll want to keep careful records.
  • 34. If you develop categories of information, you’ll want to keep a record of what each category means, and what its limits are.
  • 35. Cleaning and structuring your data is a foundation issue that changes, depending on the available format of your data.
  • 36. What if your data is crowdsourced?
  • 37. You can require a particular format for submissions
  • 38. You can even put programmatic limits on the formats available for submission
  • 39. But in the end, you’re still going to need to scrub and/or format.
  • 40. This is true even for data from supposedly reputable sources, like government or media organizations.
  • 42. This step is no fun!
  • 43. But it’s absolutely necessary.
  • 44. What does a baby computer call his father: “data” Break!
  • 45. Working with “little data”: GIS and the Spatial Turn
  • 46. GIS technology has paved the way for the analyzing qualitative data associated with cultural experiences
  • 47. “A good map is worth a thousand words, cartographers say, and they are right: because it produces a thousand words: it raises doubts, ideas. It poses new questions, and forces you to look for new answers.” (Moretti 1998, 3–4)
  • 48. Literary texts are filled with subjective spatial data: an author or character's articulation of geographically located dwellings, urban and rural landscapes, as well as performance spaces
  • 49. Project: Mapping William Wordsworth's Conspicuous Consumption in The Prelude (Brian R. Gutierrez)
  • 50. Objective: to map the visual culture events referenced in Wordsworth’s autobiographical poem The Prelude (as well as the ones not referenced)
  • 51. Problem to solve: Prove that literary galleries, specifically Joseph Boydell’s “Shakespeare Gallery” shaped the dramaturgical choices in the only play written by Wordsworth. He reads Shakespeare not through a personal copy of the play, but through the visual and performative texts at that time
  • 52. Data: place-names, indirect references, and all non- referenced visual cultural events
  • 53. Access to data: Project Gutenberg, digital archive of British newspapers and periodicals
  • 54. What to do with that data? Map it!!
  • 55. First data set: Literary spatial articulations
  • 56. Wordsworth mentions these following place names and references: "Oh wonderous power of words, how sweet  they are  / According to the meaning which they bring-- / Vauxhall and Ranelagh, I then had heard / Of your green groves and wilderness of lamps, /Your gorgeous ladies, fairy cataracts,And pageant fireworks"  (119-125) "Half-rural Sadler's Wells" (267)
  • 57. First, I need to know what and where these places were in order to identify them as spatial data Ex:Vauxhall and Ranelagh
  • 58. Second, if I'm interested in visual cultural experiences, I need to identify what kind of event occurred there: galley play, etc.
  • 59. Third, how would I access the data? Answer: place-names in a book are not under any copyright.   However, if I wanted to include sections from the text when a viewer would click on that place name then I would have to think about copyright, but it's on PG, so that's covered.
  • 60. Fourth, I would have to locate any indirect reference to visual cultural phenomena. Ex:Wordsworth mentions two actresses by name Mary Robinson and Sarah Siddons. Since I cannot map a person, I need to investigate which plays they were in and at which theaters during that moment of his life (it's an autobiography)
  • 61. Fifth, I need to research what special events were occurring at other places he mentions. For that, I look to The Times (newspapers) and various periodicals.
  • 62. Sixth, because I going to create a map, using ArcGIS, I need to put my data in an excel spreadsheet so that it can be read by the program.
  • 64. What is the relationship between the data?
  • 65. Analyze the qualitative data Humanist skill= Dhumanist skill
  • 66. Programming on the whiteboard involves looking at the categories of information, and thinking about how they interact.
  • 67. Categories • Place names • Poetic lines • Genre of visual/cultural event • Spatial data (latitude/longitude)
  • 68. Return to the source of original data—the literary text—to examine how the author is describing these phenomena
  • 70. Benefits of ArcGIS • It allows the overlay of historical maps • Trainings were available and accessible (through DHSI and UW courses) • As a software program,ArcGIS is established enough to be considered robust • Available through the UW software suite
  • 71. Disadvantages of ArcGIS • Available only for PCs • Proprietary file format (even if input data is open-access, the end result is not) • Available only on an annual subscription model (and prohibitively expensive for scholars without campus-granted access)
  • 72. In Franco Moretti’s Atlas of the European Novel 1800-1900 (1998), he calls for a “literary geography,” predicated on the creation of “readerly maps” and the use of those maps as analytical tools.
  • 73. Caveats? The pursuit of mapping data may exclude complex social spaces (e.g., gender domestic environments)
  • 74. Caveats? Cartographical representations should not be divorced from their primary texts
  • 75. Project:Visualizing Prosody (Sarah Kremen-Hicks) x / |x /|xx / | x / |x / Sir Walter Vivian all a summer's day / x | / x | x / | x / | x / Gave his broad lawns until the set of sun
  • 76. Marking up a poem for metrical scansion is encoding it with data. What can a computer do with that data?
  • 77. Computers are good at counting things – like iambs.
  • 78. Is it possible to predict deviations from a metrical norm based on author or lyric classification?
  • 79. Will authors show a tendency for particular types of metrical substitution?
  • 80. Prepping the Data • For proof of concept, start with one author (Alfred, LordTennyson) • Get Tennyson’s poems from Project Gutenberg • Hand-mark representative poems for prosody
  • 81. Programming on the Whiteboard What should the computer do?
  • 82. Computer tasks• Count feet per line • Recognize | as a foot boundary • Recognize carriage return as a line boundary • Supply foot boundaries at beginning/end of lines • Count the number of areas contained within foot boundaries for each line
  • 83. These steps involve recognizing each metrical foot as units that contain particular accentual- syllabic data. x / |x /|xx / | x / |x / Sir WalterVivian all a summer's day
  • 84. Computer tasks, cont’d. • Identify the most common number of feet per line • Supply a report on lines (by number) that deviate • Calculate rate of deviation/adherence • Mode = paradigm
  • 85. After recognizing the foot as a unit, the computer can calculate what patterns of data each foot contains.
  • 86. Computer tasks, cont’d. • Identify the most common foot type • Identify markings within foot boundaries • Compare markings to foot dictionary to identify type
  • 87. These tasks identify each line as a unit composed of one or more feet. x / |x /|xx / | x / |x / Sir WalterVivian all a summer's day (iambic pentameter with third foot anapestic substitution)
  • 88. Still more computing tasks! • Identify the most common foot type within a poem • Supply a report on feet (by line and foot number) that deviate • Calculate rate of deviation/adherence • Mode = paradigm
  • 89. Just as the feet contain patterns, the lines contain patterns that can be analyzed as well.
  • 90. Still more computing tasks! • Report on types of deviations arranged by most to least common • Information should include location (line/foot number), as well as prevalence of substitution type
  • 91. Deviations and their placement within each line and each poem should display certain patterns unique to each author (I hope!)
  • 92. Current status: I’m investigating using the Natural Language Toolkit to tokenize each foot; and to establish syllables, feet, and lines as a unique hierarchy.
  • 94. If you are thinking about your data, and the tasks that you need to accomplish, then it’s easier to determine what sort of language or platform your project needs.
  • 95. There are countless tutorials, online courses, etc., for almost any programming language or platform. (We’re giving you a cheat sheet, too; and http://www.dmdh.org is your friend. So is Google.)
  • 96. Learning them can be a slow process, especially at first.
  • 97. However, knowing what tasks you’re working towards makes it easier to understand the purpose of the introductory lessons.
  • 98. It’s also easy to think about how the first rules you learn for any language or platform might affect your goals.
  • 99. And now, it’s your turn...
  • 100. For this activity, we recommend that you pair up, or form small groups to work together.
  • 101. Group Activity • What do you need to do with your data? • What units might that data exist in? • What categories do you need to create? • What relationships need to exist between the units and categories?
  • 102. Spring Workshops! • Project Ideation and Development • April 5th and April 26th (advance registration for DMDH participants at the end of Winter Quarter
  • 103. DMDH content is developed by Paige Morgan, Sarah Kremen-Hicks, and Brian Gutierrez, with generous support from the Simpson Center for the Humanities at the University of Washington. Content is available under a Creative Commons Attribution-NonCommercial 3.0 Unported License. Please contact Paige at paigecm@uw.edu with questions.