Overview
Numbers and
divisions
Thematic
examples
Presence and
relations
Training and
evaluation
Final Comments
Acknowledgements
For the parts relating to Dramawebben (The
Swedish Drama Web) I gratefully acknowledge
financial support from the Swedish Research
Council (VR Dnr: 2011-6202).
Thanks to project co-workers, Riksteatern and
Musikverket.
Overview
Numbers and
divisions
Thematic
examples
Presence and
relations
Training and
evaluation
Final Comments
Goals of Dramawebben
The project includes a baselined corpus of TEI
drama annotated plays
Development of exploration and visualization
tools
Engaging a vibrant community
Educate students in TEI encoding and let them
be ambassadeurs spreading the word
Target disciplines within the humanities, such as
linguistics, literary and theater history, studies in
children’s culture, practical and theoretical
research in children’s theater, and arts tertiary
institutions.
<http://www.dramawebben.se>
Overview
Numbers and
divisions
Thematic
examples
Presence and
relations
Training and
evaluation
Final Comments
The context
The baseline encoding covers the basic
structure of the drama text.
On top of that, it is possible to add semantic
annotation, which goes beyond the text itself,
referring to the action below, behind or beyond
the actual words.
Overview
Numbers and
divisions
Thematic
examples
Presence and
relations
Training and
evaluation
Final Comments
The context (continued)
All plays on Dramawebben
(<http://www.dramawebben.se>) printed
1880-1900 were selected. It ended up including
some 70+ plays in all genres, children’s plays,
drama and comedy, plays by female as well as
by male dramatists.
To tempt scholars in humanities with semantic
encoding, we have started with one
theme–textile handicraft, which was a recurrent
feature of the plays by female playwrights of the
1880’s.
Overview
Numbers and
divisions
Thematic
examples
Presence and
relations
Training and
evaluation
Final Comments
Visualisations
We produce several visualizations automatically by
eXist-db extensions and apps.
One kind of visualization is speeches division
charts. To get an indication of a skewed relation
between female cast and female speeches one
can show the female percentage of speeches
and cast side by side like in the following chart.
Most plays in the selection have an equal share
of speeches in relation to the percentage of
female speaking roles.
Overview
Numbers and
divisions
Thematic
examples
Presence and
relations
Training and
evaluation
Final Comments
Handicraft
As a thematic coding we used the example of
textile handicraft since we believe that it can
generate exciting issues and serve as an instructive
example for other forms of semantic encoding.
Using feature stucture elements, with key-value
pairs
They can be tied to anchors to make them
discontinuous
Overview
Numbers and
divisions
Thematic
examples
Presence and
relations
Training and
evaluation
Final Comments
Children’s play and food and drink terms
For children’s play we used the same basic parts of
the model as for handicraft, that is:
activity,
talk about activity,
and play objects
As an other example of potential thematic coding
we extract food and drink terms. For this I created a
simple hierarchical lexical resource with cooking
and serving utensils, ingredients, dishes, procedures
etcetera. These concept words were expanded
morphologically by other lexical resources to cover
all forms and some spelling variation over time.
Overview
Numbers and
divisions
Thematic
examples
Presence and
relations
Training and
evaluation
Final Comments
Occupations
As an other example of thematic coding we use the
Historical variant of the international standard
classification of occupations (ISCO) called HISCO.
10 top level categories 1–0 and five levels of
subcategories.
The SCB also adapt/align its svensk standard för
yrkesklassificering (SSYK) to the ISCO standard
This makes it possible to compare occupations in
an international context and link (LOD) to other
datasets and implementations
Overview
Numbers and
divisions
Thematic
examples
Presence and
relations
Training and
evaluation
Final Comments
Presence and relations
Presence
Relations
We combine parts of namesdates module, like
<listPerson> and <listOrg> with relations in
<listRelation> elements to create graphs of relations
between persons (cast and non-cast) or interaction
on stage (cast only) sociograms.
Overview
Numbers and
divisions
Thematic
examples
Presence and
relations
Training and
evaluation
Final Comments
Sociograms
Sociograms are created dynamically and can
be created based on any criteria of what
constitutes interaction.
These can also be weighted by giving a numeric
value to the @sortKey attribute of the <relation>
element.
Of course you can also create other types of
graphs based on dynamic data.
Overview
Numbers and
divisions
Thematic
examples
Presence and
relations
Training and
evaluation
Final Comments
Personal relations
Personal relations are coded by hand.
Every <person> shall have at least one
<relation> referencing it.
Organisations can also be part of these
relations.
To differentiate between persons and
organisations in the graphs we make the
<person> nodes elliptic and the <org> ones
rectangular.
Cast persons have a solid node outline while
non-cast persons have a dashed outline.
Overview
Numbers and
divisions
Thematic
examples
Presence and
relations
Training and
evaluation
Final Comments
Training and evaluation
The freely available (open license) resources can be
used for training and evaluation
The hand-coded and proof-read referential
strings (names, places)
The hand-coded and proof-read relations
The occupations resource
The timespecific complementary lexical
resources in addition to already existing ones
Overview
Numbers and
divisions
Thematic
examples
Presence and
relations
Training and
evaluation
Final Comments
Named entity recognition (NER) resource
based on dw-delkorpus1
<http://www.dramawebben.se/sites/default/files/
dw-delkorpus1/swe-dw1-3class-model.ser.gz>
to be used with eXist-db stanford-ner app.
NB Fully automatically generated proof of concept,
but can still be useful for your purposes.
Overview
Numbers and
divisions
Thematic
examples
Presence and
relations
Training and
evaluation
Final Comments
Final Comments
Resources available with open licenses:
<http://www.dramawebben.se>
Tools used or mentioned (not Dramawebben
specific) with open licenses: eXist-db apps
<https://github.com/ljo/exist-tei-graphing>,
<https://github.com/ljo/exist-sparql> and
<https://github.com/eXist-db/jfreechart>, more
under <https://github.com/ljo/> and
<https://github.com/eXist-db/>
Graphs can be used in svg, graphml and gexf
output