Advertisement
Advertisement

More Related Content

Advertisement

Similar to Leif-Jöran Olsson "Dramawebben, The Swedish Drama Web" KB 9 oktober 2015 (20)

Advertisement

Leif-Jöran Olsson "Dramawebben, The Swedish Drama Web" KB 9 oktober 2015

  1. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Dramawebben, The Swedish Drama Web Leif-Jöran Olsson Språkbanken, University of Gothenburg, CLARIN-ERIC, SWE-CLARIN 2015-10-09
  2. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Overview Dramawebben, The Swedish Drama Web
  3. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Acknowledgements For the parts relating to Dramawebben (The Swedish Drama Web) I gratefully acknowledge financial support from the Swedish Research Council (VR Dnr: 2011-6202). Thanks to project co-workers, Riksteatern and Musikverket.
  4. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Text Encoding Initiative (TEI) Defacto standard for text encoding in the Humanities Modules Using XML Schema
  5. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Goals of Dramawebben The project includes a baselined corpus of TEI drama annotated plays Development of exploration and visualization tools Engaging a vibrant community Educate students in TEI encoding and let them be ambassadeurs spreading the word Target disciplines within the humanities, such as linguistics, literary and theater history, studies in children’s culture, practical and theoretical research in children’s theater, and arts tertiary institutions. <http://www.dramawebben.se>
  6. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Perspective of Dramawebben The perspective is the one of the dramatic text as a working text.
  7. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments The context The baseline encoding covers the basic structure of the drama text. On top of that, it is possible to add semantic annotation, which goes beyond the text itself, referring to the action below, behind or beyond the actual words.
  8. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments The context (continued) All plays on Dramawebben (<http://www.dramawebben.se>) printed 1880-1900 were selected. It ended up including some 70+ plays in all genres, children’s plays, drama and comedy, plays by female as well as by male dramatists. To tempt scholars in humanities with semantic encoding, we have started with one theme–textile handicraft, which was a recurrent feature of the plays by female playwrights of the 1880’s.
  9. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Visualisations We produce several visualizations automatically by eXist-db extensions and apps. One kind of visualization is speeches division charts. To get an indication of a skewed relation between female cast and female speeches one can show the female percentage of speeches and cast side by side like in the following chart. Most plays in the selection have an equal share of speeches in relation to the percentage of female speaking roles.
  10. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Speaker gender and speaker division
  11. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Speaker gender division per play
  12. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Female and child speeches division per play
  13. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Roles in cast list (and added) per play
  14. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Result of searching for plays with a specifc number of roles (excerpt)
  15. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Handicraft As a thematic coding we used the example of textile handicraft since we believe that it can generate exciting issues and serve as an instructive example for other forms of semantic encoding. Using feature stucture elements, with key-value pairs They can be tied to anchors to make them discontinuous
  16. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments The model (simplified)
  17. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Ongoing handicraft in speeches per play
  18. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Children’s play and food and drink terms For children’s play we used the same basic parts of the model as for handicraft, that is: activity, talk about activity, and play objects As an other example of potential thematic coding we extract food and drink terms. For this I created a simple hierarchical lexical resource with cooking and serving utensils, ingredients, dishes, procedures etcetera. These concept words were expanded morphologically by other lexical resources to cover all forms and some spelling variation over time.
  19. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Food and drink terms per play
  20. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Occupations As an other example of thematic coding we use the Historical variant of the international standard classification of occupations (ISCO) called HISCO. 10 top level categories 1–0 and five levels of subcategories. The SCB also adapt/align its svensk standard för yrkesklassificering (SSYK) to the ISCO standard This makes it possible to compare occupations in an international context and link (LOD) to other datasets and implementations
  21. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Occupations in play text (difference hisco 5-8, isco 0)
  22. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Occupations of role characters
  23. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Most common occupations in one selection for role characters 18 54020 arbetspiga 16 58320 amiralitetslöjtnantdotter 13 14120 adjunkt 10 14140 elisabetsyster 8 15120 f.d. författare 8 99900 arbetare 7 -1 allmosehjon 7 20210 andre legationssekreterare 7 17320 aktris 6 58220 biträdande vaktkonstapel 5 17120 kompositör 5 54010 f.d. dräng 5 55130 auktionsvaktmästare 5 20110 borgarråd 4 17140 dragspelerska
  24. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Link to images for occupation
  25. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Link to history of work DB
  26. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Presence and relations Presence Relations We combine parts of namesdates module, like <listPerson> and <listOrg> with relations in <listRelation> elements to create graphs of relations between persons (cast and non-cast) or interaction on stage (cast only) sociograms.
  27. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Presence)
  28. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Sociograms Sociograms are created dynamically and can be created based on any criteria of what constitutes interaction. These can also be weighted by giving a numeric value to the @sortKey attribute of the <relation> element. Of course you can also create other types of graphs based on dynamic data.
  29. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Sociogram for “The father”
  30. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Sociogram for “The father” alternative view 5 2 4 51 32 3 2 1 Nöjd La
  31. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Personal relations Personal relations are coded by hand. Every <person> shall have at least one <relation> referencing it. Organisations can also be part of these relations. To differentiate between persons and organisations in the graphs we make the <person> nodes elliptic and the <org> ones rectangular. Cast persons have a solid node outline while non-cast persons have a dashed outline.
  32. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Personal relations (continued) We have followed the default of three <relation> @type values “personal”, “social”, and “other”. These are represented by dashed, solid, and dotted edges respectively.
  33. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Person relations in ”Fröken Julie”
  34. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Person relations in ”The Father”
  35. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Training and evaluation The freely available (open license) resources can be used for training and evaluation The hand-coded and proof-read referential strings (names, places) The hand-coded and proof-read relations The occupations resource The timespecific complementary lexical resources in addition to already existing ones
  36. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Named entity recognition (NER) based on dw-delkorpus1/dw-delkorpus2
  37. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Named entity recognition (NER) resource based on dw-delkorpus1 <http://www.dramawebben.se/sites/default/files/ dw-delkorpus1/swe-dw1-3class-model.ser.gz> to be used with eXist-db stanford-ner app. NB Fully automatically generated proof of concept, but can still be useful for your purposes.
  38. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Female talk not about men
  39. Overview Numbers and divisions Thematic examples Presence and relations Training and evaluation Final Comments Final Comments Resources available with open licenses: <http://www.dramawebben.se> Tools used or mentioned (not Dramawebben specific) with open licenses: eXist-db apps <https://github.com/ljo/exist-tei-graphing>, <https://github.com/ljo/exist-sparql> and <https://github.com/eXist-db/jfreechart>, more under <https://github.com/ljo/> and <https://github.com/eXist-db/> Graphs can be used in svg, graphml and gexf output
Advertisement