Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Enriching Media Collections for Event-based Exploration


Published on

Slides for the MTSR2017 presentation on event enrichment in DIVE+ in the context of CLARIAH.

By: Victor de Boer, Liliana Melgar, Oana Inel, Carlos Martinez Ortiz, Lora Aroyo, and Johan Oomen

Abstract: Scholars currently have access to large heterogeneous media collections on the Web, which they use as sources for their research. Exploration of such collections is an important part in their research, where
scholars make sense of these heterogeneous datasets. Knowledge graphs which relate media objects, people and places with historical events can provide a valuable structure for more meaningful and serendipitous browsing. Based on extensive requirements analysis done with historians and media scholars, we present a methodology to publish, represent, enrich, and link heritage collections so that they can be explored by domain expert users. We present four methods to derive events from media object descriptions. We also present a case study where four datasets with mixed media types are made accessible to scholars and describe the building blocks for event-based proto-narratives in the knowledge graph.

Published in: Education
  • Be the first to comment

  • Be the first to like this

Enriching Media Collections for Event-based Exploration

  1. 1. ENRICHING MEDIA COLLECTIONS FOR EVENT-BASED EXPLORATION Victor de Boer, Liliana Melgar, Oana Inel, Carlos Martinez Ortiz, Lora Aroyo, and Johan Oomen MTSR 2017
  2. 2. 2 Cultural Heritage Collections becoming available as Linked Open Data
  3. 3. Support exploratory, event-centric browsing of multiple, heterogeneous collections for Media Scholars
  4. 4. DIVE+ Case study
  5. 5. OPENIMAGES.EU 3,220 news broadcasts Netherlands Institute for Sound & Vision GTAA thesaurus DELPHER.NL 197,199 Scans of Radio bulletins 1937 – 1984 AMSTERDAM MUSEUM 73,447 cultural heritage objects AM Thesaurus TROPENMUSEUM 78,270 cultural heritage objects SVNC thesaurus DIVE+ Collections and Vocabularies
  6. 6. Interactive Exploration & Discovery in Context linking objects to events and entities building automatic storylines (proto-narratives) Goal: develop explorable Knowledge Graph
  7. 7. Our recipe
  8. 8. Mapping to popular vocabularies am:obj_22093 “Job Cohen” am:contentPersonName rdfs:subPropertyOf dcterms:subject 1. Mapping to generic schema DIVE+
  9. 9. Van Hage, W. R., Malaisé, V., Segers, R., Hollink, L., & Schreiber, G. (2011). Design and use of the Simple Event Model (SEM). Web Semantics: Science, Services and Agents on the World Wide Web, 9(2), 128-136. Simple Event Model (SEM)
  10. 10. sem:Event sem:Actordive:MediaObject dive:depictedBy rdfs:label dive:source dive:placeholder dc:identifier dc:description etc. oa:Annotation oa:hasBodyoa:hasTarget sem:Place sem:Time skos:Concept sem:hasActor, sem:hasPlace sem:hasTime dive:isRelatedTo skos:broader, skos:narrower etc. dive:isRelatedTo DIVE+ Generic data model
  11. 11. DIVE+ manually created RDFS mapping files # mapping triples OI 3 NB - (conversion in project) AM 12 TM 18
  13. 13. Original Metadata Interpretation of content Named Entity Recognition Human computation Hybrid pipeline Where do we get events from? - LIDO, CIDOC, EDM - creationDateStart - - Interpretation of object - NLP tools, other pipelines - - Crowdsourcing - -Nichesourcing,
  14. 14. Original Metadata am:Belgische opstand am:besnijdenis am:Beurs de Keyser am:bevrijding am:bezoekerscentrum am:bibliotheken am:Bijlmerramp am:Boulevard of Broken Dreams am:brand am:brand van het oude stadhuis op de Dam am:burgeroorlog am:capitulatie am:christendom geboorte van Christus am:christendom kruisiging am:christendom opstanding van Christus am:christus aan het kruis am:Christus schrijft op de grond am:concert "Fayence bord”
  15. 15. Crowdsourcing for Events in Texts & Videos
  16. 16. Description Event Foto is genomen tijdens de Eerste Zuid Nieuw-Guinea Expeditie Eerste Zuid Nieuw- Guinea Expeditie "Foto is genomen tijdens de Eerste- of de Tweede Zuid Nieuw-Guinea Expeditie" Tweede Zuid Nieuw- Guinea Expeditie "Masker gedragen tijdens oogstfeesten. Het feest in kwestie is het Sokari spel dat eenmaal per jaar wordt opgevoerd gedurende zeven opeenvolgende nachten na Nieuwjaar, medio april. …” Nieuwjaar FROG NLP toolkit NER Event extraction Victor Kramer
  17. 17. Radio news bulletins: Every object 1 event
  19. 19. Interactive vocabulary alignment
  20. 20. DIVE:MediaObject Nieuws uit Indonesië: opheffing van het KNIL dive:depictedBy sem:hasTimestamp sem:Event ANP:1950-08-11:50 dive:isRelatedTo dive:relatedPlace sem:hasPlace dive:isRelatedTo dive:relatedActor sem:hasActor dive:isRelatedTo dive:relatedPlace sem:hasPlace sem:Time 25 Juli 1950 dive:depictedBy sem:hasTimestamp DIVE:MediaObject Mannen bij het huis van Paul Spies aan de Parapattan 42, Djakarta dive:depictedBy dive:depictedBy dive:depictedBy DIVE:MediaObject ANP:1950-08-11:50 DIVE:MediaObject Schaal sem:Time 11 Augustus 1950 sem:Event ontbindingsceremonie sem:Place Djakarta sem:Place Indonesië Result: Explorable Knowledge graph sem:Actor “Mohammed Hatta”
  21. 21. DIVE+ Enrichments Enrichment method Media Objects Actors Places Events Other Alignments OI Crowd + NER 3,204 1,249 1,412 1,916 185,846 623 NB Interpreted + NER 197,200 194,890 54,571 197,200 6,736 6,353 AM original thesaurus 73,447 66,966 5,973 148 28,047 6,865 TM original thesaurus + FROG NER 78,226 27,829 3,896 23* 13,269 - Total 352,077 290,934 65,852 199,264 233,898 - *) more to come
  22. 22. Subject-Object Property supertype Count Media Object-Event dive:depictedBy or dive:isRelatedTo 199,233 Event-Actor sem:hasActor 265,677 Event-Place sem:hasPlace 220,726 Event-Concept dive:isRelatedTo 230 DIVE+ path fragments
  23. 23. Cliopatria triple store - 15M triples (for now) - Sparql endpoint Provenance management at Named Graph level
  24. 24. DIVE+ UI API Layer
  25. 25. DIVE+ UI: INFINITY OF EXPLORATION / Support exploration and serendipity / / Visual inspection of media objects and entities / / Lets user build, save and share Proto-Narratives/
  26. 26.
  27. 27. filters results ordering
  28. 28. filter on media objects order media objects by date
  29. 29. filter on events
  30. 30. explore event related entities
  31. 31. explore event event related entities
  32. 32. place entity exploration
  33. 33. narrative
  34. 34. bookmarking
  35. 35. / Generic data model for connecting heterogeneous media collections / Various data enrichment strategies to construct explorable event-centric knowledge graphs / DIVE+ Case Study Take home
  36. 36. / / / DIVE+
  37. 37. DIVE+ team
  38. 38. Current work: (Common) Event thesaurus? Februaristaking WOII Februaristaking “De oproep 'Staakt!' voor deelname aan de februaristaking te Amsterdam op 25 en 26 februari 1941. “ stakingen Eduard Hellendoorn "Joseph Eijl Eduard Hellendoorn Hermanus Coenradi 13 maart 1941 gefusilleerd Waalsdorpervlakte" Waalsdorpervlakte Jessie Both & Didi de hooge
  39. 39. 3. Alignments to vocabularies sem:Event oi:Opening_afsluitdijk dive:isRelatedTo sem:hasActor sem:Actor dive:Person oi:Ingenieur_Lely dive:isRelatedTo dive:relatedPlace sem:hasPlace dive:MediaObject dive:Video oi:9999 dive:depictedBy deo1.mpg dive:MediaObje ct dive:Image kb:image2 oa:Annotation dive:9999ann oa:hasBodyoa:hasTarget sem:Place oi:Afsluitdijk sem:Actor dive:Person KB:Lely dive:isRelatedTo dive:relatedPlace sem:hasPlace sem:Place dive:Place kb:DenHaag1 dive:depictedBy sem:Event oi:Opening_afsl uitdijk dive:isRelatedTo dive:relatedActor sem:hasActor skos:Concept gtaa:lely skos:Concept gtaa:DenHaag skos:Concept gtaa:Zuid-Holland skos:broader KB data GTAA OI data