Automatic Heritage Metadata Enrichment with Historic Events

1,654 views

Published on

Here are the slides from the Agora presentation at Museums and the Web 2011. Johan Oomen and Marieke van Erp presented this first version of the Agora event extraction system to enrich museum collection with during the Linked Data Session on Thursday 7 April 2011.

For more information see http://agora.cs.vu.nl

Published in: Technology
1 Comment
2 Likes
Statistics
Notes
No Downloads
Views
Total views
1,654
On SlideShare
0
From Embeds
0
Number of Embeds
288
Actions
Shares
0
Downloads
21
Comments
1
Likes
2
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Entirely correct 68.4%, partially correct 13.3%\n
  • Persons 77.08%, Location 65,8%\n
  • \n
  • \n
  • \n
  • 45.6% actor correct, 41.1% location correct, and 51.5% date is correct\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Automatic Heritage Metadata Enrichment with Historic Events

    1. 1. Automatic Heritage Metadata Enrichment with Historic Events Marieke van Erp, Johan Oomen, Roxane Segers, Chiel van den Akker, Lora Aroyo, Geertje Jacobs, Susan Legêne, Lourens van der Meij, Jacco van Ossenbruggen and Guus Schreiber http://agora.cs.vu.nl7/4/11 MW2011
    2. 2. The Agora Project • 2009-2013 • VU Amsterdam (CS + History Departments), Netherlands Institute for Sound and Vision, Rijksmuseum Amsterdam • Funded by the Netherlands Organisation for Scientific Research (NWO) within the CATCH research programme MW20117/4/11
    3. 3. “Enabling anything like seamless access to the cultural record will require developing tools to navigate among vast catalogs of born-digital and digitized materials, […] The return on this investment will be a humanities and social science cyberinfrastructure that will allow new questions to be asked, new patterns and relations to be discerned, and deep structures in language, society, and culture to be exposed and explored.”7/4/11 MW2011 http://www.acls.org/programs/Default.aspx?id=644
    4. 4. Gabriel Metsu (17th century Dutch painer)7/4/11 MW2011
    5. 5. 7/4/11 Venue
    6. 6. Networked heritage ?7/4/11 MW2011
    7. 7. Europeana ~15,578,8507/4/11 MW2011
    8. 8. Europeana Thoughtlab7/4/11 MW2011
    9. 9. Europeana Thoughtlab7/4/11 MW2011
    10. 10. Baseline:
matching
7/4/11 MW2011
    11. 11. Baseline:
matching
 Metadata for the object7/4/11 MW2011
    12. 12. Baseline:
matching
 Metadata for the object Controlled place name from a vocabulary at the Rijskmuseum7/4/11 MW2011
    13. 13. A
"more
specific
Egypt"?7/4/11 MW2011
    14. 14. 7/4/11 Venue
    15. 15. date Venue
    16. 16. were present at…date Venue
    17. 17. date Venue
    18. 18. Location is...date Venue
    19. 19. date Venue
    20. 20. role is…date Venue
    21. 21. date Venue
    22. 22. is part of…date Venue
    23. 23. why enrichment? • Historical context is missing • What happened before/after • ‘Grand narratives’ • Based on keyword search • Exact matches • ...and manual annotation is costly7/4/11 MW2011
    24. 24. http://www.bl.uk/learning/timeline/7/4/11 MW2011
    25. 25. Simple Event Model7/4/11 MW2011
    26. 26. The Pipeline7/4/11 MW2011
    27. 27. The Pipeline7/4/11 MW2011
    28. 28. Recognising Events • Pattern-based approach to find Event names • “during the” <NP> • “after the” <NP>7/4/11 MW2011
    29. 29. Recognising Events • Machine learning based approach to recognise persons and locations • Retrained Stanford NER system for Dutch • Regular expressions to recognise temporal expressions • [0-9]{1,2}/[0-9]{1,2}/[0-9]{4}7/4/11 MW2011
    30. 30. The Pipeline7/4/11 MW2011
    31. 31. Linking Event Elements • Check which pairs of event names and persons, locations or times co-occur most within a Wikipedia paragraph • Rank by most frequent7/4/11 MW2011
    32. 32. The Pipeline7/4/11 MW2011
    33. 33. Event Instances7/4/11 MW2011
    34. 34. Thesaurus 1 Thesaurus 2 Direct Links Links Via Events Rijksmuseum Sound and - 20 events Vision locations Rijksmuseum Sound and - 15 events Vision people Rijksmuseum Sound and - 300 people Vision locations Rijksmuseum Sound and 7 297 locations Vision people Rijksmuseum Rijksmuseum 488 events locations Rijksmuseum Rijksmuseum 395 events people7/4/11 MW2011
    35. 35. New Links7/4/11 MW2011
    36. 36. Event-driven Browsing7/4/11 MW2011
    37. 37. Conclusions • Events provide a framework for collection data enrichment • Language technology can be used to identify events • Events provide meaningful links between different collections7/4/11 MW2011
    38. 38. Future Work7/4/11 MW2011
    39. 39. Future Work • Fine-tuning event extraction approach • English version of the system • User involvement to improve event relations • User-generated narratives7/4/11 MW2011
    40. 40. ? ? ¿ ¿ ? ¿ ? ¿ ? ¿
    41. 41. credits • publications etc. http://agora.cs.vu.nl • Merci Web & Media Group at VU University for inspiration & images • Follow us on Twitter • @agoraproject • @johanoomen7/4/11 MW2011

    ×