20140130 metadata vocabularies_and_cultural_heritage_final

446 views

Published on

Advocating the use of the Event class in order to express the dynamics of things happening. This may be particularly useful when connecting concepts across domain boundaries

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
446
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

20140130 metadata vocabularies_and_cultural_heritage_final

  1. 1. 1 Metadata Vocabularies and Cultural Heritage Reconciling static and dynamic views Gerard Kuys, January, 30, 2014
  2. 2. The Tower of Progress (*) (*) Mundaneum illustrations from: Françoise Levie, L’homme qui voulait classer le monde (2006) 2
  3. 3. The Tower of Progress 3
  4. 4. Why linking data, and not a single frame for them all? • Throughout the web, there is no single point of truth – whereas there may be one indeed for individual people or organisations only • Accordingly, people and organisations have organised their way of describing their universes following their own singular pattern • If we are going to link scattered data within scattered vocabularies, do we need to compel them all to fit within a single framework? • AAA(AAF): anybody can say anything about anything (in Almost Any Fashion) • But DBpedia can be used as a ‘hub of meanings’, providing a common reference for them all 4
  5. 5. Connecting content, or rather: connecting meaningful concepts 5 • Until now, every type of data collection has had its own way of describing its domain • How could we construct a common way of describing related domains? • Like Paul Otlet, we should move from Documents to Bits-of-Knowledge, from the Web of Documents to the Web of Data • But what current models are lacking, is dynamics, not in the least the DBpedia ontology • The ‘collection model’ versus the ‘event model’ • CIDOC and FRBRoo (*) try to bridge this gap by introducing a dual view: * Static view (what are the entities and artefacts) * Dynamic view (how did these entities and artefacts come about) • The Europeana Data Model quite recently has elaborated on events enough to offer a choice between ‘object models’ and ‘event models’ • Could we ever reconcile these two views within the DBpedia ontology and, if so, how? (*) http://www.cidoc-crm.org/frbr_inro.html
  6. 6. Why are dynamics so important? • Emerging trends in cultural productions: • Interactive additions to content offered • Semantic storytelling (like the BBC is doing) • Museums presenting their stuff by way of the ‘journey’ metaphor • Various interpretations and annotations (‘provenance’ will prove to be crucial) • Might be a valuable addition to collection modelling as well (e.g., reprints) • Enriching texts, like publishers do, will bring about ever more versions of a text • ‘Semantic publishing’ 6
  7. 7. How to get dynamics into the DBpedia ontology? Option 1: Incorporating CIDOC-CRM 7
  8. 8. How to get dynamics into the DBpedia ontology? Option 1: Replicating CIDOC-CRM • DBpedia ontology might borrow some notions from CIDOC CRM • But replicating it all is not a good idea • This is Linked Open Data, after all 8
  9. 9. CIDOC CRM on Life Cycle Events The E2 Temporal Entity Hierarchy 9
  10. 10. How to get dynamics into the DBpedia ontology? Option 2: Incorporating a model of Events 10
  11. 11. How to get dynamics into the DBpedia ontology? Option 2: Incorporating a model of Events Other viable time / calendar models: • SNaP Event Ontology (http://data.press.net/ontology/event/) • Schema.org ( http://www.schema.org/Event ) • QUDT (http://www.qudt.org/ ) • RDF Calendar Workspace (http://www.w3.org/2002/12/cal/ ) • LODE (Linked Open Description of Events) (http://linkedevents.org/ontology/) • Events in the Europeana Data Model (http://pro.europeana.eu/tech-details ) 11
  12. 12. How to get dynamics into the DBpedia ontology? Option 2: Incorporating a model of Events 12
  13. 13. How to get dynamics into the DBpedia ontology? Option 2: Incorporating a model of Events 13
  14. 14. Now, let’s get practical 14 • When linking datasets, at what points would we want DBpedia to provide an Event model? • When definitely not: • Vocabulary matching & reconciliation (with SKOS) • Establishing common identities (owl:sameAs) • When indeed: • Connecting persons to persons • Connecting persons to objects • Focus on life cycle / ‘Werdegang’
  15. 15. Case # 1: A.J. van der Aa’s Aardrijkskundig Woordenboek 15
  16. 16. A.J. van der Aa’s Aardrijkskundig Woordenboek • • • • Comprises 14 volumes, being published from 1837 to 1851 Is a historical description of places, from big cities to tiny hamlets Connects these places to historical persons and to what they were doing there The Person Index contains references to 22.360 historical persons • To be corrected for double occurrences • To be validated against other sources / datasets • To be related to persons who have a lemma of their own in Wikipedia and, therefore, are a resource in DBpedia • And, of course, A.J. van der Aa’s book has its own Wikipedia lemma as well 16
  17. 17. A.J. van der Aa’s Aardrijkskundig Woordenboek 17
  18. 18. A.J. van der Aa’s Aardrijkskundig Woordenboek 18
  19. 19. Case # 1: A.J. van der Aa’s Aardrijkskundig Woordenboek 19
  20. 20. Case # 1: Do we need Events here? • No, this is about establishing identities • Since a lot of people have no lemma of their own in Wikipedia (nor do they occur in a list), we consider the question whether or not to add these data to the DBpedia ontology without them being extracted • The Reference class proves to be a solid mechanism to mediate between texts and resources • There is, however, no development and no narrative, so there is no need here to introduce events into the model 20
  21. 21. Case # 2: Connecting Wikipedia monument links 21 (*) Met dank aan Roland Cornelissen
  22. 22. Case # 2: Connecting Wikipedia monument links XML-Version of a book on regional monuments Wikipedia page Concept representing a Monument, e.g. an information resource on an Amsterdam canal mansion DBpedia Ontology: - Work - Annotation - Reference 22
  23. 23. Case # 2: Do we need Events here? • No, this is about concept recognition • The Reference class again proves to be a solid mechanism to mediate between texts and resources • There is, still, no development and no narrative, so there is no need to introduce events into the model 23
  24. 24. Case # 3: Connecting Van der Aa people to ‘citizens’ • ‘Wie Was Wie’ database: contains data about 18 million people since 1811 • Pivotal dataset for genealogical research • Based on municipal registers of birth, death, marriage etc. since their very beginning • Needs to reflect changes in municipal organisation (splits and mergers): • For that, we made a mapping from Wikipedia lists of former municipalities to a DBpedia class FormerMunicipality • We still have to implement some periodisation from the point of view of Dutch civil administration, and the changes it went through • Need urgently a Time ontology other (that is, less physical and more cultural, including approximate time spans) than W3C Time Ontology (*), (*) http://www.w3.org/TR/owl-time/ 24
  25. 25. Life-cycle aware ontologies: the A2A Archive model • being born • dying • being wed • baptism • divorce • etc. 25
  26. 26. Connecting an Event-driven dataset with a ‘static’ one 540 infants in the town of Goes, 1811-1813 542 mothers, of which 1 unknown 542 fathers, of which 63 unknown 26 76 Van der Aa celebrities related to the town of Goes, … - 1843
  27. 27. Connecting an Event-driven dataset with a ‘static’ one 27 1 match, not to the infant Servaas (* April 4, 1811) but to its father, vicar Jacobus de Kanter (called to Goes in 1811) 540 infants in the town of Goes, 1811-1813 542 mothers, of which 1 unknown 542 fathers, of which 63 unknown 76 Van der Aa celebrities related to the town of Goes, … - 1841
  28. 28. Connecting an Event-driven dataset with a ‘static’ one 28 1 match, not to the infant Servaas (* April 4, 1811) but to its father, vicar Jacobus de Kanter (dismissed from the Goes diacony in 1811) 540 infants in the town of Goes, 1811-1813 542 mothers, of which 1 unknown 542 fathers, of which 63 unknown 76 Van der Aa celebrities related to the town of Goes, … - 1841
  29. 29. Connecting an Event-driven dataset with a ‘static’ one 29 1 match, not to the infant Servaas (* April 4, 1811) but to its father, vicar Jacobus de Kanter (dismissed from the Goes diacony in 1811) Connection to https://nl.dbpedia.org/resource/Johan_de_Kanter ?? 540 infants in the town of Goes, 1811-1813 542 mothers, of which 1 unknown 542 fathers, of which 63 unknown 76 Van der Aa celebrities related to the town of Goes, … - 1841
  30. 30. Case # 3: Do we need Events here? • Yes, official registers tend to be very much focused on the act of registration, being almost an event in itself • (as is the case in deposing or retracting a will, and similar formal declarations of a person’s intents) • Events can be anything, from a person’s birth, to him being called somewhere as a Vicar, or to a work being published or re-published • The match to vicar De Kanter would have been much more difficult if the Birth Register would have been oriented towards single persons (to the infant, especially), and not to Events with several persons related to them 30
  31. 31. Case # 4: Connecting Across Collections 31 31
  32. 32. Case # 4: Do we need Events here? 32 • Yes, this is the stuff from which narratives and interactions with existing materials are made • Events can be anything, we have to think about an ontology of sentiments, not unlike the Sentiment Wortschatz (*), but then in order to apply it ourselves when enriching descriptions • Europeana Data Model hasMet property could be the container notion, but would be very much in need of specific subproperties • Maybe this would go as far as a gotInfatuatedWith subproperty in an interactive history play (*) http://datahub.io/dataset/sentiws
  33. 33. Case # 5: Making Linked Open Data Fit for Enrichment 33
  34. 34. Semantic Storytelling 34
  35. 35. The Potential of Event Modeling - 1 It is time to think about a shift in modeling: • We still are very much captured within the thought model of the State Machine: an Event causes a change of state within one or more resources • This works fine in an environment, in which there is but a single process and a single thread of action • However, we are entering a stage in which various parallel courses of action will be coexisting, both for future scenarios and for history scripts • In all of these, authorship / provenance is of utmost importance in order to assess reliability 35
  36. 36. The Potential of Event Modeling - 2 To what extent would the DBpedia ontology have to reflect Event-related requirements? • I would suggest, that DBpedia must remain above all a data hub, offering a common point of convergence for vocabularies that are much more refined • But in order to remain a data hub, the DBpedia ontology must at least accommodate a basic Event model • ‘Events’ in DBpedia to be distinguished between: • something that happens either in Nature (‘NatureEvent’) or in society (‘SocietalEvent’) • something that causes a change of state within a resource (‘LifeCycleEvent’) 36
  37. 37. 37 Thank you for your attention Questions?

×