Genealogical domain

442
-1

Published on

KEOD-2012 Conference (Barcelona)

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
442
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • You can visit genealogy resources on the internet to see what is available. You will definitely want to find out if others have already done research on your line. Check places like The Church of Jesus Christ of Latter-Day Saints (LDS) Family History Centers (their online site is Family Search). But it would not be surprising that the records you are looking for aren't online. In this case, other primary sources will be helpful in your search (land, probate, church, county records). It's very likely have to reconcile data from different sources. But it may not match names, dates, or the records contain errors or contradictions
  • You can visit genealogy resources on the internet to see what is available. You will definitely want to find out if others have already done research on your line. Check places like The Church of Jesus Christ of Latter-Day Saints (LDS) Family History Centers (their online site is Family Search). But it would not be surprising that the records you are looking for aren't online. In this case, other primary sources will be helpful in your search (land, probate, church, county records). It's very likely have to reconcile data from different sources. But it may not match names, dates, or the records contain errors or contradictions
  • You can visit genealogy resources on the internet to see what is available. You will definitely want to find out if others have already done research on your line. Check places like The Church of Jesus Christ of Latter-Day Saints (LDS) Family History Centers (their online site is Family Search). But it would not be surprising that the records you are looking for aren't online. In this case, other primary sources will be helpful in your search (land, probate, church, county records). It's very likely have to reconcile data from different sources. But it may not match names, dates, or the records contain errors or contradictions
  • You can visit genealogy resources on the internet to see what is available. You will definitely want to find out if others have already done research on your line. Check places like The Church of Jesus Christ of Latter-Day Saints (LDS) Family History Centers (their online site is Family Search). But it would not be surprising that the records you are looking for aren't online. In this case, other primary sources will be helpful in your search (land, probate, church, county records). It's very likely have to reconcile data from different sources. But it may not match names, dates, or the records contain errors or contradictions
  • Genealogical domain

    1. 1. Modeling genealogical domain: an open problem Joan Campanyà Artés Jordi Conesa Caralt Enric Mayol Sarroca KEOD 2012 - Barcelona 1
    2. 2. Could be that you are adescendant ofCharlemagne? 2
    3. 3.  Statistically, if your ancestors are predominantly europeans, its virtually impossible not to be But if you are not satisfied with the eventuality and wish to demonstrate kinship, we must consult reliable sources and historical records supporting our assumption Genealogy is the study of families and the tracing of their lineages and history 3
    4. 4. But we would like a automated genealogy research... primary sources online resources data processing and knowledge inference data from users applicationsfamily tree 4
    5. 5. … in any case reliyng on recognised sources supported by primary sources online resources data processing and knowledge inference data from users applicationsfamily tree supported by 5
    6. 6. A common conceptual model of the domain will make things easier Modeling genealogical domain: an open problem Joan Campanyà Artés Jordi Conesa Caralt Enric Mayol Sarroca KEOD 2012 - Barcelona 6
    7. 7. Index Genealogy: a very complex domain State of the art. Standards and Specifications to share genealogical data. Genealogical knowledge processing. "Open World Assumption" (OWA) versus "Closed World Assumption" (CWA) Our proposal. Sources and statements Modeling entities and relationships Challenges for future work Conclusions 7
    8. 8. Modeling genealogy is a problem?Intrinsic complexity of the domain Syntactic variants: names of individuals and locations often appears with lexical variants that difficult the proper recognition. (Examples: Joan Campanyà / Juan Campañá, Vic / Vich, Viella / Vielha) Structural heterogeneity: the familiar pattern and roles of individuals depend on temporal and cultural context in which they occur. (Examples: paternal or maternal family name according to cultural contexts, blood relatives, ...) Data entry errors: they may be transcription errors or erroneous interpretation. (Examples: erroneous birth or death dates, inaccurate records due to forced translations for political reasons or ignorance, ...) 8
    9. 9. Agree on a model, an opportunity! Distributed and independent data structures primary sources online resources Primary sources adopt data from users applicationshetereogeneus data structures Online and semantic web servicesprovide access to specific datarepositories Private applications lack of commonand recognized standards(entities/relationships) 9
    10. 10. GEDCOM Difficult evolution: its a proprietary format Family-centered. This does not facilitate the search for ancestors that is much of the work of genealogists Ambiguity: the specification does not set limits on their hierarchical structure. So, we can find incompatibilities between different implementations of the standard Lack of source references: there are no tracking for data connected to the research process, making difficult subsequent verification or reuse of sources Inconsistencies may occur due to data duplication 10
    11. 11. GENTECHInteresting performances: All genealogical data are broken down into a series of short, formal genealogical statements Introduces key concepts: Events (anything happened in someone’s life) and relationships (between two people)Drawbacks: Restrictive predefined categories of DataTypes, TypeValues and Collections The model assumes its implementation on relational databases 11
    12. 12. Modeling with ontologies Zandhuis, 2005. Genealogical data modeled with OWL/RDF. Enable the potential use of the Semantic Web. Did not develop much beyond that the class structure Campbell, 2006. Open network data, scalable, extensible, based on open standards and understandable by machines. Genealogical data fragmented in the form of subject- predicate-object sentences, in OWL-RDF files. Woodbury, 2010. Information system based on individuals and events. Textual data is analyzed using ontological patterns and regular expressions, complemented with SWRL rules for integrity constraints. … other interesting works must be considered 12
    13. 13. Limitations of existing standards and systems We havent a recognized and unified genealogical model as standard. In this void, GEDCOM file format is extensively used for exchange genealogical data Most genealogical information systems presupposes a closed world (CWA), in the sense that everything that is not reflected in the form of tuples (ie., not declared in the extension) is false or nonexistent.Then, where to start?We are interested in the semantic value of attributes androles, not in the explicit record syntax or types. We needtransform from implicit to explicit semantic knowledge, ina way to reaching a open world assumption (OWA) 13
    14. 14. Our proposal supported by primary sources online resources data processing and knowledge inference data from users applications supported byAny statement of genealogicalfacts must be supported byrecognized sources 14
    15. 15. Overall view Formalize knowledge through ontologies Agree on a reference domain model, flexible enough to adapt different contexts Proceed on a ontological mapping between this model and existent genealogy services and applications 15
    16. 16. Sources and Statements Assertions are annotations of genealogical interest, and refer to one or more Statements. There are supported by documentary primary Sources Statement class records concepts and their relationships as atomic triples, in the form of <subject, predicate, object>Example: <Person "Person_10”>, <GenealogicalPredicate ”father”>, <Person "Person_30”> 16
    17. 17. Modeling Entity and populating Facts ontology 17
    18. 18. Modeling Event, Place and Date 18
    19. 19. PersonaEvents ontology Authomatic populationFacts ontology PersonaEvents ontology Data extraction and knowledge inference will be executed over PersonaEvents ontology. Facts ontology will allow us to retrieval primary sources 19
    20. 20. Challenges for future work Instances identification and register (entity) matching Automatic population of PersonaEvents ontology from basic statements in Facts ontologies, keeping references to Sources Make decidable the knowledge inference from PersonaEvents ontology (OWL-DL and SWRL rules) Refine the model, in particular Properties and Attributes, to accommodate the widest possible range of contexts 20
    21. 21. Conclusions Sharing data between genealogical resources would benefit from the existence of a reference model GEDCOM data exchange format are widely accepted, but recognition of family ties between resources requires some expert assistance With ontologies we can model genealogical domain entities, properties and constraints Extract implicit knowledge from source statements is possible by logics and 21
    22. 22. Are you eager to confirmthat you are a descendant of Charlemagne? 22

    ×