SlideShare a Scribd company logo
1 of 22
Modeling genealogical domain:
      an open problem

        Joan Campanyà Artés
         Jordi Conesa Caralt
         Enric Mayol Sarroca


                          KEOD 2012 - Barcelona


                                                  1
Could be that
  you are a
descendant of
Charlemagne?




                2

    Statistically, if your ancestors are predominantly
    europeans, it's virtually impossible not to be

     But if you are not
     satisfied with the
     eventuality and wish
     to demonstrate
     kinship, we must
     consult reliable
     sources and historical
     records supporting our
     assumption

    Genealogy is the study of families and the tracing of
    their lineages and history
                                                         3
But we would like a automated
      genealogy research...
              primary
              sources               online
                                    resources


              data processing and
              knowledge inference       data from user's applications
family tree




                                                                  4
… in any case reliyng on
           recognised sources
                                     supported by


                primary
                sources               online
                                      resources


                data processing and
                knowledge inference       data from user's applications
family tree


                          supported by




                                                                    5
A common conceptual model of the domain
        will make things easier

    Modeling genealogical domain:
          an open problem

             Joan Campanyà Artés
              Jordi Conesa Caralt
              Enric Mayol Sarroca


                               KEOD 2012 - Barcelona


                                                       6
Index

    Genealogy: a very complex domain

    State of the art. Standards and Specifications to
    share genealogical data.

    Genealogical knowledge processing. "Open
    World Assumption" (OWA) versus "Closed World
    Assumption" (CWA)

    Our proposal. Sources and statements

    Modeling entities and relationships

    Challenges for future work

    Conclusions
                                                        7
Modeling genealogy is a problem?
Intrinsic complexity of the domain

    Syntactic variants: names of individuals and
    locations often appears with lexical variants that
    difficult the proper recognition. (Examples: Joan Campanyà /
    Juan Campañá, Vic / Vich, Viella / Vielha)

    Structural heterogeneity: the familiar pattern and
    roles of individuals depend on temporal and cultural
    context in which they occur. (Examples: paternal or maternal
    family name according to cultural contexts, blood relatives, ...)

    Data entry errors: they may be transcription errors or
    erroneous interpretation. (Examples: erroneous birth or death
    dates, inaccurate records due to forced translations for political reasons or
    ignorance, ...)
                                                                           8
Agree on a model, an opportunity!
    Distributed and independent data
                structures
                     primary
                     sources            online
                                        resources




 Primary sources adopt                      data from user's applications
hetereogeneus data structures

 Online and semantic web services
provide access to specific data
repositories

  Private applications lack of common
and recognized standards
(entities/relationships)
                                                                      9
GEDCOM

    Difficult evolution: it's a proprietary format

    Family-centered. This does not facilitate the search for
    ancestors that is much of the work of genealogists

    Ambiguity: the specification does not set limits on their
    hierarchical structure. So, we can find incompatibilities
    between different implementations of the standard

    Lack of source references: there are no tracking for
    data connected to the research process, making difficult
    subsequent verification or reuse of sources

    Inconsistencies may occur due to data duplication
                                                           10
GENTECH
Interesting performances:

    All genealogical data are broken down into a series of
    short, formal genealogical statements

    Introduces key concepts: Events (anything happened
    in someone’s life) and relationships (between two
    people)
Drawbacks:

    Restrictive predefined categories of DataTypes,
    TypeValues and Collections

    The model assumes its implementation on relational
    databases
                                                      11
Modeling with ontologies

    Zandhuis, 2005. Genealogical data modeled with OWL/RDF.
    Enable the potential use of the Semantic Web. Did not
    develop much beyond that the class structure

    Campbell, 2006. Open network data, scalable, extensible,
    based on open standards and understandable by machines.
    Genealogical data fragmented in the form of subject-
    predicate-object sentences, in OWL-RDF files.

    Woodbury, 2010. Information system based on individuals
    and events. Textual data is analyzed using ontological
    patterns and regular expressions, complemented with SWRL
    rules for integrity constraints.

    … other interesting works must be considered

                                                        12
Limitations of existing standards
              and systems

    We haven't a recognized and unified genealogical model as
    standard. In this void, GEDCOM file format is extensively used for
    exchange genealogical data

    Most genealogical information systems presupposes a closed
    world (CWA), in the sense that everything that is not reflected in
    the form of tuples (ie., not declared in the extension) is false or
    nonexistent.

Then, where to start?
We are interested in the semantic value of attributes and
roles, not in the explicit record syntax or types. We need
transform from implicit to explicit semantic knowledge, in
a way to reaching a open world assumption (OWA)
                                                                  13
Our proposal
                                       supported by


                  primary
                  sources               online
                                        resources


                  data processing and
                  knowledge inference       data from user's applications


                            supported by
Any statement of genealogical
facts must be supported by
recognized sources

                                                                      14
Overall view





    Formalize knowledge through ontologies

    Agree on a reference domain model, flexible enough
    to adapt different contexts

    Proceed on a ontological mapping between this
    model and existent genealogy services and
    applications
                                                     15
Sources and Statements

    Assertions are
    annotations of
    genealogical interest, and
    refer to one or more
    Statements. There are
    supported by
    documentary primary
    Sources

    Statement class records
    concepts and their
    relationships as atomic
    triples, in the form of
    <subject, predicate,
    object>
Example: <Person "Person_10”>, <GenealogicalPredicate ”father”>, <Person "Person_30”>
                                                                           16
Modeling Entity and populating
       Facts ontology




                             17
Modeling Event, Place and Date




                             18
PersonaEvents ontology
                      Authomatic population
Facts ontology
                                          PersonaEvents ontology




 Data extraction and knowledge inference will be executed
 over PersonaEvents ontology.
 Facts ontology will allow us to retrieval primary sources
                                                           19
Challenges for future work

    Instances identification and register (entity)
    matching

    Automatic population of PersonaEvents
    ontology from basic statements in Facts
    ontologies, keeping references to Sources

    Make decidable the knowledge inference from
    PersonaEvents ontology (OWL-DL and SWRL
    rules)

    Refine the model, in particular Properties and
    Attributes, to accommodate the widest possible
    range of contexts                           20
Conclusions

    Sharing data between genealogical resources
    would benefit from the existence of a reference
    model

    GEDCOM data exchange format are widely
    accepted, but recognition of family ties
    between resources requires some expert
    assistance

    With ontologies we can model genealogical
    domain entities, properties and constraints

    Extract implicit knowledge from source
    statements is possible by logics and       21
Are you eager to confirm
that you are a descendant
     of Charlemagne?




                            22

More Related Content

Similar to Genealogical domain

Phyloinformatics and the Semantic Web
Phyloinformatics and the Semantic WebPhyloinformatics and the Semantic Web
Phyloinformatics and the Semantic WebRutger Vos
 
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWCFueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWCValentina Presutti
 
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...Riccardo Albertoni
 
Towards an Open Research Knowledge Graph
Towards an Open Research Knowledge GraphTowards an Open Research Knowledge Graph
Towards an Open Research Knowledge GraphSören Auer
 
Eswcsummerschool2010 ontologies final
Eswcsummerschool2010 ontologies finalEswcsummerschool2010 ontologies final
Eswcsummerschool2010 ontologies finalElena Simperl
 
Europeana Network Association AGM 2016 - 9 November - Speaker Shawn Averkamp
Europeana Network Association AGM 2016 - 9 November - Speaker Shawn Averkamp Europeana Network Association AGM 2016 - 9 November - Speaker Shawn Averkamp
Europeana Network Association AGM 2016 - 9 November - Speaker Shawn Averkamp Europeana
 
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Stuart Chalk
 
Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment
Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment
Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment Paris Sud University
 
The Semantic Web: status and prospects
The Semantic Web: status and prospectsThe Semantic Web: status and prospects
The Semantic Web: status and prospectsGuus Schreiber
 
Semantic Integration for Heterogeneous Domain-specific Information: The NIF Case
Semantic Integration for Heterogeneous Domain-specific Information: The NIF CaseSemantic Integration for Heterogeneous Domain-specific Information: The NIF Case
Semantic Integration for Heterogeneous Domain-specific Information: The NIF CaseNeuroscience Information Framework
 
Introducing CIDOC-CRM (Cch KR workshop #2.1)
Introducing CIDOC-CRM (Cch KR workshop #2.1)Introducing CIDOC-CRM (Cch KR workshop #2.1)
Introducing CIDOC-CRM (Cch KR workshop #2.1)Michele Pasin
 
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip finalDeborah McGuinness
 
The Symbiotic Nature of Provenance and Workflow
The Symbiotic Nature of Provenance and WorkflowThe Symbiotic Nature of Provenance and Workflow
The Symbiotic Nature of Provenance and WorkflowEric Stephan
 
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI PresentationOpen Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentationekansa
 
IASSIT Kansa Presentation
IASSIT Kansa PresentationIASSIT Kansa Presentation
IASSIT Kansa Presentationekansa
 
Reasoning on the Semantic Web
Reasoning on the Semantic WebReasoning on the Semantic Web
Reasoning on the Semantic WebYannis Kalfoglou
 
Use of ontologies in natural language processing
Use of ontologies in natural language processingUse of ontologies in natural language processing
Use of ontologies in natural language processingATHMAN HAJ-HAMOU
 

Similar to Genealogical domain (20)

Phyloinformatics and the Semantic Web
Phyloinformatics and the Semantic WebPhyloinformatics and the Semantic Web
Phyloinformatics and the Semantic Web
 
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWCFueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
 
Presentationonline
PresentationonlinePresentationonline
Presentationonline
 
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
 
Towards an Open Research Knowledge Graph
Towards an Open Research Knowledge GraphTowards an Open Research Knowledge Graph
Towards an Open Research Knowledge Graph
 
NISO Forum, Denver, Sept. 24, 2012: Opening Keynote: The Many and the One: BC...
NISO Forum, Denver, Sept. 24, 2012: Opening Keynote: The Many and the One: BC...NISO Forum, Denver, Sept. 24, 2012: Opening Keynote: The Many and the One: BC...
NISO Forum, Denver, Sept. 24, 2012: Opening Keynote: The Many and the One: BC...
 
Eswcsummerschool2010 ontologies final
Eswcsummerschool2010 ontologies finalEswcsummerschool2010 ontologies final
Eswcsummerschool2010 ontologies final
 
Europeana Network Association AGM 2016 - 9 November - Speaker Shawn Averkamp
Europeana Network Association AGM 2016 - 9 November - Speaker Shawn Averkamp Europeana Network Association AGM 2016 - 9 November - Speaker Shawn Averkamp
Europeana Network Association AGM 2016 - 9 November - Speaker Shawn Averkamp
 
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
 
Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment
Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment
Tutorial@BDA 2017 -- Knowledge Graph Expansion and Enrichment
 
The Semantic Web: status and prospects
The Semantic Web: status and prospectsThe Semantic Web: status and prospects
The Semantic Web: status and prospects
 
Semantic Integration for Heterogeneous Domain-specific Information: The NIF Case
Semantic Integration for Heterogeneous Domain-specific Information: The NIF CaseSemantic Integration for Heterogeneous Domain-specific Information: The NIF Case
Semantic Integration for Heterogeneous Domain-specific Information: The NIF Case
 
Introducing CIDOC-CRM (Cch KR workshop #2.1)
Introducing CIDOC-CRM (Cch KR workshop #2.1)Introducing CIDOC-CRM (Cch KR workshop #2.1)
Introducing CIDOC-CRM (Cch KR workshop #2.1)
 
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
 
Semantic Web Nature
Semantic Web NatureSemantic Web Nature
Semantic Web Nature
 
The Symbiotic Nature of Provenance and Workflow
The Symbiotic Nature of Provenance and WorkflowThe Symbiotic Nature of Provenance and Workflow
The Symbiotic Nature of Provenance and Workflow
 
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI PresentationOpen Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
 
IASSIT Kansa Presentation
IASSIT Kansa PresentationIASSIT Kansa Presentation
IASSIT Kansa Presentation
 
Reasoning on the Semantic Web
Reasoning on the Semantic WebReasoning on the Semantic Web
Reasoning on the Semantic Web
 
Use of ontologies in natural language processing
Use of ontologies in natural language processingUse of ontologies in natural language processing
Use of ontologies in natural language processing
 

Recently uploaded

ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxUnboundStockton
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 

Recently uploaded (20)

ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 

Genealogical domain

  • 1. Modeling genealogical domain: an open problem Joan Campanyà Artés Jordi Conesa Caralt Enric Mayol Sarroca KEOD 2012 - Barcelona 1
  • 2. Could be that you are a descendant of Charlemagne? 2
  • 3. Statistically, if your ancestors are predominantly europeans, it's virtually impossible not to be  But if you are not satisfied with the eventuality and wish to demonstrate kinship, we must consult reliable sources and historical records supporting our assumption Genealogy is the study of families and the tracing of their lineages and history 3
  • 4. But we would like a automated genealogy research... primary sources online resources data processing and knowledge inference data from user's applications family tree 4
  • 5. … in any case reliyng on recognised sources supported by primary sources online resources data processing and knowledge inference data from user's applications family tree supported by 5
  • 6. A common conceptual model of the domain will make things easier Modeling genealogical domain: an open problem Joan Campanyà Artés Jordi Conesa Caralt Enric Mayol Sarroca KEOD 2012 - Barcelona 6
  • 7. Index  Genealogy: a very complex domain  State of the art. Standards and Specifications to share genealogical data.  Genealogical knowledge processing. "Open World Assumption" (OWA) versus "Closed World Assumption" (CWA)  Our proposal. Sources and statements  Modeling entities and relationships  Challenges for future work  Conclusions 7
  • 8. Modeling genealogy is a problem? Intrinsic complexity of the domain  Syntactic variants: names of individuals and locations often appears with lexical variants that difficult the proper recognition. (Examples: Joan Campanyà / Juan Campañá, Vic / Vich, Viella / Vielha)  Structural heterogeneity: the familiar pattern and roles of individuals depend on temporal and cultural context in which they occur. (Examples: paternal or maternal family name according to cultural contexts, blood relatives, ...)  Data entry errors: they may be transcription errors or erroneous interpretation. (Examples: erroneous birth or death dates, inaccurate records due to forced translations for political reasons or ignorance, ...) 8
  • 9. Agree on a model, an opportunity! Distributed and independent data structures primary sources online resources  Primary sources adopt data from user's applications hetereogeneus data structures  Online and semantic web services provide access to specific data repositories  Private applications lack of common and recognized standards (entities/relationships) 9
  • 10. GEDCOM  Difficult evolution: it's a proprietary format  Family-centered. This does not facilitate the search for ancestors that is much of the work of genealogists  Ambiguity: the specification does not set limits on their hierarchical structure. So, we can find incompatibilities between different implementations of the standard  Lack of source references: there are no tracking for data connected to the research process, making difficult subsequent verification or reuse of sources  Inconsistencies may occur due to data duplication 10
  • 11. GENTECH Interesting performances:  All genealogical data are broken down into a series of short, formal genealogical statements  Introduces key concepts: Events (anything happened in someone’s life) and relationships (between two people) Drawbacks:  Restrictive predefined categories of DataTypes, TypeValues and Collections  The model assumes its implementation on relational databases 11
  • 12. Modeling with ontologies  Zandhuis, 2005. Genealogical data modeled with OWL/RDF. Enable the potential use of the Semantic Web. Did not develop much beyond that the class structure  Campbell, 2006. Open network data, scalable, extensible, based on open standards and understandable by machines. Genealogical data fragmented in the form of subject- predicate-object sentences, in OWL-RDF files.  Woodbury, 2010. Information system based on individuals and events. Textual data is analyzed using ontological patterns and regular expressions, complemented with SWRL rules for integrity constraints.  … other interesting works must be considered 12
  • 13. Limitations of existing standards and systems  We haven't a recognized and unified genealogical model as standard. In this void, GEDCOM file format is extensively used for exchange genealogical data  Most genealogical information systems presupposes a closed world (CWA), in the sense that everything that is not reflected in the form of tuples (ie., not declared in the extension) is false or nonexistent. Then, where to start? We are interested in the semantic value of attributes and roles, not in the explicit record syntax or types. We need transform from implicit to explicit semantic knowledge, in a way to reaching a open world assumption (OWA) 13
  • 14. Our proposal supported by primary sources online resources data processing and knowledge inference data from user's applications supported by Any statement of genealogical facts must be supported by recognized sources 14
  • 15. Overall view  Formalize knowledge through ontologies  Agree on a reference domain model, flexible enough to adapt different contexts  Proceed on a ontological mapping between this model and existent genealogy services and applications 15
  • 16. Sources and Statements  Assertions are annotations of genealogical interest, and refer to one or more Statements. There are supported by documentary primary Sources  Statement class records concepts and their relationships as atomic triples, in the form of <subject, predicate, object> Example: <Person "Person_10”>, <GenealogicalPredicate ”father”>, <Person "Person_30”> 16
  • 17. Modeling Entity and populating Facts ontology 17
  • 18. Modeling Event, Place and Date 18
  • 19. PersonaEvents ontology Authomatic population Facts ontology PersonaEvents ontology Data extraction and knowledge inference will be executed over PersonaEvents ontology. Facts ontology will allow us to retrieval primary sources 19
  • 20. Challenges for future work  Instances identification and register (entity) matching  Automatic population of PersonaEvents ontology from basic statements in Facts ontologies, keeping references to Sources  Make decidable the knowledge inference from PersonaEvents ontology (OWL-DL and SWRL rules)  Refine the model, in particular Properties and Attributes, to accommodate the widest possible range of contexts 20
  • 21. Conclusions  Sharing data between genealogical resources would benefit from the existence of a reference model  GEDCOM data exchange format are widely accepted, but recognition of family ties between resources requires some expert assistance  With ontologies we can model genealogical domain entities, properties and constraints  Extract implicit knowledge from source statements is possible by logics and 21
  • 22. Are you eager to confirm that you are a descendant of Charlemagne? 22

Editor's Notes

  1. You can visit genealogy resources on the internet to see what is available. You will definitely want to find out if others have already done research on your line. Check places like The Church of Jesus Christ of Latter-Day Saints (LDS) Family History Centers (their online site is Family Search). But it would not be surprising that the records you are looking for aren&apos;t online. In this case, other primary sources will be helpful in your search (land, probate, church, county records). It&apos;s very likely have to reconcile data from different sources. But it may not match names, dates, or the records contain errors or contradictions
  2. You can visit genealogy resources on the internet to see what is available. You will definitely want to find out if others have already done research on your line. Check places like The Church of Jesus Christ of Latter-Day Saints (LDS) Family History Centers (their online site is Family Search). But it would not be surprising that the records you are looking for aren&apos;t online. In this case, other primary sources will be helpful in your search (land, probate, church, county records). It&apos;s very likely have to reconcile data from different sources. But it may not match names, dates, or the records contain errors or contradictions
  3. You can visit genealogy resources on the internet to see what is available. You will definitely want to find out if others have already done research on your line. Check places like The Church of Jesus Christ of Latter-Day Saints (LDS) Family History Centers (their online site is Family Search). But it would not be surprising that the records you are looking for aren&apos;t online. In this case, other primary sources will be helpful in your search (land, probate, church, county records). It&apos;s very likely have to reconcile data from different sources. But it may not match names, dates, or the records contain errors or contradictions
  4. You can visit genealogy resources on the internet to see what is available. You will definitely want to find out if others have already done research on your line. Check places like The Church of Jesus Christ of Latter-Day Saints (LDS) Family History Centers (their online site is Family Search). But it would not be surprising that the records you are looking for aren&apos;t online. In this case, other primary sources will be helpful in your search (land, probate, church, county records). It&apos;s very likely have to reconcile data from different sources. But it may not match names, dates, or the records contain errors or contradictions