YAGO-SUMO: Integrating YAGO into the Suggested Upper Merged Ontology

1. Introduction Approach Conclusion Integrating YAGO into the Suggested Upper Merged Ontology G. de Melo1, F. Suchanek1, A. Pease2 1: Max Planck Institute for Informatics, Germany 2: Articulate Software, USA 2008-11-03 G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

2. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Outline 1 Introduction Ontologies and KBs SUMO Extending Ontologies YAGO 2 Approach Incorporation Class Information Statements 3 Conclusion Ongoing Work Summary G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

3. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction Ontologies/KBs: provide background knowledge for intelligent applications Schism: formal ontologies vs. large KBs Goal: Large-scale formal ontology G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

4. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction Ontologies/KBs: provide background knowledge for intelligent applications Schism: formal ontologies vs. large KBs Goal: Large-scale formal ontology formal ontologies: complex axioms (e.g. in FOL), but quite small large-scale KBs (e.g. based on Wikipedia): only simple facts G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

5. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction Ontologies/KBs: provide background knowledge for intelligent applications Schism: formal ontologies vs. large KBs Goal: Large-scale formal ontology combine the best of both worlds! G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

6. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction SUMO Suggested Upper Merged Ontology open source based on KIF rather than e.g. OWL large formal ontology (20,000 terms, 70,000 axioms) axiomatization of general and domain-speciﬁc concepts for applications requiring basic “common sense” G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

7. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction SUMO Suggested Upper Merged Ontology open source based on KIF rather than e.g. OWL origins: IEEE standard upper ontology group core owned by IEEE (basically Public Domain), portions GPL e.g.: OpenCyc doesn’t include axioms of commercial Cyc G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

8. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction SUMO Suggested Upper Merged Ontology open source based on KIF rather than e.g. OWL peer review, community of experts and users formal veriﬁcation with ATP systems G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

9. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction SUMO Suggested Upper Merged Ontology open source based on KIF rather than e.g. OWL OWL without additional rules is not very expressive KIF variant standardized as ISO/IEC IS 24707:2007 (Common Logic) G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

10. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction: Why Axiomatic Ontologies? G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

11. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction: Why Axiomatic Ontologies? G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

12. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction SUMO Example (=> (and (parent ?CHILD ?PARENT) (subclass ?CLASS Organism) (instance ?PARENT ?CLASS)) (instance ?CHILD ?CLASS)) This implies, for example, that a child of a Human is also a Human. G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

13. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction Structure of SUMO G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

14. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction SUMO additional domain ontologies however, SUMO is mainly an upper ontology not enough instances and ground facts e.g. for geography, ﬁnance, transportation G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

15. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction SUMO additional domain ontologies however, SUMO is mainly an upper ontology not enough instances and ground facts G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

16. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction SUMO additional domain ontologies however, SUMO is mainly an upper ontology not enough instances and ground facts e.g. people, cities, books G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

17. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction Extending Ontologies: Possible Approaches Manual work Information extraction from corpora / the Web Import from existing databases slow process, low coverage Semantic Wikis not yet accepted enough G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

18. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction Extending Ontologies: Possible Approaches Manual work Information extraction from corpora / the Web Import from existing databases low accuracy not canonical / in line with upper ontology G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

19. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction Extending Ontologies: Possible Approaches Manual work Information extraction from corpora / the Web Import from existing databases feasible, but not universal enough G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

20. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction YAGO combine entities and facts from Wikipedia with an upper ontology original YAGO: WordNet for the upper level New goal: integrate with SUMO excellent coverage: around 2 million entities millions of facts about them high quality: e.g. birth dates of people, location of cities G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

21. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction YAGO combine entities and facts from Wikipedia with an upper ontology original YAGO: WordNet for the upper level New goal: integrate with SUMO mainly a lexical knowledge base e.g. hyponymic relationships do not strictly imply subsumptions lack of formal axioms G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

22. Introduction Approach Conclusion Ontologies and KBs SUMO Extending Ontologies YAGO Introduction YAGO combine entities and facts from Wikipedia with an upper ontology original YAGO: WordNet for the upper level New goal: integrate with SUMO so the class information actually is meaningful G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

23. Introduction Approach Conclusion Incorporation Class Information Statements Outline 1 Introduction Ontologies and KBs SUMO Extending Ontologies YAGO 2 Approach Incorporation Class Information Statements 3 Conclusion Ongoing Work Summary G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

24. Introduction Approach Conclusion Incorporation Class Information Statements Incorporation Idea: most Wikipedia articles become new entities Semi-automatic matching: although SUMO contains only few instances, some degree of overlap exists use weighted string similarity measure additional manual validation −→ equivalence table Entity Generation: produce a new unique term name for Wikipedia article not listed in equivalence table, subject to the following desiderata: prevent clashes with SUMO or other entities conciseness abide to KIF syntax (Wikipedia uses Unicode) must be a proper entity (not: “List of ...”) G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

27. Introduction Approach Conclusion Incorporation Class Information Statements Class Information YAGO: From Wikipedia to WordNet goal: each entity should have class membership information use Wikipedia category system, however cannot use it directly ﬁrst link categories to WordNet, then map to SUMO requirement: distinguish thematic categories from categories encoding class membership G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

28. Introduction Approach Conclusion Incorporation Class Information Statements Class Information YAGO: From Wikipedia to WordNet goal: each entity should have class membership information use Wikipedia category system, however cannot use it directly ﬁrst link categories to WordNet, then map to SUMO requirement: distinguish thematic categories from categories encoding class membership categorization not transitive members of subcategories often unrelated to parent category G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

31. Introduction Approach Conclusion Incorporation Class Information Statements Class Information YAGO shallow parsing: noun group parser to identify headword heuristic: ignore categories with headword in singular form G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

32. Introduction Approach Conclusion Incorporation Class Information Statements Class Information YAGO shallow parsing: noun group parser to identify headword heuristic: ignore categories with headword in singular form G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

33. Introduction Approach Conclusion Incorporation Class Information Statements Class Information YAGO: From Wikipedia to WordNet check WordNet for premodiﬁer + headword or headword only disambiguate using frequency information result: relationship to WordNet-derived class e.g. “American singer” or “singer” G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

34. Introduction Approach Conclusion Incorporation Class Information Statements Class Information YAGO: From Wikipedia to WordNet check WordNet for premodiﬁer + headword or headword only disambiguate using frequency information result: relationship to WordNet-derived class G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

35. Introduction Approach Conclusion Incorporation Class Information Statements Class Information YAGO: From Wikipedia to WordNet check WordNet for premodiﬁer + headword or headword only disambiguate using frequency information result: relationship to WordNet-derived class American singers of German origin becomes linked as a subclass to the WordNet-derived class Person G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

36. Introduction Approach Conclusion Incorporation Class Information Statements Class Information Voting Procedure problem: regular polysemy, Wikipedia articles simultaneously cover several metonymically related senses e.g. Brown University is both a College and a GroupOfPeople will cause inconsistencies when the axioms are added solution: look at top-level branches for each proposed class (locations, artifacts, abstract entities, etc.) voting procedure to determine most salient branch (ties broken arbitrarily) G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

37. Introduction Approach Conclusion Incorporation Class Information Statements Class Information Voting Procedure problem: regular polysemy, Wikipedia articles simultaneously cover several metonymically related senses e.g. Brown University is both a College and a GroupOfPeople will cause inconsistencies when the axioms are added solution: look at top-level branches for each proposed class (locations, artifacts, abstract entities, etc.) voting procedure to determine most salient branch (ties broken arbitrarily) G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

38. Introduction Approach Conclusion Incorporation Class Information Statements Class Information From WordNet to SUMO good news: existing manually established WordNet- SUMO-mappings G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

39. Introduction Approach Conclusion Incorporation Class Information Statements Class Information From WordNet to SUMO in some cases, these mappings provide an equivalent SUMO class −→ directly use the SUMO class instead of the WordNet one E.g. Human instead of WordNet’s “person” G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

40. Introduction Approach Conclusion Incorporation Class Information Statements Class Information From WordNet to SUMO in many cases, the mappings provide a super-class −→ create new WordNet-based class, make it a subclass of SUMO class G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

41. Introduction Approach Conclusion Incorporation Class Information Statements Class Information From WordNet to SUMO in further cases, the mappings yield a property or relation −→ create new WordNet-based class, add axioms of the form (=> (instance ?ENTITY Guitarist) (property ?ENTITY Musician)) Then recursively move up WordNet’s class hierarchy adding parent classes, until until a genuine parent class in SUMO is available. G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

42. Introduction Approach Conclusion Incorporation Class Information Statements Class Information Evaluation lots of heuristics, multiple steps yet: accuracy of 92.67% ± 2.98% (evaluation of most speciﬁc genuine SUMO parents for new instances) G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

43. Introduction Approach Conclusion Incorporation Class Information Statements Class Information Evaluation lots of heuristics, multiple steps yet: accuracy of 92.67% ± 2.98% (evaluation of most speciﬁc genuine SUMO parents for new instances) G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

44. Introduction Approach Conclusion Incorporation Class Information Statements Statements Information Extraction YAGO uses manual rules and heuristics to extract information about entities from Wikipedia pages mainly based on categories and infoboxes, not on article text, e.g. geographical location, spouse, etc. manual rewriting rules to express facts using SUMO’s terms sample evaluation: for each relation, at least 95% of the statements are accurate G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

48. Introduction Approach Conclusion Incorporation Class Information Statements Statements SUMO Integration mapping rules new relations added to SUMO when necessary incl. additional rules for reasoning extracted fact: X hasCapital Y becomes: (capitalCity Y X) G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

49. Introduction Approach Conclusion Incorporation Class Information Statements Statements SUMO Integration mapping rules new relations added to SUMO when necessary incl. additional rules for reasoning (instance establishedOnDate BinaryRelation) (domain 1 establishedOnDate Agent) (domain 2 establishedOnDate TimeInterval) (=> (establishedOnDate ?OBJ ?TIME) (exists (?FOUNDING) (and (instance ?FOUNDING Founding) (result ?FOUNDING ?OBJ) (overlapsTemporally (WhenFn ?FOUNDING) TIME)))) G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

50. Introduction Approach Conclusion Incorporation Class Information Statements Statements SUMO Integration mapping rules new relations added to SUMO when necessary incl. additional rules for reasoning (instance establishedOnDate BinaryRelation) (domain 1 establishedOnDate Agent) (domain 2 establishedOnDate TimeInterval) (=> (establishedOnDate ?OBJ ?TIME) (exists (?FOUNDING) (and (instance ?FOUNDING Founding) (result ?FOUNDING ?OBJ) (overlapsTemporally (WhenFn ?FOUNDING) TIME)))) G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

51. Introduction Approach Conclusion Incorporation Class Information Statements Statements Statements with Literals proper encoding of literals with units: e.g. (MeasureFn 3.0 SquareMeter) date ranges are recast (exists ?YEARNO ?MONTHNO ?YEARNO (and (birthdate HerveyDeStanton (DayFn ?DAYNO (MonthFn ?MONTHNO (YearFn ?YEARNO)))) (greaterThanOrEqualTo ?YEARNO 1270) (lessThanOrEqualTo ?YEARNO 1279))) G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

52. Introduction Approach Conclusion Incorporation Class Information Statements Statements Statements with Literals proper encoding of literals with units: e.g. (MeasureFn 3.0 SquareMeter) date ranges are recast (exists ?YEARNO ?MONTHNO ?YEARNO (and (birthdate HerveyDeStanton (DayFn ?DAYNO (MonthFn ?MONTHNO (YearFn ?YEARNO)))) (greaterThanOrEqualTo ?YEARNO 1270) (lessThanOrEqualTo ?YEARNO 1279))) G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

53. Introduction Approach Conclusion Incorporation Class Information Statements Statements Additional Grounding statements of the form (representsInLanguage "Immanuel Kant" ImmanuelKant EnglishLanguage) produce a greater level of formal grounding of the semantics of term names when names are ambiguous, providing such symbolic strings for multiple languages can further reduce the range of possible interpretations classes are better-speciﬁed due to their extensional characterization G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

56. Introduction Approach Conclusion Ongoing Work Summary Outline 1 Introduction Ontologies and KBs SUMO Extending Ontologies YAGO 2 Approach Incorporation Class Information Statements 3 Conclusion Ongoing Work Summary G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

57. Introduction Approach Conclusion Ongoing Work Summary Ongoing Work Ongoing Work TPTP transformation for reasoning SUMO problems in CADE competitions ATP systems for large-scale reasoning G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

60. Introduction Approach Conclusion Ongoing Work Summary Summary Summary SUMO: axiomatic representation of common sense knowledge but lack of simple encyclopedic facts YAGO methodology: add entities and statements about them from Wikipedia semi-automatic techniques, basic amount of manual work −→ formal ontology with around two million entities and several million statements and axioms SUMO is catapulted from an upper level ontology to a full-ﬂedged all-purpose KB Open source, available online: http://www.demelo.org/yagosumo/ G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology

YAGO-SUMO: Integrating YAGO into the Suggested Upper Merged Ontology

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (12)

Similar to YAGO-SUMO: Integrating YAGO into the Suggested Upper Merged Ontology

Similar to YAGO-SUMO: Integrating YAGO into the Suggested Upper Merged Ontology (20)

More from Gerard de Melo

More from Gerard de Melo (15)

Recently uploaded

Recently uploaded (20)

YAGO-SUMO: Integrating YAGO into the Suggested Upper Merged Ontology