Ontology is a branch of philosophy that deals with the nature and the organization of reality
Science of Being (Aristotle, Metaphysics)
What characterizes being?
Eventually, what is being?
Ontologies in Computer Science
Ontology refers to an engineering artifact
a specific vocabulary used to describe a certain reality
a set of explicit assumptions regarding the intended meaning of the vocabulary
An Ontology is
an explicit specification of a conceptualization [Gruber 93]
a shared understanding of a domain of interest [Uschold/Gruninger 96]
Why Develop an Ontology?
Make domain assumptions explicit
Easier to change domain assumptions
Easier to understand and update legacy data
Separate domain knowledge from operational knowledge
Re-use domain and operational knowledge separately
A community reference for applications
Shared understanding of what information means
Types of Ontologies [Guarino, 98] Describe very general concepts like space, time, event, which are independent of a particular problem or domain. It seems reasonable to have unified top-level ontologies for large communities of users. Describe the vocabulary related to a generic domain by specializing the concepts introduced in the top-level ontology. Describe the vocabulary related to a generic task or activity by specializing the top-level ontologies. These are the most specific ontologies. Concepts in application ontologies often correspond to roles played by domain entities while performing a certain activity .
Ontologies and Their Relatives Catalog / ID Terms/ Glossary Thesauri Informal Is-a Formal Is-a Formal Instance Frames Value Restric- tions General logical constraints Axioms Disjoint Inverse Relations, ...
Knowledge Organization Systems
Semantic Lexicons – e.g. WordNet
… group together words according to lexical semantic relations like synonymy , hyponymy , meronymy , antonymy , etc.
… group together domain terms according to a set of taxonomic relations, including broader term, narrower term, sibling , etc.
Semantic Networks and Ontologies
… group together classes of objects according to a set of relations that originate in the nature of the domain of application.
Ontologies are defined by a formal semantics, but semantic networks may be informally defined. Therefore all ontologies are semantic networks, but not all semantic networks are ontologies.
Thesauri - Examples MeSH Heading Databases, Genetic Entry Term Genetic Databases Entry Term Genetic Sequence Databases Entry Term OMIM Entry Term Online Mendelian Inheritance in Man Entry Term Genetic Data Banks Entry Term Genetic Data Bases Entry Term Genetic Databanks Entry Term Genetic Information Databases See Also Genetic Screening MT 3606 natural and applied sciences UF gene pool genetic resource genetic stock genotype heredity BT1 biology BT2 life sciences NT1 DNA NT1 eugenics RT genetic engineering (6411) EuroVoc covers terminology in all of the official EU languages for all fields that concern the EU institutions, e.g., politics, trade, law, science, energy, agriculture, 27 such fields in total. MeSH (Medical Subject Headings) is organized by terms (currently over 250,000) that correspond to a specific medical subject. For each such term a list of syntactic, morphological or semantic variants is given.
Semantic Networks - Examples Pharmacologic Substance affects Pathologic Function Pharmacologic Substance causes Pathologic Function Pharmacologic Substance complicates Pathologic Function Pharmacologic Substance diagnoses Pathologic Function Pharmacologic Substance prevents Pathologic Function Pharmacologic Substance treats Pathologic Function Accession: GO:0009292 Ontology: biological process Synonyms: broad: genetic exchange Definition: In the absence of a sexual life cycle, the processes involved in the introduction of genetic information to create a genetically different individual. Term Lineage all : all (164142) GO:0008150 : biological process (115947) GO:0007275 : development (11892) GO:0009292 : genetic transfer (69) GO (Gene Ontology) allows for “consistent descriptions of gene products in different databases, including several of the world’s major repositories for plant, animal and microbial genomes…“ Organizing principles are molecular function, biological process and cellular component. UMLS (Unified Medical Language System) integrates linguistic, terminological and semantic information. The Semantic Network consists of 134 semantic types and 54 relations between types.
Example Ontology Consider an Example Ontology for the Newspaper Domain
Ontologies are used to semantically organize and retrieve data (structured, textual, multimedia) through knowledge markup
Consider the following example:
Knowledge Markup from Text is based on Named-Entity Recognition, Semantic Tagging (Term to Class Mapping) and Relation Extraction
<news:story xmnls:jobs=“http://www.jobs.org/owl-jobs#” xmlns:com=“http://www.companies.org/owl-companies#” xmlns:it=“http://www.it.net/owl-it#”> “ We were surprised by several of the results, particularly the order of finish,” said <jobs:SystemsAnalyst> Dan Olds </jobs:SystemsAnalyst>. <com:Company> IBM </com:Company> finished first with very strong results, and <com:Company> HP </com:Company> scored a solid number two; we expected to see <com:Company> Sun Microsystems </com:Company> challenging for first place or at least a strong second place. As the largest <it:operatingsystem> UNIX </it:operatingsystem> vendor in terms of number of installed systems, a third place finish should put their management on notice that their installed base may be vulnerable.
Knowledge Markup - Images Semantic Annotation of Medical Images (miAKT Project - UK)
Knowledge Markup - Images Semantic Annotation of Video (SmartMedia – DFKI KM)
Ontology Life-Cycle – Ontology Population Create/Select Development and/or Selection Populate Knowledge Base Generation Validate Consistency Checks Evolve Extension, Modification Maintain Usability Tests Deploy Knowledge Retrieval
Ontology Population with SOBA
SOBA: SmartWeb Ontology-based Annotation
SmartWeb (http://www.smartweb-projekt.de/) – German Project around World-Cup 2006
Multimodal Dialog Processing
IR-based Question Answering
Ontology-Based Information Extraction
Semantic Web Services
Ontology-Based Information Extraction …
Semantic Wrapping of Semi-Structured Data
Semantic and Linguistic Annotation of Free Text
Inference Rules for Instantiation and Integration of Annotated Entities and Events
… and Display
Ontology-driven Hyperlink Generation for Display of Extracted Information
Linguistic Annotation Named Entity Recognition & Semantic Tagging Image Extraction PDF Analysis Inference Rules for Instantiation & Integration Knowledge Base Documents Ontologies Wrapping of SemiStructured Data SOBA – Processing and Data Flow
MatchEvent [Score, Team1, Team2] FootballPlayer Information Extraction from Free Text
FoulEvent [FootballPlayer] FootballPlayer Information Extraction from Image Captions
Linguistic and Semantic Annotation Mark Crossley saved twice with his legs from Huckerby. Named Entity Recognition & Semantic Tagging [ Mark Crossley GOALKEEPER] [ saved GOALKEEPER_ACTION] twice with his legs from [ Huckerby PLAYER] . Linguistic Annotation [ Mark Crossley GOALKEEPER : SUBJ] [ saved PRED : GOALKEEPER_ACTION] twice [ with his legs PP_OBJ] [ from [ Huckerby PLAYER] PP_ADJUNCT] . [ GOALKEEPER_ACTION = 'save‘, GOALKEEPER = ' Mark Crossley ‘, PLAYER = ' Huckerby ‘, MANNER = ‘legs' ]
Example Sentence from Match Report
Allerdings ist Petrow fuer die Partie gegen Schweden gesperrt und kann erst gegen Ungarn eingesetzt werden.
“ However Petrow has been banned for the match against Sweden and can again be deployed against Hungary.”
Annotated/Extracted Information (with SProUT IE Tool - DFKI-LT )
player_action & [GAME_EVENT "Ban",
AGENT player & [SURNAME "PETROW"],
IN_MATCH game & [TEAM2 "SWE", TOURNAMENT "Match"]]
Term Disambiguation & Compositional Interpretation
Statistical Analysis & Clustering (e.g. FCA)
(Shallow) Linguistic Parsing
Anonymous Relations (e.g. with Association Rules)
Named Relations (Linguistic Parsing)
(Linguistic) Compound Analysis
Web Mining, Social Network Analysis
(Linguistic) Compound Analysis (incl. WordNet)
Overview of Current Work: Paul Buitelaar, Philipp Cimiano, Bernardo Magnini Ontology Learning from Text: Methods, Evaluation and Applications Frontiers in Artificial Intelligence and Applications Series, Vol. 123, IOS Press, July 2005.