The document discusses methods for evaluating ontologies. It proposes developing objective metrics to evaluate ontologies based on three criteria: correctness, completeness, and utility. Correctness evaluates how well an ontology expresses its design objectives. Completeness evaluates how fully an ontology captures required semantic components. Utility combines correctness and completeness and evaluates an ontology's usefulness for its intended use case. Examples are provided to illustrate evaluating ontologies based on the proposed metrics. The goal is to develop standardized evaluation methods to facilitate ontology development and reuse across different domains.
Technology in sales in brick and mortar business - workshop by Arup of Glaxo ...Amit Grover
Arup Ray, Head Sales Strategy and Distribution at Glaxo conducted a workshop for entrepreneurs at Great Indian Marketing Weekend. Here are the reference slides. #GIMW
CDAO presentation.
The idea of the comparative analysis ontoloty has been presented worldwide, including: NESCent (USA), IGBMC (France), UFRJ (Brazil). Providing a semantic framework for evolutionary analysis in a high-throughtput way after the next and third generation sequencing is the way to approach evolutionary-based studies into genome-wide analysis. The darwinian core of reasoning also allows CDAO to be used with other entities.
Technology in sales in brick and mortar business - workshop by Arup of Glaxo ...Amit Grover
Arup Ray, Head Sales Strategy and Distribution at Glaxo conducted a workshop for entrepreneurs at Great Indian Marketing Weekend. Here are the reference slides. #GIMW
CDAO presentation.
The idea of the comparative analysis ontoloty has been presented worldwide, including: NESCent (USA), IGBMC (France), UFRJ (Brazil). Providing a semantic framework for evolutionary analysis in a high-throughtput way after the next and third generation sequencing is the way to approach evolutionary-based studies into genome-wide analysis. The darwinian core of reasoning also allows CDAO to be used with other entities.
Keynote presentation for the International Semantic Web Conference in Athens Greece, on November 9, 2023. The talk addresses the generative AI explosion and its potential impacts on the Semantic Web and Knowledge Graph communities and, in fact, may spark a research Renaissance.
Abstract:
We are living in an age of rapidly advancing technology. History may view this period as one in which generative artificial intelligence is seen as reshaping the landscape and narrative of many technology-based fields of research and application. Times of disruptions often present both opportunities and challenges. We will discuss some areas that may be ripe for consideration in the field of Semantic Web research and semantically-enabled applications. Semantic Web research has historically focused on representation and reasoning and enabling interoperability of data and vocabularies. At the core are ontologies along with ontology-enabled (or ontology-compatible) knowledge stores such as knowledge graphs. Ontologies are often manually constructed using a process that (1) identifies existing best practice ontologies (and vocabularies) and (2) generates a plan for how to leverage these ontologies by aligning and augmenting them as needed to address requirements. While semi-automated techniques may help, there is typically a significant portion of the work that is often best done by humans with domain and ontology expertise. This is an opportune time to rethink how the field generates, evolves, maintains, and evaluates ontologies. We consider how hybrid approaches, i.e., those that leverage generative AI components along with more traditional knowledge representation and reasoning approaches to create improved processes. The effort to build a robust ontology that meets a use case can be large. Ontologies are not static however and they need to evolve along with knowledge evolution and expanded usage. There is potential for hybrid approaches to help identify gaps in ontologies and/or refine content. Further, ontologies need to be documented with term definitions and their provenance. Opportunities exist to consider semi-automated techniques for some types of documentation, provenance, and decision rationale capture for annotating ontologies. The area of human-AI collaboration for population and verification presents a wide range of areas of research collaboration and impact. Ontologies need to be populated with class and relationship content. Knowledge graphs and other knowledge stores need to be populated with instance data in order to be used for question answering and reasoning. Population of large knowledge graphs can be time consuming. Generative AI holds the promise to create candidate knowledge graphs that are compatible with the ontology schema. The knowledge graph should contain provenance information identifying how the content was populated and its source and correctness and currency should be checked. A human-AI assistant approach is presented.
Knowledge Organization System (KOS) for biodiversity information resources, G...Dag Endresen
Slides from a presentation on the Knowledge Organization System (KOS) work program for GBIF. KOS developments for biodiversity information resources and input to the emerging Vocabulary Management Task Group (VoMaG).
Links
GBIF KOS prototype tools, http://kos.gbif.org/
Tool: Semantic Wiki prototype, http://terms.gbif.org/wiki/
Tool: ISOcat prototype demo, http://kos.gbif.org/isocat/
GBIF concept vocabulary term browser, http://kos.gbif.org/termbrowser/
GBIF Resources Repository, http://rs.gbif.org/terms/
GBIF Vocabulary Server, http://vocabularies.gbif.org/
GBIF Resources Browser, http://tools.gbif.org/resource-browser/
A Comparative Study Ontology Building Tools for Semantic Web Applications IJwest
Ontologies have recently received popularity in the area of knowledge management and knowledge sharing,
especially after the evolution of the Semantic Web and its supporting technologies. An ontology defines the terms
and concepts (meaning) used to describe and represent an area of knowledge.The aim of this paper is to identify all
possible existing ontologies and ontology management tools (Protégé 3.4, Apollo, IsaViz & SWOOP) that are freely
available and review them in terms of: a) interoperability, b) openness, c) easiness to update and maintain, d)
market status and penetration. The results of the review in ontologies are analyzed for each application area, such
as transport, tourism, personal services, health and social services, natural languages and other HCI-related
domains. Ontology Building/Management Tools are used by different groups of people for performing diverse tasks.
Although each tool provides different functionalities, most of the users just use only one, because they are not able
to interchange their ontologies from one tool to another. In addition, we considered the compatibility of different
ontologies with different development and management tools. The paper is also concerns the detection of
commonalities and differences between the examined ontologies, both on the same domain (application area) and
among different domains.
A Comparative Study Ontology Building Tools for Semantic Web Applications dannyijwest
Ontologies have recently received popularity in the area of knowledge management and knowledge sharing, especially after the evolution of the Semantic Web and its supporting technologies. An ontology defines the terms and concepts (meaning) used to describe and represent an area of knowledge.The aim of this paper is to identify all possible existing ontologies and ontology management tools (Protégé 3.4, Apollo, IsaViz & SWOOP) that are freely available and review them in terms of: a) interoperability, b) openness, c) easiness to update and maintain, d) market status and penetration. The results of the review in ontologies are analyzed for each application area, such as transport, tourism, personal services, health and social services, natural languages and other HCI-related domains. Ontology Building/Management Tools are used by different groups of people for performing diverse tasks. Although each tool provides different functionalities, most of the users just use only one, because they are not able to interchange their ontologies from one tool to another. In addition, we considered the compatibility of different ontologies with different development and management tools. The paper is also concerns the detection of commonalities and differences between the examined ontologies, both on the same domain (application area) and among different domains.
A Comparative Study of Ontology building Tools in Semantic Web Applications dannyijwest
Ontologies have recently received popularity in the area of knowledge management and knowledge sharing,
especially after the evolution of the Semantic Web and its supporting technologies. An ontology defines the terms
and concepts (meaning) used to describe and represent an area of knowledge.The aim of this paper is to identify all
possible existing ontologies and ontology management tools (Protégé 3.4, Apollo, IsaViz & SWOOP) that are freely
available and review them in terms of: a) interoperability, b) openness, c) easiness to update and maintain, d)
market status and penetration. The results of the review in ontologies are analyzed for each application area, such
as transport, tourism, personal services, health and social services, natural languages and other HCI-related
domains. Ontology Building/Management Tools are used by different groups of people for performing diverse tasks.
Although each tool provides different functionalities, most of the users just use only one, because they are not able
to interchange their ontologies from one tool to another. In addition, we considered the compatibility of different
ontologies with different development and management tools. The paper is also concerns the detection of
commonalities and differences between the examined ontologies, both on the same domain (application area) and
among different domains.
A Comparative Study of Recent Ontology Visualization Tools with a Case of Dia...IJORCS
Ontology is a conceptualization of a domain into machine readable format. Ontologies are becoming increasingly popular modelling schemas for knowledge management services and applications. Focus on developing tools to graphically visualise ontologies is rising to aid their assessment and analysis. Graph visualisation helps to browse and comprehend the structure of ontologies. A number of ontology visualizations exist that have been embedded in ontology management tools. The primary goal of this paper is to analyze recently implemented ontology visualization tools and their contributions in the enrichment of users’ cognitive support. This work also presents the preliminary results of an evaluation of three visualization tools to determine the suitability of each method for end user applications where ontologies are used as browsing aids with a case of Diabetes data
Knowledge Organization System (KOS) for biodiversity information resources, G...Dag Endresen
Presentation of the Global Biodiversity Information Facility (GBIF) knowledge organization system (KOS) work program for the National Center for Biomedical Ontology (NCBO) Web seminar series in October 2012. Available at http://www.bioontology.org/GBIF-vocabulary-management-for-biodiversity-informatics
Scientific research is increasingly dependent on publicly avail-
able information and data sharing. So far, the best practices to ensure
that data is accessible and shareable has been to deposit it in public
repositories. However, these repositories often fail to implement mech-
anisms that measure data quality, which could lead to improving the
discoverability of existing data, and contribute to its future integration.
In light of this, we present Metadata Analyser, a tool that measures
metadata quality. It assesses the quality of metadata by considering the
proportion of terms actually linked to ontology concepts, as well as the
specificity of the terms used in the metadata. Metadata Analyser applied
to Metabolights, a real-world repository of metabolomics data, and re-
sults show that the tool successfully implements the proposed measures,
that there is indeed a lack of effort in the annotation task, and that our
tool can be used to improve this situation. Metadata Analyser’s frontend
is available at http://masterweb-metadataanalyser.rhcloud.com.
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalMauro Dragoni
The presentation provides an overview of what an ontology is and how it can be used for representing information and for retrieving data with a particular focus on the linguistic resources available for supporting this kind of task. Overview of semantic-based retrieval approaches by highlighting the pro and cons of using semantic approaches with respect to classic ones. Use cases are presented and discussed
A Survey of Ontology-based Information Extraction for Social Media Content An...ijcnes
The amount of information generated in the Web has grown enormously over the years. This information is significant to individuals, businesses and organizations. If analyzed, understood and utilized, it will provide a valuable insight to its stakeholders. However, many of these information are semi-structured or unstructured which makes it difficult to draw in-depth understanding of the implications behind those information. This is where Ontology-based Information Extraction (OBIE) and social media content analysis come into play. OBIE has now become a popular way to extract information coming from machine-readable sources. This paper presents a survey of OBIE, Ontology languages and tools and the process to build an ontology model and framework. The author made a comparison of two ontology building frameworks and identified which framework is complete.
= Finding a Good Ontology: The Open Ontology Repository Initiative =
Can you find a good ontology to use or extend for your application?
Building on previous registry and repository efforts, the Open Ontology Repository Initiative is a community effort developing open source software for finding, using, and maintaining open source and other ontologies.
The initial implementation of OOR is based on BioPortal (http://bioportal.bioontology.org), which is used to access and share ontologies that are actively used in biomedical communities and currently supports OWL, OBO, and Protege ontologies, LexGrid and RRF vocabularies, and ontology mapping. BioPortal has been developed by the National Center for Biomedical Ontology with support from the NIH Roadmap, but its infrastructure is domain-independent and being extended in various directions.
This presentation will include the following:
* A demonstration of the current public OOR instance
* OOR requirements and challenges
* On-going and planned development efforts (Common Logic support, federation, gatekeeping, provenance, governance, etc.)
* Details on how you can become involved
Indiana University 2018 SICE summer camp slidesJoanne Luciano
This is a mini lecture overview of the data science workflow for the students attending the Indiana University School of Informatics, Computing, and Engineering (SICE) Summer Camp.
More Related Content
Similar to Luciano pr 08-849_ontology_evaluation_methods_metrics
Keynote presentation for the International Semantic Web Conference in Athens Greece, on November 9, 2023. The talk addresses the generative AI explosion and its potential impacts on the Semantic Web and Knowledge Graph communities and, in fact, may spark a research Renaissance.
Abstract:
We are living in an age of rapidly advancing technology. History may view this period as one in which generative artificial intelligence is seen as reshaping the landscape and narrative of many technology-based fields of research and application. Times of disruptions often present both opportunities and challenges. We will discuss some areas that may be ripe for consideration in the field of Semantic Web research and semantically-enabled applications. Semantic Web research has historically focused on representation and reasoning and enabling interoperability of data and vocabularies. At the core are ontologies along with ontology-enabled (or ontology-compatible) knowledge stores such as knowledge graphs. Ontologies are often manually constructed using a process that (1) identifies existing best practice ontologies (and vocabularies) and (2) generates a plan for how to leverage these ontologies by aligning and augmenting them as needed to address requirements. While semi-automated techniques may help, there is typically a significant portion of the work that is often best done by humans with domain and ontology expertise. This is an opportune time to rethink how the field generates, evolves, maintains, and evaluates ontologies. We consider how hybrid approaches, i.e., those that leverage generative AI components along with more traditional knowledge representation and reasoning approaches to create improved processes. The effort to build a robust ontology that meets a use case can be large. Ontologies are not static however and they need to evolve along with knowledge evolution and expanded usage. There is potential for hybrid approaches to help identify gaps in ontologies and/or refine content. Further, ontologies need to be documented with term definitions and their provenance. Opportunities exist to consider semi-automated techniques for some types of documentation, provenance, and decision rationale capture for annotating ontologies. The area of human-AI collaboration for population and verification presents a wide range of areas of research collaboration and impact. Ontologies need to be populated with class and relationship content. Knowledge graphs and other knowledge stores need to be populated with instance data in order to be used for question answering and reasoning. Population of large knowledge graphs can be time consuming. Generative AI holds the promise to create candidate knowledge graphs that are compatible with the ontology schema. The knowledge graph should contain provenance information identifying how the content was populated and its source and correctness and currency should be checked. A human-AI assistant approach is presented.
Knowledge Organization System (KOS) for biodiversity information resources, G...Dag Endresen
Slides from a presentation on the Knowledge Organization System (KOS) work program for GBIF. KOS developments for biodiversity information resources and input to the emerging Vocabulary Management Task Group (VoMaG).
Links
GBIF KOS prototype tools, http://kos.gbif.org/
Tool: Semantic Wiki prototype, http://terms.gbif.org/wiki/
Tool: ISOcat prototype demo, http://kos.gbif.org/isocat/
GBIF concept vocabulary term browser, http://kos.gbif.org/termbrowser/
GBIF Resources Repository, http://rs.gbif.org/terms/
GBIF Vocabulary Server, http://vocabularies.gbif.org/
GBIF Resources Browser, http://tools.gbif.org/resource-browser/
A Comparative Study Ontology Building Tools for Semantic Web Applications IJwest
Ontologies have recently received popularity in the area of knowledge management and knowledge sharing,
especially after the evolution of the Semantic Web and its supporting technologies. An ontology defines the terms
and concepts (meaning) used to describe and represent an area of knowledge.The aim of this paper is to identify all
possible existing ontologies and ontology management tools (Protégé 3.4, Apollo, IsaViz & SWOOP) that are freely
available and review them in terms of: a) interoperability, b) openness, c) easiness to update and maintain, d)
market status and penetration. The results of the review in ontologies are analyzed for each application area, such
as transport, tourism, personal services, health and social services, natural languages and other HCI-related
domains. Ontology Building/Management Tools are used by different groups of people for performing diverse tasks.
Although each tool provides different functionalities, most of the users just use only one, because they are not able
to interchange their ontologies from one tool to another. In addition, we considered the compatibility of different
ontologies with different development and management tools. The paper is also concerns the detection of
commonalities and differences between the examined ontologies, both on the same domain (application area) and
among different domains.
A Comparative Study Ontology Building Tools for Semantic Web Applications dannyijwest
Ontologies have recently received popularity in the area of knowledge management and knowledge sharing, especially after the evolution of the Semantic Web and its supporting technologies. An ontology defines the terms and concepts (meaning) used to describe and represent an area of knowledge.The aim of this paper is to identify all possible existing ontologies and ontology management tools (Protégé 3.4, Apollo, IsaViz & SWOOP) that are freely available and review them in terms of: a) interoperability, b) openness, c) easiness to update and maintain, d) market status and penetration. The results of the review in ontologies are analyzed for each application area, such as transport, tourism, personal services, health and social services, natural languages and other HCI-related domains. Ontology Building/Management Tools are used by different groups of people for performing diverse tasks. Although each tool provides different functionalities, most of the users just use only one, because they are not able to interchange their ontologies from one tool to another. In addition, we considered the compatibility of different ontologies with different development and management tools. The paper is also concerns the detection of commonalities and differences between the examined ontologies, both on the same domain (application area) and among different domains.
A Comparative Study of Ontology building Tools in Semantic Web Applications dannyijwest
Ontologies have recently received popularity in the area of knowledge management and knowledge sharing,
especially after the evolution of the Semantic Web and its supporting technologies. An ontology defines the terms
and concepts (meaning) used to describe and represent an area of knowledge.The aim of this paper is to identify all
possible existing ontologies and ontology management tools (Protégé 3.4, Apollo, IsaViz & SWOOP) that are freely
available and review them in terms of: a) interoperability, b) openness, c) easiness to update and maintain, d)
market status and penetration. The results of the review in ontologies are analyzed for each application area, such
as transport, tourism, personal services, health and social services, natural languages and other HCI-related
domains. Ontology Building/Management Tools are used by different groups of people for performing diverse tasks.
Although each tool provides different functionalities, most of the users just use only one, because they are not able
to interchange their ontologies from one tool to another. In addition, we considered the compatibility of different
ontologies with different development and management tools. The paper is also concerns the detection of
commonalities and differences between the examined ontologies, both on the same domain (application area) and
among different domains.
A Comparative Study of Recent Ontology Visualization Tools with a Case of Dia...IJORCS
Ontology is a conceptualization of a domain into machine readable format. Ontologies are becoming increasingly popular modelling schemas for knowledge management services and applications. Focus on developing tools to graphically visualise ontologies is rising to aid their assessment and analysis. Graph visualisation helps to browse and comprehend the structure of ontologies. A number of ontology visualizations exist that have been embedded in ontology management tools. The primary goal of this paper is to analyze recently implemented ontology visualization tools and their contributions in the enrichment of users’ cognitive support. This work also presents the preliminary results of an evaluation of three visualization tools to determine the suitability of each method for end user applications where ontologies are used as browsing aids with a case of Diabetes data
Knowledge Organization System (KOS) for biodiversity information resources, G...Dag Endresen
Presentation of the Global Biodiversity Information Facility (GBIF) knowledge organization system (KOS) work program for the National Center for Biomedical Ontology (NCBO) Web seminar series in October 2012. Available at http://www.bioontology.org/GBIF-vocabulary-management-for-biodiversity-informatics
Scientific research is increasingly dependent on publicly avail-
able information and data sharing. So far, the best practices to ensure
that data is accessible and shareable has been to deposit it in public
repositories. However, these repositories often fail to implement mech-
anisms that measure data quality, which could lead to improving the
discoverability of existing data, and contribute to its future integration.
In light of this, we present Metadata Analyser, a tool that measures
metadata quality. It assesses the quality of metadata by considering the
proportion of terms actually linked to ontology concepts, as well as the
specificity of the terms used in the metadata. Metadata Analyser applied
to Metabolights, a real-world repository of metabolomics data, and re-
sults show that the tool successfully implements the proposed measures,
that there is indeed a lack of effort in the annotation task, and that our
tool can be used to improve this situation. Metadata Analyser’s frontend
is available at http://masterweb-metadataanalyser.rhcloud.com.
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalMauro Dragoni
The presentation provides an overview of what an ontology is and how it can be used for representing information and for retrieving data with a particular focus on the linguistic resources available for supporting this kind of task. Overview of semantic-based retrieval approaches by highlighting the pro and cons of using semantic approaches with respect to classic ones. Use cases are presented and discussed
A Survey of Ontology-based Information Extraction for Social Media Content An...ijcnes
The amount of information generated in the Web has grown enormously over the years. This information is significant to individuals, businesses and organizations. If analyzed, understood and utilized, it will provide a valuable insight to its stakeholders. However, many of these information are semi-structured or unstructured which makes it difficult to draw in-depth understanding of the implications behind those information. This is where Ontology-based Information Extraction (OBIE) and social media content analysis come into play. OBIE has now become a popular way to extract information coming from machine-readable sources. This paper presents a survey of OBIE, Ontology languages and tools and the process to build an ontology model and framework. The author made a comparison of two ontology building frameworks and identified which framework is complete.
= Finding a Good Ontology: The Open Ontology Repository Initiative =
Can you find a good ontology to use or extend for your application?
Building on previous registry and repository efforts, the Open Ontology Repository Initiative is a community effort developing open source software for finding, using, and maintaining open source and other ontologies.
The initial implementation of OOR is based on BioPortal (http://bioportal.bioontology.org), which is used to access and share ontologies that are actively used in biomedical communities and currently supports OWL, OBO, and Protege ontologies, LexGrid and RRF vocabularies, and ontology mapping. BioPortal has been developed by the National Center for Biomedical Ontology with support from the NIH Roadmap, but its infrastructure is domain-independent and being extended in various directions.
This presentation will include the following:
* A demonstration of the current public OOR instance
* OOR requirements and challenges
* On-going and planned development efforts (Common Logic support, federation, gatekeeping, provenance, governance, etc.)
* Details on how you can become involved
Similar to Luciano pr 08-849_ontology_evaluation_methods_metrics (20)
Indiana University 2018 SICE summer camp slidesJoanne Luciano
This is a mini lecture overview of the data science workflow for the students attending the Indiana University School of Informatics, Computing, and Engineering (SICE) Summer Camp.
Why are some websites successful (at behavioral change) Informs International...Joanne Luciano
Medicine is increasingly concerned with improving public health. This often implies motivating people to change behaviors and practice better health habits ( healthy eating, exercise, smoking cessation). The theoretical foundations for these models of behavior change are preventative but not truly participatory. There is more concern with metrics than with the lived experience of patients. In this talk we introduce the web as a medical device not simply a medium of exchange, by bringing together Health Web Science, a formulation for 21st century medicine, and behavioral change theory. We ask data analysts to aim to develop new metrics to help answer these questions and inform policy makers.
The General Ontology Evaluation Framework (GOEF) & the I-Choose Use CaseA ...Joanne Luciano
Example of the application of General Ontology Evaluation Framework (GOEF) to uses cases from I-Choose, a transnational project (NSF (USA) and CONACYT (Mexico)). to build an interoperable data architecture to support ethical consumption, with a focus on sustainable coffee products produced in Mexico and consumed and distributed in Canada and in the US.
Components of I-Choose System: A set of data standards to share information across sustainable supply-chain and a governance system.
National Institute of Standards and Technology
NIST Grant No.: 60NANB12D201
PI: Joanne S. Luciano
Luciano informs healthcare_2015 Nashville, TN USA July 30 2015Joanne Luciano
This talk presents and explains Health Web Science, Health Web Observatories, and the technologies needed to create and utilize them as an approach towards preferable health outcomes in the 21st century. Health Web Science (HWS), which impact of the Web on health and wellbeing, aims towards a preventative, participatory, personalized, and predictive (P4) model of healthcare. HWS posits this can be achieved by the leveraging of the Web’s data, resources and nature. In studying the Web, it is impossible to ignore the evolving social, political, economic, policy questions that emerge as a result of the use of the Web. Health Web Observatories play a role by enabling the study of these data, make available the metadata, and thereby enable it as a feedback mechanism for preferable futures.
Ontology Support for Influenza and Surveillance Joanne Luciano
This is a presentation about the construction of the influenza ontology to support Influenza research and surveillance as part of the Genomics for Bio-forensics MITRE sponsored research.
Translational Medicine: Patterns of Response to Antidepressant Treatment and ...Joanne Luciano
This is a talk I gave at the IEEE Schenectady Section - 17 MAY Membership Meeting.
The mission of my depression research is to help people figure out what they need to help them get out of a depressed state. That is, finding out what is best for them, not what is best for their doctor, friends, therapist, or anyone else. Depression is now a global problem. In the past 15 years it has gotten worse. Depression is complex; it has a wide range of varying symptoms and degrees of intensity. It can be challenging to determine the best course of action, whether medical treatment is necessary, or which of the many treatments (drug and non-drug) is the best match. Many people who are depressed do not get the help they need, and many people receive medications when they are not necessary. My work aims to bring together tools, technology, scientific and medical data and patient experience to help address depression, both personally and globally.
Joanne S. Luciano, PhD Defense @ Boston University, 1996. Neural Network Models of Unipolar Depression. Patterns of Recovery and Prediction of Outcome. Work lead to two US Patents
1. Ontology Evaluation:
Methods and Metrics
MITRE Research Interest
Dr. Joanne S. Luciano
In collaboration with
Dr. Leo Obrst, PhD
Suzette Stoutenburg
Kevin Cohen
Jean Stanford
Approved for Public Release. Distribution Unlimited. Case Number: 08
08-0849 1
2. Ontology: A Key Technology
for Knowledge Management
Used to describe terms in a vocabulary
and the relationships among them.
Ontology languages vary in their
semantic expressiveness.
Based on work by Leo Obrst of MITRE as interpreted by
Dan McCreary. This can be viewed as a trade-off of
semantic clarity v. the time and money it takes to construct
http://www.mkbergman.com/?m=20070516.
Ontologies have become the most
widespread form of knowledge
representation for multiple purposes Ontology Summit 2007 (NIST, Gaithersburg, MD,
April 23-24, 2007) 2
3. The Problem
Ontology Elephants
There is no single real elephant An elephant is abstract
There must be a purpose for
an elephant: use cases?
An elephant is very abstract
There must be an
upper elephant
An elephant
is really very An elephant is the
simple result of consensus
There are only
Open vs. distributed elephants
Closed & their mappings
Elephant
Prospects and Possibilities for Ontology Evaluation: The View from NCOR
NCOR,
Obrst, Hughes, Ray, WWW 2006, May 22 22–26, 2006, Edinburgh, UK.
3
4. The Problem
Users need to be able to build sound ontologies and to reuse ontologies for different
purposes. There is no standard no way to do that now.
At the Ontology Summit 2008 two competing “state of the art” evaluation proposals for the
Open Ontology Repository were presented. Both treat ontologies as “black boxes” and both
are subjective evaluations.
– Peer Review – self appointed editorial review board decides, non
non-overlapping domain so first one
gets preference, ‘best practices’
– User Ratings – users report their experience and rate them on a website
Language was inserted into the communiqué to provide mechanisms that enable ontology evaluation
by other metrics
Prior work on evaluating ontologies is limited, no consensus
- Ontology Workshop Methods and Metrics October 2007
Workshop materials at: http://sites.google.com/a/cme.nist.gov/workshop
://sites.google.com/a/cme.nist.gov/workshop-on-ontology-evaluation/
– Formal logic based applications can use software reasoners to address logical consistency and
classification; many don’t take advantage of these tools
– Natural Language Processing (NLP) application use NLP evaluation methods, but address only NLP
ons
applications (text mark-up – aka “annotation”), information retrieval and extraction
– Alignment (mapping of ontologies) for data mining, integration, fusion
Ontology Summit 2007 (NIST, Gaithersburg, MD, April 23-24, 2007 See slide at end with notes.
24, 2007) 4
5. Objective: Evolve toward Science &
Engineering Discipline for Ontology
• Create procedures, processes, methods to help define,
adjudicate, & ensure quality of knowledge
capture/representation
• Facilitate the education of communities on ontology
development & promote best practices for ontology
development
• Enable the best standards related to ontologies, &
promote linkages, liaisons among standards
organizations
5
6. Approach:
• Two stages:
– Recast use case into its components: Novel
• Functional objective Approach
• Design objective & requirements specification
• Semantic components required to achieve above
– Evaluate components using objective metrics
• Place existing evaluation methods in context by utility
• Engage and rally the community / stakeholders
– Participate at appropriate meetings: present work and facilitate
community focus on objective metrics for evaluation
– Introduce users with complementary skills, joint vision, shared
needs to develop needed metrics, content, tests, tools
– Involve multiple government agencies, industry and academia
to support initiative
6
7. Research Plan: (1) Identify use cases
For each, recast the use cases into their components:
a. Specify functional objective (what it is, what it does)
e.g. enable investigation of data collected on influenza strain mutations
that cause death in birds
b. Specify design objective (how good it has to be e.g., what
specifications have to be met? Is it a prototype, for commercial
use, or must it meet military specifications?)
e.g. Must meet: Minimum Information about an Influenza Genotype
and a Nomenclature Standard (MIIGNS)
c. Identify (or specify) the semantic components required to achieve
the functional objective to the level specified by the design
objective (Authoritative Sources such as engineering tolerances,
physical constants, legal jurisdictions, company policies)
e.g. To meet MIIGNS, the following semantic components must
be included:
biomaterial transformations
assays
data transformations
7
8. Research Plan (2) Develop Metrics
Develop metrics for 3 criteria for evaluation:
1. Correctness: how well the f
e functional components express the
design objectives
a) Language expressiveness
b) Fluency / competency
2. Completeness: combines use case with design criteria
(requirement specification)
a) To what extent can requirements be met?
b) Which semantic components (authoritative sources) are needed/missing?
nts
3. Utility: Is it useful? Does it work?
Combine 1 and 2 (correctness and completeness) and evaluate
against the use case (by competency questions, or other challenge
tests).
8
9. Examples:
• BioPAX (prior work)
• Habitat-Lite (subset of Environmental
Lite
Ontology to support of NSF funded Mining
Metadata for Metagenomics)
• Influenza Infectious Disease Ontology (for
Genomics for Bioforensics MSR)
9
10. Example (1) BioPAX lack of fluency
chemical structure & pathway steps incorrectly modeled
- misunderstanding of the language (language has capability)
- modeled disjoint from the biology & chemistry
- leads to logical inconsistency
OWL has a steep learning curve it’s easy to get things wrong.
rve,
10
11. Example (2) Habitat-
-Lite:
correctness & completeness
Objective: facilitate capture of habitat and
environmental metadata on genomic sequences
Approach: select subset of terms with highest frequency and
evaluate usefulness by correctness and completeness metrics
– Evaluated correctness
• 64% agreement (84 of 132 terms) of automated and expert mapping of
terms
– Evaluated coverage of terms
• 84% exact matches (“host,” “aquatic,” and “soil” covered 75%)
Hirschman, Clark, Cohen, Mardis, Luciano, Kottmann, Cole, Markowitz, Kyprpides, Field
Habitat-Lite: a GSC Case Study Based on Free Text Terms for Environmental Metadata
Lite:
OMICS A Journal of Integrative Biology Volume 12, Number 2, 2008 (in press)
11
12. Example (3) Enable Influenza Research
(proposed construction and subsequent evaluation)
Function: enable investigation of data collected on
influenza strain mutations that cause death in birds
Design objective: Minimum Information about an Influenza
:
Genotype and a Nomenclature Standard (MIIGNS)
Semantic components:
1. biomaterial transformations
a. recombinant plasmid biomaterial transformation
b. site-directed mutagenesis biomaterial transformation
directed
c. reverse genetic virus production biomaterial transformation
d. Mouse infection biomaterial transformation
2. assays
a. weight assay
b. virus replication / mouse lung assay
c. Cytokine quantification assay
3. data transformations
a. statistical difference evaluation
12
13. Example (3) Enable Influenza Research
(proposed construction and subsequent evaluation)
Correctness:
Language expressivity: validate definitions against
OBO Foundry relations
Fluency: inter-developer agreement (3 developers, 2
developer
code same, 3rd validates)
Completeness:
Calculate % coverage of minimum terms (18 terms)
Calculate % coverage of full terms (196 terms)
Utility: Challenge Questions
To be developed (by our collaborator BioHealthBase)
13
14. Impact
Communities of Practice areas need objective methods and
Metrics to facilitate the development, interoperability and reuse
of their ontologies
Some examples:
– Message Based Data Exchange
– BioSecurity
– Health Care and Biomedicine
– Life Sciences
– Disease
– Metagenomics
– Agile systems for rapid enterprise integration of
heterogeneous data
– Intelligence Community
14
15. Why MITRE?
MITRE is uniquely positioned to act as an
impartial experimental designer and
arbitrator in the development of an
evaluation methodology for ontologies
– MITRE has acted in the past for natural language
technologies such as text summarization in the Text
REtrieval Conferences (TREC) and Information
Extraction in the BioCreAtIvE challenge
15
16. Additional Notes on Specific Slides
Slide 1: Lower right gaphic: Ontology Summit 2007 (NIST, Gaithersburg, MD, April 23 23-24, 2007) effort. See the following:
Ontology Summit 2007 - Ontology, Taxonomy, Folksonomy: Understanding the Distinctions. http://ontolog.cim3.net/cgi-
:
bin/wiki.pl?OntologySummit2007.
Ontology Summit 2007 Communique. http://ontolog.cim3.net/cgi
http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2007_Communique.
Ontology Summit 2007 Ontology Dimensions Map. http://ontolog.cim3.net/cgi
http://ontolog.cim3.net/cgi-
bin/wiki.pl?OntologySummit2007_FrameworksForConsideration/DimensionsMap.
bin/wiki.pl?OntologySummit2007_FrameworksForConsideration/DimensionsMap
Gruninger, Michael; Olivier Bodenreider; Frank Olken; Leo Obrst; Peter Yim. 2008. The 2007 Ontology Summit Joint
,
Communiqué. Ontology, Taxonomy, Folksonomy: Understanding the Distinctions. Journal of Applied Ontology, forthcoming.
:
Slide 2: Ontology Summit 2008 (NIST, Gaithersburg, MD, April 28
28-29, 2008).
Ontology Summit 2008: Toward an Open Ontology Repository. http://ontolog.cim3.net/cgi
http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2008.
Ontology Summit 2008 Communique. http://ontolog.cim3.net/cgi
http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2008_Communique.
Slide 4:
Concerning:
“At the Ontology Summit 2008 two competing “state of the art” evaluation proposals for the Open Ontology Repository were
presented. Both treat ontologies as “black boxes” and both are subjective evaluations
evaluations.
–Peer Review –self appointed editorial review board decides, non
self non-overlapping domain so first one gets preference, „best
practices‟
–User Ratings –users report their experience and rate them on a website
users
These points were made during the summit and discussed more fully during the Quality and Gatekeeping session of the
Ontology Summit 2008 (NIST, Gaithersburg, MD, April 28-29, 2008
29, 2008.
16
17. Background References
The BioCreAtIvE (Critical Assessment of Information Extraction systems in Biology) challenge evaluation consists of a communi
community-wide effort for evaluating text mining
and information extraction systems applied to the biological domain. BioCreative (http://biocreative.sourceforge.net/)
BioCreative.
The Text REtrieval Conference (TREC), co-sponsored by the National Institute of Standards and Technology (NIST) and U.S. Department of Defense, wa started in
sponsored was
1992 as part of the TIPSTER Text program. Its purpose was to support research within the information retrieval community by p providing the infrastructure
necessary for large-scale evaluation of text retrieval methodologies. (http://trec.nist.gov/overview.html
http://trec.nist.gov/overview.html)
The Message Understanding Conferences (MUC) were initiated and financed by DARPA to encourage the development of new and bett methods of information
better
extraction. The character of this competition -- many concurrent research teams competing against one another -- necessitated the development of standards for
evaluation, e.g. the adoption of recall and precision. (http://en.wikipedia.org/wiki/Message_Understanding_Conference
http://en.wikipedia.org/wiki/Message_Understanding_Conference)
Lenat, Douglas B. “CYC: a large-scale investment in knowledge infrastructure,” Communications of the ACM Volume 38 , Issue 11 (November 1995) Pages: 33 – 38.
ACM,
Project Halo (http://www.projecthalo.com/), is a project funded by Paul Allen's Vulcan Ventures. The project was initially led by Oliver Roup and Noah Friedland but is
),
currently led by Mark Greaves, a former DARPA Program Manager. Project Halo is an attempt to apply Artificial Intelligence te techniques to the problem of producing
a "digital Aristotle" that might serve as a mentor, providing comprehensive access to the world's knowledge ( (http://en.wikipedia.org/wiki/Project_Halo).
Ontoprise is a commerial software provider of ontology-based solutions. (http://www.ontoprise.de/content/index_eng.html
http://www.ontoprise.de/content/index_eng.html)
Maedche, Alexander and Staab, Steffen. Ontology learning for the semantic web. Special Issue on Semantic Web. IEEE Intelligent Systems, 16(2):72 16(2):72-79, MAR 2001.
López, M. F.; Gómez-Pérez, A.; Sierra, J. P. & Sierra, A. P. Building a chemical ontology using Methontology and the OntologyDesign Environment IEEE Intelligent
,
Systems and Their Applications, 1999, 14, 37-46
Oltramari, A.; Gangemi, A.; Guarino, N. & Masolo, C. Restructuring WordNet's Top- -Level: The OntoClean approach Proceedings of the Workshop OntoLex'2, Ontologies
and Lexical Knowledge Bases, 2002
Guarino, N. & Welty, C. Evaluating ontological decisions with OntoClean Commun. ACM, ACM Press, 2002, 45, 61
. 61-65
Smith, B.; Williams, J. & Schulze-Kremer, S. The ontology of the gene ontology. AMIA Annu Symp Proc, Institute for Formal Ontology and Medical Information Science,
Kremer,
University of Leipzig., 2003, 609-613.
Smith, B. From concepts to clinical reality: an essay on the benchmarking of biomedical terminologies. J Biomed Inform, Depar Department of Philosophy and National Center
for Biomedical Ontology, University at Buffalo, Buffalo, NY 14260, USA. phismith@buffalo.edu, 2006, 39, 288 288-298
Obrst, Leo; Todd Hughes; Steve Ray. 2006. Prospects and Possibilities for Ontology Evaluation: The View from NCOR. Workshop o Evaluation of Ontologies for the
on
Web (EON2006), Edinburgh, UK, May 22, 2006.
Obrst, Leo; Werner Ceusters; Inderjeet Mani; Steve Ray; Barry Smith. 2007 The Evaluation of Ontologies: Toward Improved Semantic Interoperability. Chapt in:
; Chapter
Semantic Web: Revolutionizing Knowledge Discovery in the Life Sciences, Christopher J. O. Baker and Kei Kei-Hoi Cheung, Eds., Springer, 2007.
Gangemi, Aldo; Carola Catenacci1; Massimiliano Ciaramita1; Jos Lehmann; contributions by: Rosa Gil (in section 2.2). Francesco Bolici and Onofrio Strignano (in
section 2.4). 2004. Ontology evaluation and validation: An integrated formal model for the quality diagnostic task. ). OntoEval 2004.
Gangemi, A.; Catenacci, C.; Ciaramita, M.; Lehmann, J. 2005. A Theoretical Framework for Ontology Evaluation and Validation. In Proceedings of SWAP2005.
http://www.loa-cnr.it/Papers/swap_final_v2.pdf
Gangemi, Aldo; Carola Catenacci; Massimiliano Ciaramita; and Jos Lehmann. 2006. Modelling ontology evaluation and validation. To appear in Proceedings of
ESWC2006, Springer.
Lawrence Hunter, Mike Bada, K. Bretonnel Cohen, Helen Johnson, William Baumgartner, Jr. and Philip V. Ogren. "Ontology Quality Metrics," Center for Computational
Pharmacology University of Colorado School of Medicine, October 8, 2007.
Lynette Hirschman , Jong C. Park , Junichi Tsujii , Limsoon Wong , and Cathy H. Wu. Accomplishments and challenges in literature data mining for biology.
Bioinformatics 18: 1553-1561.
Proceedings of NIST 2007 Automatic Content Extraction Workshop (ACE), 2007. http://www.nist.gov/speech/tests/ace/ace07/
http://www.nist.gov/speech/tests/ace/ace07/.
Methods and Metrics for Ontology Evaluation Workshop (Sponsors: NIST and NIH), National Institute of Standards and Technology Gaithersburg, MD, October 25-
Technology,
26.
Mani, I., B. Sondheim, D. House, L. Obrst. 1998. TIPSTER Text Summarization Evaluation: Final Report, MITRE technical report, Reston, VA, September, 1998.
Proceedings of NIST 2007 Automatic Content Extraction Workshop (ACE), 2007. http://www.nist.gov/speech/tests/ace/ace07/.
17
18. Background: Prior Technical
Approaches
• Evaluation in use - Navigli et al. 2003, Porzel and Malaka
2005
– Best case: Halo Project - Friedland et al. 2004
• Data-driven evaluation - essentially the fit between the
ontology and a knowledge source e.g. a corpus - Brewster
et al. 2004
• Gold Standard approaches, very common, used by for
example, Cimiano et al. 2005
– Major problem is the arbitrary choice of an ontology
– Dellschaft and Staab 2006, proposed a method to derive
IR/NLP type Precision/Recall/F
Precision/Recall/F-Measure
18
19. Background : Philosophical and Methodological
Approaches
Methontology approach of Gomez
Gomez-Perez:
• Focus on evaluating procedural or formative aspects ontology
construction
• Criteria included: Consistency, Completeness, Conciseness,
Expandability
• Some tools developed reflecting these approaches:
o Lam et al. 2004, Knublauch et al. 2004, Alani 2005, 2006
OntoClean approach of Guarino and Welty:
• Philosophical approach based on theoretical principles:
• Proposed a set of "metaproperties":
o Rigidity
o Identity
o Unity
• Much focus on "cleaning up" existing "ontologies" such as
WordNet so as to make them more rigororous
19