TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
Six Myths about Ontologies: The Basics of Formal Ontology
1. Six Myths about Ontologies
John Beverley, PhD
Assistant Professor, University at Buffalo
Co-Director, National Center for Ontological Research
Affiliate Faculty, Institute of Artificial Intelligence and Data Science
4. Information Silos
• An information silo is an information repository, e.g. management system,
database, the content of which cannot be integrated with that of other
information repositories using computing strategies
5. Information Silos
• An information silo is an information repository, e.g. management system,
database, the content of which cannot be integrated with that of other
information repositories using computing strategies
• Information silos may manifest for a variety of reasons:
• Ignorance – Do not realize a given information repository exists
• Inaccessible – Do not have the appropriate permissions to access
• Infeasible – Do not have the appropriate technology to access
• Insane – Do not care about integrating with other repositories
6. Cost of Silos
A 2020 report by NIST
estimated the lack of
interoperability across
industrial datasets costs
companies between
21-43 billion
McKinsey estimates mid-size
companies spend 20-50
million annually due to silos
10. Promise of Ontology Engineering
• Ontologies are formally well-defined machine-interpretable controlled
vocabularies designed to represent entities and logical relationships
among them
• Ontologies make explicit the implicit meanings buried in datasets, by
using basic principles of formal logic
• Ontologies provide a semantic layer to connect information silos
10
11. Promise of Ontology Engineering
• Exhibiting standardized syntax/semantics, addressing VARIETY
• Represented in formal languages facilitating consistency, addressing VERACITY
• Queryable for information and inferences, addressing VALUE
• Providing a lingua franca across information silos, addressing VOLUME
11
14. 6 Myths about Ontologies
1. Ontology development is easy
2. Ontology development is too hard
3. Ontologies require changing the way we speak
4. Ontology solutions require using one master ontology
5. We do not need ontologies because we have X
6. The DOD-IC Ontology Foundry will never work
15. 6 Myths about Ontologies
1. Ontology development is easy
2. Ontology development is too hard
3. Ontologies require changing the way we speak
4. Ontology solutions require using one master ontology
5. We do not need ontologies because we have X
6. The DOD-IC Ontology Foundry will never work
16. Knowledge Representation is Easy!
• This myth exhibits a grain of truth and distortion of fact
• Constructing an ontology can be easy...just write some python to read a file
and generate classes/relations from column headers
16
19. Knowledge Representation is Easy!
• This myth exhibits a grain of truth and distortion of fact
• Constructing an ontology can be easy...just write some python to read a file
and generate classes/relations from column headers
• Constructing an ontology according to a standard is more challenging
• But that is how we avoid information silos
19
20. Basic Formal Ontology
BFO is such a standard, used by over 600 open-source groups, the first
ISO/IEC top-level ontology standard, and a “baseline standard” for
DOD-IC ontology development
20
22. Basic Formal Ontology
Aligning to BFO requires one identify within the BFO hierarchy where
content of interest falls, a task often requiring careful analysis
22
23. Myth 1: Summary
• Just building ontologies that makes explicit the semantics implicit in
data, is not enough to avoid information silos
• If you do not build to align with a standard, then you will likely
recreate the interoperability issues ontologies are designed to address
• If you build to align with a standard, such as BFO, then the task is not
trivial
24. 6 Myths about Ontologies
1. Ontology development is easy
2. Ontology development is too hard
3. Ontologies require changing the way we speak
4. Ontology solutions require using one master ontology
5. We do not need ontologies because we have X
6. The DOD-IC Ontology Foundry will never work
25. Challenge of Ontology Development
• This myth exhibits a grain of truth and distortion of fact
• Ontology development is challenging, but not too challenging
• The BFO community has developed strategies for streamlining the
development process
• Regardless, the benefits of ontologies are worth the effort
26. Hub & Spoke Strategy
Ontologies extending from
BFO are modules in a larger
hub & spoke structure
Ontologies are extended by
downward population, new
classes have parent classes in
a hierarchy ultimately leading
to a BFO class
BFO
27. Hub & Spoke Strategy
Provides guardrails for promoting
alignment between ontologies
representing nearby domains
Progress towards
interoperability is ensured
upfront, since spokes share
semantics with the same hub
BFO
29. Python Analogy
• BFO is analogous to the Python programming language; extensions of
BFO – such as CCO - are analogous to Python libraries
• You could create code that allows you to interact with, say, dataframes or you
could instead start with Python and import a library like Pandas
• You could create ontology elements that allow you to model artifacts and
processes or you could instead start with BFO and import a library like CCO
33. Definition Construction Strategy
• Material Entity (Elucidation) – An independent continuant that has some
portion of matter as part
• Agent (Definition) - A material entity that is capable of performing
intentional acts
Definitions of ontology elements are created following a recipe, to
ensure the hub semantics are preserved by spokes, and minimize
human error
34. Definition Construction Strategy
• Material Entity (Elucidation) – An independent continuant that has some
portion of matter as part
• Agent (Definition) - A material entity that is capable of performing
intentional acts
Downward population leverages the definition scheme:
A is a B that Cs
Where B is the parent class under which A falls
35. Definition Construction Strategy
• Material Entity (Elucidation) – An independent continuant that has some
portion of matter as part
• Agent (Definition) - A material entity that is capable of performing
intentional acts
Downward population leverages the definition scheme:
A is a B that Cs
Where B is the parent class under which A falls
36. Definition Construction Strategy
• Material Entity (Elucidation) – An independent continuant that has some
portion of matter as part
• Agent (Definition) - A material entity that is capable of performing
intentional acts
Downward population leverages the definition scheme:
A is a B that Cs
Where B is the parent class under which A falls
37. Definition Construction Strategy
• Material Entity (Elucidation) – An independent continuant that has some
portion of matter as part
• Agent (Definition) - A material entity that is capable of performing
intentional acts
Downward population leverages the definition scheme:
A is a B that Cs
Where B is the parent class under which A falls
38. Interoperability Guardrails
• The hub & spoke strategy provides a guardrails for promoting
alignment between ontologies representing nearby and
overlapping domains
• By following the recipe, progress towards interoperability is ensured
upfront, since elements inherited from the hub ontologies overlap
semantically
• By not following such a strategy you get...say it with me now...
41. Scope Creep
• The hub & spoke strategy significantly cuts down on scope creep,
which arises when an ontology is created as an information silo, but over
time grows beyond its initial scope
• If the ontology does not extend from a higher-level ontology, it will
likely not be compatible with other ontologies
• Developers will then need to recreate existing work because it is not
interoperable with their ontology...
43. Myth 2: Summary
• Ontology development is not easy
• But it is not too challenging, especially when one follows established
guidelines for ontology engineering
• The benefits of high-quality ontologies outweigh costs by
addressing interoperability challenges, promoting reuse, and avoiding
scope creep
44. 7 Myths about Ontologies
1. Ontology development is easy
2. Ontology development is too hard
3. Ontologies require changing the way we speak
4. Ontology solutions require using one master ontology
5. We do not need ontologies because we have X
6. The DOD-IC Ontology Foundry will never work
46. What do you mean by “talk”?
I’m joking, calm down
47. Human Ontologists
• Again, a grain of truth mixed with a distortion of fact
• One major goal for an ontologist is translating the way domain experts
talk into an ontology representation
• But some ontologists do attempt to change the way domain experts talk,
inappropriately
48. Pop Quiz
• Suppose you approach a web developer to build a website that shows
videos of your travels on the landing page, which will involve Javascript (JS)
• What should the web developer do:
A. Explain in detailed JS how they will create the landing page
B. Teach you how to read or write JS
C. Identify your needs then write the JS themselves
49. Pop Quiz
• Suppose you approach a web developer to build a website that shows
videos of your travels on the landing page, which will involve Javascript (JS)
• What should the web developer do:
A. Explain in detailed JS how they will create the landing page
B. Teach you how to read or write JS
C. Identify your needs then write the JS themselves
50. Ontology Analogy
• Suppose you approach an ontologist to represent your domain expertise,
which will involve using, say, BFO
• What should the ontologist do:
A. Explain in BFO terms how they will represent your domain
B. Teach you how to read or use BFO
C. Identify your needs then create the ontology themselves
51. Ontology Analogy
• Suppose you approach an ontologist to represent your domain expertise,
which will involve using, say, BFO
• What should the ontologist do:
A. Explain in BFO terms how they will represent your domain
B. Teach you how to read or use BFO
C. Identify your needs then create the ontology themselves
55. Consensus-Building Exercises
• Let us not blame the field for human error
• Especially since more often ontologists
merely seem like they are attempting to
change the way domain experts talk
• Ontologists rely on domain experts when
modeling, using competency questions and
consensus-building exercises
56. Competency Questions
• Competency questions are – roughly – questions that domain experts
would like answers to with respect to a given domain
• Competency questions are used to guide ontology development and
generate automated checks to ensure answers are sufficient
57. Consensus-Building Exercises
• Consensus-building exercises are where ontologists and domain experts
work towards an agreed understanding of ontology terms, definitions, etc.
• Importantly, whatever
agreement is reached
is meant to be added
to the ontology;
domain experts can
continue speaking as
they need
58. Consensus-Building Labels
• Domain experts need to use labels in specific ways for specific purposes
• We distinguish labels for ontologists from labels for users
• If domain experts find value in calling something X, let them
• X may have a different name in the ontology, but then again
“=SUM(A1; A3)” is not how I talk about addition in natural language...
59. Consensus-Building Definitions
• Intelligence =def The product resulting from the collection, processing,
integration, evaluation, analysis, and interpretation of available
information concerning foreign nations, hostile or
potentially hostile forces or elements, or areas of
actual or potential operations, as well as the activities
that result in the product and the organizations
engaged in such activities.
60. Consensus-Building Definitions
• Intelligence =def The product resulting from the collection, processing,
integration, evaluation, analysis, and interpretation of available
information concerning foreign nations, hostile or
potentially hostile forces or elements, or areas of
actual or potential operations, as well as the activities
that result in the product and the organizations
engaged in such activities.
• Counterexamples:
• Redundant if ‘available’ means ‘accessible by investigation’
• Overlooks acquisition if ‘available’ means ‘pre-existing’
61. Consensus-Building Definitions
• Intelligence =def The product resulting from the collection, processing,
integration, evaluation, analysis, and interpretation of available
information concerning foreign nations, hostile or
potentially hostile forces or elements, or areas of
actual or potential operations, as well as the activities
that result in the product and the organizations
engaged in such activities.
• Counterexamples:
• Police gather intelligence on domestic criminal activity
• Companies gather intelligence prior to merger
62. Consensus-Building Definitions
• Intelligence =def The product resulting from the collection, processing,
integration, evaluation, analysis, and interpretation of available
information concerning foreign nations, hostile or
potentially hostile forces or elements, or areas of
actual or potential operations, as well as the activities
that result in the product and the organizations
engaged in such activities.
• Counterexamples:
• Apparently, you cannot gather intelligence on an individual
63. Myth 3: Summary
• Ontology engineering is not a normative discipline; it is descriptive
• Ontologists should model the way you talk, not correct it
• Unless you have made some significant error, in which case you would
probably be happy for the assist
64. 6 Myths about Ontologies
1. Ontology development is easy
2. Ontology development is too hard
3. Ontologies require changing the way we speak
4. Ontology solutions require using one master ontology
5. We do not need ontologies because we have X
6. The DOD-IC Ontology Foundry will never work
65. N-Squared Problem
• During the early days of the web, datasets were coded in distinct syntax
without an eye towards interoperability with other datasets
• Connecting disparate datasets
requires two-way mappings:
• 2 datasets – 2 mappings
• 3 datasets – 6 mappings
• 4 datasets – 12 mappings
....
66. Semantic Web
• The advent of the “Semantic Web” ushered in a series of strategies for
promoting interoperability, partly aimed at addressing this problem
• The Resource Description
Framework was leveraged
as a way to mitigate the
n-squared problem, by
creating hubs of
interoperability
67. Rejecting the Mono-Ontology Myth
• It was recognized early that a single language used by everyone was unwise
• But progress towards interoperability could be made by mapping disparate
lexicons into the fewest number of standard languages as possible
• It is the ontologist’s
job to create a path
from such lexicons
into ontologies
68. Another Python Analogy
• Python is a popular language in no small part because it can be extended
with numerous libraries and bindings created to/from other languages
• Want to use a C++ library in Python? Use Python-C++ binding.
Haskell? Python-Haskell binding.
• Want to use the Credential Transparency Language in BFO? Hold
consensus-building exercises and let the ontologists work
69. Myth 4: Summary
• It is sometimes suggested that the ultimate goal of ontology engineering
is that everyone use a single monolithic ontology
• Such a goal seems impossible at worst and unwise at best
• Connecting to as few ontology hubs as possible is a tractable,
preferable, strategy in line with semantic web goals
70. 6 Myths about Ontologies
1. Ontology development is easy
2. Ontology development is too hard
3. Ontologies require changing the way we speak
4. Ontology solutions require using one master ontology
5. We do not need ontologies because we have X
6. The DOD-IC Ontology Foundry will never work
73. MadLibs Ontology EditionTM
• Fill in the blank:
We don’t need ontologies because we have ___
Taxonomies
Concept Maps
74. MadLibs Ontology EditionTM
• Fill in the blank:
We don’t need ontologies because we have ___
Taxonomies
Concept Maps
Data Models
75. MadLibs Ontology EditionTM
• Fill in the blank:
We don’t need ontologies because we have ___
Taxonomies
Concept Maps
Data Models
Code
76. First
• Ontologies are not
• Taxonomies – hierarchical classifications of types
• Concept maps – visual representations of class relationships
• Data models – abstract syntaxes to support information modeling
• Code – You know what code is
76
77. Second
• Ontologies are not replaceable by
• Taxonomies – But do contain them as proper parts
• Concept maps – As they are not machine-interpretable
• Data models – But are built on such things, e.g. RDF
• Code – Implicit business logic ≠ explicit semantics
77
78. MadLibs Ontology EditionTM
• Fill in the blank:
We don’t need ontologies because we have ___
Taxonomies
Concept Maps
Data Models
Code
Machine Learning
79. To a Hammer, Everything is a Nail
• Many developers attempt to automate
ontology creation without recognizing
interoperability requires a
different toolkit
• You are not going to machine learn
your way to semantic interoperability
80. A Simple Thought Experiment
• Knowledge representation and machine learning are not at odds
• Machine learning strategies are fundamentally statistical, but imagine you
could train a machine learning algorithm for 100% accuracy
• The result would be a knowledge graph wrapped in code
81. Knowledge Graph Wrapped in Code
• Introducing ontologies and
knowledge graphs into
machine learning training
pipelines can rule out
noise
• Using less data to train
algorithms to the same
level of accuracy at a
significantly faster rate
82. A (Simple?) Example
• An open source intelligence agent wants to identify the 10 highest-impact
open source documents relevant to a given OSINT problem
• This is like looking for 10
needles in a million haystacks…
• Ontologies – alongside ML and
NLP – can help to automate
identification of the relevant
high-impact documents
82
83. Ontologies in a Pipeline
83
Ontologists work with
subject-matter experts and research
a domain to produce an ontology
84. Ontologies in a Pipeline
84
High-impact texts for the domain
are identified and annotated
with ontology terms
85. Ontologies in a Pipeline
85
The labeled texts are used to train a
machine learning algorithm, which
builds a model designed to
identify other high-impact texts
86. Ontologies in a Pipeline
86
The machine learning model is
then used by natural language
processing programs to
automate text annotation
87. Ontologies in a Pipeline
87
Results are evaluated for accuracy and
errors are reported; this often creates
a feedback loop that informs further
ontology design
88. MadLibs Ontology EditionTM
• Fill in the blank:
We don’t need ontologies because we have ___
Taxonomies
Concept Maps
Data Models
Code
Machine Learning
Large-Language Models
89. Present Summer
• The current AI summer is one
of big data, cheaper & faster
computing, sophisticated
machine learning and
generative techniques
• But also hype and
overpromising that should
strike anyone who knows
the history of AI as familiar
90. Present Summer
• Ontologies and knowledge
graphs are considered
crucial resources for
advances generative AI
• They are presently at the
heart of numerous
strategies for addressing
LLM hallucinations and
mitigating bias
91. Present Summer
• Advances in generative AI
can help address concerns
raised in modern applications
of knowledge graphs
• Advances in knowledge
representation can help
temper expectations and
improve the quality of
generative AI
95. Enhancing Knowledge Graphs
• Knowledge graph
construction, e.g.
coreference resolution
• Knowledge graph Q/A,
e.g. entity and relation
extraction
95
96. Enhancing Both
• Constructing knowledge
graphs used for LLM-enhanced
Q/A responses
• Completing a knowledge
graph with LLM information
that reveals further gaps,
filling of which provides better
LLM training
96
97. Detour on RAG
• Retrieval augmented generation involves retrieving documents possibly
relevant to a question, using keyword search, then asking the model to
generate answers with additional context
• Effective when there is keyword overlap between retrieved documents
and the question
100. Detour on RAG
• Much less effective with code and math prompts since specifying
keywords that overlap with retrieved documents is challenging
• Models often get “distracted” by irrelevant content or ignore retrieved
documents, relying on parametric memory
• History doesn’t repeat, but it often rhymes
101. Myth 5: Summary
• ML and LLMs are not designed to address the interoperability challenges
where ontologies and knowledge graph solutions shine
• It is time AI researchers leverage a fuller toolkit
• Knowledge representation is a needed supplement to these more commonly
deployed technologies
• Let us not continue doing the same thing expecting different results...
101
102. 6 Myths about Ontologies
1. Ontology development is easy
2. Ontology development is too hard
3. Ontologies require changing the way we speak
4. Ontology solutions require using one master ontology
5. We do not need ontologies because we have X
6. The DOD-IC Ontology Foundry will never work
103. Gene Ontology - 1998
The mission of the GO Consortium
is to develop a comprehensive,
model of biological systems,
ranging from the molecular to the
organism level, across species in
the tree of life.
104. Proliferation of Ontologies
• When developed correctly, ontologies provide common vocabularies
with common semantics across multiple domains
• The success of the Gene Ontology led to a proliferation of ontologies
developed by subject-matter experts, computer scientists, and logicians
105. Proliferation of Ontologies
• When developed correctly, ontologies provide common vocabularies
with common semantics across multiple domains
• The success of the Gene Ontology led to a proliferation of ontologies
developed by subject-matter experts, computer scientists, and logicians
• Almost none of which were developed in coordination
• The result was massive incompatibility of terms and relations,
confusion, in-fighting, name-calling, etc.
106. Open Biological and Biomedical
Ontologies
• In 2005, a consortium of biologists
decided to create standards for ontology
development
• Such as requiring ontologies be open-source,
have documentation, include definitions for
vocabulary terms, and...
• Align to a top-level ontology which provides a
starting point for all ontology development...
107. Incomplete History of Foundry Efforts
2012
Common
Core
Ontologies
107
OBO Foundry
Basic Formal
Ontology
2005
Industrial
Ontologies
Foundry
2018 2020
Basic Formal
Ontology
ISO 21838:2
2024
DoD-IC
Foundry
108. Industrial Ontologies Foundry
• In 2018, stakeholders from manufacturing and
service industries – encountering the rise of
the Internet of Things and accompanying
interoperability challenges – followed the
OBO Foundry strategy
• The Industrial Ontologies Foundry also adopts
Basic Formal Ontology as its core
109. Industrial Ontologies Foundry
• Over 170 industry partners, many of whom have competing economic
interests, agree interoperability problems can only be addressed collectively
111. Myth 6: Summary
• If academics across various institutions and disciplines can create
and sustain a foundry...
• If individuals and organizations across various for-profit and non-
profit companies can create and sustain a foundry...
• Surely the DOD-IC can as well
112. Grace and Growth
• In this talk, I’ve targeted both ontologists and non-ontologists, those in
favor and those opposed
• My conclusion is that ontology engineering is a feasible path forward
• With a corollary that if you are, in fact, opposed to pursuing this path,
you’ll need more than myths
Editor's Notes
Promote the reuse of domain knowledge from other ontologies, through a common formal language: OWL
Internationalized Resource Identifiers (IRIs) promote tracking of entities and cross-ontology references, support interoperability
Promote human-human, human-computer, and computer-computer communication
Enable automated inference useful for deriving implicit information, as they effectively connect parts of taxonomies to other parts of taxonomies in machine-readable ways
Promote the reuse of domain knowledge from other ontologies, through a common formal language: OWL
Internationalized Resource Identifiers (IRIs) promote tracking of entities and cross-ontology references, support interoperability
Promote human-human, human-computer, and computer-computer communication
Enable automated inference useful for deriving implicit information, as they effectively connect parts of taxonomies to other parts of taxonomies in machine-readable ways
This led to the development of the Knowledge Interchange Format, or KIF: If there were a single “interlingua”, i.e., logic-based language capable of expressing everything that is expressible in any of the systems in a given environment, then, to exchange information between systems, instead of O(n2) pairwise translators, one would only need O(n).