SlideShare a Scribd company logo
1 of 112
Six Myths about Ontologies
John Beverley, PhD
Assistant Professor, University at Buffalo
Co-Director, National Center for Ontological Research
Affiliate Faculty, Institute of Artificial Intelligence and Data Science
Outline
• Preliminary Remarks
• Elephants in the Room
Outline
• Preliminary Remarks
• Elephants in the Room
Information Silos
• An information silo is an information repository, e.g. management system,
database, the content of which cannot be integrated with that of other
information repositories using computing strategies
Information Silos
• An information silo is an information repository, e.g. management system,
database, the content of which cannot be integrated with that of other
information repositories using computing strategies
• Information silos may manifest for a variety of reasons:
• Ignorance – Do not realize a given information repository exists
• Inaccessible – Do not have the appropriate permissions to access
• Infeasible – Do not have the appropriate technology to access
• Insane – Do not care about integrating with other repositories
Cost of Silos
A 2020 report by NIST
estimated the lack of
interoperability across
industrial datasets costs
companies between
21-43 billion
McKinsey estimates mid-size
companies spend 20-50
million annually due to silos
GO
HERE
Promise of Ontology Engineering
• Ontologies are formally well-defined machine-interpretable controlled
vocabularies designed to represent entities and logical relationships
among them
• Ontologies make explicit the implicit meanings buried in datasets, by
using basic principles of formal logic
• Ontologies provide a semantic layer to connect information silos
10
Promise of Ontology Engineering
• Exhibiting standardized syntax/semantics, addressing VARIETY
• Represented in formal languages facilitating consistency, addressing VERACITY
• Queryable for information and inferences, addressing VALUE
• Providing a lingua franca across information silos, addressing VOLUME
11
*A knowledge graph is just an ontology
that includes data
Outline
• Preliminary Remarks
• Elephants in the Room
6 Myths about Ontologies
1. Ontology development is easy
2. Ontology development is too hard
3. Ontologies require changing the way we speak
4. Ontology solutions require using one master ontology
5. We do not need ontologies because we have X
6. The DOD-IC Ontology Foundry will never work
6 Myths about Ontologies
1. Ontology development is easy
2. Ontology development is too hard
3. Ontologies require changing the way we speak
4. Ontology solutions require using one master ontology
5. We do not need ontologies because we have X
6. The DOD-IC Ontology Foundry will never work
Knowledge Representation is Easy!
• This myth exhibits a grain of truth and distortion of fact
• Constructing an ontology can be easy...just write some python to read a file
and generate classes/relations from column headers
16
17
18
just another information silo
Knowledge Representation is Easy!
• This myth exhibits a grain of truth and distortion of fact
• Constructing an ontology can be easy...just write some python to read a file
and generate classes/relations from column headers
• Constructing an ontology according to a standard is more challenging
• But that is how we avoid information silos
19
Basic Formal Ontology
BFO is such a standard, used by over 600 open-source groups, the first
ISO/IEC top-level ontology standard, and a “baseline standard” for
DOD-IC ontology development
20
BFO Ecosystem 600+ Projects
Basic Formal Ontology
Aligning to BFO requires one identify within the BFO hierarchy where
content of interest falls, a task often requiring careful analysis
22
Myth 1: Summary
• Just building ontologies that makes explicit the semantics implicit in
data, is not enough to avoid information silos
• If you do not build to align with a standard, then you will likely
recreate the interoperability issues ontologies are designed to address
• If you build to align with a standard, such as BFO, then the task is not
trivial
6 Myths about Ontologies
1. Ontology development is easy
2. Ontology development is too hard
3. Ontologies require changing the way we speak
4. Ontology solutions require using one master ontology
5. We do not need ontologies because we have X
6. The DOD-IC Ontology Foundry will never work
Challenge of Ontology Development
• This myth exhibits a grain of truth and distortion of fact
• Ontology development is challenging, but not too challenging
• The BFO community has developed strategies for streamlining the
development process
• Regardless, the benefits of ontologies are worth the effort
Hub & Spoke Strategy
Ontologies extending from
BFO are modules in a larger
hub & spoke structure
Ontologies are extended by
downward population, new
classes have parent classes in
a hierarchy ultimately leading
to a BFO class
BFO
Hub & Spoke Strategy
Provides guardrails for promoting
alignment between ontologies
representing nearby domains
Progress towards
interoperability is ensured
upfront, since spokes share
semantics with the same hub
BFO
28
Python Analogy
• BFO is analogous to the Python programming language; extensions of
BFO – such as CCO - are analogous to Python libraries
• You could create code that allows you to interact with, say, dataframes or you
could instead start with Python and import a library like Pandas
• You could create ontology elements that allow you to model artifacts and
processes or you could instead start with BFO and import a library like CCO
30
The Common Core
Ontologies
Hub & Spoke Strategy
BFO
Hub & Spoke Strategy
BFO
Definition Construction Strategy
• Material Entity (Elucidation) – An independent continuant that has some
portion of matter as part
• Agent (Definition) - A material entity that is capable of performing
intentional acts
Definitions of ontology elements are created following a recipe, to
ensure the hub semantics are preserved by spokes, and minimize
human error
Definition Construction Strategy
• Material Entity (Elucidation) – An independent continuant that has some
portion of matter as part
• Agent (Definition) - A material entity that is capable of performing
intentional acts
Downward population leverages the definition scheme:
A is a B that Cs
Where B is the parent class under which A falls
Definition Construction Strategy
• Material Entity (Elucidation) – An independent continuant that has some
portion of matter as part
• Agent (Definition) - A material entity that is capable of performing
intentional acts
Downward population leverages the definition scheme:
A is a B that Cs
Where B is the parent class under which A falls
Definition Construction Strategy
• Material Entity (Elucidation) – An independent continuant that has some
portion of matter as part
• Agent (Definition) - A material entity that is capable of performing
intentional acts
Downward population leverages the definition scheme:
A is a B that Cs
Where B is the parent class under which A falls
Definition Construction Strategy
• Material Entity (Elucidation) – An independent continuant that has some
portion of matter as part
• Agent (Definition) - A material entity that is capable of performing
intentional acts
Downward population leverages the definition scheme:
A is a B that Cs
Where B is the parent class under which A falls
Interoperability Guardrails
• The hub & spoke strategy provides a guardrails for promoting
alignment between ontologies representing nearby and
overlapping domains
• By following the recipe, progress towards interoperability is ensured
upfront, since elements inherited from the hub ontologies overlap
semantically
• By not following such a strategy you get...say it with me now...
SILO
NO TOP-LEVEL TOP-LEVEL
Scope Creep
• The hub & spoke strategy significantly cuts down on scope creep,
which arises when an ontology is created as an information silo, but over
time grows beyond its initial scope
• If the ontology does not extend from a higher-level ontology, it will
likely not be compatible with other ontologies
• Developers will then need to recreate existing work because it is not
interoperable with their ontology...
SCOPE CREEP
Myth 2: Summary
• Ontology development is not easy
• But it is not too challenging, especially when one follows established
guidelines for ontology engineering
• The benefits of high-quality ontologies outweigh costs by
addressing interoperability challenges, promoting reuse, and avoiding
scope creep
7 Myths about Ontologies
1. Ontology development is easy
2. Ontology development is too hard
3. Ontologies require changing the way we speak
4. Ontology solutions require using one master ontology
5. We do not need ontologies because we have X
6. The DOD-IC Ontology Foundry will never work
What do you mean by “talk”?
What do you mean by “talk”?
I’m joking, calm down
Human Ontologists
• Again, a grain of truth mixed with a distortion of fact
• One major goal for an ontologist is translating the way domain experts
talk into an ontology representation
• But some ontologists do attempt to change the way domain experts talk,
inappropriately
Pop Quiz
• Suppose you approach a web developer to build a website that shows
videos of your travels on the landing page, which will involve Javascript (JS)
• What should the web developer do:
A. Explain in detailed JS how they will create the landing page
B. Teach you how to read or write JS
C. Identify your needs then write the JS themselves
Pop Quiz
• Suppose you approach a web developer to build a website that shows
videos of your travels on the landing page, which will involve Javascript (JS)
• What should the web developer do:
A. Explain in detailed JS how they will create the landing page
B. Teach you how to read or write JS
C. Identify your needs then write the JS themselves
Ontology Analogy
• Suppose you approach an ontologist to represent your domain expertise,
which will involve using, say, BFO
• What should the ontologist do:
A. Explain in BFO terms how they will represent your domain
B. Teach you how to read or use BFO
C. Identify your needs then create the ontology themselves
Ontology Analogy
• Suppose you approach an ontologist to represent your domain expertise,
which will involve using, say, BFO
• What should the ontologist do:
A. Explain in BFO terms how they will represent your domain
B. Teach you how to read or use BFO
C. Identify your needs then create the ontology themselves
FOR HUMANS
*Slide courtesy of CUBRC reflecting the Joint Doctrine Ontology project; contact Alex Cox for more information alexander.cox@cubrc.org
FOR MACHINES
Consensus-Building Exercises
• Let us not blame the field for human error
• Especially since more often ontologists
merely seem like they are attempting to
change the way domain experts talk
• Ontologists rely on domain experts when
modeling, using competency questions and
consensus-building exercises
Competency Questions
• Competency questions are – roughly – questions that domain experts
would like answers to with respect to a given domain
• Competency questions are used to guide ontology development and
generate automated checks to ensure answers are sufficient
Consensus-Building Exercises
• Consensus-building exercises are where ontologists and domain experts
work towards an agreed understanding of ontology terms, definitions, etc.
• Importantly, whatever
agreement is reached
is meant to be added
to the ontology;
domain experts can
continue speaking as
they need
Consensus-Building Labels
• Domain experts need to use labels in specific ways for specific purposes
• We distinguish labels for ontologists from labels for users
• If domain experts find value in calling something X, let them
• X may have a different name in the ontology, but then again
“=SUM(A1; A3)” is not how I talk about addition in natural language...
Consensus-Building Definitions
• Intelligence =def The product resulting from the collection, processing,
integration, evaluation, analysis, and interpretation of available
information concerning foreign nations, hostile or
potentially hostile forces or elements, or areas of
actual or potential operations, as well as the activities
that result in the product and the organizations
engaged in such activities.
Consensus-Building Definitions
• Intelligence =def The product resulting from the collection, processing,
integration, evaluation, analysis, and interpretation of available
information concerning foreign nations, hostile or
potentially hostile forces or elements, or areas of
actual or potential operations, as well as the activities
that result in the product and the organizations
engaged in such activities.
• Counterexamples:
• Redundant if ‘available’ means ‘accessible by investigation’
• Overlooks acquisition if ‘available’ means ‘pre-existing’
Consensus-Building Definitions
• Intelligence =def The product resulting from the collection, processing,
integration, evaluation, analysis, and interpretation of available
information concerning foreign nations, hostile or
potentially hostile forces or elements, or areas of
actual or potential operations, as well as the activities
that result in the product and the organizations
engaged in such activities.
• Counterexamples:
• Police gather intelligence on domestic criminal activity
• Companies gather intelligence prior to merger
Consensus-Building Definitions
• Intelligence =def The product resulting from the collection, processing,
integration, evaluation, analysis, and interpretation of available
information concerning foreign nations, hostile or
potentially hostile forces or elements, or areas of
actual or potential operations, as well as the activities
that result in the product and the organizations
engaged in such activities.
• Counterexamples:
• Apparently, you cannot gather intelligence on an individual
Myth 3: Summary
• Ontology engineering is not a normative discipline; it is descriptive
• Ontologists should model the way you talk, not correct it
• Unless you have made some significant error, in which case you would
probably be happy for the assist
6 Myths about Ontologies
1. Ontology development is easy
2. Ontology development is too hard
3. Ontologies require changing the way we speak
4. Ontology solutions require using one master ontology
5. We do not need ontologies because we have X
6. The DOD-IC Ontology Foundry will never work
N-Squared Problem
• During the early days of the web, datasets were coded in distinct syntax
without an eye towards interoperability with other datasets
• Connecting disparate datasets
requires two-way mappings:
• 2 datasets – 2 mappings
• 3 datasets – 6 mappings
• 4 datasets – 12 mappings
....
Semantic Web
• The advent of the “Semantic Web” ushered in a series of strategies for
promoting interoperability, partly aimed at addressing this problem
• The Resource Description
Framework was leveraged
as a way to mitigate the
n-squared problem, by
creating hubs of
interoperability
Rejecting the Mono-Ontology Myth
• It was recognized early that a single language used by everyone was unwise
• But progress towards interoperability could be made by mapping disparate
lexicons into the fewest number of standard languages as possible
• It is the ontologist’s
job to create a path
from such lexicons
into ontologies
Another Python Analogy
• Python is a popular language in no small part because it can be extended
with numerous libraries and bindings created to/from other languages
• Want to use a C++ library in Python? Use Python-C++ binding.
Haskell? Python-Haskell binding.
• Want to use the Credential Transparency Language in BFO? Hold
consensus-building exercises and let the ontologists work
Myth 4: Summary
• It is sometimes suggested that the ultimate goal of ontology engineering
is that everyone use a single monolithic ontology
• Such a goal seems impossible at worst and unwise at best
• Connecting to as few ontology hubs as possible is a tractable,
preferable, strategy in line with semantic web goals
6 Myths about Ontologies
1. Ontology development is easy
2. Ontology development is too hard
3. Ontologies require changing the way we speak
4. Ontology solutions require using one master ontology
5. We do not need ontologies because we have X
6. The DOD-IC Ontology Foundry will never work
MadLibs Ontology EditionTM
• Fill in the blank:
We don’t need ontologies because we have ___
MadLibs Ontology EditionTM
• Fill in the blank:
We don’t need ontologies because we have ___
Taxonomies
MadLibs Ontology EditionTM
• Fill in the blank:
We don’t need ontologies because we have ___
Taxonomies
Concept Maps
MadLibs Ontology EditionTM
• Fill in the blank:
We don’t need ontologies because we have ___
Taxonomies
Concept Maps
Data Models
MadLibs Ontology EditionTM
• Fill in the blank:
We don’t need ontologies because we have ___
Taxonomies
Concept Maps
Data Models
Code
First
• Ontologies are not
• Taxonomies – hierarchical classifications of types
• Concept maps – visual representations of class relationships
• Data models – abstract syntaxes to support information modeling
• Code – You know what code is
76
Second
• Ontologies are not replaceable by
• Taxonomies – But do contain them as proper parts
• Concept maps – As they are not machine-interpretable
• Data models – But are built on such things, e.g. RDF
• Code – Implicit business logic ≠ explicit semantics
77
MadLibs Ontology EditionTM
• Fill in the blank:
We don’t need ontologies because we have ___
Taxonomies
Concept Maps
Data Models
Code
Machine Learning
To a Hammer, Everything is a Nail
• Many developers attempt to automate
ontology creation without recognizing
interoperability requires a
different toolkit
• You are not going to machine learn
your way to semantic interoperability
A Simple Thought Experiment
• Knowledge representation and machine learning are not at odds
• Machine learning strategies are fundamentally statistical, but imagine you
could train a machine learning algorithm for 100% accuracy
• The result would be a knowledge graph wrapped in code
Knowledge Graph Wrapped in Code
• Introducing ontologies and
knowledge graphs into
machine learning training
pipelines can rule out
noise
• Using less data to train
algorithms to the same
level of accuracy at a
significantly faster rate
A (Simple?) Example
• An open source intelligence agent wants to identify the 10 highest-impact
open source documents relevant to a given OSINT problem
• This is like looking for 10
needles in a million haystacks…
• Ontologies – alongside ML and
NLP – can help to automate
identification of the relevant
high-impact documents
82
Ontologies in a Pipeline
83
Ontologists work with
subject-matter experts and research
a domain to produce an ontology
Ontologies in a Pipeline
84
High-impact texts for the domain
are identified and annotated
with ontology terms
Ontologies in a Pipeline
85
The labeled texts are used to train a
machine learning algorithm, which
builds a model designed to
identify other high-impact texts
Ontologies in a Pipeline
86
The machine learning model is
then used by natural language
processing programs to
automate text annotation
Ontologies in a Pipeline
87
Results are evaluated for accuracy and
errors are reported; this often creates
a feedback loop that informs further
ontology design
MadLibs Ontology EditionTM
• Fill in the blank:
We don’t need ontologies because we have ___
Taxonomies
Concept Maps
Data Models
Code
Machine Learning
Large-Language Models
Present Summer
• The current AI summer is one
of big data, cheaper & faster
computing, sophisticated
machine learning and
generative techniques
• But also hype and
overpromising that should
strike anyone who knows
the history of AI as familiar
Present Summer
• Ontologies and knowledge
graphs are considered
crucial resources for
advances generative AI
• They are presently at the
heart of numerous
strategies for addressing
LLM hallucinations and
mitigating bias
Present Summer
• Advances in generative AI
can help address concerns
raised in modern applications
of knowledge graphs
• Advances in knowledge
representation can help
temper expectations and
improve the quality of
generative AI
Knowledge Representation and LLMs*
92
*From Unifying Large Language Models and Knowledge Graphs: A Roadmap
Enhancing LLMs
• Integrate knowledge graphs
into the training or prompt inputs
• Interpret prompt outputs
using knowledge graphs
93
94
Enhancing Knowledge Graphs
• Knowledge graph
construction, e.g.
coreference resolution
• Knowledge graph Q/A,
e.g. entity and relation
extraction
95
Enhancing Both
• Constructing knowledge
graphs used for LLM-enhanced
Q/A responses
• Completing a knowledge
graph with LLM information
that reveals further gaps,
filling of which provides better
LLM training
96
Detour on RAG
• Retrieval augmented generation involves retrieving documents possibly
relevant to a question, using keyword search, then asking the model to
generate answers with additional context
• Effective when there is keyword overlap between retrieved documents
and the question
Cold Water on RAG
Cold Water on RAG
NONSENSE
ON
STILTS
Detour on RAG
• Much less effective with code and math prompts since specifying
keywords that overlap with retrieved documents is challenging
• Models often get “distracted” by irrelevant content or ignore retrieved
documents, relying on parametric memory
• History doesn’t repeat, but it often rhymes
Myth 5: Summary
• ML and LLMs are not designed to address the interoperability challenges
where ontologies and knowledge graph solutions shine
• It is time AI researchers leverage a fuller toolkit
• Knowledge representation is a needed supplement to these more commonly
deployed technologies
• Let us not continue doing the same thing expecting different results...
101
6 Myths about Ontologies
1. Ontology development is easy
2. Ontology development is too hard
3. Ontologies require changing the way we speak
4. Ontology solutions require using one master ontology
5. We do not need ontologies because we have X
6. The DOD-IC Ontology Foundry will never work
Gene Ontology - 1998
The mission of the GO Consortium
is to develop a comprehensive,
model of biological systems,
ranging from the molecular to the
organism level, across species in
the tree of life.
Proliferation of Ontologies
• When developed correctly, ontologies provide common vocabularies
with common semantics across multiple domains
• The success of the Gene Ontology led to a proliferation of ontologies
developed by subject-matter experts, computer scientists, and logicians
Proliferation of Ontologies
• When developed correctly, ontologies provide common vocabularies
with common semantics across multiple domains
• The success of the Gene Ontology led to a proliferation of ontologies
developed by subject-matter experts, computer scientists, and logicians
• Almost none of which were developed in coordination
• The result was massive incompatibility of terms and relations,
confusion, in-fighting, name-calling, etc.
Open Biological and Biomedical
Ontologies
• In 2005, a consortium of biologists
decided to create standards for ontology
development
• Such as requiring ontologies be open-source,
have documentation, include definitions for
vocabulary terms, and...
• Align to a top-level ontology which provides a
starting point for all ontology development...
Incomplete History of Foundry Efforts
2012
Common
Core
Ontologies
107
OBO Foundry
Basic Formal
Ontology
2005
Industrial
Ontologies
Foundry
2018 2020
Basic Formal
Ontology
ISO 21838:2
2024
DoD-IC
Foundry
Industrial Ontologies Foundry
• In 2018, stakeholders from manufacturing and
service industries – encountering the rise of
the Internet of Things and accompanying
interoperability challenges – followed the
OBO Foundry strategy
• The Industrial Ontologies Foundry also adopts
Basic Formal Ontology as its core
Industrial Ontologies Foundry
• Over 170 industry partners, many of whom have competing economic
interests, agree interoperability problems can only be addressed collectively
2024 IOF Summit
Myth 6: Summary
• If academics across various institutions and disciplines can create
and sustain a foundry...
• If individuals and organizations across various for-profit and non-
profit companies can create and sustain a foundry...
• Surely the DOD-IC can as well
Grace and Growth
• In this talk, I’ve targeted both ontologists and non-ontologists, those in
favor and those opposed
• My conclusion is that ontology engineering is a feasible path forward
• With a corollary that if you are, in fact, opposed to pursuing this path,
you’ll need more than myths

More Related Content

Similar to Six Myths about Ontologies: The Basics of Formal Ontology

Graph DB + Bioinformatics: Bio4j, recent applications and future directions
Graph DB + Bioinformatics:  Bio4j, recent applications and future directions Graph DB + Bioinformatics:  Bio4j, recent applications and future directions
Graph DB + Bioinformatics: Bio4j, recent applications and future directions Pablo Pareja Tobes
 
Open interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBIOpen interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBIPistoia Alliance
 
ontology meets big data: immutability
ontology meets big data:immutabilityontology meets big data:immutability
ontology meets big data: immutabilityChris Partridge
 
Connecting Intelligent Content with Micropublishing and Beyond
Connecting Intelligent Content with Micropublishing and BeyondConnecting Intelligent Content with Micropublishing and Beyond
Connecting Intelligent Content with Micropublishing and BeyondDon Day
 
Objective Fiction, i-semantics keynote
Objective Fiction, i-semantics keynoteObjective Fiction, i-semantics keynote
Objective Fiction, i-semantics keynoteAldo Gangemi
 
Staying Broad and Shallow: Learning on the Fly (Eric Monson)
Staying Broad and Shallow: Learning on the Fly (Eric Monson)Staying Broad and Shallow: Learning on the Fly (Eric Monson)
Staying Broad and Shallow: Learning on the Fly (Eric Monson)DukeDigitalScholarship
 
Taxonomy, ontology, folksonomies & SKOS.
Taxonomy, ontology, folksonomies & SKOS.Taxonomy, ontology, folksonomies & SKOS.
Taxonomy, ontology, folksonomies & SKOS.Janet Leu
 
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...Maryam Farooq
 
Semantic Web - Ontologies
Semantic Web - OntologiesSemantic Web - Ontologies
Semantic Web - OntologiesSerge Linckels
 
Wageningen phenotype meeting
Wageningen phenotype meetingWageningen phenotype meeting
Wageningen phenotype meetingthehyve
 
Iot ontologies state of art$$$
Iot ontologies state of art$$$Iot ontologies state of art$$$
Iot ontologies state of art$$$Sof Ouni
 
What are some key topics and concepts that candidates are commonly expected t...
What are some key topics and concepts that candidates are commonly expected t...What are some key topics and concepts that candidates are commonly expected t...
What are some key topics and concepts that candidates are commonly expected t...DivyanshWsCube
 
Pal gov.tutorial4.session8 2.stepwisemethodologies
Pal gov.tutorial4.session8 2.stepwisemethodologiesPal gov.tutorial4.session8 2.stepwisemethodologies
Pal gov.tutorial4.session8 2.stepwisemethodologiesMustafa Jarrar
 
ARTIFICIAL INTELLIGENCE---UNIT 4.pptx
ARTIFICIAL INTELLIGENCE---UNIT 4.pptxARTIFICIAL INTELLIGENCE---UNIT 4.pptx
ARTIFICIAL INTELLIGENCE---UNIT 4.pptxRuchitaMaaran
 
Object And Oriented Programing ( Oop ) Languages
Object And Oriented Programing ( Oop ) LanguagesObject And Oriented Programing ( Oop ) Languages
Object And Oriented Programing ( Oop ) LanguagesJessica Deakin
 
Ontological realism as a strategy for integrating ontologies
Ontological realism as a strategy for integrating ontologiesOntological realism as a strategy for integrating ontologies
Ontological realism as a strategy for integrating ontologiesBarry Smith
 
Dialogare con agenti artificiali
Dialogare con agenti artificiali  Dialogare con agenti artificiali
Dialogare con agenti artificiali Agnese Augello
 
OpenKollab Project Matching
OpenKollab Project MatchingOpenKollab Project Matching
OpenKollab Project MatchingSuresh Fernando
 
20111120 warsaw learning curve by b hyland notes
20111120 warsaw   learning curve by b hyland notes20111120 warsaw   learning curve by b hyland notes
20111120 warsaw learning curve by b hyland notesBernadette Hyland-Wood
 

Similar to Six Myths about Ontologies: The Basics of Formal Ontology (20)

Graph DB + Bioinformatics: Bio4j, recent applications and future directions
Graph DB + Bioinformatics:  Bio4j, recent applications and future directions Graph DB + Bioinformatics:  Bio4j, recent applications and future directions
Graph DB + Bioinformatics: Bio4j, recent applications and future directions
 
Open interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBIOpen interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBI
 
ontology meets big data: immutability
ontology meets big data:immutabilityontology meets big data:immutability
ontology meets big data: immutability
 
Connecting Intelligent Content with Micropublishing and Beyond
Connecting Intelligent Content with Micropublishing and BeyondConnecting Intelligent Content with Micropublishing and Beyond
Connecting Intelligent Content with Micropublishing and Beyond
 
Objective Fiction, i-semantics keynote
Objective Fiction, i-semantics keynoteObjective Fiction, i-semantics keynote
Objective Fiction, i-semantics keynote
 
Staying Broad and Shallow: Learning on the Fly (Eric Monson)
Staying Broad and Shallow: Learning on the Fly (Eric Monson)Staying Broad and Shallow: Learning on the Fly (Eric Monson)
Staying Broad and Shallow: Learning on the Fly (Eric Monson)
 
Taxonomy, ontology, folksonomies & SKOS.
Taxonomy, ontology, folksonomies & SKOS.Taxonomy, ontology, folksonomies & SKOS.
Taxonomy, ontology, folksonomies & SKOS.
 
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
NYAI #27: Cognitive Architecture & Natural Language Processing w/ Dr. Catheri...
 
Semantic Web - Ontologies
Semantic Web - OntologiesSemantic Web - Ontologies
Semantic Web - Ontologies
 
Wageningen phenotype meeting
Wageningen phenotype meetingWageningen phenotype meeting
Wageningen phenotype meeting
 
Iot ontologies state of art$$$
Iot ontologies state of art$$$Iot ontologies state of art$$$
Iot ontologies state of art$$$
 
What are some key topics and concepts that candidates are commonly expected t...
What are some key topics and concepts that candidates are commonly expected t...What are some key topics and concepts that candidates are commonly expected t...
What are some key topics and concepts that candidates are commonly expected t...
 
Pal gov.tutorial4.session8 2.stepwisemethodologies
Pal gov.tutorial4.session8 2.stepwisemethodologiesPal gov.tutorial4.session8 2.stepwisemethodologies
Pal gov.tutorial4.session8 2.stepwisemethodologies
 
ARTIFICIAL INTELLIGENCE---UNIT 4.pptx
ARTIFICIAL INTELLIGENCE---UNIT 4.pptxARTIFICIAL INTELLIGENCE---UNIT 4.pptx
ARTIFICIAL INTELLIGENCE---UNIT 4.pptx
 
Object And Oriented Programing ( Oop ) Languages
Object And Oriented Programing ( Oop ) LanguagesObject And Oriented Programing ( Oop ) Languages
Object And Oriented Programing ( Oop ) Languages
 
Pln 101
Pln 101Pln 101
Pln 101
 
Ontological realism as a strategy for integrating ontologies
Ontological realism as a strategy for integrating ontologiesOntological realism as a strategy for integrating ontologies
Ontological realism as a strategy for integrating ontologies
 
Dialogare con agenti artificiali
Dialogare con agenti artificiali  Dialogare con agenti artificiali
Dialogare con agenti artificiali
 
OpenKollab Project Matching
OpenKollab Project MatchingOpenKollab Project Matching
OpenKollab Project Matching
 
20111120 warsaw learning curve by b hyland notes
20111120 warsaw   learning curve by b hyland notes20111120 warsaw   learning curve by b hyland notes
20111120 warsaw learning curve by b hyland notes
 

Recently uploaded

Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCzechDreamin
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...FIDO Alliance
 
Strategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsStrategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsUXDXConf
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxJennifer Lim
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...CzechDreamin
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101vincent683379
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyJohn Staveley
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaCzechDreamin
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyUXDXConf
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftshyamraj55
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FIDO Alliance
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Julian Hyde
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty SecureFemke de Vroome
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024TopCSSGallery
 
THE BEST IPTV in GERMANY for 2024: IPTVreel
THE BEST IPTV in  GERMANY for 2024: IPTVreelTHE BEST IPTV in  GERMANY for 2024: IPTVreel
THE BEST IPTV in GERMANY for 2024: IPTVreelreely ones
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Patrick Viafore
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastUXDXConf
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?Mark Billinghurst
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...marcuskenyatta275
 

Recently uploaded (20)

Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
Strategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsStrategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering Teams
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024
 
THE BEST IPTV in GERMANY for 2024: IPTVreel
THE BEST IPTV in  GERMANY for 2024: IPTVreelTHE BEST IPTV in  GERMANY for 2024: IPTVreel
THE BEST IPTV in GERMANY for 2024: IPTVreel
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 

Six Myths about Ontologies: The Basics of Formal Ontology

  • 1. Six Myths about Ontologies John Beverley, PhD Assistant Professor, University at Buffalo Co-Director, National Center for Ontological Research Affiliate Faculty, Institute of Artificial Intelligence and Data Science
  • 2. Outline • Preliminary Remarks • Elephants in the Room
  • 3. Outline • Preliminary Remarks • Elephants in the Room
  • 4. Information Silos • An information silo is an information repository, e.g. management system, database, the content of which cannot be integrated with that of other information repositories using computing strategies
  • 5. Information Silos • An information silo is an information repository, e.g. management system, database, the content of which cannot be integrated with that of other information repositories using computing strategies • Information silos may manifest for a variety of reasons: • Ignorance – Do not realize a given information repository exists • Inaccessible – Do not have the appropriate permissions to access • Infeasible – Do not have the appropriate technology to access • Insane – Do not care about integrating with other repositories
  • 6. Cost of Silos A 2020 report by NIST estimated the lack of interoperability across industrial datasets costs companies between 21-43 billion McKinsey estimates mid-size companies spend 20-50 million annually due to silos
  • 7.
  • 8.
  • 10. Promise of Ontology Engineering • Ontologies are formally well-defined machine-interpretable controlled vocabularies designed to represent entities and logical relationships among them • Ontologies make explicit the implicit meanings buried in datasets, by using basic principles of formal logic • Ontologies provide a semantic layer to connect information silos 10
  • 11. Promise of Ontology Engineering • Exhibiting standardized syntax/semantics, addressing VARIETY • Represented in formal languages facilitating consistency, addressing VERACITY • Queryable for information and inferences, addressing VALUE • Providing a lingua franca across information silos, addressing VOLUME 11
  • 12. *A knowledge graph is just an ontology that includes data
  • 13. Outline • Preliminary Remarks • Elephants in the Room
  • 14. 6 Myths about Ontologies 1. Ontology development is easy 2. Ontology development is too hard 3. Ontologies require changing the way we speak 4. Ontology solutions require using one master ontology 5. We do not need ontologies because we have X 6. The DOD-IC Ontology Foundry will never work
  • 15. 6 Myths about Ontologies 1. Ontology development is easy 2. Ontology development is too hard 3. Ontologies require changing the way we speak 4. Ontology solutions require using one master ontology 5. We do not need ontologies because we have X 6. The DOD-IC Ontology Foundry will never work
  • 16. Knowledge Representation is Easy! • This myth exhibits a grain of truth and distortion of fact • Constructing an ontology can be easy...just write some python to read a file and generate classes/relations from column headers 16
  • 17. 17
  • 19. Knowledge Representation is Easy! • This myth exhibits a grain of truth and distortion of fact • Constructing an ontology can be easy...just write some python to read a file and generate classes/relations from column headers • Constructing an ontology according to a standard is more challenging • But that is how we avoid information silos 19
  • 20. Basic Formal Ontology BFO is such a standard, used by over 600 open-source groups, the first ISO/IEC top-level ontology standard, and a “baseline standard” for DOD-IC ontology development 20
  • 21. BFO Ecosystem 600+ Projects
  • 22. Basic Formal Ontology Aligning to BFO requires one identify within the BFO hierarchy where content of interest falls, a task often requiring careful analysis 22
  • 23. Myth 1: Summary • Just building ontologies that makes explicit the semantics implicit in data, is not enough to avoid information silos • If you do not build to align with a standard, then you will likely recreate the interoperability issues ontologies are designed to address • If you build to align with a standard, such as BFO, then the task is not trivial
  • 24. 6 Myths about Ontologies 1. Ontology development is easy 2. Ontology development is too hard 3. Ontologies require changing the way we speak 4. Ontology solutions require using one master ontology 5. We do not need ontologies because we have X 6. The DOD-IC Ontology Foundry will never work
  • 25. Challenge of Ontology Development • This myth exhibits a grain of truth and distortion of fact • Ontology development is challenging, but not too challenging • The BFO community has developed strategies for streamlining the development process • Regardless, the benefits of ontologies are worth the effort
  • 26. Hub & Spoke Strategy Ontologies extending from BFO are modules in a larger hub & spoke structure Ontologies are extended by downward population, new classes have parent classes in a hierarchy ultimately leading to a BFO class BFO
  • 27. Hub & Spoke Strategy Provides guardrails for promoting alignment between ontologies representing nearby domains Progress towards interoperability is ensured upfront, since spokes share semantics with the same hub BFO
  • 28. 28
  • 29. Python Analogy • BFO is analogous to the Python programming language; extensions of BFO – such as CCO - are analogous to Python libraries • You could create code that allows you to interact with, say, dataframes or you could instead start with Python and import a library like Pandas • You could create ontology elements that allow you to model artifacts and processes or you could instead start with BFO and import a library like CCO
  • 31. Hub & Spoke Strategy BFO
  • 32. Hub & Spoke Strategy BFO
  • 33. Definition Construction Strategy • Material Entity (Elucidation) – An independent continuant that has some portion of matter as part • Agent (Definition) - A material entity that is capable of performing intentional acts Definitions of ontology elements are created following a recipe, to ensure the hub semantics are preserved by spokes, and minimize human error
  • 34. Definition Construction Strategy • Material Entity (Elucidation) – An independent continuant that has some portion of matter as part • Agent (Definition) - A material entity that is capable of performing intentional acts Downward population leverages the definition scheme: A is a B that Cs Where B is the parent class under which A falls
  • 35. Definition Construction Strategy • Material Entity (Elucidation) – An independent continuant that has some portion of matter as part • Agent (Definition) - A material entity that is capable of performing intentional acts Downward population leverages the definition scheme: A is a B that Cs Where B is the parent class under which A falls
  • 36. Definition Construction Strategy • Material Entity (Elucidation) – An independent continuant that has some portion of matter as part • Agent (Definition) - A material entity that is capable of performing intentional acts Downward population leverages the definition scheme: A is a B that Cs Where B is the parent class under which A falls
  • 37. Definition Construction Strategy • Material Entity (Elucidation) – An independent continuant that has some portion of matter as part • Agent (Definition) - A material entity that is capable of performing intentional acts Downward population leverages the definition scheme: A is a B that Cs Where B is the parent class under which A falls
  • 38. Interoperability Guardrails • The hub & spoke strategy provides a guardrails for promoting alignment between ontologies representing nearby and overlapping domains • By following the recipe, progress towards interoperability is ensured upfront, since elements inherited from the hub ontologies overlap semantically • By not following such a strategy you get...say it with me now...
  • 39.
  • 41. Scope Creep • The hub & spoke strategy significantly cuts down on scope creep, which arises when an ontology is created as an information silo, but over time grows beyond its initial scope • If the ontology does not extend from a higher-level ontology, it will likely not be compatible with other ontologies • Developers will then need to recreate existing work because it is not interoperable with their ontology...
  • 43. Myth 2: Summary • Ontology development is not easy • But it is not too challenging, especially when one follows established guidelines for ontology engineering • The benefits of high-quality ontologies outweigh costs by addressing interoperability challenges, promoting reuse, and avoiding scope creep
  • 44. 7 Myths about Ontologies 1. Ontology development is easy 2. Ontology development is too hard 3. Ontologies require changing the way we speak 4. Ontology solutions require using one master ontology 5. We do not need ontologies because we have X 6. The DOD-IC Ontology Foundry will never work
  • 45. What do you mean by “talk”?
  • 46. What do you mean by “talk”? I’m joking, calm down
  • 47. Human Ontologists • Again, a grain of truth mixed with a distortion of fact • One major goal for an ontologist is translating the way domain experts talk into an ontology representation • But some ontologists do attempt to change the way domain experts talk, inappropriately
  • 48. Pop Quiz • Suppose you approach a web developer to build a website that shows videos of your travels on the landing page, which will involve Javascript (JS) • What should the web developer do: A. Explain in detailed JS how they will create the landing page B. Teach you how to read or write JS C. Identify your needs then write the JS themselves
  • 49. Pop Quiz • Suppose you approach a web developer to build a website that shows videos of your travels on the landing page, which will involve Javascript (JS) • What should the web developer do: A. Explain in detailed JS how they will create the landing page B. Teach you how to read or write JS C. Identify your needs then write the JS themselves
  • 50. Ontology Analogy • Suppose you approach an ontologist to represent your domain expertise, which will involve using, say, BFO • What should the ontologist do: A. Explain in BFO terms how they will represent your domain B. Teach you how to read or use BFO C. Identify your needs then create the ontology themselves
  • 51. Ontology Analogy • Suppose you approach an ontologist to represent your domain expertise, which will involve using, say, BFO • What should the ontologist do: A. Explain in BFO terms how they will represent your domain B. Teach you how to read or use BFO C. Identify your needs then create the ontology themselves
  • 53. *Slide courtesy of CUBRC reflecting the Joint Doctrine Ontology project; contact Alex Cox for more information alexander.cox@cubrc.org
  • 55. Consensus-Building Exercises • Let us not blame the field for human error • Especially since more often ontologists merely seem like they are attempting to change the way domain experts talk • Ontologists rely on domain experts when modeling, using competency questions and consensus-building exercises
  • 56. Competency Questions • Competency questions are – roughly – questions that domain experts would like answers to with respect to a given domain • Competency questions are used to guide ontology development and generate automated checks to ensure answers are sufficient
  • 57. Consensus-Building Exercises • Consensus-building exercises are where ontologists and domain experts work towards an agreed understanding of ontology terms, definitions, etc. • Importantly, whatever agreement is reached is meant to be added to the ontology; domain experts can continue speaking as they need
  • 58. Consensus-Building Labels • Domain experts need to use labels in specific ways for specific purposes • We distinguish labels for ontologists from labels for users • If domain experts find value in calling something X, let them • X may have a different name in the ontology, but then again “=SUM(A1; A3)” is not how I talk about addition in natural language...
  • 59. Consensus-Building Definitions • Intelligence =def The product resulting from the collection, processing, integration, evaluation, analysis, and interpretation of available information concerning foreign nations, hostile or potentially hostile forces or elements, or areas of actual or potential operations, as well as the activities that result in the product and the organizations engaged in such activities.
  • 60. Consensus-Building Definitions • Intelligence =def The product resulting from the collection, processing, integration, evaluation, analysis, and interpretation of available information concerning foreign nations, hostile or potentially hostile forces or elements, or areas of actual or potential operations, as well as the activities that result in the product and the organizations engaged in such activities. • Counterexamples: • Redundant if ‘available’ means ‘accessible by investigation’ • Overlooks acquisition if ‘available’ means ‘pre-existing’
  • 61. Consensus-Building Definitions • Intelligence =def The product resulting from the collection, processing, integration, evaluation, analysis, and interpretation of available information concerning foreign nations, hostile or potentially hostile forces or elements, or areas of actual or potential operations, as well as the activities that result in the product and the organizations engaged in such activities. • Counterexamples: • Police gather intelligence on domestic criminal activity • Companies gather intelligence prior to merger
  • 62. Consensus-Building Definitions • Intelligence =def The product resulting from the collection, processing, integration, evaluation, analysis, and interpretation of available information concerning foreign nations, hostile or potentially hostile forces or elements, or areas of actual or potential operations, as well as the activities that result in the product and the organizations engaged in such activities. • Counterexamples: • Apparently, you cannot gather intelligence on an individual
  • 63. Myth 3: Summary • Ontology engineering is not a normative discipline; it is descriptive • Ontologists should model the way you talk, not correct it • Unless you have made some significant error, in which case you would probably be happy for the assist
  • 64. 6 Myths about Ontologies 1. Ontology development is easy 2. Ontology development is too hard 3. Ontologies require changing the way we speak 4. Ontology solutions require using one master ontology 5. We do not need ontologies because we have X 6. The DOD-IC Ontology Foundry will never work
  • 65. N-Squared Problem • During the early days of the web, datasets were coded in distinct syntax without an eye towards interoperability with other datasets • Connecting disparate datasets requires two-way mappings: • 2 datasets – 2 mappings • 3 datasets – 6 mappings • 4 datasets – 12 mappings ....
  • 66. Semantic Web • The advent of the “Semantic Web” ushered in a series of strategies for promoting interoperability, partly aimed at addressing this problem • The Resource Description Framework was leveraged as a way to mitigate the n-squared problem, by creating hubs of interoperability
  • 67. Rejecting the Mono-Ontology Myth • It was recognized early that a single language used by everyone was unwise • But progress towards interoperability could be made by mapping disparate lexicons into the fewest number of standard languages as possible • It is the ontologist’s job to create a path from such lexicons into ontologies
  • 68. Another Python Analogy • Python is a popular language in no small part because it can be extended with numerous libraries and bindings created to/from other languages • Want to use a C++ library in Python? Use Python-C++ binding. Haskell? Python-Haskell binding. • Want to use the Credential Transparency Language in BFO? Hold consensus-building exercises and let the ontologists work
  • 69. Myth 4: Summary • It is sometimes suggested that the ultimate goal of ontology engineering is that everyone use a single monolithic ontology • Such a goal seems impossible at worst and unwise at best • Connecting to as few ontology hubs as possible is a tractable, preferable, strategy in line with semantic web goals
  • 70. 6 Myths about Ontologies 1. Ontology development is easy 2. Ontology development is too hard 3. Ontologies require changing the way we speak 4. Ontology solutions require using one master ontology 5. We do not need ontologies because we have X 6. The DOD-IC Ontology Foundry will never work
  • 71. MadLibs Ontology EditionTM • Fill in the blank: We don’t need ontologies because we have ___
  • 72. MadLibs Ontology EditionTM • Fill in the blank: We don’t need ontologies because we have ___ Taxonomies
  • 73. MadLibs Ontology EditionTM • Fill in the blank: We don’t need ontologies because we have ___ Taxonomies Concept Maps
  • 74. MadLibs Ontology EditionTM • Fill in the blank: We don’t need ontologies because we have ___ Taxonomies Concept Maps Data Models
  • 75. MadLibs Ontology EditionTM • Fill in the blank: We don’t need ontologies because we have ___ Taxonomies Concept Maps Data Models Code
  • 76. First • Ontologies are not • Taxonomies – hierarchical classifications of types • Concept maps – visual representations of class relationships • Data models – abstract syntaxes to support information modeling • Code – You know what code is 76
  • 77. Second • Ontologies are not replaceable by • Taxonomies – But do contain them as proper parts • Concept maps – As they are not machine-interpretable • Data models – But are built on such things, e.g. RDF • Code – Implicit business logic ≠ explicit semantics 77
  • 78. MadLibs Ontology EditionTM • Fill in the blank: We don’t need ontologies because we have ___ Taxonomies Concept Maps Data Models Code Machine Learning
  • 79. To a Hammer, Everything is a Nail • Many developers attempt to automate ontology creation without recognizing interoperability requires a different toolkit • You are not going to machine learn your way to semantic interoperability
  • 80. A Simple Thought Experiment • Knowledge representation and machine learning are not at odds • Machine learning strategies are fundamentally statistical, but imagine you could train a machine learning algorithm for 100% accuracy • The result would be a knowledge graph wrapped in code
  • 81. Knowledge Graph Wrapped in Code • Introducing ontologies and knowledge graphs into machine learning training pipelines can rule out noise • Using less data to train algorithms to the same level of accuracy at a significantly faster rate
  • 82. A (Simple?) Example • An open source intelligence agent wants to identify the 10 highest-impact open source documents relevant to a given OSINT problem • This is like looking for 10 needles in a million haystacks… • Ontologies – alongside ML and NLP – can help to automate identification of the relevant high-impact documents 82
  • 83. Ontologies in a Pipeline 83 Ontologists work with subject-matter experts and research a domain to produce an ontology
  • 84. Ontologies in a Pipeline 84 High-impact texts for the domain are identified and annotated with ontology terms
  • 85. Ontologies in a Pipeline 85 The labeled texts are used to train a machine learning algorithm, which builds a model designed to identify other high-impact texts
  • 86. Ontologies in a Pipeline 86 The machine learning model is then used by natural language processing programs to automate text annotation
  • 87. Ontologies in a Pipeline 87 Results are evaluated for accuracy and errors are reported; this often creates a feedback loop that informs further ontology design
  • 88. MadLibs Ontology EditionTM • Fill in the blank: We don’t need ontologies because we have ___ Taxonomies Concept Maps Data Models Code Machine Learning Large-Language Models
  • 89. Present Summer • The current AI summer is one of big data, cheaper & faster computing, sophisticated machine learning and generative techniques • But also hype and overpromising that should strike anyone who knows the history of AI as familiar
  • 90. Present Summer • Ontologies and knowledge graphs are considered crucial resources for advances generative AI • They are presently at the heart of numerous strategies for addressing LLM hallucinations and mitigating bias
  • 91. Present Summer • Advances in generative AI can help address concerns raised in modern applications of knowledge graphs • Advances in knowledge representation can help temper expectations and improve the quality of generative AI
  • 92. Knowledge Representation and LLMs* 92 *From Unifying Large Language Models and Knowledge Graphs: A Roadmap
  • 93. Enhancing LLMs • Integrate knowledge graphs into the training or prompt inputs • Interpret prompt outputs using knowledge graphs 93
  • 94. 94
  • 95. Enhancing Knowledge Graphs • Knowledge graph construction, e.g. coreference resolution • Knowledge graph Q/A, e.g. entity and relation extraction 95
  • 96. Enhancing Both • Constructing knowledge graphs used for LLM-enhanced Q/A responses • Completing a knowledge graph with LLM information that reveals further gaps, filling of which provides better LLM training 96
  • 97. Detour on RAG • Retrieval augmented generation involves retrieving documents possibly relevant to a question, using keyword search, then asking the model to generate answers with additional context • Effective when there is keyword overlap between retrieved documents and the question
  • 99. Cold Water on RAG NONSENSE ON STILTS
  • 100. Detour on RAG • Much less effective with code and math prompts since specifying keywords that overlap with retrieved documents is challenging • Models often get “distracted” by irrelevant content or ignore retrieved documents, relying on parametric memory • History doesn’t repeat, but it often rhymes
  • 101. Myth 5: Summary • ML and LLMs are not designed to address the interoperability challenges where ontologies and knowledge graph solutions shine • It is time AI researchers leverage a fuller toolkit • Knowledge representation is a needed supplement to these more commonly deployed technologies • Let us not continue doing the same thing expecting different results... 101
  • 102. 6 Myths about Ontologies 1. Ontology development is easy 2. Ontology development is too hard 3. Ontologies require changing the way we speak 4. Ontology solutions require using one master ontology 5. We do not need ontologies because we have X 6. The DOD-IC Ontology Foundry will never work
  • 103. Gene Ontology - 1998 The mission of the GO Consortium is to develop a comprehensive, model of biological systems, ranging from the molecular to the organism level, across species in the tree of life.
  • 104. Proliferation of Ontologies • When developed correctly, ontologies provide common vocabularies with common semantics across multiple domains • The success of the Gene Ontology led to a proliferation of ontologies developed by subject-matter experts, computer scientists, and logicians
  • 105. Proliferation of Ontologies • When developed correctly, ontologies provide common vocabularies with common semantics across multiple domains • The success of the Gene Ontology led to a proliferation of ontologies developed by subject-matter experts, computer scientists, and logicians • Almost none of which were developed in coordination • The result was massive incompatibility of terms and relations, confusion, in-fighting, name-calling, etc.
  • 106. Open Biological and Biomedical Ontologies • In 2005, a consortium of biologists decided to create standards for ontology development • Such as requiring ontologies be open-source, have documentation, include definitions for vocabulary terms, and... • Align to a top-level ontology which provides a starting point for all ontology development...
  • 107. Incomplete History of Foundry Efforts 2012 Common Core Ontologies 107 OBO Foundry Basic Formal Ontology 2005 Industrial Ontologies Foundry 2018 2020 Basic Formal Ontology ISO 21838:2 2024 DoD-IC Foundry
  • 108. Industrial Ontologies Foundry • In 2018, stakeholders from manufacturing and service industries – encountering the rise of the Internet of Things and accompanying interoperability challenges – followed the OBO Foundry strategy • The Industrial Ontologies Foundry also adopts Basic Formal Ontology as its core
  • 109. Industrial Ontologies Foundry • Over 170 industry partners, many of whom have competing economic interests, agree interoperability problems can only be addressed collectively
  • 111. Myth 6: Summary • If academics across various institutions and disciplines can create and sustain a foundry... • If individuals and organizations across various for-profit and non- profit companies can create and sustain a foundry... • Surely the DOD-IC can as well
  • 112. Grace and Growth • In this talk, I’ve targeted both ontologists and non-ontologists, those in favor and those opposed • My conclusion is that ontology engineering is a feasible path forward • With a corollary that if you are, in fact, opposed to pursuing this path, you’ll need more than myths

Editor's Notes

  1. Promote the reuse of domain knowledge from other ontologies, through a common formal language: OWL Internationalized Resource Identifiers (IRIs) promote tracking of entities and cross-ontology references, support interoperability Promote human-human, human-computer, and computer-computer communication Enable automated inference useful for deriving implicit information, as they effectively connect parts of taxonomies to other parts of taxonomies in machine-readable ways
  2. Promote the reuse of domain knowledge from other ontologies, through a common formal language: OWL Internationalized Resource Identifiers (IRIs) promote tracking of entities and cross-ontology references, support interoperability Promote human-human, human-computer, and computer-computer communication Enable automated inference useful for deriving implicit information, as they effectively connect parts of taxonomies to other parts of taxonomies in machine-readable ways
  3. This led to the development of the Knowledge Interchange Format, or KIF: If there were a single “interlingua”, i.e., logic-based language capable of expressing everything that is expressible in any of the systems in a given environment, then, to exchange information between systems, instead of O(n2) pairwise translators, one would only need O(n).