The K in “neuro-symbolic”
stands for “knowledge” *
Frank van Harmelen,
Learning & Reasoning Group
Vrije Universiteit Amsterdam
Creative Commons License
CC BY 3.0:
Allowed to copy, redistribute
remix & transform
But must attribute
1
* With thanks to Wouter Beek
Highlights
• Bluffer’s guide to KG embedding
• What is semantics?
• What is not a knowledge graph
• A very fishy picture
• A lesson from 1976
• A baby in a bath
• Some hope for the K in neuro-symbolic
2
The K in “neuro-symbolic”
stands for “knowledge” *
Warning: no generative AI, no ChatGPT, no Large Language Models….
0
5
10
15
20
25
30
35
40
45 41 papers
50%
Is neuro-symbolic really important for SemWeb?
ML papers at ESWC 2023
4
0
5
10
15
20
25
30
35
40
45 41 papers
50%
Is neuro-symbolic really important for SemWeb?
ML papers at ESWC 2023
5
0
5
10
15
20
25
30
35
40
45 41 papers
50%
ML papers at ESWC 2023
6
Bluffer’s Guide to KG Embedding
https://docs.ampligraph.org/
basedIn ?
Drug1
Protein2
Protein1
Drug2
binds
Bluffer’s Guide to KG Embedding
• Link prediction
But also:
• Entity classification
• Relation extraction
• ….
binds
prediction
algorithm
9
Some things are better done geometrically (and not symbolically):
• Link prediction 
• Node classification
• Relation extraction
Bluffer’s Guide to KG Embedding:
From Symbols to Vectors and back again
10
Bluffer’s Guide to KG Embedding:
Different ways to compute the embeddings
TransE: |h+r-t|
RotateE:
symmetrie 
<h,r,t> and <t,r,h>
composition 
father’s mother =
mother’s father
Claim
None of the commonly used embeddings
capture any semantics
What is “semantics”?
11
This is not semantics
12
It is “wishful mnemonics”
13
13
Artificial Intelligence meets natural stupidity,
Drew McDermott, 1976
14
Artificial Intelligence meets natural stupidity,
Drew McDermott, 1981
15
Wishful Mnemonics
A major source of confusion in AI programs is the use of
mnemnonics like “UNDERSTAND” or “GOAL”. If a
programmer calls the main loop of their program
“UNDERSTAND”, they may mislead a lot of people, most
prominently themselves.
What they should do instead is refer to this main loop as
“G0034” and see if they can convince themselves or
anyone else that G0034 implements some part of
understanding.
It is much harder to do this when using terms like “G0034”.
When you say UNDERSTAND(x), you can just feel the
Artificial Intelligence meets natural stupidity,
Drew McDermott, 1981
16
As a field, AI has always been on the border of
respectability,
and therefore on the border of crackpottery.
……..
In this paper, I have criticised AI researchers (including
myself) very harshly. To say anything good about anyone is
beyond the scope of this paper.
Prescription medicine for every AI researcher:
In order to maintain your mental hygiene, read
“Artificial Intelligence meets natural stupidity”
once yearly
So, this is “wishful mnemonics”
17
17
“wishful mnemonics” is not semantics
for your computer
18
G0034
H9945
XB56B
RB56
B599
Z077
W87
U654
B6 7B3
86G
86G
86G
86G
KL64
BA21
BA51
86H
19
“logical semantics” is also not semantics
for your computer
So what is semantics for your computer?
Frank Bussum
birth-place
So what is semantics for your computer?
Frank Bussum
has-birth-place
domain: person
range: location
Frank is person
lowerbound
Has-birth-place
So what is semantics for your computer?
Frank Bussum
has-birth-place
domain: person
range: location
Frank is person
Has-birth-place relates
min-cardinality: 1
max-cardinality: 1
 Bussum = Meren
lowerbound upperbound
Meren
Has-birth-place
RDF Schema
The semantics is in the Reserved Symbols
Ontology
Instance
Schema
Data
RDF Schema
The semantics is in the Reserved Symbols
Ontology
Instance
Schema
Data
Ontology
Instance
Schema
Data
RDF Schema
The semantics is in the Reserved Symbols
Claim
26
None of the commonly used embeddings
capture any semantics
Because none of the commonly used embeddings respect
any of the reserved symbols from RDF Schema or OWL.
Embeddings do “distributional semantics”, but
predictable co-occurrence ≠ predictable inference
Claim
None of the commonly used embeddings
capture any semantics
Because none of the commonly used embeddings
can represent universal quantification
(and that’s where the inference comes from)
Embeddings do “variable free sentences” only,
and those don’t allow for any inference. 27
has-birth-place
domain: person
range: location
This is not a knowledge graph
28
It is a data graph
because it doesn’t support any inference
and therefore doesn’t have any semantics
But surely other people
have noticed this before?
29
Make embeddings semantic again!
(Outrageaous Ideas paper at ISWC 2018)
Abstract
The original Semantic Web vision foresees to describe
entities in a way that the meaning can be interpreted both
by machines and humans. [But] embeddings describe an
entity as a numerical vector, without any semantics
attached to the dimensions. Thus, embeddings are as far
from the original Semantic Web vision as can be. In this
paper, we make a claim for semantic embeddings.
Proposal 1: A Posteriori Learning of Interpretations.
Reconstruct a human-readable interpretation from the
vector space.
Proposal 2: Pattern-based Embeddings.
Use patterns in the knowledge graph to choose
human-interpretable dimensions in the vector space.
30
Neither of these are aimed at predictable inference
-> no semantics 
From TransE to TransOWL
(and from TransR to TransROWL)
31
TransOWL:
TransE
Loss function
Summed over
all triples
More radical idea:
use more of the geometry
to capture the semantics
32
Male
Father
𝐹𝑎𝑡ℎ𝑒𝑟 ⊑ 𝑀𝑎𝑙𝑒
Parent
𝐹𝑎𝑡ℎ𝑒𝑟 ⊑ 𝑃𝑎𝑟𝑒𝑛𝑡
Spheres: ELEm, EmEL++
Male
Parent
Father
Boxes: BoxEL, Box2EL
𝑃𝑎𝑟𝑒𝑛𝑡 ⊓ 𝑀𝑎𝑙𝑒 ⊑ 𝐹𝑎𝑡ℎ𝑒𝑟 ?
More radical idea:
use more of the geometry
to capture the semantics
Highlights
Bluffer’s guide to KG embedding
What is semantics?
What is not a knowledge graph
A very fishy picture
A lesson from 1976
• A baby in a bath
• Some hope for the K in neuro-symbolic
34
35
Symbolic
representation
Semantics
36
Message 1:
Even if you throw out
the symbolic bathwater,
make sure to keep
the semantic baby.
#ISWS2023
37
Message 2:
You can even keep
the symbolic bathwater!
Symbolic loss function
during training
See survey of 100+ systems in Von Rueden et al, Learning, 2019
flower?
cushion?
“Parts of a chair are:
cushion and armrest”
“Given the context of chair,
a cushion is much more likely
than a flower”
P(cushion|chair) >> P(flower|chair)
Symbolic consistency check
(during inference)
queen
crown
wears
39
shower
cap
?
Predict Select
crown?
showercap?
Symbolic justification
(after inference)
queen
wears
40
shower
cap
?
Predict Justify Explain
crown?
Takeaways
41
Some things are best done symbolically,
some numerically, some geometrically
The Semantic Web community is now truly a
“learning and reasoning” community
Semantics ≠ symbolic representation.
Instead: semantics = predictable inference
Whichever representation you choose,
make sure not to loose the predictable inference
There is more to neuro-symbolic methods
than embeddings

The K in "neuro-symbolic" stands for "knowledge"

  • 1.
    The K in“neuro-symbolic” stands for “knowledge” * Frank van Harmelen, Learning & Reasoning Group Vrije Universiteit Amsterdam Creative Commons License CC BY 3.0: Allowed to copy, redistribute remix & transform But must attribute 1 * With thanks to Wouter Beek
  • 2.
    Highlights • Bluffer’s guideto KG embedding • What is semantics? • What is not a knowledge graph • A very fishy picture • A lesson from 1976 • A baby in a bath • Some hope for the K in neuro-symbolic 2
  • 3.
    The K in“neuro-symbolic” stands for “knowledge” * Warning: no generative AI, no ChatGPT, no Large Language Models….
  • 4.
    0 5 10 15 20 25 30 35 40 45 41 papers 50% Isneuro-symbolic really important for SemWeb? ML papers at ESWC 2023 4
  • 5.
    0 5 10 15 20 25 30 35 40 45 41 papers 50% Isneuro-symbolic really important for SemWeb? ML papers at ESWC 2023 5
  • 6.
  • 7.
    Bluffer’s Guide toKG Embedding https://docs.ampligraph.org/ basedIn ?
  • 8.
    Drug1 Protein2 Protein1 Drug2 binds Bluffer’s Guide toKG Embedding • Link prediction But also: • Entity classification • Relation extraction • …. binds
  • 9.
    prediction algorithm 9 Some things arebetter done geometrically (and not symbolically): • Link prediction  • Node classification • Relation extraction Bluffer’s Guide to KG Embedding: From Symbols to Vectors and back again
  • 10.
    10 Bluffer’s Guide toKG Embedding: Different ways to compute the embeddings TransE: |h+r-t| RotateE: symmetrie  <h,r,t> and <t,r,h> composition  father’s mother = mother’s father
  • 11.
    Claim None of thecommonly used embeddings capture any semantics What is “semantics”? 11
  • 12.
    This is notsemantics 12
  • 13.
    It is “wishfulmnemonics” 13 13
  • 14.
    Artificial Intelligence meetsnatural stupidity, Drew McDermott, 1976 14
  • 15.
    Artificial Intelligence meetsnatural stupidity, Drew McDermott, 1981 15 Wishful Mnemonics A major source of confusion in AI programs is the use of mnemnonics like “UNDERSTAND” or “GOAL”. If a programmer calls the main loop of their program “UNDERSTAND”, they may mislead a lot of people, most prominently themselves. What they should do instead is refer to this main loop as “G0034” and see if they can convince themselves or anyone else that G0034 implements some part of understanding. It is much harder to do this when using terms like “G0034”. When you say UNDERSTAND(x), you can just feel the
  • 16.
    Artificial Intelligence meetsnatural stupidity, Drew McDermott, 1981 16 As a field, AI has always been on the border of respectability, and therefore on the border of crackpottery. …….. In this paper, I have criticised AI researchers (including myself) very harshly. To say anything good about anyone is beyond the scope of this paper. Prescription medicine for every AI researcher: In order to maintain your mental hygiene, read “Artificial Intelligence meets natural stupidity” once yearly
  • 17.
    So, this is“wishful mnemonics” 17 17
  • 18.
    “wishful mnemonics” isnot semantics for your computer 18 G0034 H9945 XB56B RB56 B599 Z077 W87 U654 B6 7B3 86G 86G 86G 86G KL64 BA21 BA51 86H
  • 19.
    19 “logical semantics” isalso not semantics for your computer
  • 20.
    So what issemantics for your computer? Frank Bussum birth-place
  • 21.
    So what issemantics for your computer? Frank Bussum has-birth-place domain: person range: location Frank is person lowerbound Has-birth-place
  • 22.
    So what issemantics for your computer? Frank Bussum has-birth-place domain: person range: location Frank is person Has-birth-place relates min-cardinality: 1 max-cardinality: 1  Bussum = Meren lowerbound upperbound Meren Has-birth-place
  • 23.
    RDF Schema The semanticsis in the Reserved Symbols Ontology Instance Schema Data
  • 24.
    RDF Schema The semanticsis in the Reserved Symbols Ontology Instance Schema Data
  • 25.
  • 26.
    Claim 26 None of thecommonly used embeddings capture any semantics Because none of the commonly used embeddings respect any of the reserved symbols from RDF Schema or OWL. Embeddings do “distributional semantics”, but predictable co-occurrence ≠ predictable inference
  • 27.
    Claim None of thecommonly used embeddings capture any semantics Because none of the commonly used embeddings can represent universal quantification (and that’s where the inference comes from) Embeddings do “variable free sentences” only, and those don’t allow for any inference. 27 has-birth-place domain: person range: location
  • 28.
    This is nota knowledge graph 28 It is a data graph because it doesn’t support any inference and therefore doesn’t have any semantics
  • 29.
    But surely otherpeople have noticed this before? 29
  • 30.
    Make embeddings semanticagain! (Outrageaous Ideas paper at ISWC 2018) Abstract The original Semantic Web vision foresees to describe entities in a way that the meaning can be interpreted both by machines and humans. [But] embeddings describe an entity as a numerical vector, without any semantics attached to the dimensions. Thus, embeddings are as far from the original Semantic Web vision as can be. In this paper, we make a claim for semantic embeddings. Proposal 1: A Posteriori Learning of Interpretations. Reconstruct a human-readable interpretation from the vector space. Proposal 2: Pattern-based Embeddings. Use patterns in the knowledge graph to choose human-interpretable dimensions in the vector space. 30 Neither of these are aimed at predictable inference -> no semantics 
  • 31.
    From TransE toTransOWL (and from TransR to TransROWL) 31 TransOWL: TransE Loss function Summed over all triples
  • 32.
    More radical idea: usemore of the geometry to capture the semantics 32 Male Father 𝐹𝑎𝑡ℎ𝑒𝑟 ⊑ 𝑀𝑎𝑙𝑒 Parent 𝐹𝑎𝑡ℎ𝑒𝑟 ⊑ 𝑃𝑎𝑟𝑒𝑛𝑡 Spheres: ELEm, EmEL++ Male Parent Father Boxes: BoxEL, Box2EL 𝑃𝑎𝑟𝑒𝑛𝑡 ⊓ 𝑀𝑎𝑙𝑒 ⊑ 𝐹𝑎𝑡ℎ𝑒𝑟 ?
  • 33.
    More radical idea: usemore of the geometry to capture the semantics
  • 34.
    Highlights Bluffer’s guide toKG embedding What is semantics? What is not a knowledge graph A very fishy picture A lesson from 1976 • A baby in a bath • Some hope for the K in neuro-symbolic 34
  • 35.
  • 36.
    36 Message 1: Even ifyou throw out the symbolic bathwater, make sure to keep the semantic baby. #ISWS2023
  • 37.
    37 Message 2: You caneven keep the symbolic bathwater!
  • 38.
    Symbolic loss function duringtraining See survey of 100+ systems in Von Rueden et al, Learning, 2019 flower? cushion? “Parts of a chair are: cushion and armrest” “Given the context of chair, a cushion is much more likely than a flower” P(cushion|chair) >> P(flower|chair)
  • 39.
    Symbolic consistency check (duringinference) queen crown wears 39 shower cap ? Predict Select crown? showercap?
  • 40.
  • 41.
    Takeaways 41 Some things arebest done symbolically, some numerically, some geometrically The Semantic Web community is now truly a “learning and reasoning” community Semantics ≠ symbolic representation. Instead: semantics = predictable inference Whichever representation you choose, make sure not to loose the predictable inference There is more to neuro-symbolic methods than embeddings

Editor's Notes

  • #10 DEZE klopt toch niet? Zie eerder.
  • #21 Mind-reading game to explain semantics. If I show the audience the top triple, and we share a little bit of background knowledge in the square box (“ontology”), I can predict what the audience will infer from the top-triple. The shared background knowledge forces us to believe certain things (such that the right blobs must be locations) , and forbids us to believe certain things (such as that the two right blobs are different). By increasing the background knowledge the enforced conclusions (lowerbound on agreement) and the forbidden conlusions (upperbound on agreement) get closer and closer, and the remaining space for ambiguity and misunderstanding reduces. Not only misunderstanding between people, but also between machines. Slogan: semantics is when I can predict what you will infer when I send you something.
  • #22 Mind-reading game to explain semantics. If I show the audience the top triple, and we share a little bit of background knowledge in the square box (“ontology”), I can predict what the audience will infer from the top-triple. The shared background knowledge forces us to believe certain things (such that the right blobs must be locations) , and forbids us to believe certain things (such as that the two right blobs are different). By increasing the background knowledge the enforced conclusions (lowerbound on agreement) and the forbidden conlusions (upperbound on agreement) get closer and closer, and the remaining space for ambiguity and misunderstanding reduces. Not only misunderstanding between people, but also between machines. Slogan: semantics is when I can predict what you will infer when I send you something.
  • #23 Mind-reading game to explain semantics. If I show the audience the top triple, and we share a little bit of background knowledge in the square box (“ontology”), I can predict what the audience will infer from the top-triple. The shared background knowledge forces us to believe certain things (such that the right blobs must be locations) , and forbids us to believe certain things (such as that the two right blobs are different). By increasing the background knowledge the enforced conclusions (lowerbound on agreement) and the forbidden conlusions (upperbound on agreement) get closer and closer, and the remaining space for ambiguity and misunderstanding reduces. Not only misunderstanding between people, but also between machines. Slogan: semantics is when I can predict what you will infer when I send you something.