Modular design patterns for
systems that learn and reason:
a boxology
Frank van Harmelen, Annette ten Teije (V1)
Vrije Universiteit Amsterdam
+ Michael van Bekkum, Maaike de Boer, André Meyer (V2)
TNO Netherlands
(https://arxiv.org/abs/2102.11965)
Creative Commons License
CC BY 3.0:
Allowed to copy, redistribute
remix & transform
But must attribute
1
Increasingly broad concensus in AI
(“the third wave”)
The next progress in AI will be driven by systems
that combine neural and symbolic techniques
Position papers by
Marcus, Lamb & Garcez, Darwiche, Pearl, Kautz, …
Keynotes at AAAI17, IJCAI18, IJCAI19, AAAI20,….
(“proof by authority”  )
2
So let’s compare
3
Strengths & Weaknesses
Symbolic Connectionist
Construction Human effort Data hunger
Scaleable +/- +/-
Explainable + -
Generalisable Performance cliff Performance cliff
4
Strengths & Weaknesses
Symbolic Connectionist
Construction Human effort Data hunger
Scaleable +/- +/-
Explainable + -
Generalisable Performance cliff Performance cliff
300.000 medical definitions
40 years of effort,
10.000 updates every years 5
“knowledge acquisition
bottle neck”
Strengths & Weaknesses
10M training images
Symbolic Connectionist
Construction Human effort Data hunger
Scalable +/- +/-
Explainable + -
Generalisable Performance cliff Performance cliff
4.8M training games
6
“sample inefficiency”
Strengths & Weaknesses
Symbolic Connectionist
Construction Human effort Data hunger
Scaleable +/- +/-
Explainable + -
Generalisable Performance cliff Performance cliff
worse with
more data
worse with
less data
7
“sample inefficiency”
“combinatorial
explosion”
Strengths & Weaknesses
Symbolic Connectionist
Construction Human effort Data hunger
Scaleable +/- +/-
Explainable + -
Generalisable Performance cliff Performance cliff
8
“black box problem”
Strengths & Weaknesses
Symbolic Connectionist
Construction Human effort Data hunger
Scaleable +/- +/-
Explainable + -
Generalisable Performance cliff Performance cliff
quality

generality  9
Strengths & Weaknesses
Symbolic Connectionist
Construction Human effort Data hunger
Scaleable +/- +/-
Explainable + -
Generalisable Performance cliff Performance cliff
10
Class: 793
Label: n04209133 (shower cap)
Certainty: 99.7%
“out of distribution generalisability”
Strengths & Weaknesses
Symbolic Connectionist
Construction Human effort Data hunger
Scaleable +/- +/-
Explainable + -
Generalisable Performance cliff Performance cliff
11
Can we get them
to collaborate?
12
So we started reading…
• 3 years of weekly reading group ≈ 75 papers
• 3x 8-week seminar with 15 students ≈100 papers
13
It was a mess…
Lot’s of techniques, tricks, ideas, methods, math
No structure, no guidance, no map, no theory
14
Ontology learning
Description logic learning
Hyperbolic embeddings
15
Goal 1: Can we make a reading map?
(as educators)
Goal 2: Can we make a modular theory?
(as scientists)
Inspired by Software Engineering:
a theory of re-usable patterns
(“Gang of four”)
16
Inspired by Software Engineering:
a theory of re-usable patterns
17
Inspired by Process Mining:
a theory of re-usable patterns
18
Inspired by Knowledge Engineering:
a theory of re-usable patterns
19
knowledge-
intensive
task
analytic
task
classification
synthetic
task
assessment
diagnosis
configuration
design
planning
scheduling
assignment
modelling
prediction
monitoring
design
object
class
attribute
feature
truth
value
generate
specify
match
obtain
Task types
Task templates
Plan:
make compositional patterns
by loose coupling
of elementary components
20
a “boxology”
learning
inference
Example: a classical ML system
21
+
=
Example: Inductive Logic Programming
22
parent(Mary,Vicky).
parent(Mary,Andre).
parent(Carrey,Vickey).
mother(Mary,Vicky).
mother(Mary,Andy).
father(Carrey,Vickey).
father(Carrey,Andy).
parent(X,Y) :- mother(X,Y).
parent(X,Y) :- father(X,Y)
parent(Carrey,Andy).
Symbols in, symbols out
• Inductive Logic Programming
• Probabilistic Soft Logic
• Markov Logic Networks
• ….
23
Intermezzo: Symbol or data?
24
“A classical machine learning system: ”
“What the <0.70, 1.17, 0.99, 1.07> is a Symbol?”
Istvan Berkeley, Minds & Machines, 2008.
1. a symbol must designate an object, a class or a relation in the world
(= the “interpretation” of the symbol)
2. symbols can be either atomic or complex,
(= composed of other symbols according to compositional rules
3. there must be system of operations that, when applied to a symbol,
generates new symbols, that again must have a designation.
cat
Symbolic prior (informed ML)
P(cushion|chair) >> P(flower|chair)
25
See survey of 100+ systems in Von Rueden et al, Learning, 2019
cushion
Learning intermediate abstractions
for reasoning
:- see( , 3), see( , 5), add(3,5,8).
End-to-end:
2x 784
inputs
19 outputs
AlphaGo
Learning intermediate abstractions
for learning
Neural
Back
end
Symbolic
Front
end
Example: Reinforcement learning for
spatial navigation
Faster adaptation to changes;
Better transfer learning
Explainable ML by rational
reconstruction
queen
crown
wears
29
shower
cap
Ranking hypotheses (≈ explaining why not)
queen
crown
wears
30
shower
cap
symbol
prediction
algorithm
From symbols to data and back again
Knowledge Graph completion
31
ML ML
From symbols to data and back again
Knowledge Graph completion
Rolling
Stones
Angi Beat It
Michael
Jackson
Publish_song
32
Angi
Rolling
Stones
Publish_song
From:
Predict:
ML ML
prediction
algorithm
From symbols to data and back again
Knowledge Graph completion
33
ML ML
Knowledge-based auto-ML
• Algorithmic configuration
• Hyperparameter tuning
• Selection of
training examples
34
Concluding
remarks
35
Goal 1: Create some structure
in the huge number of proposals
for combining learning and reasoning
Goal 2: Create modular architectures
Contribution:
A set of re-usable architectural patterns
for modular systems that learn and reason
Next steps:
• Formalise informal diagrams as pre/post-conditions
• Implement informal diagrams as a code library
• Generate diagrams via a grammar
(and predict unexplored patterns)
36

Modular design patterns for systems that learn and reason: a boxology

  • 1.
    Modular design patternsfor systems that learn and reason: a boxology Frank van Harmelen, Annette ten Teije (V1) Vrije Universiteit Amsterdam + Michael van Bekkum, Maaike de Boer, André Meyer (V2) TNO Netherlands (https://arxiv.org/abs/2102.11965) Creative Commons License CC BY 3.0: Allowed to copy, redistribute remix & transform But must attribute 1
  • 2.
    Increasingly broad concensusin AI (“the third wave”) The next progress in AI will be driven by systems that combine neural and symbolic techniques Position papers by Marcus, Lamb & Garcez, Darwiche, Pearl, Kautz, … Keynotes at AAAI17, IJCAI18, IJCAI19, AAAI20,…. (“proof by authority”  ) 2
  • 3.
  • 4.
    Strengths & Weaknesses SymbolicConnectionist Construction Human effort Data hunger Scaleable +/- +/- Explainable + - Generalisable Performance cliff Performance cliff 4
  • 5.
    Strengths & Weaknesses SymbolicConnectionist Construction Human effort Data hunger Scaleable +/- +/- Explainable + - Generalisable Performance cliff Performance cliff 300.000 medical definitions 40 years of effort, 10.000 updates every years 5 “knowledge acquisition bottle neck”
  • 6.
    Strengths & Weaknesses 10Mtraining images Symbolic Connectionist Construction Human effort Data hunger Scalable +/- +/- Explainable + - Generalisable Performance cliff Performance cliff 4.8M training games 6 “sample inefficiency”
  • 7.
    Strengths & Weaknesses SymbolicConnectionist Construction Human effort Data hunger Scaleable +/- +/- Explainable + - Generalisable Performance cliff Performance cliff worse with more data worse with less data 7 “sample inefficiency” “combinatorial explosion”
  • 8.
    Strengths & Weaknesses SymbolicConnectionist Construction Human effort Data hunger Scaleable +/- +/- Explainable + - Generalisable Performance cliff Performance cliff 8 “black box problem”
  • 9.
    Strengths & Weaknesses SymbolicConnectionist Construction Human effort Data hunger Scaleable +/- +/- Explainable + - Generalisable Performance cliff Performance cliff quality  generality  9
  • 10.
    Strengths & Weaknesses SymbolicConnectionist Construction Human effort Data hunger Scaleable +/- +/- Explainable + - Generalisable Performance cliff Performance cliff 10 Class: 793 Label: n04209133 (shower cap) Certainty: 99.7% “out of distribution generalisability”
  • 11.
    Strengths & Weaknesses SymbolicConnectionist Construction Human effort Data hunger Scaleable +/- +/- Explainable + - Generalisable Performance cliff Performance cliff 11
  • 12.
    Can we getthem to collaborate? 12
  • 13.
    So we startedreading… • 3 years of weekly reading group ≈ 75 papers • 3x 8-week seminar with 15 students ≈100 papers 13 It was a mess… Lot’s of techniques, tricks, ideas, methods, math No structure, no guidance, no map, no theory
  • 14.
    14 Ontology learning Description logiclearning Hyperbolic embeddings
  • 15.
    15 Goal 1: Canwe make a reading map? (as educators) Goal 2: Can we make a modular theory? (as scientists)
  • 16.
    Inspired by SoftwareEngineering: a theory of re-usable patterns (“Gang of four”) 16
  • 17.
    Inspired by SoftwareEngineering: a theory of re-usable patterns 17
  • 18.
    Inspired by ProcessMining: a theory of re-usable patterns 18
  • 19.
    Inspired by KnowledgeEngineering: a theory of re-usable patterns 19 knowledge- intensive task analytic task classification synthetic task assessment diagnosis configuration design planning scheduling assignment modelling prediction monitoring design object class attribute feature truth value generate specify match obtain Task types Task templates
  • 20.
    Plan: make compositional patterns byloose coupling of elementary components 20 a “boxology” learning inference
  • 21.
    Example: a classicalML system 21 + =
  • 22.
    Example: Inductive LogicProgramming 22 parent(Mary,Vicky). parent(Mary,Andre). parent(Carrey,Vickey). mother(Mary,Vicky). mother(Mary,Andy). father(Carrey,Vickey). father(Carrey,Andy). parent(X,Y) :- mother(X,Y). parent(X,Y) :- father(X,Y) parent(Carrey,Andy).
  • 23.
    Symbols in, symbolsout • Inductive Logic Programming • Probabilistic Soft Logic • Markov Logic Networks • …. 23
  • 24.
    Intermezzo: Symbol ordata? 24 “A classical machine learning system: ” “What the <0.70, 1.17, 0.99, 1.07> is a Symbol?” Istvan Berkeley, Minds & Machines, 2008. 1. a symbol must designate an object, a class or a relation in the world (= the “interpretation” of the symbol) 2. symbols can be either atomic or complex, (= composed of other symbols according to compositional rules 3. there must be system of operations that, when applied to a symbol, generates new symbols, that again must have a designation. cat
  • 25.
    Symbolic prior (informedML) P(cushion|chair) >> P(flower|chair) 25 See survey of 100+ systems in Von Rueden et al, Learning, 2019 cushion
  • 26.
    Learning intermediate abstractions forreasoning :- see( , 3), see( , 5), add(3,5,8). End-to-end: 2x 784 inputs 19 outputs AlphaGo
  • 27.
    Learning intermediate abstractions forlearning Neural Back end Symbolic Front end
  • 28.
    Example: Reinforcement learningfor spatial navigation Faster adaptation to changes; Better transfer learning
  • 29.
    Explainable ML byrational reconstruction queen crown wears 29 shower cap
  • 30.
    Ranking hypotheses (≈explaining why not) queen crown wears 30 shower cap symbol
  • 31.
    prediction algorithm From symbols todata and back again Knowledge Graph completion 31 ML ML
  • 32.
    From symbols todata and back again Knowledge Graph completion Rolling Stones Angi Beat It Michael Jackson Publish_song 32 Angi Rolling Stones Publish_song From: Predict: ML ML
  • 33.
    prediction algorithm From symbols todata and back again Knowledge Graph completion 33 ML ML
  • 34.
    Knowledge-based auto-ML • Algorithmicconfiguration • Hyperparameter tuning • Selection of training examples 34
  • 35.
  • 36.
    Goal 1: Createsome structure in the huge number of proposals for combining learning and reasoning Goal 2: Create modular architectures Contribution: A set of re-usable architectural patterns for modular systems that learn and reason Next steps: • Formalise informal diagrams as pre/post-conditions • Implement informal diagrams as a code library • Generate diagrams via a grammar (and predict unexplored patterns) 36