Inference is something we humans do all the time. Given a set of facts about the world, we derive new ones using some form of inference. Automated reasoning has been studied extensively but its value in providing a more powerful abstraction layer for database languages has been overlooked so far.
This talk explores deductive inference in Grakn, a hyper-relational database that has automated inference as one of its core features. Rather than defining SQL views or writing ad hoc code, in Grakn we can define logical rules that provide a more intuitive way to describe higher level domain concepts. In the talk we give a quick overview of computational logic semantics and of top-down and bottom-up inference algorithms. Then, after introducing some preliminary Grakn concepts, we show how logical rules are resolved in a query.
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
Logical Inference in a Hyper-Relational Database
1. Join our community at grakn.ai/community
T H E D A T A B A S E F O R A I
Logical Inference in a
Hyper-Relational database
By Domenico Corapi
Lead Engineer at GRAKN.AI
1
2. Follow us @GraknLabs
• @DomenicoCorapi
• PhD in Computational Logic /
Machine Learning at Imperial
College
• Web Services and Distributed
Systems at Microsoft Yammer and
Skyscanner
• Finance stuff at Citi and
Bloomberg
WHO AM I?
4. • We know some facts about the
world: a Knowledge Base
• Using some inference rules,
what other facts can we derive?
Knowledge
Base
Conclusions
Inference
INFERENCE
5. • Deductive reasoning
• Abductive reasoning
• Inductive reasoning
• Probabilistic reasoning
• …
INFERENCE
Knowledge
Base
Today is a sunny day
I don’t work if it’s a sunny day
or if it’s a bank holiday
Conclusions
Inference
I don’t work today
Deduction
8. Follow us @GraknLabs
SOME NOTATION
First-order logic Formal logic
Computational logic
Prolog
Datalog
Answer Set Programming
Graql rules
9. Follow us @GraknLabs
SOME NOTATION
• human(socrates)
socrates is human
• human(X)
X is human
• likes(socrates, X), paradox(X)
socrates likes X and X is a paradox
• likes(socrates, X) <- paradox(X)
socrates likes X if X is a paradox
• h <- b1, b2, …, bn
Definite clause
12. Follow us @GraknLabs
HERBRAND BASE
• human(socrates)
• human(X) -> mortal(X)
• human(socrates)
• mortal(socrates)
A Herbrand Base is the
set of all possible ground
(no variables) assertions
Herbrand Base:
13. Follow us @GraknLabs
INTERPRETATION
• human(socrates)
• mortal(socrates)
A Interpretation is an
assignment of truth
values to elements in the
Herbrand Base
Herbrand Base:
• {}
• {human(socrates)}
• {mortal(socrates)}
• {mortal(socrates), human(socrates)}
Possible interpretations:
14. Follow us @GraknLabs
MODEL
• human(socrates)
• human(X) -> mortal(X)A Model for a Knowledge
Base KB is an
interpretation of KB that
satisfies each clause in KB
Possible interpretations:
• {}
• {human(socrates)}
• {mortal(socrates)}
• {mortal(socrates), human(socrates)}Model
15. Follow us @GraknLabs
SOME FUNDAMENTAL THEOREM
A Knowledge Base with no negation has one and only
one minimal model
16. Follow us @GraknLabs
COMPUTATIONAL LOGIC IN A NUTSHELL
KB ⊨ G
G is entailed by KB
KB ⊢ G
G is provable from KB
• What language is KB in?
• What semantics is represented by ⊨?
• What proof procedure is used in ⊢?
• In what conditions is true?
17. Follow us @GraknLabs
CLOSED WORLD ASSUMPTION
Flights
Airline Number Origin Dest
AS 98 ANC SEA
AA 2336 LAX PBI
US 840 SFO CLT
AA 258 LAX MIA
AS 153 SEA ANC
…
select * from flights
where origin = ‘LHR’
What do we know about entries that are NOT in the table?
Not added yet?
They don’t exist?
We don’t know?
18. Follow us @GraknLabs
NEGATION IS HARD
man(adam).
single(X) <- man(X), not husband(X).
husband(X) <- man(X), not single(X).
Herbrand Base:
{man(adam),
single(adam),
husband(adam)}
Minimal models:
• {man(adam), single(adam)}
• {man(adam), husband(adam)}
20. Follow us @GraknLabs
BOTTOM UP INFERENCE
• Answer Set Programming
• Multiple models as answer
• Use of disjunction, cardinality constraints, integrity constraints,
classical negation, cwa negation… (no functors)
• Variables are typed
23. Follow us @GraknLabs
COMPUTE THE MODELS
(NOT human(socrates) OR mortal(socrates))
AND
(NOT human(plato) OR mortal(plato))
AND
human(plato)
AND
human(socrates)
SAT SOLVER
26. Follow us @GraknLabs
Entity
GRAKN KNOWLEDGE MODEL
Attribute
Relationship
Role
socrates, plato
mortal
e.g. mortal(socrates)
teaches
e.g. teaches(socrates, plato)
teacher, student
e.g. teaches(teacher: socrates,
student: plato)
29. Follow us @GraknLabs
RULES IN GRAKN
human(X) -> mortal(X)
when {
$x isa human
}, then {
$x has mortal “true”
}
human sub mortal
30. Follow us @GraknLabs
RULES IN GRAKN
is_located_in(X, Y), is_located_in(Y, Z) -> is_located_in(X, Z)
when {
(geo-entity: $x, entity-location: $y) isa is-located-in;
(geo-entity: $y, entity-location: $z) isa is-located-in;
}, then {
(geo-entity: $x, entity-location: $z) isa is-located-in;
}
31. Follow us @GraknLabs
STEP BY STEP
match $x isa city; $y has name 'Poland'; ($x, $y) isa is-located-in; get;
iterator.next()
1. Pop a subgoal from the stack
2. If subgoal is an answer return
3. Else expand goal into subgoals and push to stack.
Back to 1.
32. Follow us @GraknLabs
STEP BY STEP
$y has name ‘Poland’
($x, $y) isa is-located-in
match $x isa city
G
lookup $y such that $y has name ‘Poland’
($x, $y) isa is-located-in
match $x isa city
($x, NODE_POLAND) isa is-located-in
match $x isa city
$y / NODE_POLAND
lookup ($x, NODE_POLAND) isa is-located-in
match $x isa city
33. Follow us @GraknLabs
STEP BY STEP
$y has name ‘Poland’
($x, $y) isa is-located-in
match $x isa city
lookup $y such that $y has name ‘Poland’
G
G
($x, NODE_POLAND) isa is-located-in
match $x isa city
$y / NODE_POLAND
lookup ($x, NODE_POLAND) isa is-located-in
match $x isa city
lookup ($x, NODE_POLAND) isa is-located-in
…
lookup $y has name Poland
…
match NODE_SILESIA isa city
$x / NODE_SILESIA
34. Follow us @GraknLabs
STEP BY STEP
$y has name ‘Poland’
($x, $y) isa is-located-in
match $x isa city
lookup $y such that $y has name ‘Poland’
G
G
($x, NODE_POLAND) isa is-located-in
match $x isa city
$y / NODE_POLAND
lookup ($x, NODE_POLAND) isa is-located-in
match $x isa city
$x / NODE_SILESIA
match NODE_SILESIA isa city
{}
35. Follow us @GraknLabs
STEP BY STEP
lookup ($x, NODE_POLAND) isa is-located-in
match $x isa city
$x / NODE_SILESIA
match NODE_MASOVIA isa city
{}
match NODE_SILESIA isa city
$x / NODE_MASOVIA
{}
($x, NODE_POLAND) isa is-located-in
match $x isa city is-located-in($x, NODE_POLAND)
<- is-located-in($x, $z),
is-located-in($z, NODE_POLAND)
($x, $z) isa is-located-in
($z, NODE_POLAND) isa is-located-in
match $x isa city
is_located_in(X, Z) <-
is_located_in(X, Y),
is_located_in(Y, Z)
X/$x
Z/NODE_POLAND
36. Follow us @GraknLabs
STEP BY STEP
($x, $z) isa is-located-in
($z, NODE_POLAND) isa is-located-in
($x, NODE_MASOVIA) isa is-located-in
match $x isa city
$z / NODE_MASOVIA
Answer: {$x: NODE_WARSAW}
…
…
37. Follow us @GraknLabs
SLD RESOLUTION
This is very similar to what Prolog implementations do
Differences
• Things are represented as nodes! No index lookup, but just getting the edges
• Interleaving graph lookups with resolution steps
• Type checks and roles in relationships
• Fetching results lazily. We keep a pointer rather than saving goals on the stack
• Caching intermediate results (similar to tabling in Prolog)
38. Follow us @GraknLabs
CONCLUSION
Inference is a more powerful and general way to derive conclusions from data.
Grakn uses rules to define higher level concepts and applies a top-down procedure to perform
inference.
Lookups are expensive, we need to be lazy.
Using a stack to save the state of the computation. We keep pointers to nodes in the underlying
graph from which we resume the inference procedure.