Probabilistic Logic Programming with Beta-Distributed Random Variables

Reasoning about objects:
attributes and relations
Reasoning about
Confidence in
Probabilities
Probabilistic
Reasoning
Expressing statements like:
“there is a relation between smoking
and asthma”
E.g. Logic Programming
Our proposal
Some attributes and relations in
the real world are probabilistic
E.g. Probabilistic Logic
Programming
Let’s toss a coin 3 times. We obtain 2 heads and 1 tail:
is the coin fair?
Let’s toss the same coin 3000 times. We obtain 2000 heads and
1000 tails: is the coin fair?
E.g. Dempster-Shafer,
possibility theory, imprecise
probabilities…
2

Probabilistic Logic Programming
0.6::asthma(X) :- smokes(X).
smokes(bill).
Probability of a query:
Ppqq :“
ÿ
Λ1ĎΛ,Λ1|ùq
PΛpΛ1
q
“
ÿ
Λ1ĎΛ,Λ1|ùq
ź
λiPΛ1
pi ¨
ź
λiPΛzΛ1
p1 ´ piq
Bill suffers from asthma with probability 0.6 if he smokes
Sato T. 1995. A statistical learning method for logic programs with distribution semantics. In Proceedings of
ICLP-1995, 715–729.
De Raedt L., Kimmig A., and Toivonen H. 2007. Problog: A probabilistic prolog and its application in link
discovery. In Proceedings IJCAI-2007, 2462–2467.
3

Where numbers come from?
# Smokes Asthma
1 T T
2 T T
3 T F
4 T T
5 T T
6 T F
7 T T
8 T F
9 T T
10 T F
π: true—unknown—probability of asthma conditioned by
smoking
Let y be the number of occurrence of asthma over n patients
when the patient smokes (y = 6)
From Bayes’ theorem, we can estimate the posterior
distribution of π given the data on the basis of a prior:
gpπ|yq 9 gpπq ¨ fpy|πq
The conjugate of a binomial is the Beta distribution. If:
gpπ; a, bq “ Betapa, bq “
Γpa ` bq
Γpaq ` Γpbq
πa´1
p1 ´ πqb´1
then: gpπ|yq “ Betapy ` a, n ´ y ` bq
If a “ b “ 1 (uniform prior), then gpπ|yq “ Betapy ` 1, n ´ y ` 1q
In the example, gpπ|y “ 6, n “ 10q “ Betap7, 5q4

X1 „ Betap7, 5q
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.5
1.0
1.5
2.0
2.5
ErX1s “ 0.583333 » 0.6
VarpX1q “ 0.018697
X2 „ Betap61, 41q
0.0 0.2 0.4 0.6 0.8 1.0
0
2
4
6
8
ErX2s “ 0.598039 » 0.6
VarpX2q “ 0.002334
X3 „ Betap601, 401q
0.0 0.2 0.4 0.6 0.8 1.0
0
5
10
15
20
25
ErX3s “ 0.5998 » 0.6
VarpX3q “ 0.000239
Although ErX1s » ErX2s » ErX3s » 0.6
they represent remarkably different random variables
5

Proposal: extend Probabilistic Logic Programming to manipulate Beta-distributed
random variables rather than probabilities
Advantage: enable reasoning both about the probabilities of things and the uncertainty
associated with our inferences
Technical Solution:
1. Derive addition and multiplication operators over Beta distributions returning a Beta
distribution via moment-matching to be used within the algebraic ProbLog
(aProbLog) proposal*
2. Extend aProbLog to include a conditioning operator
3. Derive a conditioning operator over Beta distributions returning a Beta distribution
via moment-matching
*Kimmig, A.; Van den Broeck, G.; and De Raedt, L. 2011. An algebraic prolog for reasoning about possible
worlds. In Proceedings of AAAI 2011, 209–214.
6

Step 0: aProbLog*
Ppqq “
ÿ
Λ1ĎΛ,Λ1|ùq
ź
λiPΛ1
pi ¨
ź
λiPΛzΛ1
p1 ´ piq
ó
Apqq “
à
IPIpqq
â
iPI
δpiq
Requirement: commutative semiring xA, ‘, b, e‘
, eb
y
*Kimmig, A.; Van den Broeck, G.; and De Raedt, L. 2011. An algebraic prolog for reasoning about possible
worlds. In Proceedings of AAAI 2011, 209–214.
7

Step 1: Addition and Multiplication Operators for Beta Variables
Given X and Y independent Beta-distributed random variables:
• the sum (‘β
) of X and Y is deﬁned as the Beta-distributed random variable Z such
that:
E rZs “ E rX ` Ys “ E rXs ` E rYs
and
Var pZq “ Var pX ` Yq “ Var pXq ` Var pYq
• the product (bβ
) of X and Y is deﬁned as the Beta-distributed random variable Z
such that:
E rZs “ E rXYs “ E rXs E rYs
and
Var pZq “ Var pXYq “ Var pXq pE rYsq2
` Var pYq pE rXsq2
` Var pXq Var pYq
8

Step 2: Conditioning operator
Apq|E “ eq “ ApIpq ^ E “ eqq m ApIpE “ eqq
(label of q ^ E “ e given the label of E “ e)
9

Step 3: Conditioning Operator for Beta Variables
Given X and Y Beta-distributed random variables,
Y “ ApIpE “ eqq “ ApIpq ^ E “ eqq ‘β
ApIp␣q ^ E “ eqq, with ApIpq ^ E “ eqq “ X.
The conditioning-division (mβ
) of X by Y is deﬁned as the Beta-distributed random
variable Z such that:
E rZs “ E
„
X
Y
ȷ
“ E rXs E
„
1
Y
ȷ
»
E rXs
E rYs
and
Var pZq » pE rZsq2
p1 ´ E rZsq2
¨
ˆ
Var pXq
pE rXsq2
`
Var pYq ´ Var pXq
pE rYs ´ E rXsq2
`
2Var pXq
E rXs pE rYs ´ E rXsq
˙
10

Summary of the main contribution
Sβ
A new aProbLog parametrisation with our newly deﬁned operators ‘β
, bβ
, and mβ
11

EXPERIMENT 1:
Sβ
outperforms state-of-the-art approaches on
probabilistic logic programs benchmarks
12 Image: https://pxhere.com/en/photo/1379409

p1::stress(X) :- person(X).
p2::influences(X,Y) :- person(X), person(Y).
smokes(X) :- stress(X).
smokes(X) :- friend(X,Y), influences(Y,X), smokes(Y).
p3::asthma(X) :- smokes(X).
person(1).
person(2).
person(3).
person(4).
friend(1,2).
friend(2,1).
friend(2,4).
friend(3,2).
friend(4,2).
evidence(smokes(2),true).
evidence(influences(4,2),false).
query(smokes(1)).
query(smokes(3)).
query(smokes(4)).
query(asthma(1)).
query(asthma(2)).
query(asthma(3)).
query(asthma(4)).
13

p1::stress(X) :- person(X).
...
ó 100 random choices for pX, e.g. p1 = 0.3
Generate 10 Beta
distributions from
Nins “ 10 samples of p1
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
3
4
Sample set #1
Sample set #2
Sample set #3
Sample set #4
Sample set #5
Sample set #6
Sample set #7
Sample set #8
Sample set #9
Sample set #10
Generate 10 Beta
distributions from
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
3
4
5
6
7
Sample set #1
Sample set #2
Sample set #3
Sample set #4
Sample set #5
Sample set #6
Sample set #7
Sample set #8
Sample set #9
Sample set #10
Generate 10 Beta
distributions from
0.0 0.2 0.4 0.6 0.8 1.0
0
2
4
6
8
Sample set #1
Sample set #2
Sample set #3
Sample set #4
Sample set #5
Sample set #6
Sample set #7
Sample set #8
Sample set #9
Sample set #10
14

State of the art*
A Beta-distributed random variable X „ Betapα, αq is equivalent to a subjective logic
opinion
SSL is the aProbLog parametrisation that uses the operators ‘SL, bSL, and mSL *
*Jøsang, A. 2016. Subjective Logic: A Formalism for Reasoning Under Uncertainty. Springer
15

Nins Sβ
SSL
10 Actual 0.1014 0.1514
Predicted 0.1727 0.1178
50 Actual 0.0620 0.1123
Predicted 0.0926 0.0815
100 Actual 0.0641 0.1253
Predicted 0.1150 0.0893
RMSE for the queried variables in the Friends & Smokers program.
Best results for the actual RMSE highlighted.
16

Sβ
SSL
0.0 0.2 0.4 0.6 0.8 1.0
Desired Confidence
0.0
0.2
0.4
0.6
0.8
1.0
ActualConfidence
SL Beta
SL Operators
Nins “ 10
Sβ
SSL
0.0 0.2 0.4 0.6 0.8 1.0
Desired Confidence
0.0
0.2
0.4
0.6
0.8
1.0
ActualConfidence
SL Beta
SL Operators
Nins “ 50
Sβ
SSL
0.0 0.2 0.4 0.6 0.8 1.0
Desired Confidence
0.0
0.2
0.4
0.6
0.8
1.0
ActualConfidence
SL Beta
SL Operators
Nins “ 100
Actual versus desired significance of bounds derived from the uncertainty for Smokers & Friends
Best closest to the diagonal
17

EXPERIMENT 2:
Sβ
is as good as state-of-the-art approaches on
Bayesian network benchmarks
18 Image: https://pxhere.com/en/photo/690975

H
A
B
C
E
G L
D
F
Net1
ó to logic program
pA::a.
pB1::b :- a.
pB2::b :- +a.
...
C
A
B
E
H
F
L
D
G
Net2
ó to logic program
pA::a.
pB1::b :- a.
pB2::b :- +a.
...
C
A
B
E
H
F L
D
G
Net3
ó to logic program
pA::a.
pB1::b :- a.
pB2::b :- +a.
...
pA = P(A) pB1 = PpB|Aq pB2 = PpB|Aq …19

State of the art
SBN
Subjective Bayesian Network*
• Bayesian network where
the conditionals are
subjective opinions instead
of dogmatic probabilities
• It builds on top of Pearl’s
message-passing inference
method
GBT
Belief Networks†
• Using Dempster-Shafer
theory
• Forward propagation and
backward propagation
enabled via the
generalized Bayes
theorem (GBT)
Credal
Credal Network‡
• Replace single probability
values with closed
intervals representing
the possible range of
probability values
• It extends Pearl’s
message-passing
inference method
*Kaplan, L., and Ivanovska, M. 2018. Efﬁcient belief propagation in second-order bayesian networks for
singlyconnected graphs. International Journal of Approximate Reasoning 93:132–152.
†Smets, P. 1993. Belief functions: The disjunctive rule of combination and the generalized Bayesian theorem.
International Journal of Approximate Reasoning 9:1–35.
‡Zaffalon, M., and Fagiuoli, E. 1998. 2U: An exact interval propagation algorithm for polytrees with binary
variables. Artiﬁcial Intelligence 106(1):77–107.
20

Nins Sβ
SSL SBN GBT Credal
Net1 10 Actual 0.1505 0.2078 0.1505 0.1530 0.1631
Predicted 0.1994 0.1562 0.1470 0.0868 0.2009
50 Actual 0.0555 0.0895 0.0555 0.0619 0.0553
Predicted 0.0950 0.0579 0.0563 0.0261 0.0761
100 Actual 0.0766 0.1182 0.0766 0.0795 0.0771
Predicted 0.1280 0.0772 0.0763 0.0373 0.1028
Net2 10 Actual 0.1387 0.2089 0.1387 0.1416 0.1459
Predicted 0.2031 0.1662 0.1391 0.1050 0.1849
50 Actual 0.0537 0.0974 0.0537 0.0561 0.0528
Predicted 0.1002 0.0671 0.0520 0.0342 0.0683
100 Actual 0.0730 0.1229 0.0726 0.0752 0.0728
Predicted 0.1380 0.0863 0.0725 0.0482 0.0949
Net3 …
RMSE for the queried variables in the various Bayesian networks (selection).
Best results for the actual RMSE highlighted.
21

Sβ
SSL SBN GBT Credal
0.0 0.2 0.4 0.6 0.8 1.0
Desired Confidence
0.0
0.2
0.4
0.6
0.8
1.0
ActualConfidence
SL Beta
SL Operators
SBN
GBT
Credal
Net1, Nins “ 10
Sβ
SSL SBN GBT Credal
0.0 0.2 0.4 0.6 0.8 1.0
Desired Confidence
0.0
0.2
0.4
0.6
0.8
1.0
ActualConfidence
SL Beta
SL Operators
SBN
GBT
Credal
Net1, Nins “ 50
…
Actual versus desired significance of bounds derived from the uncertainty for the various
Bayesian networks (selection).
Best closest to the diagonal
22

CONCLUSIONS
23 Image: https://goo.gl/RcCSbb

Reasoning about objects:
attributes and relations
Reasoning about
Confidence in
Probabilities
Probabilistic
Reasoning
Expressing statements like:
“there is a relation between smoking
and asthma”
E.g. Logic Programming
Our proposal
Some attributes and relations in
the real world are probabilistic
E.g. Probabilistic Logic
Programming
Let’s toss a coin 3 times. We obtain 2 heads and 1 tail:
is the coin fair?
Let’s toss the same coin 3000 times. We obtain 2000 heads and
1000 tails: is the coin fair?
E.g. Dempster-Shafer,
possibility theory, imprecise
probabilities…
24

• We enabled the aProbLog approach to probabilistic logic programming to reason in
presence of uncertain probabilities represented as Beta-distributed random
variables
• The proposed operators outperform existing proposal for uncertain probabilities
• The proposed operators are as good as the state-of-the-art approaches for
uncertain probabilities in Bayesian networks while being able to handle much more
complex problems
25

• Provide a different characterisation of the variance in the conditioning operator
• Test the boundaries of our approximations to provide practitioners with pragmatic
assessments and assurances
• Introduce an expectation-maximisation (EM) algorithm for parameter learning
27

Probabilistic Logic Programming with Beta-Distributed Random Variables

Recommended

Recommended

More Related Content

Similar to Probabilistic Logic Programming with Beta-Distributed Random Variables

Similar to Probabilistic Logic Programming with Beta-Distributed Random Variables (20)

More from Federico Cerutti

More from Federico Cerutti (20)

Recently uploaded

Recently uploaded (20)

Probabilistic Logic Programming with Beta-Distributed Random Variables