Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Applications of Probabilistic Logic to
Materials Discovery
Solving problems with hard and soft constraints
Marcelo Finger
Department of Computer Science
Institute of Mathematics and Statistics
University of Sao Paulo, Brazil

Nov 2013

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Topics

1

What You Probably Don’t Know

2

Pragmatic Motivation

3

Probabilistic Satisfiability

4

Optimizing Probability Distributions with oPSAT

5

oPSAT and Combinatorial Materials Discovery

6

Conclusions

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Next Topic

1

What You Probably Don’t Know

2

Pragmatic Motivation

3

Probabilistic Satisfiability

4

Optimizing Probability Distributions with oPSAT

5

oPSAT and Combinatorial Materials Discovery

6

Conclusions

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

My Anguish
How to Lead My Career?
Basic Research

Application Oriented Research

Marcelo Finger
HardSoft & PSAT

Application

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

My Anguish
How to Lead My Career?
Basic Research
Advantages: Does not require a lot of resources; there is a
community to talk to; can be beautiful

Application Oriented Research

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

My Anguish
How to Lead My Career?
Basic Research
Advantages: Does not require a lot of resources; there is a
community to talk to; can be beautiful
Problems: Beauty is hard to achieve; boring, not interesting
to most; attracts little money

Application Oriented Research

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

My Anguish
How to Lead My Career?
Basic Research
Advantages: Does not require a lot of resources; there is a
community to talk to; can be beautiful
Problems: Beauty is hard to achieve; boring, not interesting
to most; attracts little money

Application Oriented Research
Advantages: Sexy, attracts visibility and money

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

My Anguish
How to Lead My Career?
Basic Research
Advantages: Does not require a lot of resources; there is a
community to talk to; can be beautiful
Problems: Beauty is hard to achieve; boring, not interesting
to most; attracts little money

Application Oriented Research
Advantages: Sexy, attracts visibility and money
Problems: Very hard to obtain data (and money is a must!),
easily obsolescent

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

My Anguish
How to Lead My Career?
Basic Research
Advantages: Does not require a lot of resources; there is a
community to talk to; can be beautiful
Problems: Beauty is hard to achieve; boring, not interesting
to most; attracts little money

Application Oriented Research
Advantages: Sexy, attracts visibility and money
Problems: Very hard to obtain data (and money is a must!),
easily obsolescent

No answer, but keep this in mind

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

What you most certainly don’t know

Combinatorial Chemistry: discovery of new products
Lots of experiments
Each has a different settings
Search for the right setting that produces an interesting
product
Very useful to find new medicine, paint, adhesives, etc.

Marcelo Finger
HardSoft & PSAT

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

Combinatorial Materials Science

Search for new materials.
Aim: new catalysts for the purification of water and air
Buzz word: Sustainability

Marcelo Finger
HardSoft & PSAT

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Combinatorial Materials Science

Search for new materials.
Aim: new catalysts for the purification of water and air
Buzz word: Sustainability
This work was developed within the
Institute of Computational Sustainability
at the Department of Computer Science, Cornell University

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

The Materials Discovery Problem
In a reaction, where are the materials formed?

Marcelo Finger
HardSoft & PSAT

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

The Crazy Idea

Use Probabilistic Logic to solve this problem
Consider only peaks
Each peak is in a layer
Connect all peaks in the same layer with a graph
Model graphs with propositional logic
Each disconnected peak is a defect

Marcelo Finger
HardSoft & PSAT

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

The Crazy Idea

Use Probabilistic Logic to solve this problem
Consider only peaks
Each peak is in a layer
Connect all peaks in the same layer with a graph
Model graphs with propositional logic
Each disconnected peak is a defect
Different models, different graphs, different defects

Marcelo Finger
HardSoft & PSAT

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

The Crazy Idea

Use Probabilistic Logic to solve this problem
Consider only peaks
Each peak is in a layer
Connect all peaks in the same layer with a graph
Model graphs with propositional logic
Each disconnected peak is a defect
Different models, different graphs, different defects
Limit the probability of the occurrence of defects

Marcelo Finger
HardSoft & PSAT

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

Challenges: Only for the tough

Logic
Probability

Marcelo Finger
HardSoft & PSAT

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

Challenges: Only for the tough

Logic
Probability
Linear Algebra
Linear Programming
Optimization
Algorithms

Marcelo Finger
HardSoft & PSAT

Conclusion
Ignorance

Motivation

Student Profile

Marcelo Finger
HardSoft & PSAT

PSAT

oPSAT

Application

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Next Topic

1

What You Probably Don’t Know

2

Pragmatic Motivation

3

Probabilistic Satisfiability

4

Optimizing Probability Distributions with oPSAT

5

oPSAT and Combinatorial Materials Discovery

6

Conclusions

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Pragmatic Motivation

Practical problems combine real-world hard constraints with
soft constraints
Soft constraints: preferences, uncertainties, flexible
requirements
We explore probabilistic logic as a mean of dealing with
combined soft and hard constraints

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

Goals

Aim: Combine Logic and Probabilistic reasoning to deal
with Hard (L) and Soft (P) constraints
Method: develop optimized Probabilistic Satisfiability
(oPSAT)
Application: Demonstrate effectiveness on a real-world
reasoning task in the domain of Materials Discovery.

Marcelo Finger
HardSoft & PSAT

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

An Example

Summer course enrollment
m students and k summer courses.
Potential team mates, to develop coursework. Constraints:
Hard Coursework to be done alone or in pairs.
Students must enroll in at least one and at most
three courses.
There is a limit of ℓ students per course.
Soft Avoid having students with no teammate.

Marcelo Finger
HardSoft & PSAT

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

An Example

Summer course enrollment
m students and k summer courses.
Potential team mates, to develop coursework. Constraints:
Hard Coursework to be done alone or in pairs.
Students must enroll in at least one and at most
three courses.
There is a limit of ℓ students per course.
Soft Avoid having students with no teammate.
In our framework: P(student with no team mate) “minimal” or
bounded

Marcelo Finger
HardSoft & PSAT

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

Combining Logic and Probability

Many proposals in the literature
Markov Logic Networks [Richardson & Domingos 2006]
Probabilistic Inductive Logic Prog [De Raedt et. al 2008]
Relational Models [Friedman et al 1999], etc

Marcelo Finger
HardSoft & PSAT

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Combining Logic and Probability

Many proposals in the literature
Markov Logic Networks [Richardson & Domingos 2006]
Probabilistic Inductive Logic Prog [De Raedt et. al 2008]
Relational Models [Friedman et al 1999], etc

Our choice: Probabilistic Satisfiability (PSAT)
Natural extension of Boolean Logic
Desirable properties, e.g. respects Kolmogorov axioms
Probabilistic reasoning free of independence presuppositions

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Combining Logic and Probability

Many proposals in the literature
Markov Logic Networks [Richardson & Domingos 2006]
Probabilistic Inductive Logic Prog [De Raedt et. al 2008]
Relational Models [Friedman et al 1999], etc

Our choice: Probabilistic Satisfiability (PSAT)
Natural extension of Boolean Logic
Desirable properties, e.g. respects Kolmogorov axioms
Probabilistic reasoning free of independence presuppositions

What is PSAT?

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Next Topic

1

What You Probably Don’t Know

2

Pragmatic Motivation

3

Probabilistic Satisfiability

4

Optimizing Probability Distributions with oPSAT

5

oPSAT and Combinatorial Materials Discovery

6

Conclusions

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Is PSAT a Zombie Idea?

An idea that refuses to die!
Marcelo Finger
HardSoft & PSAT

Application

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

A Brief History of PSAT

Proposed by [Boole 1854],
On the Laws of Thought
Rediscovered several times since Boole
De Finetti [1937, 1974], Good [1950], Smith [1961]
Studied by Hailperin [1965]
Nilsson [1986] (re)introduces PSAT to AI
PSAT is NP-complete [Georgakopoulos et. al 1988]
Nilsson [1993]: “complete impracticability” of PSAT
computation
Many other works; see Hansen & Jaumard [2000]

Marcelo Finger
HardSoft & PSAT

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

Or a Wild Amazonian Flower?

Awaits special conditions to bloom!
(Linear programming + SAT-based techniques)

Marcelo Finger
HardSoft & PSAT

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

The Setting
Formulas α1 , . . . , αℓ over logical variables P = {x1 , . . . , xn }
Propositional valuation v : P → {0, 1}

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

The Setting
Formulas α1 , . . . , αℓ over logical variables P = {x1 , . . . , xn }
Propositional valuation v : P → {0, 1}
A probability distribution over propositional valuations
π : V → [0, 1]
2n

π(vi ) = 1
i=1

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

The Setting
Formulas α1 , . . . , αℓ over logical variables P = {x1 , . . . , xn }
Propositional valuation v : P → {0, 1}
A probability distribution over propositional valuations
π : V → [0, 1]
2n

π(vi ) = 1
i=1

Probability of a formula α according to π
Pπ (α) =

Marcelo Finger
HardSoft & PSAT

{π(vi )|vi (α) = 1}
Ignorance

Motivation

PSAT

oPSAT

Application

The PSAT Problem
Consider ℓ formulas α1 , . . . , αℓ defined on n atoms
{x1 , . . . , xn }
A PSAT problem Σ is a set of ℓ restrictions
Σ = {P(αi )

pi |1 ≤ i ≤ ℓ}

Probabilistic Satisfiability: is there a π that satisfies Σ?

Marcelo Finger
HardSoft & PSAT

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

The PSAT Problem
Consider ℓ formulas α1 , . . . , αℓ defined on n atoms
{x1 , . . . , xn }
A PSAT problem Σ is a set of ℓ restrictions
Σ = {P(αi )

pi |1 ≤ i ≤ ℓ}

Probabilistic Satisfiability: is there a π that satisfies Σ?
In our framework, ℓ = m + k, Σ = Γ ∪ Ψ:
Hard Γ = {α1 , . . . , αm }, P(αi ) = 1 (clauses)
Soft Ψ = {P(si ) ≤ pi |1 ≤ i ≤ k}
si atomic; pi given, learned or minimized

Marcelo Finger
HardSoft & PSAT

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

Example continued
Only one course, three student enrollments: x, y and z
Potential partnerships: pxy and pxz , mutually exclusive.
Hard constraint
P(x ∧ y ∧ z ∧ ¬(pxy ∧ pxz )) = 1

Soft constraints
P(x ∧ ¬pxy ∧ ¬pxz )
P(y ∧ ¬pxy )
P(z ∧ ¬pxz )

Marcelo Finger
HardSoft & PSAT

≤
≤
≤

0.25
0.60
0.60

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

Example continued
Only one course, three student enrollments: x, y and z
Potential partnerships: pxy and pxz , mutually exclusive.
Hard constraint
P(x ∧ y ∧ z ∧ ¬(pxy ∧ pxz )) = 1

Soft constraints
P(x ∧ ¬pxy ∧ ¬pxz )
P(y ∧ ¬pxy )
P(z ∧ ¬pxz )

≤
≤
≤

0.25
0.60
0.60

(Small) solution: distribution π
π(x, y , z, ¬pxy , ¬pxz ) = 0.1
π(x, y , z, ¬pxy , pxz ) = 0.5
Marcelo Finger
HardSoft & PSAT

π(x, y , z, pxy , ¬pxz ) = 0.4
π(v ) = 0 for other 29 valuations

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Solving PSAT

Linear algebraic formulation
Optimization problem with exponential columns
Columns are not explicitly represented, but provided by a
column generation process
Goal: minimize the probability of inconsistent columns
A SAT solver is employed for column generation and to force
a decrease in cost
Interface between logic and algebra is an algebraic constraint
that can be seen as a logic formula

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Example of PSAT solution
Add variables for each soft violation: sx , sy , sz .
x, y , z, ¬pxy ∨ ¬pxz ,
(x ∧ ¬pxy ∧ ¬pxz ) → sx , (y ∧ ¬pxy ) → sy , (z ∧ ¬pxz ) → sz
Ψ = { P(sx ) = 0.25, P(sy ) = 0.6, P(sz ) = 0.6 }
Γ=

Iteration 0:
sx
sy
sz



1
 0

 0
0

1
0
0
1

1
0
1
1

 
0.4
1
1   0
·
1   0.35
0.25
1



HardSoft & PSAT

=

cost(0)
b (0)
Marcelo Finger





=
=




1
 0.25 


 0.60 
0.60
0.4
[1 0 1 0]′ : col 3
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Example of PSAT solution
Add variables for each soft violation: sx , sy , sz .
x, y , z, ¬pxy ∨ ¬pxz ,
(x ∧ ¬pxy ∧ ¬pxz ) → sx , (y ∧ ¬pxy ) → sy , (z ∧ ¬pxz ) → sz
Ψ = { P(sx ) = 0.25, P(sy ) = 0.6, P(sz ) = 0.6 }
Γ=

Iteration 1:
sx
sy
sz



1
 0

 0
0

1
0
0
1

1
0
1
0

 
0.05
1
1   0.35
·
1   0.35
0.25
1



HardSoft & PSAT

=

cost(1)
b (1)
Marcelo Finger





=
=




1
 0.25 


 0.60 
0.60
0.05
[1 1 0 1]′ : col 1
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Example of PSAT solution
Add variables for each soft violation: sx , sy , sz .
x, y , z, ¬pxy ∨ ¬pxz ,
(x ∧ ¬pxy ∧ ¬pxz ) → sx , (y ∧ ¬pxy ) → sy , (z ∧ ¬pxz ) → sz
Ψ = { P(sx ) = 0.25, P(sy ) = 0.6, P(sz ) = 0.6 }
Γ=

Iteration 2:
sx
sy
sz



1
 1

 0
1

1
0
0
1

1
0
1
0

 
0.05
1
1   0.35
·
1   0.40
1
0.20



HardSoft & PSAT

=

cost(2)

Marcelo Finger





=




1
 0.25 


 0.60 
0.60
0
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Next Topic

1

What You Probably Don’t Know

2

Pragmatic Motivation

3

Probabilistic Satisfiability

4

Optimizing Probability Distributions with oPSAT

5

oPSAT and Combinatorial Materials Discovery

6

Conclusions

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Optimizing PSAT solutions

Solutions to PSAT are not unique
First optimization phase: determines if constraints are solvable
Second optimization phase to obtain a distribution with
desirable properties.
A different objective (cost) function

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Minimizing Expected Violations

Idea: minimize the expected number of soft constraints
violated by each valuation, S(v )
S(vi )π(vi )

E (S) =
vi |vi (Γ)=1

Theorem: Every linear function of a model (valuation) has
constant expected value for any PSAT solution
In particular, E (S) is constant, no point in minimizing it
Any other linear expectation function not a candidate for
minimization

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Minimizing Variance

Idea: penalize high S, minimize E (S 2 )
Lemma: The distribution that minimizes E (S 2 ) also
minimizes variance, Var(S) = E ((S − E (S))2 )
oPSAT is a second phase minimization whose objective
function is E (S 2 )
Problem: computing a SAT formula that decreases cost is
harder than in PSAT
oPSAT needs a more elaborate interface logic/linear algebra

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Next Topic

1

What You Probably Don’t Know

2

Pragmatic Motivation

3

Probabilistic Satisfiability

4

Optimizing Probability Distributions with oPSAT

5

oPSAT and Combinatorial Materials Discovery

6

Conclusions

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

Problem Definition

Marcelo Finger
HardSoft & PSAT

oPSAT

Application

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

Problem Modeling

Find an association of peaks to phases, respecting structural
constraints, such that the probability of defect is limited.

Marcelo Finger
HardSoft & PSAT

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

Modeling Process

Logic modeling
Identify the structural elements of the problem
Identify the {0,1}-variables of the problem
Formulas encode the constraints on relationships between
variables
Identify the possible defects
Limit defects with “acceptable” upper probabilities. May
require iteration and/or learning

Marcelo Finger
HardSoft & PSAT

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

Implementation
Implementated in C++
Linear Solver uses blas and lapack
SAT-solver: minisat
PSAT formula (DIMACS extension) generated by C++
formula generator
P(dp ) ≤ 2% =⇒ P(di ) ≤ 1 − (1 − P(dp ))Li
Input: Peaks at sample point
Output: oPSAT most probable model
Compare with SMT implementation in SAT 2012
psat.sourceforge.net
Marcelo Finger
HardSoft & PSAT

Conclusion
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Experimental Results

System
Al/Li/Fe
Al/Li/Fe
Al/Li/Fe
Al/Li/Fe
Al/Li/Fe

Dataset
P L∗ K
28
28
28
45
45

6
8
10
7
8

#Peaks

SMT
Time(s)

170
424
530
651
744

346
10076
28170
18882
46816

6
6
6
6
6

oPSAT
Time(s) Accuracy
5.3
8.8
12.6
121.1
128.0

84.7%
90.5%
83.0%
82.0%
80.3%

The accuracy of SMT is 100%
P: n. of sample points; L∗ : the average n, of peaks per phase
K : n. of basis patterns; #Peaks: overall n. of peaks
#Aux variables > 10 000
Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Next Topic

1

What You Probably Don’t Know

2

Pragmatic Motivation

3

Probabilistic Satisfiability

4

Optimizing Probability Distributions with oPSAT

5

oPSAT and Combinatorial Materials Discovery

6

Conclusions

Marcelo Finger
HardSoft & PSAT
Ignorance

Motivation

PSAT

oPSAT

Application

Conclusion

Conclusions and the Future

oPSAT can be effectively implemented to deal with hard and
soft constraints
Can be successfully applied to non-trivial problems of
materials discovery with acceptable precision and superior run
times than existing methods
Other forms of logic-probabilistic inference are under
investigation

Marcelo Finger
HardSoft & PSAT

Applications of Probabilistic Logic to Materials Discovery: Solving problems with hard and soft constraints