Successfully reported this slideshow.
Upcoming SlideShare
×

# Applications of Probabilistic Logic to Materials Discovery: Solving problems with hard and soft constraints

841 views

Published on

Author: Marcelo Finger (http://www.ime.usp.br/~mfinger/)

Published in: Education, Technology
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

### Applications of Probabilistic Logic to Materials Discovery: Solving problems with hard and soft constraints

1. 1. Ignorance Motivation PSAT oPSAT Application Conclusion Applications of Probabilistic Logic to Materials Discovery Solving problems with hard and soft constraints Marcelo Finger Department of Computer Science Institute of Mathematics and Statistics University of Sao Paulo, Brazil Nov 2013 Marcelo Finger HardSoft & PSAT
2. 2. Ignorance Motivation PSAT oPSAT Application Conclusion Topics 1 What You Probably Don’t Know 2 Pragmatic Motivation 3 Probabilistic Satisfiability 4 Optimizing Probability Distributions with oPSAT 5 oPSAT and Combinatorial Materials Discovery 6 Conclusions Marcelo Finger HardSoft & PSAT
3. 3. Ignorance Motivation PSAT oPSAT Application Conclusion Next Topic 1 What You Probably Don’t Know 2 Pragmatic Motivation 3 Probabilistic Satisfiability 4 Optimizing Probability Distributions with oPSAT 5 oPSAT and Combinatorial Materials Discovery 6 Conclusions Marcelo Finger HardSoft & PSAT
4. 4. Ignorance Motivation PSAT oPSAT My Anguish How to Lead My Career? Basic Research Application Oriented Research Marcelo Finger HardSoft & PSAT Application Conclusion
5. 5. Ignorance Motivation PSAT oPSAT Application Conclusion My Anguish How to Lead My Career? Basic Research Advantages: Does not require a lot of resources; there is a community to talk to; can be beautiful Application Oriented Research Marcelo Finger HardSoft & PSAT
6. 6. Ignorance Motivation PSAT oPSAT Application Conclusion My Anguish How to Lead My Career? Basic Research Advantages: Does not require a lot of resources; there is a community to talk to; can be beautiful Problems: Beauty is hard to achieve; boring, not interesting to most; attracts little money Application Oriented Research Marcelo Finger HardSoft & PSAT
7. 7. Ignorance Motivation PSAT oPSAT Application Conclusion My Anguish How to Lead My Career? Basic Research Advantages: Does not require a lot of resources; there is a community to talk to; can be beautiful Problems: Beauty is hard to achieve; boring, not interesting to most; attracts little money Application Oriented Research Advantages: Sexy, attracts visibility and money Marcelo Finger HardSoft & PSAT
8. 8. Ignorance Motivation PSAT oPSAT Application Conclusion My Anguish How to Lead My Career? Basic Research Advantages: Does not require a lot of resources; there is a community to talk to; can be beautiful Problems: Beauty is hard to achieve; boring, not interesting to most; attracts little money Application Oriented Research Advantages: Sexy, attracts visibility and money Problems: Very hard to obtain data (and money is a must!), easily obsolescent Marcelo Finger HardSoft & PSAT
9. 9. Ignorance Motivation PSAT oPSAT Application Conclusion My Anguish How to Lead My Career? Basic Research Advantages: Does not require a lot of resources; there is a community to talk to; can be beautiful Problems: Beauty is hard to achieve; boring, not interesting to most; attracts little money Application Oriented Research Advantages: Sexy, attracts visibility and money Problems: Very hard to obtain data (and money is a must!), easily obsolescent No answer, but keep this in mind Marcelo Finger HardSoft & PSAT
10. 10. Ignorance Motivation PSAT oPSAT Application What you most certainly don’t know Combinatorial Chemistry: discovery of new products Lots of experiments Each has a diﬀerent settings Search for the right setting that produces an interesting product Very useful to ﬁnd new medicine, paint, adhesives, etc. Marcelo Finger HardSoft & PSAT Conclusion
11. 11. Ignorance Motivation PSAT oPSAT Application Combinatorial Materials Science Search for new materials. Aim: new catalysts for the puriﬁcation of water and air Buzz word: Sustainability Marcelo Finger HardSoft & PSAT Conclusion
12. 12. Ignorance Motivation PSAT oPSAT Application Conclusion Combinatorial Materials Science Search for new materials. Aim: new catalysts for the puriﬁcation of water and air Buzz word: Sustainability This work was developed within the Institute of Computational Sustainability at the Department of Computer Science, Cornell University Marcelo Finger HardSoft & PSAT
13. 13. Ignorance Motivation PSAT oPSAT Application The Materials Discovery Problem In a reaction, where are the materials formed? Marcelo Finger HardSoft & PSAT Conclusion
14. 14. Ignorance Motivation PSAT oPSAT Application The Crazy Idea Use Probabilistic Logic to solve this problem Consider only peaks Each peak is in a layer Connect all peaks in the same layer with a graph Model graphs with propositional logic Each disconnected peak is a defect Marcelo Finger HardSoft & PSAT Conclusion
15. 15. Ignorance Motivation PSAT oPSAT Application The Crazy Idea Use Probabilistic Logic to solve this problem Consider only peaks Each peak is in a layer Connect all peaks in the same layer with a graph Model graphs with propositional logic Each disconnected peak is a defect Diﬀerent models, diﬀerent graphs, diﬀerent defects Marcelo Finger HardSoft & PSAT Conclusion
16. 16. Ignorance Motivation PSAT oPSAT Application The Crazy Idea Use Probabilistic Logic to solve this problem Consider only peaks Each peak is in a layer Connect all peaks in the same layer with a graph Model graphs with propositional logic Each disconnected peak is a defect Diﬀerent models, diﬀerent graphs, diﬀerent defects Limit the probability of the occurrence of defects Marcelo Finger HardSoft & PSAT Conclusion
17. 17. Ignorance Motivation PSAT oPSAT Application Challenges: Only for the tough Logic Probability Marcelo Finger HardSoft & PSAT Conclusion
18. 18. Ignorance Motivation PSAT oPSAT Application Challenges: Only for the tough Logic Probability Linear Algebra Linear Programming Optimization Algorithms Marcelo Finger HardSoft & PSAT Conclusion
19. 19. Ignorance Motivation Student Profile Marcelo Finger HardSoft & PSAT PSAT oPSAT Application Conclusion
20. 20. Ignorance Motivation PSAT oPSAT Application Conclusion Next Topic 1 What You Probably Don’t Know 2 Pragmatic Motivation 3 Probabilistic Satisfiability 4 Optimizing Probability Distributions with oPSAT 5 oPSAT and Combinatorial Materials Discovery 6 Conclusions Marcelo Finger HardSoft & PSAT
21. 21. Ignorance Motivation PSAT oPSAT Application Conclusion Pragmatic Motivation Practical problems combine real-world hard constraints with soft constraints Soft constraints: preferences, uncertainties, ﬂexible requirements We explore probabilistic logic as a mean of dealing with combined soft and hard constraints Marcelo Finger HardSoft & PSAT
22. 22. Ignorance Motivation PSAT oPSAT Application Goals Aim: Combine Logic and Probabilistic reasoning to deal with Hard (L) and Soft (P) constraints Method: develop optimized Probabilistic Satisﬁability (oPSAT) Application: Demonstrate eﬀectiveness on a real-world reasoning task in the domain of Materials Discovery. Marcelo Finger HardSoft & PSAT Conclusion
23. 23. Ignorance Motivation PSAT oPSAT Application An Example Summer course enrollment m students and k summer courses. Potential team mates, to develop coursework. Constraints: Hard Coursework to be done alone or in pairs. Students must enroll in at least one and at most three courses. There is a limit of ℓ students per course. Soft Avoid having students with no teammate. Marcelo Finger HardSoft & PSAT Conclusion
24. 24. Ignorance Motivation PSAT oPSAT Application An Example Summer course enrollment m students and k summer courses. Potential team mates, to develop coursework. Constraints: Hard Coursework to be done alone or in pairs. Students must enroll in at least one and at most three courses. There is a limit of ℓ students per course. Soft Avoid having students with no teammate. In our framework: P(student with no team mate) “minimal” or bounded Marcelo Finger HardSoft & PSAT Conclusion
25. 25. Ignorance Motivation PSAT oPSAT Application Combining Logic and Probability Many proposals in the literature Markov Logic Networks [Richardson & Domingos 2006] Probabilistic Inductive Logic Prog [De Raedt et. al 2008] Relational Models [Friedman et al 1999], etc Marcelo Finger HardSoft & PSAT Conclusion
26. 26. Ignorance Motivation PSAT oPSAT Application Conclusion Combining Logic and Probability Many proposals in the literature Markov Logic Networks [Richardson & Domingos 2006] Probabilistic Inductive Logic Prog [De Raedt et. al 2008] Relational Models [Friedman et al 1999], etc Our choice: Probabilistic Satisﬁability (PSAT) Natural extension of Boolean Logic Desirable properties, e.g. respects Kolmogorov axioms Probabilistic reasoning free of independence presuppositions Marcelo Finger HardSoft & PSAT
27. 27. Ignorance Motivation PSAT oPSAT Application Conclusion Combining Logic and Probability Many proposals in the literature Markov Logic Networks [Richardson & Domingos 2006] Probabilistic Inductive Logic Prog [De Raedt et. al 2008] Relational Models [Friedman et al 1999], etc Our choice: Probabilistic Satisﬁability (PSAT) Natural extension of Boolean Logic Desirable properties, e.g. respects Kolmogorov axioms Probabilistic reasoning free of independence presuppositions What is PSAT? Marcelo Finger HardSoft & PSAT
28. 28. Ignorance Motivation PSAT oPSAT Application Conclusion Next Topic 1 What You Probably Don’t Know 2 Pragmatic Motivation 3 Probabilistic Satisfiability 4 Optimizing Probability Distributions with oPSAT 5 oPSAT and Combinatorial Materials Discovery 6 Conclusions Marcelo Finger HardSoft & PSAT
29. 29. Ignorance Motivation PSAT oPSAT Is PSAT a Zombie Idea? An idea that refuses to die! Marcelo Finger HardSoft & PSAT Application Conclusion
30. 30. Ignorance Motivation PSAT oPSAT Application A Brief History of PSAT Proposed by [Boole 1854], On the Laws of Thought Rediscovered several times since Boole De Finetti [1937, 1974], Good [1950], Smith [1961] Studied by Hailperin [1965] Nilsson [1986] (re)introduces PSAT to AI PSAT is NP-complete [Georgakopoulos et. al 1988] Nilsson [1993]: “complete impracticability” of PSAT computation Many other works; see Hansen & Jaumard [2000] Marcelo Finger HardSoft & PSAT Conclusion
31. 31. Ignorance Motivation PSAT oPSAT Application Or a Wild Amazonian Flower? Awaits special conditions to bloom! (Linear programming + SAT-based techniques) Marcelo Finger HardSoft & PSAT Conclusion
32. 32. Ignorance Motivation PSAT oPSAT Application Conclusion The Setting Formulas α1 , . . . , αℓ over logical variables P = {x1 , . . . , xn } Propositional valuation v : P → {0, 1} Marcelo Finger HardSoft & PSAT
33. 33. Ignorance Motivation PSAT oPSAT Application Conclusion The Setting Formulas α1 , . . . , αℓ over logical variables P = {x1 , . . . , xn } Propositional valuation v : P → {0, 1} A probability distribution over propositional valuations π : V → [0, 1] 2n π(vi ) = 1 i=1 Marcelo Finger HardSoft & PSAT
34. 34. Ignorance Motivation PSAT oPSAT Application Conclusion The Setting Formulas α1 , . . . , αℓ over logical variables P = {x1 , . . . , xn } Propositional valuation v : P → {0, 1} A probability distribution over propositional valuations π : V → [0, 1] 2n π(vi ) = 1 i=1 Probability of a formula α according to π Pπ (α) = Marcelo Finger HardSoft & PSAT {π(vi )|vi (α) = 1}
35. 35. Ignorance Motivation PSAT oPSAT Application The PSAT Problem Consider ℓ formulas α1 , . . . , αℓ deﬁned on n atoms {x1 , . . . , xn } A PSAT problem Σ is a set of ℓ restrictions Σ = {P(αi ) pi |1 ≤ i ≤ ℓ} Probabilistic Satisﬁability: is there a π that satisﬁes Σ? Marcelo Finger HardSoft & PSAT Conclusion
36. 36. Ignorance Motivation PSAT oPSAT Application The PSAT Problem Consider ℓ formulas α1 , . . . , αℓ deﬁned on n atoms {x1 , . . . , xn } A PSAT problem Σ is a set of ℓ restrictions Σ = {P(αi ) pi |1 ≤ i ≤ ℓ} Probabilistic Satisﬁability: is there a π that satisﬁes Σ? In our framework, ℓ = m + k, Σ = Γ ∪ Ψ: Hard Γ = {α1 , . . . , αm }, P(αi ) = 1 (clauses) Soft Ψ = {P(si ) ≤ pi |1 ≤ i ≤ k} si atomic; pi given, learned or minimized Marcelo Finger HardSoft & PSAT Conclusion
37. 37. Ignorance Motivation PSAT oPSAT Application Example continued Only one course, three student enrollments: x, y and z Potential partnerships: pxy and pxz , mutually exclusive. Hard constraint P(x ∧ y ∧ z ∧ ¬(pxy ∧ pxz )) = 1 Soft constraints P(x ∧ ¬pxy ∧ ¬pxz ) P(y ∧ ¬pxy ) P(z ∧ ¬pxz ) Marcelo Finger HardSoft & PSAT ≤ ≤ ≤ 0.25 0.60 0.60 Conclusion
38. 38. Ignorance Motivation PSAT oPSAT Application Example continued Only one course, three student enrollments: x, y and z Potential partnerships: pxy and pxz , mutually exclusive. Hard constraint P(x ∧ y ∧ z ∧ ¬(pxy ∧ pxz )) = 1 Soft constraints P(x ∧ ¬pxy ∧ ¬pxz ) P(y ∧ ¬pxy ) P(z ∧ ¬pxz ) ≤ ≤ ≤ 0.25 0.60 0.60 (Small) solution: distribution π π(x, y , z, ¬pxy , ¬pxz ) = 0.1 π(x, y , z, ¬pxy , pxz ) = 0.5 Marcelo Finger HardSoft & PSAT π(x, y , z, pxy , ¬pxz ) = 0.4 π(v ) = 0 for other 29 valuations Conclusion
39. 39. Ignorance Motivation PSAT oPSAT Application Conclusion Solving PSAT Linear algebraic formulation Optimization problem with exponential columns Columns are not explicitly represented, but provided by a column generation process Goal: minimize the probability of inconsistent columns A SAT solver is employed for column generation and to force a decrease in cost Interface between logic and algebra is an algebraic constraint that can be seen as a logic formula Marcelo Finger HardSoft & PSAT
40. 40. Ignorance Motivation PSAT oPSAT Application Conclusion Example of PSAT solution Add variables for each soft violation: sx , sy , sz . x, y , z, ¬pxy ∨ ¬pxz , (x ∧ ¬pxy ∧ ¬pxz ) → sx , (y ∧ ¬pxy ) → sy , (z ∧ ¬pxz ) → sz Ψ = { P(sx ) = 0.25, P(sy ) = 0.6, P(sz ) = 0.6 } Γ= Iteration 0: sx sy sz  1  0   0 0 1 0 0 1 1 0 1 1   0.4 1 1   0 · 1   0.35 0.25 1  HardSoft & PSAT = cost(0) b (0) Marcelo Finger    = =   1  0.25     0.60  0.60 0.4 [1 0 1 0]′ : col 3
41. 41. Ignorance Motivation PSAT oPSAT Application Conclusion Example of PSAT solution Add variables for each soft violation: sx , sy , sz . x, y , z, ¬pxy ∨ ¬pxz , (x ∧ ¬pxy ∧ ¬pxz ) → sx , (y ∧ ¬pxy ) → sy , (z ∧ ¬pxz ) → sz Ψ = { P(sx ) = 0.25, P(sy ) = 0.6, P(sz ) = 0.6 } Γ= Iteration 1: sx sy sz  1  0   0 0 1 0 0 1 1 0 1 0   0.05 1 1   0.35 · 1   0.35 0.25 1  HardSoft & PSAT = cost(1) b (1) Marcelo Finger    = =   1  0.25     0.60  0.60 0.05 [1 1 0 1]′ : col 1
42. 42. Ignorance Motivation PSAT oPSAT Application Conclusion Example of PSAT solution Add variables for each soft violation: sx , sy , sz . x, y , z, ¬pxy ∨ ¬pxz , (x ∧ ¬pxy ∧ ¬pxz ) → sx , (y ∧ ¬pxy ) → sy , (z ∧ ¬pxz ) → sz Ψ = { P(sx ) = 0.25, P(sy ) = 0.6, P(sz ) = 0.6 } Γ= Iteration 2: sx sy sz  1  1   0 1 1 0 0 1 1 0 1 0   0.05 1 1   0.35 · 1   0.40 1 0.20  HardSoft & PSAT = cost(2) Marcelo Finger    =   1  0.25     0.60  0.60 0
43. 43. Ignorance Motivation PSAT oPSAT Application Conclusion Next Topic 1 What You Probably Don’t Know 2 Pragmatic Motivation 3 Probabilistic Satisfiability 4 Optimizing Probability Distributions with oPSAT 5 oPSAT and Combinatorial Materials Discovery 6 Conclusions Marcelo Finger HardSoft & PSAT
44. 44. Ignorance Motivation PSAT oPSAT Application Conclusion Optimizing PSAT solutions Solutions to PSAT are not unique First optimization phase: determines if constraints are solvable Second optimization phase to obtain a distribution with desirable properties. A diﬀerent objective (cost) function Marcelo Finger HardSoft & PSAT
45. 45. Ignorance Motivation PSAT oPSAT Application Conclusion Minimizing Expected Violations Idea: minimize the expected number of soft constraints violated by each valuation, S(v ) S(vi )π(vi ) E (S) = vi |vi (Γ)=1 Theorem: Every linear function of a model (valuation) has constant expected value for any PSAT solution In particular, E (S) is constant, no point in minimizing it Any other linear expectation function not a candidate for minimization Marcelo Finger HardSoft & PSAT
46. 46. Ignorance Motivation PSAT oPSAT Application Conclusion Minimizing Variance Idea: penalize high S, minimize E (S 2 ) Lemma: The distribution that minimizes E (S 2 ) also minimizes variance, Var(S) = E ((S − E (S))2 ) oPSAT is a second phase minimization whose objective function is E (S 2 ) Problem: computing a SAT formula that decreases cost is harder than in PSAT oPSAT needs a more elaborate interface logic/linear algebra Marcelo Finger HardSoft & PSAT
47. 47. Ignorance Motivation PSAT oPSAT Application Conclusion Next Topic 1 What You Probably Don’t Know 2 Pragmatic Motivation 3 Probabilistic Satisfiability 4 Optimizing Probability Distributions with oPSAT 5 oPSAT and Combinatorial Materials Discovery 6 Conclusions Marcelo Finger HardSoft & PSAT
48. 48. Ignorance Motivation PSAT Problem Definition Marcelo Finger HardSoft & PSAT oPSAT Application Conclusion
49. 49. Ignorance Motivation PSAT oPSAT Application Problem Modeling Find an association of peaks to phases, respecting structural constraints, such that the probability of defect is limited. Marcelo Finger HardSoft & PSAT Conclusion
50. 50. Ignorance Motivation PSAT oPSAT Application Modeling Process Logic modeling Identify the structural elements of the problem Identify the {0,1}-variables of the problem Formulas encode the constraints on relationships between variables Identify the possible defects Limit defects with “acceptable” upper probabilities. May require iteration and/or learning Marcelo Finger HardSoft & PSAT Conclusion
51. 51. Ignorance Motivation PSAT oPSAT Application Implementation Implementated in C++ Linear Solver uses blas and lapack SAT-solver: minisat PSAT formula (DIMACS extension) generated by C++ formula generator P(dp ) ≤ 2% =⇒ P(di ) ≤ 1 − (1 − P(dp ))Li Input: Peaks at sample point Output: oPSAT most probable model Compare with SMT implementation in SAT 2012 psat.sourceforge.net Marcelo Finger HardSoft & PSAT Conclusion
52. 52. Ignorance Motivation PSAT oPSAT Application Conclusion Experimental Results System Al/Li/Fe Al/Li/Fe Al/Li/Fe Al/Li/Fe Al/Li/Fe Dataset P L∗ K 28 28 28 45 45 6 8 10 7 8 #Peaks SMT Time(s) 170 424 530 651 744 346 10076 28170 18882 46816 6 6 6 6 6 oPSAT Time(s) Accuracy 5.3 8.8 12.6 121.1 128.0 84.7% 90.5% 83.0% 82.0% 80.3% The accuracy of SMT is 100% P: n. of sample points; L∗ : the average n, of peaks per phase K : n. of basis patterns; #Peaks: overall n. of peaks #Aux variables > 10 000 Marcelo Finger HardSoft & PSAT
53. 53. Ignorance Motivation PSAT oPSAT Application Conclusion Next Topic 1 What You Probably Don’t Know 2 Pragmatic Motivation 3 Probabilistic Satisfiability 4 Optimizing Probability Distributions with oPSAT 5 oPSAT and Combinatorial Materials Discovery 6 Conclusions Marcelo Finger HardSoft & PSAT
54. 54. Ignorance Motivation PSAT oPSAT Application Conclusion Conclusions and the Future oPSAT can be eﬀectively implemented to deal with hard and soft constraints Can be successfully applied to non-trivial problems of materials discovery with acceptable precision and superior run times than existing methods Other forms of logic-probabilistic inference are under investigation Marcelo Finger HardSoft & PSAT