2. Theorem proving: is it useful?
Math: four color theorem states that, given any map, no more than four
colors are required to color the regions of the map so that no two adjacent
regions have the same color.
3. Theorem proving: is it useful?
Hardware verification: equivalence checking, Bounded Model
Checking
4. Theorem proving: is it useful?
Critical Safety system: B-method, Prover technology
6. Theorem proving: is it useful?
Cryptology :
• Cryptanalysis of Hash Functions
• provable security: A full formal machine-
checked verification of a C program: the
OpenSSL implementation of SHA-256.
7. First implementation
• In 1954,Martin Davis programmed
Presburger's algorithm for
a JOHNNIAC vacuum tube computer at
the Princeton Institute for Advanced Study.
• "Its great triumph was to prove that the sum
of two even numbers is even"
8. Related problem: proof verification
• Certified valid an existing proof
• Proof assistants require a human user to give
hints to the system.
– proof checker,
– Significant proof tasks can be performed
automatically.
9. A. Propositional logic
• Why is it interesting?
– It is simple
– It is decidable
– It has practical and industrial application
(equivalence checking, Bounded Model Checking)
– Actually you can model a lot of things with it, eg
P(n) with n=0..255 P(0) & … & P(255)
– It is of theoretical importance (P ?= NP)
10. Syntax
• P an countable alphabet of proposition
symbol
• R a finite set of connector (, , )
• C a finite set of constant (false, true)
• A formula is:
– A proposition symbol
– A constant
– A B, A B, A with A, B formulas
11. Semantic
• A formula F is a tautology if v(F) = true for all
valuation
• A formula F is satisfiable if there exists a
valuation v with v(F) = true
12. Semantic
• v is boolean valuation if:
– It assign to each atom a boolean
– v(true) = true;
– v(false)=false
– v( A)= v(A);
– v(A B)=v(A) v(B);
– v(A B)=v(A)| (B)
13. So how do I find that a propositional
formula is a tautology?
• Truth table
• Propositional calculus
• (complete) SAT solvers
• Resolution
• Analytic tableaux
14. Truth table
• What is the problem?
A !A A A
T F T
F T T
A B A (B A) = A (B A)
T T T
F T T
T F T
F F T
15. Propositional calculus
• A set of derivation rule,
– eg modus ponens: A, A B derived to B
• A set of axioms:
– Implication axioms
1. A (B A)
2. (A (B C)) ((A B) (A C))
• Example: f => f proof
• With A = f, B = (f f), C = f
2: (f ((f f) f)) ((f (f f)) (f f))
• 1: f ((f f) f)
• (f (f f)) (f f) (modus ponens)
• 1: f (f f)
• f f (modus ponens)
16. Additional axioms
• Negation axiom: reductio ad absurdum:
( A B) (( A B) A)
• Variation:
– (A False) A
– ( A False) A
17. Additional axioms
• Conjonction axioms:
– A (B (A B))
– (A B) A
– (A B) B
• Disjonction axioms:
– A (A B)
– B (A B)
– (A B) ((A C) ((B C) C))
18. Deduction and completness theorems
• B derived from A is equivalent to:
« A B can be derived from axiom »
In other words
« A |- B |- A B »
• Completeness and soundness:
|- A |= A
• What is the problem for theorem proving?
19. Convert to a SAT problem and use a
SAT solver (1)
• Satisfiability (SAT): determine if the variables of a
boolean formula can be assigned to make the
formula evaluate to True.
• Main idea:
– If A is satisfiable, A is not a tautology
– If A is not satisfiable, A is a tautology
20. Convert to a SAT problem and use a
SAT solver (2)
To check if formula A is a tautology:
• Convert A in Conjunctive Normal Form
• Find out if A is statisfiable
• If yes, A is not a tautology:
– there is valuation v such as v( A)= True, so v(A)=
False
• If no, A is satisfiable
21. Convert to a SAT problem and use a
SAT solver (3)
• SAT problem is NP-complete: first decision
problem proved to be NP-complete in 1971
• Implication: general SAT algorithms is probably
exponential in time (unless P = NP)
• Question: why this is better than truth table
method?
22. Conjunctive Normal Form (CNF)
• A is in CNF if it is a conjunction of clauses
• A clause is a disjunction of literals
• A literals is either an atom or the negation of
an atom
• Example: (A B) (B C D) (D E)
23. CNFs make life easier
• A clause is satisfied if at least one of its literals
is assigned to True
– (B C D)
• A clause cannot be satisfied if all its literals are
assigned to False
• A formula is satisfied if all its clauses are
satisfied
• A formula is unsatisfied if at least one of its
clauses is unsatisfied
24. Conversion to CNF
• Every propositional formula can be converted
into an equivalent CNF formula using De
Morgan and distributive law.
• However…
• The following formula
(A1 B1) (A2 B2) … (An Bn)
is converted in a formula with 2n clauses!
25. Conversion in 2n clauses example
(x1 & x2) | (x3 & x4) | (x5 & x6) | (x7 & x8) in CNF:
x1 V x3 V x5 V x7) &
(x2 V x3 V x5 V x7) &
(x1 V x4 V x5 V x7) &
(x2 V x4 V x5 V x7) &
(x1 V x3 V x6 V x7) &
(x2 V x3 V x6 V x7) &
(x1 V x4 V x6 V x7) &
(x2 V x4 V x6 V x7) &
(x1 V x3 V x5 V x8) &
(x2 V x3 V x5 V x8) &
(x1 V x4 V x5 V x8) &
(x2 V x4 V x5 V x8) &
(x1 V x3 V x6 V x8) &
(x2 V x3 V x6 V x8) &
(x1 V x4 V x6 V x8) &
(x2 V x4 V x6 V x8)
26. Tseitin transformation
• T(A) is satisfiable iff A is satisfiable
• T linearly increases size of A.
• Based on the following transformations:
• Example:
27. Naive SAT-solver algorithm
• Choose a literal and assign it a boolean value
• Simplify and recursively check if the simplified
formula is satisfiable
– Yes: original formula is satisfiable
– No: same recursive check is done with opposite
Boolean value
• Simplification: removes all clauses which become
true and all literals that become false from the
remaining clauses.
• Python example there
28. Empty disjunctions and conjunction
• Empty disjunction always false
– So empty clause always false
• Empty conjunction always true
Why?
– In general, a commutative, associative binary
operation applied on an empty set should be the
identity element for that operation.
29. DPLL
• DPLL = Davis-Putnam-Logemann-Loveland
• Introduced in two articles in the early 60s
• Impressive engineering work made it efficient
over the years
31. General concept
• Start from a CNF formula
• Try to build an assignement that satisfies the
formula
• The assignement is build using an efficient
backtracking mechanism:
– Truth values are propagated to reduce the number
of future assignements
32. Davis-Putnam-Logemann-Loveland
(DPLL) - 1962
Improve naive algorithm with
1. Unit propagation:
– If a clause is a unit clause, it can only be satisfied
by assigning the necessary value to make this
literal true. Thus, no choice is necessary.
– In practice, this often leads to deterministic
cascades of units, avoiding a large part of the
naive search space.
33. Unit propagation
• (P Q R) (P Q) P
• Apply unit propagation for P: P= False
– (Q R) Q
• Apply unit propagation for Q: Q = False
– R
• R = False
34. Davis-Putnam-Logemann-Loveland
(DPLL) - 1962
2. Pure literal elimination:
– f a propositional variable occurs with only one
polarity in the formula, it can always be assigned in
a way that makes all clauses containing them true.
– These clauses do not constrain the search anymore
and can be deleted.
35. Pure literal elimination
L is a pure literal of f, if it occurs with only one
polarity in f, L can always be assigned safely
• (P Q) (P Q) (R Q) (R Q)
• P is always with a positive polarity. We can assign
P to true, True and get
• (R Q) (R Q)
• Then, in the same way, we assign R to True
• We get the empty conjunction, formula
satisfiable
37. Modern SAT Solvers
Conflict-driven:
• efficient conflict analysis,
• clause learning,
• non chronological backtracking,
• "two-watched-literals" unit propagation,
• adaptive branching, and random restarts.
Essential for handling the large SAT instances in
Electronic Design Automation
38. Modern SAT Solvers
Look-ahead:
• strengthened reductions (going beyond unit-
clause propagation) and heuristics
Generally stronger than conflict-driven solvers on
hard (small) instances.
39. Sat problems libraries
• Standardized input format
becomes
p cnf NUMBER_OF_VARIABLES NUMBER_OF_CLAUSES
40. Some references
• SATLIB, SAT competition, SAT race
• A reference solver: MiniSat
• Pycosat: a Python binding to a picoSat
• http://logictools.org/ SAT solver running in your
browser
• Prover technology (Stalmarck’s Method ): a
commercial prover based on proprietory
technology
• Naive Python SAT implementation as basis for
hands on
41. Hornsat
• A horn clause is a clause with only one
positive literal
Eg: A (p1 p2 … pn)
• Finding an assignment to a conjunction of
(propositional) Horn clause can be made in
linear time
• Based of Prolog programming language.
42. (propositional) resolution
• It is refutation theorem technic (as DPLL) on CNF
• One inference rule: with ai=!bj
Exemple:
• Resolution rule is applied to all possible pair of
clauses (at each step, repeated literal are removed)
• If derivation end up to the empty clause, the negated
formula is unsatisfiable (it implies false)
• If empty clause is not reached the negated formula is
satisfiable
43. Analytics tableaux
• Refutation
• formulas in nodes of the
same branch are in
conjunction
• different branches are
disjuncted
• Popular for modal logic.
• Example {(a⋁¬b)⋀b,¬a} not
satisfiable
44. Model checking
• Model checking is a brute force semantic
approach
• Mainly focus on temporal propositional logic
• Equivalence checking can also rely on similar
approach, eg BDD
46. B. First order logic
• Why it is interesting?
– Expressive enough to express a wide range
problem
– Mature subfield of theorem proving: even if
undecidable there numerous fully automated
systems
47. Syntax
• L=(P, F, C, V) is a first order language with
– P is a countable set of predicate symbol
– F is a countable set of function symbol
– C is a countable set of constant symbol
– V is a countable set of variable symbol
• A term for L is either:
– A constant or a variable
– f(t1, t2, …, tn) were the ti are terms and f a function
symbol
48. Formulas
• A formula is either
– An atom: P(t1, t2, …, tn) were the ti are terms and P a
predicate symbol of arity n
– R P1, P2, …, Pn where R is a connector of arity n and
the Pi are formulas. Connectors: , ,
– Qx.A where Q is a quantifier and A a formula.
Quantifiers: ∀, ∃
• Constants: True, False
• A free variable is a variable not under the scope
of quantifier
49. Semantic
• A model M= (D, I) of a first order language is:
– D a non empty set: interpretation domain
– An function I (interpretation) who map
• Each constant to an element of D
• Each function symbol to a function Dn -> D
• Each predicate symbol to a relation on D
• A formula evaluates given an interpretation and
a variable assignment
A variable assignment m for M= (D, I) is a function
which associates to each variable an element of D
51. Semantic (Cont.)
Valuation of a formula wrt model m and variable
assignement m
– v(P(t1,…, tn)) = true (v(t1), …, v(tn)) ∈ I(P)
– v( A) = v(A)
– v(A B) = v(A) v(B),
– v(A B) = v(A) v(B)
– v(∀x. A) = true iff v(A) = true for all variable
assignments which differ with m only on the mapping
of x
– v(∃x. A) = true iff v(A) = true for a variable assignment
which differs with m only on the mapping of x
52. Semantic (cont.)
• A formula F is true for a model, if for all
variables assignements, v(F) = true
• A formula is valid if it is true for all model
• A formula F is satisfiable in a model M, if there
exists a valuation v such that v(F) = true
53. Substitution
• Finite set of distinct mapping from variable to
terms: [x1 := t1, …, xn:=tn]
• A substitution can be applied to by
simultaneously replacing each of the xi with
the ti, eg
A(f(x, g(y), c)[x:=h(a, y), y := b] = A(f(h(a, y)), g(b), c)
54. Calculus
• Propositional calculus
• ∀x. A => A[x:=t]
• ∀x.(A => B) => (∀x. A => ∀x. B)
• A(x) => ∀x. A where x is a free variable of A
• ∀x. (A => ∃y.A[x:=y])
• ∀x.(A => B) => ∃x. (A => B)
55. Deduction and completness theorems
• B derived from A is equivalent to
– A => B can be derived from axiom
• Said in other words
– A |- B |- A => B
• Completness and soundness:
– |- A |= A
56. Theorem proving methods
• No truth table
• Calculus, eg Sequent calculus
• Refutation technics: resolution
58. Algorithms based on cut-elimination
theorem
• Start with the sequent s to be proved
• Choose a inference rule having s as a
conclusion
• Start again for s1,..., sn the inference rule
premises.
• Iterates until all branches end with an axiom
59. The way to resolution
• Unification
• Normal form
– Prenex form and skolem form
– Clausal form
• Resolution
60. Unification
• A substitution S1 is more general than S2 if there is a
substitution S such that
S2 = S o S1
• r and t are unifiable if there is a substitution S such
that
S(r) = S(t)
• A unifier for r and t is a most general unifier (mgu) if
it is more general than all unifiers for r and t
• mgu are equivalent modulo variables renaming
61. Unfication - Conflict set
For E a finite set of expressions (terms or atoms),
A conflict set D is defined as follow:
– From the left find the position of the first symbol such
that an other element of E does not have the same
symbol at the same position,
– Extract the sub-expression starting at the position for
all elements (e1, …, en) and add it to D
– Start again
62. Unification – a mgu algorithm
• Conflict(e, f) :=
– return Id if no discordance
– d = (i, j) first discordance of e and f
– return Failed If neither i nor j are variable
– return Failed if i is a free variable of j, eg j =f(i)
– return [i := j]
• k:= 0, S0 := id; e0 = e, f0 = f
• R = Conflict(ek, fk)
– If R = Failed, stop, no mgu
– If R = id, stop, mgu is Sk
– Else Sk+1 := R o Sk,
ek+1 :=R(ek); fk+1 :=R(fk); k := k+1
63. Small exercice
• Unify f(i(c, t), h(y, t)) et f(x, h(t, t))
• Unify g(x, x) with g(c, t)
• Unify h(x, f(x)) with h(y, x)
64. Prenex normal form
• A formula is in prenex normal form if “all
quantifier are in front”:
∀x ∀y (A(x, y) & B(x))
• All formula can be rewritten to an equivalent
formula in Prenex normal form with separate
variables
65. Skolemization
• Skolemization: removes all existential quantifiers
from a prenex normal form formula with
separate variable F
• F is satisfiable iff skolem(F) is satisfiable
• It works by applying the following second order
equivalence
Where f is a skolem function that map x to y.
∀x∃y∃z P(x, y, z) -> ∀x∃y P(x, y, f(x)) -> ∀x P(x, g(x), f(x))
66. Clausal Normal Form
• Put the formula in Negative Normal Form
• Then in Prenex Normal form with separate
variables
• Skolemize
• Discards the universal quantifiers (they are
implicit in CNF)
• Use Tseitin transformation to get a conjunction
of disjunctions (or De Morgan for equivalence)
68. Equality
• Axioms
– Reflexivity: for each variable x, x = x
– Substitution for functions: for all variables x, y anf any
function symbol f
(x = y) f(…, x, …) = f(…, y, …)
– Substitution for formulas: for any variables x, y and any
formula F(x), if F’ is obtained by replacing any number of
occurrences of x in F with y, such that these remain free
occurrences of y, then
(x=y) (F F’)
• Practical theorem provers are likely to handle
equality in a way or an other
69. Distributivity from a nonstandard
Boolean algebra with Power9
Assumptions
x v (y v z) = y v (x v z).
x ^ y = (x' v y')'.
x v x' = y v y'.
(x v y') ^ (x v y) = x.
Goal
x ^ (y v z) = (x ^ y) v (x ^ z)
70. √2 is irrational expressed for Prover9
% This proof presumes numbers only range over positive integers (1,2,...).
% It's a proof by contradiction; the assumptions are entered, and prover9 shows that there is a
contradiction.
formulas(assumptions).
1*x = x. % identity
x*1 = x.
x*(y*z) = (x*y)*z. % associativity
x*y = y*x. % commutativity
(x*y = x*z ) -> y = z. % cancellation (0 is not allowed, so x!=0).
% Now let's define divides(x,y): x divides y. divides(2,6) is true b/c 2*3=6.
divides(x,y) <-> (exists z x*z = y).
divides(2,x*y) -> (divides(2,x) | divides(2,y)). % If 2 divides x*y, it divides x or y.
2 != 1.
a*a = (2*(b*b)). % a/b = sqrt(2), so a^2 = 2 * b^2.
(x != 1) -> -(divides(x,a) & divides(x,b)). % a/b is in lowest terms
71. • 1 x * y = x * z -> y = z # label(non_clause). [assumption].
• 2 divides(x,y) <-> (exists z x * z = y) # label(non_clause). [assumption].
• 3 divides(2,x * y) -> divides(2,x) | divides(2,y) # label(non_clause). [assumption].
• 4 x != 1 -> -(divides(x,a) & divides(x,b)) # label(non_clause). [assumption].
• 7 x * (y * z) = (x * y) * z. [assumption].
• 8 (x * y) * z = x * (y * z). [copy(7),flip(a)].
• 9 x * y = y * x. [assumption].
• 10 x * y != x * z | y = z. [clausify(1)].
• 11 -divides(x,y) | x * f1(x,y) = y. [clausify(2)].
• 12 divides(x,y) | x * z != y. [clausify(2)].
• 13 -divides(2,x * y) | divides(2,x) | divides(2,y). [clausify(3)].
• 14 a * a = 2 * (b * b). [assumption].
• 15 2 * (b * b) = a * a. [copy(14),flip(a)].
• 16 x = 1 | -divides(x,a) | -divides(x,b). [clausify(4)].
• 17 1 = x | -divides(x,a) | -divides(x,b). [copy(16),flip(a)].
• 18 2 != 1. [assumption].
• 19 1 != 2. [copy(18),flip(a)].
• 20 -divides(2,x * x) | divides(2,x). [factor(13,b,c)].
• 21 x * (y * z) = y * (x * z). [para(9(a,1),8(a,1,1)),rewrite([8(2)])].
• 37 divides(2,a * a). [resolve(15,a,12,b)].
• 40 a * a != 2 * x | b * b = x. [para(15(a,1),10(a,1))].
• 83 divides(2,a). [resolve(37,a,20,a)].
• 85 2 * f1(2,a) = a. [resolve(83,a,11,a)].
• 86 -divides(2,b). [ur(17,a,19,a,b,83,a)].
• 105 -divides(2,b * b). [ur(20,b,86,a)].
• 112 b * b != 2 * x. [ur(12,a,105,a),flip(a)].
• 177 b * b != x * (2 * y). [para(21(a,1),112(a,2))].
• 190 2 * (f1(2,a) * x) = a * x. [para(85(a,1),8(a,1,1)),flip(a)].
• 643 b * b != x * a. [para(85(a,1),177(a,2,2))].
• 646 a * a != 2 * (x * a). [ur(40,b,643,a)].
• 647 $F. [resolve(646,a,190,a(flip))].
73. First order theorem proving relativey
mature
• Thousands of Problems for Theorem Provers
(TPTP) Problem Library
• The CADE ATP System Competition: The World
Championship for Automated Theorem
Proving
• Some provers
– Vampire
– E prover
– Prover9 (Otter successor)