Theorem proving 2018 2019

Theorem Proving foundations
Emmanuel.zarpas@sap.com
October 2018

Theorem proving: is it useful?
Math: four color theorem states that, given any map, no more than four
colors are required to color the regions of the map so that no two adjacent
regions have the same color.

Hardware verification: equivalence checking, Bounded Model
Checking

Critical Safety system: B-method, Prover technology

Automated planning

Cryptology :
• Cryptanalysis of Hash Functions
• provable security: A full formal machine-
checked verification of a C program: the
OpenSSL implementation of SHA-256.

First implementation
• In 1954,Martin Davis programmed
Presburger's algorithm for
a JOHNNIAC vacuum tube computer at
the Princeton Institute for Advanced Study.
• "Its great triumph was to prove that the sum
of two even numbers is even"

Related problem: proof verification
• Certified valid an existing proof
• Proof assistants require a human user to give
hints to the system.
– proof checker,
– Significant proof tasks can be performed
automatically.

A. Propositional logic
• Why is it interesting?
– It is simple
– It is decidable
– It has practical and industrial application
(equivalence checking, Bounded Model Checking)
– Actually you can model a lot of things with it, eg
P(n) with n=0..255  P(0) & … & P(255)
– It is of theoretical importance (P ?= NP)

Syntax
• P an countable alphabet of proposition
symbol
• R a finite set of connector (, , )
• C a finite set of constant (false, true)
• A formula is:
– A proposition symbol
– A constant
– A  B, A  B,  A with A, B formulas

Semantic
• A formula F is a tautology if v(F) = true for all
valuation
• A formula F is satisfiable if there exists a
valuation v with v(F) = true


Semantic
• v is boolean valuation if:
– It assign to each atom a boolean
– v(true) = true;
– v(false)=false
– v( A)=  v(A);
– v(A  B)=v(A)  v(B);
– v(A  B)=v(A)|  (B)

So how do I find that a propositional
formula is a tautology?
• Truth table
• Propositional calculus
• (complete) SAT solvers
• Resolution
• Analytic tableaux

Truth table
• What is the problem?
A !A A  A
T F T
F T T
A B A  (B  A) = A  (B  A)
T T T
F T T
T F T
F F T

Propositional calculus
• A set of derivation rule,
– eg modus ponens: A, A B derived to B
• A set of axioms:
– Implication axioms
1. A (B  A)
2. (A  (B  C))  ((A  B)  (A  C))
• Example: f => f proof
• With A = f, B = (f  f), C = f
2: (f  ((f  f)  f)) ((f (f  f))  (f  f))
• 1: f  ((f  f)  f)
• (f  (f  f))  (f  f) (modus ponens)
• 1: f  (f  f)
• f  f (modus ponens)

Additional axioms
• Negation axiom: reductio ad absurdum:
( A  B)  (( A   B)  A)
• Variation:
– (A  False)   A
– ( A  False)  A

Additional axioms
• Conjonction axioms:
– A  (B  (A  B))
– (A  B)  A
– (A  B)  B
• Disjonction axioms:
– A  (A  B)
– B  (A  B)
– (A  B)  ((A  C)  ((B  C)  C))

Deduction and completness theorems
• B derived from A is equivalent to:
« A  B can be derived from axiom »
In other words
« A |- B  |- A  B »
• Completeness and soundness:
|- A  |= A
• What is the problem for theorem proving?

Convert to a SAT problem and use a
SAT solver (1)
• Satisfiability (SAT): determine if the variables of a
boolean formula can be assigned to make the
formula evaluate to True.
• Main idea:
– If  A is satisfiable, A is not a tautology
– If  A is not satisfiable, A is a tautology

SAT solver (2)
To check if formula A is a tautology:
• Convert  A in Conjunctive Normal Form
• Find out if  A is statisfiable
• If yes, A is not a tautology:
– there is valuation v such as v( A)= True, so v(A)=
False
• If no, A is satisfiable

SAT solver (3)
• SAT problem is NP-complete: first decision
problem proved to be NP-complete in 1971
• Implication: general SAT algorithms is probably
exponential in time (unless P = NP)
• Question: why this is better than truth table
method?

Conjunctive Normal Form (CNF)
• A is in CNF if it is a conjunction of clauses
• A clause is a disjunction of literals
• A literals is either an atom or the negation of
an atom
• Example: (A  B)  (B  C  D)  (D  E)

CNFs make life easier
• A clause is satisfied if at least one of its literals
is assigned to True
– (B  C  D)
• A clause cannot be satisfied if all its literals are
assigned to False
• A formula is satisfied if all its clauses are
satisfied
• A formula is unsatisfied if at least one of its
clauses is unsatisfied

Conversion to CNF
• Every propositional formula can be converted
into an equivalent CNF formula using De
Morgan and distributive law.
• However…
• The following formula
(A1  B1)  (A2  B2)  …  (An  Bn)
is converted in a formula with 2n clauses!

Conversion in 2n clauses example
(x1 & x2) | (x3 & x4) | (x5 & x6) | (x7 & x8) in CNF:
x1 V x3 V x5 V x7) &
(x2 V x3 V x5 V x7) &
(x1 V x4 V x5 V x7) &
(x2 V x4 V x5 V x7) &
(x1 V x3 V x6 V x7) &
(x2 V x3 V x6 V x7) &
(x1 V x4 V x6 V x7) &
(x2 V x4 V x6 V x7) &
(x1 V x3 V x5 V x8) &
(x2 V x3 V x5 V x8) &
(x1 V x4 V x5 V x8) &
(x2 V x4 V x5 V x8) &
(x1 V x3 V x6 V x8) &
(x2 V x3 V x6 V x8) &
(x1 V x4 V x6 V x8) &
(x2 V x4 V x6 V x8)

Tseitin transformation
• T(A) is satisfiable iff A is satisfiable
• T linearly increases size of A.
• Based on the following transformations:
• Example:

Naive SAT-solver algorithm
• Choose a literal and assign it a boolean value
• Simplify and recursively check if the simplified
formula is satisfiable
– Yes: original formula is satisfiable
– No: same recursive check is done with opposite
Boolean value
• Simplification: removes all clauses which become
true and all literals that become false from the
remaining clauses.
• Python example there

Empty disjunctions and conjunction
• Empty disjunction always false
– So empty clause always false
• Empty conjunction always true
Why?
– In general, a commutative, associative binary
operation applied on an empty set should be the
identity element for that operation.

DPLL
• DPLL = Davis-Putnam-Logemann-Loveland
• Introduced in two articles in the early 60s
• Impressive engineering work made it efficient
over the years

Python exemple for hands on session

General concept
• Start from a CNF formula
• Try to build an assignement that satisfies the
formula
• The assignement is build using an efficient
backtracking mechanism:
– Truth values are propagated to reduce the number
of future assignements

Davis-Putnam-Logemann-Loveland
(DPLL) - 1962
Improve naive algorithm with
1. Unit propagation:
– If a clause is a unit clause, it can only be satisfied
by assigning the necessary value to make this
literal true. Thus, no choice is necessary.
– In practice, this often leads to deterministic
cascades of units, avoiding a large part of the
naive search space.

Unit propagation
• (P  Q   R)  (P   Q)   P
• Apply unit propagation for  P: P= False
– (Q   R)   Q
• Apply unit propagation for  Q: Q = False
–  R
• R = False

Davis-Putnam-Logemann-Loveland
(DPLL) - 1962
2. Pure literal elimination:
– f a propositional variable occurs with only one
polarity in the formula, it can always be assigned in
a way that makes all clauses containing them true.
– These clauses do not constrain the search anymore
and can be deleted.

Pure literal elimination
L is a pure literal of f, if it occurs with only one
polarity in f, L can always be assigned safely
• (P  Q)  (P   Q)  (R  Q)  (R   Q)
• P is always with a positive polarity. We can assign
P to true, True and get
• (R  Q)  (R   Q)
• Then, in the same way, we assign R to True
• We get the empty conjunction, formula
satisfiable

Example
• (A|!B|D) & (A|!B|E) & (!B|!D) &
(A|B|C|D) & (A|B|C|!D) & (A|B|!C|E) &
(!B|D)
• (A|!B|D) & (A|!B|E) & (!B|!D) &
(A|B|C|D) & (A|B|C|!D) & (B|!C|E) &
(!B|D) – Pure literal elimination
• (!B|!D) & (B|!C|E) & (!B|D)
• Try B := True
(!B|!D) & (B|!C|E) & (!B|D)
• (!D) & (D) – D := false : conflict
• Try B:= False
(!B|!D) & (B|!C|E) & (!B|D)
• (!C|E) – satisfiable with E:= true

Modern SAT Solvers
Conflict-driven:
• efficient conflict analysis,
• clause learning,
• non chronological backtracking,
• "two-watched-literals" unit propagation,
• adaptive branching, and random restarts.
Essential for handling the large SAT instances in
Electronic Design Automation

Modern SAT Solvers
Look-ahead:
• strengthened reductions (going beyond unit-
clause propagation) and heuristics
Generally stronger than conflict-driven solvers on
hard (small) instances.

Sat problems libraries
• Standardized input format
becomes
p cnf NUMBER_OF_VARIABLES NUMBER_OF_CLAUSES

Some references
• SATLIB, SAT competition, SAT race
• A reference solver: MiniSat
• Pycosat: a Python binding to a picoSat
• http://logictools.org/ SAT solver running in your
browser
• Prover technology (Stalmarck’s Method ): a
commercial prover based on proprietory
technology
• Naive Python SAT implementation as basis for
hands on

Hornsat
• A horn clause is a clause with only one
positive literal
Eg: A  (p1  p2  …  pn)
• Finding an assignment to a conjunction of
(propositional) Horn clause can be made in
linear time
• Based of Prolog programming language.

(propositional) resolution
• It is refutation theorem technic (as DPLL) on CNF
• One inference rule: with ai=!bj
Exemple:
• Resolution rule is applied to all possible pair of
clauses (at each step, repeated literal are removed)
• If derivation end up to the empty clause, the negated
formula is unsatisfiable (it implies false)
• If empty clause is not reached the negated formula is
satisfiable

Analytics tableaux
• Refutation
• formulas in nodes of the
same branch are in
conjunction
• different branches are
disjuncted
• Popular for modal logic.
• Example {(a⋁¬b)⋀b,¬a} not
satisfiable

Model checking
• Model checking is a brute force semantic
approach
• Mainly focus on temporal propositional logic
• Equivalence checking can also rely on similar
approach, eg BDD

Proposed exercises
• Implement a propositional calculus engine
• Implement DPLL algotithm
• Implement propositional resolution

B. First order logic
• Why it is interesting?
– Expressive enough to express a wide range
problem
– Mature subfield of theorem proving: even if
undecidable there numerous fully automated
systems

Syntax
• L=(P, F, C, V) is a first order language with
– P is a countable set of predicate symbol
– F is a countable set of function symbol
– C is a countable set of constant symbol
– V is a countable set of variable symbol
• A term for L is either:
– A constant or a variable
– f(t1, t2, …, tn) were the ti are terms and f a function
symbol

Formulas
• A formula is either
– An atom: P(t1, t2, …, tn) were the ti are terms and P a
predicate symbol of arity n
– R P1, P2, …, Pn where R is a connector of arity n and
the Pi are formulas. Connectors: , , 
– Qx.A where Q is a quantifier and A a formula.
Quantifiers: ∀, ∃
• Constants: True, False
• A free variable is a variable not under the scope
of quantifier

Semantic
• A model M= (D, I) of a first order language is:
– D a non empty set: interpretation domain
– An function I (interpretation) who map
• Each constant to an element of D
• Each function symbol to a function Dn -> D
• Each predicate symbol to a relation on D
• A formula evaluates given an interpretation and
a variable assignment
A variable assignment m for M= (D, I) is a function
which associates to each variable an element of D

Semantic (Cont.)
Valuation v of a term:
– v(c) = I(c)
– v(x)= m(x)
– v(f(t1,…, tn)) = I(f)(v(t1),…, v(tn))

Semantic (Cont.)
Valuation of a formula wrt model m and variable
assignement m
– v(P(t1,…, tn)) = true  (v(t1), …, v(tn)) ∈ I(P)
– v( A) =  v(A)
– v(A  B) = v(A)  v(B),
– v(A  B) = v(A)  v(B)
– v(∀x. A) = true iff v(A) = true for all variable
assignments which differ with m only on the mapping
of x
– v(∃x. A) = true iff v(A) = true for a variable assignment
which differs with m only on the mapping of x

Semantic (cont.)
• A formula F is true for a model, if for all
variables assignements, v(F) = true
• A formula is valid if it is true for all model
• A formula F is satisfiable in a model M, if there
exists a valuation v such that v(F) = true

Substitution
• Finite set of distinct mapping from variable to
terms: [x1 := t1, …, xn:=tn]
• A substitution can be applied to by
simultaneously replacing each of the xi with
the ti, eg
A(f(x, g(y), c)[x:=h(a, y), y := b] = A(f(h(a, y)), g(b), c)

Calculus
• Propositional calculus
• ∀x. A => A[x:=t]
• ∀x.(A => B) => (∀x. A => ∀x. B)
• A(x) => ∀x. A where x is a free variable of A
• ∀x. (A => ∃y.A[x:=y])
• ∀x.(A => B) => ∃x. (A => B)

Deduction and completness theorems
• B derived from A is equivalent to
– A => B can be derived from axiom
• Said in other words
– A |- B  |- A => B
• Completness and soundness:
– |- A  |= A

Theorem proving methods
• No truth table
• Calculus, eg Sequent calculus
• Refutation technics: resolution

Algorithms based on cut-elimination
theorem
• Start with the sequent s to be proved
• Choose a inference rule having s as a
conclusion
• Start again for s1,..., sn the inference rule
premises.
• Iterates until all branches end with an axiom

The way to resolution
• Unification
• Normal form
– Prenex form and skolem form
– Clausal form
• Resolution

Unification
• A substitution S1 is more general than S2 if there is a
substitution S such that
S2 = S o S1
• r and t are unifiable if there is a substitution S such
that
S(r) = S(t)
• A unifier for r and t is a most general unifier (mgu) if
it is more general than all unifiers for r and t
• mgu are equivalent modulo variables renaming

Unfication - Conflict set
For E a finite set of expressions (terms or atoms),
A conflict set D is defined as follow:
– From the left find the position of the first symbol such
that an other element of E does not have the same
symbol at the same position,
– Extract the sub-expression starting at the position for
all elements (e1, …, en) and add it to D
– Start again

Unification – a mgu algorithm
• Conflict(e, f) :=
– return Id if no discordance
– d = (i, j) first discordance of e and f
– return Failed If neither i nor j are variable
– return Failed if i is a free variable of j, eg j =f(i)
– return [i := j]
• k:= 0, S0 := id; e0 = e, f0 = f
• R = Conflict(ek, fk)
– If R = Failed, stop, no mgu
– If R = id, stop, mgu is Sk
– Else Sk+1 := R o Sk,
ek+1 :=R(ek); fk+1 :=R(fk); k := k+1

Small exercice
• Unify f(i(c, t), h(y, t)) et f(x, h(t, t))
• Unify g(x, x) with g(c, t)
• Unify h(x, f(x)) with h(y, x)

Prenex normal form
• A formula is in prenex normal form if “all
quantifier are in front”:
∀x ∀y (A(x, y) & B(x))
• All formula can be rewritten to an equivalent
formula in Prenex normal form with separate
variables

Skolemization
• Skolemization: removes all existential quantifiers
from a prenex normal form formula with
separate variable F
• F is satisfiable iff skolem(F) is satisfiable
• It works by applying the following second order
equivalence
Where f is a skolem function that map x to y.
∀x∃y∃z P(x, y, z) -> ∀x∃y P(x, y, f(x)) -> ∀x P(x, g(x), f(x))

Clausal Normal Form
• Put the formula in Negative Normal Form
• Then in Prenex Normal form with separate
variables
• Skolemize
• Discards the universal quantifiers (they are
implicit in CNF)
• Use Tseitin transformation to get a conjunction
of disjunctions (or De Morgan for equivalence)

Resolution
• Refutation technic
• As for propositional logic with resolution rule
extended
With S = mgu(t, r)

Equality
• Axioms
– Reflexivity: for each variable x, x = x
– Substitution for functions: for all variables x, y anf any
function symbol f
(x = y)  f(…, x, …) = f(…, y, …)
– Substitution for formulas: for any variables x, y and any
formula F(x), if F’ is obtained by replacing any number of
occurrences of x in F with y, such that these remain free
occurrences of y, then
(x=y)  (F F’)
• Practical theorem provers are likely to handle
equality in a way or an other

Distributivity from a nonstandard
Boolean algebra with Power9
Assumptions
x v (y v z) = y v (x v z).
x ^ y = (x' v y')'.
x v x' = y v y'.
(x v y') ^ (x v y) = x.
Goal
x ^ (y v z) = (x ^ y) v (x ^ z)

√2 is irrational expressed for Prover9
% This proof presumes numbers only range over positive integers (1,2,...).
% It's a proof by contradiction; the assumptions are entered, and prover9 shows that there is a
contradiction.
formulas(assumptions).
1*x = x. % identity
x*1 = x.
x*(y*z) = (x*y)*z. % associativity
x*y = y*x. % commutativity
(x*y = x*z ) -> y = z. % cancellation (0 is not allowed, so x!=0).
% Now let's define divides(x,y): x divides y. divides(2,6) is true b/c 2*3=6.
divides(x,y) <-> (exists z x*z = y).
divides(2,x*y) -> (divides(2,x) | divides(2,y)). % If 2 divides x*y, it divides x or y.
2 != 1.
a*a = (2*(b*b)). % a/b = sqrt(2), so a^2 = 2 * b^2.
(x != 1) -> -(divides(x,a) & divides(x,b)). % a/b is in lowest terms

• 1 x * y = x * z -> y = z # label(non_clause). [assumption].
• 2 divides(x,y) <-> (exists z x * z = y) # label(non_clause). [assumption].
• 3 divides(2,x * y) -> divides(2,x) | divides(2,y) # label(non_clause). [assumption].
• 4 x != 1 -> -(divides(x,a) & divides(x,b)) # label(non_clause). [assumption].
• 7 x * (y * z) = (x * y) * z. [assumption].
• 8 (x * y) * z = x * (y * z). [copy(7),flip(a)].
• 9 x * y = y * x. [assumption].
• 10 x * y != x * z | y = z. [clausify(1)].
• 11 -divides(x,y) | x * f1(x,y) = y. [clausify(2)].
• 12 divides(x,y) | x * z != y. [clausify(2)].
• 13 -divides(2,x * y) | divides(2,x) | divides(2,y). [clausify(3)].
• 14 a * a = 2 * (b * b). [assumption].
• 15 2 * (b * b) = a * a. [copy(14),flip(a)].
• 16 x = 1 | -divides(x,a) | -divides(x,b). [clausify(4)].
• 17 1 = x | -divides(x,a) | -divides(x,b). [copy(16),flip(a)].
• 18 2 != 1. [assumption].
• 19 1 != 2. [copy(18),flip(a)].
• 20 -divides(2,x * x) | divides(2,x). [factor(13,b,c)].
• 21 x * (y * z) = y * (x * z). [para(9(a,1),8(a,1,1)),rewrite([8(2)])].
• 37 divides(2,a * a). [resolve(15,a,12,b)].
• 40 a * a != 2 * x | b * b = x. [para(15(a,1),10(a,1))].
• 83 divides(2,a). [resolve(37,a,20,a)].
• 85 2 * f1(2,a) = a. [resolve(83,a,11,a)].
• 86 -divides(2,b). [ur(17,a,19,a,b,83,a)].
• 105 -divides(2,b * b). [ur(20,b,86,a)].
• 112 b * b != 2 * x. [ur(12,a,105,a),flip(a)].
• 177 b * b != x * (2 * y). [para(21(a,1),112(a,2))].
• 190 2 * (f1(2,a) * x) = a * x. [para(85(a,1),8(a,1,1)),flip(a)].
• 643 b * b != x * a. [para(85(a,1),177(a,2,2))].
• 646 a * a != 2 * (x * a). [ur(40,b,643,a)].
• 647 $F. [resolve(646,a,190,a(flip))].

Exercises
• Implement unification
• Implement resolution
• Implement a theorem prover for Presburger
arithmetics (doubly exponential)

First order theorem proving relativey
mature
• Thousands of Problems for Theorem Provers
(TPTP) Problem Library
• The CADE ATP System Competition: The World
Championship for Automated Theorem
Proving
• Some provers
– Vampire
– E prover
– Prover9 (Otter successor)

More logics
• Intuitionistic first order logic
• Linear logic
• Higher order logic

Theorem proving 2018 2019

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Theorem proving 2018 2019

Similar to Theorem proving 2018 2019 (20)

Recently uploaded

Recently uploaded (20)

Theorem proving 2018 2019