This document provides an overview of algorithm design and complexity. It discusses different classes of problems including P vs NP problems. P problems can be solved in polynomial time, while NP problems can be verified in polynomial time but may not be solvable in polynomial time. NP-hard problems are at least as hard as NP problems, and NP-complete problems are NP-hard problems that are also in NP. The document describes techniques for solving difficult problems like backtracking and discusses examples like the n-queens problem.
3. Classes of Problems
Complexity of an algorithm
notation
Used to compare algorithms in order to determine which
one is better
Complexity of a problem?
We would like to know how difficult a problem is in order
to know what solution to look for
Used to compare problems in order to determine which
one is more difficult, regardless of the algorithm used to
solve it!
In other words, by taking into consideration the best
algorithms we could devise to solve the problems
4. Classes of Problems (2)
We are able to define classes of problems, but not as
many than the classes of algorithms
P = any problem for which we can find a correct solution
in polynomial time
NP = any problem for which we can verify that a solution
is correct in polynomial time
We can solve the problem in polynomial time
There exists at least such a solution/an algorithm
The problem is called tractable
We may not solve the problem in polynomial time
Polynomial time = O(nk) k - constant
5. P vs NP
Therefore, P and NP are classes of problems
Any problem in P is also in NP
Not any problem in NP is in P?
If we can find a solution in polynomial time, we surely can verify that a
solution is correct in polynomial time
P NP
It is easier to verify that a “guessed” solution is correct or not than to
compute this solution
This has not been proved until now!
NP P ???
It is unknown if P = NP, but most researchers believe that P and NP
are not the same class
1 million dollar prize for proving that P = NP or P != NP
http://en.wikipedia.org/wiki/P_versus_NP_problem
6. P vs NP (2)
Conclusion: the problems in NP P should be more
difficult than the problems in P
For these problems, we cannot find a solution in
polynomial time at this moment in time
Maybe we shall find in the future, especially if someone
manages to prove that P = NP
Therefore, at this moment in time we could separate
between problems in:
P – less difficult
NP P – more difficult
7. Polynomial Reduction
Given 2 decision problems, A1 and A2
Remember: A decision problem is a problem that has only two
possible outputs: yes/no
We say that problem A1 can be reduced in polynomial
time to problem A2 (A1 P A2) if:
There exists a polynomial time algorithm F that transforms any
input data for A1, x, into an input data for A2, F(x)
A1(x) == yes A2(F(x)) == yes
A1(x) == yes => A2(F(x)) == yes
A1(x) == no => A2(F(x)) == no
Image source: http://homepages.ius.edu/rwisman/C455/html/notes/Chapter34/NP-Completeness.htm
8. Polynomial Reduction (2)
Thus, we can use any solution to A2 to solve A1
If we have a solution for A2, we also have a solution
for A1
If A1 P A2, then we say that problem A1 is easier or
at most as difficult as A2
This looks a bit strange, doesn’t it ?
From the algorithm complexity point of view
See next slide why
9. Polynomial Reduction (3)
Solve A1(x)
x2 = F(x)
RETURN SolveA2(x2)
From the algorithm point of view, it seems that the algorithm
for solving A1 has a greater complexity than the one for
solving A2
Complexity(SolveA1) = Complexity(F) + Complexity(SolveA2)
However, we are interested from the classes of problems point
of view:
We can use the best algorithm for solving A2 and then use F to
solve A1
If A2 P => A1 P
If A2 NP => A1 P or NP (If A2 NP P => A1 P or NPP)
Therefore, A1 is always easier or as at most as difficult as A2
10. Polynomial Reduction (4)
Another property of polynomial reduction:
If A1 NP => A2 NP (If A1 NP P => A2
Because A2 cannot be easier than A1
NPP)
Therefore, we shall use polynomial reduction to
highlight that a problem is more difficult than another
one w.r.t. classes of problems
11. NP-hard and NP-complete
A problem Q is called NP-hard if:
For
It is more difficult or at least as difficult as any problem in
NP
However, Q may not even be in NP! There are problems
even more difficult that those in NP! They are NP-hard
Q1
NP: Q1
P
Q
A problem Q is called NP-complete if
It is NP-hard and it is in NP
These are the most difficult problems in NP!
12. NP-hard and NP-complete (2)
NP-hard and NP-complete are also classes of
problems
A possible graphical representation is:
Image source: http://en.wikipedia.org/wiki/NP-complete
13. NP-complete Problems
Graph Clique
Graph Vertex Cover
Given a undirected graph G(V, E). Is there a clique of size k in
G?
Given a undirected graph G(V, E). Is there a vertex cover of
size k ?
Quick Info:
A clique is a subset of vertices V’ V such that for any v1, v2
V’ there is an edge (v1, v2) E[G]
A vertex cover is a subset of vertices V’ V such that for any
edge (v1, v2) E[G], v1 V’ and/or v2 V’
At least an endpoint of any edge in the graph is covered in V’
14. NP-complete Problems (2)
Graph Coloring
N-Queens Problem
Hamiltonian Cycle
Travelling Salesman Problem
Minesweeper
Task scheduling
Etc.
A lot of interesting problems are NP-hard (some of
them are NP-complete)
15. Conclusions
There are a lot of problems that are very difficult to
solve (NP-hard)
There is no polynomial time solution for them, at
least at this moment in time
We need a method for solving them
Simple solution: backtracking (with heuristics)
16. Backtracking
Useful to solve difficult problems:
Many optimization problems
Combinatorial problems
Problems for which you want to know all the solutions
NP-complete problems
Backtracking improves the brute-force “generate and
test” solution for a problem
Generate and test
Generate all possible solutions
After a solution is final, test if it is correct
17. Generate and Test
Example: k-clique for a graph G(V, E)
1. Generate all combinations of k vertices
1.a. Choose a value from V for the 1st vertex in the clique
1.b. Choose a value from V for the 2nd vertex in the
clique
…
1.?. Choose a value from V for the kth vertex in the clique
2. Test if the generated solution is correct
2.a. If it is correct and you are looking for only 1 solution,
then stop
2.b. Else continue generating the next solution
18. Alternative View of a Problem
Most of these problems can be transformed into the following
problem:
He have a set of n variables: V1, …, Vn
Each of these variables have a specified domain:
There are a set of constraints that should be respected by the final
solution:
One value for each domain should be assigned to each variable in the
final solution
Vi Dom(Vi) = Domi[1..ki]; each domain has ki possible values to choose
from
Constraints for a single variable (E.g.: V2 != 3)
Constraints between two variables (E.g.: V2 != V3 or V1+V3 = 2)
Other kind of constraints
We need to determine a value for each variable that is part of the
domain of that variable and this instantiation respects all the defined
constraints
Constraints Satisfaction Problem (CSP)
19. Example – k-clique
We have k variables: V1, …, Vk – the vertices of the
k-clique
Each variable can take values from all the vertices of
the graph G(Vertices[n], Edges[m])
Dom(V1) = … = Dom(Vk) = {1, …, n}
Considering the vertices of the graph are labeled from
1..n
Constraints:
Vi != Vj for all 1 <= i < j <= k
(Vi, Vj) Edges for all 1 <= i < j <= k
20. Generate and Test – Revisited
For the new formulation of the problems
GenerateAndTest(Vars, Domains, Constraints)
FOR (Vars[1] in Domains[1])
FOR (Vars[2] in Domains[2])
…
FOR (Vars[n] in Domains[n])
CheckConstraints(Vars, Constraints)
Complexity: (k1 * k2 * … * kn) where ki =
size(Domains[i])
If k1 = k2 = … = kn = k => (kn)
Exponential complexity
21. Generate and Test – Recursive
We can write easily a recursive solution
Same complexity as the previous algorithm
GenerateAndTestRecursive(Vars[1..n], Domains, Constraints, k)
IF (k == n + 1)
CheckConstraints(Vars, Constraints)
ELSE
FOR (i = 1; i <= size(Domains[k]); i++)
Vars[k] = Domains[k][i]
GenerateAndTestRecursive(Vars[1..n], Domains,
Constraints, k+1)
Initial call: GenerateAndTestRecursive(Vars, Domains, Constraints, 1)
22. Solution Tree
Root level: No variable is assigned
First level: First variable is assigned with all possible
values from the domain
Last level: The last variable is assigned
Complexity: generated by all the levels in the tree!
Depends on the height – d
Depends on the (average) branching factor – b
O(bd)
More details on whiteboard
23. Problems with G&T
A correct solution = a solution that is consistent w.r.t. all
the constraints that are checked
Some inconsistencies appear while building the solution
Why only check for consistency when the solution is
final?
Also check the consistency of partial solutions
If a partial solution is not consistent, abandon it
And assign the next value in the domain to the current variable,
if any left
If no values are left in the domain of the current variable, go
back to the previous one and continue
This is called backtracking
24. Backtracking
Improvement of G&T that checks for the consistency of
the partial solutions
Thus, search in the solution tree is pruned
We can further improve the search by using heuristics
=> the complexity for finding the correct solution is reduced
We cannot reduce the height of the tree
But we can reduce the average branching factor of the pruned
solution tree
However, usually backtracking is still O(bd) for the worst
case
We cannot guarantee it to have a lower complexity
25. Backtracking – recursive scheme
We can devise a recursive scheme for most problems solvable
using backtracking
BKTRecursive(Vars[1..n], Domains, Constraints, k)
IF (k == n + 1)
PrintSolution(Vars)
ELSE
// when no next value exists, the index is reset to the first value
WHILE (ExistsNextValue(Vars[k], Domains[k]))
Vars[k] = NextValue(Vars[k] , Domains[k], Constraints)
IF (CheckConstraints(Vars, Constraints, k))
BKTRecursive (Vars[1..n], Domains,
Constraints, k+1)
PrintSolution, ExistsNextValue, NextValue, CheckConstraints
are problem and method depending
26. Remarks
Initial call:
Because consistency is verified after choosing each value for
a variable in the partial solution, it means that the final solution
is also consistent
Therefore, just print it
ExistsNextValue, NextValue – method dependent
BKTRecursive(Vars, Domains, Constraints, 1)
Simple to implement for usual backtracking, just iterate through
the domain array for each variable until reaching the end
More complex for using BKT with heuristics
CheckConstraints, PrintSolution – problem dependent
One of the only things that change from problem to problem
27. Backtracking – iterative scheme
We can also devise an iterative scheme for most problems
solvable using backtracking
BKTIterative(Vars[1..n], Domains, Constraints)
k=1
WHILE (k <= n +1)
IF (k == n + 1)
PrintSolution(Vars)
k-CONTINUE
// when no next value exists, the index is reset to the first value
WHILE (ExistsNextValue(Vars[k], Domains[k]))
Vars[k] = NextValue(Vars[k] , Domains[k], Constraints)
IF (CheckConstraints(Vars, Constraints, k))
k++
BREAK
IF (!ExistsNextValue(Vars[k], Domains[k]))
k--
28. Example: n-Queens Problem
Given a table of chess size n x n, find a possible
positioning of n queens such that none of the queens
attack themselves
Or find all possible positionings
n = 8 usual chess table
n = 1 => 1 solution
n = 2, 3 => 0 solutions
n = 4 => 2 solutions
…
n = 8 => 92 solutions
…
n = 25 => 2,207,893,435,808,352 solutions
30. n-Queens Problem
Three possible approaches
First approach
n2 variables – one for each position on the table
Each variable has the domain {0, 1} if a queen is placed on that
particular position
Complexity: O(2n*n)
Branching factor for each node: 2
Height of tree: n*n
Second approach
n variables – one for each queen
Each variable has the domain {1, …,n2} – the position on the table
for each queen
Complexity: O(n2*n)
Branching factor: n2
Height of tree: n
31. n-Queens Problem
CheckConsistency1(Vars[1..n*n], Domains, k)
FOR (i = 1..k-1)
rowi = (i – 1) / n
columni = (i - 1) % n
rowk = (k – 1) / n
columnk = (k – 1) % n
IF (rowi == rowk || columni == columnk || abs(rowi -rowk)
== abs(columni - columnk))
IF (Vars[i] == 1 AND Vars[k] == 1)
// we already have a queen on the same row
// or the same column or the same diagonal
RETURN false
RETURN true
32. n-Queens Problem
CheckConsistency2(Vars[1..n], Domains, k)
FOR (i = 1..k-1)
rowi = (Vars[i] – 1) / n
columni = (Vars[i] - 1) % n
rowk = (Vars[k] – 1) / n
columnk = (Vars[k] – 1) % n
IF (rowi == rowk || columni == columnk || abs(rowi -rowk)
== abs(columni - columnk))
// we already have a queen on the same row
// or the same column or the same diagonal
RETURN false
RETURN true
33. n-Queens Problem
Third approach
Idea: the queens cannot be placed on the same row!
n variables – one for the each position of queen i on row i
(i=1..n)
Each variable has the domain {1, …,n} – the column
where the queen is placed on each row
Complexity: O(nn)
Branching factor for each node: n
Height of tree: n
The position of each queen would be (i, Vars[i])
i = 1..n
34. n-Queens Problem
CheckConsistency3(Vars[1..n], Domains, k)
FOR (i = 1..k-1)
rowi = i
columni = Vars[i]
rowk = k
// always rowk != rowi
columnk = Vars[k]
IF (rowi == rowk || columni == columnk ||
abs(rowi - rowk) == abs(columni - columnk))
// we already have a queen on the same row
// or the same column or the same diagonal
RETURN false
RETURN true
35. Example: Graph Coloring Problem
Given an undirected graph G(V, E), can we color
each vertex of the graph using k colors such that any
two vertices joined by an edge have different colors
?
(u, v)
E : color[u] != color[v]
Modeling the problem:
N = |V| variables – one for each vertex
Each domain has k values: {1, …, k} – the color of each
vertex
Complexity: O(kn)
Height: n
Branching factor: k
37. Improvements for Backtracking
How can we improve backtracking ?
How we model the problem
Use of heuristics in order to reduce the average
branching factor
Look at n-queens
Variables that have a smaller domain, should be instantiated
firstly
Variables that have the most constraints, should be instantiated
firstly
Etc.
Forward Checking: more on whiteboard
Advanced: Arc-Consistency (AC) algorithms
38. Heuristics for Backtracking
Minimum Remaining Values (MRV) - Choose the variable that
has the least number of valid values in its domain
Combined with Forward-Checking
Most Constraining Variable (MCV) – Choose the variable that
has the most constraints with previous variables
Least Constraining Value (LCV) – Choose the value that
imposes the least number of constraints on the remaining
variables
Mainly useful if we want to find a solution, not all of them
More info here:
http://www.ai.kun.nl/aicourses/bki212a/slides/AISP-CSPch5.pdf
39. Conclusions
There are some problems that are very difficult
For these problems it’s ok to use backtracking
But solvable using exponential algorithms
NP-complete
Maybe with heuristics
However, for optimization problems maybe instead
of backtracking, it may sometimes be useful to find
an approximate solution and not the optimum one
In polynomial time