3. CS3024-FAZ 3
Algorithm Design and Analysis Process
Understand the problem
Decide on:
Computational means,
exact vs approximate
solving, data structure(s),
algorithm design
technique
Design an algorithm
Prove correctness
Analyze the algorithm
Code the algorithm
An input instance of the problem;
specify the range of instances
The capabilities of a computational
device
Approximation:
The problem cannot solved exactly,
exp: square root
Available exact algs are
unacceptably slow
The appr alg is a part of a more
sophisticated exact alg
Algorithm + Data Structures =
Program
A general approach to solving
problem algorithmically
4. CS3024-FAZ 4
Understand the problem
Decide on:
Computational means,
exact vs approximate
solving, data structure(s),
algorithm design
technique
Design an algorithm
Prove correctness
Analyze the algorithm
Code the algorithm
Specifying an algorithm:
Using natural language
Using flowchart
Using hardware design
Using program source code
Using pseudocode
Other more convenient form?
Correctness: prove that the
algorithm yields a required result for
every legitimate input in a finite
amount of time
Usually using mathematical
induction
Can we use simple tracing?
Incorrectness
Approx alg the error < limit
Algorithm Design and Analysis Process
5. CS3024-FAZ 5
Understand the problem
Decide on:
Computational means,
exact vs approximate
solving, data structure(s),
algorithm design
technique
Design an algorithm
Prove correctness
Analyze the algorithm
Code the algorithm
Algorithm qualities:
Correctness
Efficiency:
Time efficiency
Space efficiency
Simplicity
Generality: the problem, input range
Programming an algorithm:
Peril: incorrect / inefficient transition
Program correctness proving?
Practical: testing & debugging
Algorithm Design and Analysis Process
6. CS3024-FAZ 6
Important Problem Types
Sorting
Searching
String processing
Graph problems
Combinatorial problems
Geometric problems
Numerical problems
7. CS3024-FAZ 7
Problem Types: Sorting
The problem: rearrange the item of a given
list in ascending order
In case of records, we need a key
There are dozens of sorting algorithms
Two properties of sorting algorithms:
Stable: it preserve the relative order of any
two equal elements in its input
In place: it does not require extra memory,
except, possibly, for a few memory units
8. CS3024-FAZ 8
Problem Types: Searching
The problem: finding a given value (search key)
in a given set
Searching algorithms range:
sequential search to binary search (spectacularly
efficient, but limited) and algorithm based on
representing the set in a different form more
conducive to search
Challenges:
Very large data set
Update: add, edit, delete
9. CS3024-FAZ 9
Problem Types: String Processing
String = a sequence of characters from
alphabet
Particular interest: text strings, binary
strings, gene sequences etc.
One particular problem: string matching
Searching for a given word in a text
10. CS3024-FAZ 10
Problem Types: Graph Problems
Basic graph algorithms: graph traversal,
shortest-path, topological sorting for graph
with directed edges
Some problems are computationally very
hard –only very small instances can be
solved in a realistic amount of time–
Traveling Salesman Problem
Graph-Coloring Problem
11. CS3024-FAZ 11
Problem Types: Combinatorial Problems
The problem: find a combinatorial object –such
as a permutation, a combination, or a subset –
that satisfies certain constraints and has some
desired property
The most difficult problems
The number of combinatorial objects typically grows
extremely fast with a problem’s size
There are no known exact algorithms for solving such
problems in an acceptable amount of time
From a more abstract perspective, TSP & GCP
are examples of combinatorial problem
12. CS3024-FAZ 12
Problem Types: Geometric Problems
Deals with geometric objects: points, lines,
polygons etc.
Ancient Greek: to construct simple geometric
shapes –triangles, circles etc.– with unmarked
ruler and compass
Today people: application to computer graphics,
robotics, tomography etc.
Classic problems:
Closest-pair problem: given n points in the plane, find
the closest pair among them
Convex hull problem: find the smallest convex
polygon that would include all the points of a given set
13. CS3024-FAZ 13
Problem Types: Numerical Problems
Involves mathematical objects of continuous nature:
solving equations and system of equations, computing
definite integrals, evaluating functions etc.
The majority of such mathematical problem can only
solved approximately
Computer can only represent real number approximately
Accumulation of the round-off error
Computing industry focus shifting: numerical analysis (in
industry & science) to business application (information
storage, retrieval, transportation through network, and
presentation to users)
14. CS3024-FAZ 14
The Need of Efficient Algorithm
Suppose that you have an infinitely fast
computer equipped with unlimited capacity
of free-memory.
Do you still have any reason to study
algorithm?
15. CS3024-FAZ 15
Absolutely YES!
You still have to demonstrate that your
solution method terminates and does so
with the correct answer
16. CS3024-FAZ 16
Back to the real world
Computers may be fast, but they are not
infinitely fast
Memory may be cheap, but it is not free.
Bounded resources:
Computing time
Space in memory
These resources must be used wisely, and
efficient algorithms will help you do so
17. CS3024-FAZ 17
Efficiency: An Illustration
Pick two sorting algorithms:
Insertion sort: takes time roughly equal to c1n2 to sort
n items n2
Merge sort: takes (c2 n log2 n) to sort n items n log2
n
c1 < c2 far less significant than the input size n
Insertion sort is usually faster than merge sort
for small input sizes.
Once the input size n becomes large enough,
merge sort’s advantage of log2 n vs n will more
to compensate the difference in constant factors
18. CS3024-FAZ 18
Efficiency: Concrete example (1)
Array to sort: 106 numbers
Computer A: 109 inst/sec; running insertion
sort; craftiest programmer; codeIS 2n2
Computer B: 107 inst/sec; running merge
sort; average programmer, HLL; the
codeMS 50 n log2 n
Comp A: (2(106)2) / 109 = 2000 sec
Comp B: (50.106 log2 106) / 107 ≈ 100 sec
19. CS3024-FAZ 19
Efficiency: Concrete example (2)
By using an algorithm whose running time
grows more slowly, even with a poor
compiler, comp B runs 20 faster than A.
Let’s try to sort 107 numbers…
Comp A: timeIS ≈ 2.3 days
Comp B: timeMS < 20 minutes
21. CS3024-FAZ 21
Measuring an input’s size
Almost all algorithms run longer on larger inputs
It’s logical to investigate an algorithm’s efficiency
as a function of some parameter n indicating the
algorithm’s input size
In most cases, selecting n is straightforward;
exp: the size of the list for sorting, searching etc.
For the problem of evaluating polynomial of
degree n, it will be polynomial’s degree or the
number of its coefficient’s
22. CS3024-FAZ 22
The Choice of a Parameter
Indicating an Input Size Does Matter
Example: computing the product of two n-
by-n matrices
Two natural measures:
The matrix order n
The total number of elements N in the matrix
being multiplied applicable to n-by-m
matrices
23. CS3024-FAZ 23
The Choice can be Influenced by
Operations of the Algorithm
How should we measure an input’s size for a
spell-checking algorithm?
If it examines individual characters of its input the
number of characters
If it works by processing words the number of
words
For algorithms involving properties of numbers
(e.g. is integer n prime?)
Size = number of bit b in the n’s binary
representation
b = log2 n + 1
24. CS3024-FAZ 24
Unit for Measuring Running Time
Can we use some standard units of time
measurement –a second, a millisecond, and so
on– ?
Drawbacks: dependence on the speed of a particular
computer, dependence on the quality of a program,
difficulty of clocking the actual running time of the
program
One possible approach: to count the number of
times each of the algorithm’s operations is
executed difficult & unnecessary
25. CS3024-FAZ 25
Basic Operation
Identify the most important operation of the
algorithm (basic operation)
The operation contributing the most to the total
running time
Compute the number of times the basic
operation is executed
The established framework for analysis of an
algorithm’s time efficiency: counting the number
of times the algorithm’s basic operation is
executed on input of size n
26. CS3024-FAZ 26
Orders of Growth
Example: gcd(8,12) piece of cake;
gcd(7898846643,5612346991236) ???
For large value of n , it is the function’s
order of growth that counts
Ignore the constant multiple
Exp: ½ n2 n2, 50 n log2 n n log2 n
Some important functions:
log2 n, n, n log2 n, n2, n3, 2n, n!
The growing of these function on n?
27. CS3024-FAZ 27
Input Size Alone Is Not Enough
Consider sequential search algorithm
//Input: array A[0..n-1], search key K
//Output: index of first element of A that
// matches K or -1 if there is no match
i 0
while i < n and A[i] ≠ K do
i i + 1
if i < n return i else return -1
The running of this algorithm can be quite
different for the same list size n
The best, the worst, and average-case?
28. CS3024-FAZ 28
The Worst-case Efficiency
Cworst(n): Algorithm’s efficiency for the
worst-case input of size n
Algorithm runs the longest among all possible
inputs of that size
Cworst(n) is important: bounding it’s running
time from above
It guarantees that for any instance of size n,
the running time will not exceed Cworst(n)
29. CS3024-FAZ 29
The Best-case Efficiency
Cbest(n): Algorithm’s efficiency for the best-case
input of size n
Algorithm runs the fastest among all possible inputs
of that size
Usefulness: for some algorithms a good best-
case performance extends to some types of
inputs close to being the best-case one
Exp: best-case inputs for insertion sort alg are already
sorted arrays; this good best-case deteriorates only
slightly for almost sorted arrays Insertion Sort
might be the method of choice for application dealing
with almost sorted arrays
30. CS3024-FAZ 30
The Average-case Efficiency
Neither the worst-case nor the best-case
analysis yields the necessary information
about algorithm’s behavior on ‘typical’ or
‘random’ input.
Cavg(n)
We must make some assumptions about
possible inputs of size n
31. CS3024-FAZ 31
Cavg(n) for Sequential Search (1)
Standard assumptions:
The probability of successful search is equal to p (0 ≤
p ≤ 1)
The probability of the first match occurring in the i-th
position is the same for every i
The probability of first matching in the i-th
position is p/n for every i; the number of
comparisons made = i
In the case of unsuccessful search, the number
of comparisons is n, with the probability (1-p)
32. CS3024-FAZ 32
Cavg(n) for Sequential Search (2)
)
1
.(
2
)
1
(
)
1
.(
2
)
1
(
)
1
.(
...
...
2
1
)
1
.(
.
...
.
...
.
2
.
1
)
(
p
n
n
p
p
n
n
n
n
p
p
n
n
i
n
p
p
n
n
p
n
n
p
i
n
p
n
p
n
Cavg
33. CS3024-FAZ 33
Notes on Cavg(n)
Do we really need Cavg(n)?
Much better than overly-pessimistic Cworst(n)
Cavg(n) cannot be obtained by taking the
average of Cworst(n) and Cbest(n)
34. CS3024-FAZ 34
--Amortized Efficiency--
Proposed by Robert Tarjan
It applies not to a single run of an
algorithm but rather to a sequence of
operations performed on the same data
structure
In some situations, a single operations can
be expensive, but the total time for entire
sequence of n such operations is always
significantly better than Cworst(n) of that
single operation multiplied by n
35. CS3024-FAZ 35
Exercises (1)
1. Variation of sequential search: return the
number of occurrences of a given search
key in the list. Which class it falls?
2. Suggest how any sorting algorithm can
be augmented in a way to make the
best-case count of its key comparisons
equal to just n-1
36. CS3024-FAZ 36
Exercises (2)
3. According to a well-known legend, the game of
chess was invented many centuries ago in
northwestern India by a sage named Shashi.
When he took his invention to his king, the king
liked the game so much that he offered the
inventor any reward he wanted. Shashi asked
for some grain to be obtained as follows: just a
single grain of wheat was to be placed on the
first square of the chess board, two on the
second, four on the third, eight on the fourth,
and so on, until all 64 squares had been filled.
What would the ultimate result of this algorithm
have been?