2. Introduction
Often the result of analyzing an algorithm is a
summation
Loops directly translate to summations, and
recursion can often be reduced to a summation
Section 3.1 discusses a number of useful
summation formulas and properties; refer to it
as needed throughout the semester
3. Bounding Summations
Being able to determine the upper bound on
a summation is important if we want to
actually use the summation in our analysis
We will investigate four methods of bounding
summations:
Using induction
Bounding the terms
Splitting the summations
Approximation by integrals
4. Bounding Summations Using Induction
We’ve already discussed induction as a method for
bounding recurrences and for solving summations
Induction can also be used to show the bound on a
summation, rather than the exact value of the summation
As an example, we will show the bound for the following
equation:
n
n
k
k
O 33
0
5. Bounding Summations Using Induction
3/2cor1,1/c)(1/3aslongas
3
3
1
3
1
33
333
:1forholdsitthatprovemustand,forholdsboundtheassumeWe
.1allfor113,0For
1
1
1
0
1
1
0
0
0
n
n
nn
n
k
nk
n
k
k
k
k
c
c
c
c
nn
ccn
6. Bounding Summations by Bounding
the Terms
Sometimes a series can be bound by
bounding the individual terms in the series
We can quickly bound a series using the
largest term of the series, then derive a
series bound from it:
2
11
n
nk
n
k
n
k
7. Bounding Summations by Bounding
the Terms
n
k
k
k
n
k k
naa
aaa
1
max
max1
:maximumtheletweif,seriesaforgeneral,In
This technique is a weak method for bounding a
summation if the series can instead be bound by
a geometric series.
8. Bounding Summations by Bounding
the Terms
Suppose we have a series such that ak+1/ak
<= r for some constant r<1 and all k>=0
In other words, the ratio of consecutive elements
in the series is less than a constant value
If this property holds for all k, then any
element in the series ak <= a0rk
In this case, we can bound the series using
an infinite decreasing geometric series:
9. Bounding Summations by Bounding
the Terms
r
a
ra
raa
k
k
k
k
n
k
k
1
1
0
0
0
0
0
0
10. Bounding Summations by Bounding
the Terms
1
3/21
1
3
1
3
2
3
1
3
:boundourcreatetoofvaluethisusenowcanWe
3
2
1
3
1
3/
3/)1(
:istermseconsecutivofratiotheand1/3,justisfirst termThe
3
summationtheboundwant towesupposeexample,For
11
1
1
k
k
k
k
k
k
k
k
k
r
k
k
k
k
r
k
11. Bounding Summations by Splitting
Summations
Difficult summations can sometimes be “split
apart” into pieces that are easier to solve
individually
In these situations, the summation range is
split and the summation expressed as the
sum of each partition
This technique can be used to ignore a small
number of initial terms, when each term in the
summation is independent of n
12. Bounding Summations Using
Approximation by Integrals
Approximating the summation through the
use of integration provides a convenient
means of obtaining a bound
This technique can be used when the
summation can be expressed as the sum of
some f(k), where f(k) is monotonically
increasing or decreasing
In other words, any x > y implies f(x) > f(y)
(monotonically increasing) or f(x) < f(y)
(monotonically decreasing)
13. Bounding Summations Using
Approximation by Integrals
If f(k) is monotonically increasing, it can be
approximated by the integrals:
n
m
n
mk
n
m
dxxfkfdxxf
1
1
)()()(
Similarly, if f(k) is monotonically decreasing, it can
be approximated by the integrals:
1
1
)()()(
n
m
n
mk
n
m
dxxfkfdxxf
14. Bounding Summations Using
Approximation by Integrals
n
x
dx
k
n
x
dx
k
nn
k
nn
k
n
k
2
12
2
1
11
1
log
1
:similarisboundupperThe
)1(log
1
:obtainwebound,loweraFor
:
k
1
functionharmonicheconsider texample,anAs
15. Counting Theory
Attempts to answer the question of “how
many” without enumerating all the
possibilities
E.g., how many permutations of the string
“discovery” are there?
One way to find out is to write ‘em all down
Counting theory lets us calculate the answer without
having to
16. Rules of sum and product
Given a set of items, we can sometimes
count the items using one of these rules:
Rule of sum: the number of ways to choose an
element from one of two disjoint sets is the sum of
the size of the sets
Rule of product: the number of ways to choose
an ordered pair is the number of ways to choose
the first element times the number of ways to
choose the second element
17. Strings
String = a sequence of elements from the
same set
If the string is of length k, we sometimes call it a k-
string
Given a string s, a substring s’ is an ordered
sequence of consecutive elements of s
A k-substring is therefore a substring of length k
Given a set S of size n, how many k-strings
are in the set?
18. How many k-strings are in a set?
If we have n elements, and can pick any
element for any position in the string, then
there’s n choices for the first, n for the
second, etc.
The rule of product applies over the entire string
The answer is therefore nk
19. Permutations
A permutation is an ordered sequence of all
the elements of set S, with each element
appearing once
For example, if S = { c, a, t }, there are 6
permutations of S:
cat, cta, act, atc, tca, tac
For the entire set S consisting of n elements,
there are n! permutations
20. K-Permutations
A k-permutation consists of k elements from
S, with no element appearing more than once
E.g., if S = { a, b, c, d }, then there are 12 2-
permutations:
ab, ac, ad, ba, bc, bd, ca, cb, cd, da, db, dc
If S has n elements, then the number of k-
permutations in S is (n!)/(n-k)!
21. Combinations
A k-combination of an n-set S is simply a k-
subset of S
Combinations must be distinct, but elements in
the combination are unordered (unlike
permutations)
For every k-combination, there are k!
permutations
Each permutation is a distinct k-permutation of S
The number of k-combinations is therefore just the
number of k-permutations divided by k!:
n!/k!(n-k)!
22. Probability
Probability is defined in terms of a sample
space S, a set whose elements are called
elementary events
Each elementary event is a possible outcome of
an experiment
E.g., flipping two coins can result in one of 4 elementary
events, which makes up the sample space:
S = { HH, HT, TH, TT }
23. Probability
An event is a subset of the sample space S
E.g., the event of obtaining one head and one tail
is the subset { HT, TH }
The event S is called the certain event
The event {} is called the null event
Two events A and B are mutually exclusive if
they cannot occur simultaneously
I.e., AB={}
24. Probability
The probability of an event A is written Pr{A}
A probability distribution is a way to map the
events of S to real numbers, such that these
axioms are met:
Pr{A} >= 0 for any event A
Pr{S} = 1
Pr{AB} = Pr{A} + Pr{B} for any two mutually
exclusive events
25. Discrete Probability Distributions
A distribution is discrete
if it is defined over a
finite or countably
infinite sample space
A uniform distribution is
a distribution such that
all events are equally
likely
I.e., picking an element
at random
As
sA }Pr{}Pr{
Ss /1}Pr{
26. Continuous Uniform Probability
Distributions
A probability distribution in which all subsets
of the sample space are not considered to be
events
They are defined over a closed interval [a, b],
with each point in the interval being equally
likely
Since the number of points are uncountable, we
cannot satisfy axioms 2 & 3 – the probability of
each “point” is effectively 0
27. Continuous Uniform Probability
Distributions
Given a closed range [a, b], and any closed
interval on that range [c, d] such that a <= c
<= d <= b, the continuous uniform probability
distribution defines the probability of the
event [c, d] to be
ab
cd
dc
]},Pr{[
28. Discrete Random Variables
A discrete random variable X is a function
from a finite or countably infinite sample
space S to the real numbers
It associates a real number with each possible
outcome of an experiment
This lets us work on the probability distribution
X is random in the sense that its value
depends on the outcome of some
experiment, and cannot be predicted with
certainty before the experiment is run
29. Discrete Random Variables
To use discrete random variables, we must
define a probability density function:
This is the probability that X is some particular
value or event
It is simply the sum of the probabilities of all the
individual events represented by the random
variable
xsXSs
sxX
)(:
}Pr{}Pr{
30. Discrete Random Variables
Let’s look at rolling 2 6-sided dice:
X is a random variable defining the maximum of
the two values shown on the dice
There are 36 possible elementary events (2 dice,
each has 6 faces)
If we define X=3, meaning the highest value on
either die is three, then Pr{X=3} = 5/36 (five
possible outcomes out of 36 total)
31. Expected Value of a Random Variable
This is the “average” of the values it takes on
Expected value defines the center of the
distribution of the variable, i.e., if we were to run
the experiment an infinite number of times, the
expected value is the mean value of X over all
experiments
x
xXxXE }Pr{][
32. Variance and Standard Deviation
Variance: Var[X] = E[X2] – E2[X]
This is a measure of how much the distribution
varies
Standard Deviation: sqrt(Var[X])
33. QuickSort
void QuickSort(ArrayType &A, int begin, int end)
{
if ( begin < end )
{
int q = Partition(A, begin, end);
QuickSort(A, begin, q-1);
QuickSort(A, q+1, end);
}
}
This is a great algorithm if all inputs are equally
likely
That’s not always the case!
We can overcome the problem of worst case input
by introducing some randomness into the algorithm
34. QuickSort
When does the worst case occur in
QuickSort? Why?
It occurs because partition algorithm does not
partition the array evenly
One way to improve it might be to partition around
the middle element – but even so, there is still a
worst case such that the array is anti-optimally
partitioned
35. Randomized Algorithms
An algorithm is randomized if its behavior is
determined not only by the input, but also by
the output of a random number generator
Let’s assume a random number generator
with a function Random(a, b), that returns a
random number in the range a-b
We will use a random number generator to
impose a distribution such that no particular
input elicits its worst case behavior
36. Randomized Algorithms
A randomized strategy is useful when there
are many ways an algorithm can proceed, but
no good way to know which way is “good”; if
many alternatives are good, you pick one
randomly
The benefits of good choices must outweigh the
cost of bad choices
37. Randomized QuickSort
How do we randomize QuickSort?
Create a RandomizedPartition function, then use
that in our main QuickSort function:
void RandomizedQuickSort(ArrayType &A, int begin, int end)
{
if ( begin < end )
{
int q = RandomizedPartition(A, begin, end);
RandomizedQuickSort(ArrayType(A, begin, q-1);
RandomizedQuickSort(ArrayType(A, q+1, end);
}
}
39. Analysis of Randomized QuickSort
How does this change our previous analysis?
We have added a constant factor to the Partition
running time, which can be ignored
However, we have made worst-case behavior
nearly impossible – no particular input can create
it, only an unlucky partitioning
So the analysis doesn’t actually change, however
we have made the average or expected case
much more likely, and the worst case much less
likely
40. Random Numbers
Adding randomness to an algorithm implies
an ability to generate random numbers
Computers are unfortunately not directly
capable of generating truly random
sequences
The approach that is generally taken is
therefore to generate a sequence of “pseudo-
random” numbers that exhibits good random
behavior
Ref: Numerical Recipes in C, Chapter 7, available online at
http://www.ulib.org/webRoot/Books/Numerical_Recipes/
41. Random Numbers
Most languages have a set of library
functions for generating pseudo-random
numbers
System supplied generators typically suffer from a
number of problems due to poor specification and
implementation
The sequence generally repeats with a period no greater
than 32767
The randomness of the sequence is highly dependent
on the implementor’s choice of constants used by the
algorithm, and in many standard implementations the
choice is poor
42. Random Numbers
One fast method of choosing random
numbers is the linear congruential method
Each number in the sequence is determined by a
mathematical operation performed on the
previous choice
Ij+1 = aIj + c % m
m is the modulus, and determines the periodicity
of the generated sequence
a is called the multiplier
c is called the increment
43. Random Numbers
The quality of the generator here is highly
dependent on the choice of m, a, and c
Poor choices will limit the period, and more
importantly, significantly impact the randomness
of the sequence
We can eliminate the need for m by using 32-bit
integers and choosing m = 232
The result will be 64 bits, but since we’re using 32
bit variables, the hi order bits will be truncated
Some good choices for a and c are: a = 1664525,
c = 1013904223
44. A Random Number Generator
Implementation
class Random
{
public:
static const unsigned long RANDMAX, MULTIPLIER,
INCREMENT;
explicit Random(unsigned long seed = 0)
:m_seed(seed) {}
unsigned long operator()(void)
{ return (m_seed = MULTIPLIER*m_seed+INCREMENT); }
unsigned long seed(void) const { return m_seed; }
void seed(unsigned long value) { m_seed = value; }
private:
unsigned long m_seed;
};
45. Using the Random Number Generator
This generator will produce a number between 0 and
232-1
Note that prior to generating numbers, the generator
should be seeded with a non-zero value
If repeatability isn’t required, time() is often used to
generate the seed
To produce a number in an arbitrary range, use the
following:
j = LO+(int) ((float)(HI)*rand() / (RAND_MAX+1.0));
This forces the use of the hi order bits, which are
much more random than the lo order bits