Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cis435 week03


Published on

Published in: Education
  • Be the first to comment

  • Be the first to like this

Cis435 week03

  1. 1. Summations, Probability, and Randomized Algorithms Advanced Data Structures & Algorithm Design
  2. 2. Introduction  Often the result of analyzing an algorithm is a summation  Loops directly translate to summations, and recursion can often be reduced to a summation  Section 3.1 discusses a number of useful summation formulas and properties; refer to it as needed throughout the semester
  3. 3. Bounding Summations  Being able to determine the upper bound on a summation is important if we want to actually use the summation in our analysis  We will investigate four methods of bounding summations:  Using induction  Bounding the terms  Splitting the summations  Approximation by integrals
  4. 4. Bounding Summations Using Induction  We’ve already discussed induction as a method for bounding recurrences and for solving summations  Induction can also be used to show the bound on a summation, rather than the exact value of the summation  As an example, we will show the bound for the following equation:  n n k k O 33 0 
  5. 5. Bounding Summations Using Induction 3/2cor1,1/c)(1/3aslongas 3 3 1 3 1 33 333 :1forholdsitthatprovemustand,forholdsboundtheassumeWe .1allfor113,0For 1 1 1 0 1 1 0 0 0                        n n nn n k nk n k k k k c c c c nn ccn
  6. 6. Bounding Summations by Bounding the Terms  Sometimes a series can be bound by bounding the individual terms in the series  We can quickly bound a series using the largest term of the series, then derive a series bound from it: 2 11 n nk n k n k    
  7. 7. Bounding Summations by Bounding the Terms       n k k k n k k naa aaa 1 max max1 :maximumtheletweif,seriesaforgeneral,In This technique is a weak method for bounding a summation if the series can instead be bound by a geometric series.
  8. 8. Bounding Summations by Bounding the Terms  Suppose we have a series such that ak+1/ak <= r for some constant r<1 and all k>=0  In other words, the ratio of consecutive elements in the series is less than a constant value  If this property holds for all k, then any element in the series ak <= a0rk  In this case, we can bound the series using an infinite decreasing geometric series:
  9. 9. Bounding Summations by Bounding the Terms r a ra raa k k k k n k k           1 1 0 0 0 0 0 0
  10. 10. Bounding Summations by Bounding the Terms 1 3/21 1 3 1 3 2 3 1 3 :boundourcreatetoofvaluethisusenowcanWe 3 2 1 3 1 3/ 3/)1( :istermseconsecutivofratiotheand1/3,justisfirst termThe 3 summationtheboundwant towesupposeexample,For 11 1 1                         k k k k k k k k k r k k k k r k
  11. 11. Bounding Summations by Splitting Summations  Difficult summations can sometimes be “split apart” into pieces that are easier to solve individually  In these situations, the summation range is split and the summation expressed as the sum of each partition  This technique can be used to ignore a small number of initial terms, when each term in the summation is independent of n
  12. 12. Bounding Summations Using Approximation by Integrals  Approximating the summation through the use of integration provides a convenient means of obtaining a bound  This technique can be used when the summation can be expressed as the sum of some f(k), where f(k) is monotonically increasing or decreasing  In other words, any x > y implies f(x) > f(y) (monotonically increasing) or f(x) < f(y) (monotonically decreasing)
  13. 13. Bounding Summations Using Approximation by Integrals  If f(k) is monotonically increasing, it can be approximated by the integrals:       n m n mk n m dxxfkfdxxf 1 1 )()()(  Similarly, if f(k) is monotonically decreasing, it can be approximated by the integrals:        1 1 )()()( n m n mk n m dxxfkfdxxf
  14. 14. Bounding Summations Using Approximation by Integrals n x dx k n x dx k nn k nn k n k 2 12 2 1 11 1 log 1 :similarisboundupperThe )1(log 1 :obtainwebound,loweraFor : k 1 functionharmonicheconsider texample,anAs           
  15. 15. Counting Theory  Attempts to answer the question of “how many” without enumerating all the possibilities  E.g., how many permutations of the string “discovery” are there?  One way to find out is to write ‘em all down  Counting theory lets us calculate the answer without having to
  16. 16. Rules of sum and product  Given a set of items, we can sometimes count the items using one of these rules:  Rule of sum: the number of ways to choose an element from one of two disjoint sets is the sum of the size of the sets  Rule of product: the number of ways to choose an ordered pair is the number of ways to choose the first element times the number of ways to choose the second element
  17. 17. Strings  String = a sequence of elements from the same set  If the string is of length k, we sometimes call it a k- string  Given a string s, a substring s’ is an ordered sequence of consecutive elements of s  A k-substring is therefore a substring of length k  Given a set S of size n, how many k-strings are in the set?
  18. 18. How many k-strings are in a set?  If we have n elements, and can pick any element for any position in the string, then there’s n choices for the first, n for the second, etc.  The rule of product applies over the entire string  The answer is therefore nk
  19. 19. Permutations  A permutation is an ordered sequence of all the elements of set S, with each element appearing once  For example, if S = { c, a, t }, there are 6 permutations of S:  cat, cta, act, atc, tca, tac  For the entire set S consisting of n elements, there are n! permutations
  20. 20. K-Permutations  A k-permutation consists of k elements from S, with no element appearing more than once  E.g., if S = { a, b, c, d }, then there are 12 2- permutations:  ab, ac, ad, ba, bc, bd, ca, cb, cd, da, db, dc  If S has n elements, then the number of k- permutations in S is (n!)/(n-k)!
  21. 21. Combinations  A k-combination of an n-set S is simply a k- subset of S  Combinations must be distinct, but elements in the combination are unordered (unlike permutations)  For every k-combination, there are k! permutations  Each permutation is a distinct k-permutation of S  The number of k-combinations is therefore just the number of k-permutations divided by k!: n!/k!(n-k)!
  22. 22. Probability  Probability is defined in terms of a sample space S, a set whose elements are called elementary events  Each elementary event is a possible outcome of an experiment  E.g., flipping two coins can result in one of 4 elementary events, which makes up the sample space:  S = { HH, HT, TH, TT }
  23. 23. Probability  An event is a subset of the sample space S  E.g., the event of obtaining one head and one tail is the subset { HT, TH }  The event S is called the certain event  The event {} is called the null event  Two events A and B are mutually exclusive if they cannot occur simultaneously  I.e., AB={}
  24. 24. Probability  The probability of an event A is written Pr{A}  A probability distribution is a way to map the events of S to real numbers, such that these axioms are met:  Pr{A} >= 0 for any event A  Pr{S} = 1  Pr{AB} = Pr{A} + Pr{B} for any two mutually exclusive events
  25. 25. Discrete Probability Distributions  A distribution is discrete if it is defined over a finite or countably infinite sample space  A uniform distribution is a distribution such that all events are equally likely  I.e., picking an element at random   As sA }Pr{}Pr{ Ss /1}Pr{ 
  26. 26. Continuous Uniform Probability Distributions  A probability distribution in which all subsets of the sample space are not considered to be events  They are defined over a closed interval [a, b], with each point in the interval being equally likely  Since the number of points are uncountable, we cannot satisfy axioms 2 & 3 – the probability of each “point” is effectively 0
  27. 27. Continuous Uniform Probability Distributions  Given a closed range [a, b], and any closed interval on that range [c, d] such that a <= c <= d <= b, the continuous uniform probability distribution defines the probability of the event [c, d] to be ab cd dc   ]},Pr{[
  28. 28. Discrete Random Variables  A discrete random variable X is a function from a finite or countably infinite sample space S to the real numbers  It associates a real number with each possible outcome of an experiment  This lets us work on the probability distribution  X is random in the sense that its value depends on the outcome of some experiment, and cannot be predicted with certainty before the experiment is run
  29. 29. Discrete Random Variables  To use discrete random variables, we must define a probability density function:  This is the probability that X is some particular value or event  It is simply the sum of the probabilities of all the individual events represented by the random variable    xsXSs sxX )(: }Pr{}Pr{
  30. 30. Discrete Random Variables  Let’s look at rolling 2 6-sided dice:  X is a random variable defining the maximum of the two values shown on the dice  There are 36 possible elementary events (2 dice, each has 6 faces)  If we define X=3, meaning the highest value on either die is three, then Pr{X=3} = 5/36 (five possible outcomes out of 36 total)
  31. 31. Expected Value of a Random Variable  This is the “average” of the values it takes on  Expected value defines the center of the distribution of the variable, i.e., if we were to run the experiment an infinite number of times, the expected value is the mean value of X over all experiments   x xXxXE }Pr{][
  32. 32. Variance and Standard Deviation  Variance: Var[X] = E[X2] – E2[X]  This is a measure of how much the distribution varies  Standard Deviation: sqrt(Var[X])
  33. 33. QuickSort void QuickSort(ArrayType &A, int begin, int end) { if ( begin < end ) { int q = Partition(A, begin, end); QuickSort(A, begin, q-1); QuickSort(A, q+1, end); } }  This is a great algorithm if all inputs are equally likely  That’s not always the case!  We can overcome the problem of worst case input by introducing some randomness into the algorithm
  34. 34. QuickSort  When does the worst case occur in QuickSort? Why?  It occurs because partition algorithm does not partition the array evenly  One way to improve it might be to partition around the middle element – but even so, there is still a worst case such that the array is anti-optimally partitioned
  35. 35. Randomized Algorithms  An algorithm is randomized if its behavior is determined not only by the input, but also by the output of a random number generator  Let’s assume a random number generator with a function Random(a, b), that returns a random number in the range a-b  We will use a random number generator to impose a distribution such that no particular input elicits its worst case behavior
  36. 36. Randomized Algorithms  A randomized strategy is useful when there are many ways an algorithm can proceed, but no good way to know which way is “good”; if many alternatives are good, you pick one randomly The benefits of good choices must outweigh the cost of bad choices
  37. 37. Randomized QuickSort  How do we randomize QuickSort?  Create a RandomizedPartition function, then use that in our main QuickSort function: void RandomizedQuickSort(ArrayType &A, int begin, int end) { if ( begin < end ) { int q = RandomizedPartition(A, begin, end); RandomizedQuickSort(ArrayType(A, begin, q-1); RandomizedQuickSort(ArrayType(A, q+1, end); } }
  38. 38. Randomized QuickSort int RandomizedPartition(ArrayType &A, int begin, int end) { int i = Random(begin, end); ArrayType temp = A[end]; A[end] = A[i]; A[i] = temp; return Partition(A, begin, end); }
  39. 39. Analysis of Randomized QuickSort  How does this change our previous analysis?  We have added a constant factor to the Partition running time, which can be ignored  However, we have made worst-case behavior nearly impossible – no particular input can create it, only an unlucky partitioning  So the analysis doesn’t actually change, however we have made the average or expected case much more likely, and the worst case much less likely
  40. 40. Random Numbers  Adding randomness to an algorithm implies an ability to generate random numbers  Computers are unfortunately not directly capable of generating truly random sequences  The approach that is generally taken is therefore to generate a sequence of “pseudo- random” numbers that exhibits good random behavior Ref: Numerical Recipes in C, Chapter 7, available online at
  41. 41. Random Numbers  Most languages have a set of library functions for generating pseudo-random numbers  System supplied generators typically suffer from a number of problems due to poor specification and implementation  The sequence generally repeats with a period no greater than 32767  The randomness of the sequence is highly dependent on the implementor’s choice of constants used by the algorithm, and in many standard implementations the choice is poor
  42. 42. Random Numbers  One fast method of choosing random numbers is the linear congruential method  Each number in the sequence is determined by a mathematical operation performed on the previous choice  Ij+1 = aIj + c % m  m is the modulus, and determines the periodicity of the generated sequence  a is called the multiplier  c is called the increment
  43. 43. Random Numbers  The quality of the generator here is highly dependent on the choice of m, a, and c  Poor choices will limit the period, and more importantly, significantly impact the randomness of the sequence  We can eliminate the need for m by using 32-bit integers and choosing m = 232  The result will be 64 bits, but since we’re using 32 bit variables, the hi order bits will be truncated  Some good choices for a and c are: a = 1664525, c = 1013904223
  44. 44. A Random Number Generator Implementation class Random { public: static const unsigned long RANDMAX, MULTIPLIER, INCREMENT; explicit Random(unsigned long seed = 0) :m_seed(seed) {} unsigned long operator()(void) { return (m_seed = MULTIPLIER*m_seed+INCREMENT); } unsigned long seed(void) const { return m_seed; } void seed(unsigned long value) { m_seed = value; } private: unsigned long m_seed; };
  45. 45. Using the Random Number Generator  This generator will produce a number between 0 and 232-1  Note that prior to generating numbers, the generator should be seeded with a non-zero value  If repeatability isn’t required, time() is often used to generate the seed  To produce a number in an arbitrary range, use the following:  j = LO+(int) ((float)(HI)*rand() / (RAND_MAX+1.0));  This forces the use of the hi order bits, which are much more random than the lo order bits