An Introduction




Kanishka Khandelwal-BCSE IV , JU   3/20/2012
Flip a coin




An algorithm which flip coins is called a randomized algorithm.


            Kanishka Khandelwal-BCSE IV , JU   3/20/2012
Making decisions could be complicated.
                  A randomized algorithm is simpler.

Consider the minimum cut problem

Randomized algorithm?
                                         Pick a random edge and contract.
                                         And Continue until two vertices are left




                        Kanishka Khandelwal-BCSE IV , JU   3/20/2012
Making good decisions could be expensive.

                  A randomized algorithm is faster.

Consider a sorting procedure.
Picking an element in the middle makes the procedure very efficient,
but it is expensive (i.e. linear time) to find such an element.


                     5 9 13 11 8         6 7 10

                5 6 7                8             9 13 11 10



                 Picking a random element will do.

                        Kanishka Khandelwal-BCSE IV , JU   3/20/2012
 Avoid worst-case behavior: randomness can
  (probabilistically) guarantee average case behavior
 Efficient approximate solutions to intractable
  problems
 In many practical problems,we need to deal with
  HUGE input,and don’t even have time to read it
  once.But can we still do something useful?




                   Kanishka Khandelwal-BCSE IV , JU   3/20/2012
Deterministic
Input                                                    Output
        Computer

           Random Bits


               www.lavarnd.org
               (doesn’t use lava lamps
               anymore)




          Kanishka Khandelwal-BCSE IV , JU   3/20/2012
 Randomized algorithms make random rather than
  deterministic decisions.
 The main advantage is that no input can reliably
  produce worst-case results because the algorithm runs
  differently each time.
 These algorithms are commonly used in situations
  where no exact and fast algorithm is known.
 Behavior can vary even on a fixed input.




                   Kanishka Khandelwal-BCSE IV , JU   3/20/2012
 Minimum spanning trees

  A linear time randomized algorithm,
  but no known linear time deterministic algorithm.
 Primality testing

  A randomized polynomial time algorithm,
  but it takes thirty years to find a deterministic one.
 Volume estimation of a convex body

  A randomized polynomial time approximation algorithm,
  but no known deterministic polynomial time approximation algorithm.


                          Kanishka Khandelwal-BCSE IV , JU   3/20/2012
Monte Carlo                                      Las Vegas




              Kanishka Khandelwal-BCSE IV , JU   3/20/2012
 Always gives the true answer.
 Running time is random.
 Running time is variable whose expectation is
  bounded(say by a polynomail).
 E.g. Randomized QuickSort Algorithm




                   Kanishka Khandelwal-BCSE IV , JU   3/20/2012
 It may produce incorrect answer!
 We are able to bound its probability.
 By running it many times on independent random
  variables, we can make the failure probability
  arbitrarily small at the expense of running time.
 E.g. Randomized Mincut Algorithm




                    Kanishka Khandelwal-BCSE IV , JU   3/20/2012
 Suppose we want to find a number among n given
 numbers which is larger than or equal to the median.




                  Kanishka Khandelwal-BCSE IV , JU   3/20/2012
Suppose A1 < … < An.
We want Ai, such that i ≥ n/2.

It’s obvious that the best deterministic algorithm needs
   O(n) time to produce the answer.
n may be very large!
Suppose n is 100,000,000,000 !




                   Kanishka Khandelwal-BCSE IV , JU   3/20/2012
 Choose 100 of the numbers with equal probability.
 find the maximum among these numbers.
 Return the maximum.




                   Kanishka Khandelwal-BCSE IV , JU   3/20/2012
 The running time of the given algorithm is O(1).
 The probability of Failure is 1/(2100).
 Consider that the algorithm may return a wrong
  answer but the probability is very smaller than the
  hardware failure or even an earthquake!




                     Kanishka Khandelwal-BCSE IV , JU   3/20/2012
 QUICKSORT




              Kanishka Khandelwal-BCSE IV , JU   3/20/2012
 QuickSort is a simple and efficient approach to
  sorting:
 Select an element m from unsorted array c and divide
  the array into two subarrays:
  csmall - elements smaller than m and
  clarge - elements larger than m.
 Recursively sort the subarrays and combine them
  together in sorted array csorted



                   Kanishka Khandelwal-BCSE IV , JU   3/20/2012
1.    QuickSort(c)
2.    if c consists of a single element
3.       return c
4.    m  c1
5.    Determine the set of elements csmall smaller
      than m
6.    Determine the set of elements clarge larger
      than m
7.    QuickSort(csmall)
8.    QuickSort(clarge)
9.    Combine csmall, m, and clarge into a single
      array, csorted
10.   return csorted

                   Kanishka Khandelwal-BCSE IV , JU   3/20/2012
 Runtime is based on our selection of m:

      -A good selection will split c evenly such that
       |csmall | = |clarge |, then the runtime is O(n log n).

      -For a good selection, the recurrence relation is:
             T(n) = 2T(n/2) + const ·n


  The time it takes to                    Time it takes to split the
  sort two smaller                        array into 2 parts where
  arrays of size n/2                      const is a positive constant
                     Kanishka Khandelwal-BCSE IV , JU   3/20/2012
However, a poor selection will split c unevenly and in the
 worst case, all elements will be greater or less than m
 so that one subarray is full and the other is empty. In
 this case, the runtime is O(n2).

For a poor selection, the recurrence relation is:
             T(n) = T(n-1) + const · n

The time it takes to sort                     Time it takes to split the array
one array containing n-1                      into 2 parts where const is a
elements                                      positive constant
                     Kanishka Khandelwal-BCSE IV , JU   3/20/2012
 QuickSort seems like an ineffecient MergeSort
 To improve QuickSort, we need to choose m to be a
  good ‘splitter.’
 It can be proven that to achieve O(nlogn) running
  time, we don’t need a perfect split, just reasonably
  good one. In fact, if both subarrays are at least of size
  n/4, then running time will be O(n log n).
 This implies that half of the choices of m make good
  splitters.


                    Kanishka Khandelwal-BCSE IV , JU   3/20/2012
 To improve QuickSort, randomly select m.
 Since half of the elements will be good splitters, if we
  choose m at random we will get a 50% chance that m
  will be a good choice.
 This approach will make sure that no matter what
  input is received, the expected running time is small.




                    Kanishka Khandelwal-BCSE IV , JU   3/20/2012
1. RandomizedQuickSort(c)
2. if c consists of a single element
3.     return c
4. Choose element m uniformly at random from c
5. Determine the set of elements csmall smaller
   than m
6. Determine the set of elements clarge larger than
     m
7. RandomizedQuickSort(csmall)
8. RandomizedQuickSort(clarge)
9. Combine csmall, m, and clarge into a single
    array, csorted
10. return csorted

                     Kanishka Khandelwal-BCSE IV , JU   3/20/2012
 Worst case runtime: O(m2)
 Expected runtime: O(m log m).
 Expected runtime is a good measure of the
 performance of randomized algorithms, often more
 informative than worst case runtimes.




                  Kanishka Khandelwal-BCSE IV , JU   3/20/2012
 Making a random choice is fast.
 An adversary is powerless; randomized algorithms
  have no worst case inputs.
 Randomized algorithms are often simpler and faster
  than their deterministic counterparts.




                   Kanishka Khandelwal-BCSE IV , JU   3/20/2012
 In the worst case, a randomized algorithm may be very
  slow.
 There is a finite probability of getting incorrect answer.
 However, the probability of getting a wrong answer can
  be made arbitrarily small by the repeated employment
  of randomness.
 Getting true random numbers is almost impossible.




                    Kanishka Khandelwal-BCSE IV , JU   3/20/2012
Assignments




  Kanishka Khandelwal-BCSE IV , JU   3/20/2012
 Input: a set of 2D points
 Determine the closest pair (and its dist)
 Input points are stored in an array
 Suppose we have a strange storage data structure D :
 When we give a point to D, it stores the point and
  outputs the closest pair of points stored in D
 Our knowledge: Insertion time depends on whether
  the closest pair is changed or not.
 If output is the same: 1 clock tick
 If output is not the same: |D| clock ticks


                    Kanishka Khandelwal-BCSE IV , JU   3/20/2012
 With random insertion order,
 show that the expected total number of clock ticks
 used by D is O(n)




                  Kanishka Khandelwal-BCSE IV , JU   3/20/2012
Suppose you are given a directed graph with n vertices
and m unit-length edges. Consider the problem of
estimating the number of vertices within distance d of
each vertex. Give a fully polynomial approximation
scheme that solves this problem simultaneously for all
vertices for any fixed d.




                 Kanishka Khandelwal-BCSE IV , JU   3/20/2012

Randomized Algorithm

  • 1.
  • 2.
    Flip a coin Analgorithm which flip coins is called a randomized algorithm. Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 3.
    Making decisions couldbe complicated. A randomized algorithm is simpler. Consider the minimum cut problem Randomized algorithm? Pick a random edge and contract. And Continue until two vertices are left Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 4.
    Making good decisionscould be expensive. A randomized algorithm is faster. Consider a sorting procedure. Picking an element in the middle makes the procedure very efficient, but it is expensive (i.e. linear time) to find such an element. 5 9 13 11 8 6 7 10 5 6 7 8 9 13 11 10 Picking a random element will do. Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 5.
     Avoid worst-casebehavior: randomness can (probabilistically) guarantee average case behavior  Efficient approximate solutions to intractable problems  In many practical problems,we need to deal with HUGE input,and don’t even have time to read it once.But can we still do something useful? Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 6.
    Deterministic Input Output Computer Random Bits www.lavarnd.org (doesn’t use lava lamps anymore) Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 7.
     Randomized algorithmsmake random rather than deterministic decisions.  The main advantage is that no input can reliably produce worst-case results because the algorithm runs differently each time.  These algorithms are commonly used in situations where no exact and fast algorithm is known.  Behavior can vary even on a fixed input. Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 8.
     Minimum spanningtrees A linear time randomized algorithm, but no known linear time deterministic algorithm.  Primality testing A randomized polynomial time algorithm, but it takes thirty years to find a deterministic one.  Volume estimation of a convex body A randomized polynomial time approximation algorithm, but no known deterministic polynomial time approximation algorithm. Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 9.
    Monte Carlo Las Vegas Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 10.
     Always givesthe true answer.  Running time is random.  Running time is variable whose expectation is bounded(say by a polynomail).  E.g. Randomized QuickSort Algorithm Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 11.
     It mayproduce incorrect answer!  We are able to bound its probability.  By running it many times on independent random variables, we can make the failure probability arbitrarily small at the expense of running time.  E.g. Randomized Mincut Algorithm Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 12.
     Suppose wewant to find a number among n given numbers which is larger than or equal to the median. Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 13.
    Suppose A1 <… < An. We want Ai, such that i ≥ n/2. It’s obvious that the best deterministic algorithm needs O(n) time to produce the answer. n may be very large! Suppose n is 100,000,000,000 ! Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 14.
     Choose 100of the numbers with equal probability.  find the maximum among these numbers.  Return the maximum. Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 15.
     The runningtime of the given algorithm is O(1).  The probability of Failure is 1/(2100).  Consider that the algorithm may return a wrong answer but the probability is very smaller than the hardware failure or even an earthquake! Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 16.
     QUICKSORT Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 17.
     QuickSort isa simple and efficient approach to sorting:  Select an element m from unsorted array c and divide the array into two subarrays: csmall - elements smaller than m and clarge - elements larger than m.  Recursively sort the subarrays and combine them together in sorted array csorted Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 18.
    1. QuickSort(c) 2. if c consists of a single element 3. return c 4. m  c1 5. Determine the set of elements csmall smaller than m 6. Determine the set of elements clarge larger than m 7. QuickSort(csmall) 8. QuickSort(clarge) 9. Combine csmall, m, and clarge into a single array, csorted 10. return csorted Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 19.
     Runtime isbased on our selection of m: -A good selection will split c evenly such that |csmall | = |clarge |, then the runtime is O(n log n). -For a good selection, the recurrence relation is: T(n) = 2T(n/2) + const ·n The time it takes to Time it takes to split the sort two smaller array into 2 parts where arrays of size n/2 const is a positive constant Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 20.
    However, a poorselection will split c unevenly and in the worst case, all elements will be greater or less than m so that one subarray is full and the other is empty. In this case, the runtime is O(n2). For a poor selection, the recurrence relation is: T(n) = T(n-1) + const · n The time it takes to sort Time it takes to split the array one array containing n-1 into 2 parts where const is a elements positive constant Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 21.
     QuickSort seemslike an ineffecient MergeSort  To improve QuickSort, we need to choose m to be a good ‘splitter.’  It can be proven that to achieve O(nlogn) running time, we don’t need a perfect split, just reasonably good one. In fact, if both subarrays are at least of size n/4, then running time will be O(n log n).  This implies that half of the choices of m make good splitters. Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 22.
     To improveQuickSort, randomly select m.  Since half of the elements will be good splitters, if we choose m at random we will get a 50% chance that m will be a good choice.  This approach will make sure that no matter what input is received, the expected running time is small. Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 23.
    1. RandomizedQuickSort(c) 2. ifc consists of a single element 3. return c 4. Choose element m uniformly at random from c 5. Determine the set of elements csmall smaller than m 6. Determine the set of elements clarge larger than m 7. RandomizedQuickSort(csmall) 8. RandomizedQuickSort(clarge) 9. Combine csmall, m, and clarge into a single array, csorted 10. return csorted Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 24.
     Worst caseruntime: O(m2)  Expected runtime: O(m log m).  Expected runtime is a good measure of the performance of randomized algorithms, often more informative than worst case runtimes. Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 25.
     Making arandom choice is fast.  An adversary is powerless; randomized algorithms have no worst case inputs.  Randomized algorithms are often simpler and faster than their deterministic counterparts. Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 26.
     In theworst case, a randomized algorithm may be very slow.  There is a finite probability of getting incorrect answer.  However, the probability of getting a wrong answer can be made arbitrarily small by the repeated employment of randomness.  Getting true random numbers is almost impossible. Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 27.
    Assignments KanishkaKhandelwal-BCSE IV , JU 3/20/2012
  • 28.
     Input: aset of 2D points  Determine the closest pair (and its dist)  Input points are stored in an array  Suppose we have a strange storage data structure D :  When we give a point to D, it stores the point and outputs the closest pair of points stored in D  Our knowledge: Insertion time depends on whether the closest pair is changed or not.  If output is the same: 1 clock tick  If output is not the same: |D| clock ticks Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 29.
     With randominsertion order, show that the expected total number of clock ticks used by D is O(n) Kanishka Khandelwal-BCSE IV , JU 3/20/2012
  • 30.
    Suppose you aregiven a directed graph with n vertices and m unit-length edges. Consider the problem of estimating the number of vertices within distance d of each vertex. Give a fully polynomial approximation scheme that solves this problem simultaneously for all vertices for any fixed d. Kanishka Khandelwal-BCSE IV , JU 3/20/2012