1.11 Hypergeometric
    Distribution
Hypergeometric Distribution

Suppose we are interested in the number of defectives in
  a sample of size n units drawn from a lot containing N
  units, of which a are defective.
Each object has same chance of being selected, then the
  probability that the first drawing will yield a defective
  unit               = a/N
but for the second drawing, probability

                   a       1
                                if first unit is defective,
                  N        1
                       a
                               if first unit is not defective.
                  N        1
Hypergeometric Distribution
• The trials here are not independent and hence the
  fourth assumption underlying the binomial
  distribution is not fulfilled and therefore, we cannot
  apply binomial distribution here.
• Binomial distribution would have been applied if we
  do sampling with replacement, viz., if each unit
  selected from the sample would have been replaced
  before the next one is drawn.
Hypergeometric Distribution

Sampling without replacement
Number of ways in which x successes (defectives) can be
  chosen is a
             x
Number of ways in which n – x failures (non defectives)
be chosen is N a
             n x
Hence number of ways x successes and n – x failures can
be chosen is a N a
              x    n   x
Hypergeometric Distribution

Number of ways n objects can be chosen from N objects is N
If all the possibilities are equally likely then for sampling n
without replacement the probability of getting “x successes
in n trials” is given by
                                a   N   a
                                x   n   x
          h ( x; n, a , N )                      for x   0 , 1,...., n
                                    N
                                    n
          where x        a, n   x   N       a.
Hypergeometric Distribution

• The solution of the problem of sampling without
  replacement gave birth to the above distribution
  which      we     termed     as     hypergeometric
  distribution.
• The parameters of hypergeometric distribution are
  the sample size n, the lot size (or population size)
  N, and the number of “successes” in the lot a.
• When n is small compared to N, the composition
  of the lot is not seriously affected by drawing the
  sample and the binomial distribution with
  parameters n and p = a/N will yield a good
  approximation.
Hypergeometric Distribution
The difference between the two values is only 0.010.

In general it can be shown that
                h( x; n, a, N) b( x; n, p)
with p = (a/N) when N ∞.

A good rule of thumb is to use the binomial distribution
  as an approximation to the hyper-geometric
  distribution if n/N ≤0.05
The Mean and the Variance of a Probability
              Distribution

   Mean of hypergeometric distribution
                  a            n     sample size
            n                  N      population size
                 N             a     number of success
  Proof:                                        a   N   a
       n                             n          x   n   x
            x .h ( x ; n , a , N )         x.
                                                    N
      x 0                            x 1
                                                    n
The Mean and the Variance of a Probability
              Distribution


   a                 a!               a       (a       1)!               a a       1
    x        x! ( a       x )!        x (x    1)! ( a         x )!       x x       1

                 a        1       N       a
        n                                                    n       a   1     N       a
                 x        1       n   x            a
            a.
                              N                    N                 x   1     n       x
    x 1                                                      x 1
                              n                    n
The Mean and the Variance of a Probability
              Distribution

   Put x – 1= y
                                                                k   n   1
          n 1
     a          a       1           N       a                   m   a   1
     N   y 0        y           n       1       y               r   y

     n                                                          s   N   a

                            k       m           s       m       s
  Use the identity
                                    r       k       r       k
                        r 0
The Mean and the Variance of a Probability
              Distribution

   We get

                  a       N   1
               N          n   1
                  n

                      a
              n
                      N
Variance of hypergeometric distribution

         2      n a (N        a) (N     n)
                          2
                      N       (N   1)
Proof:                                              a   N   a
     n                                  n
              2                                2    x   n   x
2            x .h ( x ; n , a , N )           x .
    x 0
                                                        N
                                        x 1
                                                        n
a    1       N   a
        n          x    1       n   x
2   a         x.
                            N
        x 1
                            n
              n                     a   1   N   a
    a
                   (x   1 1).
    N                               x   1   n   x
            x 1
    n
n       a       2       N       a
              a (a       1)
         2                            .
                 N                        x       2       n       x
                              x 2
                  n
                              n       a       1       N       a
                     a
                     N                x       1       n       x
                          x 1
                     n
Put x – 2 = y in 1st summation and x – 1 = z in 2nd one
n 2         a       2       N           a
            a (a       1)
     2                              .
                  N                         y       n       2           y
                            y 0
                  n
                        n 1 a               1       N       a
                   a
                   N                    z       n       1       z
                        z 0
                   n
                                k       m           s                   m       s
Use the identity
                                        r       k       r                   k
                            r 0

 k   n          2, m        a       2               k       n           1, m        a   1
 r       y, s      N        a                       r       z, s            N       a
a (a    1) N        2             a        N        1
2
        N         n     2         N            n        1
        n                             n
                 n(n    1)                n
    a (a    1)                    a
                 N (N       1)            N

2           2    n a (N          a) (N             n)
    2                       2
                        N        (N       1)

Hypergeometric Distribution

  • 1.
  • 2.
    Hypergeometric Distribution Suppose weare interested in the number of defectives in a sample of size n units drawn from a lot containing N units, of which a are defective. Each object has same chance of being selected, then the probability that the first drawing will yield a defective unit = a/N but for the second drawing, probability a 1 if first unit is defective, N 1 a if first unit is not defective. N 1
  • 3.
    Hypergeometric Distribution • Thetrials here are not independent and hence the fourth assumption underlying the binomial distribution is not fulfilled and therefore, we cannot apply binomial distribution here. • Binomial distribution would have been applied if we do sampling with replacement, viz., if each unit selected from the sample would have been replaced before the next one is drawn.
  • 4.
    Hypergeometric Distribution Sampling withoutreplacement Number of ways in which x successes (defectives) can be chosen is a x Number of ways in which n – x failures (non defectives) be chosen is N a n x Hence number of ways x successes and n – x failures can be chosen is a N a x n x
  • 5.
    Hypergeometric Distribution Number ofways n objects can be chosen from N objects is N If all the possibilities are equally likely then for sampling n without replacement the probability of getting “x successes in n trials” is given by a N a x n x h ( x; n, a , N ) for x 0 , 1,...., n N n where x a, n x N a.
  • 6.
    Hypergeometric Distribution • Thesolution of the problem of sampling without replacement gave birth to the above distribution which we termed as hypergeometric distribution. • The parameters of hypergeometric distribution are the sample size n, the lot size (or population size) N, and the number of “successes” in the lot a. • When n is small compared to N, the composition of the lot is not seriously affected by drawing the sample and the binomial distribution with parameters n and p = a/N will yield a good approximation.
  • 7.
    Hypergeometric Distribution The differencebetween the two values is only 0.010. In general it can be shown that h( x; n, a, N) b( x; n, p) with p = (a/N) when N ∞. A good rule of thumb is to use the binomial distribution as an approximation to the hyper-geometric distribution if n/N ≤0.05
  • 8.
    The Mean andthe Variance of a Probability Distribution Mean of hypergeometric distribution a n sample size n N population size N a number of success Proof: a N a n n x n x x .h ( x ; n , a , N ) x. N x 0 x 1 n
  • 9.
    The Mean andthe Variance of a Probability Distribution a a! a (a 1)! a a 1 x x! ( a x )! x (x 1)! ( a x )! x x 1 a 1 N a n n a 1 N a x 1 n x a a. N N x 1 n x x 1 x 1 n n
  • 10.
    The Mean andthe Variance of a Probability Distribution Put x – 1= y k n 1 n 1 a a 1 N a m a 1 N y 0 y n 1 y r y n s N a k m s m s Use the identity r k r k r 0
  • 11.
    The Mean andthe Variance of a Probability Distribution We get a N 1 N n 1 n a n N
  • 12.
    Variance of hypergeometricdistribution 2 n a (N a) (N n) 2 N (N 1) Proof: a N a n n 2 2 x n x 2 x .h ( x ; n , a , N ) x . x 0 N x 1 n
  • 13.
    a 1 N a n x 1 n x 2 a x. N x 1 n n a 1 N a a (x 1 1). N x 1 n x x 1 n
  • 14.
    n a 2 N a a (a 1) 2 . N x 2 n x x 2 n n a 1 N a a N x 1 n x x 1 n Put x – 2 = y in 1st summation and x – 1 = z in 2nd one
  • 15.
    n 2 a 2 N a a (a 1) 2 . N y n 2 y y 0 n n 1 a 1 N a a N z n 1 z z 0 n k m s m s Use the identity r k r k r 0 k n 2, m a 2 k n 1, m a 1 r y, s N a r z, s N a
  • 16.
    a (a 1) N 2 a N 1 2 N n 2 N n 1 n n n(n 1) n a (a 1) a N (N 1) N 2 2 n a (N a) (N n) 2 2 N (N 1)