Hypergeometric Distribution

5,129 views

Published on

Hypergeometric Distribution

Published in: Technology

Hypergeometric Distribution

  1. 1. 1.11 Hypergeometric Distribution
  2. 2. Hypergeometric Distribution Suppose we are interested in the number of defectives in a sample of size n units drawn from a lot containing N units, of which a are defective. Each object has same chance of being selected, then the probability that the first drawing will yield a defective unit = a/N but for the second drawing, probability a 1 if first unit is defective, N 1 a if first unit is not defective. N 1
  3. 3. Hypergeometric Distribution • The trials here are not independent and hence the fourth assumption underlying the binomial distribution is not fulfilled and therefore, we cannot apply binomial distribution here. • Binomial distribution would have been applied if we do sampling with replacement, viz., if each unit selected from the sample would have been replaced before the next one is drawn.
  4. 4. Hypergeometric Distribution Sampling without replacement Number of ways in which x successes (defectives) can be chosen is a x Number of ways in which n – x failures (non defectives) be chosen is N a n x Hence number of ways x successes and n – x failures can be chosen is a N a x n x
  5. 5. Hypergeometric Distribution Number of ways n objects can be chosen from N objects is N If all the possibilities are equally likely then for sampling n without replacement the probability of getting “x successes in n trials” is given by a N a x n x h ( x; n, a , N ) for x 0 , 1,...., n N n where x a, n x N a.
  6. 6. Hypergeometric Distribution • The solution of the problem of sampling without replacement gave birth to the above distribution which we termed as hypergeometric distribution. • The parameters of hypergeometric distribution are the sample size n, the lot size (or population size) N, and the number of “successes” in the lot a. • When n is small compared to N, the composition of the lot is not seriously affected by drawing the sample and the binomial distribution with parameters n and p = a/N will yield a good approximation.
  7. 7. Hypergeometric Distribution The difference between the two values is only 0.010. In general it can be shown that h( x; n, a, N) b( x; n, p) with p = (a/N) when N ∞. A good rule of thumb is to use the binomial distribution as an approximation to the hyper-geometric distribution if n/N ≤0.05
  8. 8. The Mean and the Variance of a Probability Distribution Mean of hypergeometric distribution a n sample size n N population size N a number of success Proof: a N a n n x n x x .h ( x ; n , a , N ) x. N x 0 x 1 n
  9. 9. The Mean and the Variance of a Probability Distribution a a! a (a 1)! a a 1 x x! ( a x )! x (x 1)! ( a x )! x x 1 a 1 N a n n a 1 N a x 1 n x a a. N N x 1 n x x 1 x 1 n n
  10. 10. The Mean and the Variance of a Probability Distribution Put x – 1= y k n 1 n 1 a a 1 N a m a 1 N y 0 y n 1 y r y n s N a k m s m s Use the identity r k r k r 0
  11. 11. The Mean and the Variance of a Probability Distribution We get a N 1 N n 1 n a n N
  12. 12. Variance of hypergeometric distribution 2 n a (N a) (N n) 2 N (N 1) Proof: a N a n n 2 2 x n x 2 x .h ( x ; n , a , N ) x . x 0 N x 1 n
  13. 13. a 1 N a n x 1 n x 2 a x. N x 1 n n a 1 N a a (x 1 1). N x 1 n x x 1 n
  14. 14. n a 2 N a a (a 1) 2 . N x 2 n x x 2 n n a 1 N a a N x 1 n x x 1 n Put x – 2 = y in 1st summation and x – 1 = z in 2nd one
  15. 15. n 2 a 2 N a a (a 1) 2 . N y n 2 y y 0 n n 1 a 1 N a a N z n 1 z z 0 n k m s m s Use the identity r k r k r 0 k n 2, m a 2 k n 1, m a 1 r y, s N a r z, s N a
  16. 16. a (a 1) N 2 a N 1 2 N n 2 N n 1 n n n(n 1) n a (a 1) a N (N 1) N 2 2 n a (N a) (N n) 2 2 N (N 1)

×