1. Mean and Variance of the HyperGeometric Distribution Page 1
Al Lehnen Madison Area Technical College 11/30/2011
In a drawing of n distinguishable objects without replacement from a set of N (n < N)
distinguishable objects, a of which have characteristic A, (a < N) the probability that
exactly x objects in the draw of n have the characteristic A is given by then number of
different ways the x objects can be chosen from the a available times the number of
different ways the n-x objects in the draw which don’t have A can be chosen from the
N-a available divided by the number of different ways n distinguishable objects can be
chosen from a set of N. The resulting probability distribution for the random variable x is
called the hypergeometric distribution. In symbols,
( )
a N a
x n x
f x
N
n
−⎛ ⎞⎛ ⎞
⎜ ⎟⎜ ⎟
−⎝ ⎠⎝ ⎠=
⎛ ⎞
⎜ ⎟
⎝ ⎠
.
The binomial coefficient
( )
!
! !
k k
j j k j
⎛ ⎞
=⎜ ⎟
−⎝ ⎠
is defined to be zero if either j or k-j is
negative, so that the probability of the null event of drawing more objects than those
available is zero. To prove that ( )
0 0
1
n n
x x
a N a
x n x
f x
N
n
= =
−⎛ ⎞⎛ ⎞
⎜ ⎟⎜ ⎟
−⎝ ⎠⎝ ⎠= =
⎛ ⎞
⎜ ⎟
⎝ ⎠
∑ ∑ , consider the factorization
( ) ( ) ( )N a N a
B C B C B C
−
+ = + + . From the binomial theorem,
( ) ( )
( )
0 0
0 0
a N a
a N a a j j N a l l
j l
a N a
N l j l j
j l
a N a
B C B C B C B C
j l
a N a
B C
j l
−
− − − −
= =
−
− + +
= =
−⎛ ⎞ ⎛ ⎞
+ + = ⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠
−⎛ ⎞⎛ ⎞
= ⎜ ⎟⎜ ⎟
⎝ ⎠⎝ ⎠
∑ ∑
∑ ∑
Using the diagonal rearrangement suggested by the figure below with l n j= − , with the
intercept n running from 0 to N and j running from 0 to a. This generates more than the
( )( )1 1a N a+ − + terms in the above sum. However, all of the new terms generated vanish
since they have l N a> − .
( ) ( )
0 0
N a
a N a N n n
n j
a N a
B C B C B C
j n j
− −
= =
−⎛ ⎞⎛ ⎞
+ + = ⎜ ⎟⎜ ⎟
−⎝ ⎠⎝ ⎠
∑ ∑
Now, for n a> extending the sum over j to n because of the
a
j
⎛ ⎞
⎜ ⎟
⎝ ⎠
factor would only add
terms which are zero. Similarly, if n a< , the terms in the sum over j from j = n + 1 to j = a
are all zero due to the
N a
n j
−⎛ ⎞
⎜ ⎟
−⎝ ⎠
factor. Thus,
( ) ( )
0 0 0 0
N a N n
a N a N n n N n n
n j n j
a N a a N a
B C B C B C B C
j n j j n j
− − −
= = = =
− −⎛ ⎞⎛ ⎞ ⎛ ⎞⎛ ⎞
+ + = =⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟
− −⎝ ⎠⎝ ⎠ ⎝ ⎠⎝ ⎠
∑ ∑ ∑ ∑ .
2. Mean and Variance of the HyperGeometric Distribution Page 2
Al Lehnen Madison Area Technical College 11/30/2011
But from a second use of the binomial theorem,
( ) ( ) ( )
0 0 0
N n N
a N a NN n n N n n
n j n
a N a N
B C B C B C B C B C
j n j n
− − −
= = =
−⎛ ⎞⎛ ⎞ ⎛ ⎞
+ + = = + =⎜ ⎟⎜ ⎟ ⎜ ⎟
−⎝ ⎠⎝ ⎠ ⎝ ⎠
∑ ∑ ∑ .
The only way the two sums can be equal for all values of B and C is for
0
n
j
a N a N
j n j n=
−⎛ ⎞⎛ ⎞ ⎛ ⎞
=⎜ ⎟⎜ ⎟ ⎜ ⎟
−⎝ ⎠⎝ ⎠ ⎝ ⎠
∑ . (1)
This in turn implies that the hypergeometric probabilities do indeed construct a valid
probability distribution, i.e. ( )
0 0
1
n n
x x
a N a
x n x
f x
N
n
= =
−⎛ ⎞⎛ ⎞
⎜ ⎟⎜ ⎟
−⎝ ⎠⎝ ⎠= =
⎛ ⎞
⎜ ⎟
⎝ ⎠
∑ ∑ .
The mean or expected value of the hypergeometric random variable is given by
( )
1
0 0
n n
x
x x
N a N a
x x f x x
n x n x
μ
−
= =
−⎛ ⎞ ⎛ ⎞⎛ ⎞
= = =⎜ ⎟ ⎜ ⎟⎜ ⎟
−⎝ ⎠ ⎝ ⎠⎝ ⎠
∑ ∑ .
Now, using Equation (1),
3. Mean and Variance of the HyperGeometric Distribution Page 3
Al Lehnen Madison Area Technical College 11/30/2011
( )
( )
( ) ( ) ( )
( ) ( )
( ) ( )
( )
( )
( ) ( )
( )
( ) ( )
( )
0 1 1
1 1
0 0
1 11 !!
1 1! ! 1 ! 1 1 !
1 1 1 11 ! 1
1 1! 1 !
1
1
n n n
x x x
n n
x x
N aa aa N a N axa
x
n xx n x n xx n x x n x
N a N aa a
a a
n x n xxx n x
N
a
n
= = =
− −
= =
⎛ ⎞− − −−− −⎛ ⎞⎛ ⎞ ⎛ ⎞
= = ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟− − −− −− ⎡ ⎤− − − −⎝ ⎠⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎣ ⎦
⎛ ⎞ ⎛ ⎞− − − − − −− −⎛ ⎞
= =⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟− − − −⎡ ⎤− − ⎝ ⎠⎝ ⎠ ⎝ ⎠⎣ ⎦
−⎛ ⎞
= ⎜ ⎟
−⎝ ⎠
∑ ∑ ∑
∑ ∑
This gives that ( )
( )
( ) ( )
( )1
0
1 ! ! !1
1 1 ! ! !
n
x
x
N n N nN N na
x x f x a a
n n n N n N N
μ
−
=
− −−⎛ ⎞ ⎛ ⎞
= = = = ⋅ =⎜ ⎟ ⎜ ⎟
− − −⎝ ⎠ ⎝ ⎠
∑ .
Using the notation of the binomial distribution that
a
p
N
= , we see that the expected value
of x is the same for both drawing without replacement (the hypergeometric distribution)
and with replacement (the binomial distribution).
x
na
x np
N
μ = = = (2)
The variance of the hypergeometric distribution can be computed from the generic
formula that
2 22 2
x x x x xσ ⎡ ⎤= − = −⎣ ⎦ . Again from Equation (1),
( )
( )
( )
( )( )
( ) ( ) ( )
( ) ( )
( ) ( )
( )
( )
( )
( ) ( )
( )
( )
( ) ( )
( )
0 2 2
2 2
0 0
2 21 ! 1 2 !
1
2 2! ! 2 ! 2 2 !
2 2 2 22 ! 2
1 1
2 2! 2 !
n n n
x x x
n n
x x
N ax x a a a aa N a N a
x x
n xx n x n xx n x x n x
N a N aa a
a a a a
n x n xxx n x
= = =
− −
= =
⎛ ⎞− − −− − −− −⎛ ⎞⎛ ⎞ ⎛ ⎞
− = = ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟− − −− −− ⎡ ⎤− − − −⎝ ⎠⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎣ ⎦
⎛ ⎞ ⎛ ⎞− − − − − −− −⎛ ⎞
= − = −⎜ ⎟ ⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟− − − −⎡ ⎤− − ⎝ ⎠⎝ ⎠ ⎝ ⎠⎣ ⎦
∑ ∑ ∑
∑
( )
2
1
2
N
a a
n
−⎛ ⎞
= − ⎜ ⎟
−⎝ ⎠
∑
So,
( ) ( ) ( )
( )
( )
( ) ( )
( ) ( ) ( )
( )
1 1
0
2
1 1 1
2
2 ! ! ! 1 1
1
2 ! ! ! 1
n
x
N a N a N N
x x x x a a
n x n x n n
N N n n a a n n
a a
n N n N N N
− −
=
− −⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞
− = − = −⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟
− −⎝ ⎠ ⎝ ⎠⎝ ⎠ ⎝ ⎠ ⎝ ⎠
− − − −
= − ⋅ =
− − −
∑
and
( )
( ) ( )
( )
( )( )
( )
2 1 1 1 1
1 1
1 1
a a n n a nan an
x x x x
N N N N N
⎡ ⎤− − − −
= − + = + = +⎢ ⎥
− −⎢ ⎥⎣ ⎦
.
4. Mean and Variance of the HyperGeometric Distribution Page 4
Al Lehnen Madison Area Technical College 11/30/2011
Thus,
( )( ) ( )( )
( )
( )
( )
( )
( )
( ) ( )
( ) ( )
( )
( )( )
( )
22 2
2 2
1 1 1 1 1 1
1
1 1 1 1
1 1
1 1
x
a n N a n N N an Nan an an
x x
N N N N N N N N N N
an Nan Na Nn N N N Nan an an N Na Nn an
N N N N N N
N N a n N a N n N aan an an N a
N N N N N N N N
σ
⎡ ⎤⎡ ⎤− − − − − −
= − = + − = + −⎢ ⎥⎢ ⎥
− − − −⎢ ⎥⎣ ⎦ ⎣ ⎦
⎡ ⎤ ⎡ ⎤− − + + − − + − − +
= =⎢ ⎥ ⎢ ⎥
− −⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦
⎡ ⎤ ⎡ ⎤− − − − − −⎛
= = =⎢ ⎥ ⎢ ⎥ ⎜− − ⎝⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦
( )
1
1 1
1 1
N n
N
an a N n N n
np p
N N N N
−⎞⎛ ⎞
⎟⎜ ⎟−⎠⎝ ⎠
− −⎛ ⎞⎛ ⎞ ⎛ ⎞
= − = −⎜ ⎟⎜ ⎟ ⎜ ⎟
− −⎝ ⎠⎝ ⎠ ⎝ ⎠
The last factor
1
N n
N
−⎛ ⎞
⎜ ⎟−⎝ ⎠
is called the “finite population correction” and is the reason that
the variance of the binomial distribution ( )1np p− differs from the hypergeometric
distribution. For N large compared to the sample size n, the two distributions are
essentially identical.