1. The document covers probability axioms and rules including the additive rule, conditional probability, independence, and Bayes' rule. It also defines discrete and continuous random variables and their probability distributions.
2. Important discrete distributions discussed include the Bernoulli distribution for a binary outcome experiment and the binomial distribution for repeated Bernoulli trials.
3. Techniques for counting permutations, combinations, and sequences of events are presented to handle probability problems involving counting.
2. Axioms of Probability
A probability measure P is defined on S by defining
for each event E, P[E] with the following properties
1. P[E] ≥ 0, for each E.
2. P[S] = 1.
3. if for all ,
i i i i
i
i
P E P E E E i j
1 2 1 2
P E E P E P E
3. Finite uniform probability space
Many examples fall into this category
1. Finite number of outcomes
2. All outcomes are equally likely
3.
no. of outcomes in
=
total no. of outcomes
n E n E E
P E
n S N
: = no. of elements of
n A A
Note
To handle problems in case we have to be able to
count. Count n(E) and n(S).
5. Basic Rule of counting
Suppose we carry out k operations in sequence
Let
n1 = the number of ways the first operation
can be performed
ni = the number of ways the ith operation can be
performed once the first (i - 1) operations
have been completed. i = 2, 3, … , k
Then N = n1n2 … nk = the number of ways the
k operations can be performed in sequence.
6. Basic Counting Formulae
1. Permutations: How many ways can you order n
objects
n!
2. Permutations of size k (< n): How many ways can
you choose k objects from n objects in a specific
order
!
= 1 1
!
n k
n
P n n n k
n k
7. 3. Combinations of size k ( ≤ n): A combination of
size k chosen from n objects is a subset of size k
where the order of selection is irrelevant. How many
ways can you choose a combination of size k objects
from n objects (order of selection is irrelevant)
n k
n
C
k
1 2 1
!
! ! 1 2 1
n n n n k
n
n k k k k k
8. Important Notes
1. In combinations ordering is irrelevant.
Different orderings result in the same
combination.
2. In permutations order is relevant. Different
orderings result in the different permutations.
10. The additive rule
P[A B] = P[A] + P[B] – P[A B]
and
if P[A B] =
P[A B] = P[A] + P[B]
11. The additive rule for more than two events
then
and if Ai Aj = for all i ≠ j.
1
1
n n
i i i j
i i j
i
P A P A P A A
i j k
i j k
P A A A
1
1 2
1
n
n
P A A A
1
1
n n
i i
i
i
P A P A
12. The Rule for complements
for any event E
1
P E P E
14. The conditional probability of A given B is
defined to be:
P A B
P A B
P B
if 0
P B
15.
if 0
if 0
P A P B A P A
P A B
P B P A B P B
The multiplicative rule of probability
and
P A B P A P B
if A and B are independent.
This is the definition of independence
16.
1 2 n
P A A A
The multiplicative rule for more than
two events
1 2 1 3 2 1
P A P A A P A A A
1 2 1
n n n
P A A A A
18. Definition:
The set of k events A1, A2, … , Ak are called
mutually independent if:
P[Ai1
∩ Ai2
∩… ∩ Aim
] = P[Ai1
] P[Ai2
] …P[Aim
]
For every subset {i1, i2, … , im } of {1, 2, …, k }
i.e. for k = 3 A1, A2, … , Ak are mutually independent if:
P[A1 ∩ A2] = P[A1] P[A2], P[A1 ∩ A3] = P[A1] P[A3],
P[A2 ∩ A3] = P[A2] P[A3],
P[A1 ∩ A2 ∩ A3] = P[A1] P[A2] P[A3]
19. Definition:
The set of k events A1, A2, … , Ak are called
pairwise independent if:
P[Ai ∩ Aj] = P[Ai] P[Aj] for all i and j.
i.e. for k = 3 A1, A2, … , Ak are pairwise independent if:
P[A1 ∩ A2] = P[A1] P[A2], P[A1 ∩ A3] = P[A1] P[A3],
P[A2 ∩ A3] = P[A2] P[A3],
It is not necessarily true that P[A1 ∩ A2 ∩ A3] = P[A1]
P[A2] P[A3]
20. Bayes Rule for probability
P A P B A
P A B
P A P B A P A P B A
21. Let A1, A2 , … , Ak denote a set of events such that
1 1
i i
i
k k
P A P B A
P A B
P A P B A P A P B A
An generalization of Bayes Rule
1 2 and
k i j
S A A A A A
for all i and j. Then
23. A random variable , X, is a numerical quantity
whose value is determined be a random
experiment
24. Definition – The probability function, p(x), of
a random variable, X.
For any random variable, X, and any real
number, x, we define
p x P X x P X x
where {X = x} = the set of all outcomes (event)
with X = x.
For continuous random variables p(x) = 0 for all
values of x.
25. Definition – The cumulative distribution
function, F(x), of a random variable, X.
For any random variable, X, and any real
number, x, we define
F x P X x P X x
where {X ≤ x} = the set of all outcomes (event)
with X ≤ x.
26. Discrete Random Variables
For a discrete random variable X the probability
distribution is described by the probability
function p(x), which has the following properties
1
2. 1
i
x i
p x p x
1. 0 1
p x
3.
a x b
P a x b p x
28. Continuous random variables
For a continuous random variable X the probability
distribution is described by the probability density
function f(x), which has the following properties :
1. f(x) ≥ 0
2. 1.
f x dx
3. .
b
a
P a X b f x dx
29. Graph: Continuous Random Variable
probability density function, f(x)
1.
f x dx
.
b
a
P a X b f x dx
30. The distribution function F(x)
This is defined for any random variable, X.
F(x) = P[X ≤ x]
Properties
1. F(-∞) = 0 and F(∞) = 1.
2. F(x) is non-decreasing (i. e. if x1 < x2 then
F(x1) ≤ F(x2) )
3. F(b) – F(a) = P[a < X ≤ b].
31. 4. p(x) = P[X = x] =F(x) – F(x-)
5. If p(x) = 0 for all x (i.e. X is continuous)
then F(x) is continuous.
Here
lim
u x
F x F u
32. 6. For Discrete Random Variables
F(x) is a non-decreasing step function with
u x
F x P X x p u
jump in at .
p x F x F x F x x
0 and 1
F F
0
0.2
0.4
0.6
0.8
1
1.2
-1 0 1 2 3 4
F(x)
p(x)
33. 7. For Continuous Random Variables
Variables
F(x) is a non-decreasing continuous function with
x
F x P X x f u du
.
f x F x
0 and 1
F F
F(x)
f(x) slope
0
1
-1 0 1 2
x
To find the probability density function, f(x), one first
finds F(x) then .
f x F x
36. Suppose that we have a experiment that has two
outcomes
1. Success (S)
2. Failure (F)
These terms are used in reliability testing.
Suppose that p is the probability of success (S) and
q = 1 – p is the probability of failure (F)
This experiment is sometimes called a Bernoulli Trial
Let 0 if the outcome is F
1 if the outcome is S
X
Then
0
1
q x
p x P X x
p x
37. The probability distribution with probability function
is called the Bernoulli distribution
0
1
q x
p x P X x
p x
0
0.2
0.4
0.6
0.8
1
0 1
p
q = 1- p
39. We observe a Bernoulli trial (S,F) n times.
0,1,2, ,
x n x
n
p x P X x p q x n
x
where
Let X denote the number of successes in the n trials.
Then X has a binomial distribution, i. e.
1. p = the probability of success (S), and
2. q = 1 – p = the probability of failure (F)
40. The Poisson distribution
• Suppose events are occurring randomly and
uniformly in time.
• Let X be the number of events occuring in a
fixed period of time. Then X will have a
Poisson distribution with parameter l.
0,1,2,3,4,
!
x
p x e x
x
l
l
41. The Geometric distribution
Suppose a Bernoulli trial (S,F) is repeated until a
success occurs.
X = the trial on which the first success (S)
occurs.
The probability function of X is:
p(x) =P[X = x] = (1 – p)x – 1p = p qx - 1
42. The Negative Binomial distribution
Suppose a Bernoulli trial (S,F) is repeated until k
successes occur.
Let X = the trial on which the kth success (S)
occurs.
The probability function of X is:
1
, 1, 2,
1
k x k
x
p x P X x p q x k k k
k
43. The Hypergeometric distribution
Suppose we have a population containing N objects.
Suppose the elements of the population are partitioned into two
groups. Let a = the number of elements in group A and let b = the
number of elements in the other group (group B). Note N = a + b.
Now suppose that n elements are selected from the population at
random. Let X denote the elements from group A.
The probability distribution of X is
a b
x n x
p x P X x
N
n
45. Continuous random variables
For a continuous random variable X the probability
distribution is described by the probability density
function f(x), which has the following properties :
1. f(x) ≥ 0
2. 1.
f x dx
3. .
b
a
P a X b f x dx
46. Graph: Continuous Random Variable
probability density function, f(x)
1.
f x dx
.
b
a
P a X b f x dx
47. 0
0.1
0.2
0.3
0.4
0 5 10 15
1
b a
a b
f x
x
0
0.1
0.2
0.3
0.4
0 5 10 15
1
b a
a b
f x
x
1
b a
a b
f x
x
Continuous Distributions
The Uniform distribution from a to b
1
0 otherwise
a x b
f x b a
54. The Gamma distribution
Let the continuous random variable X have
density function:
1
0
0 0
x
x e x
f x
x
a
a l
l
a
Then X is said to have a Gamma distribution
with parameters a and l.
55. Graph: The gamma distribution
0
0.1
0.2
0.3
0.4
0 2 4 6 8 10
(a = 2, l = 0.9)
(a = 2, l = 0.6)
(a = 3, l = 0.6)
56. Comments
1. The set of gamma distributions is a family of
distributions (parameterized by a and l).
2. Contained within this family are other distributions
a. The Exponential distribution – in this case a = 1, the
gamma distribution becomes the exponential distribution
with parameter l. The exponential distribution arises if
we are measuring the lifetime, X, of an object that does
not age. It is also used a distribution for waiting times
between events occurring uniformly in time.
b. The Chi-square distribution – in the case a = n/2 and
l = ½, the gamma distribution becomes the chi- square
(c2) distribution with n degrees of freedom. Later we
will see that sum of squares of independent standard
normal variates have a chi-square distribution, degrees
of freedom = the number of independent terms in the
sum of squares.
58. Let X denote a discrete random variable with
probability function p(x) (probability density function
f(x) if X is continuous) then the expected value of X,
E(X) is defined to be:
i i
x i
E X xp x x p x
E X xf x dx
and if X is continuous with probability density function
f(x)
59. Expectation of functions
Let X denote a discrete random variable with
probability function p(x) then the expected value of X,
E[g (X)] is defined to be:
x
E g X g x p x
E X g x f x dx
and if X is continuous with probability density function
f(x)
61. the kth moment of X :
k
k E X
m
-
if is discrete
if is continuous
k
x
k
x p x X
x f x dx X
• The first moment of X , m = m1 = E(X) is the center of gravity
of the distribution of X.
• The higher moments give different information regarding the
distribution of X.
62. the kth central moment of X
0 k
k E X
m m
-
if is discrete
if is continuous
k
x
k
x p x X
x f x dx X
m
m
64. Definition
Let X denote a random variable, Then the moment
generating function of X , mX(t) is defined by:
if is discrete
if is continuous
tx
x
tX
X
tx
e p x X
m t E e
e f x dx X
65. Properties
1. mX(0) = 1
0 derivative of at 0.
k th
X X
m k m t t
2.
k
k E X
m
2 3
3
2
1
1 .
2! 3! !
k
k
X
m t t t t t
k
m m
m
m
3.
continuous
discrete
k
k
k k
x f x dx X
E X
x p x X
m
66. 4. Let X be a random variable with moment
generating function mX(t). Let Y = bX + a
Then mY(t) = mbX + a(t)
= E(e [bX + a]t) = eatE(e X[ bt ])
= eatmX (bt)
5. Let X and Y be two independent random
variables with moment generating function
mX(t) and mY(t) .
Then mX+Y(t) = E(e [X + Y]t) = E(e Xt e Yt)
= E(e Xt) E(e Yt)
= mX (t) mY (t)
67. 6. Let X and Y be two random variables with
moment generating function mX(t) and mY(t)
and two distribution functions FX(x) and
FY(y) respectively.
Let mX (t) = mY (t) then FX(x) = FY(x).
This ensures that the distribution of a random
variable can be identified by its moment
generating function
68. M. G. F.’s - Continuous distributions
Name
Moment generating
function MX(t)
Continuous
Uniform
ebt-eat
[b-a]t
Exponential l
l t
for t < l
Gamma
l
l t
a
for t < l
c2
n d.f.
1
1-2t
n/2
for t < 1/2
Normal etm+(1/2)t2s2
69. M. G. F.’s - Discrete distributions
Name
Moment
generating
function MX(t)
Discrete
Uniform
et
N
etN-1
et-1
Bernoulli q + pet
Binomial (q + pet)N
Geometric pet
1-qet
Negative
Binomial
pet
1-qet
k
Poisson el(et-1)
70. Note:
The distribution of a random variable X can be described by:
probability function if is discrete
1.
probability density function if is continuous
p x X
f x X
3. Moment generating function:
if is discrete
if is continuous
tx
x
tX
X
tx
e p x X
m t E e
e f x dx X
2. Distribution function:
if is discrete
if is continuous
u x
x
p u X
F x
f u du X
71. Summary of Discrete Distributions
Name probability function p(x) Mean Variance
Moment
generating
function MX(t)
Discrete
Uniform p(x) =
1
N x=1,2,...,N
N+1
2
N2-1
12
et
N
etN-1
et-1
Bernoulli
p(x) =
p x=1
q x=0
p pq q + pet
Binomial
p(x) =
N
x pxqN-x
Np Npq (q + pet)N
Geometric p(x) =pqx-1 x=1,2,... 1
p
q
p2
pet
1-qet
Negative
Binomial p(x) =
x-1
k-1 pkqx-k
x=k,k+1,...
k
p
kq
p2
pet
1-qet
k
Poisson
p(x) =
lx
x! e-l x=1,2,...
l l el(et-1)
Hypergeometric
p(x) =
A
x
N-A
n-x
N
n
n
A
N n
A
N
1-
A
N
N-n
N-1
not useful
72. Summary of Continuous Distributions
Name
probability
density function f(x) Mean Variance
Moment generating
function MX(t)
Continuous
Uniform
otherwise
b
x
a
a
b
x
f
0
1
)
(
a+b
2
(b-a)2
12
ebt-eat
[b-a]t
Exponentia
l
0
0
0
)
(
x
x
le
x
f
lx 1
l
1
l2
l
l t
for t < l
Gamma
f(x) =
0
0
0
)
(
f(x)
1
x
x
e
x
a
G
l lx
a
a
a
l
a
l2
l
l t
a
for t < l
c2
n d.f.
f(x) =
(1/2)n
(n/2)
xne-(1/2)x x ? 0
0 x < 0
n n
1
1-2t
n/2
for t < 1/2
Normal
f(x) =
1
2 s
e-(x-m)2/2s2 m s2
etm+(1/2)t2s2
Weibull
f(x) =
x e-x
x ? 0
0 x < 0
( )
+1
( )
+2
-[ ]
( )
+1
not
avail.
75. The joint probability function;
p(x,y) = P[X = x, Y = y]
1. 0 , 1
p x y
2. , 1
x y
p x y
3. , ,
P X Y A p x y
,
x y A
77. Definition: Two random variable are said to have
joint probability density function f(x,y) if
1. 0 ,
f x y
2. , 1
f x y dxdy
3. , ,
P X Y A f x y dxdy
A
79. Marginal Distributions (Discrete case):
Let X and Y denote two random variables with
joint probability function p(x,y) then
the marginal density of X is
,
X
y
p x p x y
the marginal density of Y is
,
Y
x
p y p x y
80. Marginal Distributions (Continuous case):
Let X and Y denote two random variables with
joint probability density function f(x,y) then
the marginal density of X is
,
X
f x f x y dy
the marginal density of Y is
,
Y
f y f x y dx
81. Conditional Distributions (Discrete Case):
Let X and Y denote two random variables with
joint probability function p(x,y) and marginal
probability functions pX(x), pY(y) then
the conditional density of Y given X = x
,
Y X
X
p x y
p y x
p x
conditional density of X given Y = y
,
X Y
Y
p x y
p x y
p y
82. Conditional Distributions (Continuous Case):
Let X and Y denote two random variables with
joint probability density function f(x,y) and
marginal densities fX(x), fY(y) then
the conditional density of Y given X = x
,
Y X
X
f x y
f y x
f x
conditional density of X given Y = y
,
X Y
Y
f x y
f x y
f y
84. Let
2 2
1 1 1 1 2 2 2 2
1 1 2 2
1 2 2
2
,
1
x x x x
Q x x
m m m m
s s s s
1 2
1
,
2
1 2 2
1 2
1
, e
2 1
Q x x
f x x
s s
where
This distribution is called the bivariate
Normal distribution.
The parameters are m1, m2 , s1, s2 and .
86. 2. The marginal distribution of x2 is Normal with
mean m2 and standard deviation s2.
1. The marginal distribution of x1 is Normal with
mean m1 and standard deviation s1.
Marginal distributions
87. Conditional distributions
1. The conditional distribution of x1 given x2 is
Normal with:
and
mean
standard deviation
1
1 2 2
12
2
x
s
m m m
s
2
1
12
1
s s
2. The conditional distribution of x2 given x1 is
Normal with:
and
mean
standard deviation
2
2 1 1
21
1
x
s
m m m
s
2
2
21
1
s s
89. Two random variables X and Y are defined to be
independent if
Definition:
, X Y
p x y p x p y
if X and Y are discrete
, X Y
f x y f x f y
if X and Y are continuous
91. Definition
Let X1, X2, …, Xn denote n discrete random
variables, then
p(x1, x2, …, xn )
is joint probability function of X1, X2, …, Xn if
1
1
2. , , 1
n
n
x x
p x x
1
1. 0 , , 1
n
p x x
1 1
3. , , , ,
n n
P X X A p x x
1, , n
x x A
92. Definition
Let X1, X2, …, Xk denote k continuous random
variables, then
f(x1, x2, …, xk )
is joint density function of X1, X2, …, Xk if
1 1
2. , , , , 1
n n
f x x dx dx
1
1. , , 0
n
f x x
1 1 1
3. , , , , , ,
n n n
P X X A f x x dx dx
A
93. The Multinomial distribution
Suppose that we observe an experiment that has k
possible outcomes {O1, O2, …, Ok } independently n
times.
Let p1, p2, …, pk denote probabilities of O1, O2, …,
Ok respectively.
Let Xi denote the number of times that outcome Oi
occurs in the n repetitions of the experiment.
94. is called the Multinomial distribution
1 2
1 1 2
1 2
!
, ,
! ! !
k
x
x x
n k
k
n
p x x p p p
x x x
1 2
1 2
1 2
k
x
x x
k
k
n
p p p
x x x
The joint probability function of:
95. The Multivariate Normal distribution
Recall the univariate normal distribution
2
1
2
1
2
x
f x e
m
s
s
the bivariate normal distribution
2
2
1
2
2 1
2
2
1
,
2 1
x x
x x y y
x x
x x y y
x y
f x y e
m m
m m
s s s s
s s
96. The k-variate Normal distribution
1
1
2
1 /2 1/2
1
, ,
2
k k
f x x f e
x μ x μ
x
where
1
2
k
x
x
x
x
1
2
k
m
m
m
μ
11 12 1
12 22 2
1 2
k
k
k k kk
s s s
s s s
s s s
98. Definition
Let X1, X2, …, Xq, Xq+1 …, Xk denote k discrete
random variables with joint probability function
p(x1, x2, …, xq, xq+1 …, xk )
1
12 1 1
, , , ,
q n
q q n
x x
p x x p x x
then the marginal joint probability function
of X1, X2, …, Xq is
99. Definition
Let X1, X2, …, Xq, Xq+1 …, Xk denote k continuous
random variables with joint probability density
function
f(x1, x2, …, xq, xq+1 …, xk )
12 1 1 1
, , , ,
q q n q n
f x x f x x dx dx
then the marginal joint probability function
of X1, X2, …, Xq is
101. Definition
Let X1, X2, …, Xq, Xq+1 …, Xk denote k discrete
random variables with joint probability function
p(x1, x2, …, xq, xq+1 …, xk )
1
1 1
1 1
1 1
, ,
, , , ,
, ,
k
q q k
q q k
q k q k
p x x
p x x x x
p x x
then the conditional joint probability function
of X1, X2, …, Xq given Xq+1 = xq+1 , …, Xk = xk is
102. Definition
Let X1, X2, …, Xq, Xq+1 …, Xk denote k continuous
random variables with joint probability density
function
f(x1, x2, …, xq, xq+1 …, xk )
then the conditional joint probability function
of X1, X2, …, Xq given Xq+1 = xq+1 , …, Xk = xk is
Definition
1
1 1
1 1
1 1
, ,
, , , ,
, ,
k
q q k
q q k
q k q k
f x x
f x x x x
f x x
103. Definition
Let X1, X2, …, Xq, Xq+1 …, Xk denote k continuous
random variables with joint probability density
function
f(x1, x2, …, xq, xq+1 …, xk )
then the variables X1, X2, …, Xq are independent
of Xq+1, …, Xk if
Definition – Independence of sets of vectors
1 1 1 1 1
, , , , , ,
k q q q k q k
f x x f x x f x x
A similar definition for discrete random variables.
104. Definition
Let X1, X2, …, Xk denote k continuous random
variables with joint probability density function
f(x1, x2, …, xk )
then the variables X1, X2, …, Xk are called
mutually independent if
Definition – Mutual Independence
1 1 1 2 2
, , k k k
f x x f x f x f x
A similar definition for discrete random variables.
106. Definition
Let X1, X2, …, Xn denote n jointly distributed
random variable with joint density function
f(x1, x2, …, xn )
then
1, , n
E g X X
1 1 1
, , , , , ,
n n n
g x x f x x dx dx
108.
1 1
1. , ,
i i n n
E X x f x x dx dx
i i i i
x f x dx
Thus you can calculate E[Xi] either from the joint distribution of
X1, … , Xn or the marginal distribution of Xi.
1 1 1 1
2. n n n n
E a X a X b a E X a E X b
The Linearity property
109.
1 1
, , , ,
q q k
E g X X h X X
In the simple case when k = 2
3. (The Multiplicative property) Suppose X1, … , Xq
are independent of Xq+1, … , Xk then
1 1
, , , ,
q q k
E g X X E h X X
E XY E X E Y
if X and Y are independent
110. Some Rules for Variance
2 2 2
Var X X
X E X E X
m m
111. Ex:
2
1
1
P X k
k
m s
3
2
4
P X m s
Tchebychev’s inequality
8
3
9
P X m s
15
4
16
P X m s
112.
1. Var Var Var 2Cov ,
X Y X Y X Y
where Cov , = X Y
X Y E X Y
m m
Cov , 0
X Y
and Var Var Var
X Y X Y
Note: If X and Y are independent, then
113. The correlation coefficient XY
Cov , Cov ,
=
Var Var
xy
X Y
X Y X Y
X Y
s s
:
1. If and are independent than 0.
XY
X Y
Properties
2. 1 1
XY
if there exists a and b such that
and 1
XY
1
P Y bX a
where XY = +1 if b > 0 and XY = -1 if b< 0
114.
2 2
2. Var Var Var 2 Cov ,
aX bY a X b Y ab X Y
Some other properties of variance
1 1
3. Var n n
a X a X
2 2
1 1
Var Var
n n
a X a X
1 2 1 2 1 1
2 Cov , 2 Cov ,
n n
a a X X a a X X
2 3 2 3 2 2
2 Cov , 2 Cov ,
n n
a a X X a a X X
1 1
2 Cov ,
n n n n
a a X X
2
1
Var 2 Cov ,
n
i i i j i j
i
a X a a X X
115. 4. Variance: Multiplicative Rule for independent
random variables
Suppose that X and Y are independent random variables,
then:
2 2
X Y
Var XY Var X Var Y Var Y Var X
m m
116. Mean and Variance of averages
Let
1
1 n
i
i
X X
n
Let X1, … , Xn be n mutually independent random
variables each having mean m and standard deviation s
(variance s2).
Then X
E X
m m
and
2
2
X
Var X
n
s
s
117. The Law of Large Numbers
Let
1
1 n
i
i
X X
n
Let X1, … , Xn be n mutually independent random
variables each having mean m.
Then for any d > 0 (no matter how small)
1 as
P X P X n
m d m d m d
119. Let X1, X2, …, Xq, Xq+1 …, Xk denote k continuous
random variables with joint probability density function
f(x1, x2, …, xq, xq+1 …, xk )
then the conditional joint probability function
of X1, X2, …, Xq given Xq+1 = xq+1 , …, Xk = xk is
Definition
1
1 1
1 1
1 1
, ,
, , , ,
, ,
k
q q k
q q k
q k q k
f x x
f x x x x
f x x
120. Let U = h( X1, X2, …, Xq, Xq+1 …, Xk )
then the Conditional Expectation of U
given Xq+1 = xq+1 , …, Xk = xk is
Definition
1 1 1 1
1 1
, , , , , ,
k q q k q
q q k
h x x f x x x x dx dx
1, ,
q k
E U x x
Note this will be a function of xq+1 , …, xk.
121. A very useful rule
E U E E U
y y
Var U E Var U Var E U
y y
y y
Then
1 1
Let , , , , , ,
q m
U g x x y y g
x y
Let (x1, x2, … , xq, y1, y2, … , ym) = (x, y) denote q + m
random variables.