CMF.pptx

Conditional Probability Mass Function
• The probability distribution of a discrete random variable can be characterized by its probability mass function.
• When the probability distribution of the random variable is updated, in order to consider some information that
gives rise to a conditional probability distribution, then such a conditional distribution can be characterized by a
conditional probability mass function.
• Conditional PMF is especially appropriate when the experiment is a compound one, in which the second part of
the experiment depends upon the outcome of the first part.
• It has the usual properties of a PMF, that of being between 0 and 1 and also summing to one.

Conditional Probability Mass Function
• We recall that a conditional probability P[AIB] is the probability of an event A, given that we know that some
other event B has occurred.
• Except for the case when the two events are independent of each other , the knowledge that B has occurred will
change the probability P[A]. In other words, P[AIB] is our new probability in light of the additional knowledge.
• In many practical situations, two random mechanisms are at work and are described by events A and B .
• To compute probabilities for a complex experiment it is usually convenient to use a conditioning argument to
simplify the reasoning.
• For example, say we choose one of two coins and toss it 4 times. We might inquire as to the probability of
observing 2 or more heads. However, this probability will depend upon which coin was chosen, as for example in
the situation where one coin is fair and the other coin is weighted.
• It is therefore convenient to define conditional probability mass functions, Px [klcoin 1 chosen] and Px[klcoin 2
chosen], since once we know which coin is chosen, we can easily specify the PMF.
• In particular, for this example the conditional PMF is a binomial one whose value of P depends upon which coin is
chosen and with k denoting the number of heads. Once the conditional PMFs are known, we have by the law of
total probability that the probability of observing k heads for this experiment is given by the PMF:
Px [k] = Px [klcoin 1 chosen]P[coin 1 chosen] + Px[klcoin 2 chosen]P[coin 2 chosen]

Conditional Probability Mass Function cont.
• The PMF that is required depends directly on the conditional PMFs (of which there are two).
• The use of conditional PMFs greatly simplifies our task in that given the event, i.e., the coin chosen, the PMF of
the number of heads observed readily follows. Also, in many problems, including this one, it is actually the
conditional PMFs that are specified in the description of the experimental procedure.
• It makes sense, therefore, to define a conditional PMF and study its properties.
• For the most part, the definitions and properties will mirror those of the conditional probability P[AIB], where A
and B are events defined on SX,Y.
• Just as we used conditional probabilities to evaluate the likelihood of one event given another, we develop here
the concepts of continuous conditional distributions and continuous conditional probability mass functions and
probability density functions to evaluate the behavior of one random variable given knowledge of another.
• Conditional probability P[A|B] is a number that expresses our new knowledge about the occurrence of event A,
when we learn that another event B occurs.
• In Conditional Probability Mass Function we consider event A to be the observation of a particular value of a
random variable: A = {X = x}.
• The conditioning event B contains information about X but not the precise value of X.
• If X ≤ 33 we learn of the occurrence of an event B that describes some property of X. The occurrence of the
conditioning event B changes the probabilities of the event {X = x}.

We can find the conditional probabilities P [A|B] = P [X = x|B] for all real numbers x. This collection of probabilities is
a function of x.
It is the conditional probability mass function of random variable X, given that B occurred.
Definition 2.19 Conditional PMF
Given the event B, with P[B] > 0, the conditional probability mass function of X is
PX|B (x) = P [X = x|B]
About notation:
The name of a PMF is the letter P with a subscript containing the name of the random variable. For a conditional
PMF, the subscript contains the name of the random variable followed by a vertical bar followed by a statement of
the conditioning event.
The argument of the function is usually the lowercase letter corresponding to the variable name. The argument is a
dummy variable. It could be any letter, so that PX|B(x) is the same function as PX|B(u).
Sometimes we write the function with no specified argument at all, PX|B(·).
In some applications, we begin with a set of conditional PMFs, PX|Bi(x), i = 1, 2, . . ,m, where B1, B2, . . . , Bm is an event
space.
We then use the law of total probability to find the PMF PX (x):
P [A] = ∑i=1,m P [A|Bi ] P [Bi ]
where: P(A|B) = P(A ∩ B) / P(B)

Theorem 2.16
A random variable X resulting from an experiment with event space B1, B2, . . . , Bm has
PMF > PX (x) = ∑i=1,m PX|Bi (x) P [Bi ]
Proof: The theorem follows directly from Theorem 1.10 (Law of Total Probability)
( P [A] = ∑i=1,m P [A|Bi ] P [Bi ] ) with A denoting the event {X = x} )
When a conditioning event B ⊂ SX , the PMF PX (x) determines both the probability of B as well as the conditional
PMF:
PX|B (x) = P [X = x, B] / P [B]
Now either the event X = x is contained in the event B or it is not.
If x ∈ B, then {X = x}∩ B = {X = x} and P[X = x, B] = PX (x). Otherwise, if x ∈ B, then {X = x}∩ B = ∅ and P[X = x, B] = 0.
The next theorem uses Equation PX|B (x) = P [X = x, B] / P [B] to calculate the conditional PMF.

Theorem 2.17
PX|B (x) =
PX(x)
P[B]
, x∈B
0, otherwise
The theorem states that when we learn that an outcome x∈B, the probabilities of all x ∉ B are zero in our conditional
model and the probabilities of all x∈B are proportionally higher than they were before we learned x ∈ B.
PX|B(x) is a perfectly respectable PMF.
Because the conditioning event B tells us that all possible outcomes are in B, we rewrite Theorem 2.1. which is:
For a discrete random variable X with PMF PX (x) and range SX :
(a) For any x, PX(x) ≥ 0
(b) ∑x∈SX PX (x) = 1
(c) For any event B ⊂ SX , the probability that X is in the set B is P[B] =∑x∈B PX(x)
using B in place of S, we have next Theorem.

Theorem 2.18
(a) For any x ∈ B, PX|B(x) ≥ 0
(b) ∑x∈B PX|B (x) = 1
(c) For any event C ⊂ B, P[C|B], the conditional probability that X is in the set C, is P[C|B] =∑x∈C PX|B (x)
Therefore, we can compute averages of the conditional random variable X|B and averages of functions of X|B in the
same way that we compute averages of X.
The only difference is that we use the conditional PMF PX|B(·) in place of PX (·)

Definition 2.20
Conditional Expected Value
The conditional expected value of random variable X given condition B is
E [X|B] = μX|B =∑x∈B x PX|B(x)
When we are given a family of conditional probability models PX|Bi (x) for an event space B1, B2, . . . , Bm, we can
compute the expected value E[X] in terms of the conditional expected values E[X|Bi]
Theorem 2.19 For a random variable X resulting from an experiment with event space B1, B2, . . . , Bm,
E [X] = ∑i=1,m E [X|Bi ] P [Bi ]
Proof
Since E[X] = ∑x x PX (x), we can use (Theorem 2.16 ) PX (x) = ∑i=1,m PX|Bi (x) P [Bi ]
to write
E [X] = ∑x x∑i=1,m PX|Bi (x) P [Bi ] =∑i=1,m P [Bi ] ∑x xPX|Bi (x) = ∑i=1,m P [Bi ] E [X|Bi ]

For a derived random variable Y = g(X), we have the equivalent of (Theorem 2.10):
E [Y ] = μY = ∑x∈SX
g(x) PX (x)
Theorem 2.20 The conditional expected value of Y = g(X) given condition B is
E [Y | B] = E [g(X) | B] = ∑x∈B g(x) PX|B (x)
The function g(Xi) = EYlx[Ylxi] is the mean of the conditional PMF PYlx[Yjlxi]. Alternatively, it is known as the conditional
mean.
This terminology is widespread and so we will adhere to it , although we should keep in mind that it is meant to
denote the usual mean of the conditional PMF.
It is also of interest to determine the expectation of other quantities besides Y with respect to the conditional PMF.
This is called the conditional expectation and is symbolized by EYlx[g(Y)lxi].
The latter is called the conditional expectation of g(Y). For example, if g(Y) = y2, then it becomes the conditional
expectation of y2 or equivalently the conditional second moment.
Lastly, the we should be aware that the conditional mean is the optimal predictor of a random variable based on
observation of a second random variable.

It follows that the conditional variance and conditional standard deviation conform to Definitions 2.16
Var (X) = E [(X- μX)2]
and Definitions 2.17
𝜎𝑋= 𝑉𝑎𝑟[𝑋] with X|B replacing X:
𝜎X|B
= 𝑉𝑎𝑟[X|B]
To conclude: The utility of defining a conditional PMF is especially appropriate when the experiment is a compound
one, in which the second part of the experiment depends upon the outcome of the first part.
The definition of the conditional PMF has the usual properties of a PMF, that of being between 0 and 1 and also
summing to one.

Chapter Summary
With all of the concepts and formulas introduced in this chapter, there is a high probability that the beginning
student will be confused at this point.
Part of the problem is that we are dealing with several different mathematical entities including random variables,
probability functions, and parameters.
Before plugging numbers or symbols into a formula, it is good to know what the entities are.
The random variable X transforms outcomes of an experiment to real numbers. Note that X is the name of the
random variable.
A possible observation is x, which is a number. SX is the range of X, the set of all possible observations x.
The PMF PX (x) is a function that contains the probability model of the random variable X.
The PMF gives the probability of observing any x. PX (·) contains our information about the randomness of X.

Chapter Summary
The expected value E[X] = μX and the variance Var[X] are numbers that describe the entire probability model.
Mathematically, each is a property of the PMF PX (·). The expected value is a typical value of the random variable.
The variance describes the dispersion of sample values about the expected value.
A function of a random variable Y = g(X) transforms the random variable X into a different random variable Y .
For each observation X = x, g(·) is a rule that tells you how to calculate y = g(x), a sample value of Y .
Although PX (·) and g(·) are both mathematical functions, they serve different purposes here. PX (·) describes the
randomness in an experiment.
On the other hand, g(·) is a rule for obtaining a new random variable from a random variable you have observed.
The Conditional PMF PX|B(x) is the probability model that we obtain when we gain partial knowledge of the outcome
of an experiment.
The partial knowledge is that the outcome x ∈ B ⊂ SX . The conditional probability model has its own expected value,
E[X|B], and its own variance, Var[X|B].

CMF.pptx

Recommended

Recommended

More Related Content

Similar to CMF.pptx

Similar to CMF.pptx (20)

Recently uploaded

Recently uploaded (20)

CMF.pptx