3. TABLE OF CONTENTS
02
Distributions
of Random
Variables
Random
Variables
Discrete
Distributions
01 03
Cumulative
Distribution
Function
04 Joint
Distributions
05 Conditioning
and
Independence
04
4. General Overview
This chapter also discusses the concept of the conditional
distribution of one random variable, given the values of
others.
This chapter is concerned with the definitions of random
variables, distribution functions, probability functions,
density functions, and the development of the concepts
necessary for carrying out calculations for a probability
model using these entities.
Conditional distributions of random variables provide
the framework for discussing what it means to say
that variables are related, which is important in many
applications of probability and statistics.
6. β The previous chapter explained how to construct
probability models, including a sample space S and a
probability measure P.
β Once we have a probability model, we may define
random variables for that probability model.
β Intuitively, a random variable assigns a numerical
value to each possible outcome in the sample space.
β For example, if the sample space is {rain, snow, clear},
then we might define a random variable X such that X =
3 if it rains, X = 6 if it snows, and X = β2.7 if it is clear.
7. Definition 2.1.1
A random variable is a function from the sample space
S to the set R1 of all real numbers.
9. Example 1
β The weather random variable could be written
formally as
X : {rain, snow, clear} β R1
by X(rain) = 3, X(snow) = 6, and X(clear) = β2.7.
π₯π rain snow clear
π(π₯π) 3 6 -2.7
10. Example 3
β If the sample space corresponds to flipping three
different coins, then we could let
β X be the total number of heads showing,
β Y be the total number of tails showing,
β Z = 0 if there is exactly one head, and otherwise Z = 17,
etc.
π₯π HHH HHT HTH HTT THH THT TTH TTT
π(π₯π) 3 2 2 1 2 1 1 0
Y(π₯π) 0 1 1 2 1 2 2 3
π(π₯π) 17 17 17 0 17 0 0 17
11. Example 3
β If the sample space corresponds to rolling two fair
dice, then we could let X be the square of the number
showing on the first die, let Y be the square of the
number showing on the second die, let Z be the sum of
the two numbers showing, let W be the square of the
sum of the two numbers showing, let R be the sum of
the squares of the two numbers showing, etc.
12. Example 4
Constants as Random Variables
β As a special case, every constant value c is also a random
variable, by saying that c(s) = c for all s β S. Thus, 5 is a random
variable, as is 3 or β21.6
13. Example 5 Indicator Functions
β One special kind of random variable is worth mentioning.
β If A is any event, then we can define the indicator function of A,
written πΌπ΄, to be the random variable
πΌπ΄(π ) =
1 ππ π β π΄
0 ππ π β π΄
β which is equal to 1 on A, and is equal to 0 on π΄πΆ
.
14. Example
β When a student calls a university help desk for technical support,
he/she will either immediately be able to speak to someone (S, for
success) or will be placed on hold (F, for failure).
β Define X by X which indicates whether (1) or not (0) the student can
immediately speak to someone.
15. β Given random variables X and Y, we can perform the usual
arithmetic operations on them.
β Thus, for example, π = π2 is another random variable, defined
by π(π ) = π2(π ) = π π
2
= π(π ) Γ π(π ).
β Similarly, if W = ππ3
, then π(π ) = π(π ) Γ π(π ) Γ π(π ) Γ
π(π ), etc.
β Also, if Z = X + Y, then Z(s) = X(s) + Y(s), etc.
16. Example
β Consider rolling a fair six-sided die, so that S = {1, 2, 3, 4, 5, 6}.
β Let X be the number showing, so that X(s) = s for s β S.
β Let Y be three more than the number showing, so that Y(s) = s +
3.
β Let π = π2 + π.
β Then π(π ) = π π 2
+ π(π ) = π 2
+ π + 3.
β So Z(1) = 5, Z(2) = 9, etc.
17. β We write X = Y to mean that X(s) = Y(s) for all s β S.
β Similarly, we write X β€ Y to mean that X(s) β€ Y(s) for all s
β S, and X β₯ Y to mean that X(s) β₯ Y(s) for all s β S.
β For example, we write X β€ c to mean that X(s) β€ c for all s
β S.
18. β Again consider rolling a fair six-sided die, with S = {1, 2, 3, 4, 5,
6}.
β For s β S, let X(s) = s, and let π = π + πΌ 6 . This means that
π π = π π + πΌ{6}(π ) =
π π β€ 5
7 π = 6
β Hence, Y(s) = X(s) for 1 β€ s β€ 5. But it is not true that Y = X,
because π 6 β π(6).
β On the other hand, it is true that Y β₯ X
19. Definition 2
β Any random variable whose only possible values are 0
and 1 is called a Bernoulli random variable.
20. 01 02
A random variable is a
function from the
state space to the
set of real numbers.
The function could be
constant, or
correspond to
counting some
random quantity
that arises, or any other
sort of function.
Summary 1
22. Exercise 1
β Let S = {high, middle, low}.
β Define random variables X, Y, and Z by X(high) = β12, X(middle)
= β2, X(low) = 3, Y(high) = 0, Y(middle) = 0, Y(low) = 1, Z(high) =
6, Z(middle) = 0, Z(low) = 4.
Determine whether each of the following relations is true or false.
(a) X < Y
(b) X β€ Y
(c) Y < Z
(d) Y β€ Z
(e) XY < Z
(f) XY β€ Z
23. Exercise 2
β Let S = {1, 2, 3, 4, 5}.
(a) Define two different (i.e., nonequal) nonconstant random
variables, X and Y, on S.
(b) For the random variables X and Y that you have chosen, let Z =
X +π2. Compute Z(s) for all s β S.
24. Exercise3
β Consider rolling a fair six-sided die, so that S = {1, 2, 3, 4, 5, 6}.
β Let X(s) = s, and π(π ) = π 3 + 2.
β Let Z = XY.
β Compute Z(s) for all s β S.
26. β Because random variables are defined to be functions of the
outcome s, and because the outcome s is assumed to be
random (i.e., to take on different values with different
probabilities), it follows that the value of a random variable will
itself be random (as the name implies).
β Specifically, if X is a random variable, then what is the probability
that X will equal some particular value x?
β Well, X = x precisely when the outcome s is chosen such that
X(s) = x.
29. β We see from the previous example that, if B is any subset of the
real numbers, then P(X β B) = P({s β S : X(s) β B}).
β Furthermore, to understand X well requires knowing the
probabilities P(X β B) for different subsets B.
β That is the motivation for the following definition.
30. Definition
β If X is a random variable, then the distribution of X
is the collection of probabilities P(X β B) for all
subsets B of the real numbers.
32. π₯π rain snow clear
π(π₯π) 3 6 -2.7
π(π = π₯π) 0.4 0.15 0.45
What is the distribution of X?
Well, if B is any subset of the real numbers, then P(X β B) should count
0.4 if 3 β B, plus 0.15 if 6 β B, plus 0.45 if β2.7 β B.
We can formally write all this information at once by saying that
P(X β B) = 0.4 IB(3) + 0.15 IB(6) + 0.45 IB(β2.7),
where again IB(x) = 1 if x β B, and IB(x) = 0 if x β B
33. ππ rain snow clear
π(π¦π) 5 7 5
π(π = π¦π) 0.4 0.15 0.45
P(Y = 5) = P({rain, clear}) = 0.4 + 0.45 = 0.85.
Therefore, if B is any subset of the real numbers, then
P(Y β B) = 0.15 IB(7) + 0.85 IB(5)
34. β While the above examples show that it is possible to keep track
of P(X β B) for all subsets B of the real numbers, they also
indicate that it is rather cumbersome to do so.
β Fortunately, there are simpler functions available to help us keep
track of probability distributions, including cumulative distribution
functions, probability functions, and density functions.
β We discuss these next
35. 01 02
The distribution of a
random variable X is
the collection of
probabilities P(X β B)
of X belonging to
various sets.
The probability P(X β B) is
determined by
calculating the
probability of the set
of response values s such
that X(s) β B, i.e., P(X β
B) = P({s β S : X(s) β B}).
Summary 2
37. Exercise 1
β Consider flipping two independent fair coins.
β Let X be the number of heads that appear.
β Compute P(X = x) for all real numbers x.
38. Exercise 2
β Suppose we flip three fair coins, and let X be the number of
heads showing.
(a) Compute P(X = x) for every real number x.
(b) Write a formula for P(X β B), for any subset B of the real
numbers.
39. Exercise 3
β Suppose we roll two fair six-sided dice, and let Y be the sum of
the two numbers showing.
(a) Compute P(Y = y) for every real number y.
(b) Write a formula for P(Y β B), for any subset B of the real
numbers.
42. β Random variables satisfying (2.3.1) are simple in some sense
because we can understand them completely just by
understanding their probabilities of being equal to particular
values x.
β Indeed, by simply listing out all the possible values x such that
P(X = x) > 0, we obtain a second, equivalent definition, as
follows.
43. Equivalent definition
β A random variable X is discrete if there is a finite or
countable sequence x1, x2, . . . of distinct real numbers,
and a corresponding sequence p1, p2, . . . of nonnegative
real numbers, such that π(π = π₯π ) = ππ for all i, and
π ππ = 1.
β (1.3.2)
44. Definition:
Probability function
β For a discrete random variable X, its probability
function is the function ππ βΆ π 1 β [0, 1] defined by
ππ (π₯) = π(π = π₯).
45. β Hence, if x1, x2, . . . are the distinct values such that π(π =
π₯π ) = ππ for all i with π ππ = 1, then
ππ (π₯) =
ππ x =π₯πfor some i
0 ππ‘βπππ€ππ π
β Clearly, all the information about the distribution of X is
contained in its probability function, but only if we know that X is
a discrete random variable.
β Finally, we note that Theorem 1.5.1 immediately implies the
following.
46. Theorem 1.5.1
(Law of total probability, conditioned
version)
β Let A1, A2, . . . Be events that form a partition of the
sample space S, each of positive probability.
β Let B be any event.
β Then
P(B) = P(A1)P(B | A1) + P(A2)P(B | A2) + Β· Β· Β· .
47. Reminder of Theorem
2.3.1
(Law of total probability, discrete random variable
version)
β Let X be a discrete random variable, and let A be
some event. Then
π(π΄) =
π₯βπ 1
π π = π₯ π(π΄ | π₯)
49. β Consider flipping a fair coin. Let Z = 1 if the coin is heads, and Z
= 3 if the coin is tails. Let π = π2 + π.
(a) What is the probability function of Z?
(b) What is the probability function of W?
50. β Consider flipping two fair coins. Let X = 1 if the first coin is
heads, and X = 0 if the first coin is tails. Let Y = 1 if the second
coin is heads, and Y = 5 if the second coin is tails.
β Let Z = XY. What is the probability function of Z?
53. β Let c be some fixed real number.
β Then, as already discussed, c is also a random variable (in fact,
c is a constant random variable).
β In this case, clearly c is discrete, with probability function ππ
satisfying that ππ(π) = 1, and ππ(π₯) = 0 for π₯ β π.
β Because c is always equal to a particular value (namely, c) with
probability 1, the distribution of c is sometimes called a point
mass or point distribution or degenerate distribution.
55. β Consider flipping a coin that has probability ΞΈ of coming up
heads and probability 1βΞΈ of coming up tails, where 0 < ΞΈ < 1.
β Let X = 1 if the coin is heads, while X = 0 if the coin is tails.
β Then ππ(1) = π(π = 1) = π,
β while ππ(0) = π(π = 0) = 1 β π .
β The random variable X is said to have the Bernoulli(ΞΈ)
distribution;
β We write this as X βΌ Bernoulli(ΞΈ).
β We read X follows the Bernouilli distribution
56. β Bernoulli distributions arise anytime we have a response variable
that takes only two possible values, and we label one of these
outcomes as 1 and the other as 0.
β For example, 1 could correspond to success and 0 to failure of
some quality test applied to an item produced in a manufacturing
process.
β In this case, ΞΈ is the proportion of manufactured items that will
pass the test.
β Alternatively, we could be randomly selecting an individual from
a population and recording a 1 when the individual is female and
a 0 if the individual is a male. In this case, ΞΈ is the proportion of
females in the population.
58. β A student has to sit for an exit exam. The probability
she passes the exam is 0.7.
β Let X be the random variable defined as follows
β X= 1 if the student passes
β X=0 otherwise
β What is the distribution of X?
60. β Consider flipping n coins, each of which has (independent)
probability ΞΈ of coming up heads, and probability 1 βΞΈ of coming
up tails. (Again, 0 < ΞΈ < 1.)
β Let X be the total number of heads showing.
β By (1.4.2), we see that for x = 0, 1, 2, . . . , n, ππ π₯ = π(π =
61. β The Bernoulli(ΞΈ ) distribution corresponds to the special
case of the Binomial(n, ΞΈ ) distribution when n = 1,
namely, Bernoulli(ΞΈ ) = Binomial(1, ΞΈ ).
β The following figure contains the plots of several
Binomial(20, ΞΈ ) probability functions.
0
0.05
0.1
0.15
0.2
0.25
0 5 10 15 20 25
Binom(20,1/2) Binom(20,1/5)
62. β The binomial distribution is applicable to any situation involving n
independent performances of a random system; for each
performance, we are recording whether a particular event has
occurred, called a success, or has not occurred, called a failure.
β If we denote the event in question by A and put ΞΈ = P(A), we
have that the number of successes in the n performances is
distributed Binomial(n, ΞΈ ).
β For example, we could be testing light bulbs produced by a
manufacturer, and ΞΈ is the probability that a bulb works when we
test it. Then the number of bulbs that work in a batch of n is
distributed Binomial(n, ΞΈ ).
64. Example 1
β Consider a coin that flips head with probability = 0.25
β If the coin is flipped 5 times, what is the resulting
distribution for the number of heads showing?
65. Example 2
β 80 persons have to pass through a security gate.
β Let p the probability that the alarm is activated with p =
0,2129
β Let X be the random variable corresponding to the
number of persons activating the alarm.
β What is the probability that only 5 persons will raise
the alarm?
67. β Consider repeatedly flipping a coin that has probability ΞΈ of
coming up heads and probability 1 β ΞΈ of coming up tails, where
again 0 < ΞΈ < 1.
β Let X be the number of tails that appear before the first head.
β Then for k β₯ 0, X = k if and only if the coin shows exactly k tails
followed by a head.
β The probability of this is equal to 1 β π π π . (In particular, the
probability of getting an infinite number of tails before the first
head is equal to 1 β π βπ = 0, so X is never equal to infinity.)
β Hence, ππ(π) = ( )
1 β π π
π , for k = 0, 1, 2, 3, . . . .
β The random variable X is said to have the Geometric(ΞΈ)
distribution;
β we write this as X βΌ Geometric(ΞΈ ).
69. β The geometric distribution applies whenever we are counting the
number of failures until the first success for independent
performances of a random system where the occurrence of
some event is considered a success.
β For example, the number of light bulbs tested that work until the
first bulb that does not (a working bulb is considered a βfailureβ
for the test).
β We note that some books instead define the geometric
distribution to be the number of coin flips up to and including the
first head, which is simply equal to one plus the random variable
defined here.
70. β We roll a die 3 times
β What is the probability that 6 shows up the first time at
the third roll?
72. β Generalizing the previous example, consider again repeatedly
flipping a coin that has probability ΞΈ of coming up heads and
probability 1 β ΞΈ of coming up tails.
β Let r be a positive integer, and let Y be the number of tails that
appear before the rth head.
β Then for k β₯ 0, Y = k if and only if the coin shows exactly r β 1
heads (and k tails) on the first r β 1 + k flips, and then shows a
head on the (r + k)-th flip.
β The probability of this is equal to
ππ π = πβ1+π
πβ1
ππβ1 1 β π π π= πβ1+π
πβ1
ππ 1 β π π
for k = 0, 1, 2, 3, . . . .
73. β The random variable Y is said to have the Negative-Binomial(r, ΞΈ
) distribution; we write this as Y βΌ Negative-Binomial(r, ΞΈ ).
β The special case r = 1 corresponds to the Geometric(ΞΈ )
distribution.
β So Negative-Binomial(1, ΞΈ ) = Geometric(ΞΈ ).
0
0.05
0.1
0.15
0.2
0.25
0.3
0 5 10 15 20 25
NegBinom(2,1/2) NegBinom(10,1/2)
74. β The Negative-Binomial(r, ΞΈ ) distribution applies whenever we
are counting the number of failures until the r th success for
independent performances of a random system where the
occurrence of some event is considered a success.
β For example, the number of light bulbs tested that work until the
third bulb that does not follow the negative-binomial distribution.
75. Example
An oil company conducts a geological study that indicates
that an exploratory oil well should have a 20% chance of
striking oil.
1. What is the probability that the first strike comes on
the third well drilled.
2. What is the probability that the third strike comes on
the seventh well drilled?
77. β We say that a random variable Y has the Poisson(Ξ») distribution,
and write Y βΌ Poisson(Ξ»), if
ππ (π¦) = π(π = π¦) =
Ξ»
π¦
π¦!
πβΞ»
for y = 0, 1, 2, 3, . . . .
β We note that since (from calculus) π¦=0
β Ξ»
π¦
π¦!
= ππ,
β It is indeed true (as it must be) that π¦=0
β
π π = π¦ = 1.
79. Motivation
β We motivate the Poisson distribution as follows.
β Suppose X βΌ Binomial(n, ΞΈ ), i.e., X has the Binomial(n, ΞΈ )
distribution.
β Then for 0 β€ x β€ n,
P X = x =
π
π₯
ππ₯ 1 β π πβπ₯
β If we set ΞΈ = Ξ» / n for some Ξ» > 0, then this becomes
P X = x =
π
π₯
(
Ξ»
π
)π₯ 1 β
Ξ»
π
πβπ₯
=
π Γ π β 1 β¦ Γ (π β π₯ + 1)
π₯!
(
Ξ»
π
)π₯ 1 β
Ξ»
π
πβπ₯
80. β Let us now consider what happens if we let n β β , while
keeping x fixed at some nonnegative integer.
β In that case,
π Γ π β 1 β¦ Γ (π β π₯ + 1)
ππ₯
β 1
β Indeed
π Γ π β 1 β¦ Γ (π β π₯ + 1)
ππ₯
= 1 1 β
1
π
1 β
2
π
β¦ (1 β
π₯ β 1
π
)
82. β Intuitively, we can phrase this result as follows.
β If we flip a very large number of coins n, and each coin has a
very small probability ΞΈ = Ξ»/n of coming up heads, then the
probability that the total number of heads will be x is
approximately given by
Ξ»
π₯
π₯!
πβΞ».
83. β Figure displays the accuracy of this estimate when we are
approximating the Binomial(100, 1/10) distribution by the
Poisson(Ξ») distribution where Ξ» = nΞΈ = 100(1/10) = 10.
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0 5 10 15 20 25
Poisson(10) Binom(100,1/10)
84. β The Poisson distribution is a good model for counting random
occurrences of an event when there are many possible
occurrences, but each occurrence has very small probability.
β Examples include
a. the number of house fires in a city on a given day, t
b. he number of radioactive events recorded by a Geiger counter,
c. the number of phone calls arriving at a switchboard,
d. the number of hits on a popular World Wide Web page on a given day, etc.
86. β Suppose that an urn contains M white balls and N β M black
balls.
β Suppose we draw n β€ N balls from the urn in such a fashion that
each subset of n balls has the same probability of being drawn.
β Because there are π
π
such subsets, this probability is
1
π
π
β One way of accomplishing this is to thoroughly mix the balls in
the urn and then draw a first ball.
β Accordingly, each ball has probability 1/N of being drawn.
β Then, without replacing the first ball, we thoroughly mix the balls
in the urn and draw a second ball.
β So each ball in the urn has probability 1/(N β1) of being drawn.
87. β We then have that any two balls, say the ith and jth balls,
have probability
β π ππππ π πππ π πππ ππππ€π
= π(ππππ π ππ ππππ€π ππππ π‘)π(ππππ π ππ ππππ€π π πππππ | ππππ π ππ ππππ€π ππππ π‘)
+ π(ππππ π ππ ππππ€π ππππ π‘)π(ππππ π ππ ππππ€π π πππππ ππππ π ππ ππππ€π ππππ π‘
=
1
π
1
πβ1
+
1
πβ1
1
πβ1
=
1
π
2
β This type of sampling is called sampling without
replacement.
88. β Given that we take a sample of n, let X denote the number of
white balls obtained.
β Note that we must have X β₯ 0 and X β₯ n β (N β M) because at
most N β M of the balls could be black.
β Hence, X β₯ max(0, n + M β N).
β Furthermore, X β€ n and X β€ M because there are only M white
balls.
β Hence, X β€ min(n, M).
β So suppose max(0, n + M β N) β€ x β€ min(n, M).
β What is the probability that x white balls are obtained?
β In other words, what is P(X = x)?
89. β To evaluate this, we know that we need to count the number of
subsets of n balls that contain x white balls.
β Using the combinatorial principles (ch1), we see that this number
is given by
π
π₯
π β π
π β π₯
β So
π π₯ = π =
π
π₯
πβπ
πβπ₯
π
π
β The random variable X is said to have the Hypergeometric(N,
M, n) distribution.
91. β Obviously, the hypergeometric distribution will apply to any
context wherein we are sampling without replacement from a
finite set of N elements and where each element of the set either
has a characteristic or does not.
β For example, if we randomly select people to participate in an
opinion poll so that each set of n individuals in a population of N
has the same probability of being selected, then the number of
people who respond yes to a particular question is distributed
Hypergeometric(N, M, n), where M is the number of people in
the entire population who would respond yes.
92. 01 03
A random variable X is
discrete if
π₯ π(π = π₯) = 1,
i.e., if all its
probability
comes from being
equal to particular
values.
Important discrete
distributions include
the degenerate,
Bernoulli, binomial,
geometric,
negative-binomial,
Poisson, and
hypergeometric
distributions.
Summary 3
02
A discrete random variable
X takes on only a finite, or
countable, number of
distinct values.
95. Exercise
β We toss a fair coin.
β Let X =1 if the coin is head and 0 otherwise
β What is the distribution function that X follows?
β We toss the coin 10 times and consider Y the r.v. that
counts the number of heads. What is the distribution
function that Y follows?
97. Exercise 3
A die is rolled 3 times
1. What is the probability that no 6 occur?
98. Exercise 4
We roll a die 3 times.
1. Let X be the number of 5 that appear
a. What is the distribution of X?
b. What is the probability to get 3 times 5?
c. What is the probability to get 2 times 5?
2.Let Y be the sum of numbers showing
a. Does Y follow the binomial distribution?
b. What is the probability P(Y=15)?
3. What is the probability to get the first five at the third roll?
4. What is the probability to get 3 times the sum of 15?
100. Definition
β Given a random variable X, its cumulative function
(or distribution function or cdf) is the function
πΉπ: π 1 β 0,1
πΉπ π₯ = π(π β€ π₯)
101. CDF Properties
Let πΉπ be a a cumulative distribution function of a random
variable X. Then
β π β€ ππΏ π β€ π πππ πππ π
β ππΏ π β€ ππΏ(π) whenever π β€ π (i.e., ππΏ is increasing)
β π₯π’π¦
πββ
ππΏ = π
β π₯π’π¦
πβββ
ππΏ = π
102. Theorem
β Let X be a discrete random variable with
probability function ππ, then its cumulative
distribution function πΉπ satisfies
πΉπ π₯ =
π¦β€π₯
ππ(π¦)
103. Example 1
We toss a fair coin twice. Let X be the number of heads
What is πΉπ π₯ ?
107. Example 2
β Suppose we roll one fair six sided die so that π =
1,2,3,4,5,6 , and p s =
1
6
, βπ β π.
β Let X be the random variable associated to the number
showing divided by 6.
π π = π /6 , βπ β π
β’ What is πΉπ(π₯)?
115. Exercise 1
β Consider rolling one fair six-sided die, so that S = {1, 2, 3, 4, 5,
6}, and P(s) = 1/6 for all s β S. Let X be the number showing on
the die, so that X(s) = s for s β S.
β Let π = π2. Compute the cumulative distribution function
πΉπ (π¦) = P(Y β€ y), for all π¦ β π 1 .
117. β In many real scenarios, we need to analyse many
variables in the same time.
β For this, we will show how to deal with two random
variables which can be generalized for more than 2
variable
118. Definition (Joint probability
function)
β Let X and Y be discrete random variables.
β Then their joint probability function, ππ,π , is a
function from π 2 to π 1, defined by
ππ,π (π₯, π¦) = π(π = π₯ , π = π¦)
120. Theorem (marginal
probability function)
β Let X and Y be two discrete random variables, with joint probability
function ππ,π . Then the probability function ππ of X can be
computed as
ππ π₯ =
π¦
ππ,π(π₯, π¦)
β Similarly, the probability function ππ of Y can be computed as
ππ π¦ =
π₯
ππ,π(π₯, π¦)
121. Example 1
β Find the marginal probability function of X and Y
Y=0 Y=1 Y=2
X=0 1/6 1/4 1/8
X=1 1/8 1/6 1/6
122. Example 2
β Suppose the joint probability function of X and Y is given by
ππ,π π₯, π¦ =
1
7
π₯ = 5, π¦ = 0
1
7
π₯ = 5, π¦ = 3
1
7
π₯ = 5, π¦ = 4
3
7
π₯ = 8, π¦ = 0
1
7
π₯ = 8, π¦ = 4
0 ππ‘βπππ€ππ π
126. Definition
β If X and Y are random variables, then the joint
distribution of X and Y is the collection of
probabilities P((X, Y) β B), for all subsets π΅ β π 2
of
pairs of real numbers.
127. Definition Joint cdf
β Let X and Y be random variables.
β Then their joint cumulative distribution function is
the function πΉπ,π βΆ π 2 β [0, 1] defined by
πΉπ,π (π₯, π¦) = π(π β€ π₯, π β€ π¦)
(the comma means βandβ here, so that πΉπ,π (π₯, π¦) is the probability that X β€ x and Y
β€ y.)
131. β What is the probability π(π < π β€ π, π < π β€ π)
132. Theorem
β Let X and Y be any random variables, with
joint cumulative distribution function πΉπ,π .
β Suppose a β€ b and c β€ d. Then
π(π < π β€ π , π < π β€ π)
= πΉπ,π (π, π) β πΉπ,π(π, π) β πΉπ,π(π, π) + πΉπ,π(π, π).
135. Theorem (marginal distributions)
β Let X and Y be two random variables, with joint cumulative
distribution function FX,Y . Then the cumulative
distribution function FX of X satisfies
πΉπ (π₯) = lim
π¦ββ
πΉπ,π (π₯, π¦)
for all x β R1.
β Similarly, the cumulative distribution function FY of Y
satisfies πΉπ (π¦) = lim
π₯ββ
πΉπ,π (π₯, π¦)
for all y β R1.
136. 01 03
It is often important
to keep track of
the joint
probabilities of
two random
variables,
X and Y.
If X and Y are discrete,
then their joint
probability function is
given by
ππ,π(π₯, π¦)
= π(π = π₯, π = π¦).
Summary
02
Their joint cumulative
distribution function is given
by πΉπ,π (π₯, π¦) = π(π β€
π₯, π β€ π¦).
138. β Let and urn containing:
a. 4 white balls
b. 3 red
c. 3 black
β We pick 5 balls without replacement
β What is the probability of getting 2 white and 2 red
balls?
139.
140. β Let X be the random variable βnumber of red ballsβ and
Y be the βnumber of white ballsβ
a. Find P(X=2, Y=2)
b. Find the joint probability function of X and Y
c. Does π₯,π¦ ππ,π π₯, π¦ = 1?
d. Find P(X=Y)
e. Find P(X>Y)
f. Find P(X=3)
g. Find P(Y=2)
151. Definition
β Let X and Y be random variables, and suppose that P(X = x) >.
β The conditional distribution of Y, given that X = x, is the probability
distribution assigning probability
π(π β π΅ , π = π₯)
P(X = x)
to each event Y β B.
β In particular, it assigns probability
π(π < π β€ π , π = π₯)
π(π = π₯)
to the event that a < Y β€ b.
152. Example
β Suppose the joint probability function of X and Y is given by
ππ,π π₯, π¦ =
1
7
π₯ = 5, π¦ = 0
1
7
π₯ = 5, π¦ = 3
1
7
π₯ = 5, π¦ = 4
3
7
π₯ = 8, π¦ = 0
1
7
π₯ = 8, π¦ = 4
0 ππ‘βπππ€ππ π
154. Definition
β Suppose X and Y are two discrete random
variables.
β Then the conditional probability function of Y, given
X, is the function ππ|π defined by
ππ|π π¦ π₯ =
ππ,π π₯,π¦
ππ π₯
=
ππ,π π₯,π¦
π§ ππ,π(π₯,π§)
β defined for all π¦ β π 1
and all x with ππ (π₯) > 0.
158. Definition
β Let X and Y be two random variables.
β Then X and Y are independent if, for all subsets B1
and B2 of the real numbers,
P(X β B1, Y β B2) = P(X β B1)P(Y β B2).
β That is, the events βX β B1β and βY β B2β are
independent events.
159. Theorem
β Let X and Y be two random variables. Then X and Y
are independent if and only if
π(π β€ π β€ π, π β€ π β€ π) = π(π β€ π β€ π)π(π β€ π β€ π) (2.8.1)
whenever a β€ b and c β€ d.
160. Theorem
β Let X and Y be two discrete random variables.
β X and Y are independent if and only if their joint
probability function ππ,π satisfies
ππ,π (x, y) = ππ (x) ππ (y)
for all x, y β R1.
161. Example
Are X and Y independent?
Y=0 Y=1 Y=2
X=0 1/6 1/4 1/8
X=1 1/8 1/6 1/6
162. β Suppose now that we have n random variables π1, . . . , ππ.
β The random variables are independent if and only if the
collection of events {ππ β€ ππ β€ ππ } are independent, whenever
ππ β€ ππ , for all i = 1, 2, . . . , n.
β Generalizing previous Theorem, we have the following result.
163. β Let π1, . . . , ππ be a collection of random discrete
variables.
β π1, . . . , ππ are independent if and only if their joint
probability function ππ1,β¦,ππ
satisfies
ππ1,β¦,ππ
(π₯1, . . . , π₯π) = ππ1
(π₯1) Β· Β· Β· πππ
(π₯π)
for all π₯1, . . . , π₯π β π 1 .
164. Independent and identically
distributed
random variables
β A collection π1, . . . , ππ of random variables is
independent and identically distributed (or i.i.d.) if
the collection is independent and if, furthermore,
each of the n variables has the same distribution.
β The i.i.d. sequence π1, . . . , ππ is also referred to as a
sample from the common distribution.
165. β In particular, if a collection π1, . . . , ππ of random variables is i.i.d.
and discrete, then each of the probability functions πππ
is the
same, so that ππ1
π₯ =Β· Β· Β·= πππ
π₯ = p π₯ , for all x β β 1.
β Furthermore, from independency theorem, it follows that
ππ1,β¦,ππ
π₯1, . . . , π₯π = ππ1
π₯1 Β· Β· Β· πππ
π₯π = p x1 β¦ p(x2)
for all π₯1, . . . , π₯π β β 1
.
168. β Consider k random variable π1, . . . , ππfollow the
multinomial distribution noted (π1, . . . , ππ)~multinomial
(n, π1, . . . , ππ) in the following case
β The experiment consists of n repeated trials.
β Each trial has a discrete number of possible outcomes.
β On any given trial, the probability that a particular
outcome will occur is constant ππ.
β The trials are independent; that is, the outcome on one
trial does not affect the outcome on other trials.
170. Examples
β Consider a bowl of chips of which a proportion ππ of the
chips are labelled i (for i = 1, 2, 3).
β If we randomly draw a chip from the bowl and observe
its label s, then P (s = i) = ππ .
β We draw 3 ships with replacement and consider the
random variable ππ: the number of ships labeled with i
β Then (π1, π2, π3) ~ππ’ππ‘πππππππ(3, π1, π2, π3)
171. Examples
β Consider a population of students at a university of
which a proportion π1 live on campus (denoted by s =
1), a proportion π2 live off-campus with their parents
(denoted by s = 2), and a proportion π3 live off-campus
independently (denoted by s = 3).
β If we randomly draw a student from this population
and determine s for that student, then P (s = i) = ππ .
β Let ππ: the number of students belonging to π π
172. β In a random sample of 10 people, what is the
probability that 6 have O, 2have A, 1 has B and 1 has
AB?
β X1: # O, X2: #A, X3: #B, X4: #AB
β π(π1 = 6, π2 = 2, π3 = 1, π4 = 1) =
10!
6!2!1!1!
0.446 Γ 0.422 Γ
0.11 Γ 0.041
Blood type O A B AB
probability 0.44 0.42 0.10 0.04
175. β An urn contains 8 red balls, 3 yellow, 9 white. 6 balls
are randomly selected with replacement.
β What is the probability 2 are red, 1 is yellow, and 3 are
white?
176. β Suppose a card is drawn randomly from an ordinary
deck of playing cards, and then put back in the deck.
This exercise is repeated five times. What is the
probability of drawing 1 spade, 1 heart, 1 diamond, and
2 clubs?
177. CREDITS: This presentation template was created
by Slidesgo, including icons by Flaticon, and
infographics & images by Freepik.
THANKS
References
Michael J. Evans and Jeffrey S.
Rosenthal: Probability and
Statistics, The Science of
Uncertainty, Second Edition,
University of Toronto