SlideShare a Scribd company logo
1 of 177
Advanced
Probability and
Statistics
CPIT701
Dr. Asma Cherif
Discrete Random
Variables and
Probability
Distribution
Chapter 2
TABLE OF CONTENTS
02
Distributions
of Random
Variables
Random
Variables
Discrete
Distributions
01 03
Cumulative
Distribution
Function
04 Joint
Distributions
05 Conditioning
and
Independence
04
General Overview
This chapter also discusses the concept of the conditional
distribution of one random variable, given the values of
others.
This chapter is concerned with the definitions of random
variables, distribution functions, probability functions,
density functions, and the development of the concepts
necessary for carrying out calculations for a probability
model using these entities.
Conditional distributions of random variables provide
the framework for discussing what it means to say
that variables are related, which is important in many
applications of probability and statistics.
Random
Variables
01
● The previous chapter explained how to construct
probability models, including a sample space S and a
probability measure P.
● Once we have a probability model, we may define
random variables for that probability model.
● Intuitively, a random variable assigns a numerical
value to each possible outcome in the sample space.
● For example, if the sample space is {rain, snow, clear},
then we might define a random variable X such that X =
3 if it rains, X = 6 if it snows, and X = βˆ’2.7 if it is clear.
Definition 2.1.1
A random variable is a function from the sample space
S to the set R1 of all real numbers.
Examples
Example 1
● The weather random variable could be written
formally as
X : {rain, snow, clear} β†’ R1
by X(rain) = 3, X(snow) = 6, and X(clear) = βˆ’2.7.
π‘₯𝑖 rain snow clear
𝑋(π‘₯𝑖) 3 6 -2.7
Example 3
● If the sample space corresponds to flipping three
different coins, then we could let
● X be the total number of heads showing,
● Y be the total number of tails showing,
● Z = 0 if there is exactly one head, and otherwise Z = 17,
etc.
π‘₯𝑖 HHH HHT HTH HTT THH THT TTH TTT
𝑋(π‘₯𝑖) 3 2 2 1 2 1 1 0
Y(π‘₯𝑖) 0 1 1 2 1 2 2 3
𝑍(π‘₯𝑖) 17 17 17 0 17 0 0 17
Example 3
● If the sample space corresponds to rolling two fair
dice, then we could let X be the square of the number
showing on the first die, let Y be the square of the
number showing on the second die, let Z be the sum of
the two numbers showing, let W be the square of the
sum of the two numbers showing, let R be the sum of
the squares of the two numbers showing, etc.
Example 4
Constants as Random Variables
● As a special case, every constant value c is also a random
variable, by saying that c(s) = c for all s ∈ S. Thus, 5 is a random
variable, as is 3 or βˆ’21.6
Example 5 Indicator Functions
● One special kind of random variable is worth mentioning.
● If A is any event, then we can define the indicator function of A,
written 𝐼𝐴, to be the random variable
𝐼𝐴(𝑠) =
1 𝑖𝑓 𝑠 ∈ 𝐴
0 𝑖𝑓 𝑠 βˆ‰ 𝐴
● which is equal to 1 on A, and is equal to 0 on 𝐴𝐢
.
Example
● When a student calls a university help desk for technical support,
he/she will either immediately be able to speak to someone (S, for
success) or will be placed on hold (F, for failure).
● Define X by X which indicates whether (1) or not (0) the student can
immediately speak to someone.
● Given random variables X and Y, we can perform the usual
arithmetic operations on them.
● Thus, for example, 𝑍 = 𝑋2 is another random variable, defined
by 𝑍(𝑠) = 𝑋2(𝑠) = 𝑋 𝑠
2
= 𝑋(𝑠) Γ— 𝑋(𝑠).
● Similarly, if W = π‘‹π‘Œ3
, then π‘Š(𝑠) = 𝑋(𝑠) Γ— π‘Œ(𝑠) Γ— π‘Œ(𝑠) Γ—
π‘Œ(𝑠), etc.
● Also, if Z = X + Y, then Z(s) = X(s) + Y(s), etc.
Example
● Consider rolling a fair six-sided die, so that S = {1, 2, 3, 4, 5, 6}.
● Let X be the number showing, so that X(s) = s for s ∈ S.
● Let Y be three more than the number showing, so that Y(s) = s +
3.
● Let 𝑍 = 𝑋2 + π‘Œ.
● Then 𝑍(𝑠) = 𝑋 𝑠 2
+ π‘Œ(𝑠) = 𝑠2
+ 𝑠 + 3.
● So Z(1) = 5, Z(2) = 9, etc.
● We write X = Y to mean that X(s) = Y(s) for all s ∈ S.
● Similarly, we write X ≀ Y to mean that X(s) ≀ Y(s) for all s
∈ S, and X β‰₯ Y to mean that X(s) β‰₯ Y(s) for all s ∈ S.
● For example, we write X ≀ c to mean that X(s) ≀ c for all s
∈ S.
● Again consider rolling a fair six-sided die, with S = {1, 2, 3, 4, 5,
6}.
● For s ∈ S, let X(s) = s, and let π‘Œ = 𝑋 + 𝐼 6 . This means that
π‘Œ 𝑠 = 𝑋 𝑠 + 𝐼{6}(𝑠) =
𝑠 𝑠 ≀ 5
7 𝑠 = 6
● Hence, Y(s) = X(s) for 1 ≀ s ≀ 5. But it is not true that Y = X,
because π‘Œ 6 β‰  𝑋(6).
● On the other hand, it is true that Y β‰₯ X
Definition 2
● Any random variable whose only possible values are 0
and 1 is called a Bernoulli random variable.
01 02
A random variable is a
function from the
state space to the
set of real numbers.
The function could be
constant, or
correspond to
counting some
random quantity
that arises, or any other
sort of function.
Summary 1
Exercises
Exercise 1
● Let S = {high, middle, low}.
● Define random variables X, Y, and Z by X(high) = βˆ’12, X(middle)
= βˆ’2, X(low) = 3, Y(high) = 0, Y(middle) = 0, Y(low) = 1, Z(high) =
6, Z(middle) = 0, Z(low) = 4.
Determine whether each of the following relations is true or false.
(a) X < Y
(b) X ≀ Y
(c) Y < Z
(d) Y ≀ Z
(e) XY < Z
(f) XY ≀ Z
Exercise 2
● Let S = {1, 2, 3, 4, 5}.
(a) Define two different (i.e., nonequal) nonconstant random
variables, X and Y, on S.
(b) For the random variables X and Y that you have chosen, let Z =
X +π‘Œ2. Compute Z(s) for all s ∈ S.
Exercise3
● Consider rolling a fair six-sided die, so that S = {1, 2, 3, 4, 5, 6}.
● Let X(s) = s, and π‘Œ(𝑠) = 𝑠3 + 2.
● Let Z = XY.
● Compute Z(s) for all s ∈ S.
Distributions of
Random
Variables
02
● Because random variables are defined to be functions of the
outcome s, and because the outcome s is assumed to be
random (i.e., to take on different values with different
probabilities), it follows that the value of a random variable will
itself be random (as the name implies).
● Specifically, if X is a random variable, then what is the probability
that X will equal some particular value x?
● Well, X = x precisely when the outcome s is chosen such that
X(s) = x.
Examples
π‘₯𝑖 rain snow clear
𝑋(π‘₯𝑖) 3 6 -2.7
𝑃(𝑋 = π‘₯𝑖) 0.4 0.15 0.45
P(X ∈ {3, 6}) =
P(X < 5) = P(X = 3) + P(X = βˆ’2.7) = 0.4 + 0.45 = 0.85
P(X = 3) + P(X = 6) = 0.4 + 0.15 = 0.55
● We see from the previous example that, if B is any subset of the
real numbers, then P(X ∈ B) = P({s ∈ S : X(s) ∈ B}).
● Furthermore, to understand X well requires knowing the
probabilities P(X ∈ B) for different subsets B.
● That is the motivation for the following definition.
Definition
● If X is a random variable, then the distribution of X
is the collection of probabilities P(X ∈ B) for all
subsets B of the real numbers.
Examples
π‘₯𝑖 rain snow clear
𝑋(π‘₯𝑖) 3 6 -2.7
𝑃(𝑋 = π‘₯𝑖) 0.4 0.15 0.45
What is the distribution of X?
Well, if B is any subset of the real numbers, then P(X ∈ B) should count
0.4 if 3 ∈ B, plus 0.15 if 6 ∈ B, plus 0.45 if βˆ’2.7 ∈ B.
We can formally write all this information at once by saying that
P(X ∈ B) = 0.4 IB(3) + 0.15 IB(6) + 0.45 IB(βˆ’2.7),
where again IB(x) = 1 if x ∈ B, and IB(x) = 0 if x βˆ‰ B
π’šπ‘– rain snow clear
π‘Œ(𝑦𝑖) 5 7 5
𝑃(π‘Œ = 𝑦𝑖) 0.4 0.15 0.45
P(Y = 5) = P({rain, clear}) = 0.4 + 0.45 = 0.85.
Therefore, if B is any subset of the real numbers, then
P(Y ∈ B) = 0.15 IB(7) + 0.85 IB(5)
● While the above examples show that it is possible to keep track
of P(X ∈ B) for all subsets B of the real numbers, they also
indicate that it is rather cumbersome to do so.
● Fortunately, there are simpler functions available to help us keep
track of probability distributions, including cumulative distribution
functions, probability functions, and density functions.
● We discuss these next
01 02
The distribution of a
random variable X is
the collection of
probabilities P(X ∈ B)
of X belonging to
various sets.
The probability P(X ∈ B) is
determined by
calculating the
probability of the set
of response values s such
that X(s) ∈ B, i.e., P(X ∈
B) = P({s ∈ S : X(s) ∈ B}).
Summary 2
Exercises
Exercise 1
● Consider flipping two independent fair coins.
● Let X be the number of heads that appear.
● Compute P(X = x) for all real numbers x.
Exercise 2
● Suppose we flip three fair coins, and let X be the number of
heads showing.
(a) Compute P(X = x) for every real number x.
(b) Write a formula for P(X ∈ B), for any subset B of the real
numbers.
Exercise 3
● Suppose we roll two fair six-sided dice, and let Y be the sum of
the two numbers showing.
(a) Compute P(Y = y) for every real number y.
(b) Write a formula for P(Y ∈ B), for any subset B of the real
numbers.
Discrete
Distributions
03
Definition
● A random variable X is discrete if
π‘₯βˆˆπ‘…1
𝑃 𝑋 = π‘₯ = 1 (2.3.1)
● Random variables satisfying (2.3.1) are simple in some sense
because we can understand them completely just by
understanding their probabilities of being equal to particular
values x.
● Indeed, by simply listing out all the possible values x such that
P(X = x) > 0, we obtain a second, equivalent definition, as
follows.
Equivalent definition
● A random variable X is discrete if there is a finite or
countable sequence x1, x2, . . . of distinct real numbers,
and a corresponding sequence p1, p2, . . . of nonnegative
real numbers, such that 𝑃(𝑋 = π‘₯𝑖 ) = 𝑝𝑖 for all i, and
𝑖 𝑝𝑖 = 1.
● (1.3.2)
Definition:
Probability function
● For a discrete random variable X, its probability
function is the function 𝑝𝑋 ∢ 𝑅1 β†’ [0, 1] defined by
𝑝𝑋 (π‘₯) = 𝑃(𝑋 = π‘₯).
● Hence, if x1, x2, . . . are the distinct values such that 𝑃(𝑋 =
π‘₯𝑖 ) = 𝑝𝑖 for all i with 𝑖 𝑝𝑖 = 1, then
𝑝𝑋 (π‘₯) =
𝑝𝑖 x =π‘₯𝑖for some i
0 π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’
● Clearly, all the information about the distribution of X is
contained in its probability function, but only if we know that X is
a discrete random variable.
● Finally, we note that Theorem 1.5.1 immediately implies the
following.
Theorem 1.5.1
(Law of total probability, conditioned
version)
● Let A1, A2, . . . Be events that form a partition of the
sample space S, each of positive probability.
● Let B be any event.
● Then
P(B) = P(A1)P(B | A1) + P(A2)P(B | A2) + Β· Β· Β· .
Reminder of Theorem
2.3.1
(Law of total probability, discrete random variable
version)
● Let X be a discrete random variable, and let A be
some event. Then
𝑃(𝐴) =
π‘₯βˆˆπ‘…1
𝑃 𝑋 = π‘₯ 𝑃(𝐴 | π‘₯)
Exercises
● Consider flipping a fair coin. Let Z = 1 if the coin is heads, and Z
= 3 if the coin is tails. Let π‘Š = 𝑍2 + 𝑍.
(a) What is the probability function of Z?
(b) What is the probability function of W?
● Consider flipping two fair coins. Let X = 1 if the first coin is
heads, and X = 0 if the first coin is tails. Let Y = 1 if the second
coin is heads, and Y = 5 if the second coin is tails.
● Let Z = XY. What is the probability function of Z?
Important
Discrete
Distributions
Degenerate
Distributions
● Let c be some fixed real number.
● Then, as already discussed, c is also a random variable (in fact,
c is a constant random variable).
● In this case, clearly c is discrete, with probability function 𝑝𝑐
satisfying that 𝑝𝑐(𝑐) = 1, and 𝑝𝑐(π‘₯) = 0 for π‘₯ β‰  𝑐.
● Because c is always equal to a particular value (namely, c) with
probability 1, the distribution of c is sometimes called a point
mass or point distribution or degenerate distribution.
The Bernoulli
Distribution
● Consider flipping a coin that has probability ΞΈ of coming up
heads and probability 1βˆ’ΞΈ of coming up tails, where 0 < ΞΈ < 1.
● Let X = 1 if the coin is heads, while X = 0 if the coin is tails.
● Then 𝑝𝑋(1) = 𝑃(𝑋 = 1) = πœƒ,
● while 𝑝𝑋(0) = 𝑃(𝑋 = 0) = 1 βˆ’ πœƒ .
● The random variable X is said to have the Bernoulli(ΞΈ)
distribution;
● We write this as X ∼ Bernoulli(ΞΈ).
● We read X follows the Bernouilli distribution
● Bernoulli distributions arise anytime we have a response variable
that takes only two possible values, and we label one of these
outcomes as 1 and the other as 0.
● For example, 1 could correspond to success and 0 to failure of
some quality test applied to an item produced in a manufacturing
process.
● In this case, ΞΈ is the proportion of manufactured items that will
pass the test.
● Alternatively, we could be randomly selecting an individual from
a population and recording a 1 when the individual is female and
a 0 if the individual is a male. In this case, ΞΈ is the proportion of
females in the population.
Examples
● A student has to sit for an exit exam. The probability
she passes the exam is 0.7.
● Let X be the random variable defined as follows
● X= 1 if the student passes
● X=0 otherwise
● What is the distribution of X?
The Binomial
Distribution
● Consider flipping n coins, each of which has (independent)
probability ΞΈ of coming up heads, and probability 1 βˆ’ΞΈ of coming
up tails. (Again, 0 < ΞΈ < 1.)
● Let X be the total number of heads showing.
● By (1.4.2), we see that for x = 0, 1, 2, . . . , n, 𝑝𝑋 π‘₯ = 𝑃(𝑋 =
● The Bernoulli(ΞΈ ) distribution corresponds to the special
case of the Binomial(n, ΞΈ ) distribution when n = 1,
namely, Bernoulli(ΞΈ ) = Binomial(1, ΞΈ ).
● The following figure contains the plots of several
Binomial(20, ΞΈ ) probability functions.
0
0.05
0.1
0.15
0.2
0.25
0 5 10 15 20 25
Binom(20,1/2) Binom(20,1/5)
● The binomial distribution is applicable to any situation involving n
independent performances of a random system; for each
performance, we are recording whether a particular event has
occurred, called a success, or has not occurred, called a failure.
● If we denote the event in question by A and put ΞΈ = P(A), we
have that the number of successes in the n performances is
distributed Binomial(n, ΞΈ ).
● For example, we could be testing light bulbs produced by a
manufacturer, and ΞΈ is the probability that a bulb works when we
test it. Then the number of bulbs that work in a batch of n is
distributed Binomial(n, ΞΈ ).
Examples
Example 1
● Consider a coin that flips head with probability = 0.25
● If the coin is flipped 5 times, what is the resulting
distribution for the number of heads showing?
Example 2
● 80 persons have to pass through a security gate.
● Let p the probability that the alarm is activated with p =
0,2129
● Let X be the random variable corresponding to the
number of persons activating the alarm.
● What is the probability that only 5 persons will raise
the alarm?
The Geometric
Distribution
● Consider repeatedly flipping a coin that has probability ΞΈ of
coming up heads and probability 1 βˆ’ ΞΈ of coming up tails, where
again 0 < ΞΈ < 1.
● Let X be the number of tails that appear before the first head.
● Then for k β‰₯ 0, X = k if and only if the coin shows exactly k tails
followed by a head.
● The probability of this is equal to 1 βˆ’ πœƒ π‘˜ πœƒ . (In particular, the
probability of getting an infinite number of tails before the first
head is equal to 1 βˆ’ πœƒ βˆžπœƒ = 0, so X is never equal to infinity.)
● Hence, 𝑝𝑋(π‘˜) = ( )
1 βˆ’ πœƒ π‘˜
πœƒ , for k = 0, 1, 2, 3, . . . .
● The random variable X is said to have the Geometric(ΞΈ)
distribution;
● we write this as X ∼ Geometric(ΞΈ ).
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0 2 4 6 8 10 12 14 16
geometric(1/2) geometric(1/5)
● The geometric distribution applies whenever we are counting the
number of failures until the first success for independent
performances of a random system where the occurrence of
some event is considered a success.
● For example, the number of light bulbs tested that work until the
first bulb that does not (a working bulb is considered a β€œfailure”
for the test).
● We note that some books instead define the geometric
distribution to be the number of coin flips up to and including the
first head, which is simply equal to one plus the random variable
defined here.
● We roll a die 3 times
● What is the probability that 6 shows up the first time at
the third roll?
The Negative-
Binomial
Distribution
● Generalizing the previous example, consider again repeatedly
flipping a coin that has probability ΞΈ of coming up heads and
probability 1 βˆ’ ΞΈ of coming up tails.
● Let r be a positive integer, and let Y be the number of tails that
appear before the rth head.
● Then for k β‰₯ 0, Y = k if and only if the coin shows exactly r βˆ’ 1
heads (and k tails) on the first r βˆ’ 1 + k flips, and then shows a
head on the (r + k)-th flip.
● The probability of this is equal to
π‘π‘Œ π‘˜ = π‘Ÿβˆ’1+π‘˜
π‘Ÿβˆ’1
πœƒπ‘Ÿβˆ’1 1 βˆ’ πœƒ π‘˜ πœƒ= π‘Ÿβˆ’1+π‘˜
π‘Ÿβˆ’1
πœƒπ‘Ÿ 1 βˆ’ πœƒ π‘˜
for k = 0, 1, 2, 3, . . . .
● The random variable Y is said to have the Negative-Binomial(r, ΞΈ
) distribution; we write this as Y ∼ Negative-Binomial(r, θ ).
● The special case r = 1 corresponds to the Geometric(ΞΈ )
distribution.
● So Negative-Binomial(1, ΞΈ ) = Geometric(ΞΈ ).
0
0.05
0.1
0.15
0.2
0.25
0.3
0 5 10 15 20 25
NegBinom(2,1/2) NegBinom(10,1/2)
● The Negative-Binomial(r, ΞΈ ) distribution applies whenever we
are counting the number of failures until the r th success for
independent performances of a random system where the
occurrence of some event is considered a success.
● For example, the number of light bulbs tested that work until the
third bulb that does not follow the negative-binomial distribution.
Example
An oil company conducts a geological study that indicates
that an exploratory oil well should have a 20% chance of
striking oil.
1. What is the probability that the first strike comes on
the third well drilled.
2. What is the probability that the third strike comes on
the seventh well drilled?
The Poisson
Distribution
● We say that a random variable Y has the Poisson(Ξ») distribution,
and write Y ∼ Poisson(λ), if
π‘π‘Œ (𝑦) = 𝑃(π‘Œ = 𝑦) =
Ξ»
𝑦
𝑦!
π‘’βˆ’Ξ»
for y = 0, 1, 2, 3, . . . .
● We note that since (from calculus) 𝑦=0
∞ λ
𝑦
𝑦!
= π‘’πœ†,
● It is indeed true (as it must be) that 𝑦=0
∞
𝑃 π‘Œ = 𝑦 = 1.
0
0.05
0.1
0.15
0.2
0.25
0.3
0 5 10 15 20 25
Poisson(2) Poisson(10)
Motivation
● We motivate the Poisson distribution as follows.
● Suppose X ∼ Binomial(n, ΞΈ ), i.e., X has the Binomial(n, ΞΈ )
distribution.
● Then for 0 ≀ x ≀ n,
P X = x =
𝑛
π‘₯
πœƒπ‘₯ 1 βˆ’ πœƒ π‘›βˆ’π‘₯
● If we set ΞΈ = Ξ» / n for some Ξ» > 0, then this becomes
P X = x =
𝑛
π‘₯
(
Ξ»
𝑛
)π‘₯ 1 βˆ’
Ξ»
𝑛
π‘›βˆ’π‘₯
=
𝑛 Γ— 𝑛 βˆ’ 1 … Γ— (𝑛 βˆ’ π‘₯ + 1)
π‘₯!
(
Ξ»
𝑛
)π‘₯ 1 βˆ’
Ξ»
𝑛
π‘›βˆ’π‘₯
● Let us now consider what happens if we let n β†’ ∞ , while
keeping x fixed at some nonnegative integer.
● In that case,
𝑛 Γ— 𝑛 βˆ’ 1 … Γ— (𝑛 βˆ’ π‘₯ + 1)
𝑛π‘₯
β†’ 1
● Indeed
𝑛 Γ— 𝑛 βˆ’ 1 … Γ— (𝑛 βˆ’ π‘₯ + 1)
𝑛π‘₯
= 1 1 βˆ’
1
𝑛
1 βˆ’
2
𝑛
… (1 βˆ’
π‘₯ βˆ’ 1
𝑛
)
● We also have
1 βˆ’
Ξ»
𝑛
π‘›βˆ’π‘₯
= 1 βˆ’
Ξ»
𝑛
𝑛
1 βˆ’
Ξ»
𝑛
βˆ’π‘₯
β†’ π‘’βˆ’Ξ». 1 = π‘’βˆ’Ξ»
● So
liπ‘š
π‘›β†’βˆž
𝑃 𝑋 =
Ξ»
π‘₯
π‘₯!
π‘’βˆ’Ξ»
for x = 0, 1, 2, 3, . . . .
𝑃(𝑋 = π‘₯)
𝑛 Γ— 𝑛 βˆ’ 1 … Γ— (𝑛 βˆ’ π‘₯ + 1)
π‘₯!
(
Ξ»
𝑛
)π‘₯
1 βˆ’
Ξ»
𝑛
π‘›βˆ’π‘₯
● Intuitively, we can phrase this result as follows.
● If we flip a very large number of coins n, and each coin has a
very small probability ΞΈ = Ξ»/n of coming up heads, then the
probability that the total number of heads will be x is
approximately given by
Ξ»
π‘₯
π‘₯!
π‘’βˆ’Ξ».
● Figure displays the accuracy of this estimate when we are
approximating the Binomial(100, 1/10) distribution by the
Poisson(Ξ») distribution where Ξ» = nΞΈ = 100(1/10) = 10.
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0 5 10 15 20 25
Poisson(10) Binom(100,1/10)
● The Poisson distribution is a good model for counting random
occurrences of an event when there are many possible
occurrences, but each occurrence has very small probability.
● Examples include
a. the number of house fires in a city on a given day, t
b. he number of radioactive events recorded by a Geiger counter,
c. the number of phone calls arriving at a switchboard,
d. the number of hits on a popular World Wide Web page on a given day, etc.
The
Hypergeometric
Distribution
● Suppose that an urn contains M white balls and N βˆ’ M black
balls.
● Suppose we draw n ≀ N balls from the urn in such a fashion that
each subset of n balls has the same probability of being drawn.
● Because there are 𝑁
𝑛
such subsets, this probability is
1
𝑁
𝑛
● One way of accomplishing this is to thoroughly mix the balls in
the urn and then draw a first ball.
● Accordingly, each ball has probability 1/N of being drawn.
● Then, without replacing the first ball, we thoroughly mix the balls
in the urn and draw a second ball.
● So each ball in the urn has probability 1/(N βˆ’1) of being drawn.
● We then have that any two balls, say the ith and jth balls,
have probability
● 𝑃 π‘π‘Žπ‘™π‘™ 𝑖 π‘Žπ‘›π‘‘ 𝑗 π‘Žπ‘Ÿπ‘’ π‘‘π‘Ÿπ‘Žπ‘€π‘›
= 𝑃(π‘π‘Žπ‘™π‘™ 𝑖 𝑖𝑠 π‘‘π‘Ÿπ‘Žπ‘€π‘› π‘“π‘–π‘Ÿπ‘ π‘‘)𝑃(π‘π‘Žπ‘™π‘™ 𝑗 𝑖𝑠 π‘‘π‘Ÿπ‘Žπ‘€π‘› π‘ π‘’π‘π‘œπ‘›π‘‘ | π‘π‘Žπ‘™π‘™ 𝑖 𝑖𝑠 π‘‘π‘Ÿπ‘Žπ‘€π‘› π‘“π‘–π‘Ÿπ‘ π‘‘)
+ 𝑃(π‘π‘Žπ‘™π‘™ 𝑗 𝑖𝑠 π‘‘π‘Ÿπ‘Žπ‘€π‘› π‘“π‘–π‘Ÿπ‘ π‘‘)𝑃(π‘π‘Žπ‘™π‘™ 𝑖 𝑖𝑠 π‘‘π‘Ÿπ‘Žπ‘€π‘› π‘ π‘’π‘π‘œπ‘›π‘‘ π‘π‘Žπ‘™π‘™ 𝑗 𝑖𝑠 π‘‘π‘Ÿπ‘Žπ‘€π‘› π‘“π‘–π‘Ÿπ‘ π‘‘
=
1
𝑁
1
π‘βˆ’1
+
1
π‘βˆ’1
1
π‘βˆ’1
=
1
𝑁
2
● This type of sampling is called sampling without
replacement.
● Given that we take a sample of n, let X denote the number of
white balls obtained.
● Note that we must have X β‰₯ 0 and X β‰₯ n βˆ’ (N βˆ’ M) because at
most N βˆ’ M of the balls could be black.
● Hence, X β‰₯ max(0, n + M βˆ’ N).
● Furthermore, X ≀ n and X ≀ M because there are only M white
balls.
● Hence, X ≀ min(n, M).
● So suppose max(0, n + M βˆ’ N) ≀ x ≀ min(n, M).
● What is the probability that x white balls are obtained?
● In other words, what is P(X = x)?
● To evaluate this, we know that we need to count the number of
subsets of n balls that contain x white balls.
● Using the combinatorial principles (ch1), we see that this number
is given by
𝑀
π‘₯
𝑁 βˆ’ 𝑀
𝑛 βˆ’ π‘₯
● So
𝑃 π‘₯ = 𝑋 =
𝑀
π‘₯
π‘βˆ’π‘€
π‘›βˆ’π‘₯
𝑁
𝑛
● The random variable X is said to have the Hypergeometric(N,
M, n) distribution.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 5 10 15 20 25
HperGeom(20,10,10) HperGeom(20,10,5)
● Obviously, the hypergeometric distribution will apply to any
context wherein we are sampling without replacement from a
finite set of N elements and where each element of the set either
has a characteristic or does not.
● For example, if we randomly select people to participate in an
opinion poll so that each set of n individuals in a population of N
has the same probability of being selected, then the number of
people who respond yes to a particular question is distributed
Hypergeometric(N, M, n), where M is the number of people in
the entire population who would respond yes.
01 03
A random variable X is
discrete if
π‘₯ 𝑃(𝑋 = π‘₯) = 1,
i.e., if all its
probability
comes from being
equal to particular
values.
Important discrete
distributions include
the degenerate,
Bernoulli, binomial,
geometric,
negative-binomial,
Poisson, and
hypergeometric
distributions.
Summary 3
02
A discrete random variable
X takes on only a finite, or
countable, number of
distinct values.
Exercises
Exercise 1
Let Y ∼ Binomial(10, θ ).
● Compute P(Y = 10).
● Compute P(Y = 5).
● Compute P(Y = 0).
Exercise
● We toss a fair coin.
● Let X =1 if the coin is head and 0 otherwise
● What is the distribution function that X follows?
● We toss the coin 10 times and consider Y the r.v. that
counts the number of heads. What is the distribution
function that Y follows?
Exercise 2
● Let Z ∼ Geometric(ΞΈ ). Compute P(5 ≀ Z ≀ 9).
Exercise 3
A die is rolled 3 times
1. What is the probability that no 6 occur?
Exercise 4
We roll a die 3 times.
1. Let X be the number of 5 that appear
a. What is the distribution of X?
b. What is the probability to get 3 times 5?
c. What is the probability to get 2 times 5?
2.Let Y be the sum of numbers showing
a. Does Y follow the binomial distribution?
b. What is the probability P(Y=15)?
3. What is the probability to get the first five at the third roll?
4. What is the probability to get 3 times the sum of 15?
Cumulative
Distribution
function
04
Definition
● Given a random variable X, its cumulative function
(or distribution function or cdf) is the function
𝐹𝑋: 𝑅1 β†’ 0,1
𝐹𝑋 π‘₯ = 𝑃(𝑋 ≀ π‘₯)
CDF Properties
Let 𝐹𝑋 be a a cumulative distribution function of a random
variable X. Then
● 𝟎 ≀ 𝑭𝑿 𝒙 ≀ 𝟏 𝒇𝒐𝒓 𝒂𝒍𝒍 𝒙
● 𝑭𝑿 𝒙 ≀ 𝑭𝑿(π’š) whenever 𝒙 ≀ π’š (i.e., 𝑭𝑿 is increasing)
● π₯𝐒𝐦
π’™β†’βˆž
𝑭𝑿 = 𝟏
● π₯𝐒𝐦
π’™β†’βˆ’βˆž
𝑭𝑿 = 𝟎
Theorem
● Let X be a discrete random variable with
probability function 𝑝𝑋, then its cumulative
distribution function 𝐹𝑋 satisfies
𝐹𝑋 π‘₯ =
𝑦≀π‘₯
𝑝𝑋(𝑦)
Example 1
We toss a fair coin twice. Let X be the number of heads
What is 𝐹𝑋 π‘₯ ?
● S={HH, HT, TH,TT}
● X ~ binomial(2,1/2)
s TT HT,TH HH
𝑋(𝑠) 0 1 2
pX(π‘₯) 1
4
1
2
1
4
● 𝐹𝑋 π‘₯ = 𝑃(𝑋 ≀ π‘₯)
● 𝐹𝑋 π‘₯ = 0 if π‘₯ < 0
● 𝐹𝑋 π‘₯ = 1 if π‘₯ β‰₯ 2
● 𝐹𝑋 π‘₯ =
1
4
if 0 ≀ π‘₯ < 1
● 𝐹𝑋 π‘₯ =
1
4
+
1
2
=
3
4
if 1 ≀ π‘₯ < 2
s TT HT,TH HH
𝑋(𝑠) 0 1 2
pX(π‘₯) 1
4
1
2
1
4
s TT HT,TH HH
𝑋(𝑠) 0 1 2
pX(π‘₯) 1
4
1
2
1
4
𝐹𝑋 π‘₯ =
0, 𝑖𝑓 π‘₯ < 0
1
4
, 𝑖𝑓 0 ≀ π‘₯ < 1
3
4
, 𝑖𝑓 1 ≀ π‘₯ < 2
1, 𝑖𝑓 π‘₯ β‰₯ 2
Example 2
● Suppose we roll one fair six sided die so that 𝑆 =
1,2,3,4,5,6 , and p s =
1
6
, βˆ€π‘  ∈ 𝑆.
● Let X be the random variable associated to the number
showing divided by 6.
𝑋 𝑠 = 𝑠/6 , βˆ€π‘  ∈ 𝑆
β€’ What is 𝐹𝑋(π‘₯)?
1 2 3 4 5 6
𝑋(π‘₯) 1
6
2
6
=
1
3
3
6
=
1
2
4
6
=
2
3
5
6
6
6
= 1
pX(π‘₯) 1
6
1
6
1
6
1
6
1
6
1
6
=
𝑦≀??
𝑝𝑋(𝑦)
𝑃(𝑋 ≀ π‘₯)
𝐹𝑋 π‘₯ =
𝑋 𝑦 =
1
6
𝑦, βˆ€π‘¦ ∈ 𝑆
𝑋 ≀ π‘₯
1
6
𝑦 ≀ π‘₯
β‡’ 𝑦 ≀ 6π‘₯
β‡’
𝐹𝑋 π‘₯ =
π‘¦βˆˆπ‘†
𝑦≀6π‘₯
𝑝𝑋(𝑦) =
π‘¦βˆˆπ‘†
𝑦≀6π‘₯
1
6
=
1
6
|𝑦 ∈ 𝑆: 𝑦 ≀ 6π‘₯|
1 2 3 4 5 6
𝑋(π‘₯) 1
6
2
6
=
1
3
3
6
=
1
2
4
6
=
2
3
5
6
6
6
= 1
pX(π‘₯) 1
6
1
6
1
6
1
6
1
6
1
6
𝐹𝑋 π‘₯ =
1
6
|𝑦 ∈ 𝑆: 𝑦 ≀ 6π‘₯|
π‘₯ β‰₯ 1 β‡’ 𝑦 ≀ 6 β‡’ 𝑦 ∈ 𝑆: 𝑦 ≀ 6 = 1,2,3,4,5,6 β‡’ 𝐹𝑋 π‘₯ =
6
6
= 1
5
6
≀ π‘₯ < 1 β‡’ 𝑦 < 6 β‡’ 𝑦 ∈ 𝑆: 𝑦 < 6 = 1,2,3,4,5 β‡’ 𝐹𝑋 π‘₯ =
5
6
4
6
≀ π‘₯ <
5
6
β‡’ 𝑦 < 5 β‡’ 𝑦 ∈ 𝑆: 𝑦 < 5 = 1,2,3,4 β‡’ 𝐹𝑋 π‘₯ =
4
6
=
2
3
1 2 3 4 5 6
𝑋(π‘₯) 1
6
2
6
=
1
3
3
6
=
1
2
4
6
=
2
3
5
6
6
6
= 1
pX(π‘₯) 1
6
1
6
1
6
1
6
1
6
1
6
𝐹𝑋 π‘₯ =
1
6
|𝑦 ∈ 𝑆: 𝑦 ≀ 6π‘₯|
3
6
≀ π‘₯ <
4
6
β‡’ 𝑦 < 4 β‡’ 𝑦 ∈ 𝑆: 𝑦 < 4 = 1,2,3 β‡’ 𝐹𝑋(π‘₯) =
3
6
=
1
2
2
6
≀ π‘₯ <
3
6
β‡’ 𝑦 < 3 β‡’ 𝑦 ∈ 𝑆: 𝑦 < 3 = 1,2 β‡’ 𝐹𝑋(π‘₯) =
2
6
=
1
3
1
6
≀ π‘₯ <
2
6
β‡’ 𝑦 < 2 β‡’ 𝑦 ∈ 𝑆: 𝑦 < 2 = 1 β‡’ 𝐹𝑋(π‘₯) =
1
6
1 2 3 4 5 6
𝑋(π‘₯) 1
6
2
6
=
1
3
3
6
=
1
2
4
6
=
2
3
5
6
6
6
= 1
pX(π‘₯) 1
6
1
6
1
6
1
6
1
6
1
6
𝐹𝑋 π‘₯ =
1
6
|𝑦 ∈ 𝑆: 𝑦 ≀ 6π‘₯|
π‘₯ <
1
6
β‡’ 𝑦 < 1 β‡’ 𝑦 ∈ 𝑆: 𝑦 < 1 = βˆ… β‡’ 𝐹𝑋(π‘₯) = 0
π‘₯
𝐹𝑋(π‘₯)
1
6
1
3
2
3
1
5
6
1
2
1
6
1
3
1
2
2
3
1
1 2 3 4 5 6
𝑋(π‘₯) 1
6
2
6
=
1
3
3
6
=
1
2
4
6
=
2
3
5
6
6
6
= 1
pX(π‘₯) 1
6
1
6
1
6
1
6
1
6
1
6
Useful formula
● For all π‘Ž ≀ 𝑏, we have
𝑃(π‘Ž < 𝑋 ≀ 𝑏) = 𝐹𝑋(𝑏) βˆ’ 𝐹𝑋(π‘Ž)
Exercises
Exercise 1
● Consider rolling one fair six-sided die, so that S = {1, 2, 3, 4, 5,
6}, and P(s) = 1/6 for all s ∈ S. Let X be the number showing on
the die, so that X(s) = s for s ∈ S.
● Let π‘Œ = 𝑋2. Compute the cumulative distribution function
πΉπ‘Œ (𝑦) = P(Y ≀ y), for all 𝑦 ∈ 𝑅1 .
Joint
Distributions
05
● In many real scenarios, we need to analyse many
variables in the same time.
● For this, we will show how to deal with two random
variables which can be generalized for more than 2
variable
Definition (Joint probability
function)
● Let X and Y be discrete random variables.
● Then their joint probability function, 𝑝𝑋,π‘Œ , is a
function from 𝑅2 to 𝑅1, defined by
𝑝𝑋,π‘Œ (π‘₯, 𝑦) = 𝑃(𝑋 = π‘₯ , π‘Œ = 𝑦)
Example
● Find P(X=0,Y<=1)
Y=0 Y=1 Y=2
X=0 1/6 1/4 1/8
X=1 1/8 1/6 1/6
Theorem (marginal
probability function)
● Let X and Y be two discrete random variables, with joint probability
function 𝑝𝑋,π‘Œ . Then the probability function 𝑝𝑋 of X can be
computed as
𝑝𝑋 π‘₯ =
𝑦
𝑃𝑋,π‘Œ(π‘₯, 𝑦)
● Similarly, the probability function π‘π‘Œ of Y can be computed as
π‘π‘Œ 𝑦 =
π‘₯
𝑃𝑋,π‘Œ(π‘₯, 𝑦)
Example 1
● Find the marginal probability function of X and Y
Y=0 Y=1 Y=2
X=0 1/6 1/4 1/8
X=1 1/8 1/6 1/6
Example 2
● Suppose the joint probability function of X and Y is given by
𝑝𝑋,π‘Œ π‘₯, 𝑦 =
1
7
π‘₯ = 5, 𝑦 = 0
1
7
π‘₯ = 5, 𝑦 = 3
1
7
π‘₯ = 5, 𝑦 = 4
3
7
π‘₯ = 8, 𝑦 = 0
1
7
π‘₯ = 8, 𝑦 = 4
0 π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’
1. 𝑝𝑋 (5) ?
2. 𝑝𝑋 (8) ?
3. π‘π‘Œ (4) ?
1. 𝑝𝑋 5 = 𝑦 𝑝(5, 𝑦) = 𝑝𝑋,π‘Œ 5, 0 + 𝑝𝑋,π‘Œ 5, 3 + 𝑝𝑋,π‘Œ 5,4
=
3
7
Y=0 Y=3 Y=4
X=5 1/7 1/7 1/7 3/7
X=8 3/7 0 1/7 4/7
4/7 1/7 2/7
Definition
● If X and Y are random variables, then the joint
distribution of X and Y is the collection of
probabilities P((X, Y) ∈ B), for all subsets 𝐡 βŠ† 𝑅2
of
pairs of real numbers.
Definition Joint cdf
● Let X and Y be random variables.
● Then their joint cumulative distribution function is
the function 𝐹𝑋,π‘Œ ∢ 𝑅2 β†’ [0, 1] defined by
𝐹𝑋,π‘Œ (π‘₯, 𝑦) = 𝑃(𝑋 ≀ π‘₯, π‘Œ ≀ 𝑦)
(the comma means β€œand” here, so that 𝐹𝑋,π‘Œ (π‘₯, 𝑦) is the probability that X ≀ x and Y
≀ y.)
Example
Y=0 Y=1 Y=2
X=0 1/6 1/4 1/8
X=1 1/8 1/6 1/6
Example
1. Find πΉπ‘‹π‘Œ(0.5,1)
Y=0 Y=1 Y=2
X=0 1/6 1/4 1/8
X=1 1/8 1/6 1/6
● What is the probability 𝑃(π‘Ž < 𝑋 ≀ 𝑏, 𝑐 < π‘Œ ≀ 𝑑)
Theorem
● Let X and Y be any random variables, with
joint cumulative distribution function 𝐹𝑋,π‘Œ .
● Suppose a ≀ b and c ≀ d. Then
𝑃(π‘Ž < 𝑋 ≀ 𝑏 , 𝑐 < π‘Œ ≀ 𝑑)
= 𝐹𝑋,π‘Œ (𝑏, 𝑑) βˆ’ 𝐹𝑋,π‘Œ(π‘Ž, 𝑑) βˆ’ 𝐹𝑋,π‘Œ(𝑏, 𝑐) + 𝐹𝑋,π‘Œ(π‘Ž, 𝑐).
𝑃(π‘Ž < 𝑋 ≀ 𝑏 , 𝑐 < π‘Œ ≀ 𝑑)
= 𝐹𝑋,π‘Œ (𝑏, 𝑑) βˆ’ 𝐹𝑋,π‘Œ(π‘Ž, 𝑑) βˆ’ 𝐹𝑋,π‘Œ(𝑏, 𝑐) + 𝐹𝑋,π‘Œ(π‘Ž, 𝑐).
Theorem (marginal distributions)
● Let X and Y be two random variables, with joint cumulative
distribution function FX,Y . Then the cumulative
distribution function FX of X satisfies
𝐹𝑋 (π‘₯) = lim
π‘¦β†’βˆž
𝐹𝑋,π‘Œ (π‘₯, 𝑦)
for all x ∈ R1.
● Similarly, the cumulative distribution function FY of Y
satisfies πΉπ‘Œ (𝑦) = lim
π‘₯β†’βˆž
𝐹𝑋,π‘Œ (π‘₯, 𝑦)
for all y ∈ R1.
01 03
It is often important
to keep track of
the joint
probabilities of
two random
variables,
X and Y.
If X and Y are discrete,
then their joint
probability function is
given by
𝑝𝑋,π‘Œ(π‘₯, 𝑦)
= 𝑃(𝑋 = π‘₯, π‘Œ = 𝑦).
Summary
02
Their joint cumulative
distribution function is given
by 𝐹𝑋,π‘Œ (π‘₯, 𝑦) = 𝑃(𝑋 ≀
π‘₯, π‘Œ ≀ 𝑦).
Exercises
● Let and urn containing:
a. 4 white balls
b. 3 red
c. 3 black
● We pick 5 balls without replacement
● What is the probability of getting 2 white and 2 red
balls?
● Let X be the random variable β€œnumber of red balls” and
Y be the β€œnumber of white balls”
a. Find P(X=2, Y=2)
b. Find the joint probability function of X and Y
c. Does π‘₯,𝑦 𝑝𝑋,π‘Œ π‘₯, 𝑦 = 1?
d. Find P(X=Y)
e. Find P(X>Y)
f. Find P(X=3)
g. Find P(Y=2)
Y=0 Y=1 Y=2 Y=3 Y=4
X=0 0 0 6
252
12
252
3
252
X=1 0 12
252
54
252
36
252
3
252
X=2 3
252
36
252
54
252
12
252
0
X=3 3
252
12
252
6
252
0 0
● Does π‘₯,𝑦 𝑝𝑋,π‘Œ π‘₯, 𝑦 = 1?
● Find P(X=Y)
● Find P(X>Y)
● Find P(X=3)
● Find P(Y=2)
● Find 𝐹𝑋,π‘Œ 2,1
● Find 𝐹𝑋,π‘Œ(3,2)
Y=0 Y=1 Y=2 Y=3 Y=4
X=0 0 0 6
252
12
252
3
252
X=1 0 12
252
54
252
36
252
3
252
X=2 3
252
36
252
54
252
12
252
0
X=3 3
252
12
252
6
252
0 0
Find 𝐹𝑋,π‘Œ 2,1
● Find 𝐹𝑋,π‘Œ(3,2)
Y=0 Y=1 Y=2 Y=3 Y=4
X=0 0 0 6
252
12
252
3
252
X=1 0 12
252
54
252
36
252
3
252
X=2 3
252
36
252
54
252
12
252
0
X=3 3
252
12
252
6
252
0 0
Conditioning
and
Independence
06
Definition
● Let X and Y be random variables, and suppose that P(X = x) >.
● The conditional distribution of Y, given that X = x, is the probability
distribution assigning probability
𝑃(π‘Œ ∈ 𝐡 , 𝑋 = π‘₯)
P(X = x)
to each event Y ∈ B.
● In particular, it assigns probability
𝑃(π‘Ž < π‘Œ ≀ 𝑏 , 𝑋 = π‘₯)
𝑃(𝑋 = π‘₯)
to the event that a < Y ≀ b.
Example
● Suppose the joint probability function of X and Y is given by
𝑝𝑋,π‘Œ π‘₯, 𝑦 =
1
7
π‘₯ = 5, 𝑦 = 0
1
7
π‘₯ = 5, 𝑦 = 3
1
7
π‘₯ = 5, 𝑦 = 4
3
7
π‘₯ = 8, 𝑦 = 0
1
7
π‘₯ = 8, 𝑦 = 4
0 π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’
● We compute P(Y = 4 | X = 8) as
𝑃(π‘Œ = 4 | 𝑋 = 8) =
𝑃 π‘Œ = 4, 𝑋 = 8
𝑃 𝑋 = 8
=
1
7
1
7
+
3
7
= 1/4
● P(Y=4|X=5)
𝑃(π‘Œ = 4 | 𝑋 = 5) =
𝑃 π‘Œ = 4, 𝑋 = 5
𝑃 𝑋 = 5
=
1
7
1
7
+
1
7
+
1
7
= 1/3
Definition
● Suppose X and Y are two discrete random
variables.
● Then the conditional probability function of Y, given
X, is the function π‘π‘Œ|𝑋 defined by
π‘π‘Œ|𝑋 𝑦 π‘₯ =
𝑝𝑋,π‘Œ π‘₯,𝑦
𝑃𝑋 π‘₯
=
𝑝𝑋,π‘Œ π‘₯,𝑦
𝑧 𝑃𝑋,π‘Œ(π‘₯,𝑧)
● defined for all 𝑦 ∈ 𝑅1
and all x with 𝑝𝑋 (π‘₯) > 0.
Example
Find 𝑃(π‘Œ = 1|𝑋 = 0)
Y=0 Y=1 Y=2
X=0 1/6 1/4 1/8
X=1 1/8 1/6 1/6
Independence
of Random
Variables
● Recall from Definition 1.5.2 that two events A and B are
independent if P(A ∩ B) = P(A) P(B).
● We wish to have a corresponding definition of independence for
random variables X and Y. Intuitively, independence of X and Y
means that X and Y have no influence on each other, i.e., that
the values of X make no change to the probabilities for Y (and
vice versa).
● The idea of the formal definition is that X and Y give rise to
events, of the form β€œa < X ≀ b” or β€œY ∈ B,” and we want all such
events involving X to be independent of all such events involving
Y.
Definition
● Let X and Y be two random variables.
● Then X and Y are independent if, for all subsets B1
and B2 of the real numbers,
P(X ∈ B1, Y ∈ B2) = P(X ∈ B1)P(Y ∈ B2).
● That is, the events β€œX ∈ B1” and β€œY ∈ B2” are
independent events.
Theorem
● Let X and Y be two random variables. Then X and Y
are independent if and only if
𝑃(π‘Ž ≀ 𝑋 ≀ 𝑏, 𝑐 ≀ π‘Œ ≀ 𝑑) = 𝑃(π‘Ž ≀ 𝑋 ≀ 𝑏)𝑃(𝑐 ≀ π‘Œ ≀ 𝑑) (2.8.1)
whenever a ≀ b and c ≀ d.
Theorem
● Let X and Y be two discrete random variables.
● X and Y are independent if and only if their joint
probability function 𝑝𝑋,π‘Œ satisfies
𝑝𝑋,π‘Œ (x, y) = 𝑝𝑋 (x) π‘π‘Œ (y)
for all x, y ∈ R1.
Example
Are X and Y independent?
Y=0 Y=1 Y=2
X=0 1/6 1/4 1/8
X=1 1/8 1/6 1/6
● Suppose now that we have n random variables 𝑋1, . . . , 𝑋𝑛.
● The random variables are independent if and only if the
collection of events {π‘Žπ‘– ≀ 𝑋𝑖 ≀ 𝑏𝑖 } are independent, whenever
π‘Žπ‘– ≀ 𝑏𝑖 , for all i = 1, 2, . . . , n.
● Generalizing previous Theorem, we have the following result.
● Let 𝑋1, . . . , 𝑋𝑛 be a collection of random discrete
variables.
● 𝑋1, . . . , 𝑋𝑛 are independent if and only if their joint
probability function 𝑝𝑋1,…,𝑋𝑛
satisfies
𝑝𝑋1,…,𝑋𝑛
(π‘₯1, . . . , π‘₯𝑛) = 𝑝𝑋1
(π‘₯1) Β· Β· Β· 𝑝𝑋𝑛
(π‘₯𝑛)
for all π‘₯1, . . . , π‘₯𝑛 ∈ 𝑅1 .
Independent and identically
distributed
random variables
● A collection 𝑋1, . . . , 𝑋𝑛 of random variables is
independent and identically distributed (or i.i.d.) if
the collection is independent and if, furthermore,
each of the n variables has the same distribution.
● The i.i.d. sequence 𝑋1, . . . , 𝑋𝑛 is also referred to as a
sample from the common distribution.
● In particular, if a collection 𝑋1, . . . , 𝑋𝑛 of random variables is i.i.d.
and discrete, then each of the probability functions 𝑝𝑋𝑖
is the
same, so that 𝑝𝑋1
π‘₯ =Β· Β· Β·= 𝑝𝑋𝑛
π‘₯ = p π‘₯ , for all x ∈ β„› 1.
● Furthermore, from independency theorem, it follows that
𝑝𝑋1,…,𝑋𝑛
π‘₯1, . . . , π‘₯𝑛 = 𝑝𝑋1
π‘₯1 Β· Β· Β· 𝑝𝑋𝑛
π‘₯𝑛 = p x1 … p(x2)
for all π‘₯1, . . . , π‘₯𝑛 ∈ β„› 1
.
Multinomial
Distributions
● Consider k random variable 𝑋1, . . . , π‘‹π‘˜follow the
multinomial distribution noted (𝑋1, . . . , π‘‹π‘˜)~multinomial
(n, πœƒ1, . . . , πœƒπ‘˜) in the following case
● The experiment consists of n repeated trials.
● Each trial has a discrete number of possible outcomes.
● On any given trial, the probability that a particular
outcome will occur is constant πœƒπ‘˜.
● The trials are independent; that is, the outcome on one
trial does not affect the outcome on other trials.
𝑝𝑋1,…,π‘‹π‘˜
π‘₯1, . . . , π‘₯π‘˜ = 𝑛
π‘₯1π‘₯2…π‘₯π‘˜
πœƒ1
π‘₯1
… πœƒπ‘˜
π‘₯π‘˜
=
𝑛!
π‘₯1!π‘₯2!…π‘₯π‘˜!
πœƒ1
π‘₯1
… πœƒπ‘˜
π‘₯π‘˜
whenever each π‘₯𝑖 ∈ {0, . . . , 𝑛} and π‘₯1 + Β· Β· Β· + π‘₯π‘˜ = 𝑛.
In this case, we write
(𝑋1, . . . , π‘‹π‘˜ ) ∼ Multinomial (𝑛, πœƒ1, . . . , πœƒπ‘˜ )
Examples
● Consider a bowl of chips of which a proportion πœƒπ‘– of the
chips are labelled i (for i = 1, 2, 3).
● If we randomly draw a chip from the bowl and observe
its label s, then P (s = i) = πœƒπ‘– .
● We draw 3 ships with replacement and consider the
random variable 𝑋𝑖: the number of ships labeled with i
● Then (𝑋1, 𝑋2, 𝑋3) ~π‘šπ‘’π‘™π‘‘π‘–π‘›π‘œπ‘šπ‘–π‘Žπ‘™(3, πœƒ1, πœƒ2, πœƒ3)
Examples
● Consider a population of students at a university of
which a proportion πœƒ1 live on campus (denoted by s =
1), a proportion πœƒ2 live off-campus with their parents
(denoted by s = 2), and a proportion πœƒ3 live off-campus
independently (denoted by s = 3).
● If we randomly draw a student from this population
and determine s for that student, then P (s = i) = πœƒπ‘– .
● Let 𝑋𝑖: the number of students belonging to 𝑠𝑖
● In a random sample of 10 people, what is the
probability that 6 have O, 2have A, 1 has B and 1 has
AB?
● X1: # O, X2: #A, X3: #B, X4: #AB
● 𝑃(𝑋1 = 6, 𝑋2 = 2, 𝑋3 = 1, 𝑋4 = 1) =
10!
6!2!1!1!
0.446 Γ— 0.422 Γ—
0.11 Γ— 0.041
Blood type O A B AB
probability 0.44 0.42 0.10 0.04
Exercises
● An urn contains 8 red balls, 3 yellow, 9 white. 6 balls
are randomly selected with replacement.
● What is the probability 2 are red, 1 is yellow, and 3 are
white?
● Suppose a card is drawn randomly from an ordinary
deck of playing cards, and then put back in the deck.
This exercise is repeated five times. What is the
probability of drawing 1 spade, 1 heart, 1 diamond, and
2 clubs?
CREDITS: This presentation template was created
by Slidesgo, including icons by Flaticon, and
infographics & images by Freepik.
THANKS
References
Michael J. Evans and Jeffrey S.
Rosenthal: Probability and
Statistics, The Science of
Uncertainty, Second Edition,
University of Toronto

More Related Content

What's hot

Series solutions at ordinary point and regular singular point
Series solutions at ordinary point and regular singular pointSeries solutions at ordinary point and regular singular point
Series solutions at ordinary point and regular singular pointvaibhav tailor
Β 
Advanced Microeconomics - Lecture Slides
Advanced Microeconomics - Lecture SlidesAdvanced Microeconomics - Lecture Slides
Advanced Microeconomics - Lecture SlidesYosuke YASUDA
Β 
Lecture 3 qualtifed rules of inference
Lecture 3 qualtifed rules of inferenceLecture 3 qualtifed rules of inference
Lecture 3 qualtifed rules of inferenceasimnawaz54
Β 
Continuity and Uniform Continuity
Continuity and Uniform ContinuityContinuity and Uniform Continuity
Continuity and Uniform ContinuityDEVTYPE
Β 
Optimization Approach to Nash Euilibria with Applications to Interchangeability
Optimization Approach to Nash Euilibria with Applications to InterchangeabilityOptimization Approach to Nash Euilibria with Applications to Interchangeability
Optimization Approach to Nash Euilibria with Applications to InterchangeabilityYosuke YASUDA
Β 
Lesson 8: Basic Differentiation Rules (Section 21 slides)
Lesson 8: Basic Differentiation Rules (Section 21 slides)Lesson 8: Basic Differentiation Rules (Section 21 slides)
Lesson 8: Basic Differentiation Rules (Section 21 slides)Matthew Leingang
Β 
Predicates and Quantifiers
Predicates and Quantifiers Predicates and Quantifiers
Predicates and Quantifiers Istiak Ahmed
Β 
Power Series - Legendre Polynomial - Bessel's Equation
Power Series - Legendre Polynomial - Bessel's EquationPower Series - Legendre Polynomial - Bessel's Equation
Power Series - Legendre Polynomial - Bessel's EquationArijitDhali
Β 
First order non-linear partial differential equation & its applications
First order non-linear partial differential equation & its applicationsFirst order non-linear partial differential equation & its applications
First order non-linear partial differential equation & its applicationsJayanshu Gundaniya
Β 
Bivariate Discrete Distribution
Bivariate Discrete DistributionBivariate Discrete Distribution
Bivariate Discrete DistributionArijitDhali
Β 
PRESENTACIΓ“N DEL INFORME MATEMÁTICA UPTAEB UNIDAD 1
PRESENTACIΓ“N DEL INFORME MATEMÁTICA UPTAEB UNIDAD 1PRESENTACIΓ“N DEL INFORME MATEMÁTICA UPTAEB UNIDAD 1
PRESENTACIΓ“N DEL INFORME MATEMÁTICA UPTAEB UNIDAD 1GenesisAcosta7
Β 
3.2 implicit equations and implicit differentiation
3.2 implicit equations and implicit differentiation3.2 implicit equations and implicit differentiation
3.2 implicit equations and implicit differentiationmath265
Β 
2 random variables notes 2p3
2 random variables notes 2p32 random variables notes 2p3
2 random variables notes 2p3MuhannadSaleh
Β 
Partial differential equations
Partial differential equationsPartial differential equations
Partial differential equationsaman1894
Β 
Function an old french mathematician said[1]12
Function an old french mathematician said[1]12Function an old french mathematician said[1]12
Function an old french mathematician said[1]12Mark Hilbert
Β 

What's hot (20)

Finite difference &amp; interpolation
Finite difference &amp; interpolationFinite difference &amp; interpolation
Finite difference &amp; interpolation
Β 
Series solutions at ordinary point and regular singular point
Series solutions at ordinary point and regular singular pointSeries solutions at ordinary point and regular singular point
Series solutions at ordinary point and regular singular point
Β 
Advanced Microeconomics - Lecture Slides
Advanced Microeconomics - Lecture SlidesAdvanced Microeconomics - Lecture Slides
Advanced Microeconomics - Lecture Slides
Β 
Lecture 3 qualtifed rules of inference
Lecture 3 qualtifed rules of inferenceLecture 3 qualtifed rules of inference
Lecture 3 qualtifed rules of inference
Β 
Continuity and Uniform Continuity
Continuity and Uniform ContinuityContinuity and Uniform Continuity
Continuity and Uniform Continuity
Β 
Optimization Approach to Nash Euilibria with Applications to Interchangeability
Optimization Approach to Nash Euilibria with Applications to InterchangeabilityOptimization Approach to Nash Euilibria with Applications to Interchangeability
Optimization Approach to Nash Euilibria with Applications to Interchangeability
Β 
Ch05 4
Ch05 4Ch05 4
Ch05 4
Β 
Lesson 8: Basic Differentiation Rules (Section 21 slides)
Lesson 8: Basic Differentiation Rules (Section 21 slides)Lesson 8: Basic Differentiation Rules (Section 21 slides)
Lesson 8: Basic Differentiation Rules (Section 21 slides)
Β 
Predicates and Quantifiers
Predicates and Quantifiers Predicates and Quantifiers
Predicates and Quantifiers
Β 
Power Series - Legendre Polynomial - Bessel's Equation
Power Series - Legendre Polynomial - Bessel's EquationPower Series - Legendre Polynomial - Bessel's Equation
Power Series - Legendre Polynomial - Bessel's Equation
Β 
First order non-linear partial differential equation & its applications
First order non-linear partial differential equation & its applicationsFirst order non-linear partial differential equation & its applications
First order non-linear partial differential equation & its applications
Β 
Ch05 3
Ch05 3Ch05 3
Ch05 3
Β 
Partitions
PartitionsPartitions
Partitions
Β 
Bivariate Discrete Distribution
Bivariate Discrete DistributionBivariate Discrete Distribution
Bivariate Discrete Distribution
Β 
PRESENTACIΓ“N DEL INFORME MATEMÁTICA UPTAEB UNIDAD 1
PRESENTACIΓ“N DEL INFORME MATEMÁTICA UPTAEB UNIDAD 1PRESENTACIΓ“N DEL INFORME MATEMÁTICA UPTAEB UNIDAD 1
PRESENTACIΓ“N DEL INFORME MATEMÁTICA UPTAEB UNIDAD 1
Β 
3.2 implicit equations and implicit differentiation
3.2 implicit equations and implicit differentiation3.2 implicit equations and implicit differentiation
3.2 implicit equations and implicit differentiation
Β 
2 random variables notes 2p3
2 random variables notes 2p32 random variables notes 2p3
2 random variables notes 2p3
Β 
Prob review
Prob reviewProb review
Prob review
Β 
Partial differential equations
Partial differential equationsPartial differential equations
Partial differential equations
Β 
Function an old french mathematician said[1]12
Function an old french mathematician said[1]12Function an old french mathematician said[1]12
Function an old french mathematician said[1]12
Β 

Similar to Probability and statistics - Discrete Random Variables and Probability Distribution

Lecture 1.2 quadratic functions
Lecture 1.2 quadratic functionsLecture 1.2 quadratic functions
Lecture 1.2 quadratic functionsnarayana dash
Β 
Finance Enginering from Columbia.pdf
Finance Enginering from Columbia.pdfFinance Enginering from Columbia.pdf
Finance Enginering from Columbia.pdfCarlosLazo45
Β 
Mayo Slides: Part I Meeting #2 (Phil 6334/Econ 6614)
Mayo Slides: Part I Meeting #2 (Phil 6334/Econ 6614)Mayo Slides: Part I Meeting #2 (Phil 6334/Econ 6614)
Mayo Slides: Part I Meeting #2 (Phil 6334/Econ 6614)jemille6
Β 
Econometrics 2.pptx
Econometrics 2.pptxEconometrics 2.pptx
Econometrics 2.pptxfuad80
Β 
Qt random variables notes
Qt random variables notesQt random variables notes
Qt random variables notesRohan Bhatkar
Β 
Probability[1]
Probability[1]Probability[1]
Probability[1]indu thakur
Β 
A Characterization of the Zero-Inflated Logarithmic Series Distribution
A Characterization of the Zero-Inflated Logarithmic Series DistributionA Characterization of the Zero-Inflated Logarithmic Series Distribution
A Characterization of the Zero-Inflated Logarithmic Series Distributioninventionjournals
Β 
Quantitative Techniques random variables
Quantitative Techniques random variablesQuantitative Techniques random variables
Quantitative Techniques random variablesRohan Bhatkar
Β 
Random variables
Random variablesRandom variables
Random variablesMenglinLiu1
Β 
Conditional Expectations Liner algebra
Conditional Expectations Liner algebra Conditional Expectations Liner algebra
Conditional Expectations Liner algebra AZIZ ULLAH SURANI
Β 
Probability distribution for Dummies
Probability distribution for DummiesProbability distribution for Dummies
Probability distribution for DummiesBalaji P
Β 
Communication Theory - Random Process.pdf
Communication Theory - Random Process.pdfCommunication Theory - Random Process.pdf
Communication Theory - Random Process.pdfRajaSekaran923497
Β 
Problem_Session_Notes
Problem_Session_NotesProblem_Session_Notes
Problem_Session_NotesLu Mao
Β 
Fuzzy random variables and Kolomogrov’s important results
Fuzzy random variables and Kolomogrov’s important resultsFuzzy random variables and Kolomogrov’s important results
Fuzzy random variables and Kolomogrov’s important resultsinventionjournals
Β 
this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...BhojRajAdhikari5
Β 

Similar to Probability and statistics - Discrete Random Variables and Probability Distribution (20)

Ch5
Ch5Ch5
Ch5
Β 
Lecture 1.2 quadratic functions
Lecture 1.2 quadratic functionsLecture 1.2 quadratic functions
Lecture 1.2 quadratic functions
Β 
CH6.pdf
CH6.pdfCH6.pdf
CH6.pdf
Β 
Statistics Exam Help
Statistics Exam HelpStatistics Exam Help
Statistics Exam Help
Β 
U unit7 ssb
U unit7 ssbU unit7 ssb
U unit7 ssb
Β 
Finance Enginering from Columbia.pdf
Finance Enginering from Columbia.pdfFinance Enginering from Columbia.pdf
Finance Enginering from Columbia.pdf
Β 
Mayo Slides: Part I Meeting #2 (Phil 6334/Econ 6614)
Mayo Slides: Part I Meeting #2 (Phil 6334/Econ 6614)Mayo Slides: Part I Meeting #2 (Phil 6334/Econ 6614)
Mayo Slides: Part I Meeting #2 (Phil 6334/Econ 6614)
Β 
Econometrics 2.pptx
Econometrics 2.pptxEconometrics 2.pptx
Econometrics 2.pptx
Β 
Qt random variables notes
Qt random variables notesQt random variables notes
Qt random variables notes
Β 
Probability[1]
Probability[1]Probability[1]
Probability[1]
Β 
Probable
ProbableProbable
Probable
Β 
A Characterization of the Zero-Inflated Logarithmic Series Distribution
A Characterization of the Zero-Inflated Logarithmic Series DistributionA Characterization of the Zero-Inflated Logarithmic Series Distribution
A Characterization of the Zero-Inflated Logarithmic Series Distribution
Β 
Quantitative Techniques random variables
Quantitative Techniques random variablesQuantitative Techniques random variables
Quantitative Techniques random variables
Β 
Random variables
Random variablesRandom variables
Random variables
Β 
Conditional Expectations Liner algebra
Conditional Expectations Liner algebra Conditional Expectations Liner algebra
Conditional Expectations Liner algebra
Β 
Probability distribution for Dummies
Probability distribution for DummiesProbability distribution for Dummies
Probability distribution for Dummies
Β 
Communication Theory - Random Process.pdf
Communication Theory - Random Process.pdfCommunication Theory - Random Process.pdf
Communication Theory - Random Process.pdf
Β 
Problem_Session_Notes
Problem_Session_NotesProblem_Session_Notes
Problem_Session_Notes
Β 
Fuzzy random variables and Kolomogrov’s important results
Fuzzy random variables and Kolomogrov’s important resultsFuzzy random variables and Kolomogrov’s important results
Fuzzy random variables and Kolomogrov’s important results
Β 
this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...
Β 

Recently uploaded

BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
Β 
β€œOh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
β€œOh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...β€œOh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
β€œOh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
Β 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
Β 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
Β 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
Β 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
Β 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
Β 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)β€”β€”β€”β€”IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)β€”β€”β€”β€”IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)β€”β€”β€”β€”IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)β€”β€”β€”β€”IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
Β 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
Β 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
Β 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
Β 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
Β 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
Β 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
Β 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
Β 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
Β 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
Β 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
Β 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
Β 

Recently uploaded (20)

BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
Β 
β€œOh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
β€œOh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...β€œOh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
β€œOh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
Β 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
Β 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
Β 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
Β 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
Β 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
Β 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)β€”β€”β€”β€”IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)β€”β€”β€”β€”IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)β€”β€”β€”β€”IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)β€”β€”β€”β€”IMP.OF KSHARA ...
Β 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Β 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
Β 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Β 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
Β 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
Β 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
Β 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
Β 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
Β 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
Β 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
Β 
Model Call Girl in Tilak Nagar Delhi reach out to us at πŸ”9953056974πŸ”
Model Call Girl in Tilak Nagar Delhi reach out to us at πŸ”9953056974πŸ”Model Call Girl in Tilak Nagar Delhi reach out to us at πŸ”9953056974πŸ”
Model Call Girl in Tilak Nagar Delhi reach out to us at πŸ”9953056974πŸ”
Β 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
Β 

Probability and statistics - Discrete Random Variables and Probability Distribution

  • 3. TABLE OF CONTENTS 02 Distributions of Random Variables Random Variables Discrete Distributions 01 03 Cumulative Distribution Function 04 Joint Distributions 05 Conditioning and Independence 04
  • 4. General Overview This chapter also discusses the concept of the conditional distribution of one random variable, given the values of others. This chapter is concerned with the definitions of random variables, distribution functions, probability functions, density functions, and the development of the concepts necessary for carrying out calculations for a probability model using these entities. Conditional distributions of random variables provide the framework for discussing what it means to say that variables are related, which is important in many applications of probability and statistics.
  • 6. ● The previous chapter explained how to construct probability models, including a sample space S and a probability measure P. ● Once we have a probability model, we may define random variables for that probability model. ● Intuitively, a random variable assigns a numerical value to each possible outcome in the sample space. ● For example, if the sample space is {rain, snow, clear}, then we might define a random variable X such that X = 3 if it rains, X = 6 if it snows, and X = βˆ’2.7 if it is clear.
  • 7. Definition 2.1.1 A random variable is a function from the sample space S to the set R1 of all real numbers.
  • 9. Example 1 ● The weather random variable could be written formally as X : {rain, snow, clear} β†’ R1 by X(rain) = 3, X(snow) = 6, and X(clear) = βˆ’2.7. π‘₯𝑖 rain snow clear 𝑋(π‘₯𝑖) 3 6 -2.7
  • 10. Example 3 ● If the sample space corresponds to flipping three different coins, then we could let ● X be the total number of heads showing, ● Y be the total number of tails showing, ● Z = 0 if there is exactly one head, and otherwise Z = 17, etc. π‘₯𝑖 HHH HHT HTH HTT THH THT TTH TTT 𝑋(π‘₯𝑖) 3 2 2 1 2 1 1 0 Y(π‘₯𝑖) 0 1 1 2 1 2 2 3 𝑍(π‘₯𝑖) 17 17 17 0 17 0 0 17
  • 11. Example 3 ● If the sample space corresponds to rolling two fair dice, then we could let X be the square of the number showing on the first die, let Y be the square of the number showing on the second die, let Z be the sum of the two numbers showing, let W be the square of the sum of the two numbers showing, let R be the sum of the squares of the two numbers showing, etc.
  • 12. Example 4 Constants as Random Variables ● As a special case, every constant value c is also a random variable, by saying that c(s) = c for all s ∈ S. Thus, 5 is a random variable, as is 3 or βˆ’21.6
  • 13. Example 5 Indicator Functions ● One special kind of random variable is worth mentioning. ● If A is any event, then we can define the indicator function of A, written 𝐼𝐴, to be the random variable 𝐼𝐴(𝑠) = 1 𝑖𝑓 𝑠 ∈ 𝐴 0 𝑖𝑓 𝑠 βˆ‰ 𝐴 ● which is equal to 1 on A, and is equal to 0 on 𝐴𝐢 .
  • 14. Example ● When a student calls a university help desk for technical support, he/she will either immediately be able to speak to someone (S, for success) or will be placed on hold (F, for failure). ● Define X by X which indicates whether (1) or not (0) the student can immediately speak to someone.
  • 15. ● Given random variables X and Y, we can perform the usual arithmetic operations on them. ● Thus, for example, 𝑍 = 𝑋2 is another random variable, defined by 𝑍(𝑠) = 𝑋2(𝑠) = 𝑋 𝑠 2 = 𝑋(𝑠) Γ— 𝑋(𝑠). ● Similarly, if W = π‘‹π‘Œ3 , then π‘Š(𝑠) = 𝑋(𝑠) Γ— π‘Œ(𝑠) Γ— π‘Œ(𝑠) Γ— π‘Œ(𝑠), etc. ● Also, if Z = X + Y, then Z(s) = X(s) + Y(s), etc.
  • 16. Example ● Consider rolling a fair six-sided die, so that S = {1, 2, 3, 4, 5, 6}. ● Let X be the number showing, so that X(s) = s for s ∈ S. ● Let Y be three more than the number showing, so that Y(s) = s + 3. ● Let 𝑍 = 𝑋2 + π‘Œ. ● Then 𝑍(𝑠) = 𝑋 𝑠 2 + π‘Œ(𝑠) = 𝑠2 + 𝑠 + 3. ● So Z(1) = 5, Z(2) = 9, etc.
  • 17. ● We write X = Y to mean that X(s) = Y(s) for all s ∈ S. ● Similarly, we write X ≀ Y to mean that X(s) ≀ Y(s) for all s ∈ S, and X β‰₯ Y to mean that X(s) β‰₯ Y(s) for all s ∈ S. ● For example, we write X ≀ c to mean that X(s) ≀ c for all s ∈ S.
  • 18. ● Again consider rolling a fair six-sided die, with S = {1, 2, 3, 4, 5, 6}. ● For s ∈ S, let X(s) = s, and let π‘Œ = 𝑋 + 𝐼 6 . This means that π‘Œ 𝑠 = 𝑋 𝑠 + 𝐼{6}(𝑠) = 𝑠 𝑠 ≀ 5 7 𝑠 = 6 ● Hence, Y(s) = X(s) for 1 ≀ s ≀ 5. But it is not true that Y = X, because π‘Œ 6 β‰  𝑋(6). ● On the other hand, it is true that Y β‰₯ X
  • 19. Definition 2 ● Any random variable whose only possible values are 0 and 1 is called a Bernoulli random variable.
  • 20. 01 02 A random variable is a function from the state space to the set of real numbers. The function could be constant, or correspond to counting some random quantity that arises, or any other sort of function. Summary 1
  • 22. Exercise 1 ● Let S = {high, middle, low}. ● Define random variables X, Y, and Z by X(high) = βˆ’12, X(middle) = βˆ’2, X(low) = 3, Y(high) = 0, Y(middle) = 0, Y(low) = 1, Z(high) = 6, Z(middle) = 0, Z(low) = 4. Determine whether each of the following relations is true or false. (a) X < Y (b) X ≀ Y (c) Y < Z (d) Y ≀ Z (e) XY < Z (f) XY ≀ Z
  • 23. Exercise 2 ● Let S = {1, 2, 3, 4, 5}. (a) Define two different (i.e., nonequal) nonconstant random variables, X and Y, on S. (b) For the random variables X and Y that you have chosen, let Z = X +π‘Œ2. Compute Z(s) for all s ∈ S.
  • 24. Exercise3 ● Consider rolling a fair six-sided die, so that S = {1, 2, 3, 4, 5, 6}. ● Let X(s) = s, and π‘Œ(𝑠) = 𝑠3 + 2. ● Let Z = XY. ● Compute Z(s) for all s ∈ S.
  • 26. ● Because random variables are defined to be functions of the outcome s, and because the outcome s is assumed to be random (i.e., to take on different values with different probabilities), it follows that the value of a random variable will itself be random (as the name implies). ● Specifically, if X is a random variable, then what is the probability that X will equal some particular value x? ● Well, X = x precisely when the outcome s is chosen such that X(s) = x.
  • 28. π‘₯𝑖 rain snow clear 𝑋(π‘₯𝑖) 3 6 -2.7 𝑃(𝑋 = π‘₯𝑖) 0.4 0.15 0.45 P(X ∈ {3, 6}) = P(X < 5) = P(X = 3) + P(X = βˆ’2.7) = 0.4 + 0.45 = 0.85 P(X = 3) + P(X = 6) = 0.4 + 0.15 = 0.55
  • 29. ● We see from the previous example that, if B is any subset of the real numbers, then P(X ∈ B) = P({s ∈ S : X(s) ∈ B}). ● Furthermore, to understand X well requires knowing the probabilities P(X ∈ B) for different subsets B. ● That is the motivation for the following definition.
  • 30. Definition ● If X is a random variable, then the distribution of X is the collection of probabilities P(X ∈ B) for all subsets B of the real numbers.
  • 32. π‘₯𝑖 rain snow clear 𝑋(π‘₯𝑖) 3 6 -2.7 𝑃(𝑋 = π‘₯𝑖) 0.4 0.15 0.45 What is the distribution of X? Well, if B is any subset of the real numbers, then P(X ∈ B) should count 0.4 if 3 ∈ B, plus 0.15 if 6 ∈ B, plus 0.45 if βˆ’2.7 ∈ B. We can formally write all this information at once by saying that P(X ∈ B) = 0.4 IB(3) + 0.15 IB(6) + 0.45 IB(βˆ’2.7), where again IB(x) = 1 if x ∈ B, and IB(x) = 0 if x βˆ‰ B
  • 33. π’šπ‘– rain snow clear π‘Œ(𝑦𝑖) 5 7 5 𝑃(π‘Œ = 𝑦𝑖) 0.4 0.15 0.45 P(Y = 5) = P({rain, clear}) = 0.4 + 0.45 = 0.85. Therefore, if B is any subset of the real numbers, then P(Y ∈ B) = 0.15 IB(7) + 0.85 IB(5)
  • 34. ● While the above examples show that it is possible to keep track of P(X ∈ B) for all subsets B of the real numbers, they also indicate that it is rather cumbersome to do so. ● Fortunately, there are simpler functions available to help us keep track of probability distributions, including cumulative distribution functions, probability functions, and density functions. ● We discuss these next
  • 35. 01 02 The distribution of a random variable X is the collection of probabilities P(X ∈ B) of X belonging to various sets. The probability P(X ∈ B) is determined by calculating the probability of the set of response values s such that X(s) ∈ B, i.e., P(X ∈ B) = P({s ∈ S : X(s) ∈ B}). Summary 2
  • 37. Exercise 1 ● Consider flipping two independent fair coins. ● Let X be the number of heads that appear. ● Compute P(X = x) for all real numbers x.
  • 38. Exercise 2 ● Suppose we flip three fair coins, and let X be the number of heads showing. (a) Compute P(X = x) for every real number x. (b) Write a formula for P(X ∈ B), for any subset B of the real numbers.
  • 39. Exercise 3 ● Suppose we roll two fair six-sided dice, and let Y be the sum of the two numbers showing. (a) Compute P(Y = y) for every real number y. (b) Write a formula for P(Y ∈ B), for any subset B of the real numbers.
  • 41. Definition ● A random variable X is discrete if π‘₯βˆˆπ‘…1 𝑃 𝑋 = π‘₯ = 1 (2.3.1)
  • 42. ● Random variables satisfying (2.3.1) are simple in some sense because we can understand them completely just by understanding their probabilities of being equal to particular values x. ● Indeed, by simply listing out all the possible values x such that P(X = x) > 0, we obtain a second, equivalent definition, as follows.
  • 43. Equivalent definition ● A random variable X is discrete if there is a finite or countable sequence x1, x2, . . . of distinct real numbers, and a corresponding sequence p1, p2, . . . of nonnegative real numbers, such that 𝑃(𝑋 = π‘₯𝑖 ) = 𝑝𝑖 for all i, and 𝑖 𝑝𝑖 = 1. ● (1.3.2)
  • 44. Definition: Probability function ● For a discrete random variable X, its probability function is the function 𝑝𝑋 ∢ 𝑅1 β†’ [0, 1] defined by 𝑝𝑋 (π‘₯) = 𝑃(𝑋 = π‘₯).
  • 45. ● Hence, if x1, x2, . . . are the distinct values such that 𝑃(𝑋 = π‘₯𝑖 ) = 𝑝𝑖 for all i with 𝑖 𝑝𝑖 = 1, then 𝑝𝑋 (π‘₯) = 𝑝𝑖 x =π‘₯𝑖for some i 0 π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’ ● Clearly, all the information about the distribution of X is contained in its probability function, but only if we know that X is a discrete random variable. ● Finally, we note that Theorem 1.5.1 immediately implies the following.
  • 46. Theorem 1.5.1 (Law of total probability, conditioned version) ● Let A1, A2, . . . Be events that form a partition of the sample space S, each of positive probability. ● Let B be any event. ● Then P(B) = P(A1)P(B | A1) + P(A2)P(B | A2) + Β· Β· Β· .
  • 47. Reminder of Theorem 2.3.1 (Law of total probability, discrete random variable version) ● Let X be a discrete random variable, and let A be some event. Then 𝑃(𝐴) = π‘₯βˆˆπ‘…1 𝑃 𝑋 = π‘₯ 𝑃(𝐴 | π‘₯)
  • 49. ● Consider flipping a fair coin. Let Z = 1 if the coin is heads, and Z = 3 if the coin is tails. Let π‘Š = 𝑍2 + 𝑍. (a) What is the probability function of Z? (b) What is the probability function of W?
  • 50. ● Consider flipping two fair coins. Let X = 1 if the first coin is heads, and X = 0 if the first coin is tails. Let Y = 1 if the second coin is heads, and Y = 5 if the second coin is tails. ● Let Z = XY. What is the probability function of Z?
  • 53. ● Let c be some fixed real number. ● Then, as already discussed, c is also a random variable (in fact, c is a constant random variable). ● In this case, clearly c is discrete, with probability function 𝑝𝑐 satisfying that 𝑝𝑐(𝑐) = 1, and 𝑝𝑐(π‘₯) = 0 for π‘₯ β‰  𝑐. ● Because c is always equal to a particular value (namely, c) with probability 1, the distribution of c is sometimes called a point mass or point distribution or degenerate distribution.
  • 55. ● Consider flipping a coin that has probability ΞΈ of coming up heads and probability 1βˆ’ΞΈ of coming up tails, where 0 < ΞΈ < 1. ● Let X = 1 if the coin is heads, while X = 0 if the coin is tails. ● Then 𝑝𝑋(1) = 𝑃(𝑋 = 1) = πœƒ, ● while 𝑝𝑋(0) = 𝑃(𝑋 = 0) = 1 βˆ’ πœƒ . ● The random variable X is said to have the Bernoulli(ΞΈ) distribution; ● We write this as X ∼ Bernoulli(ΞΈ). ● We read X follows the Bernouilli distribution
  • 56. ● Bernoulli distributions arise anytime we have a response variable that takes only two possible values, and we label one of these outcomes as 1 and the other as 0. ● For example, 1 could correspond to success and 0 to failure of some quality test applied to an item produced in a manufacturing process. ● In this case, ΞΈ is the proportion of manufactured items that will pass the test. ● Alternatively, we could be randomly selecting an individual from a population and recording a 1 when the individual is female and a 0 if the individual is a male. In this case, ΞΈ is the proportion of females in the population.
  • 58. ● A student has to sit for an exit exam. The probability she passes the exam is 0.7. ● Let X be the random variable defined as follows ● X= 1 if the student passes ● X=0 otherwise ● What is the distribution of X?
  • 60. ● Consider flipping n coins, each of which has (independent) probability ΞΈ of coming up heads, and probability 1 βˆ’ΞΈ of coming up tails. (Again, 0 < ΞΈ < 1.) ● Let X be the total number of heads showing. ● By (1.4.2), we see that for x = 0, 1, 2, . . . , n, 𝑝𝑋 π‘₯ = 𝑃(𝑋 =
  • 61. ● The Bernoulli(ΞΈ ) distribution corresponds to the special case of the Binomial(n, ΞΈ ) distribution when n = 1, namely, Bernoulli(ΞΈ ) = Binomial(1, ΞΈ ). ● The following figure contains the plots of several Binomial(20, ΞΈ ) probability functions. 0 0.05 0.1 0.15 0.2 0.25 0 5 10 15 20 25 Binom(20,1/2) Binom(20,1/5)
  • 62. ● The binomial distribution is applicable to any situation involving n independent performances of a random system; for each performance, we are recording whether a particular event has occurred, called a success, or has not occurred, called a failure. ● If we denote the event in question by A and put ΞΈ = P(A), we have that the number of successes in the n performances is distributed Binomial(n, ΞΈ ). ● For example, we could be testing light bulbs produced by a manufacturer, and ΞΈ is the probability that a bulb works when we test it. Then the number of bulbs that work in a batch of n is distributed Binomial(n, ΞΈ ).
  • 64. Example 1 ● Consider a coin that flips head with probability = 0.25 ● If the coin is flipped 5 times, what is the resulting distribution for the number of heads showing?
  • 65. Example 2 ● 80 persons have to pass through a security gate. ● Let p the probability that the alarm is activated with p = 0,2129 ● Let X be the random variable corresponding to the number of persons activating the alarm. ● What is the probability that only 5 persons will raise the alarm?
  • 67. ● Consider repeatedly flipping a coin that has probability ΞΈ of coming up heads and probability 1 βˆ’ ΞΈ of coming up tails, where again 0 < ΞΈ < 1. ● Let X be the number of tails that appear before the first head. ● Then for k β‰₯ 0, X = k if and only if the coin shows exactly k tails followed by a head. ● The probability of this is equal to 1 βˆ’ πœƒ π‘˜ πœƒ . (In particular, the probability of getting an infinite number of tails before the first head is equal to 1 βˆ’ πœƒ βˆžπœƒ = 0, so X is never equal to infinity.) ● Hence, 𝑝𝑋(π‘˜) = ( ) 1 βˆ’ πœƒ π‘˜ πœƒ , for k = 0, 1, 2, 3, . . . . ● The random variable X is said to have the Geometric(ΞΈ) distribution; ● we write this as X ∼ Geometric(ΞΈ ).
  • 68. 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0 2 4 6 8 10 12 14 16 geometric(1/2) geometric(1/5)
  • 69. ● The geometric distribution applies whenever we are counting the number of failures until the first success for independent performances of a random system where the occurrence of some event is considered a success. ● For example, the number of light bulbs tested that work until the first bulb that does not (a working bulb is considered a β€œfailure” for the test). ● We note that some books instead define the geometric distribution to be the number of coin flips up to and including the first head, which is simply equal to one plus the random variable defined here.
  • 70. ● We roll a die 3 times ● What is the probability that 6 shows up the first time at the third roll?
  • 72. ● Generalizing the previous example, consider again repeatedly flipping a coin that has probability ΞΈ of coming up heads and probability 1 βˆ’ ΞΈ of coming up tails. ● Let r be a positive integer, and let Y be the number of tails that appear before the rth head. ● Then for k β‰₯ 0, Y = k if and only if the coin shows exactly r βˆ’ 1 heads (and k tails) on the first r βˆ’ 1 + k flips, and then shows a head on the (r + k)-th flip. ● The probability of this is equal to π‘π‘Œ π‘˜ = π‘Ÿβˆ’1+π‘˜ π‘Ÿβˆ’1 πœƒπ‘Ÿβˆ’1 1 βˆ’ πœƒ π‘˜ πœƒ= π‘Ÿβˆ’1+π‘˜ π‘Ÿβˆ’1 πœƒπ‘Ÿ 1 βˆ’ πœƒ π‘˜ for k = 0, 1, 2, 3, . . . .
  • 73. ● The random variable Y is said to have the Negative-Binomial(r, ΞΈ ) distribution; we write this as Y ∼ Negative-Binomial(r, ΞΈ ). ● The special case r = 1 corresponds to the Geometric(ΞΈ ) distribution. ● So Negative-Binomial(1, ΞΈ ) = Geometric(ΞΈ ). 0 0.05 0.1 0.15 0.2 0.25 0.3 0 5 10 15 20 25 NegBinom(2,1/2) NegBinom(10,1/2)
  • 74. ● The Negative-Binomial(r, ΞΈ ) distribution applies whenever we are counting the number of failures until the r th success for independent performances of a random system where the occurrence of some event is considered a success. ● For example, the number of light bulbs tested that work until the third bulb that does not follow the negative-binomial distribution.
  • 75. Example An oil company conducts a geological study that indicates that an exploratory oil well should have a 20% chance of striking oil. 1. What is the probability that the first strike comes on the third well drilled. 2. What is the probability that the third strike comes on the seventh well drilled?
  • 77. ● We say that a random variable Y has the Poisson(Ξ») distribution, and write Y ∼ Poisson(Ξ»), if π‘π‘Œ (𝑦) = 𝑃(π‘Œ = 𝑦) = Ξ» 𝑦 𝑦! π‘’βˆ’Ξ» for y = 0, 1, 2, 3, . . . . ● We note that since (from calculus) 𝑦=0 ∞ Ξ» 𝑦 𝑦! = π‘’πœ†, ● It is indeed true (as it must be) that 𝑦=0 ∞ 𝑃 π‘Œ = 𝑦 = 1.
  • 78. 0 0.05 0.1 0.15 0.2 0.25 0.3 0 5 10 15 20 25 Poisson(2) Poisson(10)
  • 79. Motivation ● We motivate the Poisson distribution as follows. ● Suppose X ∼ Binomial(n, ΞΈ ), i.e., X has the Binomial(n, ΞΈ ) distribution. ● Then for 0 ≀ x ≀ n, P X = x = 𝑛 π‘₯ πœƒπ‘₯ 1 βˆ’ πœƒ π‘›βˆ’π‘₯ ● If we set ΞΈ = Ξ» / n for some Ξ» > 0, then this becomes P X = x = 𝑛 π‘₯ ( Ξ» 𝑛 )π‘₯ 1 βˆ’ Ξ» 𝑛 π‘›βˆ’π‘₯ = 𝑛 Γ— 𝑛 βˆ’ 1 … Γ— (𝑛 βˆ’ π‘₯ + 1) π‘₯! ( Ξ» 𝑛 )π‘₯ 1 βˆ’ Ξ» 𝑛 π‘›βˆ’π‘₯
  • 80. ● Let us now consider what happens if we let n β†’ ∞ , while keeping x fixed at some nonnegative integer. ● In that case, 𝑛 Γ— 𝑛 βˆ’ 1 … Γ— (𝑛 βˆ’ π‘₯ + 1) 𝑛π‘₯ β†’ 1 ● Indeed 𝑛 Γ— 𝑛 βˆ’ 1 … Γ— (𝑛 βˆ’ π‘₯ + 1) 𝑛π‘₯ = 1 1 βˆ’ 1 𝑛 1 βˆ’ 2 𝑛 … (1 βˆ’ π‘₯ βˆ’ 1 𝑛 )
  • 81. ● We also have 1 βˆ’ Ξ» 𝑛 π‘›βˆ’π‘₯ = 1 βˆ’ Ξ» 𝑛 𝑛 1 βˆ’ Ξ» 𝑛 βˆ’π‘₯ β†’ π‘’βˆ’Ξ». 1 = π‘’βˆ’Ξ» ● So liπ‘š π‘›β†’βˆž 𝑃 𝑋 = Ξ» π‘₯ π‘₯! π‘’βˆ’Ξ» for x = 0, 1, 2, 3, . . . . 𝑃(𝑋 = π‘₯) 𝑛 Γ— 𝑛 βˆ’ 1 … Γ— (𝑛 βˆ’ π‘₯ + 1) π‘₯! ( Ξ» 𝑛 )π‘₯ 1 βˆ’ Ξ» 𝑛 π‘›βˆ’π‘₯
  • 82. ● Intuitively, we can phrase this result as follows. ● If we flip a very large number of coins n, and each coin has a very small probability ΞΈ = Ξ»/n of coming up heads, then the probability that the total number of heads will be x is approximately given by Ξ» π‘₯ π‘₯! π‘’βˆ’Ξ».
  • 83. ● Figure displays the accuracy of this estimate when we are approximating the Binomial(100, 1/10) distribution by the Poisson(Ξ») distribution where Ξ» = nΞΈ = 100(1/10) = 10. 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0 5 10 15 20 25 Poisson(10) Binom(100,1/10)
  • 84. ● The Poisson distribution is a good model for counting random occurrences of an event when there are many possible occurrences, but each occurrence has very small probability. ● Examples include a. the number of house fires in a city on a given day, t b. he number of radioactive events recorded by a Geiger counter, c. the number of phone calls arriving at a switchboard, d. the number of hits on a popular World Wide Web page on a given day, etc.
  • 86. ● Suppose that an urn contains M white balls and N βˆ’ M black balls. ● Suppose we draw n ≀ N balls from the urn in such a fashion that each subset of n balls has the same probability of being drawn. ● Because there are 𝑁 𝑛 such subsets, this probability is 1 𝑁 𝑛 ● One way of accomplishing this is to thoroughly mix the balls in the urn and then draw a first ball. ● Accordingly, each ball has probability 1/N of being drawn. ● Then, without replacing the first ball, we thoroughly mix the balls in the urn and draw a second ball. ● So each ball in the urn has probability 1/(N βˆ’1) of being drawn.
  • 87. ● We then have that any two balls, say the ith and jth balls, have probability ● 𝑃 π‘π‘Žπ‘™π‘™ 𝑖 π‘Žπ‘›π‘‘ 𝑗 π‘Žπ‘Ÿπ‘’ π‘‘π‘Ÿπ‘Žπ‘€π‘› = 𝑃(π‘π‘Žπ‘™π‘™ 𝑖 𝑖𝑠 π‘‘π‘Ÿπ‘Žπ‘€π‘› π‘“π‘–π‘Ÿπ‘ π‘‘)𝑃(π‘π‘Žπ‘™π‘™ 𝑗 𝑖𝑠 π‘‘π‘Ÿπ‘Žπ‘€π‘› π‘ π‘’π‘π‘œπ‘›π‘‘ | π‘π‘Žπ‘™π‘™ 𝑖 𝑖𝑠 π‘‘π‘Ÿπ‘Žπ‘€π‘› π‘“π‘–π‘Ÿπ‘ π‘‘) + 𝑃(π‘π‘Žπ‘™π‘™ 𝑗 𝑖𝑠 π‘‘π‘Ÿπ‘Žπ‘€π‘› π‘“π‘–π‘Ÿπ‘ π‘‘)𝑃(π‘π‘Žπ‘™π‘™ 𝑖 𝑖𝑠 π‘‘π‘Ÿπ‘Žπ‘€π‘› π‘ π‘’π‘π‘œπ‘›π‘‘ π‘π‘Žπ‘™π‘™ 𝑗 𝑖𝑠 π‘‘π‘Ÿπ‘Žπ‘€π‘› π‘“π‘–π‘Ÿπ‘ π‘‘ = 1 𝑁 1 π‘βˆ’1 + 1 π‘βˆ’1 1 π‘βˆ’1 = 1 𝑁 2 ● This type of sampling is called sampling without replacement.
  • 88. ● Given that we take a sample of n, let X denote the number of white balls obtained. ● Note that we must have X β‰₯ 0 and X β‰₯ n βˆ’ (N βˆ’ M) because at most N βˆ’ M of the balls could be black. ● Hence, X β‰₯ max(0, n + M βˆ’ N). ● Furthermore, X ≀ n and X ≀ M because there are only M white balls. ● Hence, X ≀ min(n, M). ● So suppose max(0, n + M βˆ’ N) ≀ x ≀ min(n, M). ● What is the probability that x white balls are obtained? ● In other words, what is P(X = x)?
  • 89. ● To evaluate this, we know that we need to count the number of subsets of n balls that contain x white balls. ● Using the combinatorial principles (ch1), we see that this number is given by 𝑀 π‘₯ 𝑁 βˆ’ 𝑀 𝑛 βˆ’ π‘₯ ● So 𝑃 π‘₯ = 𝑋 = 𝑀 π‘₯ π‘βˆ’π‘€ π‘›βˆ’π‘₯ 𝑁 𝑛 ● The random variable X is said to have the Hypergeometric(N, M, n) distribution.
  • 90. 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0 5 10 15 20 25 HperGeom(20,10,10) HperGeom(20,10,5)
  • 91. ● Obviously, the hypergeometric distribution will apply to any context wherein we are sampling without replacement from a finite set of N elements and where each element of the set either has a characteristic or does not. ● For example, if we randomly select people to participate in an opinion poll so that each set of n individuals in a population of N has the same probability of being selected, then the number of people who respond yes to a particular question is distributed Hypergeometric(N, M, n), where M is the number of people in the entire population who would respond yes.
  • 92. 01 03 A random variable X is discrete if π‘₯ 𝑃(𝑋 = π‘₯) = 1, i.e., if all its probability comes from being equal to particular values. Important discrete distributions include the degenerate, Bernoulli, binomial, geometric, negative-binomial, Poisson, and hypergeometric distributions. Summary 3 02 A discrete random variable X takes on only a finite, or countable, number of distinct values.
  • 94. Exercise 1 Let Y ∼ Binomial(10, ΞΈ ). ● Compute P(Y = 10). ● Compute P(Y = 5). ● Compute P(Y = 0).
  • 95. Exercise ● We toss a fair coin. ● Let X =1 if the coin is head and 0 otherwise ● What is the distribution function that X follows? ● We toss the coin 10 times and consider Y the r.v. that counts the number of heads. What is the distribution function that Y follows?
  • 96. Exercise 2 ● Let Z ∼ Geometric(ΞΈ ). Compute P(5 ≀ Z ≀ 9).
  • 97. Exercise 3 A die is rolled 3 times 1. What is the probability that no 6 occur?
  • 98. Exercise 4 We roll a die 3 times. 1. Let X be the number of 5 that appear a. What is the distribution of X? b. What is the probability to get 3 times 5? c. What is the probability to get 2 times 5? 2.Let Y be the sum of numbers showing a. Does Y follow the binomial distribution? b. What is the probability P(Y=15)? 3. What is the probability to get the first five at the third roll? 4. What is the probability to get 3 times the sum of 15?
  • 100. Definition ● Given a random variable X, its cumulative function (or distribution function or cdf) is the function 𝐹𝑋: 𝑅1 β†’ 0,1 𝐹𝑋 π‘₯ = 𝑃(𝑋 ≀ π‘₯)
  • 101. CDF Properties Let 𝐹𝑋 be a a cumulative distribution function of a random variable X. Then ● 𝟎 ≀ 𝑭𝑿 𝒙 ≀ 𝟏 𝒇𝒐𝒓 𝒂𝒍𝒍 𝒙 ● 𝑭𝑿 𝒙 ≀ 𝑭𝑿(π’š) whenever 𝒙 ≀ π’š (i.e., 𝑭𝑿 is increasing) ● π₯𝐒𝐦 π’™β†’βˆž 𝑭𝑿 = 𝟏 ● π₯𝐒𝐦 π’™β†’βˆ’βˆž 𝑭𝑿 = 𝟎
  • 102. Theorem ● Let X be a discrete random variable with probability function 𝑝𝑋, then its cumulative distribution function 𝐹𝑋 satisfies 𝐹𝑋 π‘₯ = 𝑦≀π‘₯ 𝑝𝑋(𝑦)
  • 103. Example 1 We toss a fair coin twice. Let X be the number of heads What is 𝐹𝑋 π‘₯ ?
  • 104. ● S={HH, HT, TH,TT} ● X ~ binomial(2,1/2) s TT HT,TH HH 𝑋(𝑠) 0 1 2 pX(π‘₯) 1 4 1 2 1 4
  • 105. ● 𝐹𝑋 π‘₯ = 𝑃(𝑋 ≀ π‘₯) ● 𝐹𝑋 π‘₯ = 0 if π‘₯ < 0 ● 𝐹𝑋 π‘₯ = 1 if π‘₯ β‰₯ 2 ● 𝐹𝑋 π‘₯ = 1 4 if 0 ≀ π‘₯ < 1 ● 𝐹𝑋 π‘₯ = 1 4 + 1 2 = 3 4 if 1 ≀ π‘₯ < 2 s TT HT,TH HH 𝑋(𝑠) 0 1 2 pX(π‘₯) 1 4 1 2 1 4
  • 106. s TT HT,TH HH 𝑋(𝑠) 0 1 2 pX(π‘₯) 1 4 1 2 1 4 𝐹𝑋 π‘₯ = 0, 𝑖𝑓 π‘₯ < 0 1 4 , 𝑖𝑓 0 ≀ π‘₯ < 1 3 4 , 𝑖𝑓 1 ≀ π‘₯ < 2 1, 𝑖𝑓 π‘₯ β‰₯ 2
  • 107. Example 2 ● Suppose we roll one fair six sided die so that 𝑆 = 1,2,3,4,5,6 , and p s = 1 6 , βˆ€π‘  ∈ 𝑆. ● Let X be the random variable associated to the number showing divided by 6. 𝑋 𝑠 = 𝑠/6 , βˆ€π‘  ∈ 𝑆 β€’ What is 𝐹𝑋(π‘₯)?
  • 108. 1 2 3 4 5 6 𝑋(π‘₯) 1 6 2 6 = 1 3 3 6 = 1 2 4 6 = 2 3 5 6 6 6 = 1 pX(π‘₯) 1 6 1 6 1 6 1 6 1 6 1 6 = 𝑦≀?? 𝑝𝑋(𝑦) 𝑃(𝑋 ≀ π‘₯) 𝐹𝑋 π‘₯ = 𝑋 𝑦 = 1 6 𝑦, βˆ€π‘¦ ∈ 𝑆 𝑋 ≀ π‘₯ 1 6 𝑦 ≀ π‘₯ β‡’ 𝑦 ≀ 6π‘₯ β‡’ 𝐹𝑋 π‘₯ = π‘¦βˆˆπ‘† 𝑦≀6π‘₯ 𝑝𝑋(𝑦) = π‘¦βˆˆπ‘† 𝑦≀6π‘₯ 1 6 = 1 6 |𝑦 ∈ 𝑆: 𝑦 ≀ 6π‘₯|
  • 109. 1 2 3 4 5 6 𝑋(π‘₯) 1 6 2 6 = 1 3 3 6 = 1 2 4 6 = 2 3 5 6 6 6 = 1 pX(π‘₯) 1 6 1 6 1 6 1 6 1 6 1 6 𝐹𝑋 π‘₯ = 1 6 |𝑦 ∈ 𝑆: 𝑦 ≀ 6π‘₯| π‘₯ β‰₯ 1 β‡’ 𝑦 ≀ 6 β‡’ 𝑦 ∈ 𝑆: 𝑦 ≀ 6 = 1,2,3,4,5,6 β‡’ 𝐹𝑋 π‘₯ = 6 6 = 1 5 6 ≀ π‘₯ < 1 β‡’ 𝑦 < 6 β‡’ 𝑦 ∈ 𝑆: 𝑦 < 6 = 1,2,3,4,5 β‡’ 𝐹𝑋 π‘₯ = 5 6 4 6 ≀ π‘₯ < 5 6 β‡’ 𝑦 < 5 β‡’ 𝑦 ∈ 𝑆: 𝑦 < 5 = 1,2,3,4 β‡’ 𝐹𝑋 π‘₯ = 4 6 = 2 3
  • 110. 1 2 3 4 5 6 𝑋(π‘₯) 1 6 2 6 = 1 3 3 6 = 1 2 4 6 = 2 3 5 6 6 6 = 1 pX(π‘₯) 1 6 1 6 1 6 1 6 1 6 1 6 𝐹𝑋 π‘₯ = 1 6 |𝑦 ∈ 𝑆: 𝑦 ≀ 6π‘₯| 3 6 ≀ π‘₯ < 4 6 β‡’ 𝑦 < 4 β‡’ 𝑦 ∈ 𝑆: 𝑦 < 4 = 1,2,3 β‡’ 𝐹𝑋(π‘₯) = 3 6 = 1 2 2 6 ≀ π‘₯ < 3 6 β‡’ 𝑦 < 3 β‡’ 𝑦 ∈ 𝑆: 𝑦 < 3 = 1,2 β‡’ 𝐹𝑋(π‘₯) = 2 6 = 1 3 1 6 ≀ π‘₯ < 2 6 β‡’ 𝑦 < 2 β‡’ 𝑦 ∈ 𝑆: 𝑦 < 2 = 1 β‡’ 𝐹𝑋(π‘₯) = 1 6
  • 111. 1 2 3 4 5 6 𝑋(π‘₯) 1 6 2 6 = 1 3 3 6 = 1 2 4 6 = 2 3 5 6 6 6 = 1 pX(π‘₯) 1 6 1 6 1 6 1 6 1 6 1 6 𝐹𝑋 π‘₯ = 1 6 |𝑦 ∈ 𝑆: 𝑦 ≀ 6π‘₯| π‘₯ < 1 6 β‡’ 𝑦 < 1 β‡’ 𝑦 ∈ 𝑆: 𝑦 < 1 = βˆ… β‡’ 𝐹𝑋(π‘₯) = 0
  • 112. π‘₯ 𝐹𝑋(π‘₯) 1 6 1 3 2 3 1 5 6 1 2 1 6 1 3 1 2 2 3 1 1 2 3 4 5 6 𝑋(π‘₯) 1 6 2 6 = 1 3 3 6 = 1 2 4 6 = 2 3 5 6 6 6 = 1 pX(π‘₯) 1 6 1 6 1 6 1 6 1 6 1 6
  • 113. Useful formula ● For all π‘Ž ≀ 𝑏, we have 𝑃(π‘Ž < 𝑋 ≀ 𝑏) = 𝐹𝑋(𝑏) βˆ’ 𝐹𝑋(π‘Ž)
  • 115. Exercise 1 ● Consider rolling one fair six-sided die, so that S = {1, 2, 3, 4, 5, 6}, and P(s) = 1/6 for all s ∈ S. Let X be the number showing on the die, so that X(s) = s for s ∈ S. ● Let π‘Œ = 𝑋2. Compute the cumulative distribution function πΉπ‘Œ (𝑦) = P(Y ≀ y), for all 𝑦 ∈ 𝑅1 .
  • 117. ● In many real scenarios, we need to analyse many variables in the same time. ● For this, we will show how to deal with two random variables which can be generalized for more than 2 variable
  • 118. Definition (Joint probability function) ● Let X and Y be discrete random variables. ● Then their joint probability function, 𝑝𝑋,π‘Œ , is a function from 𝑅2 to 𝑅1, defined by 𝑝𝑋,π‘Œ (π‘₯, 𝑦) = 𝑃(𝑋 = π‘₯ , π‘Œ = 𝑦)
  • 119. Example ● Find P(X=0,Y<=1) Y=0 Y=1 Y=2 X=0 1/6 1/4 1/8 X=1 1/8 1/6 1/6
  • 120. Theorem (marginal probability function) ● Let X and Y be two discrete random variables, with joint probability function 𝑝𝑋,π‘Œ . Then the probability function 𝑝𝑋 of X can be computed as 𝑝𝑋 π‘₯ = 𝑦 𝑃𝑋,π‘Œ(π‘₯, 𝑦) ● Similarly, the probability function π‘π‘Œ of Y can be computed as π‘π‘Œ 𝑦 = π‘₯ 𝑃𝑋,π‘Œ(π‘₯, 𝑦)
  • 121. Example 1 ● Find the marginal probability function of X and Y Y=0 Y=1 Y=2 X=0 1/6 1/4 1/8 X=1 1/8 1/6 1/6
  • 122. Example 2 ● Suppose the joint probability function of X and Y is given by 𝑝𝑋,π‘Œ π‘₯, 𝑦 = 1 7 π‘₯ = 5, 𝑦 = 0 1 7 π‘₯ = 5, 𝑦 = 3 1 7 π‘₯ = 5, 𝑦 = 4 3 7 π‘₯ = 8, 𝑦 = 0 1 7 π‘₯ = 8, 𝑦 = 4 0 π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’
  • 123. 1. 𝑝𝑋 (5) ? 2. 𝑝𝑋 (8) ? 3. π‘π‘Œ (4) ?
  • 124. 1. 𝑝𝑋 5 = 𝑦 𝑝(5, 𝑦) = 𝑝𝑋,π‘Œ 5, 0 + 𝑝𝑋,π‘Œ 5, 3 + 𝑝𝑋,π‘Œ 5,4 = 3 7
  • 125. Y=0 Y=3 Y=4 X=5 1/7 1/7 1/7 3/7 X=8 3/7 0 1/7 4/7 4/7 1/7 2/7
  • 126. Definition ● If X and Y are random variables, then the joint distribution of X and Y is the collection of probabilities P((X, Y) ∈ B), for all subsets 𝐡 βŠ† 𝑅2 of pairs of real numbers.
  • 127. Definition Joint cdf ● Let X and Y be random variables. ● Then their joint cumulative distribution function is the function 𝐹𝑋,π‘Œ ∢ 𝑅2 β†’ [0, 1] defined by 𝐹𝑋,π‘Œ (π‘₯, 𝑦) = 𝑃(𝑋 ≀ π‘₯, π‘Œ ≀ 𝑦) (the comma means β€œand” here, so that 𝐹𝑋,π‘Œ (π‘₯, 𝑦) is the probability that X ≀ x and Y ≀ y.)
  • 128.
  • 129. Example Y=0 Y=1 Y=2 X=0 1/6 1/4 1/8 X=1 1/8 1/6 1/6
  • 130. Example 1. Find πΉπ‘‹π‘Œ(0.5,1) Y=0 Y=1 Y=2 X=0 1/6 1/4 1/8 X=1 1/8 1/6 1/6
  • 131. ● What is the probability 𝑃(π‘Ž < 𝑋 ≀ 𝑏, 𝑐 < π‘Œ ≀ 𝑑)
  • 132. Theorem ● Let X and Y be any random variables, with joint cumulative distribution function 𝐹𝑋,π‘Œ . ● Suppose a ≀ b and c ≀ d. Then 𝑃(π‘Ž < 𝑋 ≀ 𝑏 , 𝑐 < π‘Œ ≀ 𝑑) = 𝐹𝑋,π‘Œ (𝑏, 𝑑) βˆ’ 𝐹𝑋,π‘Œ(π‘Ž, 𝑑) βˆ’ 𝐹𝑋,π‘Œ(𝑏, 𝑐) + 𝐹𝑋,π‘Œ(π‘Ž, 𝑐).
  • 133.
  • 134. 𝑃(π‘Ž < 𝑋 ≀ 𝑏 , 𝑐 < π‘Œ ≀ 𝑑) = 𝐹𝑋,π‘Œ (𝑏, 𝑑) βˆ’ 𝐹𝑋,π‘Œ(π‘Ž, 𝑑) βˆ’ 𝐹𝑋,π‘Œ(𝑏, 𝑐) + 𝐹𝑋,π‘Œ(π‘Ž, 𝑐).
  • 135. Theorem (marginal distributions) ● Let X and Y be two random variables, with joint cumulative distribution function FX,Y . Then the cumulative distribution function FX of X satisfies 𝐹𝑋 (π‘₯) = lim π‘¦β†’βˆž 𝐹𝑋,π‘Œ (π‘₯, 𝑦) for all x ∈ R1. ● Similarly, the cumulative distribution function FY of Y satisfies πΉπ‘Œ (𝑦) = lim π‘₯β†’βˆž 𝐹𝑋,π‘Œ (π‘₯, 𝑦) for all y ∈ R1.
  • 136. 01 03 It is often important to keep track of the joint probabilities of two random variables, X and Y. If X and Y are discrete, then their joint probability function is given by 𝑝𝑋,π‘Œ(π‘₯, 𝑦) = 𝑃(𝑋 = π‘₯, π‘Œ = 𝑦). Summary 02 Their joint cumulative distribution function is given by 𝐹𝑋,π‘Œ (π‘₯, 𝑦) = 𝑃(𝑋 ≀ π‘₯, π‘Œ ≀ 𝑦).
  • 138. ● Let and urn containing: a. 4 white balls b. 3 red c. 3 black ● We pick 5 balls without replacement ● What is the probability of getting 2 white and 2 red balls?
  • 139.
  • 140. ● Let X be the random variable β€œnumber of red balls” and Y be the β€œnumber of white balls” a. Find P(X=2, Y=2) b. Find the joint probability function of X and Y c. Does π‘₯,𝑦 𝑝𝑋,π‘Œ π‘₯, 𝑦 = 1? d. Find P(X=Y) e. Find P(X>Y) f. Find P(X=3) g. Find P(Y=2)
  • 141. Y=0 Y=1 Y=2 Y=3 Y=4 X=0 0 0 6 252 12 252 3 252 X=1 0 12 252 54 252 36 252 3 252 X=2 3 252 36 252 54 252 12 252 0 X=3 3 252 12 252 6 252 0 0
  • 142. ● Does π‘₯,𝑦 𝑝𝑋,π‘Œ π‘₯, 𝑦 = 1?
  • 147. ● Find 𝐹𝑋,π‘Œ 2,1 ● Find 𝐹𝑋,π‘Œ(3,2)
  • 148. Y=0 Y=1 Y=2 Y=3 Y=4 X=0 0 0 6 252 12 252 3 252 X=1 0 12 252 54 252 36 252 3 252 X=2 3 252 36 252 54 252 12 252 0 X=3 3 252 12 252 6 252 0 0 Find 𝐹𝑋,π‘Œ 2,1
  • 149. ● Find 𝐹𝑋,π‘Œ(3,2) Y=0 Y=1 Y=2 Y=3 Y=4 X=0 0 0 6 252 12 252 3 252 X=1 0 12 252 54 252 36 252 3 252 X=2 3 252 36 252 54 252 12 252 0 X=3 3 252 12 252 6 252 0 0
  • 151. Definition ● Let X and Y be random variables, and suppose that P(X = x) >. ● The conditional distribution of Y, given that X = x, is the probability distribution assigning probability 𝑃(π‘Œ ∈ 𝐡 , 𝑋 = π‘₯) P(X = x) to each event Y ∈ B. ● In particular, it assigns probability 𝑃(π‘Ž < π‘Œ ≀ 𝑏 , 𝑋 = π‘₯) 𝑃(𝑋 = π‘₯) to the event that a < Y ≀ b.
  • 152. Example ● Suppose the joint probability function of X and Y is given by 𝑝𝑋,π‘Œ π‘₯, 𝑦 = 1 7 π‘₯ = 5, 𝑦 = 0 1 7 π‘₯ = 5, 𝑦 = 3 1 7 π‘₯ = 5, 𝑦 = 4 3 7 π‘₯ = 8, 𝑦 = 0 1 7 π‘₯ = 8, 𝑦 = 4 0 π‘œπ‘‘β„Žπ‘’π‘Ÿπ‘€π‘–π‘ π‘’
  • 153. ● We compute P(Y = 4 | X = 8) as 𝑃(π‘Œ = 4 | 𝑋 = 8) = 𝑃 π‘Œ = 4, 𝑋 = 8 𝑃 𝑋 = 8 = 1 7 1 7 + 3 7 = 1/4 ● P(Y=4|X=5) 𝑃(π‘Œ = 4 | 𝑋 = 5) = 𝑃 π‘Œ = 4, 𝑋 = 5 𝑃 𝑋 = 5 = 1 7 1 7 + 1 7 + 1 7 = 1/3
  • 154. Definition ● Suppose X and Y are two discrete random variables. ● Then the conditional probability function of Y, given X, is the function π‘π‘Œ|𝑋 defined by π‘π‘Œ|𝑋 𝑦 π‘₯ = 𝑝𝑋,π‘Œ π‘₯,𝑦 𝑃𝑋 π‘₯ = 𝑝𝑋,π‘Œ π‘₯,𝑦 𝑧 𝑃𝑋,π‘Œ(π‘₯,𝑧) ● defined for all 𝑦 ∈ 𝑅1 and all x with 𝑝𝑋 (π‘₯) > 0.
  • 155. Example Find 𝑃(π‘Œ = 1|𝑋 = 0) Y=0 Y=1 Y=2 X=0 1/6 1/4 1/8 X=1 1/8 1/6 1/6
  • 157. ● Recall from Definition 1.5.2 that two events A and B are independent if P(A ∩ B) = P(A) P(B). ● We wish to have a corresponding definition of independence for random variables X and Y. Intuitively, independence of X and Y means that X and Y have no influence on each other, i.e., that the values of X make no change to the probabilities for Y (and vice versa). ● The idea of the formal definition is that X and Y give rise to events, of the form β€œa < X ≀ b” or β€œY ∈ B,” and we want all such events involving X to be independent of all such events involving Y.
  • 158. Definition ● Let X and Y be two random variables. ● Then X and Y are independent if, for all subsets B1 and B2 of the real numbers, P(X ∈ B1, Y ∈ B2) = P(X ∈ B1)P(Y ∈ B2). ● That is, the events β€œX ∈ B1” and β€œY ∈ B2” are independent events.
  • 159. Theorem ● Let X and Y be two random variables. Then X and Y are independent if and only if 𝑃(π‘Ž ≀ 𝑋 ≀ 𝑏, 𝑐 ≀ π‘Œ ≀ 𝑑) = 𝑃(π‘Ž ≀ 𝑋 ≀ 𝑏)𝑃(𝑐 ≀ π‘Œ ≀ 𝑑) (2.8.1) whenever a ≀ b and c ≀ d.
  • 160. Theorem ● Let X and Y be two discrete random variables. ● X and Y are independent if and only if their joint probability function 𝑝𝑋,π‘Œ satisfies 𝑝𝑋,π‘Œ (x, y) = 𝑝𝑋 (x) π‘π‘Œ (y) for all x, y ∈ R1.
  • 161. Example Are X and Y independent? Y=0 Y=1 Y=2 X=0 1/6 1/4 1/8 X=1 1/8 1/6 1/6
  • 162. ● Suppose now that we have n random variables 𝑋1, . . . , 𝑋𝑛. ● The random variables are independent if and only if the collection of events {π‘Žπ‘– ≀ 𝑋𝑖 ≀ 𝑏𝑖 } are independent, whenever π‘Žπ‘– ≀ 𝑏𝑖 , for all i = 1, 2, . . . , n. ● Generalizing previous Theorem, we have the following result.
  • 163. ● Let 𝑋1, . . . , 𝑋𝑛 be a collection of random discrete variables. ● 𝑋1, . . . , 𝑋𝑛 are independent if and only if their joint probability function 𝑝𝑋1,…,𝑋𝑛 satisfies 𝑝𝑋1,…,𝑋𝑛 (π‘₯1, . . . , π‘₯𝑛) = 𝑝𝑋1 (π‘₯1) Β· Β· Β· 𝑝𝑋𝑛 (π‘₯𝑛) for all π‘₯1, . . . , π‘₯𝑛 ∈ 𝑅1 .
  • 164. Independent and identically distributed random variables ● A collection 𝑋1, . . . , 𝑋𝑛 of random variables is independent and identically distributed (or i.i.d.) if the collection is independent and if, furthermore, each of the n variables has the same distribution. ● The i.i.d. sequence 𝑋1, . . . , 𝑋𝑛 is also referred to as a sample from the common distribution.
  • 165. ● In particular, if a collection 𝑋1, . . . , 𝑋𝑛 of random variables is i.i.d. and discrete, then each of the probability functions 𝑝𝑋𝑖 is the same, so that 𝑝𝑋1 π‘₯ =Β· Β· Β·= 𝑝𝑋𝑛 π‘₯ = p π‘₯ , for all x ∈ β„› 1. ● Furthermore, from independency theorem, it follows that 𝑝𝑋1,…,𝑋𝑛 π‘₯1, . . . , π‘₯𝑛 = 𝑝𝑋1 π‘₯1 Β· Β· Β· 𝑝𝑋𝑛 π‘₯𝑛 = p x1 … p(x2) for all π‘₯1, . . . , π‘₯𝑛 ∈ β„› 1 .
  • 167.
  • 168. ● Consider k random variable 𝑋1, . . . , π‘‹π‘˜follow the multinomial distribution noted (𝑋1, . . . , π‘‹π‘˜)~multinomial (n, πœƒ1, . . . , πœƒπ‘˜) in the following case ● The experiment consists of n repeated trials. ● Each trial has a discrete number of possible outcomes. ● On any given trial, the probability that a particular outcome will occur is constant πœƒπ‘˜. ● The trials are independent; that is, the outcome on one trial does not affect the outcome on other trials.
  • 169. 𝑝𝑋1,…,π‘‹π‘˜ π‘₯1, . . . , π‘₯π‘˜ = 𝑛 π‘₯1π‘₯2…π‘₯π‘˜ πœƒ1 π‘₯1 … πœƒπ‘˜ π‘₯π‘˜ = 𝑛! π‘₯1!π‘₯2!…π‘₯π‘˜! πœƒ1 π‘₯1 … πœƒπ‘˜ π‘₯π‘˜ whenever each π‘₯𝑖 ∈ {0, . . . , 𝑛} and π‘₯1 + Β· Β· Β· + π‘₯π‘˜ = 𝑛. In this case, we write (𝑋1, . . . , π‘‹π‘˜ ) ∼ Multinomial (𝑛, πœƒ1, . . . , πœƒπ‘˜ )
  • 170. Examples ● Consider a bowl of chips of which a proportion πœƒπ‘– of the chips are labelled i (for i = 1, 2, 3). ● If we randomly draw a chip from the bowl and observe its label s, then P (s = i) = πœƒπ‘– . ● We draw 3 ships with replacement and consider the random variable 𝑋𝑖: the number of ships labeled with i ● Then (𝑋1, 𝑋2, 𝑋3) ~π‘šπ‘’π‘™π‘‘π‘–π‘›π‘œπ‘šπ‘–π‘Žπ‘™(3, πœƒ1, πœƒ2, πœƒ3)
  • 171. Examples ● Consider a population of students at a university of which a proportion πœƒ1 live on campus (denoted by s = 1), a proportion πœƒ2 live off-campus with their parents (denoted by s = 2), and a proportion πœƒ3 live off-campus independently (denoted by s = 3). ● If we randomly draw a student from this population and determine s for that student, then P (s = i) = πœƒπ‘– . ● Let 𝑋𝑖: the number of students belonging to 𝑠𝑖
  • 172. ● In a random sample of 10 people, what is the probability that 6 have O, 2have A, 1 has B and 1 has AB? ● X1: # O, X2: #A, X3: #B, X4: #AB ● 𝑃(𝑋1 = 6, 𝑋2 = 2, 𝑋3 = 1, 𝑋4 = 1) = 10! 6!2!1!1! 0.446 Γ— 0.422 Γ— 0.11 Γ— 0.041 Blood type O A B AB probability 0.44 0.42 0.10 0.04
  • 173.
  • 175. ● An urn contains 8 red balls, 3 yellow, 9 white. 6 balls are randomly selected with replacement. ● What is the probability 2 are red, 1 is yellow, and 3 are white?
  • 176. ● Suppose a card is drawn randomly from an ordinary deck of playing cards, and then put back in the deck. This exercise is repeated five times. What is the probability of drawing 1 spade, 1 heart, 1 diamond, and 2 clubs?
  • 177. CREDITS: This presentation template was created by Slidesgo, including icons by Flaticon, and infographics & images by Freepik. THANKS References Michael J. Evans and Jeffrey S. Rosenthal: Probability and Statistics, The Science of Uncertainty, Second Edition, University of Toronto

Editor's Notes

  1. More formally, we have the following definition.
  2. 1/6
  3. Now suppose that A1, A2, . . . are events that form a partition of the sample space S. This means that A1, A2, . . . are disjoint and, furthermore, that their union is equal to S, i.e., A1 βˆͺ A2 βˆͺ Β· Β· Β· = S. We have the following basic theorem that allows us to decompose the calculation of the probability of B into the sum of the probabilities of the sets Ai ∩ B. Often these are easier to compute.
  4. (1 + (c/n))^n β†’ e^c
  5. https://stattrek.com/probability-distributions/multinomial.aspx