SlideShare a Scribd company logo
1 of 269
Download to read offline
Analysis of Algorithms
The Mathematics of Analysis of Algorithms
Andres Mendez-Vazquez
September 10, 2015
1 / 97
Outline
1 Induction
Basic Induction
Structural Induction
2 Series
Properties
Important Series
Bounding the Series
3 Probability
Intuitive Formulation
Axioms
Independence
Unconditional and Conditional Probability
Posterior (Conditional) Probability
Random Variables
Types of Random Variables
Cumulative Distributive Function
Properties of the PMF/PDF
Expected Value and Variance
Indicator Random Variable
2 / 97
Outline
1 Induction
Basic Induction
Structural Induction
2 Series
Properties
Important Series
Bounding the Series
3 Probability
Intuitive Formulation
Axioms
Independence
Unconditional and Conditional Probability
Posterior (Conditional) Probability
Random Variables
Types of Random Variables
Cumulative Distributive Function
Properties of the PMF/PDF
Expected Value and Variance
Indicator Random Variable
3 / 97
Basic Induction
Principle of Mathematical Induction
Let P(n) be a property that is defined for integers n, and let a be a fixed
integer.
Suppose the following two statements are true
1 P (a) is true.
2 For all integers k ≥ a, if P (k) is true then P (k + 1) is true.
Then the statement
For all integers n ≥ a, P (n) is true. (1)
4 / 97
Basic Induction
Principle of Mathematical Induction
Let P(n) be a property that is defined for integers n, and let a be a fixed
integer.
Suppose the following two statements are true
1 P (a) is true.
2 For all integers k ≥ a, if P (k) is true then P (k + 1) is true.
Then the statement
For all integers n ≥ a, P (n) is true. (1)
4 / 97
Basic Induction
Principle of Mathematical Induction
Let P(n) be a property that is defined for integers n, and let a be a fixed
integer.
Suppose the following two statements are true
1 P (a) is true.
2 For all integers k ≥ a, if P (k) is true then P (k + 1) is true.
Then the statement
For all integers n ≥ a, P (n) is true. (1)
4 / 97
Basic Induction
Principle of Mathematical Induction
Let P(n) be a property that is defined for integers n, and let a be a fixed
integer.
Suppose the following two statements are true
1 P (a) is true.
2 For all integers k ≥ a, if P (k) is true then P (k + 1) is true.
Then the statement
For all integers n ≥ a, P (n) is true. (1)
4 / 97
We have the following method for Mathematical Induction
Consider a statement of the form
For all integers n ≥ a, P (n) is true. (2)
To prove such a statement
Perform the following two steps
Step 1 (Basis step)
Show that P(a) is true - we normally use a = 1.
5 / 97
We have the following method for Mathematical Induction
Consider a statement of the form
For all integers n ≥ a, P (n) is true. (2)
To prove such a statement
Perform the following two steps
Step 1 (Basis step)
Show that P(a) is true - we normally use a = 1.
5 / 97
We have the following method for Mathematical Induction
Consider a statement of the form
For all integers n ≥ a, P (n) is true. (2)
To prove such a statement
Perform the following two steps
Step 1 (Basis step)
Show that P(a) is true - we normally use a = 1.
5 / 97
Then
Step 2 (Inductive step)
Show that for all integers k ≥ a, if P(k) is true, then P(k + 1) is true.
Inductive hypothesis
Suppose that P(k) is true, where k is any particular but arbitrarily chosen
integer with k ≥ a.
Then, you can prove that
P (k + 1) is true.
6 / 97
Then
Step 2 (Inductive step)
Show that for all integers k ≥ a, if P(k) is true, then P(k + 1) is true.
Inductive hypothesis
Suppose that P(k) is true, where k is any particular but arbitrarily chosen
integer with k ≥ a.
Then, you can prove that
P (k + 1) is true.
6 / 97
Then
Step 2 (Inductive step)
Show that for all integers k ≥ a, if P(k) is true, then P(k + 1) is true.
Inductive hypothesis
Suppose that P(k) is true, where k is any particular but arbitrarily chosen
integer with k ≥ a.
Then, you can prove that
P (k + 1) is true.
6 / 97
Example
Proposition
For all integers n ≥ 8, n¢ can be obtained using 3¢ and 5¢ coins.
Show that P(8) is true
P(8) is true because 8¢ can be obtained using one coin 3¢ and another
coin of 5¢.
Show that for all integers k ≥ 8, if P(k) is true then P(k + 1) is also
true
We can do the following.
7 / 97
Example
Proposition
For all integers n ≥ 8, n¢ can be obtained using 3¢ and 5¢ coins.
Show that P(8) is true
P(8) is true because 8¢ can be obtained using one coin 3¢ and another
coin of 5¢.
Show that for all integers k ≥ 8, if P(k) is true then P(k + 1) is also
true
We can do the following.
7 / 97
Example
Proposition
For all integers n ≥ 8, n¢ can be obtained using 3¢ and 5¢ coins.
Show that P(8) is true
P(8) is true because 8¢ can be obtained using one coin 3¢ and another
coin of 5¢.
Show that for all integers k ≥ 8, if P(k) is true then P(k + 1) is also
true
We can do the following.
7 / 97
Example
Inductive hypothesis
Suppose that k is any integer with k ≥ 8 such that
k¢ can be obtained using 3¢ and 5¢ coins.
Case 1 - There is a 5¢ among those making the change for k¢
In this case, replace 5¢ by two 3¢. Thus, we get the change for (k + 1) ¢
Case 2 - There is not a 5¢ among those making the change for k¢
Because k ≥ 8, at leat three coins must have been used.
At least three 3¢ coins must have been used.
Remove those coins and replaced them using two 5¢.
The result will be (k + 1) ¢.
8 / 97
Example
Inductive hypothesis
Suppose that k is any integer with k ≥ 8 such that
k¢ can be obtained using 3¢ and 5¢ coins.
Case 1 - There is a 5¢ among those making the change for k¢
In this case, replace 5¢ by two 3¢. Thus, we get the change for (k + 1) ¢
Case 2 - There is not a 5¢ among those making the change for k¢
Because k ≥ 8, at leat three coins must have been used.
At least three 3¢ coins must have been used.
Remove those coins and replaced them using two 5¢.
The result will be (k + 1) ¢.
8 / 97
Example
Inductive hypothesis
Suppose that k is any integer with k ≥ 8 such that
k¢ can be obtained using 3¢ and 5¢ coins.
Case 1 - There is a 5¢ among those making the change for k¢
In this case, replace 5¢ by two 3¢. Thus, we get the change for (k + 1) ¢
Case 2 - There is not a 5¢ among those making the change for k¢
Because k ≥ 8, at leat three coins must have been used.
At least three 3¢ coins must have been used.
Remove those coins and replaced them using two 5¢.
The result will be (k + 1) ¢.
8 / 97
Example
Inductive hypothesis
Suppose that k is any integer with k ≥ 8 such that
k¢ can be obtained using 3¢ and 5¢ coins.
Case 1 - There is a 5¢ among those making the change for k¢
In this case, replace 5¢ by two 3¢. Thus, we get the change for (k + 1) ¢
Case 2 - There is not a 5¢ among those making the change for k¢
Because k ≥ 8, at leat three coins must have been used.
At least three 3¢ coins must have been used.
Remove those coins and replaced them using two 5¢.
The result will be (k + 1) ¢.
8 / 97
Example
Inductive hypothesis
Suppose that k is any integer with k ≥ 8 such that
k¢ can be obtained using 3¢ and 5¢ coins.
Case 1 - There is a 5¢ among those making the change for k¢
In this case, replace 5¢ by two 3¢. Thus, we get the change for (k + 1) ¢
Case 2 - There is not a 5¢ among those making the change for k¢
Because k ≥ 8, at leat three coins must have been used.
At least three 3¢ coins must have been used.
Remove those coins and replaced them using two 5¢.
The result will be (k + 1) ¢.
8 / 97
Example
Inductive hypothesis
Suppose that k is any integer with k ≥ 8 such that
k¢ can be obtained using 3¢ and 5¢ coins.
Case 1 - There is a 5¢ among those making the change for k¢
In this case, replace 5¢ by two 3¢. Thus, we get the change for (k + 1) ¢
Case 2 - There is not a 5¢ among those making the change for k¢
Because k ≥ 8, at leat three coins must have been used.
At least three 3¢ coins must have been used.
Remove those coins and replaced them using two 5¢.
The result will be (k + 1) ¢.
8 / 97
Example
Inductive hypothesis
Suppose that k is any integer with k ≥ 8 such that
k¢ can be obtained using 3¢ and 5¢ coins.
Case 1 - There is a 5¢ among those making the change for k¢
In this case, replace 5¢ by two 3¢. Thus, we get the change for (k + 1) ¢
Case 2 - There is not a 5¢ among those making the change for k¢
Because k ≥ 8, at leat three coins must have been used.
At least three 3¢ coins must have been used.
Remove those coins and replaced them using two 5¢.
The result will be (k + 1) ¢.
8 / 97
Outline
1 Induction
Basic Induction
Structural Induction
2 Series
Properties
Important Series
Bounding the Series
3 Probability
Intuitive Formulation
Axioms
Independence
Unconditional and Conditional Probability
Posterior (Conditional) Probability
Random Variables
Types of Random Variables
Cumulative Distributive Function
Properties of the PMF/PDF
Expected Value and Variance
Indicator Random Variable
9 / 97
We can go further...
Recursively defined sets and structures
Assume S is a set. We use two steps to define the elements of S.
Basis Step
Specify an initial collection of elements.
Recursive Step
Give a rule for forming new elements from those already known to be in S.
10 / 97
We can go further...
Recursively defined sets and structures
Assume S is a set. We use two steps to define the elements of S.
Basis Step
Specify an initial collection of elements.
Recursive Step
Give a rule for forming new elements from those already known to be in S.
10 / 97
We can go further...
Recursively defined sets and structures
Assume S is a set. We use two steps to define the elements of S.
Basis Step
Specify an initial collection of elements.
Recursive Step
Give a rule for forming new elements from those already known to be in S.
10 / 97
Thus
Let S be a set that has been defined recursively
To prove that every object in S satisfies a certain property:
1 Show that each object in the BASE for S satisfies the property.
2 Show that for each rule in the RECURSION, if the rule is applied to
objects in S that satisfy the property, then the objects defined by the
rule also satisfy the property.
11 / 97
Thus
Let S be a set that has been defined recursively
To prove that every object in S satisfies a certain property:
1 Show that each object in the BASE for S satisfies the property.
2 Show that for each rule in the RECURSION, if the rule is applied to
objects in S that satisfy the property, then the objects defined by the
rule also satisfy the property.
11 / 97
Thus
Let S be a set that has been defined recursively
To prove that every object in S satisfies a certain property:
1 Show that each object in the BASE for S satisfies the property.
2 Show that for each rule in the RECURSION, if the rule is applied to
objects in S that satisfy the property, then the objects defined by the
rule also satisfy the property.
11 / 97
Example: Binary trees recursive definition
Recall that the set B of binary trees over an alphabet A is defined as
follows
1 Basis: ∈ B
2 Recursive definition: If L, R ∈ B and x ∈ A then < L, x, R >∈ B.
Now define the function f : B → N defined as
f ( ) =0
f ( L, x, R ) =
1 if L = R =
f (L) + f (R) otherwise
Theorem
Let T in B be a binary tree. Then f (T) yields the number of leaves of T.
12 / 97
Example: Binary trees recursive definition
Recall that the set B of binary trees over an alphabet A is defined as
follows
1 Basis: ∈ B
2 Recursive definition: If L, R ∈ B and x ∈ A then < L, x, R >∈ B.
Now define the function f : B → N defined as
f ( ) =0
f ( L, x, R ) =
1 if L = R =
f (L) + f (R) otherwise
Theorem
Let T in B be a binary tree. Then f (T) yields the number of leaves of T.
12 / 97
Example: Binary trees recursive definition
Recall that the set B of binary trees over an alphabet A is defined as
follows
1 Basis: ∈ B
2 Recursive definition: If L, R ∈ B and x ∈ A then < L, x, R >∈ B.
Now define the function f : B → N defined as
f ( ) =0
f ( L, x, R ) =
1 if L = R =
f (L) + f (R) otherwise
Theorem
Let T in B be a binary tree. Then f (T) yields the number of leaves of T.
12 / 97
Proof
By structural induction on T
Basis: The empty tree has no leaves, so f ( ) = 0 is correct.
Induction
Let L, R be trees in B, x ∈ A.
Now
Suppose that f (L) and f (R) denotes the number of leaves of L and R,
respectively.
13 / 97
Proof
By structural induction on T
Basis: The empty tree has no leaves, so f ( ) = 0 is correct.
Induction
Let L, R be trees in B, x ∈ A.
Now
Suppose that f (L) and f (R) denotes the number of leaves of L and R,
respectively.
13 / 97
Proof
By structural induction on T
Basis: The empty tree has no leaves, so f ( ) = 0 is correct.
Induction
Let L, R be trees in B, x ∈ A.
Now
Suppose that f (L) and f (R) denotes the number of leaves of L and R,
respectively.
13 / 97
Proof
Case 1
If L = R = , then L, x, R = , x, has one leaf, namely x, so
f ( , x, ) = 1 is correct.
Case 2
If L and R are both not empty, then the number of leaves of the tree
L, x, R is equal to the number of leaves of L plus the number of leaves
of R.
f ( L, x, R ) = f (L) + f (R) (3)
Q.E.D.
14 / 97
Proof
Case 1
If L = R = , then L, x, R = , x, has one leaf, namely x, so
f ( , x, ) = 1 is correct.
Case 2
If L and R are both not empty, then the number of leaves of the tree
L, x, R is equal to the number of leaves of L plus the number of leaves
of R.
f ( L, x, R ) = f (L) + f (R) (3)
Q.E.D.
14 / 97
Introduction
When an algorithm contains an iterative control construct such as a
while or for loop
It is possible to express its running time as a series:
n
j=1
j (4)
Thus, what is the objective of using these series
To be able to find bounds for the complexities of algorithms
15 / 97
Introduction
When an algorithm contains an iterative control construct such as a
while or for loop
It is possible to express its running time as a series:
n
j=1
j (4)
Thus, what is the objective of using these series
To be able to find bounds for the complexities of algorithms
15 / 97
Definition of Series
Definition
Given a sequence a1, a2, ..., an of numbers, where n is a no-negative
integer, we can say that
a1 + a2 + ... + an =
n
k=1
ak (5)
In the case of infinite series
a1 + a2 + ... =
∞
k=1
ak = lim
n→∞
n
k=1
ak (6)
Here, we have concepts of convergence and divergence that I will allow
you to study.
16 / 97
Definition of Series
Definition
Given a sequence a1, a2, ..., an of numbers, where n is a no-negative
integer, we can say that
a1 + a2 + ... + an =
n
k=1
ak (5)
In the case of infinite series
a1 + a2 + ... =
∞
k=1
ak = lim
n→∞
n
k=1
ak (6)
Here, we have concepts of convergence and divergence that I will allow
you to study.
16 / 97
Definition of Series
Definition
Given a sequence a1, a2, ..., an of numbers, where n is a no-negative
integer, we can say that
a1 + a2 + ... + an =
n
k=1
ak (5)
In the case of infinite series
a1 + a2 + ... =
∞
k=1
ak = lim
n→∞
n
k=1
ak (6)
Here, we have concepts of convergence and divergence that I will allow
you to study.
16 / 97
Definition of Series
Definition
Given a sequence a1, a2, ..., an of numbers, where n is a no-negative
integer, we can say that
a1 + a2 + ... + an =
n
k=1
ak (5)
In the case of infinite series
a1 + a2 + ... =
∞
k=1
ak = lim
n→∞
n
k=1
ak (6)
Here, we have concepts of convergence and divergence that I will allow
you to study.
16 / 97
Outline
1 Induction
Basic Induction
Structural Induction
2 Series
Properties
Important Series
Bounding the Series
3 Probability
Intuitive Formulation
Axioms
Independence
Unconditional and Conditional Probability
Posterior (Conditional) Probability
Random Variables
Types of Random Variables
Cumulative Distributive Function
Properties of the PMF/PDF
Expected Value and Variance
Indicator Random Variable
17 / 97
Linearity
For any real number c and any finite sequences a1, a2, ..., an and
b1, b2, ..., bn
n
k=1
[cak + bk] = c
n
k=1
ak +
n
k=1
bk (7)
For More
Please take a look at page 1146 of Cormen’s book.
18 / 97
Linearity
For any real number c and any finite sequences a1, a2, ..., an and
b1, b2, ..., bn
n
k=1
[cak + bk] = c
n
k=1
ak +
n
k=1
bk (7)
For More
Please take a look at page 1146 of Cormen’s book.
18 / 97
Outline
1 Induction
Basic Induction
Structural Induction
2 Series
Properties
Important Series
Bounding the Series
3 Probability
Intuitive Formulation
Axioms
Independence
Unconditional and Conditional Probability
Posterior (Conditional) Probability
Random Variables
Types of Random Variables
Cumulative Distributive Function
Properties of the PMF/PDF
Expected Value and Variance
Indicator Random Variable
19 / 97
Telescopic Sum
Observation
In certain sums each term is a difference of two quantities.
You sometimes see that all the terms cancel except the first and the
last.
Example
Imagine that each term in the sum has the following structure:
1
k
−
1
k + 1
=
(k + 1) − k
k (k + 1)
=
1
k (k + 1)
(8)
What is the result of the following sum?
n
k=1
1
k (k + 1)
(9)
20 / 97
Telescopic Sum
Observation
In certain sums each term is a difference of two quantities.
You sometimes see that all the terms cancel except the first and the
last.
Example
Imagine that each term in the sum has the following structure:
1
k
−
1
k + 1
=
(k + 1) − k
k (k + 1)
=
1
k (k + 1)
(8)
What is the result of the following sum?
n
k=1
1
k (k + 1)
(9)
20 / 97
Telescopic Sum
Observation
In certain sums each term is a difference of two quantities.
You sometimes see that all the terms cancel except the first and the
last.
Example
Imagine that each term in the sum has the following structure:
1
k
−
1
k + 1
=
(k + 1) − k
k (k + 1)
=
1
k (k + 1)
(8)
What is the result of the following sum?
n
k=1
1
k (k + 1)
(9)
20 / 97
Telescopic Sum
Observation
In certain sums each term is a difference of two quantities.
You sometimes see that all the terms cancel except the first and the
last.
Example
Imagine that each term in the sum has the following structure:
1
k
−
1
k + 1
=
(k + 1) − k
k (k + 1)
=
1
k (k + 1)
(8)
What is the result of the following sum?
n
k=1
1
k (k + 1)
(9)
20 / 97
Telescopic Sum
Observation
In certain sums each term is a difference of two quantities.
You sometimes see that all the terms cancel except the first and the
last.
Example
Imagine that each term in the sum has the following structure:
1
k
−
1
k + 1
=
(k + 1) − k
k (k + 1)
=
1
k (k + 1)
(8)
What is the result of the following sum?
n
k=1
1
k (k + 1)
(9)
20 / 97
Telescopic Sum
Observation
In certain sums each term is a difference of two quantities.
You sometimes see that all the terms cancel except the first and the
last.
Example
Imagine that each term in the sum has the following structure:
1
k
−
1
k + 1
=
(k + 1) − k
k (k + 1)
=
1
k (k + 1)
(8)
What is the result of the following sum?
n
k=1
1
k (k + 1)
(9)
20 / 97
Telescopic Sum
Observation
In certain sums each term is a difference of two quantities.
You sometimes see that all the terms cancel except the first and the
last.
Example
Imagine that each term in the sum has the following structure:
1
k
−
1
k + 1
=
(k + 1) − k
k (k + 1)
=
1
k (k + 1)
(8)
What is the result of the following sum?
n
k=1
1
k (k + 1)
(9)
20 / 97
Telescopic Sum
Definition
For any sequence a0, a1, ..., an,
n
k=1
(ak − ak−1) = an − a0 (10)
Similarly
n−1
k=0
(ak − ak+1) = a0 − an (11)
21 / 97
Telescopic Sum
Definition
For any sequence a0, a1, ..., an,
n
k=1
(ak − ak−1) = an − a0 (10)
Similarly
n−1
k=0
(ak − ak+1) = a0 − an (11)
21 / 97
Arithmetic series
Summing over the set {1, 2, 3, ..., n}
We can prove that
n
i=1
i =
n (n + 1)
2
(12)
Proof
Basis: If n = 1 then 1×2
2 = 1
22 / 97
Arithmetic series
Summing over the set {1, 2, 3, ..., n}
We can prove that
n
i=1
i =
n (n + 1)
2
(12)
Proof
Basis: If n = 1 then 1×2
2 = 1
22 / 97
Proof
Induction, assume that is true for n
n+1
i=1
i =
n
i=1
i + n + 1
=
n (n + 1)
2
+ n + 1
=
n (n + 1) + 2 (n + 1)
2
=
(n + 1) (n + 2)
2
23 / 97
Proof
Induction, assume that is true for n
n+1
i=1
i =
n
i=1
i + n + 1
=
n (n + 1)
2
+ n + 1
=
n (n + 1) + 2 (n + 1)
2
=
(n + 1) (n + 2)
2
23 / 97
Proof
Induction, assume that is true for n
n+1
i=1
i =
n
i=1
i + n + 1
=
n (n + 1)
2
+ n + 1
=
n (n + 1) + 2 (n + 1)
2
=
(n + 1) (n + 2)
2
23 / 97
Proof
Induction, assume that is true for n
n+1
i=1
i =
n
i=1
i + n + 1
=
n (n + 1)
2
+ n + 1
=
n (n + 1) + 2 (n + 1)
2
=
(n + 1) (n + 2)
2
23 / 97
Series of Squares and Sums
Series of Squares
n
k=0
k2
=
n (n + 1) (2n + 1)
6
(13)
Series of Cubes
n
k=0
k2
=
n2 (n + 1)2
4
(14)
24 / 97
Series of Squares and Sums
Series of Squares
n
k=0
k2
=
n (n + 1) (2n + 1)
6
(13)
Series of Cubes
n
k=0
k2
=
n2 (n + 1)2
4
(14)
24 / 97
Geometric Series
Definition
For a real x = 1, we have that
n
k=0
xk
= 1 + x + x2
+ ... + xn
(15)
It is called the geometric series
It is possible to prove that
n
k=0
xk
=
xn+1 − 1
x − 1
(16)
Proof
Now multiply both sides by of (Eq. 15) by x
x
n
k=0
xk
= x + x2
+ x3
+ ... + xn+1
(17)
25 / 97
Geometric Series
Definition
For a real x = 1, we have that
n
k=0
xk
= 1 + x + x2
+ ... + xn
(15)
It is called the geometric series
It is possible to prove that
n
k=0
xk
=
xn+1 − 1
x − 1
(16)
Proof
Now multiply both sides by of (Eq. 15) by x
x
n
k=0
xk
= x + x2
+ x3
+ ... + xn+1
(17)
25 / 97
Geometric Series
Definition
For a real x = 1, we have that
n
k=0
xk
= 1 + x + x2
+ ... + xn
(15)
It is called the geometric series
It is possible to prove that
n
k=0
xk
=
xn+1 − 1
x − 1
(16)
Proof
Now multiply both sides by of (Eq. 15) by x
x
n
k=0
xk
= x + x2
+ x3
+ ... + xn+1
(17)
25 / 97
Proof
Subtract (Eq. 17) from (Eq. 15)
n
k=0
xk
− x
n
k=0
xk
= 1 − xn+1
(18)
Finally
n
k=0
xk
=
1 − xn+1
1 − x
=
xn+1 − 1
x − 1
(19)
26 / 97
Proof
Subtract (Eq. 17) from (Eq. 15)
n
k=0
xk
− x
n
k=0
xk
= 1 − xn+1
(18)
Finally
n
k=0
xk
=
1 − xn+1
1 − x
=
xn+1 − 1
x − 1
(19)
26 / 97
Infinite Geometric Series
When the summation is infinite and |x| < 1
∞
k=0
xk
=
1
1 − x
(20)
Proof
Given that
∞
k=0
xk
= lim
n→∞
n
k=0
xk
= lim
n→∞
1 − xn+1
1 − x
=
1
1 − x
(21)
27 / 97
Infinite Geometric Series
When the summation is infinite and |x| < 1
∞
k=0
xk
=
1
1 − x
(20)
Proof
Given that
∞
k=0
xk
= lim
n→∞
n
k=0
xk
= lim
n→∞
1 − xn+1
1 − x
=
1
1 − x
(21)
27 / 97
Infinite Geometric Series
When the summation is infinite and |x| < 1
∞
k=0
xk
=
1
1 − x
(20)
Proof
Given that
∞
k=0
xk
= lim
n→∞
n
k=0
xk
= lim
n→∞
1 − xn+1
1 − x
=
1
1 − x
(21)
27 / 97
Infinite Geometric Series
When the summation is infinite and |x| < 1
∞
k=0
xk
=
1
1 − x
(20)
Proof
Given that
∞
k=0
xk
= lim
n→∞
n
k=0
xk
= lim
n→∞
1 − xn+1
1 − x
=
1
1 − x
(21)
27 / 97
Infinite Geometric Series
When the summation is infinite and |x| < 1
∞
k=0
xk
=
1
1 − x
(20)
Proof
Given that
∞
k=0
xk
= lim
n→∞
n
k=0
xk
= lim
n→∞
1 − xn+1
1 − x
=
1
1 − x
(21)
27 / 97
For more on the series
Please take a look to
Cormen’s book - Appendix A.
28 / 97
Outline
1 Induction
Basic Induction
Structural Induction
2 Series
Properties
Important Series
Bounding the Series
3 Probability
Intuitive Formulation
Axioms
Independence
Unconditional and Conditional Probability
Posterior (Conditional) Probability
Random Variables
Types of Random Variables
Cumulative Distributive Function
Properties of the PMF/PDF
Expected Value and Variance
Indicator Random Variable
29 / 97
This is quite useful for Analysis of Algorithms
Important
The most basic way to evaluate a series is to use mathematical induction.
Example
Prove that
n
k=0
3k
≤ c3n
(22)
30 / 97
This is quite useful for Analysis of Algorithms
Important
The most basic way to evaluate a series is to use mathematical induction.
Example
Prove that
n
k=0
3k
≤ c3n
(22)
30 / 97
Fast Bounding of Series
A quick upper bound on the arithmetic series
n
k=1
k ≤
n
k=1
n = n2
(23)
In general, for a series n
k=1 ak
If amax = max1≤k≤n ak then
n
k=1
ak ≤ n · amax (24)
Another fast way of bounding finite series is
Suppose that
ak+1
ak
≤ r for all k ≥ 0 where 0 < r < 1.
31 / 97
Fast Bounding of Series
A quick upper bound on the arithmetic series
n
k=1
k ≤
n
k=1
n = n2
(23)
In general, for a series n
k=1 ak
If amax = max1≤k≤n ak then
n
k=1
ak ≤ n · amax (24)
Another fast way of bounding finite series is
Suppose that
ak+1
ak
≤ r for all k ≥ 0 where 0 < r < 1.
31 / 97
Fast Bounding of Series
A quick upper bound on the arithmetic series
n
k=1
k ≤
n
k=1
n = n2
(23)
In general, for a series n
k=1 ak
If amax = max1≤k≤n ak then
n
k=1
ak ≤ n · amax (24)
Another fast way of bounding finite series is
Suppose that
ak+1
ak
≤ r for all k ≥ 0 where 0 < r < 1.
31 / 97
A More Elegant Method
Thus, we have
ak ≤ a0rk
(25)
Thus, we can use a infinite decreasing geometric series
n
k=0
ak ≤
∞
k=0
a0rk
=a0
∞
k=0
rk
=a0
1
1 − r
32 / 97
A More Elegant Method
Thus, we have
ak ≤ a0rk
(25)
Thus, we can use a infinite decreasing geometric series
n
k=0
ak ≤
∞
k=0
a0rk
=a0
∞
k=0
rk
=a0
1
1 − r
32 / 97
A More Elegant Method
Thus, we have
ak ≤ a0rk
(25)
Thus, we can use a infinite decreasing geometric series
n
k=0
ak ≤
∞
k=0
a0rk
=a0
∞
k=0
rk
=a0
1
1 − r
32 / 97
A More Elegant Method
Thus, we have
ak ≤ a0rk
(25)
Thus, we can use a infinite decreasing geometric series
n
k=0
ak ≤
∞
k=0
a0rk
=a0
∞
k=0
rk
=a0
1
1 − r
32 / 97
Approximation by integrals
When a summation has the from n
k=m f (k), where f (k) is a
monotonically increasing function
ˆ n
m−1
f (x) dx ≤
n
k=m
f (k) ≤
ˆ n+1
m
f (x) dx (26)
33 / 97
For example
Given
ln (n + 1) =
ˆ n+1
1
1
x
dx ≤
n
k=1
1
k
(27)
In addition
n
k=1
1
k
≤
ˆ n
1
1
x
dx = ln n (28)
34 / 97
For example
Given
ln (n + 1) =
ˆ n+1
1
1
x
dx ≤
n
k=1
1
k
(27)
In addition
n
k=1
1
k
≤
ˆ n
1
1
x
dx = ln n (28)
34 / 97
Outline
1 Induction
Basic Induction
Structural Induction
2 Series
Properties
Important Series
Bounding the Series
3 Probability
Intuitive Formulation
Axioms
Independence
Unconditional and Conditional Probability
Posterior (Conditional) Probability
Random Variables
Types of Random Variables
Cumulative Distributive Function
Properties of the PMF/PDF
Expected Value and Variance
Indicator Random Variable
35 / 97
Gerolamo Cardano: Gambling out of Darkness
Gambling
Gambling shows our interest in quantifying the ideas of probability for
millennia, but exact mathematical descriptions arose much later.
Gerolamo Cardano (16th century)
While gambling he developed the following rule!!!
Equal conditions
“The most fundamental principle of all in gambling is simply equal
conditions, e.g. of opponents, of bystanders, of money, of situation, of the
dice box and of the dice itself. To the extent to which you depart from
that equity, if it is in your opponent’s favour, you are a fool, and if in your
own, you are unjust.”
36 / 97
Gerolamo Cardano: Gambling out of Darkness
Gambling
Gambling shows our interest in quantifying the ideas of probability for
millennia, but exact mathematical descriptions arose much later.
Gerolamo Cardano (16th century)
While gambling he developed the following rule!!!
Equal conditions
“The most fundamental principle of all in gambling is simply equal
conditions, e.g. of opponents, of bystanders, of money, of situation, of the
dice box and of the dice itself. To the extent to which you depart from
that equity, if it is in your opponent’s favour, you are a fool, and if in your
own, you are unjust.”
36 / 97
Gerolamo Cardano: Gambling out of Darkness
Gambling
Gambling shows our interest in quantifying the ideas of probability for
millennia, but exact mathematical descriptions arose much later.
Gerolamo Cardano (16th century)
While gambling he developed the following rule!!!
Equal conditions
“The most fundamental principle of all in gambling is simply equal
conditions, e.g. of opponents, of bystanders, of money, of situation, of the
dice box and of the dice itself. To the extent to which you depart from
that equity, if it is in your opponent’s favour, you are a fool, and if in your
own, you are unjust.”
36 / 97
Gerolamo Cardano’s Definition
Probability
“If therefore, someone should say, I want an ace, a deuce, or a trey, you
know that there are 27 favourable throws, and since the circuit is 36, the
rest of the throws in which these points will not turn up will be 9; the
odds will therefore be 3 to 1.”
Meaning
Probability as a ratio of favorable to all possible outcomes!!! As long all
events are equiprobable...
Thus, we get
P(All favourable throws) =
Number All favourable throws
Number of All throws
(29)
37 / 97
Gerolamo Cardano’s Definition
Probability
“If therefore, someone should say, I want an ace, a deuce, or a trey, you
know that there are 27 favourable throws, and since the circuit is 36, the
rest of the throws in which these points will not turn up will be 9; the
odds will therefore be 3 to 1.”
Meaning
Probability as a ratio of favorable to all possible outcomes!!! As long all
events are equiprobable...
Thus, we get
P(All favourable throws) =
Number All favourable throws
Number of All throws
(29)
37 / 97
Gerolamo Cardano’s Definition
Probability
“If therefore, someone should say, I want an ace, a deuce, or a trey, you
know that there are 27 favourable throws, and since the circuit is 36, the
rest of the throws in which these points will not turn up will be 9; the
odds will therefore be 3 to 1.”
Meaning
Probability as a ratio of favorable to all possible outcomes!!! As long all
events are equiprobable...
Thus, we get
P(All favourable throws) =
Number All favourable throws
Number of All throws
(29)
37 / 97
Intuitive Formulation
Empiric Definition
Intuitively, the probability of an event A could be defined as:
P(A) = lim
n→∞
N(A)
n
Where N(A) is the number that event a happens in n trials.
Example
Imagine you have three dices, then
The total number of outcomes is 63
If we have event A = all numbers are equal, |A| = 6
Then, we have that P(A) = 6
63 = 1
36
38 / 97
Intuitive Formulation
Empiric Definition
Intuitively, the probability of an event A could be defined as:
P(A) = lim
n→∞
N(A)
n
Where N(A) is the number that event a happens in n trials.
Example
Imagine you have three dices, then
The total number of outcomes is 63
If we have event A = all numbers are equal, |A| = 6
Then, we have that P(A) = 6
63 = 1
36
38 / 97
Intuitive Formulation
Empiric Definition
Intuitively, the probability of an event A could be defined as:
P(A) = lim
n→∞
N(A)
n
Where N(A) is the number that event a happens in n trials.
Example
Imagine you have three dices, then
The total number of outcomes is 63
If we have event A = all numbers are equal, |A| = 6
Then, we have that P(A) = 6
63 = 1
36
38 / 97
Intuitive Formulation
Empiric Definition
Intuitively, the probability of an event A could be defined as:
P(A) = lim
n→∞
N(A)
n
Where N(A) is the number that event a happens in n trials.
Example
Imagine you have three dices, then
The total number of outcomes is 63
If we have event A = all numbers are equal, |A| = 6
Then, we have that P(A) = 6
63 = 1
36
38 / 97
Outline
1 Induction
Basic Induction
Structural Induction
2 Series
Properties
Important Series
Bounding the Series
3 Probability
Intuitive Formulation
Axioms
Independence
Unconditional and Conditional Probability
Posterior (Conditional) Probability
Random Variables
Types of Random Variables
Cumulative Distributive Function
Properties of the PMF/PDF
Expected Value and Variance
Indicator Random Variable
39 / 97
Axioms of Probability
Axioms
Given a sample space S of events, we have that
1 0 ≤ P(A) ≤ 1
2 P(S) = 1
3 If A1, A2, ..., An are mutually exclusive events (i.e. P(Ai ∩ Aj) = 0),
then:
P(A1 ∪ A2 ∪ ... ∪ An) =
n
i=1
P(Ai)
40 / 97
Axioms of Probability
Axioms
Given a sample space S of events, we have that
1 0 ≤ P(A) ≤ 1
2 P(S) = 1
3 If A1, A2, ..., An are mutually exclusive events (i.e. P(Ai ∩ Aj) = 0),
then:
P(A1 ∪ A2 ∪ ... ∪ An) =
n
i=1
P(Ai)
40 / 97
Axioms of Probability
Axioms
Given a sample space S of events, we have that
1 0 ≤ P(A) ≤ 1
2 P(S) = 1
3 If A1, A2, ..., An are mutually exclusive events (i.e. P(Ai ∩ Aj) = 0),
then:
P(A1 ∪ A2 ∪ ... ∪ An) =
n
i=1
P(Ai)
40 / 97
Axioms of Probability
Axioms
Given a sample space S of events, we have that
1 0 ≤ P(A) ≤ 1
2 P(S) = 1
3 If A1, A2, ..., An are mutually exclusive events (i.e. P(Ai ∩ Aj) = 0),
then:
P(A1 ∪ A2 ∪ ... ∪ An) =
n
i=1
P(Ai)
40 / 97
Set Operations
We are using
Set Notation
Thus
What Operations?
41 / 97
Set Operations
We are using
Set Notation
Thus
What Operations?
41 / 97
Example
Setup
Throw a biased coin twice
HH .36 HT .24
TH .24 TT .16
We have the following event
At least one head!!! Can you tell me which events are part of it?
What about this one?
Tail on first toss.
42 / 97
Example
Setup
Throw a biased coin twice
HH .36 HT .24
TH .24 TT .16
We have the following event
At least one head!!! Can you tell me which events are part of it?
What about this one?
Tail on first toss.
42 / 97
Example
Setup
Throw a biased coin twice
HH .36 HT .24
TH .24 TT .16
We have the following event
At least one head!!! Can you tell me which events are part of it?
What about this one?
Tail on first toss.
42 / 97
We need to count!!!
We have four main methods of counting
1 Ordered samples of size r with replacement
2 Ordered samples of size r without replacement
3 Unordered samples of size r without replacement
4 Unordered samples of size r with replacement
43 / 97
We need to count!!!
We have four main methods of counting
1 Ordered samples of size r with replacement
2 Ordered samples of size r without replacement
3 Unordered samples of size r without replacement
4 Unordered samples of size r with replacement
43 / 97
We need to count!!!
We have four main methods of counting
1 Ordered samples of size r with replacement
2 Ordered samples of size r without replacement
3 Unordered samples of size r without replacement
4 Unordered samples of size r with replacement
43 / 97
We need to count!!!
We have four main methods of counting
1 Ordered samples of size r with replacement
2 Ordered samples of size r without replacement
3 Unordered samples of size r without replacement
4 Unordered samples of size r with replacement
43 / 97
Ordered samples of size r with replacement
Definition
The number of possible sequences (ai1 , ..., air ) for n different numbers is
n × n × ... × n = nr
(30)
Example
If you throw three dices you have 6 × 6 × 6 = 216
44 / 97
Ordered samples of size r with replacement
Definition
The number of possible sequences (ai1 , ..., air ) for n different numbers is
n × n × ... × n = nr
(30)
Example
If you throw three dices you have 6 × 6 × 6 = 216
44 / 97
Ordered samples of size r without replacement
Definition
The number of possible sequences (ai1 , ..., air ) for n different numbers is
n × n − 1 × ... × (n − (r − 1)) =
n!
(n − r)!
(31)
Example
The number of different numbers that can be formed if no digit can be
repeated. For example, if you have 4 digits and you want numbers of size
3.
45 / 97
Ordered samples of size r without replacement
Definition
The number of possible sequences (ai1 , ..., air ) for n different numbers is
n × n − 1 × ... × (n − (r − 1)) =
n!
(n − r)!
(31)
Example
The number of different numbers that can be formed if no digit can be
repeated. For example, if you have 4 digits and you want numbers of size
3.
45 / 97
Unordered samples of size r without replacement
Definition
Actually, we want the number of possible unordered sets.
However
We have n!
(n−r)! collections where we care about the order. Thus
n!
(n−r)!
r!
=
n!
r! (n − r)!
=
n
r
(32)
46 / 97
Unordered samples of size r without replacement
Definition
Actually, we want the number of possible unordered sets.
However
We have n!
(n−r)! collections where we care about the order. Thus
n!
(n−r)!
r!
=
n!
r! (n − r)!
=
n
r
(32)
46 / 97
Unordered samples of size r with replacement
Definition
We want to find an unordered set {ai1 , ..., air } with replacement
Use a digit trick for that
Look at the Board
Thus
n + r − 1
r
(33)
47 / 97
Unordered samples of size r with replacement
Definition
We want to find an unordered set {ai1 , ..., air } with replacement
Use a digit trick for that
Look at the Board
Thus
n + r − 1
r
(33)
47 / 97
Unordered samples of size r with replacement
Definition
We want to find an unordered set {ai1 , ..., air } with replacement
Use a digit trick for that
Look at the Board
Thus
n + r − 1
r
(33)
47 / 97
How?
Change encoding by adding more signs
Imagine all the strings of three numbers with {1, 2, 3}
We have
Old String New String
111 1+0,1+1,1+2=123
112 1+0,1+1,2+2=124
113 1+0,1+1,3+2=125
122 1+0,2+1,2+2=134
123 1+0,2+1,3+2=135
133 1+0,3+1,3+2=145
222 2+0,2+1,2+2=234
223 2+0,2+1,3+2=235
233 1+0,3+1,3+2=245
333 3+0,3+1,3+2=345
48 / 97
How?
Change encoding by adding more signs
Imagine all the strings of three numbers with {1, 2, 3}
We have
Old String New String
111 1+0,1+1,1+2=123
112 1+0,1+1,2+2=124
113 1+0,1+1,3+2=125
122 1+0,2+1,2+2=134
123 1+0,2+1,3+2=135
133 1+0,3+1,3+2=145
222 2+0,2+1,2+2=234
223 2+0,2+1,3+2=235
233 1+0,3+1,3+2=245
333 3+0,3+1,3+2=345
48 / 97
Outline
1 Induction
Basic Induction
Structural Induction
2 Series
Properties
Important Series
Bounding the Series
3 Probability
Intuitive Formulation
Axioms
Independence
Unconditional and Conditional Probability
Posterior (Conditional) Probability
Random Variables
Types of Random Variables
Cumulative Distributive Function
Properties of the PMF/PDF
Expected Value and Variance
Indicator Random Variable
49 / 97
Independence
Definition
Two events A and B are independent if and only if
P(A, B) = P(A ∩ B) = P(A)P(B)
Do you have any example?
Any idea?
50 / 97
Independence
Definition
Two events A and B are independent if and only if
P(A, B) = P(A ∩ B) = P(A)P(B)
Do you have any example?
Any idea?
50 / 97
Example
We have two dices
Thus, we have all pairs (i, j) such that i, j = 1, 2, 3, ..., 6
We have the following events
A ={First dice 1,2 or 3}
B = {First dice 3, 4 or 5}
C = {The sum of two faces is 9}
So, we can do
Look at the board!!! Independence between A, B, C
51 / 97
Example
We have two dices
Thus, we have all pairs (i, j) such that i, j = 1, 2, 3, ..., 6
We have the following events
A ={First dice 1,2 or 3}
B = {First dice 3, 4 or 5}
C = {The sum of two faces is 9}
So, we can do
Look at the board!!! Independence between A, B, C
51 / 97
Example
We have two dices
Thus, we have all pairs (i, j) such that i, j = 1, 2, 3, ..., 6
We have the following events
A ={First dice 1,2 or 3}
B = {First dice 3, 4 or 5}
C = {The sum of two faces is 9}
So, we can do
Look at the board!!! Independence between A, B, C
51 / 97
Example
We have two dices
Thus, we have all pairs (i, j) such that i, j = 1, 2, 3, ..., 6
We have the following events
A ={First dice 1,2 or 3}
B = {First dice 3, 4 or 5}
C = {The sum of two faces is 9}
So, we can do
Look at the board!!! Independence between A, B, C
51 / 97
Example
We have two dices
Thus, we have all pairs (i, j) such that i, j = 1, 2, 3, ..., 6
We have the following events
A ={First dice 1,2 or 3}
B = {First dice 3, 4 or 5}
C = {The sum of two faces is 9}
So, we can do
Look at the board!!! Independence between A, B, C
51 / 97
We can use it to derive the Binomial Distribution
WHAT?????
52 / 97
First, we use a sequence of n Bernoulli Trials
We have this
“Success” has a probability p.
“Failure” has a probability 1 − p.
Examples
Toss a coin independently n times.
Examine components produced on an assembly line.
Now
We take S =all 2n ordered sequences of length n, with components
0(failure) and 1(success).
53 / 97
First, we use a sequence of n Bernoulli Trials
We have this
“Success” has a probability p.
“Failure” has a probability 1 − p.
Examples
Toss a coin independently n times.
Examine components produced on an assembly line.
Now
We take S =all 2n ordered sequences of length n, with components
0(failure) and 1(success).
53 / 97
First, we use a sequence of n Bernoulli Trials
We have this
“Success” has a probability p.
“Failure” has a probability 1 − p.
Examples
Toss a coin independently n times.
Examine components produced on an assembly line.
Now
We take S =all 2n ordered sequences of length n, with components
0(failure) and 1(success).
53 / 97
First, we use a sequence of n Bernoulli Trials
We have this
“Success” has a probability p.
“Failure” has a probability 1 − p.
Examples
Toss a coin independently n times.
Examine components produced on an assembly line.
Now
We take S =all 2n ordered sequences of length n, with components
0(failure) and 1(success).
53 / 97
First, we use a sequence of n Bernoulli Trials
We have this
“Success” has a probability p.
“Failure” has a probability 1 − p.
Examples
Toss a coin independently n times.
Examine components produced on an assembly line.
Now
We take S =all 2n ordered sequences of length n, with components
0(failure) and 1(success).
53 / 97
Thus, taking a sample ω
ω = 11 · · · 10 · · · 0
k 1’s followed by n − k 0’s.
We have then
P (ω) = P A1 ∩ A2 ∩ . . . ∩ Ak ∩ Ac
k+1 ∩ . . . ∩ Ac
n
= P (A1) P (A2) · · · P (Ak) P Ac
k+1 · · · P (Ac
n)
= pk
(1 − p)n−k
Important
The number of such sample is the number of sets with k elements.... or...
n
k
54 / 97
Thus, taking a sample ω
ω = 11 · · · 10 · · · 0
k 1’s followed by n − k 0’s.
We have then
P (ω) = P A1 ∩ A2 ∩ . . . ∩ Ak ∩ Ac
k+1 ∩ . . . ∩ Ac
n
= P (A1) P (A2) · · · P (Ak) P Ac
k+1 · · · P (Ac
n)
= pk
(1 − p)n−k
Important
The number of such sample is the number of sets with k elements.... or...
n
k
54 / 97
Thus, taking a sample ω
ω = 11 · · · 10 · · · 0
k 1’s followed by n − k 0’s.
We have then
P (ω) = P A1 ∩ A2 ∩ . . . ∩ Ak ∩ Ac
k+1 ∩ . . . ∩ Ac
n
= P (A1) P (A2) · · · P (Ak) P Ac
k+1 · · · P (Ac
n)
= pk
(1 − p)n−k
Important
The number of such sample is the number of sets with k elements.... or...
n
k
54 / 97
Did you notice?
We do not care where the 1’s and 0’s are
Thus all the probabilities are equal to pk (1 − p)k
Thus, we are looking to sum all those probabilities of all those
combinations of 1’s and 0’s
k 1’s
p ωk
Then
k 1’s
p ωk
=
n
k
p (1 − p)n−k
55 / 97
Did you notice?
We do not care where the 1’s and 0’s are
Thus all the probabilities are equal to pk (1 − p)k
Thus, we are looking to sum all those probabilities of all those
combinations of 1’s and 0’s
k 1’s
p ωk
Then
k 1’s
p ωk
=
n
k
p (1 − p)n−k
55 / 97
Did you notice?
We do not care where the 1’s and 0’s are
Thus all the probabilities are equal to pk (1 − p)k
Thus, we are looking to sum all those probabilities of all those
combinations of 1’s and 0’s
k 1’s
p ωk
Then
k 1’s
p ωk
=
n
k
p (1 − p)n−k
55 / 97
Proving this is a probability
Sum of these probabilities is equal to 1
n
k=0
n
k
p (1 − p)n−k
= (p + (1 − p))n
= 1
The other is simple
0 ≤
n
k
p (1 − p)n−k
≤ 1 ∀k
This is know as
The Binomial probability function!!!
56 / 97
Proving this is a probability
Sum of these probabilities is equal to 1
n
k=0
n
k
p (1 − p)n−k
= (p + (1 − p))n
= 1
The other is simple
0 ≤
n
k
p (1 − p)n−k
≤ 1 ∀k
This is know as
The Binomial probability function!!!
56 / 97
Proving this is a probability
Sum of these probabilities is equal to 1
n
k=0
n
k
p (1 − p)n−k
= (p + (1 − p))n
= 1
The other is simple
0 ≤
n
k
p (1 − p)n−k
≤ 1 ∀k
This is know as
The Binomial probability function!!!
56 / 97
Outline
1 Induction
Basic Induction
Structural Induction
2 Series
Properties
Important Series
Bounding the Series
3 Probability
Intuitive Formulation
Axioms
Independence
Unconditional and Conditional Probability
Posterior (Conditional) Probability
Random Variables
Types of Random Variables
Cumulative Distributive Function
Properties of the PMF/PDF
Expected Value and Variance
Indicator Random Variable
57 / 97
Different Probabilities
Unconditional
This is the probability of an event A prior to arrival of any evidence, it is
denoted by P(A). For example:
P(Cavity)=0.1 means that “in the absence of any other information,
there is a 10% chance that the patient is having a cavity”.
Conditional
This is the probability of an event A given some evidence B, it is denoted
P(A|B). For example:
P(Cavity/Toothache)=0.8 means that “there is an 80% chance that
the patient is having a cavity given that he is having a toothache”
58 / 97
Different Probabilities
Unconditional
This is the probability of an event A prior to arrival of any evidence, it is
denoted by P(A). For example:
P(Cavity)=0.1 means that “in the absence of any other information,
there is a 10% chance that the patient is having a cavity”.
Conditional
This is the probability of an event A given some evidence B, it is denoted
P(A|B). For example:
P(Cavity/Toothache)=0.8 means that “there is an 80% chance that
the patient is having a cavity given that he is having a toothache”
58 / 97
Different Probabilities
Unconditional
This is the probability of an event A prior to arrival of any evidence, it is
denoted by P(A). For example:
P(Cavity)=0.1 means that “in the absence of any other information,
there is a 10% chance that the patient is having a cavity”.
Conditional
This is the probability of an event A given some evidence B, it is denoted
P(A|B). For example:
P(Cavity/Toothache)=0.8 means that “there is an 80% chance that
the patient is having a cavity given that he is having a toothache”
58 / 97
Different Probabilities
Unconditional
This is the probability of an event A prior to arrival of any evidence, it is
denoted by P(A). For example:
P(Cavity)=0.1 means that “in the absence of any other information,
there is a 10% chance that the patient is having a cavity”.
Conditional
This is the probability of an event A given some evidence B, it is denoted
P(A|B). For example:
P(Cavity/Toothache)=0.8 means that “there is an 80% chance that
the patient is having a cavity given that he is having a toothache”
58 / 97
Outline
1 Induction
Basic Induction
Structural Induction
2 Series
Properties
Important Series
Bounding the Series
3 Probability
Intuitive Formulation
Axioms
Independence
Unconditional and Conditional Probability
Posterior (Conditional) Probability
Random Variables
Types of Random Variables
Cumulative Distributive Function
Properties of the PMF/PDF
Expected Value and Variance
Indicator Random Variable
59 / 97
Posterior Probabilities
Relation between conditional and unconditional probabilities
Conditional probabilities can be defined in terms of unconditional probabilities:
P(A|B) =
P(A, B)
P(B)
which generalizes to the chain rule P(A, B) = P(B)P(A|B) = P(A)P(B|A).
Law of Total Probabilities
if B1, B2, ..., Bnis a partition of mutually exclusive events and Ais an event, then
P(A) =
n
i=1
P(A ∩ Bi). An special case P(A) = P(A, B) + P(A, B).
In addition, this can be rewritten into P(A) =
n
i=1
P(A|Bi)P(Bi).
60 / 97
Posterior Probabilities
Relation between conditional and unconditional probabilities
Conditional probabilities can be defined in terms of unconditional probabilities:
P(A|B) =
P(A, B)
P(B)
which generalizes to the chain rule P(A, B) = P(B)P(A|B) = P(A)P(B|A).
Law of Total Probabilities
if B1, B2, ..., Bnis a partition of mutually exclusive events and Ais an event, then
P(A) =
n
i=1
P(A ∩ Bi). An special case P(A) = P(A, B) + P(A, B).
In addition, this can be rewritten into P(A) =
n
i=1
P(A|Bi)P(Bi).
60 / 97
Posterior Probabilities
Relation between conditional and unconditional probabilities
Conditional probabilities can be defined in terms of unconditional probabilities:
P(A|B) =
P(A, B)
P(B)
which generalizes to the chain rule P(A, B) = P(B)P(A|B) = P(A)P(B|A).
Law of Total Probabilities
if B1, B2, ..., Bnis a partition of mutually exclusive events and Ais an event, then
P(A) =
n
i=1
P(A ∩ Bi). An special case P(A) = P(A, B) + P(A, B).
In addition, this can be rewritten into P(A) =
n
i=1
P(A|Bi)P(Bi).
60 / 97
Example
Three cards are drawn from a deck
Find the probability of no obtaining a heart
We have
52 cards
39 of them not a heart
Define
Ai ={Card i is not a heart} Then?
61 / 97
Example
Three cards are drawn from a deck
Find the probability of no obtaining a heart
We have
52 cards
39 of them not a heart
Define
Ai ={Card i is not a heart} Then?
61 / 97
Example
Three cards are drawn from a deck
Find the probability of no obtaining a heart
We have
52 cards
39 of them not a heart
Define
Ai ={Card i is not a heart} Then?
61 / 97
Independence and Conditional
From here, we have that...
P(A|B) = P(A) and P(B|A) = P(B).
Conditional independence
A and B are conditionally independent given C if and only if
P(A|B, C) = P(A|C)
Example: P(WetGrass|Season, Rain) = P(WetGrass|Rain).
62 / 97
Independence and Conditional
From here, we have that...
P(A|B) = P(A) and P(B|A) = P(B).
Conditional independence
A and B are conditionally independent given C if and only if
P(A|B, C) = P(A|C)
Example: P(WetGrass|Season, Rain) = P(WetGrass|Rain).
62 / 97
Bayes Theorem
One Version
P(A|B) =
P(B|A)P(A)
P(B)
Where
P(A) is the prior probability or marginal probability of A. It is
"prior" in the sense that it does not take into account any information
about B.
P(A|B) is the conditional probability of A, given B. It is also called
the posterior probability because it is derived from or depends upon
the specified value of B.
P(B|A) is the conditional probability of B given A. It is also called
the likelihood.
P(B) is the prior or marginal probability of B, and acts as a
normalizing constant.
63 / 97
Bayes Theorem
One Version
P(A|B) =
P(B|A)P(A)
P(B)
Where
P(A) is the prior probability or marginal probability of A. It is
"prior" in the sense that it does not take into account any information
about B.
P(A|B) is the conditional probability of A, given B. It is also called
the posterior probability because it is derived from or depends upon
the specified value of B.
P(B|A) is the conditional probability of B given A. It is also called
the likelihood.
P(B) is the prior or marginal probability of B, and acts as a
normalizing constant.
63 / 97
Bayes Theorem
One Version
P(A|B) =
P(B|A)P(A)
P(B)
Where
P(A) is the prior probability or marginal probability of A. It is
"prior" in the sense that it does not take into account any information
about B.
P(A|B) is the conditional probability of A, given B. It is also called
the posterior probability because it is derived from or depends upon
the specified value of B.
P(B|A) is the conditional probability of B given A. It is also called
the likelihood.
P(B) is the prior or marginal probability of B, and acts as a
normalizing constant.
63 / 97
Bayes Theorem
One Version
P(A|B) =
P(B|A)P(A)
P(B)
Where
P(A) is the prior probability or marginal probability of A. It is
"prior" in the sense that it does not take into account any information
about B.
P(A|B) is the conditional probability of A, given B. It is also called
the posterior probability because it is derived from or depends upon
the specified value of B.
P(B|A) is the conditional probability of B given A. It is also called
the likelihood.
P(B) is the prior or marginal probability of B, and acts as a
normalizing constant.
63 / 97
Bayes Theorem
One Version
P(A|B) =
P(B|A)P(A)
P(B)
Where
P(A) is the prior probability or marginal probability of A. It is
"prior" in the sense that it does not take into account any information
about B.
P(A|B) is the conditional probability of A, given B. It is also called
the posterior probability because it is derived from or depends upon
the specified value of B.
P(B|A) is the conditional probability of B given A. It is also called
the likelihood.
P(B) is the prior or marginal probability of B, and acts as a
normalizing constant.
63 / 97
General Form of the Bayes Rule
Definition
If A1, A2, ..., An is a partition of mutually exclusive events and B any
event, then:
P(Ai|B) =
P(B|Ai)P(Ai)
P(B)
=
P(B|Ai)P(Ai)
n
i=1 P(B|Ai)P(Ai)
where
P(B) =
n
i=1
P(B ∩ Ai) =
n
i=1
P(B|Ai)P(Ai)
64 / 97
General Form of the Bayes Rule
Definition
If A1, A2, ..., An is a partition of mutually exclusive events and B any
event, then:
P(Ai|B) =
P(B|Ai)P(Ai)
P(B)
=
P(B|Ai)P(Ai)
n
i=1 P(B|Ai)P(Ai)
where
P(B) =
n
i=1
P(B ∩ Ai) =
n
i=1
P(B|Ai)P(Ai)
64 / 97
Example
Setup
Throw two unbiased dice independently.
Let
1 A ={sum of the faces =8}
2 B ={faces are equal}
Then calculate P (B|A)
Look at the board
65 / 97
Example
Setup
Throw two unbiased dice independently.
Let
1 A ={sum of the faces =8}
2 B ={faces are equal}
Then calculate P (B|A)
Look at the board
65 / 97
Example
Setup
Throw two unbiased dice independently.
Let
1 A ={sum of the faces =8}
2 B ={faces are equal}
Then calculate P (B|A)
Look at the board
65 / 97
Another Example
We have the following
Two coins are available, one unbiased and the other two headed
Assume
That you have a probability of 3
4 to choose the unbiased
Events
A= {head comes up}
B1= {Unbiased coin chosen}
B2= {Biased coin chosen}
Find that if a head come up, find the probability that the two headed
coin was chosen
66 / 97
Another Example
We have the following
Two coins are available, one unbiased and the other two headed
Assume
That you have a probability of 3
4 to choose the unbiased
Events
A= {head comes up}
B1= {Unbiased coin chosen}
B2= {Biased coin chosen}
Find that if a head come up, find the probability that the two headed
coin was chosen
66 / 97
Another Example
We have the following
Two coins are available, one unbiased and the other two headed
Assume
That you have a probability of 3
4 to choose the unbiased
Events
A= {head comes up}
B1= {Unbiased coin chosen}
B2= {Biased coin chosen}
Find that if a head come up, find the probability that the two headed
coin was chosen
66 / 97
Another Example
We have the following
Two coins are available, one unbiased and the other two headed
Assume
That you have a probability of 3
4 to choose the unbiased
Events
A= {head comes up}
B1= {Unbiased coin chosen}
B2= {Biased coin chosen}
Find that if a head come up, find the probability that the two headed
coin was chosen
66 / 97
Another Example
We have the following
Two coins are available, one unbiased and the other two headed
Assume
That you have a probability of 3
4 to choose the unbiased
Events
A= {head comes up}
B1= {Unbiased coin chosen}
B2= {Biased coin chosen}
Find that if a head come up, find the probability that the two headed
coin was chosen
66 / 97
Outline
1 Induction
Basic Induction
Structural Induction
2 Series
Properties
Important Series
Bounding the Series
3 Probability
Intuitive Formulation
Axioms
Independence
Unconditional and Conditional Probability
Posterior (Conditional) Probability
Random Variables
Types of Random Variables
Cumulative Distributive Function
Properties of the PMF/PDF
Expected Value and Variance
Indicator Random Variable
67 / 97
Random Variables
Definition
In many experiments, it is easier to deal with a summary variable than
with the original probability structure.
Example
In an opinion poll, we ask 50 people whether agree or disagree with a
certain issue.
Suppose we record a “1” for agree and “0” for disagree.
The sample space for this experiment has 250 elements. Why?
Suppose we are only interested in the number of people who agree.
Define the variable X =number of “1” ’s recorded out of 50.
Easier to deal with this sample space (has only 51 elements).
68 / 97
Random Variables
Definition
In many experiments, it is easier to deal with a summary variable than
with the original probability structure.
Example
In an opinion poll, we ask 50 people whether agree or disagree with a
certain issue.
Suppose we record a “1” for agree and “0” for disagree.
The sample space for this experiment has 250 elements. Why?
Suppose we are only interested in the number of people who agree.
Define the variable X =number of “1” ’s recorded out of 50.
Easier to deal with this sample space (has only 51 elements).
68 / 97
Random Variables
Definition
In many experiments, it is easier to deal with a summary variable than
with the original probability structure.
Example
In an opinion poll, we ask 50 people whether agree or disagree with a
certain issue.
Suppose we record a “1” for agree and “0” for disagree.
The sample space for this experiment has 250 elements. Why?
Suppose we are only interested in the number of people who agree.
Define the variable X =number of “1” ’s recorded out of 50.
Easier to deal with this sample space (has only 51 elements).
68 / 97
Random Variables
Definition
In many experiments, it is easier to deal with a summary variable than
with the original probability structure.
Example
In an opinion poll, we ask 50 people whether agree or disagree with a
certain issue.
Suppose we record a “1” for agree and “0” for disagree.
The sample space for this experiment has 250 elements. Why?
Suppose we are only interested in the number of people who agree.
Define the variable X =number of “1” ’s recorded out of 50.
Easier to deal with this sample space (has only 51 elements).
68 / 97
Random Variables
Definition
In many experiments, it is easier to deal with a summary variable than
with the original probability structure.
Example
In an opinion poll, we ask 50 people whether agree or disagree with a
certain issue.
Suppose we record a “1” for agree and “0” for disagree.
The sample space for this experiment has 250 elements. Why?
Suppose we are only interested in the number of people who agree.
Define the variable X =number of “1” ’s recorded out of 50.
Easier to deal with this sample space (has only 51 elements).
68 / 97
Random Variables
Definition
In many experiments, it is easier to deal with a summary variable than
with the original probability structure.
Example
In an opinion poll, we ask 50 people whether agree or disagree with a
certain issue.
Suppose we record a “1” for agree and “0” for disagree.
The sample space for this experiment has 250 elements. Why?
Suppose we are only interested in the number of people who agree.
Define the variable X =number of “1” ’s recorded out of 50.
Easier to deal with this sample space (has only 51 elements).
68 / 97
Random Variables
Definition
In many experiments, it is easier to deal with a summary variable than
with the original probability structure.
Example
In an opinion poll, we ask 50 people whether agree or disagree with a
certain issue.
Suppose we record a “1” for agree and “0” for disagree.
The sample space for this experiment has 250 elements. Why?
Suppose we are only interested in the number of people who agree.
Define the variable X =number of “1” ’s recorded out of 50.
Easier to deal with this sample space (has only 51 elements).
68 / 97
Thus...
It is necessary to define a function random variable as follow
X : S → R
Graphically
69 / 97
Thus...
It is necessary to define a function random variable as follow
X : S → R
Graphically
S
A
69 / 97
Random Variables
How?
What is the probability function of the random variable is being defined
from the probability function of the original sample space?
Suppose the sample space is S = {s1, s2, ..., sn}
Suppose the range of the random variable X =< x1, x2, ..., xm >
Then, we observe X = xi if and only if the outcome of the random
experiment is an sj ∈ S s.t. X(sj) = xj or
70 / 97
Random Variables
How?
What is the probability function of the random variable is being defined
from the probability function of the original sample space?
Suppose the sample space is S = {s1, s2, ..., sn}
Suppose the range of the random variable X =< x1, x2, ..., xm >
Then, we observe X = xi if and only if the outcome of the random
experiment is an sj ∈ S s.t. X(sj) = xj or
70 / 97
Random Variables
How?
What is the probability function of the random variable is being defined
from the probability function of the original sample space?
Suppose the sample space is S = {s1, s2, ..., sn}
Suppose the range of the random variable X =< x1, x2, ..., xm >
Then, we observe X = xi if and only if the outcome of the random
experiment is an sj ∈ S s.t. X(sj) = xj or
70 / 97
Random Variables
How?
What is the probability function of the random variable is being defined
from the probability function of the original sample space?
Suppose the sample space is S = {s1, s2, ..., sn}
Suppose the range of the random variable X =< x1, x2, ..., xm >
Then, we observe X = xi if and only if the outcome of the random
experiment is an sj ∈ S s.t. X(sj) = xj or
P(X = xj) = P(sj ∈ S|X(sj) = xj)
70 / 97
Example
Setup
Throw a coin 10 times, and let R be the number of heads.
Then
S = all sequences of length 10 with components H and T
We have for
ω =HHHHTTHTTH ⇒ R (ω) = 6
71 / 97
Example
Setup
Throw a coin 10 times, and let R be the number of heads.
Then
S = all sequences of length 10 with components H and T
We have for
ω =HHHHTTHTTH ⇒ R (ω) = 6
71 / 97
Example
Setup
Throw a coin 10 times, and let R be the number of heads.
Then
S = all sequences of length 10 with components H and T
We have for
ω =HHHHTTHTTH ⇒ R (ω) = 6
71 / 97
Example
Setup
Let R be the number of heads in two independent tosses of a coin.
Probability of head is .6
What are the probabilities?
Ω ={HH,HT,TH,TT}
Thus, we can calculate
P (R = 0) , P (R = 1) , P (R = 2)
72 / 97
Example
Setup
Let R be the number of heads in two independent tosses of a coin.
Probability of head is .6
What are the probabilities?
Ω ={HH,HT,TH,TT}
Thus, we can calculate
P (R = 0) , P (R = 1) , P (R = 2)
72 / 97
Example
Setup
Let R be the number of heads in two independent tosses of a coin.
Probability of head is .6
What are the probabilities?
Ω ={HH,HT,TH,TT}
Thus, we can calculate
P (R = 0) , P (R = 1) , P (R = 2)
72 / 97
Outline
1 Induction
Basic Induction
Structural Induction
2 Series
Properties
Important Series
Bounding the Series
3 Probability
Intuitive Formulation
Axioms
Independence
Unconditional and Conditional Probability
Posterior (Conditional) Probability
Random Variables
Types of Random Variables
Cumulative Distributive Function
Properties of the PMF/PDF
Expected Value and Variance
Indicator Random Variable
73 / 97
Types of Random Variables
Discrete
A discrete random variable can assume only a countable number of values.
Continuous
A continuous random variable can assume a continuous range of values.
74 / 97
Types of Random Variables
Discrete
A discrete random variable can assume only a countable number of values.
Continuous
A continuous random variable can assume a continuous range of values.
74 / 97
Properties
Probability Mass Function (PMF) and Probability Density Function (PDF)
The pmf /pdf of a random variable X assigns a probability for each
possible value of X.
Properties of the pmf and pdf
Some properties of the pmf:
x p(x) = 1 and P(a < X < b) =
b
k=a p(k).
In a similar way for the pdf:
´ ∞
−∞
p(x)dx = 1 and P(a < X < b) =
´ b
a
p(t)dt .
75 / 97
Properties
Probability Mass Function (PMF) and Probability Density Function (PDF)
The pmf /pdf of a random variable X assigns a probability for each
possible value of X.
Properties of the pmf and pdf
Some properties of the pmf:
x p(x) = 1 and P(a < X < b) =
b
k=a p(k).
In a similar way for the pdf:
´ ∞
−∞
p(x)dx = 1 and P(a < X < b) =
´ b
a
p(t)dt .
75 / 97
Properties
Probability Mass Function (PMF) and Probability Density Function (PDF)
The pmf /pdf of a random variable X assigns a probability for each
possible value of X.
Properties of the pmf and pdf
Some properties of the pmf:
x p(x) = 1 and P(a < X < b) =
b
k=a p(k).
In a similar way for the pdf:
´ ∞
−∞
p(x)dx = 1 and P(a < X < b) =
´ b
a
p(t)dt .
75 / 97
Properties
Probability Mass Function (PMF) and Probability Density Function (PDF)
The pmf /pdf of a random variable X assigns a probability for each
possible value of X.
Properties of the pmf and pdf
Some properties of the pmf:
x p(x) = 1 and P(a < X < b) =
b
k=a p(k).
In a similar way for the pdf:
´ ∞
−∞
p(x)dx = 1 and P(a < X < b) =
´ b
a
p(t)dt .
75 / 97
Properties
Probability Mass Function (PMF) and Probability Density Function (PDF)
The pmf /pdf of a random variable X assigns a probability for each
possible value of X.
Properties of the pmf and pdf
Some properties of the pmf:
x p(x) = 1 and P(a < X < b) =
b
k=a p(k).
In a similar way for the pdf:
´ ∞
−∞
p(x)dx = 1 and P(a < X < b) =
´ b
a
p(t)dt .
75 / 97
Outline
1 Induction
Basic Induction
Structural Induction
2 Series
Properties
Important Series
Bounding the Series
3 Probability
Intuitive Formulation
Axioms
Independence
Unconditional and Conditional Probability
Posterior (Conditional) Probability
Random Variables
Types of Random Variables
Cumulative Distributive Function
Properties of the PMF/PDF
Expected Value and Variance
Indicator Random Variable
76 / 97
Cumulative Distributive Function I
Cumulative Distribution Function
With every random variable, we associate a function called
Cumulative Distribution Function (CDF) which is defined as follows:
FX (x) = P(f (X) ≤ x)
With properties:
FX (x) ≥ 0
FX (x) in a non-decreasing function of X.
Example
If X is discrete, its CDF can be computed as follows:
FX (x) = P(f (X) ≤ x) = N
k=1 P(Xk = pk).
77 / 97
Cumulative Distributive Function I
Cumulative Distribution Function
With every random variable, we associate a function called
Cumulative Distribution Function (CDF) which is defined as follows:
FX (x) = P(f (X) ≤ x)
With properties:
FX (x) ≥ 0
FX (x) in a non-decreasing function of X.
Example
If X is discrete, its CDF can be computed as follows:
FX (x) = P(f (X) ≤ x) = N
k=1 P(Xk = pk).
77 / 97
Cumulative Distributive Function I
Cumulative Distribution Function
With every random variable, we associate a function called
Cumulative Distribution Function (CDF) which is defined as follows:
FX (x) = P(f (X) ≤ x)
With properties:
FX (x) ≥ 0
FX (x) in a non-decreasing function of X.
Example
If X is discrete, its CDF can be computed as follows:
FX (x) = P(f (X) ≤ x) = N
k=1 P(Xk = pk).
77 / 97
Cumulative Distributive Function I
Cumulative Distribution Function
With every random variable, we associate a function called
Cumulative Distribution Function (CDF) which is defined as follows:
FX (x) = P(f (X) ≤ x)
With properties:
FX (x) ≥ 0
FX (x) in a non-decreasing function of X.
Example
If X is discrete, its CDF can be computed as follows:
FX (x) = P(f (X) ≤ x) = N
k=1 P(Xk = pk).
77 / 97
Example: Discrete Function
.16
.48
.36
.16
.48
.36
1 2 1 2
1
78 / 97
Cumulative Distributive Function II
Continuous Function
If X is continuous, its CDF can be computed as follows:
F(x) =
ˆ x
−∞
f (t)dt.
Remark
Based in the fundamental theorem of calculus, we have the following
equality.
p(x) =
dF
dx
(x)
Note
This particular p(x) is known as the Probability Mass Function (PMF) or
Probability Distribution Function (PDF).
79 / 97
Cumulative Distributive Function II
Continuous Function
If X is continuous, its CDF can be computed as follows:
F(x) =
ˆ x
−∞
f (t)dt.
Remark
Based in the fundamental theorem of calculus, we have the following
equality.
p(x) =
dF
dx
(x)
Note
This particular p(x) is known as the Probability Mass Function (PMF) or
Probability Distribution Function (PDF).
79 / 97
Cumulative Distributive Function II
Continuous Function
If X is continuous, its CDF can be computed as follows:
F(x) =
ˆ x
−∞
f (t)dt.
Remark
Based in the fundamental theorem of calculus, we have the following
equality.
p(x) =
dF
dx
(x)
Note
This particular p(x) is known as the Probability Mass Function (PMF) or
Probability Distribution Function (PDF).
79 / 97
Example: Continuous Function
Setup
A number X is chosen at random between a and b
Xhas a uniform distribution
fX (x) = 1
b−a for a ≤ x ≤ b
fX (x) = 0 for x < a and x > b
We have
FX (x) = P {X ≤ x} =
ˆ x
−∞
fX (t) dt (34)
P {a < X ≤ b} =
ˆ b
a
fX (t) dt (35)
80 / 97
Example: Continuous Function
Setup
A number X is chosen at random between a and b
Xhas a uniform distribution
fX (x) = 1
b−a for a ≤ x ≤ b
fX (x) = 0 for x < a and x > b
We have
FX (x) = P {X ≤ x} =
ˆ x
−∞
fX (t) dt (34)
P {a < X ≤ b} =
ˆ b
a
fX (t) dt (35)
80 / 97
Example: Continuous Function
Setup
A number X is chosen at random between a and b
Xhas a uniform distribution
fX (x) = 1
b−a for a ≤ x ≤ b
fX (x) = 0 for x < a and x > b
We have
FX (x) = P {X ≤ x} =
ˆ x
−∞
fX (t) dt (34)
P {a < X ≤ b} =
ˆ b
a
fX (t) dt (35)
80 / 97
Example: Continuous Function
Setup
A number X is chosen at random between a and b
Xhas a uniform distribution
fX (x) = 1
b−a for a ≤ x ≤ b
fX (x) = 0 for x < a and x > b
We have
FX (x) = P {X ≤ x} =
ˆ x
−∞
fX (t) dt (34)
P {a < X ≤ b} =
ˆ b
a
fX (t) dt (35)
80 / 97
Example: Continuous Function
Setup
A number X is chosen at random between a and b
Xhas a uniform distribution
fX (x) = 1
b−a for a ≤ x ≤ b
fX (x) = 0 for x < a and x > b
We have
FX (x) = P {X ≤ x} =
ˆ x
−∞
fX (t) dt (34)
P {a < X ≤ b} =
ˆ b
a
fX (t) dt (35)
80 / 97
Example: Continuous Function
Setup
A number X is chosen at random between a and b
Xhas a uniform distribution
fX (x) = 1
b−a for a ≤ x ≤ b
fX (x) = 0 for x < a and x > b
We have
FX (x) = P {X ≤ x} =
ˆ x
−∞
fX (t) dt (34)
P {a < X ≤ b} =
ˆ b
a
fX (t) dt (35)
80 / 97
Graphically
Example uniform distribution
1
81 / 97
Outline
1 Induction
Basic Induction
Structural Induction
2 Series
Properties
Important Series
Bounding the Series
3 Probability
Intuitive Formulation
Axioms
Independence
Unconditional and Conditional Probability
Posterior (Conditional) Probability
Random Variables
Types of Random Variables
Cumulative Distributive Function
Properties of the PMF/PDF
Expected Value and Variance
Indicator Random Variable
82 / 97
Properties of the PMF/PDF
Conditional PMF/PDF
We have the conditional pdf:
p(y|x) =
p(x, y)
p(x)
.
From this, we have the general chain rule
p(x1, x2, ..., xn) = p(x1|x2, ..., xn)p(x2|x3, ..., xn)...p(xn).
Independence
If X and Y are independent, then:
p(x, y) = p(x)p(y).
83 / 97
Properties of the PMF/PDF
Conditional PMF/PDF
We have the conditional pdf:
p(y|x) =
p(x, y)
p(x)
.
From this, we have the general chain rule
p(x1, x2, ..., xn) = p(x1|x2, ..., xn)p(x2|x3, ..., xn)...p(xn).
Independence
If X and Y are independent, then:
p(x, y) = p(x)p(y).
83 / 97
Properties of the PMF/PDF
Law of Total Probability
p(y) =
x
p(y|x)p(x).
84 / 97
Outline
1 Induction
Basic Induction
Structural Induction
2 Series
Properties
Important Series
Bounding the Series
3 Probability
Intuitive Formulation
Axioms
Independence
Unconditional and Conditional Probability
Posterior (Conditional) Probability
Random Variables
Types of Random Variables
Cumulative Distributive Function
Properties of the PMF/PDF
Expected Value and Variance
Indicator Random Variable
85 / 97
Expectation
Something Notable
You have the random variables R1, R2 representing how long is a call and
how much you pay for an international call:
if 0 ≤ R1 ≤ 3(minute) R2 = 10(cents)
if 3 < R1 ≤ 6(minute) R2 = 20(cents)
if 6 < R1 ≤ 9(minute) R2 = 30(cents)
We have then the probabilities
P {R2 = 10} = 0.6, P {R2 = 20} = 0.25, P {R2 = 10} = 0.15.
If we observe N calls and N is very large
We can say that we have N × 0.6 calls and 10 × N × 0.6 the cost of those
calls.
86 / 97
Expectation
Something Notable
You have the random variables R1, R2 representing how long is a call and
how much you pay for an international call:
if 0 ≤ R1 ≤ 3(minute) R2 = 10(cents)
if 3 < R1 ≤ 6(minute) R2 = 20(cents)
if 6 < R1 ≤ 9(minute) R2 = 30(cents)
We have then the probabilities
P {R2 = 10} = 0.6, P {R2 = 20} = 0.25, P {R2 = 10} = 0.15.
If we observe N calls and N is very large
We can say that we have N × 0.6 calls and 10 × N × 0.6 the cost of those
calls.
86 / 97
Expectation
Something Notable
You have the random variables R1, R2 representing how long is a call and
how much you pay for an international call:
if 0 ≤ R1 ≤ 3(minute) R2 = 10(cents)
if 3 < R1 ≤ 6(minute) R2 = 20(cents)
if 6 < R1 ≤ 9(minute) R2 = 30(cents)
We have then the probabilities
P {R2 = 10} = 0.6, P {R2 = 20} = 0.25, P {R2 = 10} = 0.15.
If we observe N calls and N is very large
We can say that we have N × 0.6 calls and 10 × N × 0.6 the cost of those
calls.
86 / 97
Expectation
Something Notable
You have the random variables R1, R2 representing how long is a call and
how much you pay for an international call:
if 0 ≤ R1 ≤ 3(minute) R2 = 10(cents)
if 3 < R1 ≤ 6(minute) R2 = 20(cents)
if 6 < R1 ≤ 9(minute) R2 = 30(cents)
We have then the probabilities
P {R2 = 10} = 0.6, P {R2 = 20} = 0.25, P {R2 = 10} = 0.15.
If we observe N calls and N is very large
We can say that we have N × 0.6 calls and 10 × N × 0.6 the cost of those
calls.
86 / 97
Expectation
Something Notable
You have the random variables R1, R2 representing how long is a call and
how much you pay for an international call:
if 0 ≤ R1 ≤ 3(minute) R2 = 10(cents)
if 3 < R1 ≤ 6(minute) R2 = 20(cents)
if 6 < R1 ≤ 9(minute) R2 = 30(cents)
We have then the probabilities
P {R2 = 10} = 0.6, P {R2 = 20} = 0.25, P {R2 = 10} = 0.15.
If we observe N calls and N is very large
We can say that we have N × 0.6 calls and 10 × N × 0.6 the cost of those
calls.
86 / 97
Expectation
Something Notable
You have the random variables R1, R2 representing how long is a call and
how much you pay for an international call:
if 0 ≤ R1 ≤ 3(minute) R2 = 10(cents)
if 3 < R1 ≤ 6(minute) R2 = 20(cents)
if 6 < R1 ≤ 9(minute) R2 = 30(cents)
We have then the probabilities
P {R2 = 10} = 0.6, P {R2 = 20} = 0.25, P {R2 = 10} = 0.15.
If we observe N calls and N is very large
We can say that we have N × 0.6 calls and 10 × N × 0.6 the cost of those
calls.
86 / 97
Expectation
Similarly
{R2 = 20} =⇒ 0.25N and total cost 5N.
{R2 = 20} =⇒ 0.15N and total cost 4.5N.
We have then the probabilities
The total cost is 6N + 5N + 4.5N = 15.5N or in average 15.5 cents per
call.
The average
10 (0.6N) + 20 (.25N) + 30 (0.15N)
N
=10 (0.6) + 20 (.25) + 30 (0.15)
=
y
yP {R2 = y}
87 / 97
Expectation
Similarly
{R2 = 20} =⇒ 0.25N and total cost 5N.
{R2 = 20} =⇒ 0.15N and total cost 4.5N.
We have then the probabilities
The total cost is 6N + 5N + 4.5N = 15.5N or in average 15.5 cents per
call.
The average
10 (0.6N) + 20 (.25N) + 30 (0.15N)
N
=10 (0.6) + 20 (.25) + 30 (0.15)
=
y
yP {R2 = y}
87 / 97
Expectation
Similarly
{R2 = 20} =⇒ 0.25N and total cost 5N.
{R2 = 20} =⇒ 0.15N and total cost 4.5N.
We have then the probabilities
The total cost is 6N + 5N + 4.5N = 15.5N or in average 15.5 cents per
call.
The average
10 (0.6N) + 20 (.25N) + 30 (0.15N)
N
=10 (0.6) + 20 (.25) + 30 (0.15)
=
y
yP {R2 = y}
87 / 97
Expectation
Similarly
{R2 = 20} =⇒ 0.25N and total cost 5N.
{R2 = 20} =⇒ 0.15N and total cost 4.5N.
We have then the probabilities
The total cost is 6N + 5N + 4.5N = 15.5N or in average 15.5 cents per
call.
The average
10 (0.6N) + 20 (.25N) + 30 (0.15N)
N
=10 (0.6) + 20 (.25) + 30 (0.15)
=
y
yP {R2 = y}
87 / 97
Expected Value
Definition
Discrete random variable X: E(X) = x xp(x).
Continuous random variable Y : E(Y ) =
´
x xp(x)dx.
Extension to a function g (x)
E(g(X)) = x g(x)p(x) (Discrete case).
E(g(X)) =
´ +∞
−∞ g(x)p(x)dx (Continuous case).
88 / 97
Expected Value
Definition
Discrete random variable X: E(X) = x xp(x).
Continuous random variable Y : E(Y ) =
´
x xp(x)dx.
Extension to a function g (x)
E(g(X)) = x g(x)p(x) (Discrete case).
E(g(X)) =
´ +∞
−∞ g(x)p(x)dx (Continuous case).
88 / 97
Expected Value
Definition
Discrete random variable X: E(X) = x xp(x).
Continuous random variable Y : E(Y ) =
´
x xp(x)dx.
Extension to a function g (x)
E(g(X)) = x g(x)p(x) (Discrete case).
E(g(X)) =
´ +∞
−∞ g(x)p(x)dx (Continuous case).
88 / 97
Expected Value
Definition
Discrete random variable X: E(X) = x xp(x).
Continuous random variable Y : E(Y ) =
´
x xp(x)dx.
Extension to a function g (x)
E(g(X)) = x g(x)p(x) (Discrete case).
E(g(X)) =
´ +∞
−∞ g(x)p(x)dx (Continuous case).
88 / 97
Linear Property
Linearity property of the Expected Value
E(af (X) + bg(Y )) = aE(f (X)) + bE(g(Y )) (36)
Example for a discrete distribution
E [aX + b] =
x
[ax + b] p (x|θ)
=a
x
xp (x|θ) + b
x
p (x|θ)
=aE [X] + b
89 / 97
Linear Property
Linearity property of the Expected Value
E(af (X) + bg(Y )) = aE(f (X)) + bE(g(Y )) (36)
Example for a discrete distribution
E [aX + b] =
x
[ax + b] p (x|θ)
=a
x
xp (x|θ) + b
x
p (x|θ)
=aE [X] + b
89 / 97
Linear Property
Linearity property of the Expected Value
E(af (X) + bg(Y )) = aE(f (X)) + bE(g(Y )) (36)
Example for a discrete distribution
E [aX + b] =
x
[ax + b] p (x|θ)
=a
x
xp (x|θ) + b
x
p (x|θ)
=aE [X] + b
89 / 97
Linear Property
Linearity property of the Expected Value
E(af (X) + bg(Y )) = aE(f (X)) + bE(g(Y )) (36)
Example for a discrete distribution
E [aX + b] =
x
[ax + b] p (x|θ)
=a
x
xp (x|θ) + b
x
p (x|θ)
=aE [X] + b
89 / 97
Example
Imagine the following
We have the following functions
1 f (x) = e−x, x ≥ 0
2 g (x) = 0, x < 0
Find
The expected Value
90 / 97
Example
Imagine the following
We have the following functions
1 f (x) = e−x, x ≥ 0
2 g (x) = 0, x < 0
Find
The expected Value
90 / 97
Example
Imagine the following
We have the following functions
1 f (x) = e−x, x ≥ 0
2 g (x) = 0, x < 0
Find
The expected Value
90 / 97
Example
Imagine the following
We have the following functions
1 f (x) = e−x, x ≥ 0
2 g (x) = 0, x < 0
Find
The expected Value
90 / 97
Variance
Definition
Var(X) = E((X − µ)2) where µ = E(X)
Standard Deviation
The standard deviation is simply σ = Var(X).
91 / 97
Variance
Definition
Var(X) = E((X − µ)2) where µ = E(X)
Standard Deviation
The standard deviation is simply σ = Var(X).
91 / 97
Outline
1 Induction
Basic Induction
Structural Induction
2 Series
Properties
Important Series
Bounding the Series
3 Probability
Intuitive Formulation
Axioms
Independence
Unconditional and Conditional Probability
Posterior (Conditional) Probability
Random Variables
Types of Random Variables
Cumulative Distributive Function
Properties of the PMF/PDF
Expected Value and Variance
Indicator Random Variable
92 / 97
Indicator Random Variable
Definition
The indicator of an event A is a random variable IA defined as follows:
IA (ω) =
1 if ω ∈ A
0 if ω /∈ A
(37)
Thus
If IA = 1 if A occurs.
If IA = 0 if A does not occur.
Why is this useful?
Because we can count events!!!
93 / 97
Indicator Random Variable
Definition
The indicator of an event A is a random variable IA defined as follows:
IA (ω) =
1 if ω ∈ A
0 if ω /∈ A
(37)
Thus
If IA = 1 if A occurs.
If IA = 0 if A does not occur.
Why is this useful?
Because we can count events!!!
93 / 97
Indicator Random Variable
Definition
The indicator of an event A is a random variable IA defined as follows:
IA (ω) =
1 if ω ∈ A
0 if ω /∈ A
(37)
Thus
If IA = 1 if A occurs.
If IA = 0 if A does not occur.
Why is this useful?
Because we can count events!!!
93 / 97
Indicator Random Variable
Definition
The indicator of an event A is a random variable IA defined as follows:
IA (ω) =
1 if ω ∈ A
0 if ω /∈ A
(37)
Thus
If IA = 1 if A occurs.
If IA = 0 if A does not occur.
Why is this useful?
Because we can count events!!!
93 / 97
Useful Property
The Expected value of an indicator function
E [IA] = 0 × P (IA = 0) + 1 × P (IA = 1) = P (IA = 1) = P (A) (38)
Why is this useful?
Because we can use this to count things when a probability is involved!!!
94 / 97
Useful Property
The Expected value of an indicator function
E [IA] = 0 × P (IA = 0) + 1 × P (IA = 1) = P (IA = 1) = P (A) (38)
Why is this useful?
Because we can use this to count things when a probability is involved!!!
94 / 97
Useful Property
The Expected value of an indicator function
E [IA] = 0 × P (IA = 0) + 1 × P (IA = 1) = P (IA = 1) = P (A) (38)
Why is this useful?
Because we can use this to count things when a probability is involved!!!
94 / 97
Useful Property
The Expected value of an indicator function
E [IA] = 0 × P (IA = 0) + 1 × P (IA = 1) = P (IA = 1) = P (A) (38)
Why is this useful?
Because we can use this to count things when a probability is involved!!!
94 / 97
Useful Property
The Expected value of an indicator function
E [IA] = 0 × P (IA = 0) + 1 × P (IA = 1) = P (IA = 1) = P (A) (38)
Why is this useful?
Because we can use this to count things when a probability is involved!!!
94 / 97
Method of Indicators
It is possible to see random variables as a sum of indicators functions
R = IA1 + ... + IAn (39)
Then
E [R] =
n
j=1
E IAj =
n
j=1
P (Aj) (40)
Hopefully
It is easier to calculate P (Aj) than to evaluate E [R] directly.
95 / 97
Method of Indicators
It is possible to see random variables as a sum of indicators functions
R = IA1 + ... + IAn (39)
Then
E [R] =
n
j=1
E IAj =
n
j=1
P (Aj) (40)
Hopefully
It is easier to calculate P (Aj) than to evaluate E [R] directly.
95 / 97
Method of Indicators
It is possible to see random variables as a sum of indicators functions
R = IA1 + ... + IAn (39)
Then
E [R] =
n
j=1
E IAj =
n
j=1
P (Aj) (40)
Hopefully
It is easier to calculate P (Aj) than to evaluate E [R] directly.
95 / 97
Example
A single unbiased die is tossed independently n times
Let R1 be the numbers of 1’s.
Let R2 be the numbers of 2’s.
Find E [R1R2]
We can express each variable as
R1 =IA1 + ... + IAn
R2 =IB1 + ... + IBn
Thus
E [R1R2] =
n
i=1
n
j=1
E IAi IBj (41)
96 / 97
Example
A single unbiased die is tossed independently n times
Let R1 be the numbers of 1’s.
Let R2 be the numbers of 2’s.
Find E [R1R2]
We can express each variable as
R1 =IA1 + ... + IAn
R2 =IB1 + ... + IBn
Thus
E [R1R2] =
n
i=1
n
j=1
E IAi IBj (41)
96 / 97
Example
A single unbiased die is tossed independently n times
Let R1 be the numbers of 1’s.
Let R2 be the numbers of 2’s.
Find E [R1R2]
We can express each variable as
R1 =IA1 + ... + IAn
R2 =IB1 + ... + IBn
Thus
E [R1R2] =
n
i=1
n
j=1
E IAi IBj (41)
96 / 97
Next
Case 1 i = j IAi and IBj are independent
E IAi IBj = E [IAi ] E IBj = P (Ai) P (Bj) =
1
6
×
1
6
=
1
36
(42)
Case 2 i = j Ai and Bi are disjoint
Thus, IAi IBi = IAi∩Bi = 0
Thus
E [R1R2] =
n (n − 1)
36
(43)
97 / 97
Next
Case 1 i = j IAi and IBj are independent
E IAi IBj = E [IAi ] E IBj = P (Ai) P (Bj) =
1
6
×
1
6
=
1
36
(42)
Case 2 i = j Ai and Bi are disjoint
Thus, IAi IBi = IAi∩Bi = 0
Thus
E [R1R2] =
n (n − 1)
36
(43)
97 / 97
Next
Case 1 i = j IAi and IBj are independent
E IAi IBj = E [IAi ] E IBj = P (Ai) P (Bj) =
1
6
×
1
6
=
1
36
(42)
Case 2 i = j Ai and Bi are disjoint
Thus, IAi IBi = IAi∩Bi = 0
Thus
E [R1R2] =
n (n − 1)
36
(43)
97 / 97

More Related Content

Viewers also liked

Artificial Intelligence 06.01 introduction bayesian_networks
Artificial Intelligence 06.01 introduction bayesian_networksArtificial Intelligence 06.01 introduction bayesian_networks
Artificial Intelligence 06.01 introduction bayesian_networksAndres Mendez-Vazquez
 
PROFESSIONAL FINANCIAL MANAGEMENT FOR YOUR BUSINESS
PROFESSIONAL FINANCIAL MANAGEMENT FOR YOUR BUSINESSPROFESSIONAL FINANCIAL MANAGEMENT FOR YOUR BUSINESS
PROFESSIONAL FINANCIAL MANAGEMENT FOR YOUR BUSINESSRobert A Harris
 
(Updated Version) Paper into 3rd International Conference (UN SDSN/ ICSDP) on...
(Updated Version) Paper into 3rd International Conference (UN SDSN/ ICSDP) on...(Updated Version) Paper into 3rd International Conference (UN SDSN/ ICSDP) on...
(Updated Version) Paper into 3rd International Conference (UN SDSN/ ICSDP) on...Energy for One World
 
Proposed Zero Emissions Multi Use Park in San Jose
Proposed Zero Emissions Multi Use Park in San JoseProposed Zero Emissions Multi Use Park in San Jose
Proposed Zero Emissions Multi Use Park in San JoseDean Stanford
 
Biljna i zivotinjska celija
Biljna i zivotinjska celijaBiljna i zivotinjska celija
Biljna i zivotinjska celijaTanja Jovanović
 
Module 5 Engaging Students in Learning – Promoting Reflective Inquiry
Module 5 Engaging Students in Learning – Promoting Reflective InquiryModule 5 Engaging Students in Learning – Promoting Reflective Inquiry
Module 5 Engaging Students in Learning – Promoting Reflective InquiryMELINDA TOMPKINS
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Mostafa G. M. Mostafa
 
Work energy power 2 reading assignment -revision 2
Work energy power 2 reading assignment -revision 2Work energy power 2 reading assignment -revision 2
Work energy power 2 reading assignment -revision 2sashrilisdi
 
Preparation Data Structures 05 chain linear_list
Preparation Data Structures 05 chain linear_listPreparation Data Structures 05 chain linear_list
Preparation Data Structures 05 chain linear_listAndres Mendez-Vazquez
 
Periodic table (1)
Periodic table (1)Periodic table (1)
Periodic table (1)jghopwood
 
Preparation Data Structures 07 stacks
Preparation Data Structures 07 stacksPreparation Data Structures 07 stacks
Preparation Data Structures 07 stacksAndres Mendez-Vazquez
 
03 Probability Review for Analysis of Algorithms
03 Probability Review for Analysis of Algorithms03 Probability Review for Analysis of Algorithms
03 Probability Review for Analysis of AlgorithmsAndres Mendez-Vazquez
 
13 Machine Learning Supervised Decision Trees
13 Machine Learning Supervised Decision Trees13 Machine Learning Supervised Decision Trees
13 Machine Learning Supervised Decision TreesAndres Mendez-Vazquez
 

Viewers also liked (20)

Artificial Intelligence 06.01 introduction bayesian_networks
Artificial Intelligence 06.01 introduction bayesian_networksArtificial Intelligence 06.01 introduction bayesian_networks
Artificial Intelligence 06.01 introduction bayesian_networks
 
Endavour 2
Endavour 2Endavour 2
Endavour 2
 
Wood1
Wood1Wood1
Wood1
 
PROFESSIONAL FINANCIAL MANAGEMENT FOR YOUR BUSINESS
PROFESSIONAL FINANCIAL MANAGEMENT FOR YOUR BUSINESSPROFESSIONAL FINANCIAL MANAGEMENT FOR YOUR BUSINESS
PROFESSIONAL FINANCIAL MANAGEMENT FOR YOUR BUSINESS
 
(Updated Version) Paper into 3rd International Conference (UN SDSN/ ICSDP) on...
(Updated Version) Paper into 3rd International Conference (UN SDSN/ ICSDP) on...(Updated Version) Paper into 3rd International Conference (UN SDSN/ ICSDP) on...
(Updated Version) Paper into 3rd International Conference (UN SDSN/ ICSDP) on...
 
Proposed Zero Emissions Multi Use Park in San Jose
Proposed Zero Emissions Multi Use Park in San JoseProposed Zero Emissions Multi Use Park in San Jose
Proposed Zero Emissions Multi Use Park in San Jose
 
BIO-Data-2016-May
BIO-Data-2016-MayBIO-Data-2016-May
BIO-Data-2016-May
 
Biljna i zivotinjska celija
Biljna i zivotinjska celijaBiljna i zivotinjska celija
Biljna i zivotinjska celija
 
Un monde de plateformes
Un monde de plateformesUn monde de plateformes
Un monde de plateformes
 
Module 5 Engaging Students in Learning – Promoting Reflective Inquiry
Module 5 Engaging Students in Learning – Promoting Reflective InquiryModule 5 Engaging Students in Learning – Promoting Reflective Inquiry
Module 5 Engaging Students in Learning – Promoting Reflective Inquiry
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)
 
What's the Matter?
What's the Matter?What's the Matter?
What's the Matter?
 
Waves
WavesWaves
Waves
 
Work energy power 2 reading assignment -revision 2
Work energy power 2 reading assignment -revision 2Work energy power 2 reading assignment -revision 2
Work energy power 2 reading assignment -revision 2
 
Drainage
DrainageDrainage
Drainage
 
Preparation Data Structures 05 chain linear_list
Preparation Data Structures 05 chain linear_listPreparation Data Structures 05 chain linear_list
Preparation Data Structures 05 chain linear_list
 
Periodic table (1)
Periodic table (1)Periodic table (1)
Periodic table (1)
 
Preparation Data Structures 07 stacks
Preparation Data Structures 07 stacksPreparation Data Structures 07 stacks
Preparation Data Structures 07 stacks
 
03 Probability Review for Analysis of Algorithms
03 Probability Review for Analysis of Algorithms03 Probability Review for Analysis of Algorithms
03 Probability Review for Analysis of Algorithms
 
13 Machine Learning Supervised Decision Trees
13 Machine Learning Supervised Decision Trees13 Machine Learning Supervised Decision Trees
13 Machine Learning Supervised Decision Trees
 

Similar to 02 The Math for Analysis of Algorithms

Analysis Of Algorithms Ii
Analysis Of Algorithms IiAnalysis Of Algorithms Ii
Analysis Of Algorithms IiSri Prasanna
 
Math 189 Exam 1 Solutions
Math 189 Exam 1 SolutionsMath 189 Exam 1 Solutions
Math 189 Exam 1 SolutionsTyler Murphy
 
Mathematical Statistics Assignment Help
Mathematical Statistics Assignment HelpMathematical Statistics Assignment Help
Mathematical Statistics Assignment HelpExcel Homework Help
 
Geometry unit 2.5.ppt
Geometry unit 2.5.pptGeometry unit 2.5.ppt
Geometry unit 2.5.pptMark Ryder
 
Discrete-Chapter 05 Inference and Proofs
Discrete-Chapter 05 Inference and ProofsDiscrete-Chapter 05 Inference and Proofs
Discrete-Chapter 05 Inference and ProofsWongyos Keardsri
 
Heuristics for counterexamples to the Agrawal Conjecture
Heuristics for counterexamples to the Agrawal ConjectureHeuristics for counterexamples to the Agrawal Conjecture
Heuristics for counterexamples to the Agrawal ConjectureAmshuman Hegde
 
Discrete Math Ch5 counting + proofs
Discrete Math Ch5 counting + proofsDiscrete Math Ch5 counting + proofs
Discrete Math Ch5 counting + proofsAmr Rashed
 
HEC 222 LECTURE 1.pptx
HEC 222 LECTURE 1.pptxHEC 222 LECTURE 1.pptx
HEC 222 LECTURE 1.pptxRaymond685379
 
Principle of mathematical induction
Principle of mathematical inductionPrinciple of mathematical induction
Principle of mathematical inductionKriti Varshney
 
Study on a class of recursive functions
Study on a class of recursive functionsStudy on a class of recursive functions
Study on a class of recursive functionsLuca Polidori
 
Stochastic Processes Homework Help
Stochastic Processes Homework HelpStochastic Processes Homework Help
Stochastic Processes Homework HelpExcel Homework Help
 
1.3.2 Inductive and Deductive Reasoning
1.3.2 Inductive and Deductive Reasoning1.3.2 Inductive and Deductive Reasoning
1.3.2 Inductive and Deductive Reasoningsmiller5
 
Cahit 8-equitability of coronas cn○k1
Cahit 8-equitability of coronas cn○k1Cahit 8-equitability of coronas cn○k1
Cahit 8-equitability of coronas cn○k1eSAT Publishing House
 

Similar to 02 The Math for Analysis of Algorithms (20)

Induction.pdf
Induction.pdfInduction.pdf
Induction.pdf
 
Analysis Of Algorithms Ii
Analysis Of Algorithms IiAnalysis Of Algorithms Ii
Analysis Of Algorithms Ii
 
Math 189 Exam 1 Solutions
Math 189 Exam 1 SolutionsMath 189 Exam 1 Solutions
Math 189 Exam 1 Solutions
 
Mathematical Statistics Assignment Help
Mathematical Statistics Assignment HelpMathematical Statistics Assignment Help
Mathematical Statistics Assignment Help
 
Mathematical Statistics Assignment Help
Mathematical Statistics Assignment HelpMathematical Statistics Assignment Help
Mathematical Statistics Assignment Help
 
Geometry unit 2.5.ppt
Geometry unit 2.5.pptGeometry unit 2.5.ppt
Geometry unit 2.5.ppt
 
Discrete-Chapter 05 Inference and Proofs
Discrete-Chapter 05 Inference and ProofsDiscrete-Chapter 05 Inference and Proofs
Discrete-Chapter 05 Inference and Proofs
 
Heuristics for counterexamples to the Agrawal Conjecture
Heuristics for counterexamples to the Agrawal ConjectureHeuristics for counterexamples to the Agrawal Conjecture
Heuristics for counterexamples to the Agrawal Conjecture
 
Discrete Math Ch5 counting + proofs
Discrete Math Ch5 counting + proofsDiscrete Math Ch5 counting + proofs
Discrete Math Ch5 counting + proofs
 
Kumera2.docx
Kumera2.docxKumera2.docx
Kumera2.docx
 
HEC 222 LECTURE 1.pptx
HEC 222 LECTURE 1.pptxHEC 222 LECTURE 1.pptx
HEC 222 LECTURE 1.pptx
 
Slides ensae-2016-9
Slides ensae-2016-9Slides ensae-2016-9
Slides ensae-2016-9
 
Chapter5.pptx
Chapter5.pptxChapter5.pptx
Chapter5.pptx
 
Principle of mathematical induction
Principle of mathematical inductionPrinciple of mathematical induction
Principle of mathematical induction
 
1112 ch 11 day 12
1112 ch 11 day 121112 ch 11 day 12
1112 ch 11 day 12
 
Study on a class of recursive functions
Study on a class of recursive functionsStudy on a class of recursive functions
Study on a class of recursive functions
 
Stochastic Processes Homework Help
Stochastic Processes Homework HelpStochastic Processes Homework Help
Stochastic Processes Homework Help
 
1.3.2 Inductive and Deductive Reasoning
1.3.2 Inductive and Deductive Reasoning1.3.2 Inductive and Deductive Reasoning
1.3.2 Inductive and Deductive Reasoning
 
Cahit 8-equitability of coronas cn○k1
Cahit 8-equitability of coronas cn○k1Cahit 8-equitability of coronas cn○k1
Cahit 8-equitability of coronas cn○k1
 
Slides ensae 9
Slides ensae 9Slides ensae 9
Slides ensae 9
 

More from Andres Mendez-Vazquez

01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectors01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectorsAndres Mendez-Vazquez
 
01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issues01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issuesAndres Mendez-Vazquez
 
05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiation05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiationAndres Mendez-Vazquez
 
01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep Learning01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep LearningAndres Mendez-Vazquez
 
25 introduction reinforcement_learning
25 introduction reinforcement_learning25 introduction reinforcement_learning
25 introduction reinforcement_learningAndres Mendez-Vazquez
 
Neural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning SyllabusNeural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning SyllabusAndres Mendez-Vazquez
 
Introduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabusIntroduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabusAndres Mendez-Vazquez
 
Ideas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data SciencesIdeas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data SciencesAndres Mendez-Vazquez
 
20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variations20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variationsAndres Mendez-Vazquez
 

More from Andres Mendez-Vazquez (20)

2.03 bayesian estimation
2.03 bayesian estimation2.03 bayesian estimation
2.03 bayesian estimation
 
05 linear transformations
05 linear transformations05 linear transformations
05 linear transformations
 
01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectors01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectors
 
01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issues01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issues
 
01.02 linear equations
01.02 linear equations01.02 linear equations
01.02 linear equations
 
01.01 vector spaces
01.01 vector spaces01.01 vector spaces
01.01 vector spaces
 
06 recurrent neural_networks
06 recurrent neural_networks06 recurrent neural_networks
06 recurrent neural_networks
 
05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiation05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiation
 
Zetta global
Zetta globalZetta global
Zetta global
 
01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep Learning01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep Learning
 
25 introduction reinforcement_learning
25 introduction reinforcement_learning25 introduction reinforcement_learning
25 introduction reinforcement_learning
 
Neural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning SyllabusNeural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning Syllabus
 
Introduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabusIntroduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabus
 
Ideas 09 22_2018
Ideas 09 22_2018Ideas 09 22_2018
Ideas 09 22_2018
 
Ideas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data SciencesIdeas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data Sciences
 
Analysis of Algorithms Syllabus
Analysis of Algorithms  SyllabusAnalysis of Algorithms  Syllabus
Analysis of Algorithms Syllabus
 
20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variations20 k-means, k-center, k-meoids and variations
20 k-means, k-center, k-meoids and variations
 
18.1 combining models
18.1 combining models18.1 combining models
18.1 combining models
 
17 vapnik chervonenkis dimension
17 vapnik chervonenkis dimension17 vapnik chervonenkis dimension
17 vapnik chervonenkis dimension
 
A basic introduction to learning
A basic introduction to learningA basic introduction to learning
A basic introduction to learning
 

Recently uploaded

chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2RajaP95
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 

Recently uploaded (20)

chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 

02 The Math for Analysis of Algorithms

  • 1. Analysis of Algorithms The Mathematics of Analysis of Algorithms Andres Mendez-Vazquez September 10, 2015 1 / 97
  • 2. Outline 1 Induction Basic Induction Structural Induction 2 Series Properties Important Series Bounding the Series 3 Probability Intuitive Formulation Axioms Independence Unconditional and Conditional Probability Posterior (Conditional) Probability Random Variables Types of Random Variables Cumulative Distributive Function Properties of the PMF/PDF Expected Value and Variance Indicator Random Variable 2 / 97
  • 3. Outline 1 Induction Basic Induction Structural Induction 2 Series Properties Important Series Bounding the Series 3 Probability Intuitive Formulation Axioms Independence Unconditional and Conditional Probability Posterior (Conditional) Probability Random Variables Types of Random Variables Cumulative Distributive Function Properties of the PMF/PDF Expected Value and Variance Indicator Random Variable 3 / 97
  • 4. Basic Induction Principle of Mathematical Induction Let P(n) be a property that is defined for integers n, and let a be a fixed integer. Suppose the following two statements are true 1 P (a) is true. 2 For all integers k ≥ a, if P (k) is true then P (k + 1) is true. Then the statement For all integers n ≥ a, P (n) is true. (1) 4 / 97
  • 5. Basic Induction Principle of Mathematical Induction Let P(n) be a property that is defined for integers n, and let a be a fixed integer. Suppose the following two statements are true 1 P (a) is true. 2 For all integers k ≥ a, if P (k) is true then P (k + 1) is true. Then the statement For all integers n ≥ a, P (n) is true. (1) 4 / 97
  • 6. Basic Induction Principle of Mathematical Induction Let P(n) be a property that is defined for integers n, and let a be a fixed integer. Suppose the following two statements are true 1 P (a) is true. 2 For all integers k ≥ a, if P (k) is true then P (k + 1) is true. Then the statement For all integers n ≥ a, P (n) is true. (1) 4 / 97
  • 7. Basic Induction Principle of Mathematical Induction Let P(n) be a property that is defined for integers n, and let a be a fixed integer. Suppose the following two statements are true 1 P (a) is true. 2 For all integers k ≥ a, if P (k) is true then P (k + 1) is true. Then the statement For all integers n ≥ a, P (n) is true. (1) 4 / 97
  • 8. We have the following method for Mathematical Induction Consider a statement of the form For all integers n ≥ a, P (n) is true. (2) To prove such a statement Perform the following two steps Step 1 (Basis step) Show that P(a) is true - we normally use a = 1. 5 / 97
  • 9. We have the following method for Mathematical Induction Consider a statement of the form For all integers n ≥ a, P (n) is true. (2) To prove such a statement Perform the following two steps Step 1 (Basis step) Show that P(a) is true - we normally use a = 1. 5 / 97
  • 10. We have the following method for Mathematical Induction Consider a statement of the form For all integers n ≥ a, P (n) is true. (2) To prove such a statement Perform the following two steps Step 1 (Basis step) Show that P(a) is true - we normally use a = 1. 5 / 97
  • 11. Then Step 2 (Inductive step) Show that for all integers k ≥ a, if P(k) is true, then P(k + 1) is true. Inductive hypothesis Suppose that P(k) is true, where k is any particular but arbitrarily chosen integer with k ≥ a. Then, you can prove that P (k + 1) is true. 6 / 97
  • 12. Then Step 2 (Inductive step) Show that for all integers k ≥ a, if P(k) is true, then P(k + 1) is true. Inductive hypothesis Suppose that P(k) is true, where k is any particular but arbitrarily chosen integer with k ≥ a. Then, you can prove that P (k + 1) is true. 6 / 97
  • 13. Then Step 2 (Inductive step) Show that for all integers k ≥ a, if P(k) is true, then P(k + 1) is true. Inductive hypothesis Suppose that P(k) is true, where k is any particular but arbitrarily chosen integer with k ≥ a. Then, you can prove that P (k + 1) is true. 6 / 97
  • 14. Example Proposition For all integers n ≥ 8, n¢ can be obtained using 3¢ and 5¢ coins. Show that P(8) is true P(8) is true because 8¢ can be obtained using one coin 3¢ and another coin of 5¢. Show that for all integers k ≥ 8, if P(k) is true then P(k + 1) is also true We can do the following. 7 / 97
  • 15. Example Proposition For all integers n ≥ 8, n¢ can be obtained using 3¢ and 5¢ coins. Show that P(8) is true P(8) is true because 8¢ can be obtained using one coin 3¢ and another coin of 5¢. Show that for all integers k ≥ 8, if P(k) is true then P(k + 1) is also true We can do the following. 7 / 97
  • 16. Example Proposition For all integers n ≥ 8, n¢ can be obtained using 3¢ and 5¢ coins. Show that P(8) is true P(8) is true because 8¢ can be obtained using one coin 3¢ and another coin of 5¢. Show that for all integers k ≥ 8, if P(k) is true then P(k + 1) is also true We can do the following. 7 / 97
  • 17. Example Inductive hypothesis Suppose that k is any integer with k ≥ 8 such that k¢ can be obtained using 3¢ and 5¢ coins. Case 1 - There is a 5¢ among those making the change for k¢ In this case, replace 5¢ by two 3¢. Thus, we get the change for (k + 1) ¢ Case 2 - There is not a 5¢ among those making the change for k¢ Because k ≥ 8, at leat three coins must have been used. At least three 3¢ coins must have been used. Remove those coins and replaced them using two 5¢. The result will be (k + 1) ¢. 8 / 97
  • 18. Example Inductive hypothesis Suppose that k is any integer with k ≥ 8 such that k¢ can be obtained using 3¢ and 5¢ coins. Case 1 - There is a 5¢ among those making the change for k¢ In this case, replace 5¢ by two 3¢. Thus, we get the change for (k + 1) ¢ Case 2 - There is not a 5¢ among those making the change for k¢ Because k ≥ 8, at leat three coins must have been used. At least three 3¢ coins must have been used. Remove those coins and replaced them using two 5¢. The result will be (k + 1) ¢. 8 / 97
  • 19. Example Inductive hypothesis Suppose that k is any integer with k ≥ 8 such that k¢ can be obtained using 3¢ and 5¢ coins. Case 1 - There is a 5¢ among those making the change for k¢ In this case, replace 5¢ by two 3¢. Thus, we get the change for (k + 1) ¢ Case 2 - There is not a 5¢ among those making the change for k¢ Because k ≥ 8, at leat three coins must have been used. At least three 3¢ coins must have been used. Remove those coins and replaced them using two 5¢. The result will be (k + 1) ¢. 8 / 97
  • 20. Example Inductive hypothesis Suppose that k is any integer with k ≥ 8 such that k¢ can be obtained using 3¢ and 5¢ coins. Case 1 - There is a 5¢ among those making the change for k¢ In this case, replace 5¢ by two 3¢. Thus, we get the change for (k + 1) ¢ Case 2 - There is not a 5¢ among those making the change for k¢ Because k ≥ 8, at leat three coins must have been used. At least three 3¢ coins must have been used. Remove those coins and replaced them using two 5¢. The result will be (k + 1) ¢. 8 / 97
  • 21. Example Inductive hypothesis Suppose that k is any integer with k ≥ 8 such that k¢ can be obtained using 3¢ and 5¢ coins. Case 1 - There is a 5¢ among those making the change for k¢ In this case, replace 5¢ by two 3¢. Thus, we get the change for (k + 1) ¢ Case 2 - There is not a 5¢ among those making the change for k¢ Because k ≥ 8, at leat three coins must have been used. At least three 3¢ coins must have been used. Remove those coins and replaced them using two 5¢. The result will be (k + 1) ¢. 8 / 97
  • 22. Example Inductive hypothesis Suppose that k is any integer with k ≥ 8 such that k¢ can be obtained using 3¢ and 5¢ coins. Case 1 - There is a 5¢ among those making the change for k¢ In this case, replace 5¢ by two 3¢. Thus, we get the change for (k + 1) ¢ Case 2 - There is not a 5¢ among those making the change for k¢ Because k ≥ 8, at leat three coins must have been used. At least three 3¢ coins must have been used. Remove those coins and replaced them using two 5¢. The result will be (k + 1) ¢. 8 / 97
  • 23. Example Inductive hypothesis Suppose that k is any integer with k ≥ 8 such that k¢ can be obtained using 3¢ and 5¢ coins. Case 1 - There is a 5¢ among those making the change for k¢ In this case, replace 5¢ by two 3¢. Thus, we get the change for (k + 1) ¢ Case 2 - There is not a 5¢ among those making the change for k¢ Because k ≥ 8, at leat three coins must have been used. At least three 3¢ coins must have been used. Remove those coins and replaced them using two 5¢. The result will be (k + 1) ¢. 8 / 97
  • 24. Outline 1 Induction Basic Induction Structural Induction 2 Series Properties Important Series Bounding the Series 3 Probability Intuitive Formulation Axioms Independence Unconditional and Conditional Probability Posterior (Conditional) Probability Random Variables Types of Random Variables Cumulative Distributive Function Properties of the PMF/PDF Expected Value and Variance Indicator Random Variable 9 / 97
  • 25. We can go further... Recursively defined sets and structures Assume S is a set. We use two steps to define the elements of S. Basis Step Specify an initial collection of elements. Recursive Step Give a rule for forming new elements from those already known to be in S. 10 / 97
  • 26. We can go further... Recursively defined sets and structures Assume S is a set. We use two steps to define the elements of S. Basis Step Specify an initial collection of elements. Recursive Step Give a rule for forming new elements from those already known to be in S. 10 / 97
  • 27. We can go further... Recursively defined sets and structures Assume S is a set. We use two steps to define the elements of S. Basis Step Specify an initial collection of elements. Recursive Step Give a rule for forming new elements from those already known to be in S. 10 / 97
  • 28. Thus Let S be a set that has been defined recursively To prove that every object in S satisfies a certain property: 1 Show that each object in the BASE for S satisfies the property. 2 Show that for each rule in the RECURSION, if the rule is applied to objects in S that satisfy the property, then the objects defined by the rule also satisfy the property. 11 / 97
  • 29. Thus Let S be a set that has been defined recursively To prove that every object in S satisfies a certain property: 1 Show that each object in the BASE for S satisfies the property. 2 Show that for each rule in the RECURSION, if the rule is applied to objects in S that satisfy the property, then the objects defined by the rule also satisfy the property. 11 / 97
  • 30. Thus Let S be a set that has been defined recursively To prove that every object in S satisfies a certain property: 1 Show that each object in the BASE for S satisfies the property. 2 Show that for each rule in the RECURSION, if the rule is applied to objects in S that satisfy the property, then the objects defined by the rule also satisfy the property. 11 / 97
  • 31. Example: Binary trees recursive definition Recall that the set B of binary trees over an alphabet A is defined as follows 1 Basis: ∈ B 2 Recursive definition: If L, R ∈ B and x ∈ A then < L, x, R >∈ B. Now define the function f : B → N defined as f ( ) =0 f ( L, x, R ) = 1 if L = R = f (L) + f (R) otherwise Theorem Let T in B be a binary tree. Then f (T) yields the number of leaves of T. 12 / 97
  • 32. Example: Binary trees recursive definition Recall that the set B of binary trees over an alphabet A is defined as follows 1 Basis: ∈ B 2 Recursive definition: If L, R ∈ B and x ∈ A then < L, x, R >∈ B. Now define the function f : B → N defined as f ( ) =0 f ( L, x, R ) = 1 if L = R = f (L) + f (R) otherwise Theorem Let T in B be a binary tree. Then f (T) yields the number of leaves of T. 12 / 97
  • 33. Example: Binary trees recursive definition Recall that the set B of binary trees over an alphabet A is defined as follows 1 Basis: ∈ B 2 Recursive definition: If L, R ∈ B and x ∈ A then < L, x, R >∈ B. Now define the function f : B → N defined as f ( ) =0 f ( L, x, R ) = 1 if L = R = f (L) + f (R) otherwise Theorem Let T in B be a binary tree. Then f (T) yields the number of leaves of T. 12 / 97
  • 34. Proof By structural induction on T Basis: The empty tree has no leaves, so f ( ) = 0 is correct. Induction Let L, R be trees in B, x ∈ A. Now Suppose that f (L) and f (R) denotes the number of leaves of L and R, respectively. 13 / 97
  • 35. Proof By structural induction on T Basis: The empty tree has no leaves, so f ( ) = 0 is correct. Induction Let L, R be trees in B, x ∈ A. Now Suppose that f (L) and f (R) denotes the number of leaves of L and R, respectively. 13 / 97
  • 36. Proof By structural induction on T Basis: The empty tree has no leaves, so f ( ) = 0 is correct. Induction Let L, R be trees in B, x ∈ A. Now Suppose that f (L) and f (R) denotes the number of leaves of L and R, respectively. 13 / 97
  • 37. Proof Case 1 If L = R = , then L, x, R = , x, has one leaf, namely x, so f ( , x, ) = 1 is correct. Case 2 If L and R are both not empty, then the number of leaves of the tree L, x, R is equal to the number of leaves of L plus the number of leaves of R. f ( L, x, R ) = f (L) + f (R) (3) Q.E.D. 14 / 97
  • 38. Proof Case 1 If L = R = , then L, x, R = , x, has one leaf, namely x, so f ( , x, ) = 1 is correct. Case 2 If L and R are both not empty, then the number of leaves of the tree L, x, R is equal to the number of leaves of L plus the number of leaves of R. f ( L, x, R ) = f (L) + f (R) (3) Q.E.D. 14 / 97
  • 39. Introduction When an algorithm contains an iterative control construct such as a while or for loop It is possible to express its running time as a series: n j=1 j (4) Thus, what is the objective of using these series To be able to find bounds for the complexities of algorithms 15 / 97
  • 40. Introduction When an algorithm contains an iterative control construct such as a while or for loop It is possible to express its running time as a series: n j=1 j (4) Thus, what is the objective of using these series To be able to find bounds for the complexities of algorithms 15 / 97
  • 41. Definition of Series Definition Given a sequence a1, a2, ..., an of numbers, where n is a no-negative integer, we can say that a1 + a2 + ... + an = n k=1 ak (5) In the case of infinite series a1 + a2 + ... = ∞ k=1 ak = lim n→∞ n k=1 ak (6) Here, we have concepts of convergence and divergence that I will allow you to study. 16 / 97
  • 42. Definition of Series Definition Given a sequence a1, a2, ..., an of numbers, where n is a no-negative integer, we can say that a1 + a2 + ... + an = n k=1 ak (5) In the case of infinite series a1 + a2 + ... = ∞ k=1 ak = lim n→∞ n k=1 ak (6) Here, we have concepts of convergence and divergence that I will allow you to study. 16 / 97
  • 43. Definition of Series Definition Given a sequence a1, a2, ..., an of numbers, where n is a no-negative integer, we can say that a1 + a2 + ... + an = n k=1 ak (5) In the case of infinite series a1 + a2 + ... = ∞ k=1 ak = lim n→∞ n k=1 ak (6) Here, we have concepts of convergence and divergence that I will allow you to study. 16 / 97
  • 44. Definition of Series Definition Given a sequence a1, a2, ..., an of numbers, where n is a no-negative integer, we can say that a1 + a2 + ... + an = n k=1 ak (5) In the case of infinite series a1 + a2 + ... = ∞ k=1 ak = lim n→∞ n k=1 ak (6) Here, we have concepts of convergence and divergence that I will allow you to study. 16 / 97
  • 45. Outline 1 Induction Basic Induction Structural Induction 2 Series Properties Important Series Bounding the Series 3 Probability Intuitive Formulation Axioms Independence Unconditional and Conditional Probability Posterior (Conditional) Probability Random Variables Types of Random Variables Cumulative Distributive Function Properties of the PMF/PDF Expected Value and Variance Indicator Random Variable 17 / 97
  • 46. Linearity For any real number c and any finite sequences a1, a2, ..., an and b1, b2, ..., bn n k=1 [cak + bk] = c n k=1 ak + n k=1 bk (7) For More Please take a look at page 1146 of Cormen’s book. 18 / 97
  • 47. Linearity For any real number c and any finite sequences a1, a2, ..., an and b1, b2, ..., bn n k=1 [cak + bk] = c n k=1 ak + n k=1 bk (7) For More Please take a look at page 1146 of Cormen’s book. 18 / 97
  • 48. Outline 1 Induction Basic Induction Structural Induction 2 Series Properties Important Series Bounding the Series 3 Probability Intuitive Formulation Axioms Independence Unconditional and Conditional Probability Posterior (Conditional) Probability Random Variables Types of Random Variables Cumulative Distributive Function Properties of the PMF/PDF Expected Value and Variance Indicator Random Variable 19 / 97
  • 49. Telescopic Sum Observation In certain sums each term is a difference of two quantities. You sometimes see that all the terms cancel except the first and the last. Example Imagine that each term in the sum has the following structure: 1 k − 1 k + 1 = (k + 1) − k k (k + 1) = 1 k (k + 1) (8) What is the result of the following sum? n k=1 1 k (k + 1) (9) 20 / 97
  • 50. Telescopic Sum Observation In certain sums each term is a difference of two quantities. You sometimes see that all the terms cancel except the first and the last. Example Imagine that each term in the sum has the following structure: 1 k − 1 k + 1 = (k + 1) − k k (k + 1) = 1 k (k + 1) (8) What is the result of the following sum? n k=1 1 k (k + 1) (9) 20 / 97
  • 51. Telescopic Sum Observation In certain sums each term is a difference of two quantities. You sometimes see that all the terms cancel except the first and the last. Example Imagine that each term in the sum has the following structure: 1 k − 1 k + 1 = (k + 1) − k k (k + 1) = 1 k (k + 1) (8) What is the result of the following sum? n k=1 1 k (k + 1) (9) 20 / 97
  • 52. Telescopic Sum Observation In certain sums each term is a difference of two quantities. You sometimes see that all the terms cancel except the first and the last. Example Imagine that each term in the sum has the following structure: 1 k − 1 k + 1 = (k + 1) − k k (k + 1) = 1 k (k + 1) (8) What is the result of the following sum? n k=1 1 k (k + 1) (9) 20 / 97
  • 53. Telescopic Sum Observation In certain sums each term is a difference of two quantities. You sometimes see that all the terms cancel except the first and the last. Example Imagine that each term in the sum has the following structure: 1 k − 1 k + 1 = (k + 1) − k k (k + 1) = 1 k (k + 1) (8) What is the result of the following sum? n k=1 1 k (k + 1) (9) 20 / 97
  • 54. Telescopic Sum Observation In certain sums each term is a difference of two quantities. You sometimes see that all the terms cancel except the first and the last. Example Imagine that each term in the sum has the following structure: 1 k − 1 k + 1 = (k + 1) − k k (k + 1) = 1 k (k + 1) (8) What is the result of the following sum? n k=1 1 k (k + 1) (9) 20 / 97
  • 55. Telescopic Sum Observation In certain sums each term is a difference of two quantities. You sometimes see that all the terms cancel except the first and the last. Example Imagine that each term in the sum has the following structure: 1 k − 1 k + 1 = (k + 1) − k k (k + 1) = 1 k (k + 1) (8) What is the result of the following sum? n k=1 1 k (k + 1) (9) 20 / 97
  • 56. Telescopic Sum Definition For any sequence a0, a1, ..., an, n k=1 (ak − ak−1) = an − a0 (10) Similarly n−1 k=0 (ak − ak+1) = a0 − an (11) 21 / 97
  • 57. Telescopic Sum Definition For any sequence a0, a1, ..., an, n k=1 (ak − ak−1) = an − a0 (10) Similarly n−1 k=0 (ak − ak+1) = a0 − an (11) 21 / 97
  • 58. Arithmetic series Summing over the set {1, 2, 3, ..., n} We can prove that n i=1 i = n (n + 1) 2 (12) Proof Basis: If n = 1 then 1×2 2 = 1 22 / 97
  • 59. Arithmetic series Summing over the set {1, 2, 3, ..., n} We can prove that n i=1 i = n (n + 1) 2 (12) Proof Basis: If n = 1 then 1×2 2 = 1 22 / 97
  • 60. Proof Induction, assume that is true for n n+1 i=1 i = n i=1 i + n + 1 = n (n + 1) 2 + n + 1 = n (n + 1) + 2 (n + 1) 2 = (n + 1) (n + 2) 2 23 / 97
  • 61. Proof Induction, assume that is true for n n+1 i=1 i = n i=1 i + n + 1 = n (n + 1) 2 + n + 1 = n (n + 1) + 2 (n + 1) 2 = (n + 1) (n + 2) 2 23 / 97
  • 62. Proof Induction, assume that is true for n n+1 i=1 i = n i=1 i + n + 1 = n (n + 1) 2 + n + 1 = n (n + 1) + 2 (n + 1) 2 = (n + 1) (n + 2) 2 23 / 97
  • 63. Proof Induction, assume that is true for n n+1 i=1 i = n i=1 i + n + 1 = n (n + 1) 2 + n + 1 = n (n + 1) + 2 (n + 1) 2 = (n + 1) (n + 2) 2 23 / 97
  • 64. Series of Squares and Sums Series of Squares n k=0 k2 = n (n + 1) (2n + 1) 6 (13) Series of Cubes n k=0 k2 = n2 (n + 1)2 4 (14) 24 / 97
  • 65. Series of Squares and Sums Series of Squares n k=0 k2 = n (n + 1) (2n + 1) 6 (13) Series of Cubes n k=0 k2 = n2 (n + 1)2 4 (14) 24 / 97
  • 66. Geometric Series Definition For a real x = 1, we have that n k=0 xk = 1 + x + x2 + ... + xn (15) It is called the geometric series It is possible to prove that n k=0 xk = xn+1 − 1 x − 1 (16) Proof Now multiply both sides by of (Eq. 15) by x x n k=0 xk = x + x2 + x3 + ... + xn+1 (17) 25 / 97
  • 67. Geometric Series Definition For a real x = 1, we have that n k=0 xk = 1 + x + x2 + ... + xn (15) It is called the geometric series It is possible to prove that n k=0 xk = xn+1 − 1 x − 1 (16) Proof Now multiply both sides by of (Eq. 15) by x x n k=0 xk = x + x2 + x3 + ... + xn+1 (17) 25 / 97
  • 68. Geometric Series Definition For a real x = 1, we have that n k=0 xk = 1 + x + x2 + ... + xn (15) It is called the geometric series It is possible to prove that n k=0 xk = xn+1 − 1 x − 1 (16) Proof Now multiply both sides by of (Eq. 15) by x x n k=0 xk = x + x2 + x3 + ... + xn+1 (17) 25 / 97
  • 69. Proof Subtract (Eq. 17) from (Eq. 15) n k=0 xk − x n k=0 xk = 1 − xn+1 (18) Finally n k=0 xk = 1 − xn+1 1 − x = xn+1 − 1 x − 1 (19) 26 / 97
  • 70. Proof Subtract (Eq. 17) from (Eq. 15) n k=0 xk − x n k=0 xk = 1 − xn+1 (18) Finally n k=0 xk = 1 − xn+1 1 − x = xn+1 − 1 x − 1 (19) 26 / 97
  • 71. Infinite Geometric Series When the summation is infinite and |x| < 1 ∞ k=0 xk = 1 1 − x (20) Proof Given that ∞ k=0 xk = lim n→∞ n k=0 xk = lim n→∞ 1 − xn+1 1 − x = 1 1 − x (21) 27 / 97
  • 72. Infinite Geometric Series When the summation is infinite and |x| < 1 ∞ k=0 xk = 1 1 − x (20) Proof Given that ∞ k=0 xk = lim n→∞ n k=0 xk = lim n→∞ 1 − xn+1 1 − x = 1 1 − x (21) 27 / 97
  • 73. Infinite Geometric Series When the summation is infinite and |x| < 1 ∞ k=0 xk = 1 1 − x (20) Proof Given that ∞ k=0 xk = lim n→∞ n k=0 xk = lim n→∞ 1 − xn+1 1 − x = 1 1 − x (21) 27 / 97
  • 74. Infinite Geometric Series When the summation is infinite and |x| < 1 ∞ k=0 xk = 1 1 − x (20) Proof Given that ∞ k=0 xk = lim n→∞ n k=0 xk = lim n→∞ 1 − xn+1 1 − x = 1 1 − x (21) 27 / 97
  • 75. Infinite Geometric Series When the summation is infinite and |x| < 1 ∞ k=0 xk = 1 1 − x (20) Proof Given that ∞ k=0 xk = lim n→∞ n k=0 xk = lim n→∞ 1 − xn+1 1 − x = 1 1 − x (21) 27 / 97
  • 76. For more on the series Please take a look to Cormen’s book - Appendix A. 28 / 97
  • 77. Outline 1 Induction Basic Induction Structural Induction 2 Series Properties Important Series Bounding the Series 3 Probability Intuitive Formulation Axioms Independence Unconditional and Conditional Probability Posterior (Conditional) Probability Random Variables Types of Random Variables Cumulative Distributive Function Properties of the PMF/PDF Expected Value and Variance Indicator Random Variable 29 / 97
  • 78. This is quite useful for Analysis of Algorithms Important The most basic way to evaluate a series is to use mathematical induction. Example Prove that n k=0 3k ≤ c3n (22) 30 / 97
  • 79. This is quite useful for Analysis of Algorithms Important The most basic way to evaluate a series is to use mathematical induction. Example Prove that n k=0 3k ≤ c3n (22) 30 / 97
  • 80. Fast Bounding of Series A quick upper bound on the arithmetic series n k=1 k ≤ n k=1 n = n2 (23) In general, for a series n k=1 ak If amax = max1≤k≤n ak then n k=1 ak ≤ n · amax (24) Another fast way of bounding finite series is Suppose that ak+1 ak ≤ r for all k ≥ 0 where 0 < r < 1. 31 / 97
  • 81. Fast Bounding of Series A quick upper bound on the arithmetic series n k=1 k ≤ n k=1 n = n2 (23) In general, for a series n k=1 ak If amax = max1≤k≤n ak then n k=1 ak ≤ n · amax (24) Another fast way of bounding finite series is Suppose that ak+1 ak ≤ r for all k ≥ 0 where 0 < r < 1. 31 / 97
  • 82. Fast Bounding of Series A quick upper bound on the arithmetic series n k=1 k ≤ n k=1 n = n2 (23) In general, for a series n k=1 ak If amax = max1≤k≤n ak then n k=1 ak ≤ n · amax (24) Another fast way of bounding finite series is Suppose that ak+1 ak ≤ r for all k ≥ 0 where 0 < r < 1. 31 / 97
  • 83. A More Elegant Method Thus, we have ak ≤ a0rk (25) Thus, we can use a infinite decreasing geometric series n k=0 ak ≤ ∞ k=0 a0rk =a0 ∞ k=0 rk =a0 1 1 − r 32 / 97
  • 84. A More Elegant Method Thus, we have ak ≤ a0rk (25) Thus, we can use a infinite decreasing geometric series n k=0 ak ≤ ∞ k=0 a0rk =a0 ∞ k=0 rk =a0 1 1 − r 32 / 97
  • 85. A More Elegant Method Thus, we have ak ≤ a0rk (25) Thus, we can use a infinite decreasing geometric series n k=0 ak ≤ ∞ k=0 a0rk =a0 ∞ k=0 rk =a0 1 1 − r 32 / 97
  • 86. A More Elegant Method Thus, we have ak ≤ a0rk (25) Thus, we can use a infinite decreasing geometric series n k=0 ak ≤ ∞ k=0 a0rk =a0 ∞ k=0 rk =a0 1 1 − r 32 / 97
  • 87. Approximation by integrals When a summation has the from n k=m f (k), where f (k) is a monotonically increasing function ˆ n m−1 f (x) dx ≤ n k=m f (k) ≤ ˆ n+1 m f (x) dx (26) 33 / 97
  • 88. For example Given ln (n + 1) = ˆ n+1 1 1 x dx ≤ n k=1 1 k (27) In addition n k=1 1 k ≤ ˆ n 1 1 x dx = ln n (28) 34 / 97
  • 89. For example Given ln (n + 1) = ˆ n+1 1 1 x dx ≤ n k=1 1 k (27) In addition n k=1 1 k ≤ ˆ n 1 1 x dx = ln n (28) 34 / 97
  • 90. Outline 1 Induction Basic Induction Structural Induction 2 Series Properties Important Series Bounding the Series 3 Probability Intuitive Formulation Axioms Independence Unconditional and Conditional Probability Posterior (Conditional) Probability Random Variables Types of Random Variables Cumulative Distributive Function Properties of the PMF/PDF Expected Value and Variance Indicator Random Variable 35 / 97
  • 91. Gerolamo Cardano: Gambling out of Darkness Gambling Gambling shows our interest in quantifying the ideas of probability for millennia, but exact mathematical descriptions arose much later. Gerolamo Cardano (16th century) While gambling he developed the following rule!!! Equal conditions “The most fundamental principle of all in gambling is simply equal conditions, e.g. of opponents, of bystanders, of money, of situation, of the dice box and of the dice itself. To the extent to which you depart from that equity, if it is in your opponent’s favour, you are a fool, and if in your own, you are unjust.” 36 / 97
  • 92. Gerolamo Cardano: Gambling out of Darkness Gambling Gambling shows our interest in quantifying the ideas of probability for millennia, but exact mathematical descriptions arose much later. Gerolamo Cardano (16th century) While gambling he developed the following rule!!! Equal conditions “The most fundamental principle of all in gambling is simply equal conditions, e.g. of opponents, of bystanders, of money, of situation, of the dice box and of the dice itself. To the extent to which you depart from that equity, if it is in your opponent’s favour, you are a fool, and if in your own, you are unjust.” 36 / 97
  • 93. Gerolamo Cardano: Gambling out of Darkness Gambling Gambling shows our interest in quantifying the ideas of probability for millennia, but exact mathematical descriptions arose much later. Gerolamo Cardano (16th century) While gambling he developed the following rule!!! Equal conditions “The most fundamental principle of all in gambling is simply equal conditions, e.g. of opponents, of bystanders, of money, of situation, of the dice box and of the dice itself. To the extent to which you depart from that equity, if it is in your opponent’s favour, you are a fool, and if in your own, you are unjust.” 36 / 97
  • 94. Gerolamo Cardano’s Definition Probability “If therefore, someone should say, I want an ace, a deuce, or a trey, you know that there are 27 favourable throws, and since the circuit is 36, the rest of the throws in which these points will not turn up will be 9; the odds will therefore be 3 to 1.” Meaning Probability as a ratio of favorable to all possible outcomes!!! As long all events are equiprobable... Thus, we get P(All favourable throws) = Number All favourable throws Number of All throws (29) 37 / 97
  • 95. Gerolamo Cardano’s Definition Probability “If therefore, someone should say, I want an ace, a deuce, or a trey, you know that there are 27 favourable throws, and since the circuit is 36, the rest of the throws in which these points will not turn up will be 9; the odds will therefore be 3 to 1.” Meaning Probability as a ratio of favorable to all possible outcomes!!! As long all events are equiprobable... Thus, we get P(All favourable throws) = Number All favourable throws Number of All throws (29) 37 / 97
  • 96. Gerolamo Cardano’s Definition Probability “If therefore, someone should say, I want an ace, a deuce, or a trey, you know that there are 27 favourable throws, and since the circuit is 36, the rest of the throws in which these points will not turn up will be 9; the odds will therefore be 3 to 1.” Meaning Probability as a ratio of favorable to all possible outcomes!!! As long all events are equiprobable... Thus, we get P(All favourable throws) = Number All favourable throws Number of All throws (29) 37 / 97
  • 97. Intuitive Formulation Empiric Definition Intuitively, the probability of an event A could be defined as: P(A) = lim n→∞ N(A) n Where N(A) is the number that event a happens in n trials. Example Imagine you have three dices, then The total number of outcomes is 63 If we have event A = all numbers are equal, |A| = 6 Then, we have that P(A) = 6 63 = 1 36 38 / 97
  • 98. Intuitive Formulation Empiric Definition Intuitively, the probability of an event A could be defined as: P(A) = lim n→∞ N(A) n Where N(A) is the number that event a happens in n trials. Example Imagine you have three dices, then The total number of outcomes is 63 If we have event A = all numbers are equal, |A| = 6 Then, we have that P(A) = 6 63 = 1 36 38 / 97
  • 99. Intuitive Formulation Empiric Definition Intuitively, the probability of an event A could be defined as: P(A) = lim n→∞ N(A) n Where N(A) is the number that event a happens in n trials. Example Imagine you have three dices, then The total number of outcomes is 63 If we have event A = all numbers are equal, |A| = 6 Then, we have that P(A) = 6 63 = 1 36 38 / 97
  • 100. Intuitive Formulation Empiric Definition Intuitively, the probability of an event A could be defined as: P(A) = lim n→∞ N(A) n Where N(A) is the number that event a happens in n trials. Example Imagine you have three dices, then The total number of outcomes is 63 If we have event A = all numbers are equal, |A| = 6 Then, we have that P(A) = 6 63 = 1 36 38 / 97
  • 101. Outline 1 Induction Basic Induction Structural Induction 2 Series Properties Important Series Bounding the Series 3 Probability Intuitive Formulation Axioms Independence Unconditional and Conditional Probability Posterior (Conditional) Probability Random Variables Types of Random Variables Cumulative Distributive Function Properties of the PMF/PDF Expected Value and Variance Indicator Random Variable 39 / 97
  • 102. Axioms of Probability Axioms Given a sample space S of events, we have that 1 0 ≤ P(A) ≤ 1 2 P(S) = 1 3 If A1, A2, ..., An are mutually exclusive events (i.e. P(Ai ∩ Aj) = 0), then: P(A1 ∪ A2 ∪ ... ∪ An) = n i=1 P(Ai) 40 / 97
  • 103. Axioms of Probability Axioms Given a sample space S of events, we have that 1 0 ≤ P(A) ≤ 1 2 P(S) = 1 3 If A1, A2, ..., An are mutually exclusive events (i.e. P(Ai ∩ Aj) = 0), then: P(A1 ∪ A2 ∪ ... ∪ An) = n i=1 P(Ai) 40 / 97
  • 104. Axioms of Probability Axioms Given a sample space S of events, we have that 1 0 ≤ P(A) ≤ 1 2 P(S) = 1 3 If A1, A2, ..., An are mutually exclusive events (i.e. P(Ai ∩ Aj) = 0), then: P(A1 ∪ A2 ∪ ... ∪ An) = n i=1 P(Ai) 40 / 97
  • 105. Axioms of Probability Axioms Given a sample space S of events, we have that 1 0 ≤ P(A) ≤ 1 2 P(S) = 1 3 If A1, A2, ..., An are mutually exclusive events (i.e. P(Ai ∩ Aj) = 0), then: P(A1 ∪ A2 ∪ ... ∪ An) = n i=1 P(Ai) 40 / 97
  • 106. Set Operations We are using Set Notation Thus What Operations? 41 / 97
  • 107. Set Operations We are using Set Notation Thus What Operations? 41 / 97
  • 108. Example Setup Throw a biased coin twice HH .36 HT .24 TH .24 TT .16 We have the following event At least one head!!! Can you tell me which events are part of it? What about this one? Tail on first toss. 42 / 97
  • 109. Example Setup Throw a biased coin twice HH .36 HT .24 TH .24 TT .16 We have the following event At least one head!!! Can you tell me which events are part of it? What about this one? Tail on first toss. 42 / 97
  • 110. Example Setup Throw a biased coin twice HH .36 HT .24 TH .24 TT .16 We have the following event At least one head!!! Can you tell me which events are part of it? What about this one? Tail on first toss. 42 / 97
  • 111. We need to count!!! We have four main methods of counting 1 Ordered samples of size r with replacement 2 Ordered samples of size r without replacement 3 Unordered samples of size r without replacement 4 Unordered samples of size r with replacement 43 / 97
  • 112. We need to count!!! We have four main methods of counting 1 Ordered samples of size r with replacement 2 Ordered samples of size r without replacement 3 Unordered samples of size r without replacement 4 Unordered samples of size r with replacement 43 / 97
  • 113. We need to count!!! We have four main methods of counting 1 Ordered samples of size r with replacement 2 Ordered samples of size r without replacement 3 Unordered samples of size r without replacement 4 Unordered samples of size r with replacement 43 / 97
  • 114. We need to count!!! We have four main methods of counting 1 Ordered samples of size r with replacement 2 Ordered samples of size r without replacement 3 Unordered samples of size r without replacement 4 Unordered samples of size r with replacement 43 / 97
  • 115. Ordered samples of size r with replacement Definition The number of possible sequences (ai1 , ..., air ) for n different numbers is n × n × ... × n = nr (30) Example If you throw three dices you have 6 × 6 × 6 = 216 44 / 97
  • 116. Ordered samples of size r with replacement Definition The number of possible sequences (ai1 , ..., air ) for n different numbers is n × n × ... × n = nr (30) Example If you throw three dices you have 6 × 6 × 6 = 216 44 / 97
  • 117. Ordered samples of size r without replacement Definition The number of possible sequences (ai1 , ..., air ) for n different numbers is n × n − 1 × ... × (n − (r − 1)) = n! (n − r)! (31) Example The number of different numbers that can be formed if no digit can be repeated. For example, if you have 4 digits and you want numbers of size 3. 45 / 97
  • 118. Ordered samples of size r without replacement Definition The number of possible sequences (ai1 , ..., air ) for n different numbers is n × n − 1 × ... × (n − (r − 1)) = n! (n − r)! (31) Example The number of different numbers that can be formed if no digit can be repeated. For example, if you have 4 digits and you want numbers of size 3. 45 / 97
  • 119. Unordered samples of size r without replacement Definition Actually, we want the number of possible unordered sets. However We have n! (n−r)! collections where we care about the order. Thus n! (n−r)! r! = n! r! (n − r)! = n r (32) 46 / 97
  • 120. Unordered samples of size r without replacement Definition Actually, we want the number of possible unordered sets. However We have n! (n−r)! collections where we care about the order. Thus n! (n−r)! r! = n! r! (n − r)! = n r (32) 46 / 97
  • 121. Unordered samples of size r with replacement Definition We want to find an unordered set {ai1 , ..., air } with replacement Use a digit trick for that Look at the Board Thus n + r − 1 r (33) 47 / 97
  • 122. Unordered samples of size r with replacement Definition We want to find an unordered set {ai1 , ..., air } with replacement Use a digit trick for that Look at the Board Thus n + r − 1 r (33) 47 / 97
  • 123. Unordered samples of size r with replacement Definition We want to find an unordered set {ai1 , ..., air } with replacement Use a digit trick for that Look at the Board Thus n + r − 1 r (33) 47 / 97
  • 124. How? Change encoding by adding more signs Imagine all the strings of three numbers with {1, 2, 3} We have Old String New String 111 1+0,1+1,1+2=123 112 1+0,1+1,2+2=124 113 1+0,1+1,3+2=125 122 1+0,2+1,2+2=134 123 1+0,2+1,3+2=135 133 1+0,3+1,3+2=145 222 2+0,2+1,2+2=234 223 2+0,2+1,3+2=235 233 1+0,3+1,3+2=245 333 3+0,3+1,3+2=345 48 / 97
  • 125. How? Change encoding by adding more signs Imagine all the strings of three numbers with {1, 2, 3} We have Old String New String 111 1+0,1+1,1+2=123 112 1+0,1+1,2+2=124 113 1+0,1+1,3+2=125 122 1+0,2+1,2+2=134 123 1+0,2+1,3+2=135 133 1+0,3+1,3+2=145 222 2+0,2+1,2+2=234 223 2+0,2+1,3+2=235 233 1+0,3+1,3+2=245 333 3+0,3+1,3+2=345 48 / 97
  • 126. Outline 1 Induction Basic Induction Structural Induction 2 Series Properties Important Series Bounding the Series 3 Probability Intuitive Formulation Axioms Independence Unconditional and Conditional Probability Posterior (Conditional) Probability Random Variables Types of Random Variables Cumulative Distributive Function Properties of the PMF/PDF Expected Value and Variance Indicator Random Variable 49 / 97
  • 127. Independence Definition Two events A and B are independent if and only if P(A, B) = P(A ∩ B) = P(A)P(B) Do you have any example? Any idea? 50 / 97
  • 128. Independence Definition Two events A and B are independent if and only if P(A, B) = P(A ∩ B) = P(A)P(B) Do you have any example? Any idea? 50 / 97
  • 129. Example We have two dices Thus, we have all pairs (i, j) such that i, j = 1, 2, 3, ..., 6 We have the following events A ={First dice 1,2 or 3} B = {First dice 3, 4 or 5} C = {The sum of two faces is 9} So, we can do Look at the board!!! Independence between A, B, C 51 / 97
  • 130. Example We have two dices Thus, we have all pairs (i, j) such that i, j = 1, 2, 3, ..., 6 We have the following events A ={First dice 1,2 or 3} B = {First dice 3, 4 or 5} C = {The sum of two faces is 9} So, we can do Look at the board!!! Independence between A, B, C 51 / 97
  • 131. Example We have two dices Thus, we have all pairs (i, j) such that i, j = 1, 2, 3, ..., 6 We have the following events A ={First dice 1,2 or 3} B = {First dice 3, 4 or 5} C = {The sum of two faces is 9} So, we can do Look at the board!!! Independence between A, B, C 51 / 97
  • 132. Example We have two dices Thus, we have all pairs (i, j) such that i, j = 1, 2, 3, ..., 6 We have the following events A ={First dice 1,2 or 3} B = {First dice 3, 4 or 5} C = {The sum of two faces is 9} So, we can do Look at the board!!! Independence between A, B, C 51 / 97
  • 133. Example We have two dices Thus, we have all pairs (i, j) such that i, j = 1, 2, 3, ..., 6 We have the following events A ={First dice 1,2 or 3} B = {First dice 3, 4 or 5} C = {The sum of two faces is 9} So, we can do Look at the board!!! Independence between A, B, C 51 / 97
  • 134. We can use it to derive the Binomial Distribution WHAT????? 52 / 97
  • 135. First, we use a sequence of n Bernoulli Trials We have this “Success” has a probability p. “Failure” has a probability 1 − p. Examples Toss a coin independently n times. Examine components produced on an assembly line. Now We take S =all 2n ordered sequences of length n, with components 0(failure) and 1(success). 53 / 97
  • 136. First, we use a sequence of n Bernoulli Trials We have this “Success” has a probability p. “Failure” has a probability 1 − p. Examples Toss a coin independently n times. Examine components produced on an assembly line. Now We take S =all 2n ordered sequences of length n, with components 0(failure) and 1(success). 53 / 97
  • 137. First, we use a sequence of n Bernoulli Trials We have this “Success” has a probability p. “Failure” has a probability 1 − p. Examples Toss a coin independently n times. Examine components produced on an assembly line. Now We take S =all 2n ordered sequences of length n, with components 0(failure) and 1(success). 53 / 97
  • 138. First, we use a sequence of n Bernoulli Trials We have this “Success” has a probability p. “Failure” has a probability 1 − p. Examples Toss a coin independently n times. Examine components produced on an assembly line. Now We take S =all 2n ordered sequences of length n, with components 0(failure) and 1(success). 53 / 97
  • 139. First, we use a sequence of n Bernoulli Trials We have this “Success” has a probability p. “Failure” has a probability 1 − p. Examples Toss a coin independently n times. Examine components produced on an assembly line. Now We take S =all 2n ordered sequences of length n, with components 0(failure) and 1(success). 53 / 97
  • 140. Thus, taking a sample ω ω = 11 · · · 10 · · · 0 k 1’s followed by n − k 0’s. We have then P (ω) = P A1 ∩ A2 ∩ . . . ∩ Ak ∩ Ac k+1 ∩ . . . ∩ Ac n = P (A1) P (A2) · · · P (Ak) P Ac k+1 · · · P (Ac n) = pk (1 − p)n−k Important The number of such sample is the number of sets with k elements.... or... n k 54 / 97
  • 141. Thus, taking a sample ω ω = 11 · · · 10 · · · 0 k 1’s followed by n − k 0’s. We have then P (ω) = P A1 ∩ A2 ∩ . . . ∩ Ak ∩ Ac k+1 ∩ . . . ∩ Ac n = P (A1) P (A2) · · · P (Ak) P Ac k+1 · · · P (Ac n) = pk (1 − p)n−k Important The number of such sample is the number of sets with k elements.... or... n k 54 / 97
  • 142. Thus, taking a sample ω ω = 11 · · · 10 · · · 0 k 1’s followed by n − k 0’s. We have then P (ω) = P A1 ∩ A2 ∩ . . . ∩ Ak ∩ Ac k+1 ∩ . . . ∩ Ac n = P (A1) P (A2) · · · P (Ak) P Ac k+1 · · · P (Ac n) = pk (1 − p)n−k Important The number of such sample is the number of sets with k elements.... or... n k 54 / 97
  • 143. Did you notice? We do not care where the 1’s and 0’s are Thus all the probabilities are equal to pk (1 − p)k Thus, we are looking to sum all those probabilities of all those combinations of 1’s and 0’s k 1’s p ωk Then k 1’s p ωk = n k p (1 − p)n−k 55 / 97
  • 144. Did you notice? We do not care where the 1’s and 0’s are Thus all the probabilities are equal to pk (1 − p)k Thus, we are looking to sum all those probabilities of all those combinations of 1’s and 0’s k 1’s p ωk Then k 1’s p ωk = n k p (1 − p)n−k 55 / 97
  • 145. Did you notice? We do not care where the 1’s and 0’s are Thus all the probabilities are equal to pk (1 − p)k Thus, we are looking to sum all those probabilities of all those combinations of 1’s and 0’s k 1’s p ωk Then k 1’s p ωk = n k p (1 − p)n−k 55 / 97
  • 146. Proving this is a probability Sum of these probabilities is equal to 1 n k=0 n k p (1 − p)n−k = (p + (1 − p))n = 1 The other is simple 0 ≤ n k p (1 − p)n−k ≤ 1 ∀k This is know as The Binomial probability function!!! 56 / 97
  • 147. Proving this is a probability Sum of these probabilities is equal to 1 n k=0 n k p (1 − p)n−k = (p + (1 − p))n = 1 The other is simple 0 ≤ n k p (1 − p)n−k ≤ 1 ∀k This is know as The Binomial probability function!!! 56 / 97
  • 148. Proving this is a probability Sum of these probabilities is equal to 1 n k=0 n k p (1 − p)n−k = (p + (1 − p))n = 1 The other is simple 0 ≤ n k p (1 − p)n−k ≤ 1 ∀k This is know as The Binomial probability function!!! 56 / 97
  • 149. Outline 1 Induction Basic Induction Structural Induction 2 Series Properties Important Series Bounding the Series 3 Probability Intuitive Formulation Axioms Independence Unconditional and Conditional Probability Posterior (Conditional) Probability Random Variables Types of Random Variables Cumulative Distributive Function Properties of the PMF/PDF Expected Value and Variance Indicator Random Variable 57 / 97
  • 150. Different Probabilities Unconditional This is the probability of an event A prior to arrival of any evidence, it is denoted by P(A). For example: P(Cavity)=0.1 means that “in the absence of any other information, there is a 10% chance that the patient is having a cavity”. Conditional This is the probability of an event A given some evidence B, it is denoted P(A|B). For example: P(Cavity/Toothache)=0.8 means that “there is an 80% chance that the patient is having a cavity given that he is having a toothache” 58 / 97
  • 151. Different Probabilities Unconditional This is the probability of an event A prior to arrival of any evidence, it is denoted by P(A). For example: P(Cavity)=0.1 means that “in the absence of any other information, there is a 10% chance that the patient is having a cavity”. Conditional This is the probability of an event A given some evidence B, it is denoted P(A|B). For example: P(Cavity/Toothache)=0.8 means that “there is an 80% chance that the patient is having a cavity given that he is having a toothache” 58 / 97
  • 152. Different Probabilities Unconditional This is the probability of an event A prior to arrival of any evidence, it is denoted by P(A). For example: P(Cavity)=0.1 means that “in the absence of any other information, there is a 10% chance that the patient is having a cavity”. Conditional This is the probability of an event A given some evidence B, it is denoted P(A|B). For example: P(Cavity/Toothache)=0.8 means that “there is an 80% chance that the patient is having a cavity given that he is having a toothache” 58 / 97
  • 153. Different Probabilities Unconditional This is the probability of an event A prior to arrival of any evidence, it is denoted by P(A). For example: P(Cavity)=0.1 means that “in the absence of any other information, there is a 10% chance that the patient is having a cavity”. Conditional This is the probability of an event A given some evidence B, it is denoted P(A|B). For example: P(Cavity/Toothache)=0.8 means that “there is an 80% chance that the patient is having a cavity given that he is having a toothache” 58 / 97
  • 154. Outline 1 Induction Basic Induction Structural Induction 2 Series Properties Important Series Bounding the Series 3 Probability Intuitive Formulation Axioms Independence Unconditional and Conditional Probability Posterior (Conditional) Probability Random Variables Types of Random Variables Cumulative Distributive Function Properties of the PMF/PDF Expected Value and Variance Indicator Random Variable 59 / 97
  • 155. Posterior Probabilities Relation between conditional and unconditional probabilities Conditional probabilities can be defined in terms of unconditional probabilities: P(A|B) = P(A, B) P(B) which generalizes to the chain rule P(A, B) = P(B)P(A|B) = P(A)P(B|A). Law of Total Probabilities if B1, B2, ..., Bnis a partition of mutually exclusive events and Ais an event, then P(A) = n i=1 P(A ∩ Bi). An special case P(A) = P(A, B) + P(A, B). In addition, this can be rewritten into P(A) = n i=1 P(A|Bi)P(Bi). 60 / 97
  • 156. Posterior Probabilities Relation between conditional and unconditional probabilities Conditional probabilities can be defined in terms of unconditional probabilities: P(A|B) = P(A, B) P(B) which generalizes to the chain rule P(A, B) = P(B)P(A|B) = P(A)P(B|A). Law of Total Probabilities if B1, B2, ..., Bnis a partition of mutually exclusive events and Ais an event, then P(A) = n i=1 P(A ∩ Bi). An special case P(A) = P(A, B) + P(A, B). In addition, this can be rewritten into P(A) = n i=1 P(A|Bi)P(Bi). 60 / 97
  • 157. Posterior Probabilities Relation between conditional and unconditional probabilities Conditional probabilities can be defined in terms of unconditional probabilities: P(A|B) = P(A, B) P(B) which generalizes to the chain rule P(A, B) = P(B)P(A|B) = P(A)P(B|A). Law of Total Probabilities if B1, B2, ..., Bnis a partition of mutually exclusive events and Ais an event, then P(A) = n i=1 P(A ∩ Bi). An special case P(A) = P(A, B) + P(A, B). In addition, this can be rewritten into P(A) = n i=1 P(A|Bi)P(Bi). 60 / 97
  • 158. Example Three cards are drawn from a deck Find the probability of no obtaining a heart We have 52 cards 39 of them not a heart Define Ai ={Card i is not a heart} Then? 61 / 97
  • 159. Example Three cards are drawn from a deck Find the probability of no obtaining a heart We have 52 cards 39 of them not a heart Define Ai ={Card i is not a heart} Then? 61 / 97
  • 160. Example Three cards are drawn from a deck Find the probability of no obtaining a heart We have 52 cards 39 of them not a heart Define Ai ={Card i is not a heart} Then? 61 / 97
  • 161. Independence and Conditional From here, we have that... P(A|B) = P(A) and P(B|A) = P(B). Conditional independence A and B are conditionally independent given C if and only if P(A|B, C) = P(A|C) Example: P(WetGrass|Season, Rain) = P(WetGrass|Rain). 62 / 97
  • 162. Independence and Conditional From here, we have that... P(A|B) = P(A) and P(B|A) = P(B). Conditional independence A and B are conditionally independent given C if and only if P(A|B, C) = P(A|C) Example: P(WetGrass|Season, Rain) = P(WetGrass|Rain). 62 / 97
  • 163. Bayes Theorem One Version P(A|B) = P(B|A)P(A) P(B) Where P(A) is the prior probability or marginal probability of A. It is "prior" in the sense that it does not take into account any information about B. P(A|B) is the conditional probability of A, given B. It is also called the posterior probability because it is derived from or depends upon the specified value of B. P(B|A) is the conditional probability of B given A. It is also called the likelihood. P(B) is the prior or marginal probability of B, and acts as a normalizing constant. 63 / 97
  • 164. Bayes Theorem One Version P(A|B) = P(B|A)P(A) P(B) Where P(A) is the prior probability or marginal probability of A. It is "prior" in the sense that it does not take into account any information about B. P(A|B) is the conditional probability of A, given B. It is also called the posterior probability because it is derived from or depends upon the specified value of B. P(B|A) is the conditional probability of B given A. It is also called the likelihood. P(B) is the prior or marginal probability of B, and acts as a normalizing constant. 63 / 97
  • 165. Bayes Theorem One Version P(A|B) = P(B|A)P(A) P(B) Where P(A) is the prior probability or marginal probability of A. It is "prior" in the sense that it does not take into account any information about B. P(A|B) is the conditional probability of A, given B. It is also called the posterior probability because it is derived from or depends upon the specified value of B. P(B|A) is the conditional probability of B given A. It is also called the likelihood. P(B) is the prior or marginal probability of B, and acts as a normalizing constant. 63 / 97
  • 166. Bayes Theorem One Version P(A|B) = P(B|A)P(A) P(B) Where P(A) is the prior probability or marginal probability of A. It is "prior" in the sense that it does not take into account any information about B. P(A|B) is the conditional probability of A, given B. It is also called the posterior probability because it is derived from or depends upon the specified value of B. P(B|A) is the conditional probability of B given A. It is also called the likelihood. P(B) is the prior or marginal probability of B, and acts as a normalizing constant. 63 / 97
  • 167. Bayes Theorem One Version P(A|B) = P(B|A)P(A) P(B) Where P(A) is the prior probability or marginal probability of A. It is "prior" in the sense that it does not take into account any information about B. P(A|B) is the conditional probability of A, given B. It is also called the posterior probability because it is derived from or depends upon the specified value of B. P(B|A) is the conditional probability of B given A. It is also called the likelihood. P(B) is the prior or marginal probability of B, and acts as a normalizing constant. 63 / 97
  • 168. General Form of the Bayes Rule Definition If A1, A2, ..., An is a partition of mutually exclusive events and B any event, then: P(Ai|B) = P(B|Ai)P(Ai) P(B) = P(B|Ai)P(Ai) n i=1 P(B|Ai)P(Ai) where P(B) = n i=1 P(B ∩ Ai) = n i=1 P(B|Ai)P(Ai) 64 / 97
  • 169. General Form of the Bayes Rule Definition If A1, A2, ..., An is a partition of mutually exclusive events and B any event, then: P(Ai|B) = P(B|Ai)P(Ai) P(B) = P(B|Ai)P(Ai) n i=1 P(B|Ai)P(Ai) where P(B) = n i=1 P(B ∩ Ai) = n i=1 P(B|Ai)P(Ai) 64 / 97
  • 170. Example Setup Throw two unbiased dice independently. Let 1 A ={sum of the faces =8} 2 B ={faces are equal} Then calculate P (B|A) Look at the board 65 / 97
  • 171. Example Setup Throw two unbiased dice independently. Let 1 A ={sum of the faces =8} 2 B ={faces are equal} Then calculate P (B|A) Look at the board 65 / 97
  • 172. Example Setup Throw two unbiased dice independently. Let 1 A ={sum of the faces =8} 2 B ={faces are equal} Then calculate P (B|A) Look at the board 65 / 97
  • 173. Another Example We have the following Two coins are available, one unbiased and the other two headed Assume That you have a probability of 3 4 to choose the unbiased Events A= {head comes up} B1= {Unbiased coin chosen} B2= {Biased coin chosen} Find that if a head come up, find the probability that the two headed coin was chosen 66 / 97
  • 174. Another Example We have the following Two coins are available, one unbiased and the other two headed Assume That you have a probability of 3 4 to choose the unbiased Events A= {head comes up} B1= {Unbiased coin chosen} B2= {Biased coin chosen} Find that if a head come up, find the probability that the two headed coin was chosen 66 / 97
  • 175. Another Example We have the following Two coins are available, one unbiased and the other two headed Assume That you have a probability of 3 4 to choose the unbiased Events A= {head comes up} B1= {Unbiased coin chosen} B2= {Biased coin chosen} Find that if a head come up, find the probability that the two headed coin was chosen 66 / 97
  • 176. Another Example We have the following Two coins are available, one unbiased and the other two headed Assume That you have a probability of 3 4 to choose the unbiased Events A= {head comes up} B1= {Unbiased coin chosen} B2= {Biased coin chosen} Find that if a head come up, find the probability that the two headed coin was chosen 66 / 97
  • 177. Another Example We have the following Two coins are available, one unbiased and the other two headed Assume That you have a probability of 3 4 to choose the unbiased Events A= {head comes up} B1= {Unbiased coin chosen} B2= {Biased coin chosen} Find that if a head come up, find the probability that the two headed coin was chosen 66 / 97
  • 178. Outline 1 Induction Basic Induction Structural Induction 2 Series Properties Important Series Bounding the Series 3 Probability Intuitive Formulation Axioms Independence Unconditional and Conditional Probability Posterior (Conditional) Probability Random Variables Types of Random Variables Cumulative Distributive Function Properties of the PMF/PDF Expected Value and Variance Indicator Random Variable 67 / 97
  • 179. Random Variables Definition In many experiments, it is easier to deal with a summary variable than with the original probability structure. Example In an opinion poll, we ask 50 people whether agree or disagree with a certain issue. Suppose we record a “1” for agree and “0” for disagree. The sample space for this experiment has 250 elements. Why? Suppose we are only interested in the number of people who agree. Define the variable X =number of “1” ’s recorded out of 50. Easier to deal with this sample space (has only 51 elements). 68 / 97
  • 180. Random Variables Definition In many experiments, it is easier to deal with a summary variable than with the original probability structure. Example In an opinion poll, we ask 50 people whether agree or disagree with a certain issue. Suppose we record a “1” for agree and “0” for disagree. The sample space for this experiment has 250 elements. Why? Suppose we are only interested in the number of people who agree. Define the variable X =number of “1” ’s recorded out of 50. Easier to deal with this sample space (has only 51 elements). 68 / 97
  • 181. Random Variables Definition In many experiments, it is easier to deal with a summary variable than with the original probability structure. Example In an opinion poll, we ask 50 people whether agree or disagree with a certain issue. Suppose we record a “1” for agree and “0” for disagree. The sample space for this experiment has 250 elements. Why? Suppose we are only interested in the number of people who agree. Define the variable X =number of “1” ’s recorded out of 50. Easier to deal with this sample space (has only 51 elements). 68 / 97
  • 182. Random Variables Definition In many experiments, it is easier to deal with a summary variable than with the original probability structure. Example In an opinion poll, we ask 50 people whether agree or disagree with a certain issue. Suppose we record a “1” for agree and “0” for disagree. The sample space for this experiment has 250 elements. Why? Suppose we are only interested in the number of people who agree. Define the variable X =number of “1” ’s recorded out of 50. Easier to deal with this sample space (has only 51 elements). 68 / 97
  • 183. Random Variables Definition In many experiments, it is easier to deal with a summary variable than with the original probability structure. Example In an opinion poll, we ask 50 people whether agree or disagree with a certain issue. Suppose we record a “1” for agree and “0” for disagree. The sample space for this experiment has 250 elements. Why? Suppose we are only interested in the number of people who agree. Define the variable X =number of “1” ’s recorded out of 50. Easier to deal with this sample space (has only 51 elements). 68 / 97
  • 184. Random Variables Definition In many experiments, it is easier to deal with a summary variable than with the original probability structure. Example In an opinion poll, we ask 50 people whether agree or disagree with a certain issue. Suppose we record a “1” for agree and “0” for disagree. The sample space for this experiment has 250 elements. Why? Suppose we are only interested in the number of people who agree. Define the variable X =number of “1” ’s recorded out of 50. Easier to deal with this sample space (has only 51 elements). 68 / 97
  • 185. Random Variables Definition In many experiments, it is easier to deal with a summary variable than with the original probability structure. Example In an opinion poll, we ask 50 people whether agree or disagree with a certain issue. Suppose we record a “1” for agree and “0” for disagree. The sample space for this experiment has 250 elements. Why? Suppose we are only interested in the number of people who agree. Define the variable X =number of “1” ’s recorded out of 50. Easier to deal with this sample space (has only 51 elements). 68 / 97
  • 186. Thus... It is necessary to define a function random variable as follow X : S → R Graphically 69 / 97
  • 187. Thus... It is necessary to define a function random variable as follow X : S → R Graphically S A 69 / 97
  • 188. Random Variables How? What is the probability function of the random variable is being defined from the probability function of the original sample space? Suppose the sample space is S = {s1, s2, ..., sn} Suppose the range of the random variable X =< x1, x2, ..., xm > Then, we observe X = xi if and only if the outcome of the random experiment is an sj ∈ S s.t. X(sj) = xj or 70 / 97
  • 189. Random Variables How? What is the probability function of the random variable is being defined from the probability function of the original sample space? Suppose the sample space is S = {s1, s2, ..., sn} Suppose the range of the random variable X =< x1, x2, ..., xm > Then, we observe X = xi if and only if the outcome of the random experiment is an sj ∈ S s.t. X(sj) = xj or 70 / 97
  • 190. Random Variables How? What is the probability function of the random variable is being defined from the probability function of the original sample space? Suppose the sample space is S = {s1, s2, ..., sn} Suppose the range of the random variable X =< x1, x2, ..., xm > Then, we observe X = xi if and only if the outcome of the random experiment is an sj ∈ S s.t. X(sj) = xj or 70 / 97
  • 191. Random Variables How? What is the probability function of the random variable is being defined from the probability function of the original sample space? Suppose the sample space is S = {s1, s2, ..., sn} Suppose the range of the random variable X =< x1, x2, ..., xm > Then, we observe X = xi if and only if the outcome of the random experiment is an sj ∈ S s.t. X(sj) = xj or P(X = xj) = P(sj ∈ S|X(sj) = xj) 70 / 97
  • 192. Example Setup Throw a coin 10 times, and let R be the number of heads. Then S = all sequences of length 10 with components H and T We have for ω =HHHHTTHTTH ⇒ R (ω) = 6 71 / 97
  • 193. Example Setup Throw a coin 10 times, and let R be the number of heads. Then S = all sequences of length 10 with components H and T We have for ω =HHHHTTHTTH ⇒ R (ω) = 6 71 / 97
  • 194. Example Setup Throw a coin 10 times, and let R be the number of heads. Then S = all sequences of length 10 with components H and T We have for ω =HHHHTTHTTH ⇒ R (ω) = 6 71 / 97
  • 195. Example Setup Let R be the number of heads in two independent tosses of a coin. Probability of head is .6 What are the probabilities? Ω ={HH,HT,TH,TT} Thus, we can calculate P (R = 0) , P (R = 1) , P (R = 2) 72 / 97
  • 196. Example Setup Let R be the number of heads in two independent tosses of a coin. Probability of head is .6 What are the probabilities? Ω ={HH,HT,TH,TT} Thus, we can calculate P (R = 0) , P (R = 1) , P (R = 2) 72 / 97
  • 197. Example Setup Let R be the number of heads in two independent tosses of a coin. Probability of head is .6 What are the probabilities? Ω ={HH,HT,TH,TT} Thus, we can calculate P (R = 0) , P (R = 1) , P (R = 2) 72 / 97
  • 198. Outline 1 Induction Basic Induction Structural Induction 2 Series Properties Important Series Bounding the Series 3 Probability Intuitive Formulation Axioms Independence Unconditional and Conditional Probability Posterior (Conditional) Probability Random Variables Types of Random Variables Cumulative Distributive Function Properties of the PMF/PDF Expected Value and Variance Indicator Random Variable 73 / 97
  • 199. Types of Random Variables Discrete A discrete random variable can assume only a countable number of values. Continuous A continuous random variable can assume a continuous range of values. 74 / 97
  • 200. Types of Random Variables Discrete A discrete random variable can assume only a countable number of values. Continuous A continuous random variable can assume a continuous range of values. 74 / 97
  • 201. Properties Probability Mass Function (PMF) and Probability Density Function (PDF) The pmf /pdf of a random variable X assigns a probability for each possible value of X. Properties of the pmf and pdf Some properties of the pmf: x p(x) = 1 and P(a < X < b) = b k=a p(k). In a similar way for the pdf: ´ ∞ −∞ p(x)dx = 1 and P(a < X < b) = ´ b a p(t)dt . 75 / 97
  • 202. Properties Probability Mass Function (PMF) and Probability Density Function (PDF) The pmf /pdf of a random variable X assigns a probability for each possible value of X. Properties of the pmf and pdf Some properties of the pmf: x p(x) = 1 and P(a < X < b) = b k=a p(k). In a similar way for the pdf: ´ ∞ −∞ p(x)dx = 1 and P(a < X < b) = ´ b a p(t)dt . 75 / 97
  • 203. Properties Probability Mass Function (PMF) and Probability Density Function (PDF) The pmf /pdf of a random variable X assigns a probability for each possible value of X. Properties of the pmf and pdf Some properties of the pmf: x p(x) = 1 and P(a < X < b) = b k=a p(k). In a similar way for the pdf: ´ ∞ −∞ p(x)dx = 1 and P(a < X < b) = ´ b a p(t)dt . 75 / 97
  • 204. Properties Probability Mass Function (PMF) and Probability Density Function (PDF) The pmf /pdf of a random variable X assigns a probability for each possible value of X. Properties of the pmf and pdf Some properties of the pmf: x p(x) = 1 and P(a < X < b) = b k=a p(k). In a similar way for the pdf: ´ ∞ −∞ p(x)dx = 1 and P(a < X < b) = ´ b a p(t)dt . 75 / 97
  • 205. Properties Probability Mass Function (PMF) and Probability Density Function (PDF) The pmf /pdf of a random variable X assigns a probability for each possible value of X. Properties of the pmf and pdf Some properties of the pmf: x p(x) = 1 and P(a < X < b) = b k=a p(k). In a similar way for the pdf: ´ ∞ −∞ p(x)dx = 1 and P(a < X < b) = ´ b a p(t)dt . 75 / 97
  • 206. Outline 1 Induction Basic Induction Structural Induction 2 Series Properties Important Series Bounding the Series 3 Probability Intuitive Formulation Axioms Independence Unconditional and Conditional Probability Posterior (Conditional) Probability Random Variables Types of Random Variables Cumulative Distributive Function Properties of the PMF/PDF Expected Value and Variance Indicator Random Variable 76 / 97
  • 207. Cumulative Distributive Function I Cumulative Distribution Function With every random variable, we associate a function called Cumulative Distribution Function (CDF) which is defined as follows: FX (x) = P(f (X) ≤ x) With properties: FX (x) ≥ 0 FX (x) in a non-decreasing function of X. Example If X is discrete, its CDF can be computed as follows: FX (x) = P(f (X) ≤ x) = N k=1 P(Xk = pk). 77 / 97
  • 208. Cumulative Distributive Function I Cumulative Distribution Function With every random variable, we associate a function called Cumulative Distribution Function (CDF) which is defined as follows: FX (x) = P(f (X) ≤ x) With properties: FX (x) ≥ 0 FX (x) in a non-decreasing function of X. Example If X is discrete, its CDF can be computed as follows: FX (x) = P(f (X) ≤ x) = N k=1 P(Xk = pk). 77 / 97
  • 209. Cumulative Distributive Function I Cumulative Distribution Function With every random variable, we associate a function called Cumulative Distribution Function (CDF) which is defined as follows: FX (x) = P(f (X) ≤ x) With properties: FX (x) ≥ 0 FX (x) in a non-decreasing function of X. Example If X is discrete, its CDF can be computed as follows: FX (x) = P(f (X) ≤ x) = N k=1 P(Xk = pk). 77 / 97
  • 210. Cumulative Distributive Function I Cumulative Distribution Function With every random variable, we associate a function called Cumulative Distribution Function (CDF) which is defined as follows: FX (x) = P(f (X) ≤ x) With properties: FX (x) ≥ 0 FX (x) in a non-decreasing function of X. Example If X is discrete, its CDF can be computed as follows: FX (x) = P(f (X) ≤ x) = N k=1 P(Xk = pk). 77 / 97
  • 212. Cumulative Distributive Function II Continuous Function If X is continuous, its CDF can be computed as follows: F(x) = ˆ x −∞ f (t)dt. Remark Based in the fundamental theorem of calculus, we have the following equality. p(x) = dF dx (x) Note This particular p(x) is known as the Probability Mass Function (PMF) or Probability Distribution Function (PDF). 79 / 97
  • 213. Cumulative Distributive Function II Continuous Function If X is continuous, its CDF can be computed as follows: F(x) = ˆ x −∞ f (t)dt. Remark Based in the fundamental theorem of calculus, we have the following equality. p(x) = dF dx (x) Note This particular p(x) is known as the Probability Mass Function (PMF) or Probability Distribution Function (PDF). 79 / 97
  • 214. Cumulative Distributive Function II Continuous Function If X is continuous, its CDF can be computed as follows: F(x) = ˆ x −∞ f (t)dt. Remark Based in the fundamental theorem of calculus, we have the following equality. p(x) = dF dx (x) Note This particular p(x) is known as the Probability Mass Function (PMF) or Probability Distribution Function (PDF). 79 / 97
  • 215. Example: Continuous Function Setup A number X is chosen at random between a and b Xhas a uniform distribution fX (x) = 1 b−a for a ≤ x ≤ b fX (x) = 0 for x < a and x > b We have FX (x) = P {X ≤ x} = ˆ x −∞ fX (t) dt (34) P {a < X ≤ b} = ˆ b a fX (t) dt (35) 80 / 97
  • 216. Example: Continuous Function Setup A number X is chosen at random between a and b Xhas a uniform distribution fX (x) = 1 b−a for a ≤ x ≤ b fX (x) = 0 for x < a and x > b We have FX (x) = P {X ≤ x} = ˆ x −∞ fX (t) dt (34) P {a < X ≤ b} = ˆ b a fX (t) dt (35) 80 / 97
  • 217. Example: Continuous Function Setup A number X is chosen at random between a and b Xhas a uniform distribution fX (x) = 1 b−a for a ≤ x ≤ b fX (x) = 0 for x < a and x > b We have FX (x) = P {X ≤ x} = ˆ x −∞ fX (t) dt (34) P {a < X ≤ b} = ˆ b a fX (t) dt (35) 80 / 97
  • 218. Example: Continuous Function Setup A number X is chosen at random between a and b Xhas a uniform distribution fX (x) = 1 b−a for a ≤ x ≤ b fX (x) = 0 for x < a and x > b We have FX (x) = P {X ≤ x} = ˆ x −∞ fX (t) dt (34) P {a < X ≤ b} = ˆ b a fX (t) dt (35) 80 / 97
  • 219. Example: Continuous Function Setup A number X is chosen at random between a and b Xhas a uniform distribution fX (x) = 1 b−a for a ≤ x ≤ b fX (x) = 0 for x < a and x > b We have FX (x) = P {X ≤ x} = ˆ x −∞ fX (t) dt (34) P {a < X ≤ b} = ˆ b a fX (t) dt (35) 80 / 97
  • 220. Example: Continuous Function Setup A number X is chosen at random between a and b Xhas a uniform distribution fX (x) = 1 b−a for a ≤ x ≤ b fX (x) = 0 for x < a and x > b We have FX (x) = P {X ≤ x} = ˆ x −∞ fX (t) dt (34) P {a < X ≤ b} = ˆ b a fX (t) dt (35) 80 / 97
  • 222. Outline 1 Induction Basic Induction Structural Induction 2 Series Properties Important Series Bounding the Series 3 Probability Intuitive Formulation Axioms Independence Unconditional and Conditional Probability Posterior (Conditional) Probability Random Variables Types of Random Variables Cumulative Distributive Function Properties of the PMF/PDF Expected Value and Variance Indicator Random Variable 82 / 97
  • 223. Properties of the PMF/PDF Conditional PMF/PDF We have the conditional pdf: p(y|x) = p(x, y) p(x) . From this, we have the general chain rule p(x1, x2, ..., xn) = p(x1|x2, ..., xn)p(x2|x3, ..., xn)...p(xn). Independence If X and Y are independent, then: p(x, y) = p(x)p(y). 83 / 97
  • 224. Properties of the PMF/PDF Conditional PMF/PDF We have the conditional pdf: p(y|x) = p(x, y) p(x) . From this, we have the general chain rule p(x1, x2, ..., xn) = p(x1|x2, ..., xn)p(x2|x3, ..., xn)...p(xn). Independence If X and Y are independent, then: p(x, y) = p(x)p(y). 83 / 97
  • 225. Properties of the PMF/PDF Law of Total Probability p(y) = x p(y|x)p(x). 84 / 97
  • 226. Outline 1 Induction Basic Induction Structural Induction 2 Series Properties Important Series Bounding the Series 3 Probability Intuitive Formulation Axioms Independence Unconditional and Conditional Probability Posterior (Conditional) Probability Random Variables Types of Random Variables Cumulative Distributive Function Properties of the PMF/PDF Expected Value and Variance Indicator Random Variable 85 / 97
  • 227. Expectation Something Notable You have the random variables R1, R2 representing how long is a call and how much you pay for an international call: if 0 ≤ R1 ≤ 3(minute) R2 = 10(cents) if 3 < R1 ≤ 6(minute) R2 = 20(cents) if 6 < R1 ≤ 9(minute) R2 = 30(cents) We have then the probabilities P {R2 = 10} = 0.6, P {R2 = 20} = 0.25, P {R2 = 10} = 0.15. If we observe N calls and N is very large We can say that we have N × 0.6 calls and 10 × N × 0.6 the cost of those calls. 86 / 97
  • 228. Expectation Something Notable You have the random variables R1, R2 representing how long is a call and how much you pay for an international call: if 0 ≤ R1 ≤ 3(minute) R2 = 10(cents) if 3 < R1 ≤ 6(minute) R2 = 20(cents) if 6 < R1 ≤ 9(minute) R2 = 30(cents) We have then the probabilities P {R2 = 10} = 0.6, P {R2 = 20} = 0.25, P {R2 = 10} = 0.15. If we observe N calls and N is very large We can say that we have N × 0.6 calls and 10 × N × 0.6 the cost of those calls. 86 / 97
  • 229. Expectation Something Notable You have the random variables R1, R2 representing how long is a call and how much you pay for an international call: if 0 ≤ R1 ≤ 3(minute) R2 = 10(cents) if 3 < R1 ≤ 6(minute) R2 = 20(cents) if 6 < R1 ≤ 9(minute) R2 = 30(cents) We have then the probabilities P {R2 = 10} = 0.6, P {R2 = 20} = 0.25, P {R2 = 10} = 0.15. If we observe N calls and N is very large We can say that we have N × 0.6 calls and 10 × N × 0.6 the cost of those calls. 86 / 97
  • 230. Expectation Something Notable You have the random variables R1, R2 representing how long is a call and how much you pay for an international call: if 0 ≤ R1 ≤ 3(minute) R2 = 10(cents) if 3 < R1 ≤ 6(minute) R2 = 20(cents) if 6 < R1 ≤ 9(minute) R2 = 30(cents) We have then the probabilities P {R2 = 10} = 0.6, P {R2 = 20} = 0.25, P {R2 = 10} = 0.15. If we observe N calls and N is very large We can say that we have N × 0.6 calls and 10 × N × 0.6 the cost of those calls. 86 / 97
  • 231. Expectation Something Notable You have the random variables R1, R2 representing how long is a call and how much you pay for an international call: if 0 ≤ R1 ≤ 3(minute) R2 = 10(cents) if 3 < R1 ≤ 6(minute) R2 = 20(cents) if 6 < R1 ≤ 9(minute) R2 = 30(cents) We have then the probabilities P {R2 = 10} = 0.6, P {R2 = 20} = 0.25, P {R2 = 10} = 0.15. If we observe N calls and N is very large We can say that we have N × 0.6 calls and 10 × N × 0.6 the cost of those calls. 86 / 97
  • 232. Expectation Something Notable You have the random variables R1, R2 representing how long is a call and how much you pay for an international call: if 0 ≤ R1 ≤ 3(minute) R2 = 10(cents) if 3 < R1 ≤ 6(minute) R2 = 20(cents) if 6 < R1 ≤ 9(minute) R2 = 30(cents) We have then the probabilities P {R2 = 10} = 0.6, P {R2 = 20} = 0.25, P {R2 = 10} = 0.15. If we observe N calls and N is very large We can say that we have N × 0.6 calls and 10 × N × 0.6 the cost of those calls. 86 / 97
  • 233. Expectation Similarly {R2 = 20} =⇒ 0.25N and total cost 5N. {R2 = 20} =⇒ 0.15N and total cost 4.5N. We have then the probabilities The total cost is 6N + 5N + 4.5N = 15.5N or in average 15.5 cents per call. The average 10 (0.6N) + 20 (.25N) + 30 (0.15N) N =10 (0.6) + 20 (.25) + 30 (0.15) = y yP {R2 = y} 87 / 97
  • 234. Expectation Similarly {R2 = 20} =⇒ 0.25N and total cost 5N. {R2 = 20} =⇒ 0.15N and total cost 4.5N. We have then the probabilities The total cost is 6N + 5N + 4.5N = 15.5N or in average 15.5 cents per call. The average 10 (0.6N) + 20 (.25N) + 30 (0.15N) N =10 (0.6) + 20 (.25) + 30 (0.15) = y yP {R2 = y} 87 / 97
  • 235. Expectation Similarly {R2 = 20} =⇒ 0.25N and total cost 5N. {R2 = 20} =⇒ 0.15N and total cost 4.5N. We have then the probabilities The total cost is 6N + 5N + 4.5N = 15.5N or in average 15.5 cents per call. The average 10 (0.6N) + 20 (.25N) + 30 (0.15N) N =10 (0.6) + 20 (.25) + 30 (0.15) = y yP {R2 = y} 87 / 97
  • 236. Expectation Similarly {R2 = 20} =⇒ 0.25N and total cost 5N. {R2 = 20} =⇒ 0.15N and total cost 4.5N. We have then the probabilities The total cost is 6N + 5N + 4.5N = 15.5N or in average 15.5 cents per call. The average 10 (0.6N) + 20 (.25N) + 30 (0.15N) N =10 (0.6) + 20 (.25) + 30 (0.15) = y yP {R2 = y} 87 / 97
  • 237. Expected Value Definition Discrete random variable X: E(X) = x xp(x). Continuous random variable Y : E(Y ) = ´ x xp(x)dx. Extension to a function g (x) E(g(X)) = x g(x)p(x) (Discrete case). E(g(X)) = ´ +∞ −∞ g(x)p(x)dx (Continuous case). 88 / 97
  • 238. Expected Value Definition Discrete random variable X: E(X) = x xp(x). Continuous random variable Y : E(Y ) = ´ x xp(x)dx. Extension to a function g (x) E(g(X)) = x g(x)p(x) (Discrete case). E(g(X)) = ´ +∞ −∞ g(x)p(x)dx (Continuous case). 88 / 97
  • 239. Expected Value Definition Discrete random variable X: E(X) = x xp(x). Continuous random variable Y : E(Y ) = ´ x xp(x)dx. Extension to a function g (x) E(g(X)) = x g(x)p(x) (Discrete case). E(g(X)) = ´ +∞ −∞ g(x)p(x)dx (Continuous case). 88 / 97
  • 240. Expected Value Definition Discrete random variable X: E(X) = x xp(x). Continuous random variable Y : E(Y ) = ´ x xp(x)dx. Extension to a function g (x) E(g(X)) = x g(x)p(x) (Discrete case). E(g(X)) = ´ +∞ −∞ g(x)p(x)dx (Continuous case). 88 / 97
  • 241. Linear Property Linearity property of the Expected Value E(af (X) + bg(Y )) = aE(f (X)) + bE(g(Y )) (36) Example for a discrete distribution E [aX + b] = x [ax + b] p (x|θ) =a x xp (x|θ) + b x p (x|θ) =aE [X] + b 89 / 97
  • 242. Linear Property Linearity property of the Expected Value E(af (X) + bg(Y )) = aE(f (X)) + bE(g(Y )) (36) Example for a discrete distribution E [aX + b] = x [ax + b] p (x|θ) =a x xp (x|θ) + b x p (x|θ) =aE [X] + b 89 / 97
  • 243. Linear Property Linearity property of the Expected Value E(af (X) + bg(Y )) = aE(f (X)) + bE(g(Y )) (36) Example for a discrete distribution E [aX + b] = x [ax + b] p (x|θ) =a x xp (x|θ) + b x p (x|θ) =aE [X] + b 89 / 97
  • 244. Linear Property Linearity property of the Expected Value E(af (X) + bg(Y )) = aE(f (X)) + bE(g(Y )) (36) Example for a discrete distribution E [aX + b] = x [ax + b] p (x|θ) =a x xp (x|θ) + b x p (x|θ) =aE [X] + b 89 / 97
  • 245. Example Imagine the following We have the following functions 1 f (x) = e−x, x ≥ 0 2 g (x) = 0, x < 0 Find The expected Value 90 / 97
  • 246. Example Imagine the following We have the following functions 1 f (x) = e−x, x ≥ 0 2 g (x) = 0, x < 0 Find The expected Value 90 / 97
  • 247. Example Imagine the following We have the following functions 1 f (x) = e−x, x ≥ 0 2 g (x) = 0, x < 0 Find The expected Value 90 / 97
  • 248. Example Imagine the following We have the following functions 1 f (x) = e−x, x ≥ 0 2 g (x) = 0, x < 0 Find The expected Value 90 / 97
  • 249. Variance Definition Var(X) = E((X − µ)2) where µ = E(X) Standard Deviation The standard deviation is simply σ = Var(X). 91 / 97
  • 250. Variance Definition Var(X) = E((X − µ)2) where µ = E(X) Standard Deviation The standard deviation is simply σ = Var(X). 91 / 97
  • 251. Outline 1 Induction Basic Induction Structural Induction 2 Series Properties Important Series Bounding the Series 3 Probability Intuitive Formulation Axioms Independence Unconditional and Conditional Probability Posterior (Conditional) Probability Random Variables Types of Random Variables Cumulative Distributive Function Properties of the PMF/PDF Expected Value and Variance Indicator Random Variable 92 / 97
  • 252. Indicator Random Variable Definition The indicator of an event A is a random variable IA defined as follows: IA (ω) = 1 if ω ∈ A 0 if ω /∈ A (37) Thus If IA = 1 if A occurs. If IA = 0 if A does not occur. Why is this useful? Because we can count events!!! 93 / 97
  • 253. Indicator Random Variable Definition The indicator of an event A is a random variable IA defined as follows: IA (ω) = 1 if ω ∈ A 0 if ω /∈ A (37) Thus If IA = 1 if A occurs. If IA = 0 if A does not occur. Why is this useful? Because we can count events!!! 93 / 97
  • 254. Indicator Random Variable Definition The indicator of an event A is a random variable IA defined as follows: IA (ω) = 1 if ω ∈ A 0 if ω /∈ A (37) Thus If IA = 1 if A occurs. If IA = 0 if A does not occur. Why is this useful? Because we can count events!!! 93 / 97
  • 255. Indicator Random Variable Definition The indicator of an event A is a random variable IA defined as follows: IA (ω) = 1 if ω ∈ A 0 if ω /∈ A (37) Thus If IA = 1 if A occurs. If IA = 0 if A does not occur. Why is this useful? Because we can count events!!! 93 / 97
  • 256. Useful Property The Expected value of an indicator function E [IA] = 0 × P (IA = 0) + 1 × P (IA = 1) = P (IA = 1) = P (A) (38) Why is this useful? Because we can use this to count things when a probability is involved!!! 94 / 97
  • 257. Useful Property The Expected value of an indicator function E [IA] = 0 × P (IA = 0) + 1 × P (IA = 1) = P (IA = 1) = P (A) (38) Why is this useful? Because we can use this to count things when a probability is involved!!! 94 / 97
  • 258. Useful Property The Expected value of an indicator function E [IA] = 0 × P (IA = 0) + 1 × P (IA = 1) = P (IA = 1) = P (A) (38) Why is this useful? Because we can use this to count things when a probability is involved!!! 94 / 97
  • 259. Useful Property The Expected value of an indicator function E [IA] = 0 × P (IA = 0) + 1 × P (IA = 1) = P (IA = 1) = P (A) (38) Why is this useful? Because we can use this to count things when a probability is involved!!! 94 / 97
  • 260. Useful Property The Expected value of an indicator function E [IA] = 0 × P (IA = 0) + 1 × P (IA = 1) = P (IA = 1) = P (A) (38) Why is this useful? Because we can use this to count things when a probability is involved!!! 94 / 97
  • 261. Method of Indicators It is possible to see random variables as a sum of indicators functions R = IA1 + ... + IAn (39) Then E [R] = n j=1 E IAj = n j=1 P (Aj) (40) Hopefully It is easier to calculate P (Aj) than to evaluate E [R] directly. 95 / 97
  • 262. Method of Indicators It is possible to see random variables as a sum of indicators functions R = IA1 + ... + IAn (39) Then E [R] = n j=1 E IAj = n j=1 P (Aj) (40) Hopefully It is easier to calculate P (Aj) than to evaluate E [R] directly. 95 / 97
  • 263. Method of Indicators It is possible to see random variables as a sum of indicators functions R = IA1 + ... + IAn (39) Then E [R] = n j=1 E IAj = n j=1 P (Aj) (40) Hopefully It is easier to calculate P (Aj) than to evaluate E [R] directly. 95 / 97
  • 264. Example A single unbiased die is tossed independently n times Let R1 be the numbers of 1’s. Let R2 be the numbers of 2’s. Find E [R1R2] We can express each variable as R1 =IA1 + ... + IAn R2 =IB1 + ... + IBn Thus E [R1R2] = n i=1 n j=1 E IAi IBj (41) 96 / 97
  • 265. Example A single unbiased die is tossed independently n times Let R1 be the numbers of 1’s. Let R2 be the numbers of 2’s. Find E [R1R2] We can express each variable as R1 =IA1 + ... + IAn R2 =IB1 + ... + IBn Thus E [R1R2] = n i=1 n j=1 E IAi IBj (41) 96 / 97
  • 266. Example A single unbiased die is tossed independently n times Let R1 be the numbers of 1’s. Let R2 be the numbers of 2’s. Find E [R1R2] We can express each variable as R1 =IA1 + ... + IAn R2 =IB1 + ... + IBn Thus E [R1R2] = n i=1 n j=1 E IAi IBj (41) 96 / 97
  • 267. Next Case 1 i = j IAi and IBj are independent E IAi IBj = E [IAi ] E IBj = P (Ai) P (Bj) = 1 6 × 1 6 = 1 36 (42) Case 2 i = j Ai and Bi are disjoint Thus, IAi IBi = IAi∩Bi = 0 Thus E [R1R2] = n (n − 1) 36 (43) 97 / 97
  • 268. Next Case 1 i = j IAi and IBj are independent E IAi IBj = E [IAi ] E IBj = P (Ai) P (Bj) = 1 6 × 1 6 = 1 36 (42) Case 2 i = j Ai and Bi are disjoint Thus, IAi IBi = IAi∩Bi = 0 Thus E [R1R2] = n (n − 1) 36 (43) 97 / 97
  • 269. Next Case 1 i = j IAi and IBj are independent E IAi IBj = E [IAi ] E IBj = P (Ai) P (Bj) = 1 6 × 1 6 = 1 36 (42) Case 2 i = j Ai and Bi are disjoint Thus, IAi IBi = IAi∩Bi = 0 Thus E [R1R2] = n (n − 1) 36 (43) 97 / 97