APM Welcome, APM North West Network Conference, Synergies Across Sectors
Random Variables and Probability Distributions Lecture 3
1. Introduction to Data Analytics
Lecture: Random Variables and Probability distributions – 3
NPTEL MOOC
By
Prof. Nandan Sudarsanam, DoMS, IIT-M and
Prof. B. Ravindran, CS&E, IIT-M
2. Common Distributions
• Normal
• Bell shaped curve
• PDF:
• Mean, variance, CDF
• Height, weight, etc.
• Many things after removal of outliers
• Binomial Approximation
• Central Limit Theorem (CLT)
• Sampling distributions
2
2
( )
2
1
( , , )
2
x
f x e
3. • Normal Distribution: Total Annual household income to explain
outlier removal:
Common Distributions
0 1 2 3 4 5 6 7 8 9
x 10
4
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
x 10
4
0 0.5 1 1.5 2 2.5 3 3.5 4
x 10
5
0
1
2
3
4
5
6
7
8
9
10
x 10
4
No.
of
Households
(upto
45,000)
No.
of
Households
(upto
1,00,000)
Income (up to 4,00,000 Rupees) Income (up to 90,000 Rupees)
4. Binomial Approximation
• Review of PDF, mean and variance
• PDF
• Mean = np
• Variance = np(1-p)
• Construct a normal distribution with the above mean and variance
and use that to answer distribution related questions.
k
n
k
p
p
k
n
)
1
(
5. • The aggregation of a sufficiently large number of independent random
variables results in a random variable which will be approximately normal.
• Example
Central Limit Theorem
7. Sampling distribution
• Sampling distribution
Original Distribution
Distribution of Sample means
• What is its shape?
• What is its mean?
• What is its standard deviation?
• Can there be a distribution for sample standard deviations?