More Related Content
Similar to Probability and Statistics part 3.pdf (20)
More from Almolla Raed (7)
Probability and Statistics part 3.pdf
- 2. PROBABILITY AND STATISTICS
• Combinations and Permutations
• Sets
• Probability
• Laws of Probability
• Measures of Central Tendency
• Measures of Dispersion
• Expected Values
• Probability Density Functions
• Probability Distribution Functions
• Probability Distributions
• Sums of Random Variables
• Hypothesis Testing
• Linear Regression
© 2018 Professional Publications, Inc. 51
Lesson Overview
- 3. PROBABILITY AND STATISTICS
random variable
assigns a real value to each possible
sample point in sample space
discrete random variables
There are a finite number of possible
numbers that the random variable can
take on.
continuous random variables
The random variable can take on any
value over an interval on the real number
line.
© 2018 Professional Publications, Inc. 52
Probability Density Functions
- 4. PROBABILITY AND STATISTICS
probability density function (PDF)
• provides probability to each numerical
output of the random variable (RV)
• In the case of continuous RV, the PDF
gives the density at that point.
• In the case of discrete RV, the PDF is a
sum of impulses, each impulse with a
magnitude of the probability equal to
that numerical outcome.
probability that X will occur between
and
•
•
• Discrete probability mass functions
can be modeled with probability
density functions where the PDF is
impulsive.
© 2018 Professional Publications, Inc. 53
Probability Density Functions
- 5. PROBABILITY AND STATISTICS
cumulative distribution function
• probability that will be less than a
certain value
•
•
• if
•
•
• Discrete cumulative distribution
functions give a staircase appearance,
a step occurring at each of the finite
possible outcomes.
© 2018 Professional Publications, Inc. 54
Probability Distribution Functions
- 6. PROBABILITY AND STATISTICS
An engineer spends 30% of the time at
home and 40% at work, 6 mi away. The
remaining 30% is spent commuting
between home and work in a straight line
at a steady velocity. Calculate the
probability that at a randomly chosen time
the engineer is at least 2 mi from home.
© 2018 Professional Publications, Inc. 55
Example: Probability Distribution Functions
- 7. PROBABILITY AND STATISTICS
An engineer spends 30% of the time at
home and 40% at work, 6 mi away. The
remaining 30% is spent commuting
between home and work in a straight line
at a steady velocity. Calculate the
probability that at a randomly chosen time
the engineer is at least 2 mi from home.
Solution
Integrate the PDF from 2 to .
Therefore the probability is 60% that the
engineer will be more than 2 mi away.
© 2018 Professional Publications, Inc. 56
Example: Probability Distribution Functions
6
2 2
6
2
2 0.05 0.4
0.05 0.4
0.05 6 2 0.4
0.6
x
P X f x dx dx
x
- 8. PROBABILITY AND STATISTICS
Let X represent the percentage of
material understood by a random student
in this course. If you were the teacher,
would you want the cumulative
distribution to look as it does in figure (a)
or in figure (b)?
© 2018 Professional Publications, Inc. 57
Poll: Probability Distribution Functions
- 9. PROBABILITY AND STATISTICS
Let X represent the percentage of
material understood by a random student
in this course. If you were the teacher,
would you want the cumulative
distribution to look as it does in figure (a)
or in figure (b)?
Solution
Remember that the probability density
function is the derivative of the
cumulative distribution function.
The answer is (b).
© 2018 Professional Publications, Inc. 58
Poll: Probability Distribution Functions
- 10. PROBABILITY AND STATISTICS
binomial distribution
• Given n trials the binomial distribution
provides the likelihood that there will x
successes.
• n is the number of trials
p is the probability of success
q is the probability of failure where
• Probability mass function for a
binomial distribution is then:
• Variance of a binomial distribution is
© 2018 Professional Publications, Inc. 59
Probability Distributions
- 11. PROBABILITY AND STATISTICS
A coin is tossed four times. What are the
probabilities of getting heads 0 times, 1
time, 2 times, 3 times, and 4 times?
© 2018 Professional Publications, Inc. 60
Example: Probability Distributions
- 12. PROBABILITY AND STATISTICS
A coin is tossed four times. What are the
probabilities of getting heads 0 times, 1
time, 2 times, 3 times, and 4 times?
Solution
© 2018 Professional Publications, Inc. 61
Example: Probability Distributions
0 4
4
1 3
4
2 2
4
3 1
4
4 0
4
!
! !
4!
0 heads 0.50 0.50 0.0625
0! 4 0 !
4!
1 0.50 0.50 0.25
1! 4 1 !
4!
2 0.50 0.50 0.375
2! 4 2 !
4!
3 0.50 0.50 0.25
3! 4 3 !
4!
4 0.50 0.50 0.0625
4! 4 4 !
x n x
n
n
P x p q
x n x
P
P
P
P
P
- 13. PROBABILITY AND STATISTICS
Five percent of all people have red hair. In
a random selection of seven people, most
nearly what is the probability that exactly
three have red hair?
(A) 0.0013
(B) 0.0036
(C) 0.013
(D) 0.036
© 2018 Professional Publications, Inc. 62
Example: Probability Distributions
- 14. PROBABILITY AND STATISTICS
Five percent of all people have red hair. In
a random selection of seven people, most
nearly what is the probability that exactly
three have red hair?
(A) 0.0013
(B) 0.0036
(C) 0.013
(D) 0.036
Solution
The answer is (B).
© 2018 Professional Publications, Inc. 63
Example: Probability Distributions
7
3 7 3
!
(3)
! !
7!
0.05 0.95
3! 7 3 !
0.00356 0.0036
x n x
n
P p q
x n x
- 15. PROBABILITY AND STATISTICS
normal or Gaussian distribution
• frequently occurring natural
distribution when multiple random
parameters all effect the outcome
• probability density function:
© 2018 Professional Publications, Inc. 64
Probability Distributions
- 16. PROBABILITY AND STATISTICS
unit normal table
• The probability density function, f(x),
is difficult to evaluate.
• Instead, it is common to use a unit
normal table to find values for the
corresponding cumulative distribution
function, .
• A unit normal table is normalized for a
mean of zero and a standard deviation
of one.
• To convert the existing distribution
into a unit normal distribution,
• Z is the standard normal variable, not
the actual measurement of the
random variable, x.
• Use this value of Z in the unit normal
table to find the probability.
© 2018 Professional Publications, Inc. 65
Probability Distributions
- 17. PROBABILITY AND STATISTICS
• f (x) is probability density of one particular value
• F(x) is probability of all values less than x
• R(x) is probability of all values greater than x
• 2R(x) is probability of all values greater than x or less than –x
• W(x) is probability of all values between –x and x
© 2018 Professional Publications, Inc. 66
Probability Distributions
Unit Normal Distribution
- 18. PROBABILITY AND STATISTICS
A normal distribution has a mean of 16
and a standard deviation of 4. What is
the total probability of values greater
than 4?
© 2018 Professional Publications, Inc. 67
Example: Probability Distributions
- 19. PROBABILITY AND STATISTICS
A normal distribution has a mean of 16
and a standard deviation of 4. What is
the total probability of values greater
than 4?
Solution
Find the probability that z is greater than
–3. Due to the symmetry of the
distribution, R(–z) = F(z). From the unit
normal distribution table, probability
R(−3) = F(3) = 0.9987.
© 2018 Professional Publications, Inc. 68
Example: Probability Distributions
- 20. PROBABILITY AND STATISTICS
central limit theorem
• Let
• The central limit theorem states that
given a sum of identically distributed
random variables, , with a mean of
and a standard deviation of ,
then
• The means of X and Y are therefore
the same, and the standard deviation
of Y is the standard deviation of X
divided by the square root of N.
• Most surprisingly, as N goes to infinity,
the distribution of Y becomes normal
(Gaussian), no matter what the
distribution of X is.
© 2018 Professional Publications, Inc. 69
Probability Distributions
- 21. PROBABILITY AND STATISTICS
t-distribution
estimating statistics of normal
distribution when sample size is small
• arises when using sample mean and
variance as estimates for normal
distribution
• student’s t-test used for testing
significance of difference between
two sample means
• as number of degrees of freedom
increases, t-distribution approaches
normal distribution
© 2018 Professional Publications, Inc. 70
Probability Distributions
- 22. PROBABILITY AND STATISTICS
t-distribution
Convert the distribution to unit normal
distribution.
For n degrees of freedom, find t,n that
leads to probability .
Calculate probability using the normal
distribution columns.
• is total probability of values greater
than t,n
• 1 – is total probability of values less
than t,n
• 2 is total probability of values greater
than t,n or less than –t,n
• 1 – 2 is total probability of values
between t,n and –t,n
© 2018 Professional Publications, Inc. 71
Probability Distributions
- 23. PROBABILITY AND STATISTICS
A t-distribution with four degrees of
freedom has a mean of 10 and a standard
deviation of 5. If 5% of the population is
greater than a value, what is that value?
© 2018 Professional Publications, Inc. 72
Example: Probability Distributions
- 24. PROBABILITY AND STATISTICS
A t-distribution with four degrees of
freedom has a mean of 10 and a standard
deviation of 5. If 5% of the population is
greater than a value, what is that value?
Solution
From the NCEES t-distribution table for
α = 0.05 and v = 4, tα,v = 2.132.
© 2018 Professional Publications, Inc. 73
Example: Probability Distributions
5
2.132 10 15.33
4
x
t
s n
s
x t
n
15.33
- 25. PROBABILITY AND STATISTICS
• Estimating the actual mean and
variance of a population based on a
sample set will never be exact.
• Confidence interval gives a range of
possible values and the probability
that the true statistic will lie within
that range.
• Denote upper and lower limits of the
confidence interval by UCL and LCL.
• By the central limit theorem, the
probability distribution tends towards
normal as more samples are taken.
© 2018 Professional Publications, Inc. 74
Probability Distributions
- 26. PROBABILITY AND STATISTICS
• The sample mean can be used as an
estimate of the true mean of the
population,
• If the variance of the population X is
unknown, the sample variance can be
used in it’s place with the t-distribution.
© 2018 Professional Publications, Inc. 75
Probability Distributions
- 27. PROBABILITY AND STATISTICS
Given that
Expected value of Y is scaled sum of
expectations of individual random
variables
Variance of Y is scalar squared sum of
variances of individual random variables
provided that individual random variables
are statistically independent.
© 2018 Professional Publications, Inc. 76
Sums of Random Variables
- 28. PROBABILITY AND STATISTICS
hypothesis testing
the process of making a decision with a
specified level of confidence about a
statistical parameter being evaluated
common hypothesis tests
• Determine whether the average value
taken from n samples could have come
from a certain type of distribution.
• Determine whether the sample variance
taken from n samples could have come
from a certain type of distribution.
© 2018 Professional Publications, Inc. 77
Hypothesis Testing
- 29. PROBABILITY AND STATISTICS
null hypothesis, H0
assumption being tested
alternative hypothesis, H1
must be true if H0 is not true
four possibilities when making the
decision:
• decide H0 when H0 is true
• decide H0 when H0 is false – type I
error
• decide H1 when H1 is true
• decide H1 when H1 is false – type II
error
© 2018 Professional Publications, Inc. 78
Hypothesis Testing
- 30. PROBABILITY AND STATISTICS
hypothesis testing for mean value
• For n samples taken, decide on a
desired confidence level,
where is the probability of error. (A
one-tail or two-tail estimate must be
used depending on hypothesis.)
• Calculate the standard normal
variable Z.
• If , then with confidence C,
decide against the null hypothesis.
© 2018 Professional Publications, Inc. 79
Hypothesis Testing
- 31. PROBABILITY AND STATISTICS
Two years ago, the average annual salary
of engineers in a region was $58,500.
Recently, a group of 100 engineers from
the region was found to have a mean
salary of $63,450 with a standard
deviation of $3,500. Using a level of
confidence at α = 0.05, can we conclude
that the average salary has increased?
© 2018 Professional Publications, Inc. 80
Example: Hypothesis Testing
- 32. PROBABILITY AND STATISTICS
Two years ago, the average annual salary
of engineers in a region was $58,500.
Recently, a group of 100 engineers from
the region was found to have a mean
salary of $63,450 with a standard
deviation of $3,500. Using a level of
confidence at α = 0.05, can we conclude
that the average salary has increased?
Solution
H0 is µ = $58,500 and H1 is µ > $58,500.
H0 is not accepted if t0 > tα,n–1 = 1.645
(i.e., from the NCEES Handbook for 0.05)
© 2018 Professional Publications, Inc. 81
Example: Hypothesis Testing
0
0
$63,450 $58,500
14.14
$3,500
100
X
t
S
n
- 33. PROBABILITY AND STATISTICS
method of least squares
a way to find the straight line that best
fits a set of data points
• calculate these seven quantities
• calculate the mean x- and y-values
These values will be used to find the
slope and y-intercept of the line.
© 2018 Professional Publications, Inc. 82
Linear Regression
- 34. PROBABILITY AND STATISTICS
method of least squares (continued)
• calculate the slope of the line • calculate the y-intercept
The equation of the line that best fits the
data is
© 2018 Professional Publications, Inc. 83
Linear Regression
- 35. PROBABILITY AND STATISTICS
goodness of fit
• determined by the sample correlation
coefficient
• when R = 1.0, the fit is perfect (the data
points themselves are in a straight line)
• when R > 0.85, the fit is acceptable
© 2018 Professional Publications, Inc. 84
Linear Regression
xy
xx yy
S
R
S S
- 36. PROBABILITY AND STATISTICS
You have learned
• how to calculate combinations and
permutations
• about the laws of probability
• about the mean, mode, median, and
other measures of central tendency
• about standard deviation, variance,
and other measures of dispersion
• about probability density and
distribution functions
• about binomial distribution and
normal distribution
• about the central limit theorem
• about unit normal and t-distribution
tables and how to use them
• about hypothesis testing
• about linear regression and the
method of least squares
© 2018 Professional Publications, Inc. 85
Learning Objectives
- 37. PROBABILITY AND STATISTICS
• Combinations and Permutations
• Sets
• Probability
• Laws of Probability
• Measures of Central Tendency
• Measures of Dispersion
• Expected Values
• Probability Density Functions
• Probability Distribution Functions
• Probability Distributions
• Sums of Random Variables
• Hypothesis Testing
• Linear Regression
© 2018 Professional Publications, Inc. 86
Lesson Overview