2. Basic Statistics: Review
What is Statistics?
The art of abstracting Real World via sampling and deriving general
“estimates” that describes the Real World at a certain degree of confidence.
Fall 2021/ ElDessouki 104
Real World
sample
Sample Date
&
Data Reduction
Descriptive
Measures for
Real World
(@ deg. Of
confidence)
Math.
Model
Decision
Making
&
Design
. TTENG 441 Traffic Engineering
3. Basic Statistics: Review
When do we need Statistics?
When we can not measure all the data values for the
population.
Before starting: What do we need to address?
Sample Size (how many measurements are sufficient?)
What Confidence should I have in the results?
What statistical model distribution (math model) that better
describes the observed data?
Did a traffic engineering solution affected the status of the
Real World significantly?(before & after analysis)
Fall 2021/ ElDessouki 105
. TTENG 441 Traffic Engineering
5. Basic Statistics:
Common Statistical Estimators
Mean:
Fall 2021/ ElDessouki 107
ns
Observatio
of
number
N
i
n
observatio
x
mean
sample
x
where
N
x
x
i
N
i
i
)
(
:
1
Median:
- Is the middle value of all
the sample data ( i.e. 50%
of the data are above this
value)
Mode:
Is the value that occurs most
frequently
Measures of Central Tendency:
. TTENG 441 Traffic Engineering
6. Basic Statistics:
Common Statistical Estimators
Variance:
Fall 2021/ ElDessouki 108
ns
Observatio
of
number
N
i
n
observatio
x
mean
sample
x
Variance
sample
S
where
N
x
x
S
i
N
i
i
)
(
:
)
1
(
)
(
2
1
2
2
Standard Deviation:
Measures of Dispersion:
Variance
sample
S
Deviation
dard
S
S
where
N
x
x
S
S
N
i
i
2
1
2
2
tan
:
)
1
(
)
(
. TTENG 441 Traffic Engineering
7. Basic Statistics:
Common Statistical Estimators
Coefficient of Variation:
The ratio between the standard
deviation and the mean.
Fall 2021/ ElDessouki 109
mean
sample
x
deviation
Standard
STD
Variation
of
t
Coefficien
C
where
x
STD
C
var
var
:
Skewness:
Describes the asymmetry in the
data sample.
Measures of Dispersion:
STD
mode)
mean
Skewness
(
. TTENG 441 Traffic Engineering
9. Basic Statistics:
Common Statistical Estimators
Fall 2021/ ElDessouki 111
MS Excel Functions:
Mean = average(range array)
Mode = mode(range array)
Median = median(range array)
Variance = var(range array)
Standard Deviation = stdev( range array)
Skewness = skew( range array)
. TTENG 441 Traffic Engineering
10. Basic Statistics:
Useful MS Excel Functions
Fall 2021/ ElDessouki 112
For Plotting Frequency Diagram:
For a sample of speed observations do the following:
Delete the lowest value and the highest value from the sample because those
are called “outliers”
Define the range of the data using the functions:
=min(data range) & = max(data range)
Divide that range into equal intervals
Estimate the frequency of values grater than the lower limit of each interval
, use the following functions:
= freq(data range, “> value”)
Subtract the values from the previous interval, then you get the frequency
for that interval.
The sum of all values should be the number of observations (N)
Define the mid of the interval as (x) and the freq. of interval as (y)
PLOT using column chart type
. TTENG 441 Traffic Engineering
11. Basic Statistics:
Useful MS Excel Functions
Fall 2021/ ElDessouki 113
Plotting Cumulative Frequency% Diagram:
For the same sample do the following:
Define a percentile sequence (y) starting from 0% to 100% in 5%
increments.
For each percentile value (y) in the sequence determine the
corresponding observation value (x):
= Percentile(data range, percentile value (y))
Plot using XY- line chart type
. TTENG 441 Traffic Engineering
12. Basic Statistics:
Normal Distribution and Its Applications
Fall 2021/ ElDessouki 114
The Normal Distribution:
The most common statistical distributions is the normal distribution,
known also as the ”Bell Curve”
The normal distribution is a “continuous distribution”, i.e. it is used for
continuous variables, such as: Speed, Time, Temperature, …etc.
Probability density function , f(x),
. TTENG 441 Traffic Engineering
13. Basic Statistics: Normal Distribution and Its
Applications
Fall 2021/ ElDessouki 115
The Standard Normal
Distribution:
Is a normalized form of the Normal
Distribution, to handle the
integration of probability density
function.
The true variables are normalized
and converted to an equivalent (z)
value as following:
. TTENG 441 Traffic Engineering
14. Basic Statistics:
Normal Distribution and Its Applications
Fall 2021/ ElDessouki 116
The Standard Normal Distribution (Cont.):
Then, the integration of the probability density function F(z) can be estimated
using the standard tables for the z value.
. TTENG 441 Traffic Engineering
15. Basic Statistics:
Normal Distribution and Its Applications
Fall 2021/ ElDessouki 117
Characteristics of the Standard Normal Distribution:
Mean = Median= Mode
Area under the curve (probability) distributed as shown below:
. TTENG 441 Traffic Engineering
16. Basic Statistics:
Normal Distribution and Its Applications
Fall 2021/ ElDessouki 118
Characteristics of the Standard Normal Distribution (cont.):
The distribution of the observations is as following:
. TTENG 441 Traffic Engineering
100%
99.7%
that
assumed
usually
is
it
ty,
practicali
For
:
Note
S.D.
3.00
within
are
ns
observatio
the
of
99.7%
S.D.
2.00
within
are
ns
observatio
the
of
95.5%
S.D.
1.96
within
are
ns
observatio
the
of
95.0%
S.D.
1.00
within
are
ns
observatio
the
of
68.3%
17. Basic Statistics:
Standard Error, True Mean & Sample Size
Fall 2021/ ElDessouki 119
Standard Error:
The standard error (E) in the sample mean ( X ) is
function in the sample size and the standard deviation of
the population ( the sample SD can be used instead):
size
sample
the
is
-
N
instead
used
be
can
sample
the
for
SD
The
population
the
for
deviation
standard
the
is
-
where
N
E
. TTENG 441 Traffic Engineering
18. Basic Statistics:
Standard Error, True Mean & Sample Size
Fall 2021/ ElDessouki 120
True Mean: m
The standard error (E) for the sample mean ( X ) is
assumed to follow a Normal Distribution around the true
mean ( ). Hence:
mean
sample
the
is
-
X
sample
the
of
error
standard
the
is
-
E
where
99.5%)
Confidence
of
Degre
(at
E
X
95%)
Confidence
of
Degree
(at
E
X
67%)
Confidence
of
Degree
(at
E
X
00
.
3
96
.
1
00
.
1
. TTENG 441 Traffic Engineering
19. Basic Statistics:
Standard Error, True Mean & Sample Size
Fall 2021/ ElDessouki 121
Sample Size:
For a given allowable error ( err ) and a specific degree
of confidence , the sample size ( N ) can be determined as
following:
mean
true
the
in
error
allwable
maximum
-
err
sample
the
of
deviation
standard
the
is
-
SD
where
99.5%)
Confidence
of
Degre
(at
err
SD
N
95%)
Confidence
of
Degre
(at
err
SD
N
67%)
Confidence
of
Degre
(at
err
SD
N
2
2
2
)
00
.
3
(
)
96
.
1
(
. TTENG 441 Traffic Engineering
20. Basic Statistics: Poisson Distribution
Poisson Distribution:
The Poisson distribution is known in traffic engineering as the “counting” distribution. It
has the clear physical meaning of several events (x) occurring in a specified counting
interval of duration (t) and is a one-parameter distribution with:
Where:
P(x)- Probability of (x) number of events occurring in a period (t)
(m)- is the average rate of occurrence of events in a period (t)
Example:
If the average number of accidents was 12 accident per year.
What is the probability of: m = 12 acc/yr = 1 acc/m
1 accident per month x = 1 m = 12 acc/yr 1 acc/m
2 accidents per month x=2
0 accidents per month x = 0
!
*
)
(
x
m
e
x
P
x
m
Fall 2021/ ElDessouki 122
. TTENG 441 Traffic Engineering
21. Basic Statistics: Poisson Distribution
Example:
If the average number of accidents was 12 accident per year.
What is the probability of:
a. 1 accident occurring per month?
b. 2 accidents occurring per month?
c. a month passing with no accidents?
Fall 2021/ ElDessouki 123
. TTENG 441 Traffic Engineering
22. Basic Statistics:
Poisson Distribution - Time Headways & Gaps:
What is the meaning of a probability of X=0?
It implies that there is no events occurring in that time period.
Hence, a time gap (t -sec) occurring in a flow (V -veh /hr) is
basically the probability of 0 vehicle arriving (i.e. X= 0) in a
time headway (h) that is >= to that gap. (i.e. h >= t)
Then: the average arrival rate in time period (t), will be
m = V * t /3600,
Then the probability of time gaps >= t will be:
3600
/
*
0
!
0
*
)
( t
V
m
m
e
e
m
e
t
h
P
Fall 2021/ ElDessouki 124
. TTENG 441 Traffic Engineering
3600
/
*
*
)
(
* t
V
e
V
t
h
P
V
t
gaps
of
Number
23. Basic Statistics:
Poisson Distribution - Time Headways & Gaps (cont.):
Estimation of the number of passing gaps:
Assuming that the follow up passing vehicle will need a time gap
that is equal to the lead vehicle, then:
Example:
...)
(
*
3600
/
4
*
3600
/
3
*
3600
/
2
*
3600
/
*
t
V
t
V
t
V
t
V
e
e
e
e
V
gaps
Passing
of
Number
Fall 2021/ ElDessouki 125
. TTENG 441 Traffic Engineering