Statistics review

Areas of Statistics
Descriptive statistics
 methods concerned w/
collecting, describing, and
analyzing a set of data
without drawing
conclusions (or inferences)
about a large group
Inferential statistics
 methods concerned
with the analysis of a
subset of data leading
to predictions or
inferences about the
entire set of data

Key Definitions
 Parameters are numerical measures
that describe the population or universe
of interest. Usually donated by Greek
letters;  (mu),  (sigma),  (rho), 
(lambda),  (tau),  (theta),  (alpha) and
 (beta).
 Statistics are numerical measures of a
sample

Nominal Level of Measurement
 The nominal level of measurement is
characterized by data that consists of
names, labels, or categories only. The
data cannot be arranged in an ordering
scheme.
 Example: gender, civil status, nationality,
religion, etc.

Ordinal Level of Measurement
 The ordinal level of measurement
involves data that may be arranged in
some order, but differences between data
values either cannot be determined or are
meaningless.
 Example: good, better or best speakers; 1
star, 2 star, 3 star movie; employee rank

Interval Level of Measurement
 The interval level of measurement is like
the ordinal level, with the additional
property that meaningful amounts of
differences between data can be
determined. However, there are no
inherent (natural) zero starting point.
 Example: body temperature, year (1955,
1843, 1776, 1123, etc.)

Ratio Level of Measurement
 The ratio level of measurement is the
interval modified to include the inherent
zero starting point. For values at this
level, differences and ratios are
meaningful.
 Example: weights of plastic, lengths of
videos, distances traveled

Median
 Divides the observations into two equal
parts
 If the number of observations is odd, the
median is the middle number.
 If the number of observations is even, the
median is the average of the 2 middle
number

Measures of Location
 A Measure of Location summarizes a data set
by giving a value within the range of the data
values that describes its location relative to the
entire data set arranged according to magnitude
(called an array).
Some Common Measures:
 Minimum, Maximum
 Percentiles, Deciles, Quartiles

Maximum and Minimum
 Minimum is the smallest value in the
data set, denoted as MIN.
 Maximum is the largest value in the
data set, denoted as MAX.

Percentiles
 Numerical measures that give the
relative position of a data value
relative to the entire data set.
 Divide an array (raw data arranged
in increasing or decreasing order
of magnitude) into 100 equal parts.
 The jth percentile, denoted as Pj, is
the data value in the the data set
that separates the bottom j% of the
data from the top (100-j)%.

Deciles
 Divide an array into ten equal
parts, each part having ten
percent of the distribution of
the data values, denoted by Dj.
 The 1st decile is the 10th
percentile; the 2nd decile is the
20th percentile…..

Quartiles
 Divide an array into four equal parts,
each part having 25% of the distribution
of the data values, denoted by Qj.
 The 1st quartile is the 25th percentile;
the 2nd quartile is the 50th percentile,
also the median and the 3rd quartile is
the 75th percentile.

Measures of Variation
 A measure of variation is a single
value that is used to describe the
spread of the distribution

Two Types of Measures of
Dispersion
Absolute Measures of Dispersion:
 Range
 Inter-quartile Range
 Variance
 Standard Deviation
Relative Measure of Dispersion:
 Coefficient of Variation

Variance
 important measure of variation
 shows variation about the mean
Population variance
Sample variance
N
X
N
i
i

 1
2
2
)( 

1
)(
1
2
2




n
xx
s
n
i
i

Standard Deviation (SD)
 most important measure of variation
 square root of Variance
 has the same units as the original data
Population SD
Sample SD
N
X
N
i
i

 1
2
)( 

1
)(
1
2




n
xx
s
n
i
i

Remarks on Standard Deviation
 If there is a large amount of variation,
then on average, the data values will be
far from the mean. Hence, the SD will be
large.
 If there is only a small amount of
variation, then on average, the data
values will be close to the mean. Hence,
the SD will be small.

Guiding Principle
 The larger the value of the
measure, the more dispersed (more
varied) the observations are.

Statistics review

More Related Content

What's hot

Viewers also liked

Similar to Statistics review

Recently uploaded

Statistics review