WHY WE
NEED
STATISTICS
First, statisticsare used for
purposes of description.
Second, we can use statistics to make
inferences, which are logical
deductions about events that cannot
be observed directly.
3.
WHY WE NEEDSTATISTICS
• Data gathering and analysis might be considered analogous to criminal investigation and
prosecution (Cox, 2006; Regenwetter, 2006; Tukey)
• First comes the detective work of gathering and displaying clues, or what the statistician John
Tukey calls exploratory data analysis.
• Then comes a period of confirmatory data analysis, when the clues are evaluated against rigid
statistical rules.
• Descriptive statistics are methods used to provide a concise description of a collection of
quantitative information.
• Inferential statistics are methods used to make inferences from observations of a small
group of people known as a sample to a larger group of individuals known as a population.
PROPERTIES
OF SCALES
Magnitude- Itis the property
of “moreness”.
Equal Intervals- If the difference
between two points at any place on
the scale has the same meaning as
the difference between two other
points that differ by the same
number of scale units.
Absolute 0- It is obtained
when nothing of the property
being measured exists.
7.
NOMINAL:
-Eye Color (e.g.Blue, Brown, Green)
-Nationality (e.g. German, Filipino, Lebanese)
-Personality Type (e.g. Introvert, Extrovert)
-Employment Status (e.g. Unemployed, Part-time, Retired)
-Type of Smartphone Owned (e.g. Iphone, Samsung,
Google Pixel)
INTERVAL
-Temperature In DegreesFahrenheit or Celsius (But not
Kelvin)
-IQ Score
-Income Categorized as Ranges (P30-39k, P40-49k,
P50-59k, and so on)
10.
RATIO
-Weight in grams
-Numberof employees at a company
-Speed in miles per hour
-Length in centimeters
-Age in years
-Income in dollars
11.
FREQUENCY DISTRIBUTIONS
• Thefrequency distribution displays scores on a variable or a measure to reflect how
frequently each value was obtained.
14.
PERCENTILE RANKS
• Percentileranks replace simple ranks when we want to adjust for the
number of scores in a group.
• A percentile rank answers the question, “What percent of the scores fall
below a particular score (Xi)?”
15.
PERCENTILES
• Percentiles arethe specific scores or points within a distribution.
• Percentiles divide the total frequency for a set of observations into
hundredths.
• Instead of indicating what percentage of scores fall below a particular score,
as percentile ranks do, percentiles indicate the particular score, below which
a defined percentage of scores falls.
16.
PERCENTILE RANK VSPERCENTILE
Percentile Rank: This is a percentage indicating how a particular value (like a test score) compares to the rest of the
scores in a distribution. For example, a percentile rank of 80 means that 80% of the scores in the distribution are
lower than that specific score.
Percentile: This is a value itself that divides a dataset into 100 equal parts. The nth percentile represents the value
below which n% of the data points fall. For example, the 75th percentile is the value that separates the bottom 75%
of the data from the top 25%.
Example: Imagine a class of 100 students took a test.
• If Sarah scores at the 80th percentile, it means that 80% of the students scored below her. Her percentile rank is
80.
• If the 75th percentile score was 85, it means that 75% of the students scored 85 or less. 85 is the value at the 75th
percentile.
MEASURES OF CENTRALTENDENCY
1. MEAN
The arithmetic average score in a distribution is called the mean.
To calculate the mean, we total the scores and divide the sum by
the number of cases, or N. The capital Greek letter sigma (S)
means summation.
19.
MEASURES OF CENTRALTENDENCY
2. MEDIAN
The median, defined as the middle score in a distribution, is determined by ordering the scores in
a list by magnitude, in either ascending or descending order.
To calculate the median, first arrange the data set in ascending order. If the number of data points
is odd, the median is the middle value. If the number of data points is even, the median is the
average of the two middle values.
Example: Data set: 2, 5, 1, 8, 4
Arrange in ascending order: 1, 2, 4, 5, 8
The median is the middle value, which is 4.
20.
MEASURES OF CENTRALTENDENCY
3. MODE
The most frequently occurring score in a distribution of scores is the mode.
As an example, determine the mode for the following scores obtained by another TRW job
applicant, Bruce. The scores reflect the number of words Bruce word-processed in seven 1-
minute trials:
43 34 45 51 42 31 51
21.
MEASURES OF VARIABILITY
Variabilityis an indication of how scores in a distribution are scattered or dispersed.
1. The range of a distribution is equal to the difference between the highest and the lowest
scores.
2. A distribution of test scores can be divided into four parts are quartiles.
3. The interquartile range is a measure of variability equal to the difference between Q3
and Q1.
24.
SKEWNESS
• Skewness isan indication of how the measurements in a distribution are distributed.
1. A distribution has a positive skew when relatively few of the scores fall at the high end of
the distribution.
2. A distribution has a negative skew when relatively few of the scores fall at the low end of
the distribution.
NORMS
• Norms referto the performances by defined
groups on particular tests.
• Age-Related Norms- Certain tests have different
normative groups for particular age groups.
• A norm-referenced test compares each person
with a norm.
• A criterion-referenced test describes the
specific types of skills, tasks, or knowledge that
the test taker can demonstrate such as
mathematical skills
Editor's Notes
#11 A frequency distribution of data can be shown in a table or graph.
#15 Percentile = (Number of data points below / Total number of data points) * 100