2. Measurement: the act of assigning numbers
or symbols to characteristics of things
(people, events, whatever) according to
rules.
3. Scale: a set of numbers (or other symbols) whose
properties model empirical properties of the
objects to which the numbers are assigned.
Categories of Scale according to type of variable:
1. Continuous scale - a scale used to measure a continuous
variable
Scale measuring the time spent playing ML
2. Discrete scale - a scale used to measure a discrete
variable
Scale measuring two discrete groups: previously
hospitalized and never hospitalized
4. error refers to the collective influence of all of the
factors on a test score or measurement beyond
those specifically measured by the test or
measurement
Ex. score of 25 on some test of anxiety should not be
thought of as a precise measure of anxiety
5. Classification or categorization based on one
or more distinguishing characteristics
mutually exclusive and exhaustive
categories
cannot be meaningfully added, subtracted,
ranked, or averaged
Example: Mental Disorders in DSM
6. Individual test items may also employ nominal scaling, including
yes/no responses.
For example, consider the following test items:
Instructions: Answer either yes or no.
1. Are you actively contemplating suicide? __________
2. Are you currently under professional care for a
psychiatric disorder? _______
3. Have you ever been convicted of a felony? _______
7. 1. Are you actively contemplating suicide? __________
2. Are you currently under professional care for a psychiatric disorder? _______
3. Have you ever been convicted of a felony? _______
In each case, a yes or no response results in the placement into one
of a set of mutually exclusive groups: suicidal or not, under care for
psychiatric disorder or not, and felon or not. Arithmetic operations
that can legitimately be performed with nominal data include
counting for the purpose of determining how many cases fall into
each category and a resulting determination of proportion or
percentages.
8. It permits classification and rank
ordering on some characteristic
imply nothing about how much
greater one ranking is than another
Example: 1 as least important, 10
as most important
9. Contain equal intervals between numbers.
Each unit on the scale is exactly equal to any other
unit on the scale.
Contain no absolute zero point
Example: IQ Tests
10. Example: IQ Tests
The difference in intellectual ability represented by IQs of 80 and
100, for example, is thought to be similar to that existing between IQs
of 100 and 120. However, if an individual were to achieve an IQ of 0
(something that is not even possible, given the way most intelligence
tests are structured), that would not be an indication of zero (the total
absence of) intelligence. Because interval scales contain no absolute
zero point, a presumption inherent in their use is that no testtaker
possesses none of the ability or trait (or whatever) being measured.
11. Equal intervals between the numbers on the
scale as well as a true or absolute zero point.
All mathematical operations can
meaningfully be performed because there
exist equal intervals between the numbers
on the scale as well as a true or absolute
zero point.
12. In psychology, ratio-level measurement is employed in some types
of tests and test items, perhaps most notably those involving
assessment of neurological functioning.
Example is a timed test of perceptual-motor ability that requires the
testtaker to assemble a jigsaw-like puzzle. The time taken to
successfully complete the puzzle is the measure that is recorded.
Because there is a true zero point on this scale (or, 0 seconds), it is
meaningful to say that a testtaker who completes the assembly in 30
seconds has taken half the time of a testtaker who completed it in 60
seconds.
In this example, it is meaningful to speak of a true zero point on the
scale—but in theory only. Why? Just think . . .
13. In psychology, ratio-level measurement is employed in some types
of tests and test items, perhaps most notably those involving
assessment of neurological functioning.
No testtaker could ever obtain a score of zero on this assembly task.
Stated another way, no testtaker, not even The Flash (a comic-book
superhero whose power is the ability to move at superhuman speed),
could assemble the puzzle in zero seconds.
14. On Pain
Nominal level- Are you currently in pain? Yes or No
Ordinal level- How bad is the pain right now? None,
Mild, Moderate, Severe
Interval/Ratio level- 0-10 Numerical Scale
0--1--2--3--4--5--6--7--8--9—10
No Pain Worst Pain Imaginable
15. The ordinal level of measurement is most frequently
used in psychology. As Kerlinger (1973, p. 439) put
it: “Intelligence, aptitude, and personality test
scores are, basically and strictly speaking, ordinal.
These tests indicate with more or less accuracy not
the amount of intelligence, aptitude, and personality
traits of individuals, but rather the rank-order
positions of the individuals.”
16.
17. Suppose you have magically changed places with
the professor teaching this course and that you
have just administered an examination that
consists of 100 multiple-choice items (where 1
point is awarded for each correct answer).
The distribution of scores for the 25 students
enrolled in your class could theoretically range
from 0 (none correct) to 100 (all correct).
18. A distribution may be defined as a set of test
scores arrayed for recording or study.
A raw score is a straightforward, unmodified
accounting of performance that is usually
numerical.
35, 60, 65, 81, 95, 98
19. In a frequency
distribution, all scores are
listed alongside the
number of times each
score occurred.
simple frequency
distribution
grouped frequency
distribution
20. In a frequency
distribution, all scores are
listed alongside the
number of times each
score occurred.
simple frequency
distribution
grouped frequency
distribution
21. A graph is a diagram or chart composed of
lines, points, bars, or other symbols that
describe and illustrate data.
22.
23.
24. One way to describe a distribution of test scores.
A measure of central tendency is a statistic that
indicates the average or midmost score between the
extreme scores in a distribution.
25. denoted by the symbol (and pronounced “X
bar”),
Equal to the sum of the observations
divided by the number of observations
Most appropriate measure of central
tendency for interval or ratio data when
the distributions are believed to be
approximately normal.
N
X
X
X
26. defined as the middle score in a distribution
ordering the scores in a list by magnitude,
in either ascending or descending order.
cuts the distribution into two equal parts
The median is an appropriate measure of central tendency for ordinal,
interval, and ratio data. The median may be a particularly useful
measure of central tendency in cases where relatively few scores fall
at the high end of the distribution or relatively few scores fall at the
low end of the distribution.
2
1
N
29. Skewness the nature and extent to
which symmetry is absent.
Skewness is an indication of how the measurements in a
distribution are distributed.
1. Positively Skewed
A distribution has a positive skew when relatively few of the
scores fall at the high end of the distribution. Positively skewed
examination results may indicate that the test was too difficult.
More items that were easier would have been desirable in order
to better discriminate at the lower end of the distribution of test
scores.
2. Negatively Skewed
A distribution has a negative skew when relatively few of the
scores fall at the low end of the distribution. Negatively skewed
examination results may indicate that the test was too easy. In
this case, more items of a higher level of difficulty would make it
possible to better discriminate between scores at the upper end
of the distribution.
30. The term skewed carries with it
negative implications for many
students. We suspect that skewed is
associated with abnormal, perhaps
because the skewed distribution
deviates from the symmetrical or so-
called normal distribution. However,
the presence or absence of symmetry
in a distribution (skewness) is simply
one characteristic by which a
distribution can be described.
31. refer to the steepness of a distribution in its
center
Kurtosis is a measure of whether the data
are peaked or flat relative to a normal
distribution.
leptokurtic (relatively peaked) (mesokurtic) somewhere in the middle platykurtic (relatively flat)
32. High kurtosis - characterized by a
high peak and “fatter” tails
compared to a normal distribution.
Lower kurtosis values - indicate a
distribution with a rounded peak
and thinner tails.
According to the original definition,
the normal bell-shaped curve would
have a kurtosis value of 3. In other
methods of computing kurtosis, a
normal distribution would have
kurtosis of 0, with positive values
indicating higher kurtosis and
negative values indicating lower
kurtosis.
33.
34. Equal to the difference between
the highest and the lowest
scores
Extreme scores can radically
alter the value of the range.
35. Variance is equal to the
arithmetic mean of the
squares of the differences
between the scores in a
distribution and their mean
Standard deviation as a
measure of variability equal
to the square root of the
average squared deviations
about the mean; equal to the
square root of the variance.
N
X
X
S
2
2 )
(
N
X
X
s
2
)
(
36.
37. Development of the concept of a normal
curve began in the middle of the eighteenth
century with the work of Abraham DeMoivre
and, later, the Marquis de Laplace.
At the beginning of the nineteenth century,
Karl Friedrich Gauss made some substantial
contributions.
Through the early nineteenth century,
scientists referred to it as the “Laplace-
Gaussian curve.”
Karl Pearson is credited with being the first
to refer to the curve as the normal curve
38. Theoretically, the normal curve is a bell-shaped,
smooth, mathematically defined curve that is
highest at its center. From the center it tapers on
both sides approaching the X-axis asymptotically
(meaning that it approaches, but never touches, the
axis).
In theory, the distribution of the normal curve
ranges from negative infinity to positive infinity.
The curve is perfectly symmetrical, with no
skewness. If you folded it in half at the mean, one
side would lie exactly on top of the other. Because it
is symmetrical, the mean, the median, and the
mode all have the same exact value.
39. It is bell shaped
It is bilaterally symmetrical
It has tails that approach but never touch the baseline, and
thus its limits extend to ± infinity (±∞)
It is unimodal.
It has a mean, median, and mode that coincide at the center
of the distribution
When the normal curve has a mean of 0 and a standard
deviation of 1, it is called the standard normal distribution.
40.
41.
42. Descriptive Uses
Generating the standard scores
Inferential Uses of the Normal Curve Model
(a) Estimating population parameters
(b) Testing hypotheses about differences
between means
43. Individuals who are mentally retarded or gifted share the
burden of deviance from the norm, in both a developmental
and a statistical sense. In terms of mental ability as
operationalized by tests of intelligence, performance that is
approximately two standard deviations from the mean (or, IQ
of 70–75 or lower or IQ of 125–130 or higher) is one key
element in identification.
Success at life’s tasks, or its absence, also plays a defining role,
but the primary classifying feature of both gifted and retarded
groups is intellectual deviance. These individuals are out of
sync with more average people, simply by their difference from
what is expected for their age and circumstance. (Robinson et
al., 2000, p. 1413)
44. Robinson et al. (2000) convincingly demonstrated that
knowledge of the areas under the normal curve can be quite
useful to the interpreter of test data.
This knowledge can tell us not only something about where the
score falls among a distribution of scores but also something
about a person and perhaps even something about the people
who share that person’s life.
This knowledge might also convey something about how
impressive, average, or lackluster the individual is with
respect to a particular discipline or ability. For example,
consider a high-school student whose score on a national, well-
respected spelling test is close to 3 standard deviations above
the mean.
45.
46.
47.
48. A standard score is a raw score that
has been converted from one scale to
another scale, where the latter scale
has some arbitrarily set mean and
standard deviation.
49. With a standard score, the position of a
testtaker’s performance relative to other
testtakers is readily apparent.
Standard scores provide a convenient
context for comparing scores on
different tests.
50. First for consideration is the type of standard score scale that may
be thought of as the zero plus or minus one scale.
This is so because it has a mean set at 0 and a standard deviation
set at 1. Raw scores converted into standard scores on this scale are
more popularly referred to as z scores.
51. zero plus or minus one scale
results from the conversion of a raw score into a
number indicating how many standard deviation
units the raw score is below or above the mean of
the distribution.
52.
53. Raw score of 10, mean of 6,
standard deviation of 3
Z= 1.33
Mean of 5.5, Raw score of 7, standard
deviation of 1.5
Z = 1
54. fifty plus or minus ten scale
Devised by W. A. McCall (1922, 1939)
and named a T score in honor of his
professor E. L. Thorndike
composed of a scale that ranges from 5
standard deviations below the mean to 5
standard deviations above the mean.
55.
56. mean of 5 and a
standard deviation of
approximately 2
a contraction of the
words standard and
nine
Stanine scoring may be familiar to many
students from achievement tests
administered in elementary and secondary
school, where test scores are often
represented as stanines. Stanines are
different from other standard scores in that
they take on whole values from 1 to 9, which
represent a range of performance that is
half of a standard deviation in width
57. mean set at 100 and a standard
deviation set at 15
approximately 95% of deviation
IQs ranging from 70 to 130, which
is 2 standard deviations below
and above the mean.
58.
59. Central to psychological testing and
assessment are inferences (deduced
conclusions) about how some things (such as
traits, abilities, or interests) are related to
other things (such as behavior).
An understanding of the concept of
correlation and an ability to compute a
coefficient of correlation is therefore central
to the study of tests and measurement.
60. Correlation is an expression of the degree and
direction of correspondence between two things.
Coefficient of correlation
A coefficient of correlation (or correlation coefficient) is a number
that provides us with an index of the strength of the relationship
between two things.
A coefficient of correlation (r) expresses a linear relationship
between two (and only two) variables, usually continuous in
nature. It reflects the degree of associated variation between
variable X and variable Y.
The coefficient of correlation is the numerical index that
expresses this relationship: It tells us the extent to which X and
Y are “co-related.”
61. Correlation is an expression of the degree
and direction of correspondence between two
things.
The meaning of a correlation coefficient is
interpreted by its sign and magnitude.
“plus” (for a positive correlation),
“minus” (for a negative correlation), or
“none” (in the rare instance that the correlation coefficient
was exactly equal to zero).
If asked to supply information about its magnitude, it
would respond with a number anywhere at all between −1
and +1.
62. Hours, x 0 1 2 3 3 5 5 5 6 7 7 10
Test score, y 96 85 82 74 95 68 76 84 58 65 75 50
Example:
The following data represents the number of hours 12
different students watched television during the
weekend and the scores of each student who took a test
the following Monday.
63. 1. Positive Correlation
If two variables simultaneously increase or
simultaneously decrease, then those two variables are
said to be positively (or directly) correlated.
The height and weight of normal, healthy children ranging in
age from birth to 10 years tend to be positively or directly
correlated. As children get older, their height and their weight
generally increase simultaneously.
A positive correlation also exists when two variables
simultaneously decrease. For example, the less a student
prepares for an examination, the lower that student’s score on
the examination.
64. 2. Negative (or inverse) correlation
occurs when one variable increases
while the other variable decreases.
For example, there tends to be an inverse
relationship between the number of times
students use cell phones during class to
text, tweet, check facebook and their test
performance
65. 3. Zero correlation
absolutely no relationship exists between the
two variables. And some might consider
“perfectly no correlation” to be a third variety of
perfect correlation; that is, a perfect
noncorrelation. After all, just as it is nearly
impossible in psychological work to identify two
variables that have a perfect correlation, so it is
nearly impossible to identify two variables that
have a zero correlation. Most of the time, two
variables will be fractionally correlated.
66. The most widely used of all is the Pearson r, also
known as the Pearson correlation coefficient and the
Pearson product-moment coefficient of correlation.
Devised by Karl Pearson, r can be the statistical tool
of choice when the relationship between the
variables is linear and when the two variables being
correlated are continuous.
Other correlational techniques can be employed with
data that are discontinuous and where the
relationship is nonlinear. The formula for the
Pearson r takes into account the relative position of
each test score or measurement with respect to the
mean of the distribution.
67. A number of formulas can be used to calculate a Pearson
r.
One formula requires that we
oconvert each raw score to a standard score and then
omultiply each pair of standard scores.
oA mean for the sum of the products is calculated, and
that mean is the value of the Pearson r.
The range of the correlation coefficient is 1 to 1. If x and y
have a strong positive linear correlation, r is close to 1. If x
and y have a strong negative linear correlation, r is close to
1. If there is no linear correlation or a weak linear
correlation, r is close to 0.
68. Pearson r Qualitative Description
±1 Perfect
±0.75 to ±<1 Very High
±0.50 to <±0.75 Moderately High
±0.25 to <±0.50 Moderately Low
>±0.0 to <±0.25 Very Low
0 No Correlation
70. Coefficient of Determination
The value obtained for the coefficient of
correlation can be further interpreted
by deriving from it what is called a
coefficient of determination, or r2. It is
an indication of how much variance is
shared by the X- and the Y-variables.
71. Coefficient of Determination
Calculation of r2:
square the correlation coefficient and multiply by 100; the
result is equal to the percentage of the variance accounted
for.
Example, you calculated r to be .9, then r2 would be equal
to .81.
The number .81 tells us that 81% of the variance is
accounted for by the X- and Y-variables. The remaining
variance, equal to 100(1 − r2), or 19%, could presumably
be accounted for by chance, error, or otherwise
unmeasured or unexplainable factors.
72. One commonly used alternative statistic is
variously called a rank-order correlation
coefficient, a rank-difference correlation
coefficient, or simply Spearman’s rho.
Developed by Charles Spearman, a British
psychologist, this coefficient of correlation is
frequently used when the sample size is
small (fewer than 30 pairs of
measurements) and especially when both
sets of measurements are in ordinal (or
rank-order) form.
73.
74. bivariate distribution, a scatter diagram, a scattergram
A scatterplot is a simple graphing of the coordinate
points for values of the X-variable (placed along the
graph’s horizontal axis) and the Y-variable (placed
along the graph’s vertical axis).
Scatterplots are useful because they provide a quick
indication of the direction and magnitude of the
relationship, if any, between the two variables.
To distinguish positive from negative correlations, note the
direction of the curve.
To estimate the strength of magnitude of the correlation, note
the degree to which the points form a straight line.
75. x
y
Negative Linear Correlation
x
y
No Correlation
x
y
Positive Linear Correlation
x
y
Nonlinear Correlation
As x increases,
y tends to
decrease.
As x increases,
y tends to
increase.
76. x
y
Very high negative correlation
x
y
Moderately low positive correlation
x
y
Very high positive correlation
x
y
Very low positive Correlation
r = 0.91 r = 0.88
r = 0.42
r = 0.07
Editor's Notes
Back to the Basics
Discrete vs Continuous
Discrete variable - a variable which can assume finite, or, at most, countably infinite number of values; usually measured by counting or enumeration.
Students in a class
Continuous variable - a variable which can assume infinitely many values corresponding to a line interval
Time spent playing ML
303.00 identified alcohol intoxication, and the number 307.00 identified stuttering. But these numbers were used exclusively for classification purposes and could not be meaningfully added, subtracted, ranked, or averaged. Hence, the middle number between these two diagnostic codes, 305.00, did not identify an intoxicated stutterer.
Mutually exclusive – 2 can not occur at the same time
Exhaustive – 1 of them must occur
.(RPM Classification: Intellectually superior, Definitely above average, intellectually average, Definitely below average, intellectually impaired.). Temperature, time, date
No testtaker could ever obtain a score of zero on this assembly task. Stated another way, no testtaker, not even The Flash (a comic-book superhero whose power is the ability to move at superhuman speed), could assemble the puzzle in zero seconds. Length, area, population
In this chapter we discuss the various ways in which test data can be described or
converted to make those data more manageable and understandable. Some of the techniques
we’ll describe, such as the computation of an average, can be used if data are assumed to
be interval- or ratio-level in nature but not if they are ordinal- or nominal-level. Other
techniques, such as those involving the creation of graphs or tables, may be used with
ordinal- or even nominal-level data.
. (the abscissa or X-axis); (the ordinate or Y-axis)
One way to describe a distribution of test scores is by a measure of central tendency.
Many methods exist for measuring kurtosis.
an indication of how scores in a distribution are scattered or dispersed
Highest minus lowest scores
Confidence Intervals
T-Tests, ANOVA
Confidence Intervals
T-Tests, ANOVA
Confidence Intervals
T-Tests, ANOVA
Two students, taking an exam on different sections
Sten: Mean of 5.5 and SD of 2
DETERMINING THE RAW SCORE (OR SCORES) ASSOCIATED WITH A GIVEN PERCENTAGE AREA UNDER THE NORMAL CURVE
OBTAINING THE PROPORTION (OR PERCENTAGE) OF THE AREA UNDER THE NORMAL CURVE BETWEEN TWO SCORE VALUES
OBTAINING THE PERCENTILE RANKS OF SCORES USING THE NORMAL CURVE
Karl Pearson’s name has become synonymous with
correlation. History records, however, that it was
actually Sir Francis Galton who should be credited
with developing the concept of correlation (Magnello &
Spies, 1984). Galton experimented with many formulas
to measure correlation, including one he labeled r.
Pearson, a contemporary of Galton’s, modified Galton’s
r, and the rest, as they say, is history. The Pearson r
eventually became the most widely used measure of
correlation.
Karl Pearson’s name has become synonymous with
correlation. History records, however, that it was
actually Sir Francis Galton who should be credited
with developing the concept of correlation (Magnello &
Spies, 1984). Galton experimented with many formulas
to measure correlation, including one he labeled r.
Pearson, a contemporary of Galton’s, modified Galton’s
r, and the rest, as they say, is history. The Pearson r
eventually became the most widely used measure of
correlation.
Charles Spearman is best known as the
developer of the Spearman rho statistic and
the Spearman-Brown prophecy formula,
which is used to “prophesize” the accuracy
of tests of different sizes. Spearman is also
credited with being the father of a statistical
method called factor analysis, discussed
later in this text.