The Standard NormalDistribution and z ScoresKeren SuCorbisC.docx

The Standard NormalDistribution and z Scores
Keren Su/Corbis
Chapter Learning Objectives
After reading this chapter, you should be able to do the followin
g:
1.
Identify the characteristics of the standard normal distribution.
2. Demonstrate the use of the z transformation.
3.
Determine the percent of a population above a point, below a po
int, andbetween two points on the horizontal axis of a normal di
stribution.
4. Calculate z scores using Excel.
5. Describe alternative standard scores.
6. Demonstrate the use of the modified standard score.
Introduction
The data that describe characteristics of groups come from eithe
r samples or populations,explained in the first two chapters. By
way of reminder, recall that populations include allpossible me
mbers of any specified group. All university students, all psych
ology majors, allresidents of Orange County, and all left-
handed male tennis players in their 20s are eachdescriptions of
a population. We rely on Greek letters, such as µ for the mean a
nd σ for thestandard deviation, to distinguish population parame
ters from the statistics that describesamples. (The word paramet
er indicates a characteristic of a population.) Remove one ormor
e individuals from any population, and the resulting group is a s
ample.
As we were describing populations, we noted that some are “nor
mally distributed.” Thesecharacteristics indicate normality: (a)
data distributions are symmetrical, (b) all the measuresof centra
l tendency have very similar values, and (c) the value of the sta
ndard deviation isabout one-sixth of the range.

Data normality does not simply mean that the frequency distribu
tion will appear as a bell-
shaped curve; it means that predictable proportions of the entire
population will occur inspecified regions of the distribution, an
d this holds for all normal data distributions. Forexample, the re
gion under a normal curve from the mean of the population to o
ne standarddeviation below the mean always includes 34.13% of
the area under the curve. Because normal distributions are sym
metrical, from the mean to one standard deviation above themea
n also includes 34.13%, so from +1σ or −1σ includes about 68.2
6% of the area under thecurve in any normally distributed popul
ation. As long as the data are normally distributed,those percent
ages hold true. Since many mental characteristics are normally
distributed,researchers can know a good deal about such a chara
cteristic without actually gathering thedata and doing the analys
is. Whether the characteristic is intelligence, achievementmotiv
ation, anxiety, or any other normally distributed characteristics,
the proportion of thedistribution within +1 or −1 standard devia
tion from the mean will be the same:
·
If a particular intelligence scale has µ = 100 and σ = 15, about 6
8% of any generalpopulation will have intelligence scores betwe
en 85 and 115.
·
Likewise, if an achievement motivation scale has µ = 40 and σ =
8, about 2/3 of anypopulation will have achievement motivation
scores from 32 to 48.
·
And for an anxiety measure with µ = 25 and σ= 5, about 68% of
any generalpopulation will have scores between 20 and 30.
The consistency in the way so many characteristics are distribut
ed affords a good deal ofinterpretive power. Anyone who needs
information about the likelihood of individualsscoring in certain
areas of a distribution has an advantage when data are normally
distributed.In addition to the 68% of any general population lik
ely to score between +1σ and −1σ,

·
from µ to +2σ is about 47.72% of the population, so about 95%
(2 × 47.72) of thepeople in any general population will have int
elligence scores between 70 (100 − 30)and 130 (100 + 30).
·
from +3σ (49.87%) to −3σ includes nearly everyone in any norm
ally distributedpopulation (2 × 49.87 = 99.74).
These observations emphasize that, sometimes, isolated bits of
data can be quite informative.When a 12-year-
old with an intelligence score of 170 pops up on YouTube, it is
immediatelyapparent that this is a very unusual child. An intelli
gence score of that magnitude is about4.667σ (170 − 100 = 70;
70 ÷ 15 = 4.667) beyond the mean of the general population. If f
rom+3σ to −3σ includes more than 99% of the population, from
+4.667σ to −4.667σ mustinclude all but the utmost extreme scor
es. We obtain an even better context for how common(or uncom
mon) particular measures may be when we can determine the pr
ecise probability oftheir occurrence.
This is a self-assessment and will not affect your grade. You
may only take this pre-test once.
Test Ch 3: The Standard Normal Distribution and z Scores
Top of Form
1. The z transformation changes any raw score into a z score so
that it fits the standard normal distribution.
· a. TRUE
· b. FALSE
2. Calculating z scores doesn’t alter the distribution; it just
makes them fit a distribution where the mean is 0 and the
standard deviation is 1.0.
· a. FALSE
· b. TRUE
3. We can apply the z transformation to sample data even when
there is reason to believe that the population from which the
sample was drawn is not normally distributed.
· a. TRUE
· b. FALSE

4. Raw scores can be determined if the mean and standard
deviation are available.
· a. FALSE
· b. TRUE
5. Individual scores in the standard normal distribution are
called z scores.
· a. TRUE
· b. FALSE
Finish
Bottom of Form
3.1 A Primer in Probability
Scholars, data analysts, and in fact people on the whole are rarel
y interested in outcomes thatoccur every time. If everyone had a
n intelligence score of 170, no one would pay anyattention to so
meone with such a score. The fact that we know it to be uncom
mon is whatpiques our curiosity.
If we are not interested in events that always occur, neither do
we closely follow events thatnever occur. If no one had ever ha
d an intelligence score of 170, probably no one wouldwonder ab
out what such a score means for the person who has it. The thin
gs that occur someof the time, however, intrigue us. The “some
of the time” indicates that the event has some probability, or lik
elihood, of occurrence.
· What is the probability that those newlyweds will divorce?
· How likely is Germany to win the World Cup?
·
What is the probability that an earthquake will occur on a partic
ular day for someonewho lives near the San Andreas Fault?
· What is the probability of an IRS audit for one taxpayer?
Joseph Sohm/Visions of America/Corbis
The probability that something willoccur, such as how likely it i
s thatour favorite baseball team will winthe World Series, intrig
ues us and isan important component in thedecision-
making process.

Because all of the items listed have happened in thepast and bec
ause their occurrence is important to atleast someone, people ar
e interested in theprobability of those occurrences whether or no
tthey use the language of probability. When statednumerically,
probability values range from 0 to 1.0.Something with a probabi
lity of zero (p = 0) neveroccurs. On the other hand, p = 1.0 indi
cates that theevent occurs every time, and p = 0.5 indicates thatt
he event occurs 50% of the time.
As that last point indicates, percentages can beconverted to prob
ability values. Dividing the percentage of times an event occurs
by 100indicates the associated probability of the event.
Returning to the intelligence scores, we see thatbecause about 6
8% of the population has intelligence scores between 85 and 11
5, theprobability (p) that someone selected at random from the g
eneral population will have a scoresomewhere between 85 and 1
15 is 0.68 (68.26/100, if the result is rounded to two decimalpla
ces).
What is the probability that someone selected at random from th
e general population willhave an intelligence score of 100 or lo
wer? Because 100 is the mean for intelligence scores,and becaus
e 50% of the population occur at the mean or below, p = 0.5.
What is the probability that someone selected at random will ha
ve an intelligence scorehigher than 115? First, we noted earlier t
hat 34.13% of the population falls between themean, µ, and one
standard deviation above the mean at σ = +1.0 in any normally d
istributedpopulation. In terms of intelligence score values, that i
s the region between scores of 100 and115. Since 50% of any no
rmally distributed population will occur at the mean and above,
ifwe subtract from 50% that portion between the mean and one s
tandard deviation above themean, the remainder will be the port
ion of the distribution above 115: 50% − 34.13% =15.87%; that
is, 15.87% of all intelligence scores in a normally distributed po
pulation willoccur above 115. Dividing by 100 (15.87/100 = 0.1
587) and rounding the result to twodecimal places produces the
probability p = 0.16.
By the same logic, because a score of 85 is one standard deviati

on below the mean, theprobability p = 0.16 means that someone
selected at random from the population will scorebelow 85. If w
e combine the two outcomes, the probability is p = 0.32 that so
meone from thepopulation will score either below 85 or above 1
15.
Consider the number line shown in Figure 3.1.
Figure 3.1: Standard deviations for intelligence scores
The number line shows the portion of scores that fall within two
standarddeviations above and below the mean. If M = 100, we c
an know the probabilityof someone scoring below 85 or above 1
15.
If this number line represents all intelligence scores ranging fro
m two standard deviationsbelow to two standard deviations abov
e the mean, we can see the percentages of thepopulation that wil
l have scores in the designated areas. Using the percentages and
dividingby 100 indicates the probability of a score in any of the
designated areas.
Recall that the lowest probability for any value is zero (p = 0). I
f p = 0, then the event oroutcome never occurs. There is no such
thing as a negative probability.3.2 The Standard Normal Distrib
ution
Not all populations are normally distributed. Home sales are usu
ally reported in terms of the medianprice of a home, and salary
data are likewise reported as median values. Those cases use the
mediansbecause the related populations are very unlikely to be
normally distributed and, as a measure ofcentral tendency, medi
ans are less affected by extreme values than are means. A few v
ery high salariesor home values create positive skew in the resul
ting distribution. In contrast, when it comes to, say,mental char
acteristics such as intelligence, achievement motivation, proble
m-
solving ability, verbalaptitude, reading comprehension, and so o
n, population data are often normally distributed.
Hello Lovely/Corbis

When evaluating information aboutpeople’s characteristics, kee
p in mindthat data are often normally distributed.
Although there are many normal distributions all havingthe sam
e proportions, each has different descriptivevalues. An intellige
nce test might have µ = 100 and σ =15 points. A nationally admi
nistered reading test mighthave a mean of 60 and a standard dev
iation of 8. Thesedifferent parameters can make it difficult to co
mpare oneindividual’s performance across multiple measures. A
sone author noted regarding scores from the WechslerIntelligen
ce Test for Children (WISC), “A raw score of 5on one [sub]test
will not have the same meaning as a rawscore 5 on another [sub]
test” (Brock, 2010).
One way to resolve this interpretation problem is toconvert the s
cores from different distributions into acommon metric, or meas
urement system. If researchersalter scores from different distrib
utions so that they both fit the same distribution, they can comp
arescores directly. A researcher can compare them directly to de
termine, for example, on which test anindividual scored highest.
Such comparisons are one of the purposes of the standard norm
aldistribution.
The standard normal distribution looks like all other normal dist
ributions—
from the mean to +1standard deviation includes 34.13% of the d
istribution, for example. What separates it from the othersis that
in the standard normal distribution, the mean is always 0, and t
he standard deviation is always1.0 (Figure 3.2). Other distributi
ons may have fixed values for their means and standard deviatio
ns, buthere µ is always 0 and σ is always 1.0.Figure 3.2: The sta
ndard normal distribution
In the standard normal distribution, the mean is always 0, and th
e standard deviation isalways 1.0.
The Standard Normal, or z, Distribution
Although various normal distributions have different means and
standard deviations, they all mirroreach other in terms of how m

uch of their populations occur in particular regions. The standar
d normaldistribution’s advantage is that the proportions of the w
hole that occur in the various regions of thedistribution have be
en calculated. That means that if data from any normal distributi
on are made toconform to the standard normal distribution, we c
an answer questions about what is likely to occur invirtually an
y area of the distribution, such as how likely it is to score 2.5 st
andard deviations below themean on a particular test, or what pe
rcentage of the entire population will likely occur between twos
pecified points. All such questions can be answered when adapti
ng normal data to the characteristicsof the standard normal distr
ibution.
Individual scores in the standard normal distribution are called
z scores, which is why the standardnormal distribution is often c
alled “the z distribution.” The formula used to turn scores from
anynormal distribution into scores that conform to the standard
normal distribution is the ztransformation:
Formula 3.1
z=x−Ms
where z is a score in the standard normal distribution, x is the s
core from the original distribution(often called a “raw” score),
M is the mean of the scores before the original distribution, and
s is thestandard deviation of the scores from the original distrib
ution.
Because normality is characteristic of only very large groups, sa
mples will rarely be normal. However,we can apply the z transf
ormation to sample data when there is reason to believe that the
populationfrom which the sample was drawn is normally distrib
uted. This is what Formula 3 reflects. The M and s indicate that
the data involved are sample data. In those situations where an a
nalyst has access topopulation data—
a social worker has all the data for those served by Head Start i
n a particular county,for example—
µ replaces M and σ replaces s in the formula. With either sampl
e or population data, thetransformation is from data that can hav

e any mean and standard deviation to a distribution where theme
an will always equal 0 and the standard deviation will always eq
ual 1.0.
To turn raw scores into z scores, perform the following steps:
1. Determine the mean and standard deviation for the data set.
2.
Subtract the mean of the data set from each score to be transfor
med.
3.
Divide the difference by the standard deviation of the data set.
For example, consider a psychologist interested in the level of a
pathy among potential voters regardingmental health issues that
affect the community. Scores on the Summary of WHo’s Apathe
tic Test (theSoWHAT for short), an apathy measure, are gathere
d for 10 registered voters:
5, 6, 9, 11, 15, 15, 17, 20, 22, 25
What’s the z score for someone who has an apathy score of 11?
· Verify that for these 10 scores, M = 14.5 and s = 6.737.
· The z score equivalent for an apathy score of 11 is
z=x−Ms=11−14.56.737=−0.5195
An apathy score of 11 translates into a z score of −0.5195. Beca
use the mean of the z distribution is 0and the standard deviation
in the z distribution is 1.0, where would a score of −0.5195 occ
ur on thehorizontal axis of the data distribution? It would be a li
ttle over half a standard deviation below themean, right? Figure
3.3 shows the z distribution and the point about where a raw sco
re of 11 occurs inthis distribution once it is transformed into a z
score.
It is important to know that the z transformation does not make
data normal. Calculating z scores doesnot alter the distribution;
it just makes them fit a distribution where the mean is 0 and the
standarddeviation is 1.0. Evaluating skew and kurtosis must allo
w the analyst to assume that the data arenormal before using the
z transformation.
With a mean of 0 in the standard normal distribution, half of all

z scores—all the scores below themean—
are going to be negative. A raw score of 11 from the SoWHAT d
ata is lower than the mean,which was M = 14.5, so it has a negat
ive z value (−0.5195).Try It!: #1
How many standard deviations from themean of the distribution
is a z score of 1.5?
Besides indicating by its sign whether the zscore is above or bel
ow the mean, the value ofthe z score indicates how far from the
mean the zscore is in standard deviations. If a score had a zvalu
e of 1.0, it would indicate that the score isone standard deviatio
n above the mean. The zscore for the raw score of 11 was −0.51
95,indicating that it is just over half a standarddeviation below t
he mean. This ease of interpretation is one of the great values of
z scores: the sign ofthe score indicates whether the associated r
aw score was above or below the mean, and the value ofthe scor
e indicates how far from the mean the raw score falls, in standar
d deviation units (Fischer andMilfont, 2010).Figure 3.3: Locatio
n of a score on the z distribution
Half of all z scores will fall below the mean, resulting in a nega
tive value. A score of z= −0.5195 is slightly less than one-
half a standard deviation below the mean.
Comparing Scores from Different Instruments
Consider another application of the standard normal distribution
. A counselor has intelligence andreading scores for the same pe
rson and wishes to know on which measure the individual score
d higher.Table 3.1 shows the data for the two tests. On the intell
igence test, the individual scored 105, and onthe reading test, th
e individual scored 62.Table 3.1: Reading and intelligence test r
esults
Test
Mean
Standard deviation
Intelligence
100

15
Reading
60
8
If the counselor transforms both scores to make them fit the stan
dard normal distribution, they can becompared directly.
The z for the intelligence score is
z=x−Ms=105−10015=0.333
The z for the reading test score is
z=x−Ms=62−608=0.250
The intelligence score of 105 and the reading score of 62 are dif
ficult to compare because they belongto different distributions
with different means and standard deviations. When both are tra
nsformed tofit the standard normal distribution, an analyst can d
irectly compare scores. The larger z value forintelligence makes
it clear that individual scored higher in intelligence than in rea
ding.
Expanding the Use of the z Distribution
Because the standard normal distribution is a normal distributio
n, we know that predictableproportions of its population will oc
cur in specific areas. As we noted earlier, however, thosepropor
tions are known in great detail for the z distribution because thi
s population is so often used toanswer detailed questions about t
he likelihood of particular outcomes. Table 3.2 indicates how m
uchof the entire population is above or below all of the most co
mmonly occurring values of z. So, bytransforming scores from o
ther distributions to fit the z distribution, we can use what we k
now aboutthis population to answer questions about scores from
any normal distribution.
Not all tables for z values are alike. Probably as a matter of the
developer’s preference, some tablesindicate the percentage of th
e population below a point. Some indicate the percentage betwe

en a pointand the mean of the distribution. Some indicate the pr
obability of scoring in a particular area, and soon. This particul
ar table indicates the proportion of the population between the s
pecified value of z andthe mean of the distribution. (Table 3.2 is
listed as Table B.1 in Appendix B.)Table 3.2: The z table
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.0
0.0000
0.0040
0.0080
0.0120
0.0160
0.0199
0.0239
0.0279
0.0319
0.0359
0.1
0.0398
0.0438
0.0478
0.0517
0.0557
0.0596
0.0636
0.0675

0.0714
0.0753
0.2
0.0793
0.0832
0.0871
0.0910
0.0948
0.0987
0.1026
0.1064
0.1103
0.1141
0.3
0.1179
0.1217
0.1255
0.1293
0.1331
0.1368
0.1406
0.1443
0.1480
0.1517
0.4
0.1554
0.1591
0.1628
0.1664
0.1700
0.1736
0.1772
0.1808
0.1844
0.1879
0.5

0.1915
0.1950
0.1985
0.2019
0.2054
0.2088
0.2123
0.2157
0.2190
0.2224
0.6
0.2257
0.2291
0.2324
0.2357
0.2389
0.2422
0.2454
0.2486
0.2517
0.2549
0.7
0.2580
0.2611
0.2642
0.2673
0.2704
0.2734
0.2764
0.2794
0.2823
0.2852
0.8
0.2881
0.2910
0.2939

0.2967
0.2995
0.3023
0.3051
0.3078
0.3106
0.3133
0.9
0.3159
0.3186
0.3212
0.3238
0.3264
0.3289
0.3315
0.3340
0.3365
0.3389
1.0
0.3413
0.3438
0.3461
0.3485
0.3508
0.3531
0.3554
0.3577
0.3599
0.3621
1.1
0.3643
0.3665
0.3686
0.3708
0.3729
0.3749

0.3770
0.3790
0.3810
0.3830
1.2
0.3849
0.3869
0.3888
0.3907
0.3925
0.3944
0.3962
0.3980
0.3997
0.4015
1.3
0.4032
0.4049
0.4066
0.4082
0.4099
0.4115
0.4131
0.4147
0.4162
0.4177
1.4
0.4192
0.4207
0.4222
0.4236
0.4251
0.4265
0.4279
0.4292
0.4306

0.4319
1.5
0.4332
0.4345
0.4357
0.4370
0.4382
0.4394
0.4406
0.4418
0.4429
0.4441
1.6
0.4452
0.4463
0.4474
0.4484
0.4495
0.4505
0.4515
0.4525
0.4535
0.4545
1.7
0.4554
0.4564
0.4573
0.4582
0.4591
0.4599
0.4608
0.4616
0.4625
0.4633
1.8
0.4641

0.4649
0.4656
0.4664
0.4671
0.4678
0.4686
0.4693
0.4699
0.4706
1.9
0.4713
0.4719
0.4726
0.4732
0.4738
0.4744
0.4750
0.4756
0.4761
0.4767
2.0
0.4772
0.4778
0.4783
0.4788
0.4793
0.4798
0.4803
0.4808
0.4812
0.4817
2.1
0.4821
0.4826
0.4830
0.4834

0.4838
0.4842
0.4846
0.4850
0.4854
0.4857
2.2
0.4861
0.4864
0.4868
0.4871
0.4875
0.4878
0.4881
0.4884
0.4887
0.4890
2.3
0.4893
0.4896
0.4898
0.4901
0.4904
0.4906
0.4909
0.4911
0.4913
0.4916
2.4
0.4918
0.4920
0.4922
0.4925
0.4927
0.4929
0.4931

0.4932
0.4934
0.4936
2.5
0.4938
0.4940
0.4941
0.4943
0.4945
0.4946
0.4948
0.4949
0.4951
0.4952
2.6
0.4953
0.4955
0.4956
0.4957
0.4959
0.4960
0.4961
0.4962
0.4963
0.4964
2.7
0.4965
0.4966
0.4967
0.4968
0.4969
0.4970
0.4971
0.4972
0.4973
0.4974

2.8
0.4974
0.4975
0.4976
0.4977
0.4977
0.4978
0.4979
0.4979
0.4980
0.4981
2.9
0.4981
0.4982
0.4982
0.4983
0.4984
0.4984
0.4985
0.4985
0.4986
0.4986
3.0
0.4987
0.4987
0.4987
0.4988
0.4988
0.4989
0.4989
0.4989
0.4990
0.4990
Source: StatSoft. (2011). Electronic Statistics Textbook. Tulsa,
OK: StatSoft. Retrieved from http://www.statsoft.com/textbook/
distribution-tables/#z

The z value calculation for a SoWHAT score of 11 rounded to 4
decimal values for the sake of theillustration. The table rounds
z values to just two decimals, so from this point forward, round
z valuesto two decimals when using the table. Rounding makes t
he z value for a raw score of 11 = −0.52.
To interpret the z score, read the whole numbers and the tenths (
the tenths are the first value to theright of the decimal) verticall
y down the left margin of the table. For the hundredths (the seco
nd valueto the right of the decimal), move from left to right acr
oss the columns at the top of the table.
1. Read down the left margin to the line indicating 0.5.
2. Read across the top to the column indicating 0.02.
3.
The table value where row and column intersect is 0.1985. This
value is the proportion (out ofa total of 1.0) of any normally dis
tributed population that will occur between z = 0.52 and thepop
ulation’s mean.
4.
To determine the percentage of the distribution between z = −0.
52 and the mean, multiply thetable value by 100: 100 × 0.1985
= 19.85% of the distribution is between −0.52 and thepopulation
mean.
Note that all the z values in Table 3.2 are positive. Our z score f
rom the SoWHAT score was actuallynegative (z = −0.52). Since
the mean of the standard normal distribution is z = 0, the z valu
e for anyscore below the mean will be negative. However, the n
egative values pose no problem because allnormal distributions
are symmetrical, so the proportion of a normal population betwe
en z = −0.52 andthe mean will be the same as that between z = 0
.52 and the mean. We simply look up the proportionfor the appr
opriate value of z, remembering that when z is negative, it is a p
roportion to the left of themean rather than to the right.Try It!:
#2
Table 3.2 has table values only for positive z scores. How do we
interpret the valuewhen z turns out to be negative?
To state all this as a principle, because normaldistributions are s

ymmetrical, z scores with thesame absolute value (the same num
bers withoutregard to the sign) include the same proportionsbet
ween their values and the mean of thedistribution. For this reaso
n, the z table indicatesonly the proportions for half the distribut
ion. Inthe case of Table 3.2, that half is the positive(right) half
of the distribution.
Because 50% of the distribution occurs eitherside of the mean, i
f 19.85% of the distribution is from a z = −0.52 back to the mea
n, the balance of theleft (negative) half of the distribution must
occur below a z score of −0.52. That proportion is 50 −19.85 =
30.15%, as the number line illustrates:
Working in the other direction: if the question is what percentag
e of the population will score 11 orlower on the SoWHAT, the a
nswer is 50 − 19.85 = 30.15%.
If instead someone asks what the probability of scoring at or bel
ow 11 (30.15%) is, we must turn thepercentage back into a prob
ability: 30.15 / 100 = 0.3015, or p = 0.3015 of scoring at or belo
w 11.Try It!: #3
What is the largest possible value for z?
Note that the language above is “11 or lower,”and “at or below.
” The characteristics of thenormal curve allow us to determine t
hepercentage between points, but not at a discretepoint. Technic
ally, a particular point has nowidth and so no associated percent
age.
Converting z Scores to Percentage
Now that we have learned how to transform scores from other di
stributions to fit the z distribution, wewill take a further look at
how we can convert scores on opposite sides of the mean and sc
ores with thesame sign to percentages.
Two Scores on Opposite Sides of the Mean
If 5 and 25 are the most extreme apathy scores gathered in the s
ample of SoWHAT scores, we mightask what percentage of the
entire distribution will score between 5 and 25. Because those w

ere thelowest and highest scores, the answer should be 100%, co
rrect? Remember that the collected data werea sample:
5, 6, 9, 11, 15, 15, 17, 20, 22, 25
Although everyone in the sample scored between 5 and 25, it is
entirely possible, even probable, thatsomeone in the larger popu
lation will have a more extreme score. Using the z distribution,
we candetermine how probable by following these steps:
1. Convert both 5 and 25 into z scores.
2. Determine the table values for both z scores.
3. Turn the table values into percentages.
4. Add the percentages together.
The z score formula is
z=x−Ms
Allowing that the subscript to each z indicates the raw score and
that M = 14.5 and s = 6.737 from thesample data produces the f
ollowing calculations:
z5=5−14.56.737=−1.410,
for which the table value is 0.4207,
which corresponds to a percentage of 42.07% (0.4207 × 100).
z25=25−14.56.737=1.559=1.56,
which has a table value of 0.4406.
Expressed as a percentage, the value is 44.06% (0.4406 ×100).
Adding the two percentages together to determine the total perc
entage between them produces thefollowing:
42.07 + 44.06 = 86.13% from 5 to 25.
Clearly, these scores do not equal 100%. The results indicate th
at in the population for which thesedata are a sample, about 13.
87% (100 − 86.13) will score either lower than 5 or higher than
25. Figure3.4 indicates this result.Figure 3.4: Areas under the n
ormal curve below z = −1.41 andbeyond z = 1.56

In this distribution, z values that fall below −1.41 or above +1.5
6 (raw scores below 5 orabove 25) are considered extreme score
s, comprising only about 13.87% of thepopulation.
The answer to this problem underscores two important concepts.
First, remember that we are dealingwith sample data, and the sa
mple will never exactly duplicate a population. The second, mor
e subtlepoint reveals that there is no point at which we can be c
onfident that no one will produce a moreextreme score. The cur
ve represents this fact by extending the tails (the endpoints of th
e curve)outward in either direction along the horizontal axis. Al
though the gap between tail and axis narrowsconstantly, the tail
s never touch the axis (the 50-
cent word is that the tails are “asymptotic” to thehorizontal axis
). The application means a value of z will never account for 100
% of the distribution.
z Scores with the Same Sign
The previous example raised the question about the percentage
of the distribution between z scores onopposite sides of the mea
n—
two z scores where one was positive (z = 1.56) and the other ne
gative (z =−1.41). Perhaps the researcher has a question about t
he percentage of the distribution betweenSoWHAT scores of 15
and 20. When M = 14.5, both of these raw scores are higher tha
n the mean andboth will result in positive z values. When two z
scores have the same sign, determining the percentageof the dist
ribution between them requires that we complete the following s
teps:
1. Calculate z scores for the raw scores.
2. Determine the table values for each z.
3. Subtract the smaller proportion from the larger.
4. Convert the result into a percentage by multiplying by 100.
z=x−Ms

z15=15−14.56.737=0.0742,
or 0.07, for which the table value is 0.0279.
The 0.0279 is the proportion of the distribution from z = 0.07 an
d the mean of the distribution. For araw score of 20,
z20=20−14.56.737=0.8164,
or 0.82, which corresponds to p = 0.2939.
This is the proportion of the distribution between z = 0.82 and t
he mean of the distribution.
When the z scores are on opposite sides of the mean, as they we
re in our first example, determining theproportion of the distrib
ution between them was a simple matter of adding the two table
values. Whenboth z scores are on the same side of the distributi
on, however, their table values overlap. To determinethe proport
ion between two values of z with the same sign, take the proport
ion between the larger(absolute) value and the mean minus the p
roportion from the smaller (absolute) value to the mean:0.2939
− 0.0279 = 0.2660. Multiplying that by 100 produces the percen
tage: 100 × 0.2660 = 26.6% ofthe distribution will score betwee
n 15 and 20.
Figure 3.5 illustrates this result.Figure 3.5: Areas under the cur
ve between z = 0.07 and z = 0.82
The percentage of scores between two z values with the same si
gn is determined bycalculating the difference between the small
er z score table value and the larger one,then multiplying the res
ult by 100.
When trying to answer a question about the percentage of the di
stribution in a particular area, drawinga simple diagram like Fig
ure 3.5 helps make the question less abstract.Apply It! Attention
to Detail
gerenme/iStock/Thinkstock
A psychological services company administers a test thatmeasur
es the respondent’s attention to detail. The company’sclients are

employers in a variety of organizations that requirepeople with
good analytical skills. Respondents who score inthe lowest rang
es of the scale are indifferent to potentiallyimportant details. Th
ose who score in the highest ranges tendto fixate on details that
may be unimportant to an outcome.Individuals who meet the qu
alification on this particular testscore in the range from 3.80 to
4.30. Data for those who havetaken the test in the past indicate t
hat M = 4.00 and s = 0.120.For researchers, the initial question i
s, “Of those who take thetest, what proportion are rejected beca
use either they areinattentive to important details or they becom
e focused on thewrong details?” In terms of the z distribution, t
he equivalentquestions are the following:
a.
What proportion of those who took the test in the past failed to
meet the minimumqualification for attention to relevant detail?
In other words, what proportion scoredlower than 3.80?
b. What proportion of test-takers scored higher than 4.30?
Regarding question (a), to determine the value of z, the followin
g apply:
x = 3.80
M = 4.00
s = 0.12
Since
z=x−Ms=(3.80−4.00)/0.120
z3.80 = −1.67
The z score table (Table 3.2) indicates that a proportion of 0.45
25 of the entire population willfall between this z score and the
mean of the distribution. However, the researchers’ interest isin
the proportion below this point. Therefore,
0.5 − 0.4525 = 0.0475
In other words, a proportion of 0.0475 occurs below x = 3.80. St
ated as a percentage, 4.75% ofthe candidates will score below 3.
80 on the test.
For the proportion above 4.30,
z4.30 = (4.30 − 4.00) / 0.12 = 2.5
Table 3.2 indicates that

·
this z score corresponds to a proportion of 0.4938, indicating th
at, as a percentage,49.38% of the population occurs between a s
core of 4.30 and the mean of thedistribution, and
·
the percentage above this point will be 50 − 49.38 = 0.62, or 0.6
2%, of those who takethe test score at 4.30 or beyond.
Apply It! boxes written by Shawn Murphy
Comparing Data from Different Tests
Earlier chapters discussed how test scores from two different in
struments with different means andstandard deviations can be co
mpared. Perhaps a juvenile gang member under court-
ordered counselingis required to complete two different assessm
ents: one measuring aggression and one social alienation.The ga
ng member scores 39 on the aggression test and 15 on the aliena
tion test. Table 3.3 shows themeans and standard deviations of t
he two tests.Table 3.3: Test results for aggression and social ali
enation
Test
Mean
Standard deviation
Aggression measure
32.554
5.824
Social alienation
12.917
2.674
In both cases, the gang member scored higher than average on b
oth aggression and social alienation.For which measure is the sc
ore the most extreme?
Because the two tests have different means and standard deviati
ons, comparing the raw scores directlyis not helpful. However,
employing the z transformation allows both scores to fit a distri
bution wherethe mean is 0 and the standard deviation is 1.0. The
raw scores may not reveal much, the z scores canbe directly co

mpared. Recall that
z=x−Ms
Calculating z for the aggression score produces:
z39=39−32.5545.824=1.107
Then calculate the z for social alienation:
z15=15−12.9172.674=0.779
Doug Menuez/Photodisc/Thinkstock
Using z scores enables researchers tobetter understand test resul
ts measuringaggression and social alienation injuvenile gang me
mbers.
Interpreting Multiple z Values
Since the question is which of the juvenile’s two testscores is th
e more extreme, we have no need for tablevalues—
only the value of z. As both z values arepositive, the z for aggre
ssion is more extreme than thatfor social alienation. Performing
the z transformationallows us to note that the aggression value i
s 1.107standard deviations from the mean of the distribution.Ali
enation, meanwhile, is just 0.779 standard deviationsfrom its me
an. Practically, the results show thisindividual is more aggressiv
e than alienated. As long asraw scores, means, and standard dev
iations are available,researchers can use z to make direct compa
rison of verydifferent qualities, in this case, aggression and soci
alalienation in the same individual.
Another Comparison
Psychologist Lewis Terman developed the Stanford-
Binet test, which measures children’s intelligence.Suppose a ps
ychologist is similarly interested in giftedness among children.

Because unusual verbalability often seems to accompany superi
or intelligence in gifted children, the psychologist measuresboth
characteristics for a group of subjects. One particular subject s
cores 140 on intelligence and 55.0on verbal ability. Table 3.4 li
sts the descriptive data for each test.Table 3.4: Test results for i
ntelligence and verbal ability
Test
Mean
Standard deviation
Intelligence
100
15
Verbal ability measure
40
5.451
As in the previous example, the researcher must convert scores i
nto z scores before they can bedirectly compared.
For the intelligence score, the z score is calculated as:
z=x−Ms
z140=140−10015=2.667
For the verbal ability measure, the z score is calculated as:
z=x−Ms
z55=55−405.451=2.752
The z scores indicate that both test scores are about the same di
stance from their respective means.This makes it more difficult
to glance at the raw scores and know which is higher. But becau
se bothhave been transformed into z scores, the two measures n
ow belong to a common distribution, and theresearcher can see t
hat the verbal ability measure is slightly higher than the intellig
ence score.

Determining How Much of the Distribution Occurs Under Partic
ular Areas of theCurve
If we draw a distribution and clarify what is at issue, questions
about how much of the distribution isabove a point, below a poi
nt, or between two points do not require researchers to observe f
ormal rules.For the sake of order and clarity, however, the flow
chart in Figure 3.6 provides some direction foranswering differe
nt questions a researcher might ask about proportions within a d
istribution.Figure 3.6: Flowchart to address questions pertaining
to adistribution
Use the steps illustrated in the flowchart to resolve questions ab
out the proportionswithin a population.
Try It!: #4
Figuratively speaking, how does the ztransformation allow you t
o compareapples to oranges?
The list of steps must seem like a great deal toremember. In fact
, the better course whenconfronted with a z score problem is to
sketchout a distribution to produce something likeFigures 3.3 an
d 3.4. The visual displays helpclarify the question and suggest t
he steps neededto answer it.
The Normal Curve
00:00
00:00
3.3 z Scores, Percentile Ranks, and Other Standard Scores
Our task to this point has been to transform raw scores into z sc
ores and then to percentagesor proportions of the distribution in
specified areas. If the percentages are already available,but neit
her the raw data nor the related descriptive statistics are, Table
3.2 (the z table) allowsus to work backward to determine the z v
alue—even without the mean and standarddeviation for the data.
Let us assume that published data indicate that only 1% of the p
opulation has intelligencescores above 140. What z score does t
his represent?

1.
Because Table 3.2 lists proportions, the first step is to turn the p
ercentage into aproportion: 1% is 1/100, which is the same as a
proportion of 0.01.
2.
Recall that Table 3.2 indicates the proportion of a normal popul
ation between a particular value of z and the mean for half (0.5)
of the distribution. Therefore, we needa z value which includes
all but that most extreme 0.01, which will be the z value for apr
oportion of 0.50 − 0.01 = 0.49. A z value for a proportion of 0.4
9, will be the valuethat includes the 49% of the distribution, wh
ich means it excludes the highest 1% ofthe distribution.
3.
Table 3.2 does not list a proportion of exactly 0.49, but it does l
ist 0.4901, which isvery close. Reading leftward from the propo
rtion to the margin and also vertically tothe column heading, the
associated z value for 0.4901 is 2.33. If data were gathered fori
ntelligence scores, a z = 2.33 excludes close to the top 0.01 or 1
%.
To state this more directly, when viewed as z scores, any intelli
gence score where z > 2.33 issomewhere among the top 1% of al
l intelligence scores. Figure 3.7 illustrates this proportion.Figur
e 3.7: The value of z associated with a particularproportion
A normal distribution curve that shows where the highest 1% of
scores fallwithin a given population.
Converting z Scores to Percentile Ranks
Chapter 2 introduced percentile scores. Recall that percentiles i
ndicate the point belowwhich a specified percentage of the grou
p occurs. For example, 73% of the distributionoccurs at or belo
w the point defined by the 73rd percentile, and so on. Because r
esearcherscan use the table values associated with z scores to de
termine the percentage of thedistribution occurring below a poin
t, it is not difficult to take one more step and turn thatpercentag
e into a percentile score. For example, because

· z = 1.0 includes 34.13% between that point and the mean and
·
that part of the distribution from the mean downward is 50%, th
en
·
34.13% + 50% = 84.13% of scores are at or below z = 1.0; there
fore, z = 1.0 occurs atthe 84th percentile.
Although percentile scores can be easily determined from the ta
ble values that are associatedwith z scores, note an important di
fference between percentile scores and z scores. The zscore is o
ne of several standard scores. Standard scores are all equal-
interval scores—
theinterval between consecutive integers is constant, which mea
ns that in terms of data scale,standard scores are interval scale.
The increase in whatever is measured from z = −1.5 to z =−1.0 i
s the same as it is from z = 0.3 to z = 0.8. The increase is 0.5 in
either case.
This interval scale does not apply to percentile scores. Because
these scores indicate thepercentage of scores below a point rath
er than reflecting a direct measure of somecharacteristic, the dis
tances between consecutive scores differ widely in various parts
of thedistribution. Most of the data in any normal distribution a
re in the middle portion, wherescores have the greatest frequenc
y. The frequency with which scores occur diminishes asscores b
ecome more distant from the mean, something reflected in the c
urves in frequencydistributions that are vertically highest in the
middle and then decline as they extend outwardto the two tails.
Note the comparison between percentiles and z scores in Figure
3.8.Figure 3.8: z scores and percentile scores
A comparison of z scores and percentiles for a normal distributi
on shows thatthe majority of scores are found within the 50th pe
rcentile. Meanwhile, thefrequency of scores above the 99th and
below the 1st percentiles is low in anormal distribution.
As a result of high frequency in the middle of the distribution, i
n any normal distribution thedifference between consecutive per

centile scores is always much smaller near the middle ofthe dist
ribution (between the 50th and 51st percentiles, for example) th
an betweenconsecutive percentile scores in the tails (between th
e 10th and 11th, or the 90th and 91stpercentiles, for example). T
his characteristic has important implications. The differencebet
ween scoring at the 50th and 51st percentile score on something
like the Beck DepressionInventory is almost inconsequential co
mpared to the difference between the 90th and 91stpercentile, a
much greater difference. Percentile scores are ordinal scale, wh
ereas z scores areinterval scale.
Converting z to Other Standard Scores
Part of the appeal of the z score is that it enables researchers to
readily determine relativeperformance. A positive z value indica
tes that the individual has scored in the upper half ofthe distribu
tion. Someone who scores one standard deviation beyond the me
an, as we notedearlier, has scored at the 84th percentile, and so
on. The z scores belong to a family ofmeasures called “standard
scores.” They have in common these characteristics: a) a fixed
mean and standard deviation and b) equal intervals between con
secutive data points.
Another standard score is the t score. It is used in the place of z
scores when those reportingthem prefer not to report negative sc
ores, which of course are half of all possible z values.After calc
ulating the z value, a researcher can easily change it to a t score
. In fact this is truefor any score that has a fixed mean and stand
ard deviation, whether it is a standard score like t or, for exampl
e, a Graduate Record Exam (GRE) score (see Table 3.5), which
also has afixed mean and standard deviation.Table 3.5: Compari
son of t scores and GRE scores
Mean
Standard deviation
t scores
50
10

Graduate Record Exam
500
100Try It!: #5
What makes a score a standard score?
Either score can be derived from z. Toconvert from z to t, for ex
ample, simplymultiply z by 10 and add 50. So, if z = 1.75,then
t = 10 × 1.75 = 17.5 + 50 = 67.5
z = 1.75 is the same as t = 67.5.
For GRE, we would multiply z by 100 and add 500:
GRE = 100 × 1.75 = 175.0 + 500 = 675
z = 1.75 is the same as GRE= 675 (and as t = 67.5).
Although more common in educational than in psychological tes
ting and research, normalcurve equivalent scores (NCE) and “st
anine” scores (standard nine-
point scale) are alsoexamples of standard scores. Like z and t, e
ach is equal-
interval, and both have fixed meansand standard deviations.
3.4 Using Excel to Perform the z Score Transformation
monkeybusinessimages/iStock/Thinkstock
Evaluating achievement motivationscores can give researchers v
aluableinformation about the relationshipbetween poverty and a
chievement inschools.
The z score transformation is a fairly simple formula. Asa result
, to program it into Excel and transform an entiredata set into z
scores is not difficult. In fact, theapplication offers several way
s to do this, but thischapter will explore just one. It involves pro
grammingthe z score transformation formula directly into the da
tasheet.
A researcher interested in the relationship betweenpoverty and a
chievement motivation among secondary-school-
aged young people gathers data from a group ofstudents whose f
amilies qualify for free and reduced-
price lunches at school. The achievement motivationscores are a
s follows:
4, 5, 7, 7, 8, 9, 9, 9, 10, 13

To use Excel to transform those data into their z score equivalen
ts, follow these steps:
1.
List the data in Excel in Column B, with the label “Ach Mot” in
B1.
2. Enter the 10 scores into cells B2 to B11.
3.
In cell B12, enter the formula =average(B2:B11). (Note: Virtual
ly all spreadsheets, includingExcel, have shortcuts for the more
common calculations, such as means and standard deviations.A
user can enter the formula, as we have done here, or use a short
cut. Shortcut procedures vary,however, depending on the operati
ng system and the version of Excel. Excel for Mac, forexample,
allows users to enter the data, position the cursor where they de
sire the statistic’s valueto appear, and then double-
click the name of the desired statistic under the Formula tab.)
c. The equal sign indicates to Excel that a formula follows.
c. The command average will provide the arithmetic mean.
c.
When several cells are to be included in the function, they are p
laced in parentheses ( ).When the cells are consecutive, the colo
n (:) indicates that all cells from B2 to B11 are tobe included in
the function.
1. Press Enter.
1.
In cell A12, enter the label “mean =.” The value in cell B12 will
be 8.1, the mean of theachievement motivation scores.
1.
In cell B13, enter the formula =stdev(B2:B11). Note that stdev i
s the Excel abbreviation for“sample standard deviation.” For Ma
c users, the abbreviation is stdev.s.
1. Press Enter.
1.
In cell A13, enter the label std dev =. The value in cell B13 will
be 2.558211, the standarddeviation of the scores.
1. In cell C1, enter the label equiv z.

1.
In cell C2, enter the formula =(B2_8.1)/2.558 and press Enter.
Consistent with the z scoretransformation, this formula subtract
s the mean from the raw score in cell B2 and then dividesthe res
ult by the standard deviation, 2.558, which we rounded to three
decimals.
1. Repeat that operation for all the other scores as shown next:
k.
With the cursor in cell C2, click and drag the cursor down from
C2 to C11 so that cells C2to C11 are highlighted.
k.
In the Editing section at the top of the page near the right side i
s a Fill command with adown-
arrow at the left (for Macs, the command is on the left side, bel
ow the Home tab).Click the down-
arrow to the right of the Fill command, then click Down. This a
ction willrepeat the result in C2 for the other nine cells, adjustin
g for the different test scores in eachcell.
Figure 3.9 shows how the spreadsheet will look after Step 11, w
ith the z score equivalents of all theoriginal achievement motiva
tion scores displayed to the right of the original scores.
Figure 3.9: Raw scores transformed to z scores in Excel
Excel converts raw scores to z scores using a simple formula.
Source: Microsoft Excel. Used with permission from Microsoft.
Using Excel to Perform the z Score Transformation
00:00
00:00
3.5 Using z Scores to Determine Other Measures
Occasionally, a researcher has access to z scores and the mean a
nd standard deviation but notthe original raw scores. Formula 3.
1 used the raw score (x), the mean (M), and the standarddeviatio
n (s) to determine the value of z, but actually, any three of the v
alues in the formulacan be used to determine the value of the fo
urth. Just as Formula 3.1 uses x, M, and s todetermine z, we cou

ld use z, M, and s to derive x. Altering Formula 3.1 to determin
e thevalue of something other than z involves a little algebra but
is not difficult.
Determining the Raw Score
To determine the raw score, follow these steps:
1.
Because z =(x − M)/s, swap the terms before and after the equal
sign so that (x − M)/s= z.
2.
To eliminate the s in the denominator of the first term, multiply
both sides by s so thatit disappears from the first term and emer
ges in the second: x − M = sz.
3.
To isolate x, add M to both sides of the equation so that x = sz
+ M.
Returning to the Excel problem, if the z scores and descriptive s
tatistics are available, we candetermine the raw score for which
z = −1.603 as follows:
If M = 8.10, s = 2.558, and x = z × s + M, substituting the value
s we have produces x = (−1.603)(2.558) + 8.10 = 3.9995, which
rounds to 4.0.
Checking the earlier data reveals that 4 was indeed the raw scor
e for which z = −1.603.
Determining the Standard Deviation
If the raw scores, the mean, and z are available, but s is lacking,
z = (x − M)/s, so (x − M)/s = z. Taking the reciprocal of each h
alf of the equation—
which means inverting the term so that(x − M)/s becomes s/(x −
M) and z/1 becomes 1/z, giving us s/(x − M) = 1/z. Multiplying
bothsides by (x − M) yields s = (x − M).
Using the data from the Excel problem again, for the first partic
ipant, x = 4, M = 8.10, and z =−1.603. According to the adjuste
d formula,
s = (x − M)/z

therefore, substituting the values we have produces the followin
g:
s=4−8.1−1.603=2.5577
which rounds to 2.558, the standard deviation value for the origi
nal data set.
Determining the Mean
It would be very unusual for a researcher to have the z scores, t
he standard deviation, and theoriginal achievement motivation s
core but not have the mean. However, just to complete theset, th
e mean can be determined from the other three values as follows
:
Because
z = (x − M)/s
if both halves of the equation are multiplied by s, then s appears
in the first termand disappears from the second. The result is
sz = x − M
If M is then added to both sides, M appears in the first term and
is eliminated in thesecond. The result is
sz + M = x.
If sz is then subtracted from both sides, it is eliminated from the
first term andadded to the second. The result is
M = x − sz
For the first participant, the achievement motivation score x = 4
.0, the z score =−1.603, and s = 2.558. The mean for the test can
be determined as follows:
M = x − sz, or
M = 4 − (2.558 × −1.603) = 8.10, which was the original mean.
Maintaining Fixed Means and Standard Deviation
One of the characteristics of widely used standardized tests is th
at their mean and standarddeviation values remain the same ove
r time. The major intelligence tests, for example, have afixed m
ean of 100 and a standard deviation of 15, even though the Stanf
ord-

Binet andWechsler tests have been revised several times. When
a test is revised and updated, do themeans and standard deviatio
ns likewise change? In fact, they do. Flynn and Weiss (2007)doc
umented significant increases in intelligence scores over a 70-
year period, but to make thescores comparable over time, psych
ologists use what are called modified standard scores.The modif
ied standard score allows those working with the test to gather d
ata that have anymean and any standard deviation and then adju
st them so that they conform to predeterminedvalues. This proce
ss follows these steps:
1. Gather data with the new instrument.
2. Determine the equivalent z scores for test-takers’ raw scores.
3. Apply the formula.
Formula 3.2 is used to modify a score so that the mean and stan
dard deviation for thepopulation of scores take on specified valu
es.
Formula 3.2
MSS = (sspec)(z)+ Mspec
where
MSS = the modified standard score,
sspec = the specified standard deviation, and
Mspec = the specified mean.
Note that this formula is the same used to transform z scores int
o t scores. By way of anexample, perhaps a psychologist has de
veloped what she has labeled the Brief IntelligenceTest (BIT). T
o compare results to tests her colleagues have used traditionally
, she wants theBIT’s descriptive characteristics to conform to th
ose of the more established tests. For eightparticipants, the BIT
scores are as follows:
22, 25, 26, 29, 29, 32, 32, 35
Of course, no one norms an intelligence test on only eight peopl
e. The potential for what wewill later call sampling error is too
great. Still, to illustrate the process, we will assume thesample s
cores are valid.
Verify that M = 28.75 and s = 4.268.
For the participant with an intelligence score of 22, the correspo

nding z value is
z=x−Ms=22−28.754.268=−1.582
To determine that participant’s score on an instrument with a m
ean of 100 and a standarddeviation of 15, the psychologist will
apply the formula:
MSS = (sspec)(z) + Mspec
= (15)(=1.582) + 100
= 76.276
Although the original BIT score was 22, using the z transformat
ion and modified standardscore procedures makes the BIT score
conform to the mean and standard deviation of a moreestablishe
d test. Among scores for which the mean is 100 and the standard
deviation is 15,the modified standard score for the BIT score of
22 is 76.276.
Writing Up Statistics
Although z scores are an important part of data analysis, like th
e raw scores that researchersgather in their work, z scores often
do not appear in research reports. Reports list the meansand stan
dard deviations, but often omit the raw scores and their z score t
ransformation. In astudy of the weight-gain side effect that anti-
psychotic drugs might have on adolescents,Overbeek (2012), ho
wever, used a combination of height and weight to determine bo
dy-mass-
index (BMI) scores for each subject, and then transformed the B
MI scores into easier-to-
interpret z scores. A z score near 0 indicated that given the subj
ect’s height, weight wasprobably appropriate. A positive z score
indicated that the individual might be overweight,negative z sc
ores indicated underweight, and so on. Overbeek also used z sco
res to index theweight-gain data over the course of the study.
Brown (2012), too, used z scores. In his study, they offered a w
ay to counter the effect thatgrade inflation has on college studen
ts’ class rankings. He posited that class rankings are lessinform
ative than they once were because weaker students in departmen

ts where courseworkis easier are ranked ahead of students who h
ave a higher level of academic aptitude butcompete in more dem
anding programs. Brown’s solution was to use the z transformati
onwithin departments to indicate how much above or below the
mean students were in theirindividual programs.Summary and R
esources
Chapter Summary
Normal distributions are unimodal and symmetrical, and their st
andard deviations tend to beabout one-
sixth of the range. Although not all data are normally distribute
d, many of themental characteristics that psychologists and soci
al scientists measure are normal. Becausethe proportions of the
population that occur in specified ranges remains constant in no
rmallydistributed populations, we can have some confidence abo
ut how scores will be arrayed evenbefore we view a display of t
he data.
What the standard normal distribution, or z distribution, does is
capitalize on the consistencyin normally distributed populations
by offering one distribution by which all other normalpopulatio
ns can be referenced. In this distribution, where the mean is alw
ays 0 and thestandard deviation is 1.0 (Objective 1), table value
s indicate the proportions of the populationlikely to occur anyw
here along its range. By transforming raw scores (Objective 2) f
rom anynormal population so that they fit this z distribution, we
can take advantage of how well thecharacteristics of this distrib
ution are known and answer important questions about data fro
many population (Objective 3) in terms of z:
·
For example, when someone scores at a particular level, we can
ask what proportion ofthe entire population is likely to score bel
ow (or above) that point.
·
When most of the people in a particular group score between tw
o points, we can askwhat proportion of the entire population wil
l score between (or outside) those points.

Because the z score transformation is a relatively simple formul
a, programming Excel toproduce the z equivalents for any set of
scores (Objective 4) is simple and can be helpful withlarge data
sets.
The z is one of several standard scores in fairly common use. W
hether z or some other, allstandard scores indicate how distant o
ne individual’s score is from the mean of thedistribution. Rather
than providing an absolute measure of some characteristic, stan
dardscores are normative, meaning that they indicate the level o
f what is measured relative toothers in the same population. Tho
se who prefer not to deal in negative values (whichcharacterize
half of the z distribution) can employ t scores. In all material re
spects, t is thesame as z, except that the mean is 50 and the stan
dard deviation is 10.
The modified standard score (Objective 6) enhances standard sc
ores’ ability to communicatean individual’s standing relative to
a population. Researchers often use standard scores toreport the
data from standardized tests, but these tests are revised from tim
e to time, whichcan affect the test means and standard deviation
s. To ensure stability, the modified standardscore uses the z tran
sformation as a way to maintain constant descriptive characteris
tics, evenas the instrument used to measure it, or even the chara
cteristic measured, changes with time.
In the incremental nature of statistics books, each chapter prefa
ces the next. Chapters 1–
3 area prelude to Chapter 4. With all our effort to label, display,
and describe data sets, the focusin the discussion of z scores an
d the other topics has been primarily about analyzing theperfor
mance of individuals. Behavioral scientists, however, are genera
lly much moreinterested in asking questions about groups. Anal
yzing how those in a sample compare tothose in the entire popul
ation is the focus of Chapter 4. It will do so by expanding discu
ssionof the z distribution.
The math and the logic involved in Chapter 4 will be much the s
ame. If the discussion in thischapter makes sense, the material i
n Chapter 4 will not be difficult. Still, it is a good idea toreview

the Chapter 3 material and recalculate the sample problems, as
repetition has value.
Chapter 3 Flashcards
Key Terms
modified standard scores
percentile
probability
standard normal distribution
standard scores
t score
z score
z transformation
Review Questions
Answers to the odd-
numbered questions are provided in Appendix A.
1.
A researcher is interested in people’s resistance to change. For t
he dogmatism scale(DS), data for 10 participants are as follows:
28, 28, 29, 29, 32, 33, 35, 36, 39, 42
a. What is the z score for someone who has a DS score of 28?
a.
Will a z score for a raw score of 35 be positive or negative? Ho
w do you know?
a.
How many standard deviations is a score of 28 from the mean?
a. What will be the z value of a raw score of 33.1?
a.
Since there are no scores below 28, does it make any sense to ca
lculate z for a rawscore of 25, for example? Shouldn’t such a sc
ore have a zero probability ofoccurring?
1.
Examining the relationship between recreational activity and lev
el of optimism amongsenior citizens, a psychologist develops th

e Recreation Activity Test (RAT). Scores for8 participants are a
s follows:
11, 11, 14, 14, 14, 17, 18, 22
b. What is the z score for someone with RAT = 23?
b.
Why isn’t 0 the answer to 2a, since none of the participants scor
ed 23?
b. What is the z score for someone with RAT = 15.125?
b. Explain the answer to 2c.
b.
How does z allow one to compare tests with entirely different m
eans and standarddeviations?
1.
One participant has RAT = 11. The same individual is administe
red a ConsistentApproval Test (CAT) and scores 45. The CAT d
ata, including that participant’s score,are as follows:
42, 45, 48, 49, 55, 58, 62, 64
c. Which score is higher, the RAT or the CAT?
c.
Why isn’t the answer to 3a automatically CAT, since it has the
higher mean value?
1.
Researchers developed the ANxious, Gnawing Stress Test (ANG
ST) to measure emotional stability among law-
enforcement professionals. A random sample of policepatrol off
icers yielded the following scores:
54, 58, 61, 64, 75, 81, 82, 85
d. What proportion of the population will score 81 or higher?
d. What proportion will score 60 or higher?
d.
If x > 75 is the cutoff for “highly stressed,” what is the probabil
ity that someone,selected at random, will be highly stressed?
d. What is the probability of scoring between 60 and 81?
1.
Using the data in Question 4, what percentage of the population
will score lower than55?

1.
What is the t equivalent to a z score for someone with an ANGS
T score of 64?
f. What is the mean of the t distribution?
f. Why is t sometimes preferred over z?
f. If z = 2.5, what is t?
1.
Refer to the data in Questions 3 and 4: For an individual who sc
ores 60 on RAT and 78on ANGST, which is the higher score?
1. In any standard normal distribution, determine the following:
h. What percentage of scores will occur below z = 0?
h. What is the probability of a positive value of z?
h. What percentage of scores will occur between ±1.96 z?
1.
If someone scores z = 1.0, what is the corresponding percentile
rank?
1.
What percentile rank is z = 0? What measure of central tendenc
y represents the 50thpercentile?
1.
A psychologist wishes to maintain a mean of 25 and a standard
deviation of 5 for a testdeveloped to measure compulsive behavi
or. On a revised test, the scores are asfollows:
14, 17, 19, 19, 22, 27, 28, 29
k.
What is the modified standard score for the person who scored 1
7 on the revisedinstrument?
k.
What is the probability of scoring 17 or lower according to the e
ight scores?
1. Given the data in Question 11,
l. What is the z equivalent of a raw score of 28?
l. What is the probability of scoring somewhere from 14 to 29?
1.
Draw a normal distribution and identify where z = −1.17 and z =
+2.53 are located.What percentage of the population occurs bet

ween these two z values?
Answers to Try It! Questions
1.
A z of 1.5 indicates that the associated raw score is 1.5 standard
deviations (thedenominator in z) from the mean.
2.
Because the standard normal distribution (the z distribution) is
normal, the distributionis symmetrical. The proportion of the di
stribution between a value of z and the meanwill be the same fo
r a negative z as it is for a positive z with the same numerical v
alue.
3.
Do not refer to Table 3.2 for help with this one. The table’s hig
hest score is z = 3.09,but in fact z has no upper limit. In theory,
the tails in the z distribution never actuallytouch the horizontal
axis of the graph, which means that there exists always at least t
hepossibility of scores higher (or lower) than any already measu
red.
4.
One of the values of the z transformation is that scores that hav
e any descriptivecharacteristics can be recalibrated so that they
fit a distribution where the mean is 0 andthe standard deviation
is 1.0. By doing so, scores from any variety of sources can beco
mpared directly after converting them to z scores. The only requ
irement is that theybe normally distributed.
5. Standard scores are equal-
interval scores with a fixed mean and standard deviation,thereby
allowing the magnitude of the score to indicate how an individu
al compares toall others for whom scores are available.
Previous section
Next section
4
Applying z to Groups

Victor Faile/Corbis
Chapter Learning Objectives:
After reading this chapter, you should be able to do the followin
g:
1. Describe the distribution of sample means.
2. Explain the central limit theorem.
3.
Analyze the relationship between sample size and confidence in
normality.
4. Calculate and explain z test results.
5. Explain statistical significance.
6. Calculate and explain confidence intervals.
7. Explain how decision errors can affect statistical analysis.
8. Calculate the z test using Excel.
Introduction
As we noted at the end of Chapter 3, researchers are generally
more interested in groups thanin individuals. Individuals can be
highly variable, and what occurs with one is not necessarilya go
od indicator of what to expect from someone else. What occurs i
n groups, on the otherhand, can be very helpful in understandin
g the nature of the entire population. A Googlesearch indicates t
hat the suicide rate is higher among dentists than it is among th
ose of manyother professions. If we wanted to experiment with
some therapy designed to relievedepressive symptoms among de
ntists, we would be more confident observing how a group of50
dentists responds than in examining results from just one. This c
hapter will use thematerial from the first three chapters to begin
analyzing people in groups.
Noting that many of the characteristics that interest behavioral s
cientists are normallydistributed in a population implies that so
me characteristics are not. Since samples can neverexactly emul
ate their populations, it may not be clear in the midst of a partic
ular study whendata are normally distributed. This uncertainty p
otentially poses a problem: we may wish touse the z transformat
ion and Table B.1 of Appendix B in our analysis, but Table B.1
is basedon the normality assumption. If the data are not normal,

where does that leave the relatedanalysis?
This is a self-assessment and will not affect your grade. You
may only take this pre-test once.
Test Ch 4: Applying z to Groups
Top of Form
1. The potential for sampling error diminishes as the size of the
sample grows.
· a. TRUE
· b. FALSE
2. Populations based on samples are more inclined to normality
than populations based on individual scores.
· a. FALSE
· b. TRUE
3. Determining normality requires all of the scores in a
population.
· a. TRUE
· b. FALSE
4. The z test produces a z value based on individual scores
rather than on sample means.
· a. TRUE
· b. FALSE
5. A statistically significant result is one that is unlikely to have
occurred by chance.
· a. FALSE
· b. TRUE
Finish
Bottom of Form
4.1 Distribution of Sample Means
iStockphoto/Thinkstock
A population is allmembers of a definedgroup, such as all voters
ina county.
What options do researchers have if they are suspiciousabout da
ta normality? One important answer is the distribution of sampl
e means, so named because the scoresthat constitute the distribu
tion are the means of samplesrather than individual scores.

Note that the descriptor population means all possiblemembers
of a defined group. Recall that the frequencydistribution—
the bell-shaped curve representing thepopulation—
was a figure based on the individual measuressampled one subje
ct at a time. In discussing the frequencydistribution, we assume
d that we would measure eachindividual on some trait, and then
plot each individual score.Instead of selecting each individual i
n a population one at atime, suppose a researcher
1. selects a group with a specified size;
2. calculates the sample mean (M) for each group;
3.
plots the value of M (rather than the value of eachscore) in a fre
quency distribution;
4. and continues doing this until the population is exhausted.
How would plotting group scores rather than individual ones aff
ect the distribution? Wouldthe end result still be a population?
The answer to the second question is yes: because everymember
is included, it is still a population. Whether a population is mea
sured individually oras members of a group is incidental, as lon
g as all are included.
Perhaps researchers are interested in language development amo
ng young children and wishto measure mean length of utterance
(MLU) in a county population. Whether the researchersmeasure
and plot MLU for each child in a county’s Head Start program o
r plot the meanMLU for every group of 25 in the program, the r
esult is population data for Head Startlearners for that county.
The Central Limit Theorem
The answer to the question “how would the distribution be affec
ted?” is a little moreinvolved, but it is important to nearly every
thing we do in statistical analysis. It involveswhat is called the
central limit theorem:
If a population is sampled an infinite number of times using sam
ple size n and themean (M) of each sample is determined, then t
he multiple M measures will take onthe characteristics of a nor
mal distribution, whether or not the original populationof indivi

duals is normal.
Take a minute to absorb this. A population of an infinite numbe
r of sample means drawnfrom one population will reflect a norm
al distribution whatever the nature of the originaldistribution. A
healthy skepticism prompts at least two questions: 1) How woul
d we provewhether this is true since no one can gather an infinit
e number of samples? and 2) Why doessampling in groups rathe
r than as individuals affect normality?
Although prove is too strong a word, we can at least provide evi
dence for the effect of thecentral limit theorem using an exampl
e. Perhaps a psychologist is working with 10 people ontheir resi
stance to change, their level of dogmatism. Technically, because
10 constitutes theentire group, the population is N = 10. Recall
that N refers to the number in a population.Even with a small po
pulation we cannot have an infinite number of samples, of cours
e, butfor the sake of the illustration we will assume that
· dogmatism scores are available for each of the 10 people;
· the data are interval scale;
· the scores range from 1 to 10; and
· each person receives a different score.
So with N = 10, the scores are 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. Figure
4.1 depicts a frequencydistribution of those 10 scores.
The distribution in Figure 4.1 is not normal. With R = 10 − 1 =
9 and s = 3.028 (a calculationworth checking), the distribution i
s extremely platykurtic (i.e., flatter than normal); the rangeis les
s than 3 times the value of the standard deviation rather than the
approximately 6 timesassociated with normal distributions. The
re is either no mode or there are 10 modes, neitherof which sugg
ests normality. We can illustrate the workings of the central lim
it theorem witha procedure Diekhoff (1992) used. We will use s
amples of n = 2, and make the examplemanageable by using one
sample for each possible combination of scores in samples of n
= 2from the population, rather than an infinite number of sampl
es.Figure 4.1: A frequency distribution for the scores 1through
10: Each score occurring once
A frequency distribution of ten scores, each with a different val

ue. This type ofdistribution, which is not normal, is highly platy
kurtic.
Table 4.1 lists all the possible combinations of two scores from
values 1–
10. Ninetycombinations of the 10 dogmatism scores are possible
. The larger the sample size, the morereadily it demonstrates the
tendency toward normality, but all combinations of (for exampl
e)three scores would result in a very large table.Table 4.1: All p
ossible combinations of the integers 1–10
1, 2
2, 1
3, 1
4, 1
5, 1
6, 1
7, 1
8, 1
9, 1
10, 1
1, 3
2, 3
3, 2
4, 2
5, 2
6, 2
7, 2
8, 2
9, 2
10, 2
1, 4
2, 4
3, 4
4, 3
5, 3

6, 3
7, 3
8, 3
9, 3
10, 3
1, 5
2, 5
3, 5
4, 5
5, 4
6, 4
7, 4
8, 4
9, 4
10, 4
1, 6
2, 6
3, 6
4, 6
5, 6
6, 5
7, 5
8, 5
9, 5
10, 5
1, 7
2, 7
3, 7
4, 7
5, 7
6, 7
7, 6
8, 6
9, 6
10, 6
1, 8

2, 8
3, 8
4, 8
5, 8
6, 8
7, 8
8, 7
9, 7
10, 7
1, 9
2, 9
3, 9
4, 9
5, 9
6, 9
7, 9
8, 9
9, 8
10, 8
1, 10
2, 10
3, 10
4, 10
5, 10
6, 10
7, 10
8, 10
9, 10
10, 9
For each possible pair of scores, if we calculate a mean and plot
the value in a frequencydistribution as a test of the central limit
theorem, the result is Figure 4.2. Because the entiredistribution
is based on sample means, Figure 4.2 is a distribution of sampl
e means. Strictlyspeaking, this distribution is not normal, but alt
hough based on precisely the same data,Figure 4.2’s distribution
is a good deal more normal than the distribution in Figure 4.1.F

igure 4.2: A frequency distribution of the means of allpossible p
airs of scores 1 through 10
A distribution of the means of each possible pair of scores with
values between1 and 10. This distribution is not normal, but has
more normality than thedistribution shown in Figure 4.1.
Mean of the Distribution of Sample Means
The symbol used for a population mean to this point, µ, is actua
lly the symbol for apopulation mean formed from one score at a
time. To distinguish between the mean of thepopulation of indiv
idual scores and the mean of the population of sample means, w
e’llsubscript µ with an M: µM. This symbol indicates a populati
on mean (µ) based on samplemeans (M).Try It!: #1
Why is there less variability in thedistribution of sample means
than in adistribution of individual scores?
With a distribution of just 90 sample means,Figure 4.2 shows no
thing like an infinitenumber, of course, but it is instructivenever
theless. The mean of the scores 1through 10 is 5.5 (µ = 5.5). Stu
dy Figure4.2 for a moment. What is the mean of thatdistribution
? The mean of our distributionof sample means is also 5.5: µM
= 5.5. Thepoint is this: When the same data are usedto create tw
o distributions—
one apopulation based on individual scores andthe other a distri
bution of sample means—
the two population means will have the samevalue, or, symbolic
ally stated: µ = µM.
Describing the distribution as “normal” is a stretch, but Figure
4.2 is certainly more normalthan Figure 4.1. For one thing, rath
er than the perfectly flat distribution that occurs when allthe sco
res have the same frequency, some scores have greater frequenc
y than others. Thesample means near the middle of the distributi
on in Figure 4.2 occur more frequently than thesample means at
either the extreme right or left.
Why are extreme scores less likely than scores near the middle
of the distribution? It isbecause many combinations of scores ca

n produce the mean values in the middle of thedistribution, but
comparatively few combinations can produce the values in the t
ails of thedistribution. With repetitive sampling, the mean score
s that can be produced by multiplecombinations increase in freq
uency and the more extreme scores occur only occasionally,whi
ch the next section illustrates.
Variability in the Distribution of Sample Means
In the original distribution of 10 scores (Figure 4.1), what is the
probability that someonecould randomly select one score (x) th
at happens to have a value of 1? Because there are 10scores, and
just one score of 1, the probability is p = 1/10 = 0.1, right? By t
he same token,what is the probability of selecting x = 10? It is t
he same, p = 0.1.
Moving to the distribution based on 90 scores (Figure 4.2), what
is the probability ofselecting a sample of n = 2 that will have M
= 1.0? Is there any probability of selecting twoscores out of the
10 that will have M = 1.0? Because there is only one value of 1
, there is noway to select two values with M = 1.0. As soon as a
score of 1 is averaged with any otherscore in the group, all of w
hich are greater than one, the result is M > 1. That is why thelo
west possible mean score in Figure 4.2 is 1.5, which can only oc
cur when 1 and 2 are in thesame sample.
The same thing occurs in the upper end of the distribution. The
probability of selecting agroup of n = 2 with M = 10 is also zero
(p = 0) because all other scores have lower valuesthan 10. For t
he 90 possible combinations, the highest possible mean score is
9.5, which canoccur only when the 10 and the 9 happen to be in
the same sample.
The point is that variability in group scores is always less than t
he variability in individualscores. A related point is that the imp
act of the most extreme scores in a distributiondiminishes when
they are included in samples with less extreme scores. Applied,
theseprinciples mean that a researcher examining, for example,
problem-
solving ability among agroup of subjects can afford to be less c

oncerned about the impact of one extremely low orone extremel
y high score as the size of the group increases. Larger group siz
es minimize theeffect of extreme scores.
Standard Error of the Mean
Recall that the sigma, σ, indicates a population’s standard devia
tion. Specifically, σ indicatesa standard deviation from a popula
tion of individual scores. The symbol for the standarddeviation i
n a distribution of sample means is σM and as the subscript M s
uggests, it measuresvariability among the sample means. The fo
rmal name for σM is the standard error of themean.
In the language of statistics, error as in standard error of the me
an refers to unexplainedvariability. As we move through the diff
erent procedures, we will calculate other standarderrors, which
all have this in common: all are measures of unexplained data v
ariability.
Earlier, we noted that whether charting the distribution of indivi
dual scores or the distributionof sample means, the means of the
two distributions will always be equal: µ = µM. Is itlogical to e
xpect the same from the measures of variability; in other words,
will σ = σM? Thefact that the distribution of individual scores
always has more variability than the distributionof sample mean
s answers this question. Symbolically speaking, σ > σM, someth
ing that thedogmatism data show.
The standard deviation of the 10 original scores (1, 2, 3, 4, 5, 6,
7, 8, 9, 10) is σ = 2.872. Notethat this instance and the calculat
ion of the standard error of the mean just below deal withpopula
tions and the formula must, therefore, employ N, rather than n −
1. Elsewhere in thebook, however, the formula will always be n
− 1.
The standard error of the mean can be calculated by taking the s
tandard deviation of themean scores of each of those 90 samples
which constituted the distribution of sample means.The calculat
ion is a little laborious, and happily, not a pattern that must be f
ollowed later, butthe value is σM = 1.915.
So, as predicted, σ has a larger value than σM. The smaller valu

e for σM reflects the way lessextreme scores moderate the more
extreme scores when they occur in the same sample.
Sampling Error
Although the standard error of the mean does not per se refer to
a mistake, another kind oferror, sampling error, does. In inferen
tial statistics, samples are important for what theyreveal about p
opulations. However, information from the sample is helpful for
drawinginferences only when the sample accurately represents t
he population. The degree to whichthe sample does not represen
t the population is the degree of sampling error.
Samples reflect the population with the greatest fidelity when t
wo prerequisites are met: 1)the sample must be relatively large,
and 2) the sample must be based on random selection.
The safety of large samples is explained by the law of large nu
mbers. According to thismathematical principle, as a proportion
of the whole, errors diminish as the number of datapoints incre
ases. The potential for serious sampling error diminishes as the
size of the samplegrows. The text earlier referred to this princip
le in noting that the distorting effect of extremescores diminishe
s as sample size grows.
Random selection refers to a situation where every member of t
he population has an equalprobability of being selected. Rando
m selection contrasts with what are called conveniencesamples,
samples that are used intact because they are handy. A sociolog
y professor who usesthe students in a particular section of his cl
ass is relying on a type of convenience sampleknown as a nonra
ndom sample.
A random sample of n = 5 could be created from the 10 people b
eing treated for dogmaticbehavior by assigning each person a nu
mber, placing the 10 numbers into a paper bag,shaking the bag
well, and without looking, drawing out five numbers.
The result would be a randomly selected sample. When randoml
y selected, samples differfrom populations only by chance. They
will differ, of course, but the differences are less andless impor
tant as sample size grows.

National Atlas
Systematic sampling error providesresults drastically different f
rom theactual outcome. Such a samplingerror occurred before th
e 1936presidential election, when a studypredicted a win by Alf
Landon. Theelection results, displayed in themap, were drastical
ly different.
If the sample should fail to capture some importantcharacteristi
c of the population other than its size,the problem is sampling e
rror. The importantcharacteristic might be the mean, for exampl
e, andwhen M ≠ µ, sampling error has occurred. In fact,some sa
mpling error always occurs because asample can never exactly d
uplicate all thedescriptive characteristics of the population, buts
ampling error will usually be minor if samples arerelatively larg
e and randomly selected.
Statistical analysis procedures tolerate minor,random sampling
error, but systematic samplingerror is another matter. Systemati
c sampling erroroccurs when the same mistake is made time afte
rtime.
In 1936, the publishers of Literary Digest, aprominent publicati
on of the time, decided topredict the outcome of that year’s pres
idential election in the United States. To ensure thatsample size
would not pose a problem, they sent out millions of postcards to
registeredvoters. Literary Digest seemed to at least have met th
e requirement for a relatively largesample, because the Harris a
nd Gallup polling organizations typically obtain very accuratere
sults with a few thousand, and sometimes just a few hundred, re
sponses. Unfortunately, thepublishers decided to rely on telepho
ne books and automobile registrations to locate pollrecipients. C
onsider the historical setting. At the height of the Great Depress
ion, twoindicators of relative prosperity identified voters: a tele
phone in the home and a currentlyregistered car. The study prov
ed disastrous for the magazine’s reputation. Poll resultsindicate
d that Alf Landon would win, but of course Franklin Roosevelt
was elected in alandslide to a second term, carrying every state
in the union except Maine and Vermont.

The problem was systematic sampling error. The voters were co
nsistently and nonrandomlyselected from groups not representat
ive of the entire population. If they had been randomlyselected,
chances are that with the large sample size, the study would hav
e predicted the election results accurately, but the sample size a
lone was not enough to compensate for theerror.4.2 The z Test
The distribution of sample means is a distribution based not on i
ndividual scores but on themeans of samples of the same size re
peatedly drawn from a population. The central limittheorem ass
ures that such a distribution will be normal. Consequently, if th
e z score formulafrom Chapter 3 is adjusted to accommodate gro
ups rather than individual scores, Table B.1answers all the same
questions about groups that it did about individuals in Chapter
3.
Recall that the z score formula (3.1) had the following form:
z=x−Ms
If the following substitutions are made:
·
M for x, so that the focus is on a sample mean rather than on an
individual score;
·
µM for M to shift from the sample mean to the mean of the distr
ibution of samplemeans;
·
σM for σ so that the measure of variability is for the distributio
n rather than the sample;
then the result is the z test:
Formula 4.1
z=M−μMσM
The z test produces a z value for groups rather than individual s
cores. Just as it did forindividual scores, z indicates how distant
a particular sample mean is from the mean of thedistribution of
sample means.
Note the similarities in Formulas 3.1 and 4.1:

· Both formulas produce values of z.
·
Both numerators call for subtractions that result in difference sc
ores.
· Both denominators measure data variability.
Calculating the z Test
When calculating z scores, as shown in Chapter 3, everything th
at is needed (x, M, and s) canbe determined from the sample:
z=x−Ms
Values needed for the z test, however, are often not as easy to d
etermine. Because µM = µ,one of those two parameters must be
provided. The standard error of the mean (σM) can alsopresent a
problem. No one wishing to complete a z test is going to have t
he mean scores forthat infinite number of samples that make up
the distribution of sample means. So calculatingthe standard dev
iation of those means, which is what the standard error of the m
eanrepresents, is not an option. It is possible, however, to deter
mine the value of the standarderror of the mean without having t
o calculate a standard deviation for an indeterminatenumber of s
cores. If a researcher does not have σM (the standard error of th
e mean is . . .) butdoes know the population standard deviation (
σ), σM can be found as follows:
Formula 4.2
σM=σN√
where
σM = the standard error of the mean
σ = the population standard deviation
N = the number in the group
So for a group of 100 with the value of σ as 15, then σM is
σM=σN‾‾√
σM=15100‾‾‾‾√=1510=1.5

This approach affords only a partial solution to the standard err
or of the mean, however,because it still requires at least σ.
Chapter 5 will explain a way around the problem of determining
σM, but in the meantime,consider the following example. A ma
rriage and family counselor has access to somenational data on t
he frequency of negative verbal comments exchanged between c
ouples introubled marriages. The counselor finds the following:
·
Couples in troubled marriages tend to have 11 negative exchang
es per week, with astandard deviation of 4.755.
·
A study of 45 couples who have filed for divorce in the area wh
ere the counselor hasher practice reveals that the mean number
of negative comments per week is 12.865.
·
Given the national data, the counselor wants to know the probab
ility that a randomlyselected group of couples from that populat
ion will have as many negative exchangesas the counselors’ clie
nts, or more.
Although the counselor’s question is about groups rather than in
dividuals, the problem ismuch like a z score problem. The couns
elor knows the following: µ and (because µM = µ) µM= 11.0, σ
= 4.755, N = 45, and M = 12.865.
Jeffrey Hamilton/DigitalVision/Thinkstock
Marriage and family counselors canuse existing data to help the
m learnmore about their clients.
The standard error of the mean is
σM=σN‾‾√=4.75545‾‾‾√=0.709
And z is
z=M−μMσM=12.865−11.00.709=2.630
Comparing M to µM indicates that the counselor’sgroup has a hi
gher number of negative verbalexchanges per week than the nu
mber nationallyamong couples with troubled marriages: 12.865 i

s ahigher value than 11.0. In the following section, wewill discu
ss what else can be determined from the analysis.
Interpreting the Value from the z Test
The result from the marriage and family counselor’s group in th
e previous section is a valueof z just like those that were calcula
ted in Chapter 3, except that the value indicates howmuch a sam
ple mean (M) differs from the mean of a population of samples (
µM),instead ofhow an individual (x) differs from either a sampl
e mean (M) or a population mean (µ). TheTable B.1 value indic
ates that 0.4957 out of 0.5 occurs between a value for z = 2.63 a
nd themean of the distribution. So among the population of coup
les with troubled marriages,49.57% will have negative verbal ex
changes somewhere between the level of this group(12.865 per
week) and the mean of the national population, 11.0 per week. B
ut the questionconcerns the probability that a group of clients se
lected at random would have 12.865negative comments per wee
k, or more. Because 49.57% will have 12.865 or fewer negativee
xchanges per week, just 0.43% (50% − 49.57%) will have 12.86
5 negative comments perweek or more. Stated as a probability,
p = 0.0043 that a group of individuals in troubledmarriages will
have 12.865 negative exchanges per week or more.
Figure 4.3 depicts this result.Figure 4.3: The probability of sele
cting a sample with M =12.865 or higher from a population with
µM = 11.0
The probability of selecting a sample with M = 12.865 or higher
is indicated bydetermining the z equivalent of a sample with M
= 12.865 and then determiningthe proportion of the distribution
at that point and higher in the population. Theproportion is indi
cated in red.
Note some important differences between this z and those calcul
ated in Chapter 3.
·
The difference between the mean of the population (µM = 11.0)
and the sample mean(M = 12.865) is really quite modest, but the

z value (z = 2.630) is comparativelyextreme. Recall that ±2z in
cludes 95% of the distribution, and z = 2.630 is substantiallybey
ond that.
·
The reason for the rather large value of z is the quite small stan
dard error of the mean,0.709. That value reminds us that variabi
lity in populations based on groups is smallcompared to variabil
ity based on individual scores, and it does not take much of adif
ference between the sample mean (M) and the mean of the distri
bution of samplemeans (µM) to produce an extreme value of z.
Another z Test
To evaluate the impact of group therapy on a group of juvenile
offenders, a psychologistconstructs a study in which the level of
social alienation among juvenile offenders attendingcourt-
mandated counseling sessions is compared to the level of social
alienation measured ina national sample. For a group of 15 juve
niles who are the psychologist’s clients, socialalienation (SA) a
fter 6 months of counseling has M = 13.554. Nationally, SA is μ
= 14.500with σ = 2.734. What percentage of all randomly select
ed groups of juvenile offenders willhave mean levels of SA 13.5
54 or lower? The psychologist knows that µ, and therefore, µM
=14.500, σ = 2.734, N = 15, and M = 13.554. First, calculate the
standard error of the mean:
σM=σ/N‾‾√=2.734/15‾‾‾√=0.706
Then determine the value of z:
z=M−μMσM=13.554−14.5000.706=−1.340
The table value for z = −1.3450 is 0.4099.
The researcher wants to know what percentage of all juvenile of
fender groups will havemean SA scores 13.554 or lower. Since t
he national mean SA for juvenile offenders is14.500, we know t
hat 50% of all groups will be 14.500 or lower. The proportion w
ith 13.554or lower is 0.5 minus the table value for z = −1.340: 0
.5 − 0.4099 = 0.0901. Multiplying thatproportion by 100 will in

dicate the percentage at or below that point: 0.0901 × 100 = 9.0
1%.
It looks as though the counselor’s therapy sessions are reasonab
ly effective. This groupmanifests a level of social alienation con
siderably lower than what is represented in thenational populati
on of juvenile offenders. Is the result just a chance outcome? H
ow could the researcher know?4.3 Statistical Significance
Like the z score problems in Chapter 3, the z test is a ratio of th
e difference (M − μM in thenumerator) compared to data variabi
lity (σM in the denominator). A large ratio indicates thatthe sco
re (in the z score problem) or the sample mean (in the case of th
e z test) is quite distantfrom the mean to which it is compared.
With increasing values of z, is there a point at which the sample
mean (M) becomes sodifferent from the mean of the distributio
n of sample means (μM) that a researcher shouldconclude that t
he sample mean is more characteristic of some distribution othe
r than the oneto which it is compared? In the z test, when the sa
mple is more characteristic of some otherpopulation rather than
the one to which it is compared, it is statistically significant. To
put itanother way, a statistically significant result is one that is
extreme enough—
sufficientlydistant from that to which it is compared—
that it is unlikely to have occurred by chance.
In the first z test problem, we proceeded as though the sample o
f those who had filed fordivorce was a subgroup of all couples
with troubled marriages. What if the sample is actuallymore cha
racteristic of some other distribution, say, a population of coupl
es for whom divorceis imminent? Can large values of z reflect t
he fact that the sample actually represents apopulation different
from the one to which it was compared?
Consider another example before we answer this question. Thos
e in a college honorsprogram are probably adults. If researchers
are interested in studying intelligence, would it bereasonable to
expect that the members of this group represent what is characte
ristic of alladults? From the standpoint of age (and in the absen
ce of child prodigies), those honorsstudents are probably all adu

lts, but in terms of intelligence, they probably are not typical.Pe
rhaps they are more representative of the population of intellect
ually gifted adults than ofadults in general.
The individuals in every sample belong to many different popul
ations. The couples on theverge of divorce belong to
· the population of married people;
· the population of adults;
· the population of adults in the particular state;
· the population of adults in the particular county;
· the population of couples with troubled marriages, and so on.
One of the questions the z test helps answer is whether a particu
lar sample is mostcharacteristic of the population to which it is
compared, or whether the sample is more likesome other populat
ion. The magnitude of the z value is the key to the answer.
Statistical Significance and Probability
In the case of the z test, an outcome is statistically significant w
hen it meets these conditions:
·
It is so unlike the population to which it is compared that statist
ically, at least, itrepresents some other population.
·
The random selection of a sample from the population with the
particular value of μMwould almost always result in a less extre
me difference between M and µM.
So, at what point is an outcome nonrandom? Fisher (1925), who
created the term statisticallysignificant, made the answer a matt
er of probability. If the probability that an outcome (in ourcase,
the value of z) occurred by chance is p = 0.05 or less, the outco
me is probably notrandom; it is statistically significant.Try It!:
#2
What does the term statisticallysignificant mean?
Although p = 0.05 is probably the mostcommon, other probabilit
y levels have alsobeen used to indicate statistical significance.R
eviewing journal articles indicatesstatistical testing done at p =
0.01, p =0.001, and occasionally, even p = 0.1. It isup to the per

son doing the analysis to statethe level chosen to indicate statist
icalsignificance (before conducting the test, bythe way). Becaus
e we can use the z test andthe z score table to calculate the prob
ability of an occurrence (in addition to the other thingswe can d
o to determine the percentage of the population above a point, b
elow a point, andbetween points), we can also use the table to d
etermine whether an outcome is statisticallysignificant. In the fi
rst z test we completed, we compared the mean number of negat
ive verbalexchanges in a sample of couples on the verge of divo
rce to the mean level of negativeexchanges among those identifi
ed as the population of couples with “troubled” marriages andfo
und that z = 2.630. The table value indicates that the probability
of randomly selecting asample of couples that would have M =
12.865 or more negative verbal exchanges per weekwas p = 0.00
43. At less than p = 0.05, that outcome is unlikely to have occur
red by chance. Itis statistically significant.
The second z test dealt with how the group of juveniles in thera
py compared to a nationalpopulation of juvenile offenders. For t
hat problem, z = −1.340, and the table value for that zwas 0.409
9. We determined that the probability of a group scoring M = 13
.554 or lower was p = 0.0901 (shown in Figure 4.4).Figure 4.4:
The probability of selecting a sample withsocial alienation scor
es of M = 13.554 or lower from apopulation with mean SA of µ
M = 14.5
Visual representation of the probability of selecting a random sa
mple ofindividuals with social alienation scores of 13.554 or lo
wer, when thepopulation mean is 14.5.
Determining Significance Without the Table
Remember that ±z = 1.0 includes about 68% of the z distributio
n, so the probability ofrandomly selecting an outcome that occu
rs in the ±z = 1.0 area is p = 0.68. Nothing in thatregion is goin
g to be statistically significant because those z values indicate r
esults that arecharacteristic of the distribution as a whole. The u
ncharacteristic events are the significantones, and Fisher’s stand

ard of p = 0.05 indicates that the key is a z value that excludes o
nlythe most extreme 5% of the distribution.
Recall that normal distributions are symmetrical. That 5% exclu
sion means that the mostextreme 2.5% of outcomes in the lower
tail and the most extreme 2.5% of outcomes in theupper tail are
the regions that include statistically significant outcomes. Beca
use Table B.1provides proportions for only the upper half of the
distribution, the z value, which includesall but the extreme 2.5
% of outcomes, will be the point at which results become statist
icallysignificant. If 2.5% needs to be the percentage excluded, 4
7.5% is the percentage included.As a proportion, 47.5% is expre
ssed as 0.475.
·
From Table B.1, find the z value which includes 0.475 from that
point to the mean ofthe distribution.
·
Because z = 1.96 includes 0.475 of the distribution, ± that value
will include 0.95 ofthe distribution (2 × 0.475 = 0.95).
·
Any time a z test produces a z = ±1.96 or greater, the result is st
atistically significant at p = 0.05.
For example, if the registrar at a university had a group of stude
nts applying for admission toa graduate program, and the admiss
ions test scores for that group resulted in a value of, say, z= 1.9
8 compared to all applicants, the registrar would know straight a
way that those studentshave scores significantly greater than tho
se to whom they were compared. Such a differenceis not likely t
o be an artifact of sampling variability.Apply It! Confidence in
the Claim
A parent is looking at private high schools for his child. A parti
cular high schoolclaims that last year, its students scored above
average on the math and verbalsections of the SAT. The parent,
who knows something about statistical analysis,decides to test t
his claim.
Jack Hollingsworth/Thinkstock

The Standard NormalDistribution and z ScoresKeren SuCorbisC.docx

The Standard NormalDistribution and z ScoresKeren SuCorbisC.docx

Recommended

Recommended

More Related Content

Similar to The Standard NormalDistribution and z ScoresKeren SuCorbisC.docx

Similar to The Standard NormalDistribution and z ScoresKeren SuCorbisC.docx (20)

More from irened6

More from irened6 (20)

Recently uploaded

Recently uploaded (20)

The Standard NormalDistribution and z ScoresKeren SuCorbisC.docx