Dr. William Kritsonis, National FORUM Journals, www.nationalforum.com
Beating the Odds: Comparisons of Elementary Schools Based on
Actual vs. Predicted Standardized Achievement Scores
Gregory J. Marchant, PhD
Sharon E. Paulson, PhD
Ball State University
This study compared elementary schools, based on percentage of students passing a state
achievement test, with those performing above and below expectations when controlling for
race, income, and cognitive ability. Results indicated that, when simply looking at percent
passing, the top schools were predominantly White, affluent, with high cognitive ability
students. However, after controlling for the demographic characteristics of students, the
top schools were more diverse in race, income, and ability. This raised questions about
evaluations of the “best” schools and suggested the possibility of bias based on race and
Key Words: accountability, achievement tests, demographics
Standardized achievement testing historically has been considered very useful for
monitoring school progress and educational accountability (Elliott, 1994; Sanders & Horn,
1995); this approach has become increasingly pervasive in the United States, where successive
federal administrations have established educational policies that base school performance on
students “passing” state-level standardized achievement tests. The No Child Left Behind Act of
2001 (NCLB) assumes that standardized achievement test scores are the most effective way to
measure students’ academic progress and schools’ effectiveness, based on the criteria of
efficiency, uniformity, validity, and reliability (NCLB, 2002). Unfortunately, these scores are
used to compare or to rank schools based on the means of students’ performance at each school.
No consideration is given to the multitude of other factors that might affect students’ test scores,
particularly those outside the control of schools.
Educational and psychological research have shown that several demographic factors
inherent in the students themselves (e.g., socio-economic status [such as poverty], parent
education, race, and cognitive skills) exert an enormous influence on school success (Coleman et
al., 1966; Heyneman, 2005; Lee & Wong, 2004), but these factors are seldom sufficiently
stressed by current educational evaluations and policies. Accordingly, demographic variables are
not irrelevant concerns for schools’ accountability (Thrupp, 2001); these variables are out of the
control of schools, but this does not mean that they should be discarded in explaining academic
achievement. In fact, ignoring these factors does a disservice to the influence of families and
communities on their children. This is not simply the “soft prejudice of low expectations” as the
Bush administration suggested (Marchant & Paulson, 2001, 2005). The problem is not just that
the demographic characteristics lead to low expectations; the problem is that the background,
home, and community factors actually have a negative impact on school learning.
Student demographic characteristics have been shown to account for the majority of the
variance in achievement comparisons of students at the school, district, and state levels (Paulson
& Marchant, 2009). The impact of these influences was demonstrated when private school
advantages in mathematics scores disappeared and even reversed, in most cases, when
demographic differences were controlled (Lubienski & Lubienski, 2006). Consequently, research
attempts to determine school effectiveness must take into account the inherent characteristics of
the students, given that demographics and background variables exert a significant role in
predicting academic achievement. For this reason, it becomes necessary to use alternative models
that produce more valid estimates of the separate effects of school factors and demographics on
Given the unbridled use of standardized test scores to pass judgment on schools and their
teachers, achievement testing has become an important focus in educational research. The past
10 to 15 years have witnessed a rapid growth in research that uses complex models to determine
the relative effects of numerous factors on students’ academic achievement. Using a broad range
of theoretical and methodological approaches, studies have produced models that discriminate
school factors from other variables in determining how students accomplish academic goals.
Value-added evaluation models attempt to control for student background variables, usually by
considering previous achievement, when determining school or teacher specific contributions to
students’ standardized achievement outcomes (Doran & Fleischman, 2005; McCaffrey,
Lockwood, Koretz, Louis & Hamilton, 2004; Raudenbush, 2004). The difference between
achievement predicted by past performance and current achievement is the value-added by the
school or teacher. Currently, most states, however, do not use value-added and/or growth models
to evaluate their schools because they are relatively complex and difficult to implement. Instead,
most states continue to base the evaluation of their schools on the simple percentage of students
“passing” the statewide achievement test.
The purpose of this study was to show that even using simple mean levels of performance
(i.e., simple pass rates), adjusting scores based on the demographic characteristics of a school’s
students can have a profound impact on school evaluations. In this study, a “modified” value-
added model with available school level data was developed and compared to the characteristics
of schools that were identified as high achieving based on the percentage of students passing a
statewide standardized achievement (ISTEP+) test. Comparison was made with the schools that
performed above expectations based on the demographic characteristics of their students
(heretofore referred to as demographics adjusted performance [DAP]). The characteristics of the
highest and lowest achieving schools based on percentage of passing and the highest and lowest
DAP schools were compared. It was expected that schools performing above expectations
(highest DAP schools) would not be the same as the highest performing schools based simply on
the percentage of students passing the test. If true, school evaluations, whether based on simple
pass rates or complex value-added models, must take into account the factors that cannot be
controlled by schools before making judgments on the instructional quality of the schools.
School level data were retrieved from the Indiana Department of Education Website for
all of the elementary schools in Indiana for the school year 2002-2003. This year was chosen
because it was the last year the state of Indiana required the reporting of a Cognitive Skills Index
(CSI) from the items on the state’s achievement test (ISTEP+). The CSI compares the student’s
cognitive ability with that of students who are the same age, without regard to grade placement.
CSI is a normalized standard score and thought to equate to a mental aptitude measure roughly
the same as an IQ score (with a mean of 100 and a standard deviation of 15). In addition to the
CSI, the data for each school included the percentage of students passing both the English and
mathematics sections of the third grade ISTEP+; the percentage of students participating in the
free lunch program, and the percentage of Black students for each school. There were 790
elementary schools with complete data.
Calculating Demographics Adjusted Performance
Using regression models, students’ family income, race, and intellectual ability were used
to predict the percentage of students who would be expected to pass the Indiana Statewide
Testing for Educational Progress Plus (ISTEP+) at each school in Indiana. Although there is no
perfect way to account for all inherent student differences, controlling for just these three
students’ factors allowed for the calculation of a school’s DAP; that is the difference between the
actual percentage of students passing the test and the expected percentage. First, a regression
equation was generated from the three demographic factors to predict schools’ actual
performance on the ISTEP+. The beta weights for each predictor variable represent their
respective contributions to the schools’ pass rates on the test. Then, using the beta weights from
this regression equation, schools’ expected pass rates (expected based on their individual
demographic data) were calculated. Based on a school’s student income (percent of students in
the free lunch program); percentage of Black students; average CSI, and the three-way
interaction, the percentage of students who would be expected to pass both parts of the ISTEP+
test was predicted. To compare schools that would be considered high achieving based simply on
the percentage of students passing the ISTEP+ and schools with high DAP (actual pass rates
exceed pass rates expected based on demographics of the students), the top 20 and bottom 20
schools for each type of performance were identified.
Demographics Adjusted Performance Results
Bivariate correlations revealed that family income, race, and cognitive ability each
predicted ISTEP+ pass rates (p < .001): income, r2
= -.32; race, r2
= -.24; CSI, r2
= .48. A
multiple regression was used to determine the total variability in school achievement predicted
by all three factors and by all three factors plus their three-way interaction term. The effects of
each one of the predictor variables is shown in the following model:
As noted, schools’ expected pass rates were calculated using the beta weights from the
equation, and for each school, the difference between the actual percentage of students passing
the ISTEP+ and the predicted percentage was calculated to generate a DAP score. Results
showed that income, race, and cognitive ability significantly predicted ISTEP+ pass rates (R2
52), and the interaction term significantly added to the equation (R2
= .54; R2
change = .02). In
addition, semi-partial correlations revealed that each of the student factors made a unique
contribution to the equation (p < .001).
Descriptive School Comparisons
The average percentage of students passing the ISTEP+ for the top 20 schools was 91
percent (see Table 1). Five of these schools did not have any students in the free lunch program,
and the average for the 20 schools was 5 percent. Only two schools had more than 10 percent of
students in the free lunch program. Three of the schools did not have any Black students, and the
average for the 20 schools was 2 percent. Only one school had over 5 percent Black students. All
of the schools had above average Cognitive Skills Index scores, with an average of 115 (one
standard deviation above the mean). The list contained eight private Catholic schools. On
average, these schools had 12 percent more students pass the ISTEP+ than predicted.
Comparison of top and bottom 20 Indiana elementary schools based on percent passing the third
grade ISTEP and based on demographics adjusted performance (DAP).
Passing state test 91 17 77 26
Free lunch program eligible 5 62 49 35
Black Students 2 48 22 21
Private schools 40 10 5 30
Cognitive Skills Index (IQ) 115 95 100 102
The average percentage of students passing the ISTEP+ for the bottom 20 schools was 17
percent, with two schools reporting no students passing both achievement test sections. All but
three schools had 50 percent or more of the students in the free lunch program, with the average
across the 20 schools being 62 percent. Half of the schools had over 50 percent Black students,
with the average for the 20 schools being 48 percent. Only three schools scored at or above
average on the CSI, with an average score of 95. The list contained two private Catholic schools.
On average, these schools had 23 percent less students pass the ISTEP+ than predicted.
In contrast to the list of schools based simply on the percentage passing the ISTEP+, the
top 20 and the bottom 20 schools based on comparisons of predicted and obtained achievement
percentages (DAP scores) were relatively similar demographically. In fact, a multiple regression
predicting DAP scores using income, race, and CSI was not significant for these schools.
The 20 highest DAP schools had 48 percent of their students in the free lunch program
compared to 35 percent from the 20 lowest DAP schools. Nineteen percent of the top 20 DAP
schools’ students were Black, compared to 21 percent in the bottom 20 schools. For the CSI, the
top 20 DAP schools averaged one point below the mean of 100, and the bottom 20 DAP schools
scored an average of two points above the mean. Only one of the top 20 DAP schools was a
private Catholic school, but six of the bottom 20 DAP schools were private Catholic schools.
The top 20 DAP schools had an average of 28 percent more students pass both sections of the
ISTEP+ than would be expected based on their demographic makeup. The bottom 20 DAP
schools had an average of 29 percent fewer students pass both sections of the test than expected.
State and national evaluations of the quality of schools are based primarily on pass rates
on the state achievement test, such as the ISTEP+. The percentage of students passing the tests
contributes to a school’s designation as an Indiana four-star school and to No Child Left
Behind’s Adequate Yearly Progress. Although some studies have reported that demographic data
accounted for about 20 or 30 percent of the variation among elementary schools (Thompson,
2004), this study showed that over half of the variance among schools’ percentage of students
passing could be attributed to factors over which schools have no control. This supported a study
using NAEP data finding 53 percent of the variance in school level achievement due to race and
income (Marchant, Ordonez-Morales, & Paulson, 2010). Another study, comparing third-grade
students’ scores on a performance-based writing test and a multiple-choice test of language
skills, found that demographics contributed over 75 percent of the school-level variance on the
multiple-choice test, but only 40 percent of the school-level variance on the performance-based
test (Heck & Crislip, 2001).
When student demographic factors are controlled, a different picture of quality emerges.
In our study, only two of the highest achieving schools based simply on percent passing the
achievement test also appeared among the top 20 DAP schools. However, eight of the schools
with the lowest pass percentages were in the bottom 20 DAP schools. These results suggested
that if student demographics are ignored when comparing schools, truly successful schools are
more likely to be misidentified than those schools that are struggling.
Although stereotypes often contain a shred of truth, they usually do a better job of
masking the truth. The notion that the best schools are full of highly intelligent affluent White
children was supported by the data reporting only the percentage of students passing the state
test. Almost half of these schools were private schools. However, one might argue that, although
these schools score well, they may not be doing the most to meet their students’ potential. The
schools performing above expectations were more diverse. Some had a majority of their students
participating in the free lunch program or had a majority of students who were Black. Only one
out of these top twenty schools was a private school. As demonstrated in previous research, these
private and Catholic institutions do not necessarily produce better results when the backgrounds
of their students are considered (Lubienski & Lubienski, 2006).
This study suggested that any effort to identify educational quality by simply looking at
test pass rates would be biased based on race and SES. Efforts to identify “good” schools must
take into account the nature of the students attending. When these factors are considered, the
“good” schools are less stereotypical and more diverse.
Coleman, J. S., Campbell, E. Q., Hobson, C. J., McPartland, J., Mood, A. M., Weinfeld, F. D.,
et al. (1966). Equality of educational opportunity. Washington, DC: U.S. Department of
Health, Education and Welfare, Office of Education.
Doran, H., & Fleischman, S. (2005). Research matters/challenges of value-added assessment.
Educational Leadership, 63(3), 85-87.
Elliott, S. E. (1994). Creating meaningful performance assessments: Fundamental concepts.
Performance assessment: CEC Mini-Library. Reston, VA: Council for Exceptional
Children. (ERIC Document Reproduction Center No. ED375566)
Heck, R. H., & Crislip, M. (2001). Direct and indirect writing assessments: Examining
issues of equity and utility. Educational Evaluation and Policy Analysis, 23, 19-36.
Heyneman, S. P. (2005, November). Student background and student achievement: What is the
right question? American Journal of Education, 112, 1-9.
Lee, J., & Wong, K. K. (2004). The impact of accountability on racial and socioeconomic equity:
Considering both school resources and achievement outcomes. American Educational
Research Journal, 41, 797–832.
Lubienski, S. T., & Lubienski, C. (2006). School sector and academic achievement: A multilevel
analysis of NAEP mathematics data. American Educational Research Journal, 43, 651–
Marchant, G. J., Ordonez-Morales, O., & Paulson, S. E. (2010, May). The contribution of
student demographics to achievement scores at varying levels of aggregation. Paper
presented at the annual meeting of the American Educational Research Association,
Marchant, G. J., & Paulson, S. E. (2005, January 21). The relationship of high school graduation
exams to graduation rates and SAT scores. Education Policy Analysis Archives, 13(6).
Retrieved from http://epaa.asu.edu/epaa/v13n6/
Marchant, G. J., & Paulson, S. E. (2001). State comparisons of SAT Scores: Who’s your test
taker? NASSP Bulletin, 85(627), 62-74.
McCaffrey, D. F., Lockwood, J. R., Koretz, D., Louis, T. A., & Hamilton, L. (2004). Models for
value-added modeling of teacher effects. Journal of Educational and Behavioral
Statistics, 29, 67-101.
No Child Left Behind Act of 2001, 20 U.S.C. § 6311 (2002).
Paulson, S. E., & Marchant, G.J. (2009). Background variables, levels of aggregation, and
standardized test scores. Education Policy Analysis Archives, 17. Retrieved from
Raudenbush, S. W. (2004). What are value-added models estimating and what does this imply
for statistical practice? Journal of Educational and Behavioral Statistics, 29, 121-129.
Sanders, W., & Horn, S. (1995). Educational assessment reassessed: The usefulness of
standardized and alternative measures of student achievement as indicators for the
assessment of educational outcomes. Education Policy Analysis Archives, 3(6). Retrieved
Thompson, B. R. (2004). Equitable measurement of school effectiveness. Urban Education, 39,
Thrupp, M. (2001). Sociological and political concerns about school effectiveness research:
Time for a new research agenda. School Effectiveness and School Improvement, 12 (1), 7-
Gregory J. Marchant is Professor of Educational Psychology in the Teachers College-Ball State
University in Muncie, IN.
Oscar Ordonez-Morales is a doctoral student in educational psychology in the Teachers College
– Ball State University, Muncie, IN.
Sharon E. Paulson is Professor of Educational Psychology in the Teachers College – Ball State
University, Muncie, IN.