Assessment Of Teachers Beliefs About Classroom Use Of Critical-Thinking Activities
1. 10.1177/0013164404267281
EDUCA
TIONAL AND PSYCHOLOGICAL MEASUREMENT
TORFF AND W
ARBURTON
ASSESSMENT OF TEACHERSâ BELIEFS ABOUT
CLASSROOM USE OF CRITICAL-THINKING ACTIVITIES
BRUCE TORFF
Hofstra University
EDWARD C. WARBURTON
University of California at Santa Cruz
This article reports five studies in which a scale for assessing teachersâ beliefs about
classroom use of critical-thinking (CT) activities was developed and its scores evaluated
for reliability and validity. The Critical Thinking Belief Appraisal (CTBA) is based on a
four-factor âadvantage effectâ model: the theoretical premise that teachersâ CT-related
decision making is associated with their beliefs about the effectiveness of (a) high-CT
activities for high-advantage learners, (b) high-CT activities for low-advantage learners,
(c) low-CT activities for high-advantage learners, and (d) low-CT activities for low-
advantage learners. Results indicated that the scale producedscores with high reliability;
a stable factor structure; and satisfactory discriminant, construct, and predictive validity.
The studies supported the theoretical and practical utility of the construct and measure of
teachersâ beliefs about classroom use of CT activities.
Keywords: critical thinking; classroom instruction; teachersâ beliefs; disadvantaged
learners; advantage effects
Success in adult lifedepends on, among other things, the capacityfor criti-
cal thinking (CT)âpurposeful and goal-directed cognitive skills or strate-
gies that increase the likelihood of a desired outcome (Halpern, 2002). Suc-
cess in school also depends on CT skills perhaps now more than ever as high-
stakes tests increasingly present CT-rich tasks such as writing essays and ex-
plaining mathematics responses (Yeh, 2002). Accordingly, extensive bodies
of literature focus on CT (e.g., Browne & Keeley, 2001; Ennis, 1987;
Halpern, 2002; Resnick, 1987) and applications of CT in education (e.g.,
Educational and Psychological Measurement, Vol. 65 No. 1, February 2005 155-179
DOI: 10.1177/0013164404267281
Š 2005 Sage Publications
155
2. Henderson, 2001; OâTuel & Bullard, 1993; Pogrow, 1990, 1994; Raths,
Wasserman, Jonas, & Rothstein, 1986; Torff, 2003).
This article describes the development of a measurement instrument
designed to explore related phenomena. The goal is to identify differences
among teachers in beliefs about the conditions under which they deem it
effective to engage students in CT-rich and/or CT-lean activities. In what fol-
lows, a theoretical model of CT-use beliefs is presented accompanied by a
scale based on this model, the Critical Thinking Belief Appraisal (CTBA).
The article reports a series of five studies in which the construct and scale
were developed and the reliability and validity of the scaleâs scores were
evaluated.
The Impact of Perceived Learner
Advantages on Teachersâ CT-Use Beliefs
Increasing attention in educational psychology focuses on teachersâbeliefs,
which have been shown to influence their decision making concerning when
to employ CT-rich activities (e.g., debate) or comparatively CT-lean ones
(e.g., direct instruction) (Anning, 1988; Brickhouse, 1990; Calderhead,
1996; Nespor, 1987; Richardson, 1996). In particular, teachers tend to regard
CT-rich activities as more appropriate for high-advantage learners (i.e., high
in academic ability, prior knowledge, and motivation) than low-advantage
learners (Metz, 1978; Oakes, 1990; Page, 1990; Raudenbush, Rowan, &
Cheong, 1993; Zohar, Degani, & Vaakin, 2001). These findings are consis-
tent with the theory that teachersâCT-related beliefs are associated with their
perceptions of learnersâ level of advantages (i.e., an âadvantage effectâ):
When learners are perceived to be high in advantages, CT activities are
thought to be effective and classroom CT-use levels are high; conversely,
when learners are perceived to be disadvantaged, CT activities are viewed as
ineffective and CT-use levels are comparatively low.
To further investigate the advantage effect, a new scale to assess teachersâ
CT-use beliefs was developed. To date, little research has been conducted to
validate either the construct âCT-use beliefsâ or the scores of measurement
instruments designed to tap these beliefs. Advantage-effect research has
included examinations of the reliability of scores from the assessments
employed but littleevaluation of the validity of the scores or the constructs on
which the assessments were based (Raudenbush et al., 1993; Zohar et al.,
2001). With the discriminant validity of the CT-use construct as yet unexam-
ined, it remains unclear the extent to which the advantage effect is an artifact
of hidden factors (such as CT ability, CT disposition, or need for social
approvalâfactors we discuss below). Similarly, the predictive validity of the
construct âCT-use beliefsâ remains unexplored, so it is unknown the extent to
which the advantage effect is manifested in teachersâ classroom practices as
156 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT
3. well as their espoused beliefs. Given the mixed results of research on the cor-
respondence of beliefs to behavior in educational research (Fang, 1996;
Wilcox-Herzog, 2002), examination of the predictive validityof a measure of
CT-use beliefs seems warranted.
Development of the CTBA
Operationalizing the CT Level in Classroom Activities
Development of the CTBA began with drafting of a series of 20
promptsâbrief descriptions of classroom activities set in secondary-level
academic subjects (English, languages other than English, mathematics, sci-
ence, and social studies). Based on the theoretical premise that teachersâCT-
related decision making entails beliefs about both CT-rich and CT-lean activ-
ities, equal numbers of high-CT and low-CT prompts were included. Below
are examples of each.
High CT: A social studies class is studying the Treaty of Versailles signed
at the end of World War I. The teacher assigns learners to write âletters from
the futureâ to President Wilson arguing why the United States should or
should not support the treaty.
Low CT: A social studies class is studying the Industrial Revolution. The
teacher provides learners with a list of inventions, explains the impact of
these inventions during this period, and describes how they continue to influ-
ence the modern world.
Operationalizing Learner Advantages
Based on the theoretical proposition that CT-use beliefs are influenced by
teachersâ judgments of learnersâ advantage level, development of the CTBA
entailed procedures to operationalize teachersâ perceptions of âhigh-advan-
tageâ and âlow-advantageâ learners. We devised a contextualized assessment
scheme drawing on teachersâ conceptions of the particular characteristics
they take into consideration in attribution of a learner (or group of learners) as
high advantage or low advantage. Three such âadvantage characteristicsâ
were nominated: ability (learnersâ capacity for intellectual or academic
achievement when dealing with the specific topic to which a given prompt
refers), prior knowledge (the extent of learnersâknowledge about the specific
topic to which a given prompt refers before learners participate in additional
activities), and motivation (how much interest and attention learners demon-
strate when dealing with the specific topic to which a given prompt refers)
(Archer & McCarthy, 1988; Dweck, 1986; Givvin, Stipek, Salmon, &
MacGyvers, 2001; Madon et al., 1998; Moje & Wade, 1997; Nolen &
TORFF AND WARBURTON 157
4. Nichols, 1994; Pintrich & Schunk, 1996; Tollefson, 2000). We nominated
these characteristics not as factors underlying CT use but as indications of
teachersâ judgments of learners as high advantage or low advantage. This
method appears to have been effective, given that factor-analytic and inter-
nal-consistency results presented below strongly supported ability, prior
knowledge, and motivation collectively as indicators of teachersâperception
of learner advantagesâbut not as independent factors.
For item selection purposes, the three-characteristic design yielded an ini-
tial 120-item pool in which respondents used 6-point Likert-type scales to
rate the effectiveness of each of the 20 prompts for 6 different groups of
learnersâhigh ability, low ability, high prior knowledge, low prior knowl-
edge, high motivation, and low motivation. The use of three advantage char-
acteristics reduces response bias caused by leading questions. Were the scale
to present respondents with a prompt and then ask them to rate its effective-
ness with âhigh-advantageâ or âlow-advantageâ learners, attention would be
called to the importance of advantage level, and respondents would naturally
begin to ask themselves whether advantage level ought to make a difference.
Instead, the final version of the scale (created through item selection in Study
1 below) follows each prompt with either a high- or low-advantage item for
each characteristicâfor example, Prompt 1 is followed by a low-ability item
(but not a high-ability one), a low-prior-knowledge item (but not a high-
prior-knowledge one), and a high-motivation item (but not a low-motivation
one).
Study 1
To select the most effective items for use in subsequent studies, we com-
pleted three procedures. First, in preliminary pilot testing, the criterion of
ambiguity was applied to evaluatethe extent to which the 20 prompts actually
reflect high-CT and low-CT activities. Second, the criterion of irrelevance
was applied to eliminate any item that failed to discriminate between groups
known to differ in beliefs about classroom use of CT activities. Third, the cri-
terion of internal consistency was applied to delete any item that met all other
criteria but had low internal consistency reliability relative to the items with
which it was expected to be associated.
Preliminary testing. In preliminary pilot testing to assess the extent to
which the prompts successfully reflect high-CT and low-CT activities, the
prompts were presented to 20 university professors in the School of Educa-
tion and Allied Human Services at Hofstra University. Participants were
asked to rate each prompt as high CT or low CT. Given the deliberately
polarized nature of the prompts, it is no surprise that 100% of the 400 judg-
ments made by the participants correctly classified the prompts as high CT or
low CT.
158 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT
5. Method
Participants in Study 1 were practicing secondary teachers (N = 40) on
Long Island, New York. Participating teachers were nominated by adminis-
trators at 40 randomly selected secondary schools. The administrators were
shown written descriptions of CT-rich and CT-lean lessons and then were
asked to nominate one teacher judged to strongly favor one of the levelsof CT
use. The result was two groups of 20 teachers known by supervisor nomina-
tion to differ in CT use (CT-inclined and CT-averse). Among the CT-inclined
teachers, the 8 males and 12 females had an average age of 39.9 years with
13.1 years of teaching experience ranging from 8 to 29 years. They included
5 teachers of English, 3 of languages other than English, 4 of mathematics, 5
of science,and 3 of socialstudies. Among the CT-averse teachers, the 9 males
and 11 females had an average age of 41.4 years with 12.4 years of teaching
experience ranging from 7 to 30 years. They included 4 teachers of English, 2
in languages other than English, 5 of mathematics, 4 of science, and 5 of
social studies. No administrator or teacher asked to take part declined to do
so. Participants were not compensated for participating. The 120-item pool
was administered to participating teachers at the schools at which they were
employed. They were instructed that the items had no correct answers and
that responses were confidential.
Results and Discussion
Item selection. Selection of prompts and advantage-characteristic items
for use in subsequent studies was made according to the following criteria:
(a) prompts yielding a statistically significant main effect in multivariate
analysis of group membership, (b) items with a strong association to the main
effect (i.e., statistically significant univariate F values), (c) balanced inclu-
sion of prompts (high/low CT) and advantage-characteristic items (high/low
ability, high/low prior knowledge, and high/low motivation), and (d) distri-
bution across the five secondary subjects (English, languages other than
English, mathematics, science, and social studies).
For each of the 20 prompts, a one-way between-subjects multivariate
analysis of variance (MANOVA) was performed on the six dependent vari-
ables (high ability, low ability, high prior knowledge, low prior knowledge,
high motivation, and low motivation). The independent variable was group
membership (CT-inclined or CT-averse). Results of evaluation of assump-
tions of normality of sampling distributions, linearity, and homogeneity of
variance/covariance matrices were satisfactory. There were no outliers at
alpha = .001. Twelve prompts were retained that discriminated between
groups at the conservative alpha level of .0025 (.05/20). With the use of
Wilksâs Lambda, the selected prompts yielded F values (6, 34) ranging from
TORFF AND WARBURTON 159
6. 12.11 to 77.96 (ps < .0025). Univariate analyses were performed as post hoc
procedures to follow up on statistically significant multivariate effects (see
Table 1). The selected prompts contained 36 items that maximally discrimi-
nated between the two groups, with univariate F values (1, 38) ranging from
3.35 to 51.9. Univariate effect sizes (eta-squared statistics) ranged from .11
to .61 with an average of .29.
The resulting 36-item scale is balanced with 6 high-CT prompts and 6
low-CT ones. Each prompt is followed by three items (6-point Likert-type
scales)âone each for ability, prior knowledge, and motivation. The scale is
also balanced for the three advantage characteristics with each one appearing
12 timesâ6 referring to high-advantage learners and 6 referring to low-
advantage ones.
As expected, analysis of variance (ANOVA) procedures revealed that the
two groups produced divergent scores on the 36-item scale, with CT-inclined
teachers favoring high-CT prompts and CT-averse teachers preferring low-
CT ones. On high-CT prompts for high-advantage learners, the difference
between CT-inclined teachers (M = 5.38, SD = 0.56) and CT-adverse teachers
(M = 4.18, SD = 0.83) was statistically significant, with F(1, 38) = 26.11, p <
.0001 (eta-squared = .43). On high-CT prompts for low-advantage learners, a
statistically significant difference was found between CT-inclined teachers
(M = 3.67, SD = 0.97) and CT-adverse ones (M = 2.47, SD = 0.94), with F(1,
38) = 14.76, p < .0001 (eta-squared = .30). On low-CT prompts for high-
advantage learners, the difference between CT-inclined teachers (M = 3.12,
SD = 1.1) and CT-adverse ones (M = 4.41, SD = 0.83) was statisticallysignifi-
cant, with F(1, 38) = 13.26, p < .001 (eta-squared = .27). Finally, on low-CT
prompts for low-advantage learners, a statistically significant difference was
found between CT-inclined teachers (M = 2.25, SD = 0.90) and CT-adverse
ones (M = 4.36, SD = 0.67), with F(1, 38) = 68.59, p < .0001 (eta-squared =
.65).
Internal consistency. Tests of internal consistency indicated that the 36
selecteditemsexhibited an acceptabledegree of interrelatedness (where such
interrelatedness is expected given the theoretical distinctions drawn between
high versus low CT use and high versus low learner advantages). The overall
alpha level for the scale was .89 based on averages for the items expected to
be associated. Satisfactory levels of internal consistency were obtained
among the items measuring (a) high-CT prompts for high-advantage learners
(alpha = .91), (b) high-CT prompts for low-advantage learners (alpha = .79),
(c) low-CT prompts for high-advantage learners (alpha = .96), and (d) low-
CT prompts for low-advantage learners (alpha = .92).
Internal consistency computations also supported the efficacy of the three
advantage characteristics (ability, prior knowledge, and motivation) as meth-
160 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT
(text continues on page 164)
10. odological tools for operationalizing teachersâperception of learnersâadvan-
tage level. For high-CT prompts, acceptable levels of correlation were
obtained as follows (all correlations in this paragraph, p < .05): between
high-ability and high-prior-knowledge items (.91), between high-ability
and high-motivation items (.90), between high-prior-knowledge and high-
motivation items (.97), between low-ability and low-prior-knowledge items
(.74), between low-ability and low-motivation items (.74), and between low-
prior-knowledge and low-motivation items (.90). Similarly, for low-CT
prompts, satisfactory levels of correlation were obtained among high-ability
and high-prior-knowledge items (.94), high-ability and high-motivation
items (.96), high-prior-knowledge and high-motivation items (.86), low-
ability and low-prior-knowledge items (.96), low-ability and low-motivation
items (.91), and low-prior-knowledge and low-motivation items (.90).
Study 2
In Study 2, exploratory factor analysis was employed to identify factors
underlying teachersâ CT-use beliefs. With the assessment of a larger and
more homogeneous population of practicing secondary teachers, it was
expected that the salient pattern and structure coefficients of the items would
represent a tendency among teachers to support CT-rich and CT-lean activi-
ties according to learnersâ perceived advantages.
Method
Practicing secondary teachers (N = 381) in New York, Connecticut, and
Massachusetts participated in Study 2. They were selected at random from
faculty at 39 randomly selected secondary schools. Among the participating
teachers, 199 were female with a mean age of 37.7 years, and 182 were male
with a mean age of 38.6 years. In keeping with teacher-expertise benchmarks
set out by Berliner (1992), participants had a minimum of 5 years of teaching
experience, which ranged from 7 to 35 years with an average of 13.7 years.
Participants included 23 prospective teachers of business, 49 of English, 25
of fine arts, 27 of health, 42 of languages other than English, 55 of mathemat-
ics, 48 of music, 47 of science, and 65 of social studies. They were not com-
pensated for participating, and no teachers asked to participate declined to do
so. All participants completed the 36-item CTBA at the schools at which they
were employed. Verbal instructions emphasized that there were no correct
answers and that responses were confidential.
164 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT
11. Results and Discussion
Principal axis factoring was used to evaluate the number of factors, pres-
ence of outliers, absence of multicollinearity, and factorability of the correla-
tion matrices. (Use of a principal components method yielded highly similar
results.) The Kaiser-Meyer-Okin measure of sampling adequacy was .73.
There were no outliers at alpha = .001. When oblique factor rotation was
attempted, the highest correlation obtained (between factors interpreted as
low CT/high advantage and low CT/low advantage) was .25. Because this
correlation fell below the benchmark of .32 for inclusion in nonorthogonal
factor rotation (Tabachnick & Fidell, 2001), varimax rotation was chosen.
The number of candidate factors was determined by eigenvalues greater
than one, examination of the scree plot, and a parallel analysis. Ten factors
yielded eigenvalues greater than one, and examination of the scree plot sug-
gested a possible range of 4 to 6 factors. Because the eigenvalue-greater-
than-one rule tends to overestimate the number of factors, and the scree plot
entails some inherent subjectivity, the parallel analysis was employed as a
generally more accurate assessment of the number of factors (Henson,
Capraro, & Capraro, 2001; Henson & Roberts, 2001). The parallel analysis
was employed using the syntax for SPSS provided by Thompson and Daniel
(1996). This analysis produced a set of 6 factorsâthat is, there were 6 factors
for which the eigenvalue in the actual data exceeded the associated
eigenvalue in the random data generated for parallel analysis. These 6 factors
accounted for 70% of the variance. However, only the first 4 factors provided
theoretically relevant and interpretable information about teachersâ CT-
related beliefs. These factors accounted for a high percentage of the variance
(62%) and could be readily interpreted in accordance with the theoretical
aims of the instrumentâthat is, to assess teachersâbeliefs about high-CT and
low-CT activities depending upon learnersâ advantage level (high versus
low). In contrast, the 5th and 6th factors explained only a small portion of
additional variance (8%) and were not readily interpretable given the CTBAâs
theoretical design. Hence, the 5th and 6th factors were excluded from further
analyses in Study 2.
Table 2 presents pattern/structure coefficients, communalities, and per-
centages of variance and covariance, with interpretative labels for each of the
four remaining factors suggested in bold type. As indicated by squared mean
correlations, all four factors were internally consistent and well defined by
the items; the lowest of the squared mean correlations for factors from items
was .95. The reverse was also true: The items were well defined by this factor
solution. Communality values were moderate to high. With a cutoff of .40 for
inclusion of an item in interpretation of a factor (Tabachnick & Fidell, 2001),
all 36 items loaded on oneâand only oneâof the four factors.
As noted, the pattern/structure coefficients and the distribution of items
suggest that the items represented teachersâ reported tendency to support
TORFF AND WARBURTON 165
14. high-CT and low-CT activities depending upon learnersâ advantage level
(high versus low). The items were equally distributed across the four factors
(nine items each), and all of the items on the four factors were positively
weighted. The CTBA yielded four scale scores with satisfactory reliability:
(a) high-CT activities for high-advantage learners (M = 4.75, SD = .88,
alpha = .88), (b) high-CT activities for low-advantage learners (M = 3.19,
SD = .76, alpha = .78), (c) low-CT activities for high-advantage learners (M =
4.14, SD = .90, alpha = .88), and (d) low-CT activities for low-advantage
learners (M = 2.91, SD = .79, alpha = .83).
Study 3
To replicate the factor structure obtained in Study 2, a third study was con-
ducted in which the CTBA was administered to a group of preservice second-
ary teachers. As a group with negligible experience in teaching, preservice
teachersâCT-use beliefs may differ from those of seasoned in-service teach-
ers. Moreover, preservice teachers comprise a self-selected population likely
to be targeted in future CTBA research.
Method
Participants in Study 3 included randomly selected undergraduate-level
preservice teachers (N = 308) at three postsecondary institutions on Long
Island, New York (Hofstra University, Adelphi University, and Dowling Col-
lege). Among the participants, 161 were female and 147 were male, with a
mean age of 21.2 years. They included 1 preservice teacher of business, 56 of
English, 25 of fine arts, 27 of health, 38 of languages other than English, 46 of
mathematics, 30 of music, 31 of science, and 54 of social studies. No partici-
pants were compensated, and none declined when asked to take part. All par-
ticipantscompleted the 36-item CTBA, with verbal instructions emphasizing
that there were no correct answers and that responses were confidential. In
Study 3, a factor analysis was conducted in the same manner as in Study 2
because the two studies examined populations that may plausibly be expected
to differ (in-service and preservice teachers), raising the need to employ a
consistent set of data-analytic procedures.
Results and Discussion
Factor-analytic results obtained in Study 3 were similar to those of Study
2. In Study 3, principal axis factoring and principal components analysis
again yielded similar results. The principal axis factoring produced a Kaiser-
Meyer-Okin measure of sampling adequacy of .72 and no outliers at alpha =
.001. Oblique rotation yielded low correlations (the highest of which was
168 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT
15. .19), so varimax rotation was employed. In Study 3, 10 factors had eigen-
values greater than one, and examination of the scree plot suggested a possi-
ble range of 4 to 6 factors. Parallel analysis produced 6 factors that accounted
for 72% of the variance. However, as in Study 2, only 4 factors provided use-
ful measures of teachersâCT-related beliefs. The first 4 factors accounted for
63% of the variance and could be readily interpreted in accordance with the
theoretical design of the CTBA, whereas the 5th and 6th factors explained
only a small additional portion of the variance (9%) and provided little theo-
retically relevant or interpretable information about teachersâ CT-related
beliefs. Hence, only the first 4 factors were included in further analyses in
Study 3.
Pattern/structure coefficients, communalities, and percentages of vari-
ance and covariance are presented in Table 3 with interpretative labels sug-
gested for each of the four remaining factors in bold type. Analysis of the
squared mean correlations indicated that all four factors had satisfactory
internal consistency and were well defined by the items (.93 to .98); con-
versely, the items were well defined by this factor solution. Moderate to high
communality values were obtained. All 36 items loaded on one of the four
factors (with a cutoff of .40 for inclusion of an item in interpretation of a
factor).
The pattern/structure coefficients and the distribution of items indicated
that the four-factor model represented teachersâtendency to support high-CT
and low-CT activities according to learnersâ advantage level (high versus
low), as in Study 2. The 36 items were equally distributed across the four fac-
tors (9 items each), and all of the items on the four factors were positively
weighted. The resulting four scale scores had satisfactory reliabilities: (a)
high-CT activities for high-advantage learners (M = 4.94, SD = .68, alpha =
.84), (b) high-CT activities for low-advantage learners (M = 3.79, SD = .91,
alpha = .82), (c) low-CT activities for high-advantage learners (M = 3.55,
SD = .99, alpha = .92), and (d) low-CT activities for low-advantage learners
(M = 2.78, SD = .83, alpha = .84).
Study 4
To investigatethe discriminant validityof scores yielded by the CTBA, we
conducted a fourth study to determine the extent to which the scale tapped
constructs that were distinguishable from three constructs hypothesized to be
associated with CT-use beliefs. The first construct was CT abilityâan indi-
vidualâs capacity to engage successfully in CT-rich tasks (Ennis, 1987;
Facione, Facione, & Giancarlo, 2000; Kuhn, 1999). To explore the possibil-
ity that individuals with a high level of CT ability favor the use of such skills
in the classroom, we administered a test of CT ability, the California Critical
Thinking Skills Test Form 2000 (CCTST) (Facione et al., 2000), along with
the CTBA. The CCTST includes 34 multiple-choice items with four to six
TORFF AND WARBURTON 169
18. response options. As an ability measure, the CCTST has correct and incor-
rect answersâunlike the other assessment instruments employed in this
study (including the CTBA), which are opinion measures with no correct
answers.
The second construct considered for the purposes of assessing the
discriminant validity of the CTBA was CT dispositionâan individualâs pro-
pensity to engage in CT in everyday situations and professional contexts.
Individuals who are inclined to think critically may seek situations that
require it (Perkins, Jay, & Tishman, 1993). To investigate the possibility that
teachers with high CT disposition favor CT activities in the classroom, we
administered the Need for Cognition Scale (NCS) (Cacioppo & Petty, 1982;
Cacioppo, Petty, Feinstein, & Harvis, 1996). The NCS has 18 itemsscored on
4-point scales, with 9 items worked for reverse scoring.
The third construct was need for social approvalâan individualâs inclina-
tion to behave in ways that he or she perceives to be agreeable to others.
Schooling that is rich in CT may be viewed by members of our society as
preferable to forms of direct instruction, which are frequently regarded as
tedious and ineffective (Blumenfeld, Hicks, & Krajcik, 1996; Kagan &
Tippins, 1991; Putnam & Borko, 1997; Woolfolk Hoy & Murphy, 2001). To
explore the possibility that the CTBA operates as a proxy for need for social
approval, we administered the Marlowe-Crowne Social Desirability Scale
(MCSDS) (Crowne & Marlowe, 1964). The MCSDS includes 33 true-false
items, with 14 items worked for reverse scoring.
Method
Participants were drawn randomly from preservice secondary teachers (N =
100) in the School of Education and Allied Human Services at Hofstra Uni-
versity on Long Island, New York. The 59 females and 41 males had a mean
age of 20.8 years and included 1 prospective teacher of business, 21 of Eng-
lish, 10 of health, 15 of languages other than English, 17 of mathematics, 13
of science, and 23 of social studies. Participants were not compensated, and
none declined when asked to take part. All participants completed the CTBA,
CCTST, NCS, and MCSDS. Verbal instructions indicated that (a) there were
no correct answers for CTBA, NCS, or MCSDS; (b) there were correct
answersontheCCTST,and(c)responsesonallinstrumentswereconfidential.
Results and Discussion
High alpha coefficients provided evidence for the reliabilities of the
CCTST (.82), NCS (.90), and MCSDS (.94) with this group of participants.
Table 4 presents correlations between these measures and the four sets of
scores produced by the CTBA (employing true factor scores, not unit
172 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT
19. weighted scores based on salient items). As expected, near zero correlations
(at the .05 level) were found between the MCSDS and the four CTBA score
sets, indicating that the CTBA tapped a construct distinguishable from need
for social approval. Similarly, correlations were very small between the NCS
and all four CTBA score sets, demonstrating that the CT-use construct was
distinct from participantsâ CT disposition.
Small or near zero correlations were found between the CCTST and two
score sets of the CTBA, low-CT prompts for high-advantage learners and
low-CT prompts for low-advantage learners. Slightly larger correlations
were found between the CCTST and the remaining two CTBA score sets:
high-CT prompts for high-advantage learners (r = .28, p < .05) and high-CT
prompts for low-advantage learners (r = .31, p < .05). Neither of these corre-
lation coefficients reached the .32 level, which was used to judge a meaning-
ful contribution to the variance (Tabachnick & Fidell, 2001). However, the
obtained correlations indicate that participantsâCT ability was slightly asso-
ciated (9.7% of the variance) with their beliefs about the effectiveness of
high-CT prompts for both high-advantage and low-advantage learners.
These results suggest that the CTBA tapped a construct that was, perhaps,
related to but distinguishable from CT ability.
Study 5
A fifth study was conducted to investigate the predictive validity of scores
produced by the CTBA. At issue is the strength of association between
CTBA results and teachersâ observed classroom practice.
Method
Participating teachers (N = 72) were randomly selected from faculty ros-
ters at 35 schools that were randomly selected from a list of all secondary
TORFF AND WARBURTON 173
Table 4
Correlation Coefficients for Critical Thinking Belief Appraisal (CTBA), Marlowe-Crowne
Social Desirability Scale, Need for Cognition Scale, and the California Critical Thinking
Skills Test (CCTST) in Study 4 (N = 100)
CTBA Score Marlowe-Crowne Need for Cognition CCTST
HICT Ă HIADV .04 .12 .29*
HICT Ă LOADV .07 .02 .31*
LOCT Ă HIADV .02 .10 .06
LOCT Ă LOADV .05 .03 .18
Note. HICT = high critical thinking; HIADV = high advantage; LOADV = low advantage; LOCT= low critical
thinking.
*p < .05; otherwise p > .10.
20. schools on Long Island, New York. The 38 women and 34 men included 16
teachers of English, 10 of languages other than English, 15 of mathematics,
17 of science, and 14 of social studies. All had 5 or more years of teaching
experience (Berliner, 1992). Participantsâ ages ranged from 27 to 56 years
with a mean of 37.7. Teaching experience ranged from 5 to 33 years with an
average of 10.6.
A three-part data collection strategy was used. First, observers visited the
classrooms of participating secondary teachers and rated their use of CT
activities (âobserved CT useâ). Second, we asked the participating teachers
to identify the observed classesas low or high with respect to each of the three
advantage characteristics (ability, prior knowledge, and motivation). Third,
we asked teachers to complete the CTBA. The scaleâs predictive validity was
then evaluated by calculating the correlation between observed CT use and
the classroom-matched items on the CTBA (i.e., the items that correspond to
the configuration of advantage characteristics that teachers identified as
describing the observed class).
Teachersâ observed CT use was assessed by two raters using a procedure
developed in prior research (Torff, 2003). Raters were enrolled in a graduate-
level program in secondary education and had earned highest grades in a CT-
in-education course. They received additional instruction in measurement of
classroom CT use (i.e., how to identify high-CT and low-CT activities) but
had no specific knowledge of the research design or hypothesis.
To tune the rating process and evaluate interrater reliability prior to data
collection, raters met for 2 hours twice weekly for 4 weeks (a total of eight
meetings). Initially they reviewed lessons on videotape and discussed with
the experimenter how to rate teachersâ use of CT. Then, to test the level of
interrater reliability, raters scored a second set of videotapes separately with-
out discussion. Raters produced suitable levels of interrater reliability after
three iterations of the training process (88% agreement). In data collection
for Study 5, interrater reliability of 85% was sufficient to allow further
analysis of the data.
When scheduling classroom observations, we asked participating teach-
ers to host the raters in periods featuring teacher-led activities and not a spe-
cial event (e.g., a guest presentation) or a lesson in which the teacherâs role
was minimized (e.g., an examination). The two raters visited classrooms
together but made separate assessments without access to each otherâs rat-
ings. During each instructional period (40 to 45 minutes in length), raters
made assessments of CT use once per minute using a 5-point Likert-type
scale (5 = a great deal, 4 = a lot, 3 = some, 2 = a little, 1 = not muchâor a rat-
ing of âno teachingâ had instruction not been attempted in the previous min-
ute). Ratings were entered on specially prepared score sheets as summaries
of the previous minute of classroom activity. Ratings were made simulta-
neously with the use of a clock visible to both raters.
174 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT
21. As examples of the scoring procedure, consider three instances of class-
room instruction in social studies classes. In one instance, a teacher described
the events of the Vietnam War. With learners simply watching and listening,
the raters agreed that CT use was very low (1). In a second instance, the
teacher asked learners to participate in a debate about whether the United
States should have refused the sign a treaty banning land mines. Because
learners were charged to reason and argue as political scientists do, raters
agreed that CT use was very high (rated a 5). In a third instance, classroom
activity was temporarily delayed when the teacher reviewed medical docu-
ments provided by a student; in this case, raters concurred that âno teachingâ
was appropriate. As measures of teachersâclassroom behavior, rating proce-
dures as such are far from ideal (Medley & Coker, 1987; Wilkerson, Manatt,
Rogers, & Maughm, 2000). However, measurement of teachersâuse of CTon
a minute-by-minute basis provided ongoing assessment that was sensitive to
changes in classroom activities (Fischer & Bidell, 1998; Granott & Parziale,
2002).
Following the classroom observations, participating teachers were asked
to rate the observed class as high or low in each of the three advantage charac-
teristics. Teachers were not asked to complete the CTBA until 4 weeks later
to reduce response bias based on their recall of the activities employed during
the observed class and the advantage-characteristic identification they pro-
duced following the class.
Results and Discussion
The 72 teachers were sorted into six groups according to teachersâidenti-
fication of the observed classroom as high or low in specific advantage char-
acteristics (for example, high ability, high prior knowledge, and low motiva-
tion). Teachersâ average scores on the classroom-matched items on the
CTBA were then compared to the observed CT-use ratings.
Overall, a strong degree of association (r = .72) was found between
observed CT use and the classroom-matched items. High correlations were
found between observed CT use and classroom-matched CTBA scores
among teachers who identified learners as follows: (a) high ability, high prior
knowledge, and low motivation (n = 12, r = .79), (b) high ability, high prior
knowledge, and high motivation (n = 20, r = .68), (c) low ability, low prior
knowledge, and low motivation (n = 16, r = .70), and (d) high ability, low
prior knowledge, and high motivation (n = 15, r = .71). Despite the small
sample size in two of the six groups, the correlations between observed CT
use and classroom-matched CTBA scores in all groups (ranging from r = .64
to r = .80) suggest that the CTBA produced scores with satisfactory predictive
validity.
TORFF AND WARBURTON 175
22. General Discussion
Scale Construction and Reliability
The CTBA was developed in Study 1, in which a pool of items with strong
face validity for measuring beliefs about classroom CT use was generated
and revised to meet the criteria of ambiguity, irrelevance, and internal consis-
tency. Study 2 was conducted to explore the factor structure of scores on the
36 selected items, with results supporting a four-factor model (high-CT
activities for high-advantage learners, high-CT activities for low-advantage
learners, low-CT activities for high-advantage learners, and low-CT activi-
ties for low-advantage learners). This factor structure was replicated in Study
3. Strong dominance of the four-factor set, stable pattern/structure coeffi-
cients, and high internal consistency reliabilities indicated that the shared
variation of the items reliably assessed a common set of factors. Moreover,
this result was unlikely to be an artifact of response biases, because instruc-
tions and item wording were designed to minimize acquiescence and self-
presentation, and correlations between CT-use scores and social desirability
were weak.
Validity
In the series of studies reported in this article, the CTBA produced scores
with favorable construct, content, discriminant, and predictive validity. The
empirical method of deriving the scale used in Study 1 enhances the confi-
dence that can be placed in the construct and content validity of obtained
scores. In Study 4, the scale produced scores with acceptable discriminant
validity. Study 5 provided evidence for the predictive validity of the scores
produced by the scale. Taken together, the results of the five studies support
the theoretical utility of the CTBA for assessing teachersâ CT-use beliefs.
Practical Utility
The CTBA has practical utility as a research tool. Research on CT-use
beliefs using interview protocols (e.g., Zohar et al., 2001) or questionnaires
in which respondents self-report their classroom practices (e.g., Raudenbush
et al., 1993) employ methodologies effective only with practicing teachers.
The CTBA seemingly can be used with preservice populations, allowing
researchers to investigate the origin and development of CT-use beliefs.
Issues that remain unexplored include (a) the extent to which prospective
teachers hold similar beliefs relative to individuals who do not become teach-
ers; (b) the extent to which CT-use beliefs change during preservice training;
(c) the extent to which CT-use beliefs change as teachers gain classroom
176 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT
23. experience, in-service training, and teaching expertise; and (d) what kinds of
preservice and in-service interventions are maximally effective in inducing
favorable teacher change. Finally, future research should examine the stabil-
ity of the obtained factor structure in other samples (perhaps of varying
diversity) using confirmatory factor analytic methods.
References
Anning, A. (1988). Teachersâ theories about childrenâs learning. In J. Calderhead (Ed.), Teach-
ersâprofessional learning (pp. 128-145). London: Falmer.
Archer, J., &McCarthy, B. (1988).Personalbiases instudentassessment. EducationalResearch,
30, 142-145.
Berliner, D. (1992). The nature of expertise in teaching. In F. Oser, A. Dick, & J. Patry (Eds.), Ef-
fective and responsible teaching (pp. 227-248). San Francisco: Jossey-Bass.
Blumenfeld, P., Hicks, L., & Krajcik, J. (1996). Teaching educational psychology through in-
structional planning. Educational Psychologist, 31, 51-62.
Brickhouse, N. W. (1990). Teachersâbeliefs about the nature of science and their relationship to
classroom practice. Journal of Teacher Education, 41, 53-62.
Browne, M., & Keeley, K. (2001). Asking the right questions: A guide to critical thinking (6th
ed.). Upper Saddle River, NJ: Merrill/Prentice Hall.
Cacioppo, J., & Petty, R. (1982). The need for cognition. Journal of Personality and Social Psy-
chology, 42, 116-131.
Cacioppo, J. T., Petty, R. E., Feinstein, J. A., & Jarvis, W. B. (1996). Dispositional differences in
cognitive motivation:The life andtimes of individualsvaryingin needforcognition.Psycho-
logical Bulletin, 119, 197-253.
Calderhead, J. (1996). Teachers: Beliefs and knowledge. In D. Berliner & R. Calfee (Eds.),
Handbook of educational psychology (pp. 709-725). New York: Macmillan.
Crowne, D., & Marlowe, D. (1964). The approval motive. New York: John Wiley.
Dweck, C. (1986). Motivational processes affecting learning. American Psychologist, 41, 1040-
1048.
Ennis, R. (1987). A taxonomy of critical-thinking dispositions and abilities. In J. Baron & R.
Sternberg (Eds.), Teaching thinking skills: Theory and practice (pp. 9-26). New York:
Freeman.
Facione, P., Facione, N., & Giancarlo, C. (2000). The California critical thinking skills test.
Millbrae, CA: California Academic Press.
Fang,Z. (1996).A review of researchonteacherbeliefs andpractices. EducationalResearch, 38,
47-65.
Fischer, K. W., & Bidell, T. R. (1998). Dynamic development of psychological structures in ac-
tion and thought. In R. M. Lerner (Ed.), Handbook of child psychology: Vol. 1. Theoretical
models of human development (5th ed., pp. 467-561). New York: John Wiley.
Givvin, K. B., Stipek, D. J., Salmon, J. M., & MacGyvers, V. L. (2001). In the eyes of the be-
holder: Studentsâand teachersâjudgmentsof studentsâmotivation. Teaching and Teacher Ed-
ucation, 17, 321-331.
Granott, N., & Parziale, J. (Eds.). (2002). Microdevelopment: Transition processes in develop-
ment and learning. Cambridge: Cambridge University Press.
Halpern, D. (2002). Thought and knowledge (4th ed.). Mahwah, NJ: Lawrence Erlbaum.
Henderson, J. (2001).Reflective teaching: Professional artistry through inquiry (3rd ed.). Upper
Saddle River, NJ: Merrill/Prentice Hall.
Henson, R. K., Capraro, R. M., & Capraro, M. M. (2001, November). Reporting practices and
use of exploratory factor analysis in educational research journals. Paper presented at the
annual meeting of the Mid-South Educational Research Association, Little Rock, AR.
TORFF AND WARBURTON 177
24. Henson, R. K., & Roberts, J. K. (2001, February). A meta-analytic review of exploratory factor
analysis reporting practices in published research. Paper presented at the annual meeting of
the Southwest Educational Research Association, New Orleans, LA. (ERIC Document Re-
production Service No. ED 449 227)
Kagan, D., & Tippins, D. (1991). How student teachers describe their pupils. Teaching and
Teacher Education, 7, 455-466.
Kuhn, D. (1999). A developmental model of critical thinking. Educational Researcher, 28, 16-
26.
Madon, S., Jussim, L., Keiper, S. Eccles, J., Smith, A., & Paolumbo, P. (1998). The accuracy and
power of sex, social class, and ethnic stereotypes: A naturalistic study in person perception.
Personality and Social Psychology Bulletin, 24, 1304-1318.
Medley, D., & Coker, H. (1987). How valid are principalsâ judgments of teacher effectiveness?
Phi Delta Kappan, 69, 138-140.
Metz, M. H. (1978). Classrooms and corridors: The crisis of authority in desegregated second-
ary schools. Berkeley: University of California Press.
Moje, E. B., & Wade, S. E. (1997). What case discussions reveal about teacher thinking. Teach-
ing and Teacher Education, 13, 691-712.
Nespor, J. (1987). The role of beliefs in the practice of teaching. Journal of Curriculum Studies,
19, 317-328.
Nolen, S. B., & Nichols, J. G. (1994). A place to begin (again) in research on student motivation:
Teachersâ beliefs. Teaching and Teacher Education, 10, 57-69.
Oakes, J. (1990). Multiplying inequalities: The effects of race, social class, and tracking on op-
portunities to learn math and science. Santa Monica, CA: RAND.
OâTuel, F., & Bullard, R. (1993). Developing higher-order thinking in the content areas. Pacific
Grove, CA: Critical Thinking Books and Software.
Page,R. N.(1990).Thelower-trackcurriculuminacollege-preparatoryhighschool.Curriculum
Inquiry, 20, 249-281.
Perkins,D. N., Jay, E., & Tishman,S. (1993).New conceptionsof thinking:Fromontologyto ed-
ucation. Educational Psychologist, 28, 67-85.
Pintrich, P., & Schunk, D. (1996). Motivation in education. Upper Saddle River, NJ: Prentice
Hall.
Pogrow, S. (1990). Challenging at-risk learners: Findings from the HOTS program. Phi Delta
Kappan, 71, 389-397.
Pogrow, S. (1994). Helping learners who âjust donât understand.â Educational Leadership, 52,
62-66.
Putnam, R., & Borko, H. (1997). Teacher learning: Implications of new views of cognition. In B.
Biddle, T. Good, & I. Goodson (Eds.), The international handbook of teachers and teaching
(Vol. 2, pp. 1223-1296). Dordrecht, the Netherlands: Kluwer.
Raudenbush, S. W., Rowan, B., & Cheong, Y. F. (1993). Higher order instructional goals in sec-
ondaryschools:Class, teacher, andschoolinfluences.AmericanEducationalResearchJour-
nal, 30, 523-553.
Raths, L., Wasserman, S., Jonas, A., & Rothstein, A. (1986). Teaching for thinking. New York:
Teachers College Press.
Resnick,L.(1987).Educationandlearningtothink.Washington,DC: NationalAcademyPress.
Richardson, V. (1996). The role of attitudes and beliefs in learning to teach. In J. Sikula (Ed.),
Handbookofresearchinteachereducation(2nded.,pp.102-119).NewYork:Macmillan.
Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariate statistics (4th ed.). New York:
Allyn & Bacon.
Thompson, B., & Daniel, L. G. (1996). Factor analytic evidence for the construct validity of
scores: A historical overview and some guidelines. Educational and Psychological Mea-
surement, 56, 197-208.
178 EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT
25. Tollefson, N. (2000). Classroom applications of cognitive theories of motivation. Educational
Psychology Review, 12, 63-83.
Torff, B. (2003). Developmental changes in teachersâ use of higher-order thinking and content
knowledge. Journal of Educational Psychology, 95, 563-569.
Wilcox-Herzog,A. (2002).Is there a link between teachersâbeliefs and behaviors? Early Educa-
tion and Development, 13, 81-106.
Wilkerson, D., Manatt, R., Rogers, M., & Maughm, R. (2000). Validation of learner, principal,
and self-ratings in 360 degree feedback for teacher evaluation. Journal of Personnel Evalua-
tion in Education, 14, 179-192.
Woolfolk Hoy, A., & Murphy, K. (2001). Teaching educational psychologyto the intuitive mind.
In B. Torff & R. Sternberg, (Eds.), Understanding and teaching the intuitive mind: Student
and teacher learning (pp. 145-186). Mahwah, NJ: Lawrence Erlbaum.
Yeh, S. (2002). Tests worth teaching to: Constructing state-mandated tests that emphasize criti-
cal thinking. Educational Researcher, 30, 12-17.
Zohar, A., Degani, A., & Vaakin, E. (2001). Teachersâbeliefs about low-achieving students and
higher-order thinking. Teaching and Teacher Education, 17, 469-485.
TORFF AND WARBURTON 179
View publication stats
View publication stats