SlideShare a Scribd company logo
1 of 18
Download to read offline
http://cjs.sagepub.com
Psychology
Canadian Journal of School
DOI: 10.1177/0829573509335470
2009; 24; 108
Canadian Journal of School Psychology
Roberts
Lijuan Wang, Carolyn MacCann, Xiaohua Zhuang, Ou Lydia Liu and Richard D.
Multimethod Approach
Assessing Teamwork and Collaboration in High School Students: A
http://cjs.sagepub.com/cgi/content/abstract/24/2/108
The online version of this article can be found at:
Published by:
http://www.sagepublications.com
On behalf of:
Canadian Association of School Psychologists
can be found at:
Canadian Journal of School Psychology
Additional services and information for
http://cjs.sagepub.com/cgi/alerts
Email Alerts:
http://cjs.sagepub.com/subscriptions
Subscriptions:
http://www.sagepub.com/journalsReprints.nav
Reprints:
http://www.sagepub.com/journalsPermissions.nav
Permissions:
http://cjs.sagepub.com/cgi/content/refs/24/2/108
Citations
at University of Sydney on January 13, 2010
http://cjs.sagepub.com
Downloaded from
108
Canadian Journal of
School Psychology
Volume 24 Number 2
June 2009 108-124
© 2009 SAGE Publications
10.1177/0829573509335470
http://cjsp.sagepub.com
hosted at
http://online.sagepub.com
Assessing Teamwork
and Collaboration in High
School Students
A Multimethod Approach
Lijuan Wang
University of Notre Dame
Carolyn MacCann
University of New South Wales
Xiaohua Zhuang
Rutgers University
Ou Lydia Liu
Richard D. Roberts
Educational Testing Services
Abstract: Various policy papers assert that teamwork is an essential skill for the 21st-
century workforce. However, outside of organizational psychology research with adult
populations, there are few reliable assessments of this construct with suitable validity
evidence for test scores. To redress this issue, self-report, situational judgment, and
teacher-report assessments of teamwork were developed for high school students.
Various multivariate techniques were used to determine the structure of the scales,
including factor and latent class analysis. Measures showed reasonable reliability and
satisfactory validity evidence: Self-report, situational judgment, and teacher-report
measures intercorrelated, and these measures also related to academic achievement.
The advantages and disadvantages of each methodology are discussed, as are possible
uses of this assessment system (e.g., evaluation of school-based programs that infuse
curricula with modules on teamwork).
Résumé: Divers articles sur les politiques affirment que le travail d’équipe est essentiel
à la force de travail du 21e
siècle. Cependant, en dehors des recherches en psychologie
des organisations chez les adultes, peu d’évaluations fiables ont été faites sur ce thème
avec suffisamment de validité. Pour pallier cette situation, nous avons élaboré des mesures
Authors’ Note: This research was supported, partly, by a Ford Partnership for Advanced Studies research
grant. Numerous individuals at Educational Testing Service were involved in test construction, data col-
lection, and analyses; thanks to them all. We also thank those involved in the Ford PAS program, who
gave willingly of their time. The views expressed are those of the authors and do not represent the views
of Ford PAS or any of the authors’institutional affiliations. Correspondence concerning this article should
be addressed to Richard D. Roberts, R&D, MS16-R, Educational Testing Service, Rosedale Road,
Princeton, NJ, 08541; e-mail: RRoberts@ets.org.
at University of Sydney on January 13, 2010
http://cjs.sagepub.com
Downloaded from
Wang et al. / Assessing Teamwork 109
d’auto-évaluation, de jugements de situations et d’évaluation par les enseignants du travail
d’équipe chez les étudiants du secondaire. Diverses techniques multivariées ont été utili-
sées pour déterminer la structure des mesures, y compris des analyses factorielles et de
classes latentes. Les mesures se révèlent raisonnablement fiables et leur validité est satis-
faisante. L’auto-évaluation, les jugements de situations et l’évaluation par les enseignants
sont en intercorrélation et toutes ces mesures sont également liées au rendement scolaire.
Nous abordons le pour et le contre de chaque méthodologie, de même que les utilisations
possibles de ce système d’évaluation (p. ex. l’évaluation de programmes scolaires qui
comprennent des curriculum avec des modules fondés sur le travail d’équipe).
Keywords: teamwork; situational judgment test; teacher ratings; academic achieve-
ment; latent class analysis; structural equation modeling
Teamwork has been touted as one of the major skills comprising workforce
readiness in the 21st century (e.g., Barton, 2007) and has become an essential
process in education, with teachers frequently assigning projects that require student
collaboration (Ahles & Bosworth, 2004). Despite the perceived importance of team-
work in secondary education, reliable assessments of teamwork at the high school
level are scarce, with existing assessments primarily targeting business organiza-
tions or college students (Loughry, Ohland, & Moore, 2007; Morgeson, Reider, &
Campion, 2005; O’Neil, Wang, & Lee, 2003).
The current study aims to develop a teamwork assessment that (a) targets high
school students, (b) uses multiple methods of measurement (self-report, other-report,
and situational judgment test [SJT] procedures), (c) results in reliable measures, and
(d) yields scores with demonstrable validity evidence. These instruments could be
used to identify students’ teamwork skills, to design intervention programs around
the assessment, and to provide career guidance for students. In addition, information
derived from the assessment procedures may inform the development of teamwork-
training curricula in the high school.
Individual Differences in Teamwork: Conceptual Models
Although models of teamwork differ in the details, conceptual correspondences
between the components of teamwork models suggest five general content areas (e.g.,
O’Neil et al., 2003; Stevens & Campion, 1994). These content areas are (a) task-
related process skills, (b) cooperation with other team members, (c) influencing
team members through support and encouragement, (d) resolution of conflicts
among team members via negotiation strategies, and (e) guidance and mentorship of
other team members. As process skills appear to have a clear cognitive load (and
hence might be most appropriately measured with knowledge assessments), we limit
our definition of teamwork to the four latter content areas in the current study. That is,
at University of Sydney on January 13, 2010
http://cjs.sagepub.com
Downloaded from
110 Canadian Journal of School Psychology
teamwork is defined as (a) cooperation with others, (b) influence through support and
encouragement (hereafter referred to as advocate), (c) resolving conflict/negotiating,
and (d) guiding others. A multimethod teamwork assessment system is developed to
cover these content domains.
Assessing Teamwork
Individual differences in teamwork are commonly assessed with self- or other-
reports (e.g., Loughry et al., 2007) and less commonly with SJTs (e.g., Stevens &
Campion, 1999). There are practical concerns with each method of assessment in
isolation: SJTs are difficult to score, self-report ratings may produce response distor-
tion, and other-report ratings may be susceptible to halo effects. Thus, using multiple
methods represents an innovative approach to teamwork assessment, ensuring that
potential measurement issues are limited to one part of the assessment system. In
addition, the relationship between different methods of measurement (self-reports,
SJTs, and teacher-reports) can be examined as an important methodological issue.
Establishing Validity Evidence for Teamwork Scores
Several different types of validity evidence for the teamwork assessment system
are examined. First, the multiple measures should converge to assess the same con-
struct (teamwork) as evidence of convergent validity. Second, limited group differ-
ences (i.e., gender or ethnicity differences) in teamwork scores are expected as group
membership is conceptually unrelated to teamwork, constituting a form of discrimi-
nant validity evidence. Third, teamwork scores should predict students’ grades, as
evidence of criterion validity. Fourth, latent class analyses of test takers’ responses
to the SJT should converge with expert opinion, as evidence for the validity of the
expert scoring rubric. Fifth, developmental and learning trends in teamwork are
expected as a further form of validity evidence. Further background to these claims
follows.
Convergent validity evidence. Scores on the three different measurement methods
should converge, as all are based on the same definition of teamwork. Positive cor-
relations between the different measurement methods are evidence of convergent
validity.
Discriminant validity evidence. The distinctiveness of teamwork scores from
ethnicity and gender constitutes evidence of discriminant validity. Although there is
little literature examining group differences in teamwork, related literatures suggest
that noncognitive assessments reduce adverse impact (e.g., McDaniel, Morgeson,
at University of Sydney on January 13, 2010
http://cjs.sagepub.com
Downloaded from
Wang et al. / Assessing Teamwork 111
Finnegan, Campion, & Braverman, 2001). Teamwork may relate to socialization
processes in much the same ways as similar constructs, such as social and emotional
competencies (Zeidner, Matthews, & Roberts, 2009). However, if the teamwork
assessments are purely measures of socialization styles or types of experience
(which differ across groups), then such measures are not truly assessing teamwork
but rather social interaction norms. For this reason, we consider the absence of group
differences on the teamwork assessments to be a form of discriminant validity.
Test-criterion evidence. Test-criterion relations of the teamwork assessments are
evaluated against students’grades. Given that students’learning and achievement may
relate to social demands of the classroom, cognitive demands of mastering academic
material, and teachers stressing collaborative approaches to learning, teamwork skills
gain in importance as components of academic success (Ahles & Bosworth, 2004).
Teamwork scores should thus predict school grades. However, it might be the case that
teamwork measures predict grades especially in classes where teamwork is pivotal
(e.g., music, where ensembles or group performances are common) versus those where
it may sometimes be downplayed (e.g., history, where individual assessments are more
common than team projects).
Validity evidence for the SJTs. Although expert scoring is recommended for SJTs
(e.g., McDaniel & Nguyen, 2001), experts may disagree, criteria for teamwork exper-
tise are not obvious, and multiple correct answers to situational items may be possi-
ble. For these reasons, latent class analysis (LCA) is used as a procedure for ensuring
the validity of expert scoring of the teamwork SJT. The LCA identifies qualitatively
different groups of cases based on consistencies in response patterns. An LCA of SJT
items can determine whether there are distinct groups of test takers showing different
patterns of response on the SJT. If discrete groups of test takers have higher or lower
expert-derived scores, this provides evidence that expert scoring is valid.
Development trends in teamwork. Development and learning trends for teamwork
are also examined (i.e., higher teamwork scores are expected for older students as a
product of education and socialization).
Aims of the Study
This study serves three primary purposes: (a) develop a multiple method assess-
ment system to measure teamwork in high school students, (b) provide preliminary
reliability and validity evidence for the teamwork assessment system by examining
each measure in isolation and then by examining the relation between the three meth-
ods (i.e., convergent validity evidence), and (c) provide additional validity evidence
at University of Sydney on January 13, 2010
http://cjs.sagepub.com
Downloaded from
112 Canadian Journal of School Psychology
for the assessments by examining relationships between teamwork scores and age,
ethnicity, gender, and academic achievement.
Method
Participants
Participants were 159 high school students (51.6% female adolescents) partici-
pating in Ford Partnership for Advanced Studies (PAS) courses, whose teachers
indicated a willingness to participate in the teamwork study. The Ford PAS is a set
of curriculum modules linking classroom education to real-world employment appli-
cations, through developing links between schools and local colleges, universities,
and businesses. The learning process is based on inquiry and collaborative, project-
based learning experiences and aims to teach students four essential workplace
skills: Teamwork, critical thinking, problem solving, and communication (see http://
www.fordpas.org). Participants’mean age was 16.10 years (SD = 1.03). Participants’
self-identified ethnicity was 64.2% African American, 18.9% White, 3.1% Hispanic,
and 13.8% American Indian, Asian, or Other. Although this sample is not representa-
tive of U.S. high school students, it is consistent with the student population gener-
ally participating in the Ford PAS program.
Measures
Teamwork
A self-report rating scale, SJT, and behaviorally anchored teacher-rating scale were
developed to assess the four content domains of teamwork. All three measures were
developed by specialists and reviewed by two expert panels: (a) content experts (cur-
riculum developers and educators who have taught teamwork curricula) and (b) fairness
and sensitivity experts (i.e., individuals trained to meet guidelines according to estab-
lished standards designed to reduce bias and promote legal and public defensibility).
Self-report teamwork assessment. There were 57 items developed to assess Coo-
peration (15 items), Advocate (12 items), Negotiate (17 items), and Guiding Others
(13 items). Students responded to these items on a 6-point scale from never to always
(see Table 1 for sample items).
SJT assessment. Curricula materials and student exercises supplied by Ford PAS
were used to develop scenarios and response options. Two scenarios were developed
for each of the four components, such that the SJT represented all four content com-
ponents (however, due to limited testing time, we did not attempt to develop enough
items to examine multiple teamwork factors). For each scenario, students rated the
at University of Sydney on January 13, 2010
http://cjs.sagepub.com
Downloaded from
Wang et al. / Assessing Teamwork 113
effectiveness of four response options on a 5-point scale, from very ineffective to
very effective. A panel of three assessment specialists in educational and psycho-
logical testing decided on the best response (with final decision based on the group
reaching consensus), and the test taker’s rating of this response was used to score
each item. A sample item is described below.
Table 1
Factor Loadings of the Revised Self-Report Teamwork Scale
Item Content Cooperate Advocate/Guide Negotiate M (SD)
I enjoy bringing team .75 4.45 (1.29)
members togethera
Sharing ideas .74 4.58 (1.20)
Acknowledging peers’ .71 4.71 (1.19)
accomplishments
Helping team .67 4.77 (1.19)
Valuing different perspectives .67 4.40 (1.26)
Providing feedback .60 4.16 (1.18)
Exchanging creative ideas .60 4.84 (1.26)
Cooperating with students .53 4.72 (1.12)
Enjoying team activities .46 4.43 (1.33)
Inspired by others .46 4.09 (1.25)
Contributing team’s goals .43 4.50 (1.22)
Respecting peer opinions .41 .34 4.76 (1.11)
I like to be in charge .70 3.75 (1.42)
of groups or projectsa
Help others see things my way .65 4.03 (1.12)
Convincing peers .57 3.80 (1.19)
Believe good leaders .56 4.68 (1.36)
Comfortable providing criticism .44 3.80 (1.46)
Persuading peers attentively .43 4.30 (1.29)
Influencing peers .41 3.61 (1.44)
Suggesting solutions .30 3.99 (1.23)
Constructively argue .23 4.15 (1.27)
I am a good listenera
.72 4.91 (1.11)
Open to varying opinions .65 4.64 (1.23)
Take others’ interests into account .55 4.32 (1.28)
Adaptable in team .53 4.16 (1.15)
Flexible in team .51 4.68 (1.16)
Find best solution .49 4.38 (1.37)
Dislike people challenging viewsb
.38 3.92 (1.29)
Understanding team .30 .31 5.33 (1.04)
member differences
Consider team first .29 3.98 (1.36)
Note: For better readability, factor loadings that are less than .20 are not listed.
a
Items are complete examples.
b
Reverse-scored item.
at University of Sydney on January 13, 2010
http://cjs.sagepub.com
Downloaded from
114 Canadian Journal of School Psychology
You are part of a study group that has been assigned a large presentation for class. As
you are all dividing up the workload, it becomes clear that both you and another mem-
ber of the group are interested in researching the same aspect of the topic. Your col-
league already has a great deal of experience in this area, but you have been extremely
excited about working on this part of the project for several months. Rate the following
approach to dealing with this situation: (a) Flip a coin to determine who gets to work on
that particular aspect of the assignment, (b) insist that, for the good of the group, you
should work on that aspect of the assignment because your interest in the area means
you will do a particularly good job, (c) compromise your preferences for the good of
the group and allow your friend to work on that aspect of the assignment [best response
by expert judgment], (d) suggest to the other group member that you both share the
research for that aspect of the assignment and also share the research on another less-
desirable topic.
Teacher-rating scale. Teachers evaluated each student’s level of teamwork
against ten behaviorally anchored items. A sample item follows:
When working on a group goal or project, this student
(1) (2) (3) (4) (5)
ignores or does not listens to others’ always listens
notice others’ ideas contributions to others
or suggestions and respects
their contributions
Self-Reported Grades
Students reported their grades in reading/language arts, math, science, social sci-
ence, art, and music from the previous semester. A total of 151 grades were reported
for reading, 150 for math, 146 for social science, 135 for social studies, 49 for art,
and 41 for music.
Procedure
Participants were tested in class (20 students per class) with teachers reporting on
their students’ performance during this time. All tests were administered in paper-
and-pencil format and were self-paced. The entire student protocol lasted 45 min. All
protocols were approved by the Educational Testing Service human ethics review
committee.
Data Analysis Steps
Exploratory factor analyses (EFAs). Separate EFAs using principal factor analy-
sis with promax rotation were conducted for the student self-report scale, SJT (eight
at University of Sydney on January 13, 2010
http://cjs.sagepub.com
Downloaded from
Wang et al. / Assessing Teamwork 115
key items that experts selected as the best), and teacher-report scale. Problematic
items (loadings <.30 or cross-loadings to multiple factors) were removed to reduce
the item pool. Parallel analysis was conducted to determine the number of factors for
each scale, using a SAS macro program that allows parallel analysis of ordinal data
such as rating scale items (Liu & Rijmen, 2008).
Confirmatory factor analyses (CFAs). The CFAs were then used to compare
structural models for the self-report scale (to determine whether highly correlating
latent factors are sufficiently different from each other to constitute separate factors).
The following rules of thumb were used to evaluate fit statistics: (a) acceptable fit:
root mean square error of approximation (RMSEA) ≤ .08, comparative fit index
(CFI) ≥ .90; (b) good fit: RMSEA ≤ .05, CFI ≥ .95 (e.g., Hu & Bentler, 1999; Marsh,
Hau, & Wen, 2004). However, Hu and Bentler (1995) also suggest that the rule of
thumb is not absolute because it does not work equally well with various types of
indices, sample sizes, estimators, or distributions. Both EFAs and CFAs were con-
ducted with Mplus (Muthén & Muthén, 1998-2007).
LCA. The LCA was applied to the eight expert-ratified SJT item responses, to
see whether the obtained latent classes converge with expert scoring. Only the eight
key items were used in the LCA analysis as the small sample size and nested item-
scenario data structure meant that a model based on all 32 responses would not be
reliable. Akaike information criterion (AIC), Bayesian information criterion (BIC),
and profile plots were used to select the number of classes.
Correlational analyses. Correlations among self-report, teacher-report, and SJT
factor scores were calculated to determine whether the scores showed convergent
validity evidence. Correlations of teamwork factor scores with demographic vari-
ables and course grades were also calculated, providing further validity evidence.
Results
Scale Dimensionality
Self-report scale. A four-factor CFA based on the theoretical assignment of items
to factors fit data poorly (e.g., CFI = .56; RMSEA = .08), suggesting that a four-
factor solution was not the best fit to the data. The initial EFA was then conducted
on the correlation matrix, using a four-factor solution suggested by parallel analysis.
However, results indicated some potentially problematic items due to negative factor
loadings, items that did not load saliently (>.30) on any factor, or items that showed
multiple loadings (>.40) on different factors. In addition, these four factors did not
correspond to the four subscales originally postulated. A sequential deletion method
at University of Sydney on January 13, 2010
http://cjs.sagepub.com
Downloaded from
116 Canadian Journal of School Psychology
was then used to eliminate problematic items. Items with negative loadings were
first eliminated, and then items with low loadings and multiple loadings were
eliminated. On the basis of item analyses, 27 items (13 of which were reverse-coded)
were dropped. An EFA was conducted on these discarded items, and no clear factor
pattern was identified. These 27 items were excluded from further analysis.1
The
remaining items were then reanalyzed using EFA. Results from the scree plot and
parallel analysis indicate a three-factor solution. Factor loadings from this analysis
are shown in Table 1.
The reduced self-report scale consisted of 30 items and captured three factors:
(a) Cooperation (12 items), (b)Advocate/Guide (9 items), and (c) Negotiation (9 items).
Cooperation includes tendencies to bring ideas together, seek solutions, and provide
feedback to team members. Advocate/Guide refers to tendencies to direct others,
provide appropriate suggestions and criticism, and persuade others (i.e., the factor
primarily represents advocating content, although elements of guiding others, which
did not emerge as a separate factor, also appear here). Negotiation includes students’
tendency to listen, to adapt to change while there are conflicts, and the ability to
solve conflicts. Cronbach alpha’s for the three retained factors were acceptable (i.e.,
.88, .80, and .78, respectively). Table 1 also provides the mean and standard devia-
tion of each item.
For CFA, items were assigned to one of the three dimensions based on the factor
loading matrix from the EFA shown in Table 1. Fit indices from a CFA based on
these three factors (with no cross-loadings) are shown in Table 2. The RMSEA indi-
cates good fit, although the CFI is slightly below conventional estimates (at .85
rather than .90). Correlations among the latent variables were relatively high: .66
(Cooperation, Advocate/Guide), .79 (Cooperation, Negotiation), and .59 (Advocate/
Guide, Negotiation). For this reason, we fitted three alternative two-factor models to
test if the three factors were statistically distinct: (a) Cooperation and Advocate/
Guide combine to form one factor (i.e., correlation between them equals 1.00 and
correlations of each factor with the Negotiation factor were also constrained to be
equal); (b) Cooperation and Negotiation combine to form one factor; and (c)Advocate/
Guide and Negotiation combine to form one factor. Results of the likelihood ratio
tests shown in Table 2 indicate that combining any two factors into one significantly
lowers model fit.
SJT. Scree and parallel analysis of the eight SJT items indicated a one-factor solu-
tion. The Cronbach alpha was .71. Fit indices from a one-factor CFA model indi-
cated good fit (CFI = .96, RMSEA = .049).
Unrestricted LCA models with one to four classes were fitted to the eight key SJT
items with raw data used as input. Table 3 displays the goodness-of-fit indices for
these LCA models. The AIC indicates a three-class solution, BIC a two-class solu-
tion. Profile plots (mean ratings of the 32 item responses for each class) were gener-
ated for both the two-class LCA model (Figure 1a) and the three-class LCA model
at University of Sydney on January 13, 2010
http://cjs.sagepub.com
Downloaded from
Wang et al. / Assessing Teamwork 117
(Figure 1b). The plots show that the Class 1 and Class 3 in the three-class model
were not visually distinguishable. However, the two classes from the two-class model
discriminated very clearly. Combining the results from AIC, BIC, and profile dis-
crimination abilities, a two-class model was selected to represent the dimensionality
of the SJT scale (one dimension, two classes). From Figure 1a, we can also see that
students in Class 1 rated all of the eight expert-endorsed response options more
highly than those in Class 2. This result provides a form of validity evidence for
the expert keys. There was also more variation across reactions/items within each
scenario for Class 1 than Class 2 (i.e., Class-1 students can better differentiate reac-
tions within each scenario than Class-2 students). Therefore, we label Class-1 as the
high teamwork skill group and Class-2 as the low teamwork skill group.
Teacher-report scale. The scree and parallel analysis with EFA showed that the
teacher evaluation scale was unidimensional, with the first factor explaining 83%
of the variance of the 10 items. The Cronbach alpha of the scale was .98. The mean
of the composite scores of these 10 items was 3.35 (SD = 1.14).
Table 2
Fit Indices from the Confirmatory Factor
Models for the Self-Report Teamwork Scores
Correlation Between Factors
Fixed to 1.00
Fit Indices Three-Factor Model F1, F2 F1, F3 F2, F3
Chi-square 651.5 743.8 716.1 741.6
df 402 404 404 404
CFI .84 .79 .81 .80
RMSEA .06 .07 .07 .07
Likelihood ratio test 92.3/2 64.6/2 90.1/2
Note: F1 = Cooperation; F2 = Advocate/Guide; F3 = Negotiation; CFI = comparative fit index; RMSEA =
root mean square error of approximation.
Table 3
Goodness-of-Fit Indices for One- through Four-Class
LCA Models for the SJT Teamwork Scores
Latent Class Log Likelihood Number of Parameters AIC BIC
1 –1541.1 31 3144.2 3238.5
2 –1428.4 63 2982.9 3174.6
3 –1382.9 95 2955.9 3245.0
4 –1361.4 127 2976.7 3363.2
Note: AIC = Akaike information criterion; BIC = Bayesian information criterion.
at University of Sydney on January 13, 2010
http://cjs.sagepub.com
Downloaded from
118 Canadian Journal of School Psychology
Figure 1
Mean Profile Plots With the Latent Class Analyses (Two vs. Three Classes)
1
1.5
2
2.5
3
3.5
4
4.5
5
a b c d a b c d a b c d a b c d a b c d a b c d a b c d a b c d
1 2 3 4 5 6 7 8
Response options a to d for each of the eight scenarios
Mean
rating
Class 1
Class 2
a
1
1.5
2
2.5
3
3.5
4
4.5
5
a b c d a b c d a b c d a b c d a b c d a b c d a b c d a b c d
1 2 3 4 5 6 7 8
Response options a to d for each of the eight scenarios
Mean
rating
Class 1
Class 2
Class 3
b
Note: The y axis represents rated means of effectiveness of each reaction across latent classes. The x axis
represents reaction (a, b, c, and d) and scenario numbers (1-8). For Scenario 1-8, the most effective reac-
tions rated by experts are c, d, d, c, d, d, c, and d.
at University of Sydney on January 13, 2010
http://cjs.sagepub.com
Downloaded from
Wang et al. / Assessing Teamwork 119
Relationships between Three Evaluation Methods
Table 4 displays the correlations of the five teamwork scales (Cooperation,Advocate/
Guide, Negotiation, SJT, and teacher report). The SJT assessment was significantly
correlated with all three self-report scores around the same magnitude (as might be
expected as scenarios covered the same teamwork subcomponents). The SJT and the
teacher report also shared a moderate and positive relationship. Teacher-report scores
were significantly correlated with two of the student self-report scores: cooperation
and advocating/influence.
To further investigate the relationship between the three assessment methods,
self-report and teacher-report scores were also compared across the two SJT latent
classes, and the results are given in Table 5. The high-teamwork-skill class scored
higher on the three self-report scales than the low-teamwork class, although no sig-
nificant differences between two classes were observed for the teacher-report scale.
Table 4
Correlations between all Teamwork Scores
Student Self-Report
Teamwork Score Cooperation Advocate/Guide Negotiation SJT Teacher Report
Cooperation .88
Advocate/guide .66** .80
Negotiation .79** .59** .78
SJT .52** .47** .60** .71
Teacher report .19* .32** .14 .33** .98
Note: Cronbach’s alpha reliability shown on diagonal. SJT = situational judgment test.
*p < .05. **p < .01.
Table 5
Teamwork Scores Compared by Latent Classes
High Teamwork Low Teamwork
Skills (n = 64) Skills (n = 95)
Teamwork Score M (SD) M (SD) t
Cooperation 4.74 (0.74) 4.32 (0.81) 3.38**
Advocate/guide 4.14 (0.81) 3.88 (0.78) 2.00*
Negotiation 4.59 (0.68) 4.36 (0.78) 2.01*
SJT 4.27 (0.35) 3.50 (0.44) 11.85**
Teacher report 3.43 (1.05) 3.28 (1.25) 0.81
Note: SJT = situational judgment test.
*p < .05. **p < .01.
at University of Sydney on January 13, 2010
http://cjs.sagepub.com
Downloaded from
120 Canadian Journal of School Psychology
Relationships between Teamwork Scores and Demographic Variables
No significant gender or ethnic differences were found for the self-report subscales,
teacher-report scores, or SJT scores. Age was positively correlated with the three self-
report (r = .31 to .35, p < .01) and SJT scores (r = .32, p < .01) but not significantly with
the teacher-report score. Participants in the high-teamwork latent class were significantly
older than participants in the low-teamwork latent class (M = 16.31, SD = 0.96 vs.
M = 15.91, SD = 1.07; t = 2.42, p < .05). Age significantly predicted all teamwork mea-
sures when controlling for the number of Ford PAS modules a student had undertaken
(partial r = .25 to .30 for self-reports, .28 for the SJT, and .18 for the teacher report,
p < .05 in all cases). However, the number of Ford PAS modules undertaken was not a
significant predictor of any teamwork measure after controlling for the age variable.
Relationships between Teamwork Scores and Course Grades
The SJT scores did not correlate significantly with students’ grades. The teacher-
report correlated significantly with math, science, and social studies grades (r = .21, .30,
and .27, respectively, p < .01). Cooperation correlated moderately with science and
music grades (r = .18 and .38, p < .05), whereas Advocate/Guide correlated positively
with science (r = .32, p < .01), social science (r = .19, p < .05), and music grades (r = .40,
p < .01). Negotiation shared a positive correlation with music grades only (r = .50,
p < .01). A grades composite was calculated by taking the means of the different course
grades. Only Advocate/Guide scores of the self-report scale and total teacher-report
score were significantly correlated with the grades composite (r = .25, p < .01).
Discussion
Scores on the three teamwork assessments: (a) related to each other, (b) were
unrelated to ethnicity or gender, (c) increased with age, and (d) related to school
grades. Generally, the aims of the study were met: A multiple-method assessment
system was developed for high school students, and scores from this assessment
suite showed evidence of convergent and criterion-related validity. Although all
teamwork assessments were related, there were some distinguishing features of the
different methods, with related implications for the use and purposes of such assess-
ments. These issues are discussed below.
Relationships among Teamwork Measures
The strongest relationship between self- and teacher-reports was for Advocate/
Guide, perhaps because this factor is more obvious to external observers (Cooperation
and Negotiation appear less open to observation). The SJT was less strongly related
at University of Sydney on January 13, 2010
http://cjs.sagepub.com
Downloaded from
Wang et al. / Assessing Teamwork 121
to the teacher-report than the self-report scales. Plausibly if educators wish to mea-
sure constructs that are not obviously and frequently observable in students’ behav-
iors, they may need to supplement teacher-ratings with reports from observers who
know the student well (e.g., family members, peers). Generally, these results support
the construct validity of the scores but indicated that different measurement methods
may capture different aspects of the teamwork construct.
Developmental Trajectory of Teamwork over Late Adolescence
This study found significant positive correlations of age with self-reports and SJT
teamwork scores. As these students were involved in teamwork courses run by Ford
PAS, results may be due to learning effects (as the students in higher grades often
take more courses than students in lower grades). However, correlations remained
significant after controlling for number of Ford PAS modules taken, indicating that
increases are due to maturation rather than course exposure. This outcome has
important implications for using such assessments for program evaluation as
increasing teamwork would have to be compared to natural developmental gains.
That said, it should be noted that this study was not a controlled intervention design.
Future studies might be undertaken to investigate intervention effects by using a
pre–post and control group design and having the multiple-method assessment of
teamwork serve as the dependent variable(s).
Does Teamwork Predict Academic Achievement?
Self-reported teamwork correlated with different courses grades to varying degrees.
Only the teacher-report and the Advocate/Guide self-report score predicted the grades
composite, although several measures predicted individual subject grades, with the
strongest relationship found for music. Although music had the lowest number of cases
(n = 41), this relationship makes conceptual sense. Of all the subjects measured, aca-
demic performance in music depends most on teamwork: Playing pieces as a group
forms an essential part of the subject, with the negotiation of piece choice, solos, and
group practice times playing a role in final performance and grade. Such a focus on
team performance is not essential for many other subjects, although it is certainly rel-
evant to the performing arts, debating, or team sports (although note that the importance
of teamwork might differ depending on pedagogical approach to learning, for example
with the mathlete competitive teams for mathematics). In short, overall grade point
average may be too broad a variable for teamwork to predict; that is, teamwork may be
more useful in predicting those aspects of academic performance where it is stressed.
Comparison of Self-report, SJT, and Teacher-Report Scores
There are advantages and disadvantages to each assessment method used in this
study, and an argument could be made that these compliment and offset each other.
The primary problem with self-reports is that these may be susceptible to impression
at University of Sydney on January 13, 2010
http://cjs.sagepub.com
Downloaded from
122 Canadian Journal of School Psychology
management—Students may fake good or even fake bad. Teacher-reports are rela-
tively more objective and may reduce faking problems. However, it is unlikely that a
teacher could observe all aspects of students’ teamwork skills, thus teacher-ratings
may not cover the entire spectrum of the teamwork construct (Youngstrom, Loeber,
& Stouthamer-Loeber, 2000). In addition, teacher-ratings of students may be difficult
to implement in practice, given the teacher workload of such an activity in a typical 20
to 30 student classroom. Indeed, the zero variability found for 35 teacher-report cases
in this study might indicate that teachers were fatigued by this assessment procedure.
A further advantage of teacher assessments is that ratings are not confounded by the
verbal ability of the student. Students who are English language learners, for example,
may not understand some items and thus provide answers indicating poor teamwork
due to lack of language comprehension rather than poor teamwork. This potential con-
found might be particularly problematic for the SJT, which involve large amounts of
text. However, presenting SJT items via video or audio may ameliorate this problem.
In addition, SJTs may detect subtle judgment processes by asking participants to
provide intuitive judgments about ecologically valid scenarios. This ecological valid-
ity may also make SJT items more engaging than traditional self-reports, although
such context-rich material makes objective standards for scoring difficult. In this
study, however, the LCA of students’ responses provided independent confirmation
of expert judgments. Classes who disagreed/agreed with the expert scoring-key were
identified, providing evidence for the validity of the expert judgments. This novel
approach to ascertaining the validity of expert opinion could be usefully extended
to SJTs assessing workplace competencies, tacit knowledge, or emotional intelli-
gence. Although the small sample size in this study made a multilevel analysis of all
32 responses (4 × 8 scenarios) impossible, multilevel LCA models could be devel-
oped in future to accommodate all responses in a ratings-based SJT.
Limitations of the Current Study
As with any new assessment developed, there is a clear need to conduct additional
studies. In the present instance, this includes (a) formally testing factor structure across
additional, disparate samples (e.g., non-Ford PAS students, ethnic groups), (b) getting
a better understanding of the extent that group performance factors into grades in spe-
cific subjects in schools, and (c) formally conducting a multimethod multitrait design
(by, for example, having more SJTs to possibly explore the dimensionality of the
construct). Clearly too, it would be especially important to conduct a longitudinal
study of these teamwork assessments to more fully understand developmental trends
and possible casual mechanisms.
Future Applications for Teamwork Research in High Schools
Subject to certain caveats, including the need for further validity studies, there are
several useful ways that this teamwork instrument might be applied in high schools.
at University of Sydney on January 13, 2010
http://cjs.sagepub.com
Downloaded from
Wang et al. / Assessing Teamwork 123
First, this multiple-method assessment might be used for early identification and
primary intervention, as deficits in teamwork can potentially harm students’ higher
education, career opportunities, and quality of life. Certainly, it would be beneficial
to identify students with deficits in teamwork sooner rather than later and provide
appropriate remediation. Students high on teamwork might be selected as mentors
or role models for students low on teamwork, with study or project groups composed
accordingly. In addition, feedback and suggestions for improvement might be given
to students based on their own unique profile of teamwork skills.
Second, the instrument might be used to gauge the effects of training. Programs
emphasizing teamwork (e.g., Ford PAS) are already implemented in some schools.
The multiple-method assessment could help determine those aspects of teamwork that
might most be amenable to training and fine-tune the programs accordingly. Third, the
instrument might be used as a form of career guidance/advisement, in conjunction with
cognitive tests and interest inventories. For example, students with very high negotia-
tion skills might be directed toward courses or careers where these skills might prove
valuable.
Overall, this study suggests some promising new directions in teamwork research
and its application in high schools. A reliable multiple-method teamwork assess-
ment system was developed with promising validity evidence. Such an instrument
might profitably be used for manifold purposes in high schools, with multiple meth-
ods a useful technique for overcoming the practical limitations evident in giving
any assessment in isolation.
Note
1. After exclusion, only one reverse-keyed item remained. The phenomenon where reverse-keyed
items do not load on the same factors as non-reverse-keyed items (and hence are removed from the item
pool) is not isolated to the current dataset but is a measurement issue that has been commented on fre-
quently in the literature (e.g., Barnette, 2000). It is as yet unresolved how to concurrently deal with this
issue and also control for acquiescence or other response sets.
References
Ahles, C. B., & Bosworth, C. C. (2004). The perception and reality of student and workplace teams.
Journalism & Mass Communication Educator, 59, 42-59.
Barnette, J. J. (2000). Effects of stem and Likert-response option reversals on survey internal consistency:
If you feel the need, there is a better alternative to using those negatively worded stems. Educational
and Psychological Measurement, 60, 361-370.
Barton, P. E. (2007). What about those who don’t go? Educational Leadership, 64, 26-27.
Hu, L.-T., & Bentler, P. M. (1995). Evaluating model fit. In R. H. Hoyle (Ed.), Structural equation mod-
eling: Concepts, issues, and applications (pp. 76-99). Thousand Oaks, CA: SAGE.
Hu, L.-T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis:
Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary
Journal, 6, 1-55.
at University of Sydney on January 13, 2010
http://cjs.sagepub.com
Downloaded from
124 Canadian Journal of School Psychology
Liu, O. L., & Rijmen, F. (2008). A modified procedure for parallel analysis for ordered categorical data.
Behavior Research Methods, 40, 556-562.
Loughry, M. L., Ohland, M. W., & Moore, D. D. (2007). Development of a theory-based assessment of
team member effectiveness. Educational and Psychological Measurement, 67, 505-524.
Marsh, H. W., Hau, K. T., & Wen, Z. (2004). In search of golden rules: Comment on hypothesis-testing
approaches to setting cutoff values for fit indexes and dangers in overgeneralizing Hu and Bentler’s
(1999) findings. Structural Equation Modeling: A Multidisciplinary Journal, 11, 320-341.
McDaniel, M. A., Morgeson, F. P., Finnegan, E. B., Campion, M. A., & Braverman, E. P. (2001). Use of
situational judgment tests to predict job performance: A clarification of the literature. Journal of
Applied Psychology, 86, 730-740.
McDaniel, M. A., & Nguyen, N. T. (2001). Situational judgment tests: A review of practice and constructs
assessed. International Journal of Selection and Assessment, 9, 103-113.
Morgeson, F. P., Reider, M. H., & Campion, M. A. (2005). Selecting individuals in team settings: The
importance of social skills, personality characteristics, and teamwork knowledge. Personnel Psychology,
58, 583-611.
Muthén, L. K., & Muthén, B. O. (1998 -2007). Mplus user’s guide (4th ed.). Los Angeles, CA: Author.
O’Neil, H. F., Jr., Wang, S., Jr., & Lee, C. (2003). Assessment of teamwork skills via a teamwork ques-
tionnaire. In H. F. O’Neil Jr. & R. S. Perez. (Eds.), Technology applications in education: A learning
view (pp. 283-303). Mahwah, NJ: Erlbaum.
Stevens, M. J., & Campion, M. A. (1994). The knowledge, skill, and ability requirements for teamwork:
Implications for human resource management. Journal of Management, 20, 503-530.
Stevens, M. J., & Campion, M. A. (1999). Staffing work teams: Development and validation of a selection
test for teamwork settings. Journal of Management, 25, 207-228.
Youngstrom, E. A., Loeber, R., & Stouthamer-Loeber, M. (2000). Patterns and correlates of agreement
between parent, teacher, and male youth behavior ratings. Journal of Consulting and Clinical Psychology,
68, 1-13.
Zeidner, M., Matthews, G., & Roberts, R. D. (2009). What we know about emotional intelligence: How
it affects learning, work, relationships, and our mental health. Cambridge, MA: MIT Press.
Lijuan Wang obtained her PhD degree in quantitative psychology at the University of Virginia in 2008
and is working as an assistant professor in the Department of Psychology at the University of Notre
Dame.
Carolyn MacCann received her PhD in psychology from the University of Sydney in 2006, before
undertaking a 2-year postdoctoral fellowship at the Educational Testing Service, Princeton. Her research
focuses on social and emotional competencies and noncognitive constructs.
Xiaohua Zhuang is a PhD candidate in cognitive psychology at Rutgers University. Her research inter-
ests include visual attention, inattentional blindness, and visual consciousness.
Ou Lydia Liu received her PhD from University of California, Berkeley, in quantitative methods and
evaluation. She is currently an associate research scientist at Educational Testing Service, Princeton. Her
research areas include educational and psychological instrument validation, accountability measures in
higher education, and science assessment.
Richard D. Roberts, PhD, is a principal research scientist in the Center for New Constructs in the
Educational Testing Service’s Research & Development Division, Princeton, NJ. His main areas of spe-
cialization are assessment and human individual differences, where he has published more than 150
peer-reviewed works and received significant grants, contracts, and awards.
at University of Sydney on January 13, 2010
http://cjs.sagepub.com
Downloaded from

More Related Content

Similar to Assessing Teamwork Skills A Multi-Method Approach

Assessment Selection Paper-Herman_Heritage_Goldschmidt (2011)
Assessment Selection Paper-Herman_Heritage_Goldschmidt (2011)Assessment Selection Paper-Herman_Heritage_Goldschmidt (2011)
Assessment Selection Paper-Herman_Heritage_Goldschmidt (2011)
Research in Action, Inc.
 
Re-balancing Assessment_CEPA whitepaper_HofmanGoodwinKahl_Feb2015 (1)
Re-balancing Assessment_CEPA whitepaper_HofmanGoodwinKahl_Feb2015  (1)Re-balancing Assessment_CEPA whitepaper_HofmanGoodwinKahl_Feb2015  (1)
Re-balancing Assessment_CEPA whitepaper_HofmanGoodwinKahl_Feb2015 (1)
Peter Hofman
 
Teacher opinions about the use of Value-Added models
Teacher opinions about the use of Value-Added models Teacher opinions about the use of Value-Added models
Teacher opinions about the use of Value-Added models
llee18
 
Discussion 5Critically think about ethnocentrism, culture, and
Discussion 5Critically think about ethnocentrism, culture, andDiscussion 5Critically think about ethnocentrism, culture, and
Discussion 5Critically think about ethnocentrism, culture, and
LyndonPelletier761
 
COMPARISON OF STUDENT EVALUATIONS OF TEACHING 1 Executi
COMPARISON OF STUDENT EVALUATIONS OF TEACHING 1  ExecutiCOMPARISON OF STUDENT EVALUATIONS OF TEACHING 1  Executi
COMPARISON OF STUDENT EVALUATIONS OF TEACHING 1 Executi
LynellBull52
 
Running Header PROJECT BASED LEARNING PROJECT BASED LEARNING .docx
Running Header PROJECT BASED LEARNING PROJECT BASED LEARNING   .docxRunning Header PROJECT BASED LEARNING PROJECT BASED LEARNING   .docx
Running Header PROJECT BASED LEARNING PROJECT BASED LEARNING .docx
agnesdcarey33086
 

Similar to Assessing Teamwork Skills A Multi-Method Approach (20)

Evaluating the evaluator a reflective approach
Evaluating the evaluator a reflective approachEvaluating the evaluator a reflective approach
Evaluating the evaluator a reflective approach
 
Ej1123995
Ej1123995Ej1123995
Ej1123995
 
Assessment Selection Paper-Herman_Heritage_Goldschmidt (2011)
Assessment Selection Paper-Herman_Heritage_Goldschmidt (2011)Assessment Selection Paper-Herman_Heritage_Goldschmidt (2011)
Assessment Selection Paper-Herman_Heritage_Goldschmidt (2011)
 
Re-balancing Assessment_CEPA whitepaper_HofmanGoodwinKahl_Feb2015 (1)
Re-balancing Assessment_CEPA whitepaper_HofmanGoodwinKahl_Feb2015  (1)Re-balancing Assessment_CEPA whitepaper_HofmanGoodwinKahl_Feb2015  (1)
Re-balancing Assessment_CEPA whitepaper_HofmanGoodwinKahl_Feb2015 (1)
 
An Investigation Into Students Perceptions Of Group Assignments
An Investigation Into Students  Perceptions Of Group AssignmentsAn Investigation Into Students  Perceptions Of Group Assignments
An Investigation Into Students Perceptions Of Group Assignments
 
Teacher opinions about the use of Value-Added models
Teacher opinions about the use of Value-Added models Teacher opinions about the use of Value-Added models
Teacher opinions about the use of Value-Added models
 
Issotl12 handout sw
Issotl12 handout swIssotl12 handout sw
Issotl12 handout sw
 
A6.1: Course Project: Learning Tasks, Section 4
A6.1: Course Project: Learning Tasks, Section 4A6.1: Course Project: Learning Tasks, Section 4
A6.1: Course Project: Learning Tasks, Section 4
 
E assessment
E assessmentE assessment
E assessment
 
An Evaluation Of Predictors Of Achievement On Selected Outcomes In A Self-Pac...
An Evaluation Of Predictors Of Achievement On Selected Outcomes In A Self-Pac...An Evaluation Of Predictors Of Achievement On Selected Outcomes In A Self-Pac...
An Evaluation Of Predictors Of Achievement On Selected Outcomes In A Self-Pac...
 
Student leadership in conduct symposiumonline
Student leadership in conduct   symposiumonlineStudent leadership in conduct   symposiumonline
Student leadership in conduct symposiumonline
 
A review of classroom observation techniques used in postsecondary settings..pdf
A review of classroom observation techniques used in postsecondary settings..pdfA review of classroom observation techniques used in postsecondary settings..pdf
A review of classroom observation techniques used in postsecondary settings..pdf
 
Discussion 5Critically think about ethnocentrism, culture, and
Discussion 5Critically think about ethnocentrism, culture, andDiscussion 5Critically think about ethnocentrism, culture, and
Discussion 5Critically think about ethnocentrism, culture, and
 
COMPARISON OF STUDENT EVALUATIONS OF TEACHING 1 Executi
COMPARISON OF STUDENT EVALUATIONS OF TEACHING 1  ExecutiCOMPARISON OF STUDENT EVALUATIONS OF TEACHING 1  Executi
COMPARISON OF STUDENT EVALUATIONS OF TEACHING 1 Executi
 
2022_eat_framework_-aug_.pdf
2022_eat_framework_-aug_.pdf2022_eat_framework_-aug_.pdf
2022_eat_framework_-aug_.pdf
 
Choosing and-using-sel-competency-assessments what-schools-and-districts-need...
Choosing and-using-sel-competency-assessments what-schools-and-districts-need...Choosing and-using-sel-competency-assessments what-schools-and-districts-need...
Choosing and-using-sel-competency-assessments what-schools-and-districts-need...
 
A Critical Review Of Research On Student Self-Assessment
A Critical Review Of Research On Student Self-AssessmentA Critical Review Of Research On Student Self-Assessment
A Critical Review Of Research On Student Self-Assessment
 
Assesment and collaboration in online learning
Assesment and collaboration in online learningAssesment and collaboration in online learning
Assesment and collaboration in online learning
 
Beyond instrumentation
Beyond instrumentationBeyond instrumentation
Beyond instrumentation
 
Running Header PROJECT BASED LEARNING PROJECT BASED LEARNING .docx
Running Header PROJECT BASED LEARNING PROJECT BASED LEARNING   .docxRunning Header PROJECT BASED LEARNING PROJECT BASED LEARNING   .docx
Running Header PROJECT BASED LEARNING PROJECT BASED LEARNING .docx
 

More from Crystal Sanchez

More from Crystal Sanchez (20)

Free Printable Castle Templates - PRINTABLE T
Free Printable Castle Templates - PRINTABLE TFree Printable Castle Templates - PRINTABLE T
Free Printable Castle Templates - PRINTABLE T
 
Writing An Abstract For A Research Paper Guideline
Writing An Abstract For A Research Paper GuidelineWriting An Abstract For A Research Paper Guideline
Writing An Abstract For A Research Paper Guideline
 
Start Writing Your Own Statement Of Purpose (SO
Start Writing Your Own Statement Of Purpose (SOStart Writing Your Own Statement Of Purpose (SO
Start Writing Your Own Statement Of Purpose (SO
 
Top 10 Effective Tips To Hire Your Next Essay Writer TopTeny.Com
Top 10 Effective Tips To Hire Your Next Essay Writer TopTeny.ComTop 10 Effective Tips To Hire Your Next Essay Writer TopTeny.Com
Top 10 Effective Tips To Hire Your Next Essay Writer TopTeny.Com
 
016 My Career Goals 1024X867 Essay Example
016 My Career Goals 1024X867 Essay Example016 My Career Goals 1024X867 Essay Example
016 My Career Goals 1024X867 Essay Example
 
Research Process- Objective, Hypothesis (Lec2) Hypothesis, Hypothesis
Research Process- Objective, Hypothesis (Lec2) Hypothesis, HypothesisResearch Process- Objective, Hypothesis (Lec2) Hypothesis, Hypothesis
Research Process- Objective, Hypothesis (Lec2) Hypothesis, Hypothesis
 
PDF A Manual For Writers Of Research Papers, Theses
PDF A Manual For Writers Of Research Papers, ThesesPDF A Manual For Writers Of Research Papers, Theses
PDF A Manual For Writers Of Research Papers, Theses
 
Write My Persuasive Speech, 11 Tips How To Writ
Write My Persuasive Speech, 11 Tips How To WritWrite My Persuasive Speech, 11 Tips How To Writ
Write My Persuasive Speech, 11 Tips How To Writ
 
University Entrance Essay Help. Online assignment writing service.
University Entrance Essay Help. Online assignment writing service.University Entrance Essay Help. Online assignment writing service.
University Entrance Essay Help. Online assignment writing service.
 
Essay About My First Day At A New Schoo. Online assignment writing service.
Essay About My First Day At A New Schoo. Online assignment writing service.Essay About My First Day At A New Schoo. Online assignment writing service.
Essay About My First Day At A New Schoo. Online assignment writing service.
 
Why Dogs Are Better Pets Than Cats Essay
Why Dogs Are Better Pets Than Cats EssayWhy Dogs Are Better Pets Than Cats Essay
Why Dogs Are Better Pets Than Cats Essay
 
Abstracts For Research Papers What Are Some Fre
Abstracts For Research Papers What Are Some FreAbstracts For Research Papers What Are Some Fre
Abstracts For Research Papers What Are Some Fre
 
8 Steps To Write Your Memoir Memoir Writing Prompts,
8 Steps To Write Your Memoir Memoir Writing Prompts,8 Steps To Write Your Memoir Memoir Writing Prompts,
8 Steps To Write Your Memoir Memoir Writing Prompts,
 
(PDF) How To Write A Book Review. Online assignment writing service.
(PDF) How To Write A Book Review. Online assignment writing service.(PDF) How To Write A Book Review. Online assignment writing service.
(PDF) How To Write A Book Review. Online assignment writing service.
 
How To Format An Apa Paper. How To Format A
How To Format An Apa Paper. How To Format AHow To Format An Apa Paper. How To Format A
How To Format An Apa Paper. How To Format A
 
Best College Essay Ever - UK Essay Writing Help.
Best College Essay Ever - UK Essay Writing Help.Best College Essay Ever - UK Essay Writing Help.
Best College Essay Ever - UK Essay Writing Help.
 
Home - Write Better Scripts Screenplay Writing, Writin
Home - Write Better Scripts Screenplay Writing, WritinHome - Write Better Scripts Screenplay Writing, Writin
Home - Write Better Scripts Screenplay Writing, Writin
 
Free Classification Essay Examples Topics, Outline
Free Classification Essay Examples Topics, OutlineFree Classification Essay Examples Topics, Outline
Free Classification Essay Examples Topics, Outline
 
Contoh Essay Argumentative Beinyu.Com. Online assignment writing service.
Contoh Essay Argumentative Beinyu.Com. Online assignment writing service.Contoh Essay Argumentative Beinyu.Com. Online assignment writing service.
Contoh Essay Argumentative Beinyu.Com. Online assignment writing service.
 
Persuasive Essay Introduction Examp. Online assignment writing service.
Persuasive Essay Introduction Examp. Online assignment writing service.Persuasive Essay Introduction Examp. Online assignment writing service.
Persuasive Essay Introduction Examp. Online assignment writing service.
 

Recently uploaded

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 

Recently uploaded (20)

Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 

Assessing Teamwork Skills A Multi-Method Approach

  • 1. http://cjs.sagepub.com Psychology Canadian Journal of School DOI: 10.1177/0829573509335470 2009; 24; 108 Canadian Journal of School Psychology Roberts Lijuan Wang, Carolyn MacCann, Xiaohua Zhuang, Ou Lydia Liu and Richard D. Multimethod Approach Assessing Teamwork and Collaboration in High School Students: A http://cjs.sagepub.com/cgi/content/abstract/24/2/108 The online version of this article can be found at: Published by: http://www.sagepublications.com On behalf of: Canadian Association of School Psychologists can be found at: Canadian Journal of School Psychology Additional services and information for http://cjs.sagepub.com/cgi/alerts Email Alerts: http://cjs.sagepub.com/subscriptions Subscriptions: http://www.sagepub.com/journalsReprints.nav Reprints: http://www.sagepub.com/journalsPermissions.nav Permissions: http://cjs.sagepub.com/cgi/content/refs/24/2/108 Citations at University of Sydney on January 13, 2010 http://cjs.sagepub.com Downloaded from
  • 2. 108 Canadian Journal of School Psychology Volume 24 Number 2 June 2009 108-124 © 2009 SAGE Publications 10.1177/0829573509335470 http://cjsp.sagepub.com hosted at http://online.sagepub.com Assessing Teamwork and Collaboration in High School Students A Multimethod Approach Lijuan Wang University of Notre Dame Carolyn MacCann University of New South Wales Xiaohua Zhuang Rutgers University Ou Lydia Liu Richard D. Roberts Educational Testing Services Abstract: Various policy papers assert that teamwork is an essential skill for the 21st- century workforce. However, outside of organizational psychology research with adult populations, there are few reliable assessments of this construct with suitable validity evidence for test scores. To redress this issue, self-report, situational judgment, and teacher-report assessments of teamwork were developed for high school students. Various multivariate techniques were used to determine the structure of the scales, including factor and latent class analysis. Measures showed reasonable reliability and satisfactory validity evidence: Self-report, situational judgment, and teacher-report measures intercorrelated, and these measures also related to academic achievement. The advantages and disadvantages of each methodology are discussed, as are possible uses of this assessment system (e.g., evaluation of school-based programs that infuse curricula with modules on teamwork). Résumé: Divers articles sur les politiques affirment que le travail d’équipe est essentiel à la force de travail du 21e siècle. Cependant, en dehors des recherches en psychologie des organisations chez les adultes, peu d’évaluations fiables ont été faites sur ce thème avec suffisamment de validité. Pour pallier cette situation, nous avons élaboré des mesures Authors’ Note: This research was supported, partly, by a Ford Partnership for Advanced Studies research grant. Numerous individuals at Educational Testing Service were involved in test construction, data col- lection, and analyses; thanks to them all. We also thank those involved in the Ford PAS program, who gave willingly of their time. The views expressed are those of the authors and do not represent the views of Ford PAS or any of the authors’institutional affiliations. Correspondence concerning this article should be addressed to Richard D. Roberts, R&D, MS16-R, Educational Testing Service, Rosedale Road, Princeton, NJ, 08541; e-mail: RRoberts@ets.org. at University of Sydney on January 13, 2010 http://cjs.sagepub.com Downloaded from
  • 3. Wang et al. / Assessing Teamwork 109 d’auto-évaluation, de jugements de situations et d’évaluation par les enseignants du travail d’équipe chez les étudiants du secondaire. Diverses techniques multivariées ont été utili- sées pour déterminer la structure des mesures, y compris des analyses factorielles et de classes latentes. Les mesures se révèlent raisonnablement fiables et leur validité est satis- faisante. L’auto-évaluation, les jugements de situations et l’évaluation par les enseignants sont en intercorrélation et toutes ces mesures sont également liées au rendement scolaire. Nous abordons le pour et le contre de chaque méthodologie, de même que les utilisations possibles de ce système d’évaluation (p. ex. l’évaluation de programmes scolaires qui comprennent des curriculum avec des modules fondés sur le travail d’équipe). Keywords: teamwork; situational judgment test; teacher ratings; academic achieve- ment; latent class analysis; structural equation modeling Teamwork has been touted as one of the major skills comprising workforce readiness in the 21st century (e.g., Barton, 2007) and has become an essential process in education, with teachers frequently assigning projects that require student collaboration (Ahles & Bosworth, 2004). Despite the perceived importance of team- work in secondary education, reliable assessments of teamwork at the high school level are scarce, with existing assessments primarily targeting business organiza- tions or college students (Loughry, Ohland, & Moore, 2007; Morgeson, Reider, & Campion, 2005; O’Neil, Wang, & Lee, 2003). The current study aims to develop a teamwork assessment that (a) targets high school students, (b) uses multiple methods of measurement (self-report, other-report, and situational judgment test [SJT] procedures), (c) results in reliable measures, and (d) yields scores with demonstrable validity evidence. These instruments could be used to identify students’ teamwork skills, to design intervention programs around the assessment, and to provide career guidance for students. In addition, information derived from the assessment procedures may inform the development of teamwork- training curricula in the high school. Individual Differences in Teamwork: Conceptual Models Although models of teamwork differ in the details, conceptual correspondences between the components of teamwork models suggest five general content areas (e.g., O’Neil et al., 2003; Stevens & Campion, 1994). These content areas are (a) task- related process skills, (b) cooperation with other team members, (c) influencing team members through support and encouragement, (d) resolution of conflicts among team members via negotiation strategies, and (e) guidance and mentorship of other team members. As process skills appear to have a clear cognitive load (and hence might be most appropriately measured with knowledge assessments), we limit our definition of teamwork to the four latter content areas in the current study. That is, at University of Sydney on January 13, 2010 http://cjs.sagepub.com Downloaded from
  • 4. 110 Canadian Journal of School Psychology teamwork is defined as (a) cooperation with others, (b) influence through support and encouragement (hereafter referred to as advocate), (c) resolving conflict/negotiating, and (d) guiding others. A multimethod teamwork assessment system is developed to cover these content domains. Assessing Teamwork Individual differences in teamwork are commonly assessed with self- or other- reports (e.g., Loughry et al., 2007) and less commonly with SJTs (e.g., Stevens & Campion, 1999). There are practical concerns with each method of assessment in isolation: SJTs are difficult to score, self-report ratings may produce response distor- tion, and other-report ratings may be susceptible to halo effects. Thus, using multiple methods represents an innovative approach to teamwork assessment, ensuring that potential measurement issues are limited to one part of the assessment system. In addition, the relationship between different methods of measurement (self-reports, SJTs, and teacher-reports) can be examined as an important methodological issue. Establishing Validity Evidence for Teamwork Scores Several different types of validity evidence for the teamwork assessment system are examined. First, the multiple measures should converge to assess the same con- struct (teamwork) as evidence of convergent validity. Second, limited group differ- ences (i.e., gender or ethnicity differences) in teamwork scores are expected as group membership is conceptually unrelated to teamwork, constituting a form of discrimi- nant validity evidence. Third, teamwork scores should predict students’ grades, as evidence of criterion validity. Fourth, latent class analyses of test takers’ responses to the SJT should converge with expert opinion, as evidence for the validity of the expert scoring rubric. Fifth, developmental and learning trends in teamwork are expected as a further form of validity evidence. Further background to these claims follows. Convergent validity evidence. Scores on the three different measurement methods should converge, as all are based on the same definition of teamwork. Positive cor- relations between the different measurement methods are evidence of convergent validity. Discriminant validity evidence. The distinctiveness of teamwork scores from ethnicity and gender constitutes evidence of discriminant validity. Although there is little literature examining group differences in teamwork, related literatures suggest that noncognitive assessments reduce adverse impact (e.g., McDaniel, Morgeson, at University of Sydney on January 13, 2010 http://cjs.sagepub.com Downloaded from
  • 5. Wang et al. / Assessing Teamwork 111 Finnegan, Campion, & Braverman, 2001). Teamwork may relate to socialization processes in much the same ways as similar constructs, such as social and emotional competencies (Zeidner, Matthews, & Roberts, 2009). However, if the teamwork assessments are purely measures of socialization styles or types of experience (which differ across groups), then such measures are not truly assessing teamwork but rather social interaction norms. For this reason, we consider the absence of group differences on the teamwork assessments to be a form of discriminant validity. Test-criterion evidence. Test-criterion relations of the teamwork assessments are evaluated against students’grades. Given that students’learning and achievement may relate to social demands of the classroom, cognitive demands of mastering academic material, and teachers stressing collaborative approaches to learning, teamwork skills gain in importance as components of academic success (Ahles & Bosworth, 2004). Teamwork scores should thus predict school grades. However, it might be the case that teamwork measures predict grades especially in classes where teamwork is pivotal (e.g., music, where ensembles or group performances are common) versus those where it may sometimes be downplayed (e.g., history, where individual assessments are more common than team projects). Validity evidence for the SJTs. Although expert scoring is recommended for SJTs (e.g., McDaniel & Nguyen, 2001), experts may disagree, criteria for teamwork exper- tise are not obvious, and multiple correct answers to situational items may be possi- ble. For these reasons, latent class analysis (LCA) is used as a procedure for ensuring the validity of expert scoring of the teamwork SJT. The LCA identifies qualitatively different groups of cases based on consistencies in response patterns. An LCA of SJT items can determine whether there are distinct groups of test takers showing different patterns of response on the SJT. If discrete groups of test takers have higher or lower expert-derived scores, this provides evidence that expert scoring is valid. Development trends in teamwork. Development and learning trends for teamwork are also examined (i.e., higher teamwork scores are expected for older students as a product of education and socialization). Aims of the Study This study serves three primary purposes: (a) develop a multiple method assess- ment system to measure teamwork in high school students, (b) provide preliminary reliability and validity evidence for the teamwork assessment system by examining each measure in isolation and then by examining the relation between the three meth- ods (i.e., convergent validity evidence), and (c) provide additional validity evidence at University of Sydney on January 13, 2010 http://cjs.sagepub.com Downloaded from
  • 6. 112 Canadian Journal of School Psychology for the assessments by examining relationships between teamwork scores and age, ethnicity, gender, and academic achievement. Method Participants Participants were 159 high school students (51.6% female adolescents) partici- pating in Ford Partnership for Advanced Studies (PAS) courses, whose teachers indicated a willingness to participate in the teamwork study. The Ford PAS is a set of curriculum modules linking classroom education to real-world employment appli- cations, through developing links between schools and local colleges, universities, and businesses. The learning process is based on inquiry and collaborative, project- based learning experiences and aims to teach students four essential workplace skills: Teamwork, critical thinking, problem solving, and communication (see http:// www.fordpas.org). Participants’mean age was 16.10 years (SD = 1.03). Participants’ self-identified ethnicity was 64.2% African American, 18.9% White, 3.1% Hispanic, and 13.8% American Indian, Asian, or Other. Although this sample is not representa- tive of U.S. high school students, it is consistent with the student population gener- ally participating in the Ford PAS program. Measures Teamwork A self-report rating scale, SJT, and behaviorally anchored teacher-rating scale were developed to assess the four content domains of teamwork. All three measures were developed by specialists and reviewed by two expert panels: (a) content experts (cur- riculum developers and educators who have taught teamwork curricula) and (b) fairness and sensitivity experts (i.e., individuals trained to meet guidelines according to estab- lished standards designed to reduce bias and promote legal and public defensibility). Self-report teamwork assessment. There were 57 items developed to assess Coo- peration (15 items), Advocate (12 items), Negotiate (17 items), and Guiding Others (13 items). Students responded to these items on a 6-point scale from never to always (see Table 1 for sample items). SJT assessment. Curricula materials and student exercises supplied by Ford PAS were used to develop scenarios and response options. Two scenarios were developed for each of the four components, such that the SJT represented all four content com- ponents (however, due to limited testing time, we did not attempt to develop enough items to examine multiple teamwork factors). For each scenario, students rated the at University of Sydney on January 13, 2010 http://cjs.sagepub.com Downloaded from
  • 7. Wang et al. / Assessing Teamwork 113 effectiveness of four response options on a 5-point scale, from very ineffective to very effective. A panel of three assessment specialists in educational and psycho- logical testing decided on the best response (with final decision based on the group reaching consensus), and the test taker’s rating of this response was used to score each item. A sample item is described below. Table 1 Factor Loadings of the Revised Self-Report Teamwork Scale Item Content Cooperate Advocate/Guide Negotiate M (SD) I enjoy bringing team .75 4.45 (1.29) members togethera Sharing ideas .74 4.58 (1.20) Acknowledging peers’ .71 4.71 (1.19) accomplishments Helping team .67 4.77 (1.19) Valuing different perspectives .67 4.40 (1.26) Providing feedback .60 4.16 (1.18) Exchanging creative ideas .60 4.84 (1.26) Cooperating with students .53 4.72 (1.12) Enjoying team activities .46 4.43 (1.33) Inspired by others .46 4.09 (1.25) Contributing team’s goals .43 4.50 (1.22) Respecting peer opinions .41 .34 4.76 (1.11) I like to be in charge .70 3.75 (1.42) of groups or projectsa Help others see things my way .65 4.03 (1.12) Convincing peers .57 3.80 (1.19) Believe good leaders .56 4.68 (1.36) Comfortable providing criticism .44 3.80 (1.46) Persuading peers attentively .43 4.30 (1.29) Influencing peers .41 3.61 (1.44) Suggesting solutions .30 3.99 (1.23) Constructively argue .23 4.15 (1.27) I am a good listenera .72 4.91 (1.11) Open to varying opinions .65 4.64 (1.23) Take others’ interests into account .55 4.32 (1.28) Adaptable in team .53 4.16 (1.15) Flexible in team .51 4.68 (1.16) Find best solution .49 4.38 (1.37) Dislike people challenging viewsb .38 3.92 (1.29) Understanding team .30 .31 5.33 (1.04) member differences Consider team first .29 3.98 (1.36) Note: For better readability, factor loadings that are less than .20 are not listed. a Items are complete examples. b Reverse-scored item. at University of Sydney on January 13, 2010 http://cjs.sagepub.com Downloaded from
  • 8. 114 Canadian Journal of School Psychology You are part of a study group that has been assigned a large presentation for class. As you are all dividing up the workload, it becomes clear that both you and another mem- ber of the group are interested in researching the same aspect of the topic. Your col- league already has a great deal of experience in this area, but you have been extremely excited about working on this part of the project for several months. Rate the following approach to dealing with this situation: (a) Flip a coin to determine who gets to work on that particular aspect of the assignment, (b) insist that, for the good of the group, you should work on that aspect of the assignment because your interest in the area means you will do a particularly good job, (c) compromise your preferences for the good of the group and allow your friend to work on that aspect of the assignment [best response by expert judgment], (d) suggest to the other group member that you both share the research for that aspect of the assignment and also share the research on another less- desirable topic. Teacher-rating scale. Teachers evaluated each student’s level of teamwork against ten behaviorally anchored items. A sample item follows: When working on a group goal or project, this student (1) (2) (3) (4) (5) ignores or does not listens to others’ always listens notice others’ ideas contributions to others or suggestions and respects their contributions Self-Reported Grades Students reported their grades in reading/language arts, math, science, social sci- ence, art, and music from the previous semester. A total of 151 grades were reported for reading, 150 for math, 146 for social science, 135 for social studies, 49 for art, and 41 for music. Procedure Participants were tested in class (20 students per class) with teachers reporting on their students’ performance during this time. All tests were administered in paper- and-pencil format and were self-paced. The entire student protocol lasted 45 min. All protocols were approved by the Educational Testing Service human ethics review committee. Data Analysis Steps Exploratory factor analyses (EFAs). Separate EFAs using principal factor analy- sis with promax rotation were conducted for the student self-report scale, SJT (eight at University of Sydney on January 13, 2010 http://cjs.sagepub.com Downloaded from
  • 9. Wang et al. / Assessing Teamwork 115 key items that experts selected as the best), and teacher-report scale. Problematic items (loadings <.30 or cross-loadings to multiple factors) were removed to reduce the item pool. Parallel analysis was conducted to determine the number of factors for each scale, using a SAS macro program that allows parallel analysis of ordinal data such as rating scale items (Liu & Rijmen, 2008). Confirmatory factor analyses (CFAs). The CFAs were then used to compare structural models for the self-report scale (to determine whether highly correlating latent factors are sufficiently different from each other to constitute separate factors). The following rules of thumb were used to evaluate fit statistics: (a) acceptable fit: root mean square error of approximation (RMSEA) ≤ .08, comparative fit index (CFI) ≥ .90; (b) good fit: RMSEA ≤ .05, CFI ≥ .95 (e.g., Hu & Bentler, 1999; Marsh, Hau, & Wen, 2004). However, Hu and Bentler (1995) also suggest that the rule of thumb is not absolute because it does not work equally well with various types of indices, sample sizes, estimators, or distributions. Both EFAs and CFAs were con- ducted with Mplus (Muthén & Muthén, 1998-2007). LCA. The LCA was applied to the eight expert-ratified SJT item responses, to see whether the obtained latent classes converge with expert scoring. Only the eight key items were used in the LCA analysis as the small sample size and nested item- scenario data structure meant that a model based on all 32 responses would not be reliable. Akaike information criterion (AIC), Bayesian information criterion (BIC), and profile plots were used to select the number of classes. Correlational analyses. Correlations among self-report, teacher-report, and SJT factor scores were calculated to determine whether the scores showed convergent validity evidence. Correlations of teamwork factor scores with demographic vari- ables and course grades were also calculated, providing further validity evidence. Results Scale Dimensionality Self-report scale. A four-factor CFA based on the theoretical assignment of items to factors fit data poorly (e.g., CFI = .56; RMSEA = .08), suggesting that a four- factor solution was not the best fit to the data. The initial EFA was then conducted on the correlation matrix, using a four-factor solution suggested by parallel analysis. However, results indicated some potentially problematic items due to negative factor loadings, items that did not load saliently (>.30) on any factor, or items that showed multiple loadings (>.40) on different factors. In addition, these four factors did not correspond to the four subscales originally postulated. A sequential deletion method at University of Sydney on January 13, 2010 http://cjs.sagepub.com Downloaded from
  • 10. 116 Canadian Journal of School Psychology was then used to eliminate problematic items. Items with negative loadings were first eliminated, and then items with low loadings and multiple loadings were eliminated. On the basis of item analyses, 27 items (13 of which were reverse-coded) were dropped. An EFA was conducted on these discarded items, and no clear factor pattern was identified. These 27 items were excluded from further analysis.1 The remaining items were then reanalyzed using EFA. Results from the scree plot and parallel analysis indicate a three-factor solution. Factor loadings from this analysis are shown in Table 1. The reduced self-report scale consisted of 30 items and captured three factors: (a) Cooperation (12 items), (b)Advocate/Guide (9 items), and (c) Negotiation (9 items). Cooperation includes tendencies to bring ideas together, seek solutions, and provide feedback to team members. Advocate/Guide refers to tendencies to direct others, provide appropriate suggestions and criticism, and persuade others (i.e., the factor primarily represents advocating content, although elements of guiding others, which did not emerge as a separate factor, also appear here). Negotiation includes students’ tendency to listen, to adapt to change while there are conflicts, and the ability to solve conflicts. Cronbach alpha’s for the three retained factors were acceptable (i.e., .88, .80, and .78, respectively). Table 1 also provides the mean and standard devia- tion of each item. For CFA, items were assigned to one of the three dimensions based on the factor loading matrix from the EFA shown in Table 1. Fit indices from a CFA based on these three factors (with no cross-loadings) are shown in Table 2. The RMSEA indi- cates good fit, although the CFI is slightly below conventional estimates (at .85 rather than .90). Correlations among the latent variables were relatively high: .66 (Cooperation, Advocate/Guide), .79 (Cooperation, Negotiation), and .59 (Advocate/ Guide, Negotiation). For this reason, we fitted three alternative two-factor models to test if the three factors were statistically distinct: (a) Cooperation and Advocate/ Guide combine to form one factor (i.e., correlation between them equals 1.00 and correlations of each factor with the Negotiation factor were also constrained to be equal); (b) Cooperation and Negotiation combine to form one factor; and (c)Advocate/ Guide and Negotiation combine to form one factor. Results of the likelihood ratio tests shown in Table 2 indicate that combining any two factors into one significantly lowers model fit. SJT. Scree and parallel analysis of the eight SJT items indicated a one-factor solu- tion. The Cronbach alpha was .71. Fit indices from a one-factor CFA model indi- cated good fit (CFI = .96, RMSEA = .049). Unrestricted LCA models with one to four classes were fitted to the eight key SJT items with raw data used as input. Table 3 displays the goodness-of-fit indices for these LCA models. The AIC indicates a three-class solution, BIC a two-class solu- tion. Profile plots (mean ratings of the 32 item responses for each class) were gener- ated for both the two-class LCA model (Figure 1a) and the three-class LCA model at University of Sydney on January 13, 2010 http://cjs.sagepub.com Downloaded from
  • 11. Wang et al. / Assessing Teamwork 117 (Figure 1b). The plots show that the Class 1 and Class 3 in the three-class model were not visually distinguishable. However, the two classes from the two-class model discriminated very clearly. Combining the results from AIC, BIC, and profile dis- crimination abilities, a two-class model was selected to represent the dimensionality of the SJT scale (one dimension, two classes). From Figure 1a, we can also see that students in Class 1 rated all of the eight expert-endorsed response options more highly than those in Class 2. This result provides a form of validity evidence for the expert keys. There was also more variation across reactions/items within each scenario for Class 1 than Class 2 (i.e., Class-1 students can better differentiate reac- tions within each scenario than Class-2 students). Therefore, we label Class-1 as the high teamwork skill group and Class-2 as the low teamwork skill group. Teacher-report scale. The scree and parallel analysis with EFA showed that the teacher evaluation scale was unidimensional, with the first factor explaining 83% of the variance of the 10 items. The Cronbach alpha of the scale was .98. The mean of the composite scores of these 10 items was 3.35 (SD = 1.14). Table 2 Fit Indices from the Confirmatory Factor Models for the Self-Report Teamwork Scores Correlation Between Factors Fixed to 1.00 Fit Indices Three-Factor Model F1, F2 F1, F3 F2, F3 Chi-square 651.5 743.8 716.1 741.6 df 402 404 404 404 CFI .84 .79 .81 .80 RMSEA .06 .07 .07 .07 Likelihood ratio test 92.3/2 64.6/2 90.1/2 Note: F1 = Cooperation; F2 = Advocate/Guide; F3 = Negotiation; CFI = comparative fit index; RMSEA = root mean square error of approximation. Table 3 Goodness-of-Fit Indices for One- through Four-Class LCA Models for the SJT Teamwork Scores Latent Class Log Likelihood Number of Parameters AIC BIC 1 –1541.1 31 3144.2 3238.5 2 –1428.4 63 2982.9 3174.6 3 –1382.9 95 2955.9 3245.0 4 –1361.4 127 2976.7 3363.2 Note: AIC = Akaike information criterion; BIC = Bayesian information criterion. at University of Sydney on January 13, 2010 http://cjs.sagepub.com Downloaded from
  • 12. 118 Canadian Journal of School Psychology Figure 1 Mean Profile Plots With the Latent Class Analyses (Two vs. Three Classes) 1 1.5 2 2.5 3 3.5 4 4.5 5 a b c d a b c d a b c d a b c d a b c d a b c d a b c d a b c d 1 2 3 4 5 6 7 8 Response options a to d for each of the eight scenarios Mean rating Class 1 Class 2 a 1 1.5 2 2.5 3 3.5 4 4.5 5 a b c d a b c d a b c d a b c d a b c d a b c d a b c d a b c d 1 2 3 4 5 6 7 8 Response options a to d for each of the eight scenarios Mean rating Class 1 Class 2 Class 3 b Note: The y axis represents rated means of effectiveness of each reaction across latent classes. The x axis represents reaction (a, b, c, and d) and scenario numbers (1-8). For Scenario 1-8, the most effective reac- tions rated by experts are c, d, d, c, d, d, c, and d. at University of Sydney on January 13, 2010 http://cjs.sagepub.com Downloaded from
  • 13. Wang et al. / Assessing Teamwork 119 Relationships between Three Evaluation Methods Table 4 displays the correlations of the five teamwork scales (Cooperation,Advocate/ Guide, Negotiation, SJT, and teacher report). The SJT assessment was significantly correlated with all three self-report scores around the same magnitude (as might be expected as scenarios covered the same teamwork subcomponents). The SJT and the teacher report also shared a moderate and positive relationship. Teacher-report scores were significantly correlated with two of the student self-report scores: cooperation and advocating/influence. To further investigate the relationship between the three assessment methods, self-report and teacher-report scores were also compared across the two SJT latent classes, and the results are given in Table 5. The high-teamwork-skill class scored higher on the three self-report scales than the low-teamwork class, although no sig- nificant differences between two classes were observed for the teacher-report scale. Table 4 Correlations between all Teamwork Scores Student Self-Report Teamwork Score Cooperation Advocate/Guide Negotiation SJT Teacher Report Cooperation .88 Advocate/guide .66** .80 Negotiation .79** .59** .78 SJT .52** .47** .60** .71 Teacher report .19* .32** .14 .33** .98 Note: Cronbach’s alpha reliability shown on diagonal. SJT = situational judgment test. *p < .05. **p < .01. Table 5 Teamwork Scores Compared by Latent Classes High Teamwork Low Teamwork Skills (n = 64) Skills (n = 95) Teamwork Score M (SD) M (SD) t Cooperation 4.74 (0.74) 4.32 (0.81) 3.38** Advocate/guide 4.14 (0.81) 3.88 (0.78) 2.00* Negotiation 4.59 (0.68) 4.36 (0.78) 2.01* SJT 4.27 (0.35) 3.50 (0.44) 11.85** Teacher report 3.43 (1.05) 3.28 (1.25) 0.81 Note: SJT = situational judgment test. *p < .05. **p < .01. at University of Sydney on January 13, 2010 http://cjs.sagepub.com Downloaded from
  • 14. 120 Canadian Journal of School Psychology Relationships between Teamwork Scores and Demographic Variables No significant gender or ethnic differences were found for the self-report subscales, teacher-report scores, or SJT scores. Age was positively correlated with the three self- report (r = .31 to .35, p < .01) and SJT scores (r = .32, p < .01) but not significantly with the teacher-report score. Participants in the high-teamwork latent class were significantly older than participants in the low-teamwork latent class (M = 16.31, SD = 0.96 vs. M = 15.91, SD = 1.07; t = 2.42, p < .05). Age significantly predicted all teamwork mea- sures when controlling for the number of Ford PAS modules a student had undertaken (partial r = .25 to .30 for self-reports, .28 for the SJT, and .18 for the teacher report, p < .05 in all cases). However, the number of Ford PAS modules undertaken was not a significant predictor of any teamwork measure after controlling for the age variable. Relationships between Teamwork Scores and Course Grades The SJT scores did not correlate significantly with students’ grades. The teacher- report correlated significantly with math, science, and social studies grades (r = .21, .30, and .27, respectively, p < .01). Cooperation correlated moderately with science and music grades (r = .18 and .38, p < .05), whereas Advocate/Guide correlated positively with science (r = .32, p < .01), social science (r = .19, p < .05), and music grades (r = .40, p < .01). Negotiation shared a positive correlation with music grades only (r = .50, p < .01). A grades composite was calculated by taking the means of the different course grades. Only Advocate/Guide scores of the self-report scale and total teacher-report score were significantly correlated with the grades composite (r = .25, p < .01). Discussion Scores on the three teamwork assessments: (a) related to each other, (b) were unrelated to ethnicity or gender, (c) increased with age, and (d) related to school grades. Generally, the aims of the study were met: A multiple-method assessment system was developed for high school students, and scores from this assessment suite showed evidence of convergent and criterion-related validity. Although all teamwork assessments were related, there were some distinguishing features of the different methods, with related implications for the use and purposes of such assess- ments. These issues are discussed below. Relationships among Teamwork Measures The strongest relationship between self- and teacher-reports was for Advocate/ Guide, perhaps because this factor is more obvious to external observers (Cooperation and Negotiation appear less open to observation). The SJT was less strongly related at University of Sydney on January 13, 2010 http://cjs.sagepub.com Downloaded from
  • 15. Wang et al. / Assessing Teamwork 121 to the teacher-report than the self-report scales. Plausibly if educators wish to mea- sure constructs that are not obviously and frequently observable in students’ behav- iors, they may need to supplement teacher-ratings with reports from observers who know the student well (e.g., family members, peers). Generally, these results support the construct validity of the scores but indicated that different measurement methods may capture different aspects of the teamwork construct. Developmental Trajectory of Teamwork over Late Adolescence This study found significant positive correlations of age with self-reports and SJT teamwork scores. As these students were involved in teamwork courses run by Ford PAS, results may be due to learning effects (as the students in higher grades often take more courses than students in lower grades). However, correlations remained significant after controlling for number of Ford PAS modules taken, indicating that increases are due to maturation rather than course exposure. This outcome has important implications for using such assessments for program evaluation as increasing teamwork would have to be compared to natural developmental gains. That said, it should be noted that this study was not a controlled intervention design. Future studies might be undertaken to investigate intervention effects by using a pre–post and control group design and having the multiple-method assessment of teamwork serve as the dependent variable(s). Does Teamwork Predict Academic Achievement? Self-reported teamwork correlated with different courses grades to varying degrees. Only the teacher-report and the Advocate/Guide self-report score predicted the grades composite, although several measures predicted individual subject grades, with the strongest relationship found for music. Although music had the lowest number of cases (n = 41), this relationship makes conceptual sense. Of all the subjects measured, aca- demic performance in music depends most on teamwork: Playing pieces as a group forms an essential part of the subject, with the negotiation of piece choice, solos, and group practice times playing a role in final performance and grade. Such a focus on team performance is not essential for many other subjects, although it is certainly rel- evant to the performing arts, debating, or team sports (although note that the importance of teamwork might differ depending on pedagogical approach to learning, for example with the mathlete competitive teams for mathematics). In short, overall grade point average may be too broad a variable for teamwork to predict; that is, teamwork may be more useful in predicting those aspects of academic performance where it is stressed. Comparison of Self-report, SJT, and Teacher-Report Scores There are advantages and disadvantages to each assessment method used in this study, and an argument could be made that these compliment and offset each other. The primary problem with self-reports is that these may be susceptible to impression at University of Sydney on January 13, 2010 http://cjs.sagepub.com Downloaded from
  • 16. 122 Canadian Journal of School Psychology management—Students may fake good or even fake bad. Teacher-reports are rela- tively more objective and may reduce faking problems. However, it is unlikely that a teacher could observe all aspects of students’ teamwork skills, thus teacher-ratings may not cover the entire spectrum of the teamwork construct (Youngstrom, Loeber, & Stouthamer-Loeber, 2000). In addition, teacher-ratings of students may be difficult to implement in practice, given the teacher workload of such an activity in a typical 20 to 30 student classroom. Indeed, the zero variability found for 35 teacher-report cases in this study might indicate that teachers were fatigued by this assessment procedure. A further advantage of teacher assessments is that ratings are not confounded by the verbal ability of the student. Students who are English language learners, for example, may not understand some items and thus provide answers indicating poor teamwork due to lack of language comprehension rather than poor teamwork. This potential con- found might be particularly problematic for the SJT, which involve large amounts of text. However, presenting SJT items via video or audio may ameliorate this problem. In addition, SJTs may detect subtle judgment processes by asking participants to provide intuitive judgments about ecologically valid scenarios. This ecological valid- ity may also make SJT items more engaging than traditional self-reports, although such context-rich material makes objective standards for scoring difficult. In this study, however, the LCA of students’ responses provided independent confirmation of expert judgments. Classes who disagreed/agreed with the expert scoring-key were identified, providing evidence for the validity of the expert judgments. This novel approach to ascertaining the validity of expert opinion could be usefully extended to SJTs assessing workplace competencies, tacit knowledge, or emotional intelli- gence. Although the small sample size in this study made a multilevel analysis of all 32 responses (4 × 8 scenarios) impossible, multilevel LCA models could be devel- oped in future to accommodate all responses in a ratings-based SJT. Limitations of the Current Study As with any new assessment developed, there is a clear need to conduct additional studies. In the present instance, this includes (a) formally testing factor structure across additional, disparate samples (e.g., non-Ford PAS students, ethnic groups), (b) getting a better understanding of the extent that group performance factors into grades in spe- cific subjects in schools, and (c) formally conducting a multimethod multitrait design (by, for example, having more SJTs to possibly explore the dimensionality of the construct). Clearly too, it would be especially important to conduct a longitudinal study of these teamwork assessments to more fully understand developmental trends and possible casual mechanisms. Future Applications for Teamwork Research in High Schools Subject to certain caveats, including the need for further validity studies, there are several useful ways that this teamwork instrument might be applied in high schools. at University of Sydney on January 13, 2010 http://cjs.sagepub.com Downloaded from
  • 17. Wang et al. / Assessing Teamwork 123 First, this multiple-method assessment might be used for early identification and primary intervention, as deficits in teamwork can potentially harm students’ higher education, career opportunities, and quality of life. Certainly, it would be beneficial to identify students with deficits in teamwork sooner rather than later and provide appropriate remediation. Students high on teamwork might be selected as mentors or role models for students low on teamwork, with study or project groups composed accordingly. In addition, feedback and suggestions for improvement might be given to students based on their own unique profile of teamwork skills. Second, the instrument might be used to gauge the effects of training. Programs emphasizing teamwork (e.g., Ford PAS) are already implemented in some schools. The multiple-method assessment could help determine those aspects of teamwork that might most be amenable to training and fine-tune the programs accordingly. Third, the instrument might be used as a form of career guidance/advisement, in conjunction with cognitive tests and interest inventories. For example, students with very high negotia- tion skills might be directed toward courses or careers where these skills might prove valuable. Overall, this study suggests some promising new directions in teamwork research and its application in high schools. A reliable multiple-method teamwork assess- ment system was developed with promising validity evidence. Such an instrument might profitably be used for manifold purposes in high schools, with multiple meth- ods a useful technique for overcoming the practical limitations evident in giving any assessment in isolation. Note 1. After exclusion, only one reverse-keyed item remained. The phenomenon where reverse-keyed items do not load on the same factors as non-reverse-keyed items (and hence are removed from the item pool) is not isolated to the current dataset but is a measurement issue that has been commented on fre- quently in the literature (e.g., Barnette, 2000). It is as yet unresolved how to concurrently deal with this issue and also control for acquiescence or other response sets. References Ahles, C. B., & Bosworth, C. C. (2004). The perception and reality of student and workplace teams. Journalism & Mass Communication Educator, 59, 42-59. Barnette, J. J. (2000). Effects of stem and Likert-response option reversals on survey internal consistency: If you feel the need, there is a better alternative to using those negatively worded stems. Educational and Psychological Measurement, 60, 361-370. Barton, P. E. (2007). What about those who don’t go? Educational Leadership, 64, 26-27. Hu, L.-T., & Bentler, P. M. (1995). Evaluating model fit. In R. H. Hoyle (Ed.), Structural equation mod- eling: Concepts, issues, and applications (pp. 76-99). Thousand Oaks, CA: SAGE. Hu, L.-T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6, 1-55. at University of Sydney on January 13, 2010 http://cjs.sagepub.com Downloaded from
  • 18. 124 Canadian Journal of School Psychology Liu, O. L., & Rijmen, F. (2008). A modified procedure for parallel analysis for ordered categorical data. Behavior Research Methods, 40, 556-562. Loughry, M. L., Ohland, M. W., & Moore, D. D. (2007). Development of a theory-based assessment of team member effectiveness. Educational and Psychological Measurement, 67, 505-524. Marsh, H. W., Hau, K. T., & Wen, Z. (2004). In search of golden rules: Comment on hypothesis-testing approaches to setting cutoff values for fit indexes and dangers in overgeneralizing Hu and Bentler’s (1999) findings. Structural Equation Modeling: A Multidisciplinary Journal, 11, 320-341. McDaniel, M. A., Morgeson, F. P., Finnegan, E. B., Campion, M. A., & Braverman, E. P. (2001). Use of situational judgment tests to predict job performance: A clarification of the literature. Journal of Applied Psychology, 86, 730-740. McDaniel, M. A., & Nguyen, N. T. (2001). Situational judgment tests: A review of practice and constructs assessed. International Journal of Selection and Assessment, 9, 103-113. Morgeson, F. P., Reider, M. H., & Campion, M. A. (2005). Selecting individuals in team settings: The importance of social skills, personality characteristics, and teamwork knowledge. Personnel Psychology, 58, 583-611. Muthén, L. K., & Muthén, B. O. (1998 -2007). Mplus user’s guide (4th ed.). Los Angeles, CA: Author. O’Neil, H. F., Jr., Wang, S., Jr., & Lee, C. (2003). Assessment of teamwork skills via a teamwork ques- tionnaire. In H. F. O’Neil Jr. & R. S. Perez. (Eds.), Technology applications in education: A learning view (pp. 283-303). Mahwah, NJ: Erlbaum. Stevens, M. J., & Campion, M. A. (1994). The knowledge, skill, and ability requirements for teamwork: Implications for human resource management. Journal of Management, 20, 503-530. Stevens, M. J., & Campion, M. A. (1999). Staffing work teams: Development and validation of a selection test for teamwork settings. Journal of Management, 25, 207-228. Youngstrom, E. A., Loeber, R., & Stouthamer-Loeber, M. (2000). Patterns and correlates of agreement between parent, teacher, and male youth behavior ratings. Journal of Consulting and Clinical Psychology, 68, 1-13. Zeidner, M., Matthews, G., & Roberts, R. D. (2009). What we know about emotional intelligence: How it affects learning, work, relationships, and our mental health. Cambridge, MA: MIT Press. Lijuan Wang obtained her PhD degree in quantitative psychology at the University of Virginia in 2008 and is working as an assistant professor in the Department of Psychology at the University of Notre Dame. Carolyn MacCann received her PhD in psychology from the University of Sydney in 2006, before undertaking a 2-year postdoctoral fellowship at the Educational Testing Service, Princeton. Her research focuses on social and emotional competencies and noncognitive constructs. Xiaohua Zhuang is a PhD candidate in cognitive psychology at Rutgers University. Her research inter- ests include visual attention, inattentional blindness, and visual consciousness. Ou Lydia Liu received her PhD from University of California, Berkeley, in quantitative methods and evaluation. She is currently an associate research scientist at Educational Testing Service, Princeton. Her research areas include educational and psychological instrument validation, accountability measures in higher education, and science assessment. Richard D. Roberts, PhD, is a principal research scientist in the Center for New Constructs in the Educational Testing Service’s Research & Development Division, Princeton, NJ. His main areas of spe- cialization are assessment and human individual differences, where he has published more than 150 peer-reviewed works and received significant grants, contracts, and awards. at University of Sydney on January 13, 2010 http://cjs.sagepub.com Downloaded from