Missouri Art Teachers Agree on Assessment Criteria
1. INFORMATION TO USERS
This manuscript has been reproduced from the microfilm master. UMI films
the text directly from the original or copy submitted. Thus, some thesis and
dissertation copies are in typewriterface, while others may be from any type of
computer printer.
The quality of this reproduction is dependent upon the quality of the
copy submitted. Broken or indistinct print, colored or poor quality illustrations
and photographs, print bleedthrough, substandard margins, and improper
alignment can adversely affect reproduction.
In the unlikely event that the author did not send UMI a complete manuscript
and there are missing pages, these will be noted. Also, if unauthorized
copyright material had to be removed, a note will indicate the deletion.
Oversize materials (e.g., maps, drawings, charts) are reproduced by
sectioning the original, beginning at the upper left-hand comer and continuing
from left to right in equal sections with small overlaps.
Photographs included in the original manuscript have been reproduced
xerographically in this copy. Higher quality 6” x 9" black and vtfiite
photographic prints are available for any photographs or illustrations appearing
in this copy for an additional charge. Contact UMI directly to order.
Bell & Howell Information and Learning
300 North Zeeb Road. Ann Arbor, Ml 48106-1346 USA
800-521-0600
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
3. ART TEACHERS’ OPINIONS
OF ASSESSMENT CRITERIA
A dissertation
presented to
the Faculty of the Graduate School
University of Missouri-Columbia
In Partial fulfillment
of the Requirements for the Degree
Doctor of Philosophy
by
CHERYL VENET
Dr. Larry Kantner, Dissertation Supervisor
MAY 2000
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4. UMI Number 9974694
Copyright 2000 by
Venet, Cheryl Lynn
All rights reserved.
UMI
UMI Microform9974694
Copyright 2000 by Bell & Howell Information and Learning Company.
All rights reserved. This microform edition is protected against
unauthorized copying under Title 17, United States Code.
Bell & Howell Information and Learning Company
300 North Zeeb Road
P.O. Box 1346
Ann Arbor, Ml 48106-1346
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
6. The undersigned, appointed by the Dean of the Graduate School, have examined the
dissertation entitled
ART TEACHERS’ OPINIONS OF ART ASSESSMENT CRITERIA
presented by Cheryl Venet
a candidate for the degree of Doctor of Philosophy
and hereby certify that in their opinion it is worthy of acceptance
-----------
6- /
/ -iS *
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
7. ACKNOWLEDGMENTS
To Missouri art educators for sharing their time and experiences with art
assessment which made this research study possible.
To Dr. Larry Kantner, my advisor and dissertation supervisor, for encouraging me
to complete this degree before and after a IS year hiatus. Through his professional
reputation and friendships, I was able to meet and work with national experts in art
assessment. He coached me toward success with skill, kindness, and support.
To Dr. Adrienne Walker Hoard, for her friendship, encouragement, knowledge of
aesthetics, and for broadening my perspectives by looking through multicultural lenses.
To Dr. Lloyd Barrow, for sharing his knowledge of survey methodology, leading
me toward my goals though succinct and probing questions, and for responding
thoughtfully to all drafts of work-in-progress.
To Frank Stack, for being my mentor and artistic role model for the past twenty
years during which he used his time and expertise to help me improve my paintings.
To Dr. Wendy Sims, for her attention to detail, insightful questions, and for
guiding me toward receiving a dissertation grant which helped fund this study.
To my fellow doctoral students who, along with Dr. Kantner, provided a forum for
discussing issues and stimulating my thoughts about art education.
To my mother and deceased father, Dianne and Harry Venet, for raising me to ask
questions and find answers, and for their unwavering belief in my abilities.
To my siblings, Barbara Horler (who showed me that you can get a Ph.D. while
working more than full-time), Judi Phelps, and Allen Venet for the their love and support.
To my children, Samantha Heisler Myers and Kimberly Heisler, for their constant
love, understanding, encouragement, and faith in me and to whom I dedicate this work.
ii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
8. ART TEACHER’S OPINIONS OF ASSESSMENT CRITERIA
Cheryl Venet
Dr. Larry Kantner, Dissertation Supervisor
ABSTRACT
The arts are a basic part of contemporary education (U.S. Department of
Education, 1998; National Assessment of Educational Progress, 1996). Like teachers in
the core subjects of language arts, mathematics, science, and social sciences, arts
practitioners established expectations for student knowledge and production/performance
through national and state standards (Higgins, 1989; U.S. Department of Education, 1991;
National Standards for Arts Education, 1994; Missouri State Board of Education, 1996).
To determine whether standards increase student achievement - as intended - student
knowledge and performance must be assessed. As a consequence of arts’ inclusion in
basic education, its practitioners must develop, implement, and publicly report the results
of art achievement. States can assess their standards through multiple choice/essay tests,
performance tasks, and/or portfolios.
In Missouri, without a mandatory textbook or state curriculum, there exists great
diversity among schools regarding what students are taught in art classes. Therefore,
standards can be assessed by creating a generic rubric which can be adapted to a wide
variety of art products/assignments. Teachers, trained as judges, would use the rubric’s
criteria and levels of achievement to score student portfolios. Scores obtained through this
assessment could be used to: monitor student growth, provide teachers with feedback for
iii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
9. improving instruction, and inform stakeholders (parents, administrators, the public) about
student achievement (Armstrong, 1994; Beattie, 1997; Eisner, 1996).
The purpose o f this study was to provide a model for school districts or states to
use when designing large-scale, authentic assessments. The research problem was to
determine which criteria should be included on a Missouri art assessment rubric. One
question investigated whether there should be different rubrics for elementary, middle, and
high school grade levels. Another, proposed four sets of aesthetic criteria representing the
aesthetic theories of Formalism, Expressionism, Instrumentalism, and Imitationalism.
Significant differences in opinions among teachers of different grade levels suggested the
use of multiple rubrics. Significant differences among aesthetic theory criteria indicated
that each could be used interchangeably depending upon the project or artist’s intent.
To determine which component criteria and descriptors should be included in the
questionnaire, a search was made of the related literature, experts in the field provided
feedback, and teachers offered input through focus groups held at a Missouri Art
Education Association Conference.
Using survey methodology, 382 (19% of population) Missouri art teachers were
asked to respond to a list of criteria. For each criterion statement, teachers indicated (on a
5-point Likert scale) the degree to which they felt it was important for assessment.
The methodology consisted of the development and mailing of a questionnaire to
a random sample of Missouri art teachers. As a follow-up, a second cover letter and
survey were mailed to non-respondents, then a letter was faxed to building principals, and
finally, phone calls were made to a sample of non-respondents. A total of 78% of teachers
iv
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
10. in the sample responded.
Descriptive statistics, ANOVA, Tukey’s Post Hoc Comparisons, and Contrasts
were computed for each criterion. Written teacher comments were tallied and used to
provide a deeper understanding of survey responses.
This study found that Missouri art teachers agreed upon a list of criteria for
inclusion in a state art assessment rubric. The conclusions follow, presented by survey
category.
Greater than 70% o f art teachers (the cut off for recommending inclusion on a
state rubric) indicated that it was important to include the following Responding to Art
criteria on a state rubric: 1) explains perceptions of artwork; 2) identifies connections
among arts and with other subjects; 3) relates art from historical periods, movements
and/or cultures to own work; 4) uses art vocabulary to describe, analyze, interpret, and
evaluate artworks; and 5) student self-evaluates.
The Creating or Process criteria recommended for the rubric were: I) correctly
uses assigned processes, media, and techniques; 2) demonstrates problem-solving process;
and 3) demonstrates originality, creativity, or inventiveness.
All Attitude or Habits-of-Mind criteria were included: 1) is persistently on task; 2)
respects materials, equipment, other students, and their art; 3) shows commitment; and 4)
is responsive to teacher’s feedback.
The Art Product criteria recommended for the state rubric were: 1) demonstrates
skill or craftspersonship, 2) demonstrates planned, effective composition; 3) work shows
improvement from past products; 4) demonstrates assigned concepts, processes, elements
v
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
11. and/or principles; and S) intent of artist is communicated.
The four aesthetic theory scales were significantly different at the p< 0001 level.
Under Aesthetic criteria, none of the Instrumentalism criteria were thought to be
important by 70% of responding teachers.
All Formalism criteria were deemed to be important: 1) use of elements of art; 2)
use of principles o f design; 3) distorts, exaggerates for purpose of design; and 4)
composition.
Three Expressionist criteria were included: 1) expresses ideas, attitudes, or
feelings; 2) evokes emotions or feelings in viewer; and 3) communicates a point of view.
All Imitationalist criteria were believed to be important for inclusion on a state
rubric: 1) real or idealized representation of life; 2) shows realistic form (3-D)
or illusion of form (2-D); 3) shows realistic texture (3-D) or the illusion o f texture (2-D);
and 4) shows space (3-D), or the illusion of depth (2-D).
The results were used by the Missouri Fine Arts Assessment Task Force to
develop a draft of an interdisciplinaiy arts rubric for teachers to use when conducting local
assessment of the state education standards.
vi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
12. LIST OF TABLES
Table Page
1. Frequency of Grade Levels Currently Taught by Art Teachers in Sample..........117
2. Years of Teaching Experience for Art Teachers in the Sample............................. 118
3. Number of Art Students Taught in a Year by Grade Level.................................. 119
4. Products Considered Important for Teachers to Assess....................................... 122
5. Additional Products Teachers Assess Comments.................................................. 123
6. What is included in Student Portfolios Comments.................................................125
7. Cronbach Coefficient Alpha for “Responding” Criteria........................................127
8. Percentage of Art Teachers who Indicated it was Important to Assess
Student Response Criteria.......................................................................................129
9. Responding to Art Criteria Comments....................................................................128
30
10. Percentage of Art Teachers who Indicated it was Important to Assess
Process Criteria........................................................................................................132
11. Table Creating or Process Criteria Comments....................................................... 133
12. Percentage of Art Teachers who Indicated it was Important to Assess
Attitude or Habits of Mind Criteria........................................................................135
13. Attitude or Habits of Mind Comments....................................................................136
14. Percentage of Art Teachers who Indicated it was Important to Assess
Art Product Criteria.................................................................................................138
15. Art Product Criteria Comments...............................................................................139
16. Cronbach Coefficient Alpha for all “Aesthetic” Criteria.......................................141
17. General Linear Models Procedure ANOVA for Aesthetic M eans......................142
vii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
13. 18. Tukey's Studentized Range (HSD) Test for Aesthetics Subcategories:
Formalist, Expressionist, Instrumental, and Imitationalist Criteria........................144
19. Contrasts for Aesthetics Subcategories: Formalist, Expressionist,
Instrumental, and Imitationalist Criteria....................................................................145
20. Cronbach Coefficient Alpha for Formalist Aesthetic Criteria..................................146
21. Percentage of Art Teachers who Indicated it was Important to Assess
Formalist Aesthetic Criteria....................................................................................... 147
22. Percentage of Art Teachers who Indicated it was Important to Assess
Expressionist Aesthetic Criteria.................................................................................149
23. Cronbach Coefficient Alpha for Instrumental/Pragmatic Aesthetic Criteria...........150
24. Percentage of Art Teachers who Indicated it was Important to Assess
Instrumental or Pragmatic Aesthetic Criteria............................................................151
25. Cronbach Coefficient Alpha for Imitationalist or Mimetic Aesthetic Criteria.........152
26. Percentage of Art Teachers who Indicated it was Important to Assess
Imitationalist or Mimetic Aesthetic Criteria..............................................................154
27. Aesthetic Criteria Comments.......................................................................................156
28. Percentage of Art Teachers who Indicated it was Important to Teach
Specific Content...........................................................................................................159
29. “What do you Teach?” Comments............................................................................. 161
30. Assessment Criteria not Included in this Survey Comments...................................162
31. Pearson Correlation Coefficients for the Mean o f Formalist Criteria, Uses
Elements/Principles, and Abstracts/Non-Objective................................................164
32. Pearson Correlation Coefficients for Expressionist Criteria and Teaching
Students to Express Feelings/Attitudes....................................................................165
33. Pearson Correlation Coefficients for Instrumentalism, “Create Functional Art”
and “Communicate Social, Political, or Personal Messages...................................166
34. Pearson Correlation Coefficients for Imitationalism and
Draw/Paint/Sculpt/Print Realistically from Observation......................................... 167
viii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
14. 35. General Linear Models Procedure ANOVA for Dependent Variable: II,
RoughDrafts................................................................................................................204
36. Tukey's Studentized Range (HSD) Test for variable: II, Rough Drafts................ 205
37. Contrast for Dependent Variable: II, Rough...........................................................206
39. General Linear Models Procedure ANOVA for Dependent Variable: 12, Final
Product......................................................................................................................... 207
39. Tukey's Studentized Range (HSD) Test for variable: 12, Final Product..............208
40. Contrast for Dependent Variable: 12, Final Product................................................ 209
41. General Linear Models Procedure for Dependent Variable: 14, Art Criticism ... 210
42. Tukey's Studentized Range (HSD) Test for variable: 14, Art Criticism.................211
43. Contrast for Dependent Variable: 14, Art Criticism................................................ 212
44. General Linear Models Procedure ANOVA for Dependent Variable: 15, Art
Historical Writing....................................................................................................... 213
45. Tukey's Studentized Range (HSD) Test for variable: 15, Art Historical
Writing........................................................................................................................214
46. Contrast for Dependent Variable: 15, Art Historical Writing.................................215
47. General Linear Models Procedure ANOVA for Dependent Variable: 17,
Portfolio......................................................................................................................216
48. Tukey's Studentized Range (HSD) Test for Variable: 17,Portfolio........................217
49. Contrast for Dependent Variable: 17, Portfolio........................................................218
50. General Linear Models Procedure ANOVA for Dependent Variable: 114,
Uses Vocabulary....................................................................................................... 219
51. Tukey's Studentized Range (HSD) Test for variable: 114, Uses Vocabulary........220
52. Contrast for Dependent Variable: 114, Uses Vocabulary........................................ 221
53. General Linear Models Procedure for Dependent Variable: 115, Self-Evaluate...222
ix
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
15. 54. Tukey’s Studentized Range (HSD) Test for variable: 115, Self-Evaluate...............223
55. Contrasts for Dependent Variable: 115, Self-Evaluate.............................................224
56. General Linear Models Procedure for Dependent Variable: III4,
Sketchbook/Journal................................................................................................... 225
57. Tukey's Studentized Range (HSD) Test for variable: III4 ,
Sketchbook/Journal................................................................................................... 226
58. Contrasts for Dependent Variable: III4, Sketchbook/Journal................................ 227
59. General Linear Models Procedure ANOVA for Dependent Variable: IV3,
Shows Commitment................................................................................................... 228
60. Tukey's Studentized Range (HSD) Test for variable: IV3, Shows
Commitment............................................................................................................... 229
61. Contrasts for Dependent Variable: IV3, Shows Commitment.,.............................230
62. General Linear Models Procedure Dependent Variable: VI,
Craftspersonship.........................................................................................................231
63. Tukey's Studentized Range (HSD) Test for variable: VI,Craftspersonship.........232
64. Contrasts for Dependent Variable: VI,Craftspersonship........................................233
65. General Linear Models Procedure ANOVA for Dependent Variable: V2,
Plans Composition..................................................................................................... 234
66. Tukey's Studentized Range (HSD) Test for variable: V2, Plans
Composition............................................................................................................... 235
67. Contrasts for Dependent Variable: V2, Plans Composition................................... 236
68. General Linear Models Procedure ANOVA for Dependent Variable: VII2,
Realism from Observation......................................................................................... 237
69. Tukey's Studentized Range (HSD) Test for variable: VTI2, Realism from
Observation................................................................................................................ 238
70. Contrasts for Dependent Variable: VII2, Realism from Observation....................239
x
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
16. 71. General Linear Models Procedure ANOVA for Dependent Variable: VII7,
Historical Style................................................................. 240
72. Tukey's Studentized Range (HSD) Test for variable: VIT7, Historical Style........ 241
73. Contrasts for Dependent Variable: VIT7, Historical Style.......................................242
74. General Linear Models Procedure ANOVA for Dependent Variable: V,
Category of Art Product Criteria...............................................................................243
75. Tukey’s Studentized Range (HSD) Test for variable: V, Category of Art
Product Criteria.......................................................................................................... 244
75. Contrasts for Dependent Variable: V, Category of Art Product Criteria..............245
77. General Linear Models Procedure ANOVA for Dependent Variable: VI,
Category of Aesthetic Criteria...................................................................................246
78. Tukey's Studentized Range (HSD) Test for variable: VI, Category of
Aesthetic Criteria........................................................................................................247
79. Contrasts for Dependent Variable: VI, Category of Aesthetic Criteria................. 248
80. General Linear Models Procedure ANOVA for Dependent Variable: Aesthetic
Subcategory of Formalism.........................................................................................249
81. Tukey's Studentized Range (HSD) Test for variable: Aesthetic
Subcategory of Formalism.........................................................................................250
82. Contrasts for Dependent Variable: Aesthetic Subcategory of Formalism..............251
83. General Linear Models Procedure ANOVA for Dependent Variable: Aesthetic
Subcategory of Imitationalism...................................................................................252
84. Tukey's Studentized Range (HSD) Test for variable: Aesthetic
Subcategory of Imitationalism...................................................................................253
85. Contrasts for Dependent Variable: Aesthetic Subcategory of
Imitationalism............................................................................................................. 254
86. General Linear Models Procedure ANOVA for Dependent Variable: VTF3,
Abstracts..................................................................................................................... 255
87. Tukey's Studentized Range (HSD) Test for variable: VIF3, Abstracts.................256
xi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
17. 88. Contrasts for Dependent Variable: VIF3, Abstracts.................................................257
89. General Linear Models Procedure ANOVA for Dependent Variable: VTF4,
Composition.................................................................................................................258
90. Tukey's Studentized Range (HSD) Test for variable: VIF4, Composition........... 259
91. Dependent Variable: VTF4, Composition...................................................................260
92. General Linear Models Procedure ANOVA for Dependent Variable: VIM1,
Realism..........................................................................................................................261
93. Tukey's Studentized Range (HSD) Test for variable: VIM1, Realism..................262
94. Contrasts for Dependent Variable: VIM1, Realism..................................................263
95. General Linear Models Procedure ANOVA for Dependent Variable:
VIM2, Shows Realistic Form..................................................................................... 264
96. Tukey's Studentized Range (HSD) Test for variable: VTM2, Shows Realistic
Form................................................................................................... 265
97. Table for Dependent Variable: VIM2, Shows Realistic Form................................266
98. General Linear Models Procedure ANOVA for Dependent Variable: VTM3,
Shows Realistic Texture.............................................................................................267
99. Tukey's Studentized Range (HSD) Test for variable: VIM3, Shows Realistic
Texture..........................................................................................................................268
100. Contrasts for Dependent Variable: VTM3, Shows Realistic Texture..................... 269
101. General Linear Models Procedure ANOVA for Dependent Variable.
VTM4, Shows Realistic Space....................................................................................270
102. Tukey's Studentized Range (HSD) Test for variable: VIM4, Shows
Realistic Space............................................................................................................. 271
103. Contrasts for Dependent Variable: VTM4, Shows Realistic Space....................... 272
104. Cronbach Coefficient Alpha Correlation Analysis.................................................... 273
xii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
18. LIST OF FIGURES
Figure Page
1. Recommended Criteria for Grade Level, State Art Rubrics.............................191
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
19. TABLE OF CONTENTS
ACKNOWLEDGMENTS..........................................................................................................ii
ABSTRACT...............................................................................................................................iii
LIST OF TABLES....................................................................................................................vii
LIST OF FIGURES.................................................................................................................xiii
Chapter
1. INTRODUCTION................................................................................................... 1
Purpose of Study...............................................................................................5
Importance of the Study....................................................................................6
Statement of the Problem..................................................................................7
Study Design..................................................................................................... 7
Definition of Terms........................................................................................... 8
Assumptions of the Study................................................................................10
Delimitations of the Study...............................................................................11
Summary........................................................................................................... 12
2. REVIEW OF RELATED LITERATURE........................................................... 13
Introduction......................................................................................................13
Functions of Assessment.................................................................................13
History of Arts Testing.................................................................................... 15
Standardized Achievement Tests...................................................................21
Criterion-Referenced Multiple Choice Tests................................................25
xiv
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
20. Alternative Forms of Assessment................................................................... 27
Performance-Based Assessment.....................................................................29
Authentic Assessment..................................................................................... 30
Portfolio Assessment........................................................................................33
Performance Assessment Criteria.................................................................. 36
Aesthetics......................................................................................................... 47
Definition of Aesthetics...................................................................................47
Philosophy of Aesthetics.................................................................................48
Aesthetic Theories of Art................................................................................50
Imitational or Mimetic Theory...........................................................52
Expressionist Theory.......................................................................... 54
Formalist Theory.................................................................................56
Pragmatic or Instrumental Theory..................................................... 58
Open Theory........................................................................................60
Institutional Theory............................................................................62
Postmodern Theory............................................................................ 64
Aesthetic Education in Art Education............................................................66
Aesthetic Theories and Student Art Production...........................................77
Rationale for this Study Based upon Literature Review............................. 81
Summary............................................................................................................85
3 METHODS AND PROCEDURES.......................................................................87
Introduction...................................................................................................... 87
xv
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
21. Research Questions..........................................................................................89
Relationship to the Literature..........................................................................92
How will this Study Answer Research Questions?...................................... 95
Subjects............................................................................................................. 97
The Instrument..................................................................................................99
Themes of Questionnaire Categories..........................................................101
Reliability and Validity................................................................................. 105
Administration of the Survey.......................................................................107
Coding of Surveys...........................................................................................108
Optimizing Return Rate..................................................................................108
Data Analysis.................................................................................................110
4. RESULTS.............................................................................................................113
Introduction...................................................................................................113
Demographic Variables................................................................................115
What do Art Teachers Assess?....................................................................120
Portfolio Assessment....................................................................................123
Responding Criteria........................................................................................ 126
Creating or Process Criteria.........................................................................131
Attitude or Habits-of-Mind Criteria............................................................134
Art Product Criteria......................................................................................136
Aesthetics Criteria.........................................................................................139
Formalist Criteria............................................................................. 145
xvi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
22. Expressionist Criteria........................................................................ 148
Instrumental Criteria......................................................................... 149
Imitationalist Criteria........................................................................ 151
What do You Teach?.....................................................................................157
Relationship Between Aesthetics and Instruction.......................................163
Sample of Non-Respondents........................................................................ 167
Summary..........................................................................................................169
5 SUMMARY AND DISCUSSION OF RESULTS, CONCLUSIONS, AND
RECOMMENDATIONS........................................................................................... 173
Introduction.................................................................................................... 173
Summary..........................................................................................................176
Discussion of Results.....................................................................................187
Conclusions..................................................................................................... 190
Recommendations.......................................................................................... 195
Implications....................................................................................................198
APPENDIX............................................................................................................................. 203
Tables...........................................................................................................................204
REFERENCE LIST....................................................................................................274
Questionnaire, Art Assessment Survey..................................................................... 295
Initial and Follow-up Cover Letters.........................................................................299
Results Sent to Participants........................................................................................302
Draft Missouri Art Assessment Rubric.................................................................... 306
VITA.........................................................................................................................................................................................3 0 8
xvii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
23. CHAPTER ONE
Assessment should look directly at skills and principles essential to
thinking in the arts, such as craftsmanship, originality, willingness to
pursue a problem in depth, development of work over time, ability
to work independently and in a group, ability to perceive qualities in
a work, and ability to think critically about one’s work. The
assessment should reflect the rigorous standards routinely applied
to the professions in the arts as valid fields of intellectual endeavor.
(Rayala, 1995, p. 176)
Introduction
The arts are a basic part of contemporary education (U.S. Department of
Education, 1998; National Assessment of Educational Progress, 1996). Like teachers in
the core subjects of language arts, mathematics, science, and social sciences, arts
practitioners established expectations for student knowledge and production/performance
through national and state standards (Higgins, 1989; U.S. Department of Education, 1991;
National Standards for Arts Education, 1994; Missouri State Board of Education, 1996).
To determine whether standards increase student achievement - as intended - student
knowledge and performance must be assessed. As a consequence of arts’ inclusion in
basic education, its practitioners must develop, implement, and publicly report the results
of art achievement.
In the absence of a national or state curriculum in the visual arts, the broadly-
stated standards are translated into practice by art teachers and/or school districts. Given
the diverse interpretations of standards which are taught in art classrooms, how can the
1
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
24. standards be assessed? One answer is that ifthere are criteria that describe quality in art
processes and products, then it would be possible to use them to create a rubric that
transcends individual teachers’s assignments. In addition to a set of core criteria that
could be applied to all student artwork, are there some criteria that could selectively be
applied to works based upon the subject matter and intent of the artist? If so, then
aesthetic theories of art may provide the lenses or windows for framing different sets of
content-related criteria. Criteria and descriptors of quality production/performance,
assembled into a scoring rubric, could be used by students when creating art, and by
teachers and/or external moderators when assessing student artwork. Scores obtained
through this assessment could be used to: monitor student growth, provide teachers
feedback for improving instruction, and inform stakeholders (parents, administrators, the
public) about student achievement. The subject of this study is the search for such criteria
and for teachers’ subsequent validation of the criteria as important enough to be
considered for large-scale assessment.
In response to the request of the United States Congress for a study on the state of
the arts, the National Endowment for the Arts published Toward Civilization (1987)
which advocated full inclusion of the arts in American education. The report recommends
that state education agencies and local school districts make arts education part o f the
basic school curriculum, K-12, and determine an essential body of content that all students
should know. Toward Civilization specified that each district should implement a
comprehensive testing program to measure student achievement in the arts, using both
qualitative and quantitative measures, and addressing creation, performance, history,
2
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
25. critical analysis, and the place of the arts in society. State education agencies were asked
to develop comparative evaluation procedures based upon state arts education goals for
each district and school arts program. This landmark document set the stage for high-
stakes assessment in the arts (Finlayson, 1988; Rudner & Boston, 1994).
When America 2000: An Education Strategy (U.S. Department of Education,
1991) was amended to include the arts, school districts and state agencies began to search
for ways to document student achievement in the arts (Sabol, 1994). This task continues.
From survey results, Peeno (1997) reported diversity among states in arts evaluation
methods, including essay, multiple-choice, short answer, embedded, performance, and
portfolio assessment. Six states were already assessing the arts; another eight were
planning to do so; 18 had not decided; and 18 had no plans to test in the arts. Vermont,
Utah, California, Wisconsin, and Minnesota, while not administering state-wide tests, have
produced guidelines for teachers to assess achievement of state standards in their
classrooms (Vermont Arts Council, 1995; Stubbs, 1985; Taylor 1991; Mitchell, 1993;
Rayala, 1995; Higgins, 1989).
Zimmerman (1999) comments that authentic assessments in which students are
asked to use knowledge and skills to solve out-of-school realistic problems is becoming
common. She states that “in the near future, most art teachers as well as art education
researchers probably will be involved in some aspect of large-scale arts assessment” (p.
45). Traditionally, large scale assessments have been multiple-choice tests because they
are familiar, report scores which are easily ranked, and cost less than alternative types of
assessment. The trend across content areas is currently away from standardized tests and
3
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
26. toward performance-based, authentic instruments (Wiggins, 1989; Maeroff, 1991). Kohn,
in an interview with O’Neil and Tell (1999), explains the rationale for a trend to conduct
assessment embedded in classroom instruction:
Learning doesn’t take place at a district or a state level; it takes place in a
classroom... a teacher-designed - and perhaps externally validated -
assessment doesn’t meet only the teacher’s needs. If it’s done right, it also
meets the needs of parents and citizens to make sure that the teachers and
schools are doing a decent job. (p.21)
External validation involves assessment by a panel of trained judges using common
criteria. (Gaston, 1977; Weate, 1999). These criteria, when organized with descriptors
that indicate the differences among various levels of success or quality, are called rubrics
(Gall, Borg, & Gall, 1996). With such an instrument, raters are able to score a variety of
teacher-developed assessments. An advantage of this approach is its flexibility in
assessing artwork created in the diverse cultural contexts found in contemporary life
(Broughton, 1999). A disadvantage is that assessments which require judges are labor
intensive, costing more than machine-scored tests (Wiggins, 1998).
The state of Missouri is in the process of developing arts assessment with a limited
budget. Therefore, one component will be a multiple-choice exam in which students
respond in writing to images presented in videotape format (Peeno, 1999). Because the
selected-response format limits students’ critical and/or creative responses, this test will
focus on students’ knowledge of art vocabulary. Art production, aesthetics, and in-depth
responses regarding historical/cultural contexts of art will not be assessed through this
state-wide test. Instead, the state will support teachers’ local scoring of these art
4
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
27. disciplines using a common rubric. Matrix sampling of scored work will be used to
communicate degrees of achievement statewide. The rubric should represent, not only the
state standards which are being addressed, but also the teachers’ practice and
understanding of what is important in assessing students’ art products.
The Purpose of the Study
Assessments in education influence curriculum and instruction. Therefore, state
wide assessment has far-reaching power to change education. One way to improve the
quality of instruction and student achievement is to design an assessment which allows
students to perform or produce tasks that simulate professional practice. In the arts, this
practice is best demonstrated through portfolio assessment. A generic rubric, aligned with
state standards, provides the framework from which to score diverse student artworks and
writings included in portfolios.
The purpose of this study is to provide a rubric development model for states or
school districts to use when designing large-scale portfolio assessments. To determine
which component criteria and descriptors should be included in the instrument, a search
was made of the related literature, experts in the field were asked to provide feedback, and
teachers were asked to provide input during a state art association meeting.
5
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
28. The Importance of the Study
The study will serve as a model which can be adapted by states or school districts
as they begin to discuss achievement and plan ways to document it. Component parts of
the survey might be changed to include items of regional or cultural significance.
Discussions of the model could generate local standards of quality. The study is intended
to stimulate a process in which art teachers, the experts in analyzing quality in student
artworks, determine which criteria should be valued highly enough to become
expectations for all students.
In the state of Missouri, the study results will help determine which criteria should
be included in a state rubric. The rubric will be given to art teachers to help them evaluate
their students’ achievement of state art standards. The specific knowledge standards for
the arts are:
In Fine Arts, students in M issouripublic schools willacquire a solid
foundation which includes knowledge o f
1) process and techniques for the production, exhibition or performance of
one or more of the visual or performed arts;
2) the principles and elements of different art forms;
3) the vocabulary to explain perceptions about and evaluations of works in
dance, music, theater, and visual arts;
4) interrelationships of visual and performing arts and the relationships of
the arts to other disciplines;
5) visual and performing arts in historical and cultural contexts. (Show-Me
Standards, 1996, p. 1)
These standards require high levels of thinking and creating which can be judged through
portfolio assessment. The rubric will function as an operational definition of the
6
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
29. standards, making the general more specific, and therefore of greater practical use. It is
assumed that opinions reflected by the random sample of art teachers should generalize to
others in the population.
The Statement of the Problem
The problem of this study is to identify criteria for a portfolio assessment rubric
that would assess critical and creative thinking, problem-solving, and production in the
visual arts. The study was designed to gather art teachers’ opinions of criteria that could
be used when assessing students’ art achievement of Missouri’s art standards.
Study Design
The study is quantitative. A survey will be mailed to 382, randomly-selected
Missouri art teachers. A questionnaire was developed using a Likert, five-point scale. It
was used to obtain art teachers opinions about the relative importance of various criteria
when assessing student art products. Categories on the questionnaire relate to
demographic information and various aspects of assessment. They are: Demographics,
What do you Teach?, Responding to Art, What do you assess?, Creating or Process
Criteria, Attitude or Habits-of-Mind Criteria, Art Product Criteria, and Aesthetic
Approach Criteria.
7
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
30. Definition of Terms
In the discussion of related literature many terms will be used that have specific
meanings in the fields of assessment and aesthetics. These terms are defined below:
Evaluation and Assessment are synonymous (Eisner, 1996, New W ebster’s dictionary o f
the English Language, 1992, Charles, 1998). Both are processes that obtain
information through measuring, testing, or judging for the purpose of determining
value. Both use quantitative and qualitative sources of data.
Formative Evaluation and Summative Evaluation have become accepted descriptions for
mid-progress versus final evaluation (Scriven, 1981).
Authentic Assessment implies the evaluation of complex tasks in an out-of-school context,
during which students face challenging, “il-structured” (no single known solution)
problems (Wiggins, 1989).
Test refers to a quantitative evaluation/assessment for purposes of reporting or
comparison.
Standardized Tests, typically multiple- or cued- choice, are accompanied by norms that
permit comparison of individuals (Charles, 1998).
8
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
31. Selected-Choice. Cued-Choice. or Multiple-Choice test items ask students to choose the
one correct answer from a list of four or five possible answers.
Criterion-Referenced tests compare a student’s performance “to a whole repertoire of
behaviors, which are, in turn, referenced to the content and skills o f a discipline”
rather than to the performance of other students (Beattie, 1997, p. 4).
Standards are “quantifying thresholds of what is adequate for some purpose established by
authority, custom, or consensus“ (Sadler, 1987, p. 192).
Content Standards specify exit learning criteria.
Achievement Standards “specify achievement levels pertaining to exit learning criteria”
(Beattie, 1997, p.4).
External Assessment describes a situation where the observer is not a normal part of the
situation, and/or the assessment instrument (usually a test) was constructed by
persons outside o f the school district.
Internal Assessments use locally developed instruments and are usually administered by
the teacher as part of instruction or subsequent to it (Armstrong, 1994).
9
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
32. Portfolio assessment is a “purposeful collection of student work that tells the story o f the
student’s efforts, progress, or achievement in (a) given area(s)” (Beattie, 1997, p.
15).
Aesthetics is a group of concepts for understanding the nature of art (Lankford, 1992).
Within the field of aesthetics, theories explain phenomena in different ways. Major
aesthetic theories relevant to this research project are Imitationalism/Mimeticism,
Expressionism, Formalism, and Pragmatism/Instrumentalism.
Imitationalism or Mimeticism proposes that an artifact is art if it copies the real or
imagined world.
Expressionism considers works that either evoke or represent emotions to be art.
Formalism looks for meaning solely from the analysis of the object’s formal qualities such
as line, shape, or color.
Pragmaticism or Instrumentalism views art in terms of it’s social function in a culture.
Assumptions o f the Study
This study is based upon the following assumptions:
10
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
33. 1) The respondents in the sample are representative of Missouri art teachers.
2) The respondents of the sample provided, to the best of their ability, accurate
information to the questions posed.
3) The art teachers represented in the sample assess their student’s work.
4) The art teachers represented in the sample understand the terminology used in the
instrument.
5) The Art Assessment Survey, developed for this study, measures opinions about the
importance of using specific assessment criteria to evaluate student art production.
6) The questionnaires were completed by Missouri art teachers.
Delimitations of the Study
The results of this study were interpreted in relationship to the following
delimitations:
1) The findings are subject to sampling errors.
2) The findings of this study generalize only to Missouri art teachers.
3) Art teachers’ names provided by the Missouri Department of Elementary and
Secondary Education, listed teachers from the previous year, therefore the sample
contained names of teachers who have moved or retired.
4) Missouri has no statewide art textbook or curriculum, therefore teachers may have
different understandings o f terms used in the survey.
11
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
34. 5) The sample of 382 is 19% of the population o f2030 art teachers in the state. Most
dissertations reviewed used a sample of 15%-20% of similar-sized populations. Postage
and printing costs made it necessary to limit the sample size.
6) Some data are not reported in this study. Since the problem was to identify criteria for
a state rubric, the decision was made to report only the percentage of teachers who
favored inclusion of each item. Information on the percentage of teachers who answered
“no opinion”, “little importance”, or “no importance” for each item is available from the
researcher.
Summary
Chapter One included the importance of the study, the statement of the problem,
definition of terms, assumptions and delimitations of the study. Chapter Two presents a
review of literature related to the study. Chapter Three contains a description of the
procedures and methods used in the study. Chapter Four provides an analysis of the data
gathered in the study. Chapter Five reviews and discusses conclusions, recommendations
for further study, and implications of the results.
12
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
35. CHAPTER TWO
Review of Related Literature
Introduction
The scaffold of theory that supports this study is presented in this chapter. The
topics covered in the literature review are: 1) functions of assessment, 2) history of arts
testing, 3) standardized achievement tests, 4) criterion-referenced multiple-choice tests, 5)
alternative assessment, 6) performance-based assessment, 7) authentic assessment, 8)
portfolio assessment, 9) performance assessment criteria, 10) aesthetics, 11) aesthetics as
a philosophy o f art, 12) aesthetic education, 13) aesthetic theories of art, 14) aesthetic
theories as criteria for assessment.
Functions of Assessment
Thirteen assessment roles and the function of each are presented by Boston and
Rudner (1994) in the VisualArts Education Reform Handbook. Those directed toward
student learning are listed as numbers 1-6. Those directed toward the evaluation,
maintenance, and improvement of art programs are numbered 7-13.
1) Criticism (informing students about the quality of a performance)
13
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
36. 2) Grading (informing students, parents, and others about achievement levels)
3) Qualification (to decide which students may enter or leave a course or
program)
4) Placement (to identify the type or ability level most suitable for
students)
5) Prediction (to help predict success or failure based upon past or current
achievement)
6) Diagnosis (to identify students... particular learning attributes)
7) Didactic Feedback ( to provide... feedback concerning ...teaching process)
8) Communication (to convey information about the goals of educational
programs)
9) Accountability (to provide information regarding the extent to which
goals for educational programs have been achieved)
10) Representation (to operationalize...the general or abstract goals of art
education)
11) Implementation (to provide information about the extent to which the arts
program is being implemented)
12) Curriculum Maintenance (to ensure that certain elements of the arts
program continue to be included)
13) Innovation (to encourage the introduction of new...elements into the
arts curriculum), (p.7)
Armstrong (1994) discusses three basic reasons for assessment of student learning: 1) it is
educationally sound; 2) required by some states or school districts; and 3) it is an
opportunity to inform others about art education (p.5). Eisner (1994) lists five functions
of assessment: 1) educational temperature-taking (measuring the educational health of the
nation); 2) “gatekeeping” (selecting the most accomplished to receive further schooling);
3) determining if course objectives have been attained; 4) providing feedback to teachers
on the quality of their work; and 5) providing feedback on the quality of programs (pp.
201- 202).
Depending upon their functions and contexts, various assessments gather different
kinds of data. Generally, the types fall into quantitative and qualitative categories. The
14
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
37. typical quantitative test is a standardized achievement test, while qualitative assessments
occur informally during instruction, through observation, interviews, portfolio, and
production analyzes.
History of Arts Testing
Beattie (1998) reviewed the history of arts testing noting its origins in the pre-Qin
Dynasty of China. Socrates tested thinking through his method of orally examining
students (Beattie, 1998; Eaton, 1994). The first arts tests probably occurred in the middle
ages when artists and musicians had to pass exams to gain admittance to guilds (Zerull,
1990).
Evaluation and assessment were embedded in the scientific tradition dating to the
Enlightenment in Europe and the work of Descartes and Newton. After 1850, scientific
study of human behavior and the child study movement emerged in Germany, while in
England, Galton developed statistics for describing mental performance (Eisner, 1994).
The first “draw a man” test dates to Schuyten (1901-1907). Correlations were
found between drawing ability and intelligence by Ivanof in 1909 (Clark, Zimmerman, &
Zurmuehlen, 1987).
Scientific inquiry was based upon the search for variables that could be measured,
predicted, and control outcomes. The testing movement in America adopted a scientific
approach. Before 1910, the use of surveys, descriptive studies, and psychometric tests
predominated in testing theory.
15
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
38. From 1913-1929, the efficiency movement, based upon Taylor’s time and motions
studies (a model for improving the productivity of factory workers) led a drive for
standardized testing (Eisner; 1994; Clark, Zimmerman, & Zurmuehlen, 1987) and the era
of quantitative testing began. Thorndike, developed the first standardized test in 1913,
and invented “connectionism”, learning through reinforcement of stimulus response
(Castiglione, 1966). In 1926, Thorndike used inter-rater judging for the first time;
Whipple used tests to differentiate gifted from other students; and Terman published the
first Stanford-Binet Intelligence Test. The Manuel Test, developed in, 1919, was used to
discover special ability in drawing using psychological traits. Between 1919 and 1942,
fifteen art tests were developed including the Meier-Seashore Art Judgment Test (Clark,
Zimmerman, & Zurmuehlen, 1987).
During the early twentieth century drawing assessment was not popular due to the
influence of Dewey. In 1916, he was influenced by Darwin’s theory of the nature of
human organisms. In Dewey’s child-centered approach, as the human sought equilibrium
through problem-solving, the mind grew. For him, the child could grow best when he had
the ability to frame and pursue his own purposes. This philosophy, underlying the
Progressive Education movement in the 1920's-50's, viewed art as a means for children’s
self-expression.
In 1926, Goodenough published the Draw a M an Test. Tests o fFundamental
Abilities o f VisualArt, by Lewemz, in 1927, included production, aesthetic perception and
art history, for grades three through 12 (Hoepfner, 1983).
From 1942 to 1966, while art education emphasized creative production,
16
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
39. exploration of media, and personal expression, art test development was depressed. An
anti-test bias was promoted in the literature by Cizek, Dewey, Cole, D’Amico, Read ,
Lowenfeld, Shaefer-Simmem, Kellogg, and Clark, Zimmerman, & Zurmuehlen.
During the same period, educational psychology was developing technological,
systematic theories led by Harap, and Tyler, and followed by Anderson, Bloom, Cronbach,
Goodlad, and Taba. They ushered in the Behavioral Objective Movement which gained
strength after Russia’s ascent of Spudnik in 1957 (Eisner, 1985). The national drive to
reform education focused on the “basics” and changed prevailing educational philosophy
from child-centered growth to the presentation and assessment of clearly-articulated,
measurable objectives.
Eisner developed two tests in 1966, the Eisner Art Information Inventory and the
Eisner Art Aptitude Inventory to measure students’ knowledge and attitudes about the
visual arts. Analysis of his research with secondary school students demonstrated that
neither attitude towards the arts, nor knowledge of art increased over four years of high
school (Clark, Zimmerman, & Zurmuehlen, 1987).
The 1974 and 1978 versions of the National Assessment of Educational Progress
(NAEP) arts assessments, led by Wilson, collected data to describe students’ abilities to:
1) perceive and respond to aspects of art; 2) value art, 3) produce art, 4) know about art,
and 5) make and justify judgments about the aesthetic merit o f art. Included were
multiple-choice questions which required complex thinking, open-ended essay questions
based upon art reproductions and sculpture, and production activities (Clark, Zimmerman,
& Zurmuehlen, 1987; Gaitskill, Hurwitz, & Day, 1982). In analyzing the results of the
17
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
40. first NAEP studies, Clark, Zimmerman, and Zurmuehlen (1987) noted that students’ taste
in art became more conventional and realistic during the late seventies. At the same time,
the importance they place upon art decreased. The items used to assess knowledge about
art and art history included identification of artworks, their dates, and places of creation.
Results of the assessment indicated that American students had limited knowledge about
art. An explanation was that American art curricula generally emphasized production of
art works rather than art history or art criticism. However, in spite of this focus, student
performance on design and drawing skills was lower than expected.
ARTS PROPEL, a program developed by Gardner o f Harvard’s Project Zero, the
Educational Testing Service, and the Pittsburgh schools presented a portfolio assessment
model centered upon studio production, perception, and reflection which has influenced
the field (Gardner & Grunbaum, 1986; Clark, Zimmerman, & Zurmuehlen, 1987; Gardner,
1989; Gardner, 1990; Yau, 1990; Wolf& Pistone, 1991; Winner & Simmons, 1992;
Gitomer, 1992; Arter, 1995). Never intended as large-scale assessment, there were no
provisions for aggregation of data from the PROPEL studies.
In the 1980-1990's, public criticism of education focused on graduates’ deficient
entry level skills for work in the information age (SCANS Report, 1992) and the United
State’s poor standing in international tests. Standards were viewed as a panacea. The
standards model of assessment included training judges to identify multiple right answers
(Castiglione, 1996). Arts educators from the visual arts, music, theatre, and dance formed
a consortium and responded to the standards movement by writing the National Standards
fo r Arts Education (1994). The states were charged with responsibility for assessment
18
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
41. (national testing, defeated by Congress in 1997, is likely to be revived in the future).
The first large-scale arts assessment based upon national standards, the National
Assessment of Educational Progress (NAEP) planned to conduct a field test of fourth,
eighth, and twelfth grade students in 1996-97 (NAEP Arts Education Assessment and
Exercise Specifications, 1996). However, funding limitations necessitated cutting the
proposed administration of the NAEP art test to a national sample of only eighth grade
students. The test was innovative in its scope and performance components. The sample
of the general population, rather than those in art classes, was drawn from public and
private schools. Both paper-and-pencil tasks (used to assess responding) and performance
tasks (used to assess creating) were prepared by the Educational Testing Service. They
wrote:
The visual arts assessment covers both content and processes. Content
includes (1) knowledge and understanding of the visual arts and (2)
perceptual, technical, expressive, and intellectual/reflective skills.
Processes include (1) creating, and (2) responding. (National Center for
Education Statistics, U. S. Department of Education, NCES-526, p. 2)
Results indicate that students who did well on the responding, paper-and-pencil, activities,
also did well on the creating tasks. In both categories, students were challenged. In the
responding category, average scores ranged from a high of 55 percent of students who
could identify an example of contemporary Western art, to a low of 25 percent of students
who could select a work that contributed to Cubism from four choices. On essay
responses, only four percent of students could write a complete, in-depth analysis,
compared with 24 percent who could give a limited, or partial score, answer. The average
creating score in the visual arts was 43 percent of the possible points. Between one
19
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
42. percent and three percent of students scored at the optimal level on tasks that asked them
to create expressive artworks which showed consistent awareness of qualities such as
contrast, texture, and color. Demographically, 52 percent of students attended schools
where visual arts were taught to the typical eighth-grader at least three or four times a
week, though no significant relationships were found between frequency of instruction and
student scores (U.S. Department of Education, 1998). Another large-scale assessment
project, measuring art creation and reflection, is being developed by the Arts Education
Consortium of the Council of Chief State School Officers.
Art education textbooks, expressing prevailing philosophies, disseminated anti
testing attitudes. Lowenfeld and Brittain in Creative and Mental Growth (1957,1987),
viewed testing as an impediment to growth. Kellogg (1969) wrote that tests interfered
with children’s natural development. Chapman, in Approaches to Art in Education
(1978), discussed program evaluation, of which one component was evaluation of
learning, represented by a list of qualitative ways to assess student progress. Eisner, in
Educating Artistic Vision, noted that “in Tests in Print, the most comprehensive catalogue
of published tests available in the world, only 10 o f the 2,100 tests listed are for the visual
arts” (1972, p.206). He identified production and criticism as appropriate subjects for
testing. In Children and Their Art, Gaitskell, Hurwitz, and Day, (1982) espoused
evaluation through questioning students on personal expression, pupils’ reactions to work
of others, and students’ behaviors during participation in art activities. They reviewed the
work of Bloom, behavioral objectives, the NAEP studies, Eisner’s theories of art
connoisseurship and criticism, and suggested that standardized tests were neither reliable
20
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
43. nor applicable to the classroom. In these texts, evaluation was relegated to the end of the
book rather than integrated with curriculum development and instruction. Teaching
strategies for art activities ended with production. With systematic evaluation absent from
major resources, generations of teachers modeled their classroom assessment on their
personal recollection of college instruction in studio courses. Eisner (1996) expressed the
predominant attitude.
Testing typically is predicated on the assumption that the desired outcomes
of educational activities are known in advance; artistic creation seeks
surprise. Testing aspires for all a set of common correct responses; in the
arts, idiosyncratic responses are prized. Testing typically focuses on pieces
or segments of information; artistic work emphasizes wholes and
configurations, (p. 1-2)
Clark (1987) wrote that throughout history, art tests were most frequently
developed for descriptive purposes in research studies with minimal transference to the
classroom. Available art tests were idiosyncratic and specific to individual research
projects.
There is no history of national, normally distributed art achievement tests.
Textbooks were inadequate in suggesting means for national accountability of
achievement in the arts. Therefore, it is necessary to look for test models outside of the
arts.
Standardized Achievement Tests
The most broadly-based test instruments are standardized achievement tests which
21
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
44. use multiple-choice test formats and cover general knowledge. Results are designed to
match a statistical normal curve. Comparative tests are important where competition for
limited resources exists (admissions to degree programs, jobs) and for large scale research
(Castiglione, 1996). They contain multiple-choice items, are based on recall of factual
knowledge and isolated skills, memorization of procedures, do not require judgment, and
are reliable and valid (Frederiksen & Collins, 1989). Standardized tests’ long history
make them acceptable to a wide audience and they are easy to administer (Archbald &
Newman, 1988). Hamblen (1988) noted a trend toward standardized testing in the arts.
Traditional standardized testing was viewed by some educators as a political necessity and
could be used to report how students achieved in terms of general aspects of education
(Newman, 1990). Educational accountability requires reliable assessment to support
innovations in curriculum design, instructional methods, program funding, and student
evaluation (Gruber, 1994). These standardized tests are most frequently found in
mathematics, language arts, science and social studies subject areas.
“The relative lack of systematic content and sequence in art instruction at the
elementary grades accounts for the paucity of useful devices to assess achievement in art”
(Hoepfiier, 1984, p. 251). Hoepfner (1984) believed his difficulty in finding art tests was
due to: uneven requirements for art in schools which generated only a small commercial
market for test developers, art educators’ lack of agreement on uniform art curriculum
content, and the high cost of printing and scoring good tests. In an analysis of available
art tests, Hoepfiier characterized them as unstructured, verbally structured, or object
structured. Since empirical evidence did not exist on the reliability and validity of these
22
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
45. tests, he logically predicted that unstructured would have the highest validity and lowest
reliability. He found no evidence for claims that art either changed attitudes or had an
effect upon creativity.
The Discipline-Based Art Education (DBAE) movement aspired to give all
learners a lay understanding of the arts by engaging them in the four disciplines of artistic
production, criticism, aesthetics, and art history. Day (1985) explained that the process
and products of all these learning activities were meaningful candidates for evaluation for
the improvement of student learning. He saw congruence between DBAE goals and
testing “because evaluation is an essential component for validation of student
achievement” (Day, 1985, p.232). Another advocate of DBAE, Gentile, suggested a
balanced approach to assessment in which criterion-referenced grading using a mastery
learning process for production would be combined with standardized paper and pencil
tests of art criticism, aesthetics, and art history (Gentile, 1989).
Standardized art tests engender widespread interest in the United States and
abroad (Allison, 1977; Lai & Shishido, 1987). The Indiana Department of Education
(1988) developed a multiple choice test for eighth grade students. It attempted to
evaluate art historical, art critical, and aesthetic responses in a multiple choice format.
Students wrote on the 28 page booklet, filled with reproductions (many in full color).
Though promising, the cost became prohibitive and it was discontinued. At the post
college level, the Educational Testing Service (1998) developed a high-stakes art
knowledge test that is required by many states for teacher certification. The exam is
composed of multiple choice items, constructed response hems, and an essay. The
23
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
46. multiple choice questions are typical of standardized tests while the other sections are
criterion-referenced and are scored by trained raters using a rubric for scoring.
Concerns were expressed about the exclusive use of standardized tests. Popham
(1999) explained that standardized tests are poor indicators of educational quality because
their primary purpose is to separate and sort people. From a test writer’s perspective, the
goal of each item is to produce the maximum variance meaning that items are discarded
unless close to 50 percent o f test takers get the wrong answer. Teachers emphasize the
most significant content in any subject area which results in too many test takers
answering those questions correctly. Therefore, the essential content is dropped from the
test while trivial pieces of knowledge, better at discriminating, remain. Worthen and
Spandel (1991) suggested that standardized tests represent only a small part o f assessing
student learning, while teacher-centered assessment plays the greater role. Gordon (1977)
researched effects of achievement testing on disadvantaged and minority populations and
found that measures of diversity (such as differences in student interests, learning styles,
learning rates, motivation, work habits, personalities, ethnicity, sex, and social class) were
usually ignored in standardized assessments. Zimmerman (1992, 1994) noted that
standardized tests tend to reward districts with high socio-economic and entry level
scores; they are biased against women and minorities; there is a lack of correlation
between test scores and improved learning; and minorities are under represented in test
development. She stated that, “students from diverse ethnic, racial, and social groups
possess unique characteristics that should be taken into consideration when art curricula
and assessment measures are being developed” (Zimmerman, 1994, p. 31). Instead of
24
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
47. standardized tests, she advocates a socio-cultural approach in which teacher and
community establish art content. The criteria need to be sensitive to, and include non-
western values of collectivism, traditionalism, non-permanence, and culturally meaningful
symbolism. Hamblen (1988) expressed concerns of many:
Using testing as a legitimating rationale can be a dangerous game even if
closely monitored and there is an explicit awareness ....Within the tautology
of a self-fulfilling prophecy, what fits systematization becomes legitimate
content. Art concepts can be easily limited to that which is technical,
formalistic, and, hence testable, (p. 60)
Standardized tests could be used in the arts and would be appropriate instruments
for the assessment functions of accountability, temperature-taking, reporting to the
community, and gatekeeping. If standardized tests were developed for the visual arts, the
writers should select meaningful rather than trivial content, build higher-order thinking
into complex questions, and address equity for multicultural, diverse populations.
Criterion-Referenced Multiple-Choice Tests
Criterion-referenced tests are linked directly to the learning objective
established for the curriculum. No a priori attention is paid to the
distribution of resulting scores. Successful completion of criterion-
referenced tests is one indicator of mastery of content. (Hoepfiier, 1984,
p.252)
National consensus on art curriculum content will be needed in order to develop
national, criterion-referenced tests. Some arts educators believe a national curriculum
25
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
48. already exists because of 1) state agency frameworks, 2) textbooks, 3) National Teacher’s
Exam, 4) NAEP and the National Art Education Association research agenda, and 5)
Getty’s promotion of DBAE (Zimmerman, 1994). These are insufficient to provide
specific and agreed-upon art content, concepts, processes, or art historical emphases
necessary for a national, criterion-referenced test.
While not appropriate for national testing in the United States, Gentile (1989)
proposed that criterion-referenced tests be used for classroom assessment because they 1)
ensure that students do complete work, 2) establish criteria and standards for adequate
work, and 3) provide incentive to master and excel (Gentile, 1989). Grove (1996)
suggested that “Criterion-referenced tests can be appropriately used in small-scale testing
where common curriculum objectives exist” (p.358). Gaitskell, Hurwitz, and Day, (1992)
provided formats for teachers to use when developing multiple-choice, short answer, and
essay tests. Limitations of both standardized and criterion-referenced multiple-choice tests
are summarized by Parsons (1990):
Understanding, or higher order thinking, is not all of one kind, and can’t be
represented or assessed by a single overall quantitative score. It requires
facts, concepts of different levels of generality, ways of organizing facts
and concepts, procedures and strategies for answering questions and
approaching tasks, and knowledge structures that allow one to organize all
of these, (p.31)
Wiggins (1989) criticized criterion-referenced tests as are inadequate because the
problems were contrived, and the cues artificial. Criterion-referenced tests can be
appropriately used for school, district, or state-level testing where common curriculum
objectives exist. They could serve the functions of accountability, gatekeeping,
26
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
49. improvement of instruction, communication of achievement to all stakeholders, and
modifications of instruction based upon measurement of student learning.
Alternative Forms of Assessment
We can not be said to understand something unless we can employ our
knowledge wisely, fluently, flexibly, and aptly in particular and diverse
contexts. (Wiggins, 1993, p.200)
The umbrella category o f alternative assessment refers to a group of assessment
practices which do not employ standardized or criterion-referenced, multiple-choice
format tests. Performance-based assessments require students to create a product or to
perform a task. Scoring allows partial credit as a means of evaluating process as well as
the final product. Authentic assessments are performances set in a real-world context, and
therefore may be i/-structured, without a single known solution, and frequently may be
evaluated by an audience of experts. A portfolio is typically a collection of student works
demonstrating process, reflection, and final product(s). The portfolio is a methodology
which can be employed as a means of organizing and presenting documents for
performance-based or authentic assessments.
Wiggins (1989) explained that the movement to alternative forms of assessment
was driven by a reaction to:
the key assumptions of conventional test design - the decomposability of
knowledge into elements and the decontextualiztion o f knowing whereby it
27
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
50. is assumed that if we know something, we know it in any context....A true
test of intellectual ability requires the performance of exemplary
tasks ...reform begins by recognizing that the test is central to
instruction....The catch is that the test must offer students a genuine
intellectual challenge, and teachers must be involved in designing the test.
(p.704)
Assessments were performance-based from the time of Socrates until the
development of the Army Alpha multiple-choice exam during World War I (Popham,
1993). In response to deficits in American education publicized by the SCANS Report
(1992), they were resurrected. The business community reported that workers needed to
demonstrate complex skills such as problem-solving, working collaboratively, self-
direction, and effective communication instead of knowing discrete facts being measured
in standardized achievement tests. Business and educational reform demands (America
2000: An Education Strategy, 1991) coincided and led to standards development. Broad
process skills, or “outcomes”, were included in national and state level standards in the
content areas (National Standardsfo r Arts Education, 1994; Show-Me Standardsfo r
Missouri Schools, 1996). Criteria are essential in alternative forms of assessment. The
determination of whether criteria are met usually depends upon a scorer’s judgment or
qualitative analysis (Grove, 1996).
28
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
51. Performance-based Assessment
Performance-based assessment requires students to be active participants.
Students are responsible for creating or constructing their responses (Rudner & Boston,
1994). Tasks that can be used to judge performance are: samples of work in process,
final product, journals, research papers, group presentations or performances, peer
critiques, interviews, self-evaluations, portfolios, essays, discussions, audio tapes, video
tapes, sketches, notes, media experiments, exhibitions, behavior profiles, peer teaching,
and retrospective verbal responses (Siegler, 1989; Wiggins. 1989, 1993; Maeroff, 1991;
MacGregor, 1992; Beattie, 1992, 1994; Madaus, 1993; Worthen, 1993; Zimmerman,
1992, 1994; Gruber, 1994; Rudner & Boston, 1994; Grove, 1996; Boughton, 1997).
The limitation of performance-based assessment as a large scale assessment is the
cost. While standardized or criterion-referenced tests are machine-scored,
product/performance-based scoring requires intense training and time-consuming analysis.
Student products are initially scored by at least two independent raters. Often a third or
fourth is necessary to resolve differences of opinion. Though most classroom instruction
is performance-based, it differs from large scale assessment in that there is no feedback on,
or moderation of the teachers’ scoring of student works.
Performance-based assessment is appropriate for large-scale, high-stakes testing
and is currently being used in state tests in Rhode Island and Kentucky (Maeroff, 1991;
Kentucky, 1996). It would be appropriate for large-scale temperature-taking,
gatekeeping, determining if course objectives had been attained, providing feedback on
29
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
52. both individual students and on the quality of art programs, and informing stakeholders
about student achievement.
Within the general category o f performance-based assessment there are two
variants common to the classroom and literature. They are authentic assessment, and
portfolio assessment.
Authentic Assessment
Authentic Assessment has students demonstrate what they might do outside of
class in the course of normal life (Kentucky, 1996). These assessments are typically
embedded (taught by the teacher as part of the regular instructional program). A scenario,
or real-life context, is presented in which students are expected to solve problems that
adults deal with in contemporary society. (Popham ,1993; Wiggins, 1989, 1993, 1998,
1999; MacGregor, 1992; Beattie, 1992, 1994; Madaus, 1993; Worthen, 1993; Milbrandt,
1998). Bloom, Hastings, & Madaus (1981) suggested that students should have open
access to a variety of reference materials when being tested for synthesis level thinking.
Ideally, synthesis problems should be as close as possible to the situation in
which a scholar (or artist, or engineer, etc.) attacks a problem he or she is
interested in. The time allowed, conditions of work, and other stipulations
should be as far from the typical, controlled examination situation as
possible, (pp. 52-53)
Teachers are the best evaluators of their students’ authentic tasks (Zimmerman, 1992;
Beattie, 1998; Huffman, 1998). Authentic tests involve the following factors:
30
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
53. 1) engaging and worthy problems or questions of importance in which
students must use knowledge to fashion creative and effective
performances...similar to real world problem.
2) faithful representation of contexts in real life
3) non routine and multistage tasks - real problems
4) tasks require a quality product/performance
5) transparent or demystified criteria and standards
6) interactions between assessor and assessee
7) response-contingent challenges where process and product are
important with
concurrent feedback and possibility o f self-adjustment during the test
8) trained assessor judgment in reference to clear and appropriate criteria
9) search for patterns of response in diverse settings (Wiggins 1993, p.
206-207).
In authentic assessments, rubrics or scoring guides are used to list criteria and
describe levels of achievement. Rubrics, the frameworks around which students build
their work are best when collaboratively created by students and teacher and include self-
assessments (Grove, 1996; Huffman, 1998). In order to discriminate levels of
performance, some researchers contrast novice and sophisticated, rather than age-related,
responses. (Efland, 1990; Parsons, 1990). Exemplars or benchmark samples of student
work provide models for students at the beginning of an assignment, and help teachers
calibrate scores during scoring (Frederiksen & Collins, 1989). Critical issues facing
alternative assessment are:
1) conceptual clarity
2) mechanisms for self-criticism
3) support from well-informed educators
4) technical quality and truthfulness
5) standardization of assessment judgments
6) ability to assess complex thinking
7) acceptability to stakeholders
8) appropriateness for high-stakes assessment
9) feasibility
10) continuity and integration across educational systems
31
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
54. 11) use of technology
12) avoidance of monopolies (Worthen, 1993, pp. 447-453).
Based upon authentic assessment in Great Britain, Madaus and Kellagham, (1993)
proposed that large-scale, high-stakes authentic assessments may be prematurely
discontinued due to constraints of time, money, and training of scorers. In a presentation
to the Missouri Art Education Association, head of state art assessment Peeno (1999)
stated that “authentic assessment costs the same amount as teachers’ salaries and supplies
- the district cost per student”. Popham’s (1993) solution was to use genuine matrix
sampling in which a low proportion of both students and assessment tasks are formally
assessed. Students are prepared for many techniques, only a few of which were assessed.
Teachers are influenced by what is eligible to be tested as well as what is actually tested.
The quality of assessment stays high and the costs decrease. For those not participating in
the sample, Popham advised that the government provide “difficulty-equated, but non-
secure, authentic assessment to districts to allow teachers (on a voluntary basis) to show
how well their students are doing” (p.473). These locally-scored assessments keep the
focus o f assessment consistent among the districts selected for formal assessment and
those that are not part of the matrix sample.
Authentic assessments would be appropriate for large-scale temperature-taking,
gatekeeping, determining if course objectives have been attained, providing feedback on
both individual students and on the quality of art programs, and informing stakeholders
about student achievement. The advantage of authentic assessment, over other types of
performance-based assessment, is that a connection is made between what the students are
32
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
55. producing and why anyone would ever produce it. Therefore, natural connections are
made to art careers and lifelong avocations. As such, what students learn in the process of
performing authentic assessment should be more meaningful, likely to be retained over
time, and tend to be transferred to other learning situations.
Portfolio Assessment
Portfolios have historically been used in visual arts, however, until recently little
was written about requirements, contents, and the interaction between student and teacher
regarding the portfolio. Portfolio Assessment is a type of performance-based assessment
which appeared during the standards movement, beginning in 198S, in reaction to
standardized testing.
Many of the tests students encounter, by virtue of the tests’ design as a
series of unrelated questions, draw teaching and learning toward the
mastery of facts and away from large ideas and processes. Students’
repeated encounters with multiple-choice, timed tests teach them that the
bases for success in school are first draft answers rather than sustained
explorations, correctness rather than risk, and information rather than
conceptualization. (Wolf, 1991, p. 65)
Though traditionally used in the visual arts for admissions to art schools and to acquire
jobs, portfolios became a popular addition to traditional testing in language arts, science,
math, and social studies in the 1980-90's (Arter, 1990, 1995). Hamblen (1988) noted the
irony that as other subjects’ testing was becoming more open, the arts were becoming
more standardized. Arter (1995) explained that portfolios were not an end in themselves,
33
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
56. but a means to an end. She reviewed literature on portfolio assessment and concluded
that little hard evidence existed to show that portfolio assessment necessarily led to critical
thinking, self-reflection, responsibility for learning, skills or knowledge (p.l). When
exhibiting clarity of purpose and criteria, the advantages of portfolio use were: 1)
broader, in-depth picture of the student; 2) authenticity; 3) supplements or alternatives to
grade card and/or achievement tests; 4) communication to parents. In addition, portfolios
could be used for certification of competence, to track growth over time, and to
demonstrate accountability. Arter (1990) raised issues: To what extent must
process/content/performance criteria be standardized to be comparable? Were they
feasible, cost-effective? Would teachers buy in? Will conclusions be valid? (p.5-6). The
most influential model of portfolio assessment in the arts has been ARTS PROPEL, in
which the theory of multiple intelligences, developed by Gardner, led to studio-centered
production, perception, and reflection, and offered expanded opportunities for students to
learn beyond traditional logical-linguistic means. Performance tasks were more likely to
elicit a student’s f iili repertoire of skills (Gardner, 1989). Portfolio assessment was
reclaimed by many in the arts (Gardner & Grunbaum, 1986; Clark, Zimmerman, &
Zurmeuhlen, 1987; Gardner, 1986, 1989, 1990; Yau, 1990; Taylor, 1991, 1993; Wolf&
Pistone, 1991; Anderson, D., 1992; Winner & Simmons, 1992; Gitomer, Grosh, & Price
1992; Hausman, 1993; Coates, Gaither & Shauck, 1993; Carroll, 1993; Swann & Bickley-
Green, 1993; Reynolds, 1993; Thomas, 1993; Warner, 1993; Anderson, T., 1994; Beattie,
1994; NAEA Advisory, 1993, 1994; Vermont Assessment Project, 1995). Common
characteristics of art portfolio assessment are: it is student-centered; assessment is both
34
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
57. formative and summative; learning is viewed as an active, constructive process; student
self-reflection is evident; criteria are specified for selection of works and for merit; and
process (documented by sketches, photographs, video-tapes, journals, self-reflective
writings, etc.) receives attention along with final products. Dialogues, between student
and teacher or student and peers, are credited with increased self-motivation, self-
direction, and increase in critical analysis abilities (Wolf, 1991). Though Vermont (1995)
and California (Taylor 1991, 1993) experimented with large-scale portfolio assessment,
problems occurred when attempting to aggregate data (Arter, 1995). Most portfolio use
was classroom-based and internally moderated. Notable exceptions are the large-scale,
high-stakes, externally moderated portfolio assessments used by the Educational Testing
Service on their Advanced Placement art exam, the British national assessment, the New
South Wales, Australia exam, and the International Baccalaureate Program (Anderson,
1994; Blaikie, 1994; Beattie, 1997; Boughton, 1997; Gaston, 1997; Weate, 1999).
Portfolio assessment is appropriate for functions related to individual student
achievement, monitoring growth, providing feedback to improve art curricula, and
demonstrating student progress to parents. It has potential for use in large-scale
assessment if means are developed for aggregation of portfolio information, criteria are
standardized, and rater training issues are resolved.
35
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
58. Performance Assessment Criteria
A wide variety of criteria have been employed in the evaluation of student art.
Judgments are made about art products in diverse venues including an individual teacher’s
classroom and national assessments. These assessments serve different functions and
value different aspects of art. Blaikie (1992, 1994) found that the Advanced Placement
exams concentrate almost exclusively on finished art products; while the International
Baccalaureate evaluates workbook process records and welcomes the art teacher’s
comments in addition to analysis of final art products. In contrast, ARTS PROPEL places
greater emphasis on process and reflective thinking than on the final art product.
Furthermore, many rubrics include behaviors that reflect habits of mind such as
perseverence, fluency, flexibility, and skills in research, analysis, synthesis, and making
judgments. The type of products assessed also varies. In some cases, only studio art
production is assessed, while in others, historical, critical, and/or aesthetic products are
also evaluated.
Clark and Zimmerman (1984) reviewed the literature in art education looking for
observable criteria or indicators of student success in art. In Educating Artistically
Talented Students, they created a composite list of characteristics. Though their purpose
was to use criteria for the purpose of separating the talented from the typical art student,
the descriptors can be viewed as the exemplary column of a performance rubric for all
students. First, they considered criteria evident in artworks. Later, they looked at
behaviors of the student that could be indicative of success in art.
36
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
59. To assess the art product, Clark and Zimmerman (1984) identified five
components of product assessment. The first, “compositional arrangement” encompassed:
skillful composition; complete and coherent designs; purposeful, asymmetrical
arrangement with stability in irregular placement; three or more objects integrated by a
balanced arrangements; complex composition; and elaboration and depiction of details.
The second subset of criteria was “elements and principles” which included well-organized
colors; deliberate brilliancy and contrast; subtle blending of colors; decisive use of line;
clarity of outline; subtle use of line; accurate depiction of light and shadow; intentional use
of indefinite shapes, hazy outlines, shapes blended into the background; and excellence in
use of color, form, grouping, and movement (p.53). The third characteristic of products
was “subject matter” which included: specializes in one subject matter; draws a wide
variety of things; sometimes copies to acquire technique; adept at depiction of movement;
and uses personal experiences and feelings as subject matter. The fourth component was
“art-making skills” (p.56). Included attributes were: true-to-appearance representation;
accurate depiction of depth by perspective; use of good proportion; schematic and
expressive representation; effective use of media; and products show obvious talent and
artistic expression. The fifth category under the art product was “art-making techniques.
Specifics listed were: areas treated to display boldness, blending, gradation, and textures;
visual narratives used for self-expression and as a basis for mature art expression; and uses
smaller paper (p.56).
Clark and Zimmerman (1984) found that, in the literature, researchers looked
beyond the product to consider observational behavior as criteria for success in art. Under
37
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.