Assessment In Art Education Chapter 8: Validity and Reliability Facilitated by Marissa Barclay
Validity and Reliability? What do these have to do with teaching art?! The art educator involved in assessment initiatives beyond the classroom needs to have a working knowledge of validity and reliability issues as well as item analysis-all of which are critical to assessment practice. In layman's terms: YOU NEED TO KNOW IT! **all information in this Power Point is courtesy of the author, Donna Kay Beattie**
VALIDITY Validity: In short, inferences drawn from a test or assessment score need to be validated. There are twelve validation criteria that are useful for judging the validity of performance-based art assignments. These criteria should be addressed during the given performance task.
#1: Relevance Relevance refers to the quality of fit between the purpose of the assessment and selected performance format and tasks. Art educator should look for the best possible match between purpose and assessment performance formats.
#2: Content Fidelity and Integrity This refers to the authenticity of assessment task content. The assessment task must faithfully reflect the integrity of the discipline and clearly show the field’s most time-tested and valuable content and processes.
#3: Exhaustiveness This term refers to the scope and comprehensiveness of performance task content and underlying constructs are the internal qualities and behaviors that undergird a performance.
#4: Cognitive Complexity Refers to the level s of intellectual complexity the performance task requires of students. An example of this might be a holistic performance –based task that covers each of the four visual arts disciplines, their interconnections, and their connections to the other academic disciplines.
#5: Equity Refers to the extent to which a task allows equal opportunities for all students to succeed. The art educator can create a checklist of possible sources of bias to be used to review the task as it is being developed.
#6: Meaningfulness Refers to how motivating, challenging, and satisfying a task is to both students and others who might have interest in the task, such as parents, other teachers, administrators, and experts from the art disciplines.
#7: Straightforwardness Refers to students’ need to see and understand what is expected of them. In assessment terminology this is known as transparency.
#8: Cohesiveness Refers to the homogeneity of exercises in a performance. If the holistic task score is to be interpreted as a valid measure of knowledge of the visual arts, then each exercise within the task should correlate well with the others.
#9: Consequences Refers to both intended and unintended consequences of a task or of interpretation of task scores,. The art educator can address the consequential criterion by anticipating possible assessment side effects (negative and positive), and hypothesizing potential testing outcomes.
#10: Directness Refers to the extent to which the task reflects the actual behavior of characteristic being examined. A direct assessment of students abilities to criticize a work of art would be to have them write a critical review. Knowledge and skills cannot be assessed directly, but are inferred from performances and products.
#11: Cost and Effieiency The value of the performance task must override the cost, and performance assessments generally cost more than multiple-choice or other pencil-and-paper test formats. All aspects of a performance assessment should be carefully studied in an effort to devise ways to reduce costs and increase efficiency.
#12: Generalizability Refers to the degree to which the results of a performance assessment can be generalized across different domains. Contextual can be considered here because it is concerned with issues of transfer and generalizability. Classroom settings, classroom management effects, and even rater variability are contextual issues that need to be analyzed before making judgments of transfer.
This is a concept known in classical tests and measurement theory.
Can also be defined as the consistency of scores.
For the art educator it is worthwhile to know what factors might cause an unreliable assessment score.
Note: an assessment can be reliable without being valid, but it cannot be valid without being reliable.
Meaning: reliability affects the quality of the decisions made on the basis of the derived score.
Reliability…continued. There are twelve procedures to help improve the reliability and generalizability of art assessments: **Also listed on the handout* -Assess the same material -Broaden the scope of assessments -Develop clear and concrete scoring criteria -Make annotated examples showing each score -Make scoring objective, use a scoring rubric -When possible, and use more than one scorer
Reliability…still continuing… -if more than one judgment is needed, train scorers -Check consistency, go back and check scores -Score one question on all tests, then the next, etc. - Provide practice and training assessment s -Craft tests to fit each student’s needs -Design tasks that help differentiate the most able from the least able students
Item Analysis This is a related psychometric issue to validity and reliability. This term is defined as “the process of collecting summarizing, and using information from students’ responses to make decisions about each assessment task. Three Steps of a simple item analysis on an important classroom test (written exam) 1) The teacher needs to group the assessments according to high and low scores. 2) Tallies are made of each group’s responses to each test item 3) Percentages for item responses are figured.
Graphical descriptions of validity and reliability
What does “validity” mean? Validity refers to inferences drawn from a test or assessment score needed to be validated
Name two or three criteria useful for judging validity of performance-based art assignments Relevance Content and fidelity Exhaustiveness Cognitive complexity Equity Meaningfulness Straightforwardness Cohesiveness Consequences Directness Cost and efieiency Generalizability
Define “reliability” This is a concept known in classical tests and measurement theory. Can also be defined as the consistency of scores.
True or False: an assessment can be valid without being reliable FALSE! an assessment can be reliable without being valid, but it cannot be valid without being reliable. Meaning: reliability affects the quality of the decisions made on the basis of the derived score.