SlideShare a Scribd company logo
1 of 32
© 2013 Springer Publishing Company, LLC.
Chapter 2
Qualities of Effective
Assessment Procedures
&Oermann Gaberson
Evaluation and Testing in Nursing Education
4th edition
© 2013 Springer Publishing Company, LLC.
General Criteria for Effective
Assessment Procedures
♦ Produce results that can be used to make
appropriate inferences about learners’
knowledge and abilities
– Important educational decisions based on
such inferences
♦ Practical and easy to use
2
© 2013 Springer Publishing Company, LLC.
Guiding Questions
♦ To what extent will the interpretation of the
scores be appropriate, meaningful, and useful
for their intended application?
♦ What are the consequences of how the results
are used and interpreted?
3
© 2013 Springer Publishing Company, LLC.
Assessment Validity
♦ Concept has changed over time
♦ Current philosophy
– Meaningfulness of the interpretations that
teachers make of assessment results
– Adequacy and appropriateness of inferences
about scores and how results are used
– Emphasis on consequences (intended and
unintended) of test use
4
© 2013 Springer Publishing Company, LLC.
Assessment Validity (cont’d)
♦ Not a static property of the test itself
♦ Not an either/or judgment
– Degrees of validity depending on purpose of test
and how scores will be used
5
© 2013 Springer Publishing Company, LLC.
Assessment Validity (cont’d)
♦ Unitary concept
– Variety of sources of evidence to support the
validity of the interpretation and use of
assessment results
– Four major considerations for validation
• Content
• Construct
• Assessment-criterion relationships
• Consequences
6
© 2013 Springer Publishing Company, LLC.
Content Considerations
♦ Goal of content validation
– Determine the degree to which the assessment
tasks accurately represent the domain of content
or abilities about which the teacher wants to
interpret assessment results
– A test is only a sample of the universe of possible
assessment tasks
– “Face validity” is insufficient evidence of content
representativeness
7
© 2013 Springer Publishing Company, LLC.
Content Considerations
♦ Start by defining the universe of content
– Should be related to the purpose for which
the test will be used
♦ Write or select test items that satisfactorily
represent the desired content domain
– Test blueprint or table of specifications
documents
– Also important when selecting a published test
8
© 2013 Springer Publishing Company, LLC.
Content Considerations (cont’d)
♦ Assessed by content-domain experts
– Determine if assessment tasks represent the
• content domain (as specified on test blueprint)
• learning outcomes
– Trustworthiness of this evidence is based on
estimation of rater reliability
• How closely do the judgments of multiple
experts agree?
9
© 2013 Springer Publishing Company, LLC.
Construct Considerations
♦ “Umbrella” concept for all types of
assessment validation
♦ Goes beyond content considerations
– Used to make inferences from assessment results
to more general abilities (e.g., clinical reasoning)
– What construct is the assessment intended to
measure?
10
© 2013 Springer Publishing Company, LLC.
Construct Considerations (cont’d)
♦ Construct
– Characteristic assumed to exist because it explains some
observed behavior
– Cannot be observed directly—inferred from performance
♦ Construct validation
– Determining the extent to which assessment results can
be interpreted in terms of the construct
♦ Two central elements
– Construct representation
– Construct relevance
11
© 2013 Springer Publishing Company, LLC.
Construct Considerations (cont’d)
♦ Construct representation
– Extent to which important elements of the
construct are represented in the assessment
♦ Construct relevance
– Extent to which the assessment focuses only
on relevant elements of the construct
– Omits factors that are unrelated or irrelevant
to the construct (e.g., writing ability, English
language literacy)
12
© 2013 Springer Publishing Company, LLC.
Construct Considerations (cont’d)
♦ Methods used in construct validation
– Define the domain to be measured
– Analyze the process of responding to tasks
required by the assessment
– Compare assessment results of known groups
– Compare assessment results before and after a
learning activity
– Correlate assessment results with other measures
13
© 2013 Springer Publishing Company, LLC.
Assessment-Criterion Relationship
Considerations
♦ Predictive validation
– Focuses on predicting future performance (the
criterion) based on current assessment results
♦ Concurrent validation
– Uses assessment results to estimate
performance on another assessment (the
criterion measure) at the same time
– Not widely used for teacher-made assessments
14
© 2013 Springer Publishing Company, LLC.
Assessment-Criterion Relationship
Considerations (cont’d)
♦ Relationship between assessment scores and
criterion-measure scores usually expressed as
a correlation coefficient
♦ Teacher who uses the test must judge what
magnitude of correlation is adequate for the
intended use of the assessment
15
© 2013 Springer Publishing Company, LLC.
Consideration of Consequences
♦ Assessment has intended and unintended
consequences
♦ Concept of validity includes consideration of
the consequences of assessment use and how
results are interpreted by students, teachers,
and other stakeholders
16
© 2013 Springer Publishing Company, LLC.
Influences on Validity
♦ Characteristics of the assessment
– Examples: clarity of directions, number of items,
test construction errors
♦ Assessment administration and scoring factors
– Examples: cheating, scoring errors, time limits
♦ Student characteristics
– Examples: test anxiety, motivation
17
© 2013 Springer Publishing Company, LLC.
Reliability
♦ Consistency of test scores
♦ Extent to which test scores are accurate,
error-free, and stable
♦ Reproducibility and generalizability of
test scores
♦ Necessary but insufficient condition
for validity
18
© 2013 Springer Publishing Company, LLC.
Reliability (cont’d)
♦ Sources of inconsistency
– Instability of the behavior being measured
– Sample of tasks varies from one assessment to
another
– Assessment conditions vary significantly
– Scoring procedures are inconsistent
♦ These and other factors introduce an
unknown amount of error into every
measurement
19
© 2013 Springer Publishing Company, LLC.
Reliability (cont’d)
♦ Obtained score
– The number of correct answers
♦ True score
– Hypothetical
– Cannot be measured directly
– Represents what the student actually knows
♦ Error score
– Difference between true score and obtained score
– Cannot be measured directly
– Affects measurement reliability
20
© 2013 Springer Publishing Company, LLC.
Reliability (cont’d)
♦ Methods of determining assessment reliability
estimate how much measurement error is
present
♦ When assessment results are reasonably
consistent, measurement error ↓ and
reliability ↑
21
© 2013 Springer Publishing Company, LLC.
Reliability (cont’d)
♦ Reliability pertains to assessment results, not
to the assessment instrument
♦ A reliability estimate always refers to a
particular type of consistency
♦ A reliability estimate is always represented by
a statistical value (reliability coefficient or
standard error of measurement)
22
© 2013 Springer Publishing Company, LLC.
Methods of Estimating Reliability
♦ Measures of stability
– Indicates whether students would achieve similar
scores if they took the same assessment at
another time—test-retest procedure
– Appropriate when the trait being measured is
expected to be stable over time
– Limited usefulness for teacher-made assessments,
but an important consideration when selecting
standardized tests
23
© 2013 Springer Publishing Company, LLC.
Methods of Estimating
Reliability (cont’d)
♦ Measures of equivalence
– Use of two or more forms of the same
assessment, based on the same blueprint
– Both forms administered to the same group of
students in close succession; resulting scores are
correlated
– High reliability coefficient indicates that the forms
sample the domain equally well
– Widely used in standardized testing, but not
practical for teacher-made assessments
24
© 2013 Springer Publishing Company, LLC.
♦ Measures of internal consistency—split-half
methods
– Used with a set of scores from only one
administration of a single assessment: Divide the
assessment into two equal subtests, score
subtests separately, correlate the two sets of
subscores
– Underestimates the true reliability of the scores
produced by the whole assessment—correct with
Spearman-Brown prophecy formula
Methods of Estimating
Reliability (cont’d)
25
© 2013 Springer Publishing Company, LLC.
♦ Measures of internal consistency—coefficient
alpha
– Extent to which the assessment tasks measure
similar characteristics
– Kuder-Richardson formulas are a specific
type of coefficient alpha
• Require dichotomously scored assessment tasks
Methods of Estimating
Reliability (cont’d)
26
© 2013 Springer Publishing Company, LLC.
♦ Measures of consistency of ratings
– Determine if same scores would have been obtained if a
different person had scored the assessment or judged the
performance
– Two equally qualified persons score each student’s paper
or rate each student’s performance; two scores are
compared
– Produces a percentage of agreement or index of scorer
consistency (correlation)
– Interrater consistency facilitated by the use of scoring
rubrics and training of raters
Methods of Estimating
Reliability (cont’d)
27
© 2013 Springer Publishing Company, LLC.
Influences on Reliability of Scores
♦ Assessment-related factors
– Length of the test
• In general, more assessment tasks (e.g., test items) → greater
score reliability
– Homogeneity of assessment tasks
• Score reliability enhanced by homogeneity of content covered by
the assessment
– Item difficulty and discrimination ability
• Moderately difficulty items, good discrimination between high and
low achievers, and absence of technical errors → greater score
reliability
28
© 2013 Springer Publishing Company, LLC.
Influences on
Reliability of Scores (cont’d)
♦ Student-related factors
– Heterogeneity of the student group
• In general, increased range of ability in the group of students →
greater score reliability
– Testwiseness
• Student with test-taking skills and experience may obtain a higher
score than true ability would predict
– Motivation
• Influences individual students differently
• Scores of poorly motivated students may not accurately represent
their actual achievement levels
29
© 2013 Springer Publishing Company, LLC.
♦ Assessment administration conditions
– Time limits
• Inadequate time can lower the reliability of scores
• Some students who know the content well may be
unable to respond to all of the items
– Cheating
• Contributes random errors to assessment scores
• Raises offenders’ observed scores above their
true scores
Influences on
Reliability of Scores (cont’d)
30
© 2013 Springer Publishing Company, LLC.
Practicality (Usability)
♦ A quality of the assessment instrument itself
and its administration procedures
♦ Qualities of practical assessments
– Easy to administer and score
– Do not take too much time away from other
instructional activities
– Have reasonable resource requirements
31
© 2013 Springer Publishing Company, LLC.
Practicality (Usability; cont’d)
♦ Practicality criteria
– Easy to construct and use
– Reasonable time requirements for administration
and scoring the assessment and interpreting
results
– Reasonable costs associated with assessment
construction, administration, and scoring
– Assessment results can be interpreted easily and
accurately by those who will use them
32

More Related Content

What's hot

Validity and Reliability of a Test
Validity and Reliability of a TestValidity and Reliability of a Test
Validity and Reliability of a TestBella Jao
 
Measurement validity and reliability
Measurement validity and reliabilityMeasurement validity and reliability
Measurement validity and reliabilityJonathan Javid
 
Reliability and validity- research-for BSC/PBBSC AND MSC NURSING
Reliability and validity- research-for BSC/PBBSC AND MSC NURSINGReliability and validity- research-for BSC/PBBSC AND MSC NURSING
Reliability and validity- research-for BSC/PBBSC AND MSC NURSINGSUCHITRARATI1976
 
Reliability & validity
Reliability & validityReliability & validity
Reliability & validityalameenpa
 
Good scale measurement
Good scale measurementGood scale measurement
Good scale measurementsai precious
 
reliablity and validity in social sciences research
reliablity and validity  in social sciences researchreliablity and validity  in social sciences research
reliablity and validity in social sciences researchSourabh Sharma
 
Validity, reliability & practicality
Validity, reliability & practicalityValidity, reliability & practicality
Validity, reliability & practicalitySamcruz5
 
Characteristics of a Good Test
Characteristics of a Good TestCharacteristics of a Good Test
Characteristics of a Good TestAjab Ali Lashari
 
Content &statistical validity
Content &statistical validityContent &statistical validity
Content &statistical validityAMU
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validityKaimrc_Rss_Jd
 

What's hot (19)

Validity and Reliability of a Test
Validity and Reliability of a TestValidity and Reliability of a Test
Validity and Reliability of a Test
 
Testing
TestingTesting
Testing
 
Validation
ValidationValidation
Validation
 
Measurement validity and reliability
Measurement validity and reliabilityMeasurement validity and reliability
Measurement validity and reliability
 
Reliability and validity- research-for BSC/PBBSC AND MSC NURSING
Reliability and validity- research-for BSC/PBBSC AND MSC NURSINGReliability and validity- research-for BSC/PBBSC AND MSC NURSING
Reliability and validity- research-for BSC/PBBSC AND MSC NURSING
 
Reliability & validity
Reliability & validityReliability & validity
Reliability & validity
 
Validity
ValidityValidity
Validity
 
Characteristics of a Good Test
Characteristics of a Good TestCharacteristics of a Good Test
Characteristics of a Good Test
 
Good scale measurement
Good scale measurementGood scale measurement
Good scale measurement
 
reliablity and validity in social sciences research
reliablity and validity  in social sciences researchreliablity and validity  in social sciences research
reliablity and validity in social sciences research
 
Validity, reliability & practicality
Validity, reliability & practicalityValidity, reliability & practicality
Validity, reliability & practicality
 
Characteristics of a Good Test
Characteristics of a Good TestCharacteristics of a Good Test
Characteristics of a Good Test
 
Item Analysis and Validation
Item Analysis and ValidationItem Analysis and Validation
Item Analysis and Validation
 
Item analysis in MCQs
Item analysis in MCQsItem analysis in MCQs
Item analysis in MCQs
 
Content &statistical validity
Content &statistical validityContent &statistical validity
Content &statistical validity
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validity
 
Validity Evidence
Validity EvidenceValidity Evidence
Validity Evidence
 
Item analysis.pptx du
Item analysis.pptx duItem analysis.pptx du
Item analysis.pptx du
 
Validity in Assessment
Validity in AssessmentValidity in Assessment
Validity in Assessment
 

Viewers also liked

Tipos de consultas
Tipos de consultasTipos de consultas
Tipos de consultasjessicajerez
 
7 myths of email marketing
7 myths of email marketing7 myths of email marketing
7 myths of email marketingupside1
 
Biografia
BiografiaBiografia
Biografiadamaraz
 
Narrativa transmedia figuras geométricas
Narrativa transmedia figuras geométricasNarrativa transmedia figuras geométricas
Narrativa transmedia figuras geométricasVanemalave
 
Moving towards 100% recovery of concrete demolition waste across Europe - Kar...
Moving towards 100% recovery of concrete demolition waste across Europe - Kar...Moving towards 100% recovery of concrete demolition waste across Europe - Kar...
Moving towards 100% recovery of concrete demolition waste across Europe - Kar...Karl Downey
 
Frecuencia cardiaca
Frecuencia cardiacaFrecuencia cardiaca
Frecuencia cardiacaJuan Gge
 
Autoexcitación de las fibras del nódulo sinusal Fisiologia
Autoexcitación de las fibras del nódulo sinusal FisiologiaAutoexcitación de las fibras del nódulo sinusal Fisiologia
Autoexcitación de las fibras del nódulo sinusal FisiologiaMaaria Esther L'Garza
 
Math3010 week 3
Math3010 week 3Math3010 week 3
Math3010 week 3stanbridge
 
ENFERMERIA- SIGNOS VITALES (VALORES NORMALES)
ENFERMERIA- SIGNOS VITALES (VALORES NORMALES)ENFERMERIA- SIGNOS VITALES (VALORES NORMALES)
ENFERMERIA- SIGNOS VITALES (VALORES NORMALES)josueadairdelacruzmorales
 

Viewers also liked (12)

Tipos de consultas
Tipos de consultasTipos de consultas
Tipos de consultas
 
7 myths of email marketing
7 myths of email marketing7 myths of email marketing
7 myths of email marketing
 
Resume
ResumeResume
Resume
 
Biografia
BiografiaBiografia
Biografia
 
Participants @ State Bank Of Mysore
Participants @ State Bank Of MysoreParticipants @ State Bank Of Mysore
Participants @ State Bank Of Mysore
 
Narrativa transmedia figuras geométricas
Narrativa transmedia figuras geométricasNarrativa transmedia figuras geométricas
Narrativa transmedia figuras geométricas
 
Moving towards 100% recovery of concrete demolition waste across Europe - Kar...
Moving towards 100% recovery of concrete demolition waste across Europe - Kar...Moving towards 100% recovery of concrete demolition waste across Europe - Kar...
Moving towards 100% recovery of concrete demolition waste across Europe - Kar...
 
Frecuencia cardiaca
Frecuencia cardiacaFrecuencia cardiaca
Frecuencia cardiaca
 
Autoexcitación de las fibras del nódulo sinusal Fisiologia
Autoexcitación de las fibras del nódulo sinusal FisiologiaAutoexcitación de las fibras del nódulo sinusal Fisiologia
Autoexcitación de las fibras del nódulo sinusal Fisiologia
 
Chapter 9 Alcohol
Chapter 9   AlcoholChapter 9   Alcohol
Chapter 9 Alcohol
 
Math3010 week 3
Math3010 week 3Math3010 week 3
Math3010 week 3
 
ENFERMERIA- SIGNOS VITALES (VALORES NORMALES)
ENFERMERIA- SIGNOS VITALES (VALORES NORMALES)ENFERMERIA- SIGNOS VITALES (VALORES NORMALES)
ENFERMERIA- SIGNOS VITALES (VALORES NORMALES)
 

Similar to Chapter 2 ppt eval & testing 4e formatted 01.10 kg edits

Chapter 14 ppt eval & testing 4e formatted 01.10 mo checked
Chapter 14 ppt eval & testing 4e formatted 01.10 mo checkedChapter 14 ppt eval & testing 4e formatted 01.10 mo checked
Chapter 14 ppt eval & testing 4e formatted 01.10 mo checkedstanbridge
 
Chapter 1 ppt eval & testing 4e formatted 01.10 mo edits query comment re...
Chapter 1 ppt eval & testing 4e formatted 01.10 mo edits query comment re...Chapter 1 ppt eval & testing 4e formatted 01.10 mo edits query comment re...
Chapter 1 ppt eval & testing 4e formatted 01.10 mo edits query comment re...stanbridge
 
Chapter 1 ppt eval & testing 4e formatted 01.10 mo edits query comment re...
Chapter 1 ppt eval & testing 4e formatted 01.10 mo edits query comment re...Chapter 1 ppt eval & testing 4e formatted 01.10 mo edits query comment re...
Chapter 1 ppt eval & testing 4e formatted 01.10 mo edits query comment re...stanbridge
 
Chapter 18 ppt eval & testing 4e formatted 01.10 kg edits
Chapter 18 ppt eval & testing 4e formatted 01.10 kg editsChapter 18 ppt eval & testing 4e formatted 01.10 kg edits
Chapter 18 ppt eval & testing 4e formatted 01.10 kg editsstanbridge
 
Chapter 16 ppt eval & testing 4e formatted 01.10 kg edits
Chapter 16 ppt eval & testing 4e formatted 01.10 kg editsChapter 16 ppt eval & testing 4e formatted 01.10 kg edits
Chapter 16 ppt eval & testing 4e formatted 01.10 kg editsstanbridge
 
Arte387Ch8
Arte387Ch8Arte387Ch8
Arte387Ch8SCWARTED
 
Good test , Reliability and Validity of a good test
Good test , Reliability and Validity of a good testGood test , Reliability and Validity of a good test
Good test , Reliability and Validity of a good testTiru Goel
 
STANDARDIZED TEST AND NON STANDARDED TEST.pdf
STANDARDIZED TEST AND NON STANDARDED TEST.pdfSTANDARDIZED TEST AND NON STANDARDED TEST.pdf
STANDARDIZED TEST AND NON STANDARDED TEST.pdfOM VERMA
 
Standardized test and non standarded test
Standardized test and non standarded testStandardized test and non standarded test
Standardized test and non standarded testOM VERMA
 
Gradding and reporting
Gradding and reporting Gradding and reporting
Gradding and reporting HennaAnsari
 
Test standardization
Test standardizationTest standardization
Test standardizationKaye Batica
 
2.3. Types of Validity in assessment and education
2.3. Types of Validity in assessment and education2.3. Types of Validity in assessment and education
2.3. Types of Validity in assessment and educationBharat98560
 
Standardized and non-standardized test -Tests of Achievement
Standardized and non-standardized test -Tests of AchievementStandardized and non-standardized test -Tests of Achievement
Standardized and non-standardized test -Tests of AchievementVinodhini kirthivasan
 
Measurement & Evaluation pptx
Measurement & Evaluation pptxMeasurement & Evaluation pptx
Measurement & Evaluation pptxAliimtiaz35
 
Strategies in teaching
Strategies in teachingStrategies in teaching
Strategies in teachingShian Morallos
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validityMuhammad Ali
 
NED 203 Criterion Referenced Test & Rubrics
NED 203 Criterion Referenced Test & RubricsNED 203 Criterion Referenced Test & Rubrics
NED 203 Criterion Referenced Test & RubricsCarmina Gurrea
 

Similar to Chapter 2 ppt eval & testing 4e formatted 01.10 kg edits (20)

Chapter 14 ppt eval & testing 4e formatted 01.10 mo checked
Chapter 14 ppt eval & testing 4e formatted 01.10 mo checkedChapter 14 ppt eval & testing 4e formatted 01.10 mo checked
Chapter 14 ppt eval & testing 4e formatted 01.10 mo checked
 
Chapter 1 ppt eval & testing 4e formatted 01.10 mo edits query comment re...
Chapter 1 ppt eval & testing 4e formatted 01.10 mo edits query comment re...Chapter 1 ppt eval & testing 4e formatted 01.10 mo edits query comment re...
Chapter 1 ppt eval & testing 4e formatted 01.10 mo edits query comment re...
 
Chapter 1 ppt eval & testing 4e formatted 01.10 mo edits query comment re...
Chapter 1 ppt eval & testing 4e formatted 01.10 mo edits query comment re...Chapter 1 ppt eval & testing 4e formatted 01.10 mo edits query comment re...
Chapter 1 ppt eval & testing 4e formatted 01.10 mo edits query comment re...
 
Chapter 18 ppt eval & testing 4e formatted 01.10 kg edits
Chapter 18 ppt eval & testing 4e formatted 01.10 kg editsChapter 18 ppt eval & testing 4e formatted 01.10 kg edits
Chapter 18 ppt eval & testing 4e formatted 01.10 kg edits
 
Chapter 16 ppt eval & testing 4e formatted 01.10 kg edits
Chapter 16 ppt eval & testing 4e formatted 01.10 kg editsChapter 16 ppt eval & testing 4e formatted 01.10 kg edits
Chapter 16 ppt eval & testing 4e formatted 01.10 kg edits
 
Arte387Ch8
Arte387Ch8Arte387Ch8
Arte387Ch8
 
Good test , Reliability and Validity of a good test
Good test , Reliability and Validity of a good testGood test , Reliability and Validity of a good test
Good test , Reliability and Validity of a good test
 
STANDARDIZED TEST AND NON STANDARDED TEST.pdf
STANDARDIZED TEST AND NON STANDARDED TEST.pdfSTANDARDIZED TEST AND NON STANDARDED TEST.pdf
STANDARDIZED TEST AND NON STANDARDED TEST.pdf
 
Standardized test and non standarded test
Standardized test and non standarded testStandardized test and non standarded test
Standardized test and non standarded test
 
Gradding and reporting
Gradding and reporting Gradding and reporting
Gradding and reporting
 
Test standardization
Test standardizationTest standardization
Test standardization
 
2.3. Types of Validity in assessment and education
2.3. Types of Validity in assessment and education2.3. Types of Validity in assessment and education
2.3. Types of Validity in assessment and education
 
Standardized and non-standardized test -Tests of Achievement
Standardized and non-standardized test -Tests of AchievementStandardized and non-standardized test -Tests of Achievement
Standardized and non-standardized test -Tests of Achievement
 
Measurement & Evaluation pptx
Measurement & Evaluation pptxMeasurement & Evaluation pptx
Measurement & Evaluation pptx
 
HR 202 Chapter 11
HR 202 Chapter 11HR 202 Chapter 11
HR 202 Chapter 11
 
Assessment- Fundamentals of Instruction
Assessment- Fundamentals of InstructionAssessment- Fundamentals of Instruction
Assessment- Fundamentals of Instruction
 
Strategies in teaching
Strategies in teachingStrategies in teaching
Strategies in teaching
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validity
 
NED 203 Criterion Referenced Test & Rubrics
NED 203 Criterion Referenced Test & RubricsNED 203 Criterion Referenced Test & Rubrics
NED 203 Criterion Referenced Test & Rubrics
 
Campare 3737
Campare 3737Campare 3737
Campare 3737
 

More from stanbridge

Micro Lab 3 Lecture
Micro Lab 3 LectureMicro Lab 3 Lecture
Micro Lab 3 Lecturestanbridge
 
Creating a poster v2
Creating a poster v2Creating a poster v2
Creating a poster v2stanbridge
 
Creating a poster
Creating a posterCreating a poster
Creating a posterstanbridge
 
OT 5018 Thesis Dissemination
OT 5018 Thesis DisseminationOT 5018 Thesis Dissemination
OT 5018 Thesis Disseminationstanbridge
 
Ot5101 005 week 5
Ot5101 005 week 5Ot5101 005 week 5
Ot5101 005 week 5stanbridge
 
Ot5101 005 week4
Ot5101 005 week4Ot5101 005 week4
Ot5101 005 week4stanbridge
 
Compliance, motivation, and health behaviors
Compliance, motivation, and health behaviors Compliance, motivation, and health behaviors
Compliance, motivation, and health behaviors stanbridge
 
Ch 5 developmental stages of the learner
Ch 5   developmental stages of the learnerCh 5   developmental stages of the learner
Ch 5 developmental stages of the learnerstanbridge
 
OT 5101 week2 theory policy
OT 5101 week2 theory policyOT 5101 week2 theory policy
OT 5101 week2 theory policystanbridge
 
OT 5101 week3 planning needs assessment
OT 5101 week3 planning needs assessmentOT 5101 week3 planning needs assessment
OT 5101 week3 planning needs assessmentstanbridge
 
NUR 304 Chapter005
NUR 304 Chapter005NUR 304 Chapter005
NUR 304 Chapter005stanbridge
 
NUR 3043 Chapter007
NUR 3043 Chapter007NUR 3043 Chapter007
NUR 3043 Chapter007stanbridge
 
NUR 3043 Chapter006
NUR 3043 Chapter006NUR 3043 Chapter006
NUR 3043 Chapter006stanbridge
 
NUR 3043 Chapter004
NUR 3043 Chapter004NUR 3043 Chapter004
NUR 3043 Chapter004stanbridge
 
3043 Chapter009
3043 Chapter0093043 Chapter009
3043 Chapter009stanbridge
 
3043 Chapter008
 3043 Chapter008 3043 Chapter008
3043 Chapter008stanbridge
 
Melnyk ppt chapter_21
Melnyk ppt chapter_21Melnyk ppt chapter_21
Melnyk ppt chapter_21stanbridge
 
Melnyk ppt chapter_22
Melnyk ppt chapter_22Melnyk ppt chapter_22
Melnyk ppt chapter_22stanbridge
 

More from stanbridge (20)

Micro Lab 3 Lecture
Micro Lab 3 LectureMicro Lab 3 Lecture
Micro Lab 3 Lecture
 
Creating a poster v2
Creating a poster v2Creating a poster v2
Creating a poster v2
 
Creating a poster
Creating a posterCreating a poster
Creating a poster
 
Sample poster
Sample posterSample poster
Sample poster
 
OT 5018 Thesis Dissemination
OT 5018 Thesis DisseminationOT 5018 Thesis Dissemination
OT 5018 Thesis Dissemination
 
Ot5101 005 week 5
Ot5101 005 week 5Ot5101 005 week 5
Ot5101 005 week 5
 
Ot5101 005 week4
Ot5101 005 week4Ot5101 005 week4
Ot5101 005 week4
 
Compliance, motivation, and health behaviors
Compliance, motivation, and health behaviors Compliance, motivation, and health behaviors
Compliance, motivation, and health behaviors
 
Ch 5 developmental stages of the learner
Ch 5   developmental stages of the learnerCh 5   developmental stages of the learner
Ch 5 developmental stages of the learner
 
OT 5101 week2 theory policy
OT 5101 week2 theory policyOT 5101 week2 theory policy
OT 5101 week2 theory policy
 
OT 5101 week3 planning needs assessment
OT 5101 week3 planning needs assessmentOT 5101 week3 planning needs assessment
OT 5101 week3 planning needs assessment
 
Ot5101 week1
Ot5101 week1Ot5101 week1
Ot5101 week1
 
NUR 304 Chapter005
NUR 304 Chapter005NUR 304 Chapter005
NUR 304 Chapter005
 
NUR 3043 Chapter007
NUR 3043 Chapter007NUR 3043 Chapter007
NUR 3043 Chapter007
 
NUR 3043 Chapter006
NUR 3043 Chapter006NUR 3043 Chapter006
NUR 3043 Chapter006
 
NUR 3043 Chapter004
NUR 3043 Chapter004NUR 3043 Chapter004
NUR 3043 Chapter004
 
3043 Chapter009
3043 Chapter0093043 Chapter009
3043 Chapter009
 
3043 Chapter008
 3043 Chapter008 3043 Chapter008
3043 Chapter008
 
Melnyk ppt chapter_21
Melnyk ppt chapter_21Melnyk ppt chapter_21
Melnyk ppt chapter_21
 
Melnyk ppt chapter_22
Melnyk ppt chapter_22Melnyk ppt chapter_22
Melnyk ppt chapter_22
 

Chapter 2 ppt eval & testing 4e formatted 01.10 kg edits

  • 1. © 2013 Springer Publishing Company, LLC. Chapter 2 Qualities of Effective Assessment Procedures &Oermann Gaberson Evaluation and Testing in Nursing Education 4th edition
  • 2. © 2013 Springer Publishing Company, LLC. General Criteria for Effective Assessment Procedures ♦ Produce results that can be used to make appropriate inferences about learners’ knowledge and abilities – Important educational decisions based on such inferences ♦ Practical and easy to use 2
  • 3. © 2013 Springer Publishing Company, LLC. Guiding Questions ♦ To what extent will the interpretation of the scores be appropriate, meaningful, and useful for their intended application? ♦ What are the consequences of how the results are used and interpreted? 3
  • 4. © 2013 Springer Publishing Company, LLC. Assessment Validity ♦ Concept has changed over time ♦ Current philosophy – Meaningfulness of the interpretations that teachers make of assessment results – Adequacy and appropriateness of inferences about scores and how results are used – Emphasis on consequences (intended and unintended) of test use 4
  • 5. © 2013 Springer Publishing Company, LLC. Assessment Validity (cont’d) ♦ Not a static property of the test itself ♦ Not an either/or judgment – Degrees of validity depending on purpose of test and how scores will be used 5
  • 6. © 2013 Springer Publishing Company, LLC. Assessment Validity (cont’d) ♦ Unitary concept – Variety of sources of evidence to support the validity of the interpretation and use of assessment results – Four major considerations for validation • Content • Construct • Assessment-criterion relationships • Consequences 6
  • 7. © 2013 Springer Publishing Company, LLC. Content Considerations ♦ Goal of content validation – Determine the degree to which the assessment tasks accurately represent the domain of content or abilities about which the teacher wants to interpret assessment results – A test is only a sample of the universe of possible assessment tasks – “Face validity” is insufficient evidence of content representativeness 7
  • 8. © 2013 Springer Publishing Company, LLC. Content Considerations ♦ Start by defining the universe of content – Should be related to the purpose for which the test will be used ♦ Write or select test items that satisfactorily represent the desired content domain – Test blueprint or table of specifications documents – Also important when selecting a published test 8
  • 9. © 2013 Springer Publishing Company, LLC. Content Considerations (cont’d) ♦ Assessed by content-domain experts – Determine if assessment tasks represent the • content domain (as specified on test blueprint) • learning outcomes – Trustworthiness of this evidence is based on estimation of rater reliability • How closely do the judgments of multiple experts agree? 9
  • 10. © 2013 Springer Publishing Company, LLC. Construct Considerations ♦ “Umbrella” concept for all types of assessment validation ♦ Goes beyond content considerations – Used to make inferences from assessment results to more general abilities (e.g., clinical reasoning) – What construct is the assessment intended to measure? 10
  • 11. © 2013 Springer Publishing Company, LLC. Construct Considerations (cont’d) ♦ Construct – Characteristic assumed to exist because it explains some observed behavior – Cannot be observed directly—inferred from performance ♦ Construct validation – Determining the extent to which assessment results can be interpreted in terms of the construct ♦ Two central elements – Construct representation – Construct relevance 11
  • 12. © 2013 Springer Publishing Company, LLC. Construct Considerations (cont’d) ♦ Construct representation – Extent to which important elements of the construct are represented in the assessment ♦ Construct relevance – Extent to which the assessment focuses only on relevant elements of the construct – Omits factors that are unrelated or irrelevant to the construct (e.g., writing ability, English language literacy) 12
  • 13. © 2013 Springer Publishing Company, LLC. Construct Considerations (cont’d) ♦ Methods used in construct validation – Define the domain to be measured – Analyze the process of responding to tasks required by the assessment – Compare assessment results of known groups – Compare assessment results before and after a learning activity – Correlate assessment results with other measures 13
  • 14. © 2013 Springer Publishing Company, LLC. Assessment-Criterion Relationship Considerations ♦ Predictive validation – Focuses on predicting future performance (the criterion) based on current assessment results ♦ Concurrent validation – Uses assessment results to estimate performance on another assessment (the criterion measure) at the same time – Not widely used for teacher-made assessments 14
  • 15. © 2013 Springer Publishing Company, LLC. Assessment-Criterion Relationship Considerations (cont’d) ♦ Relationship between assessment scores and criterion-measure scores usually expressed as a correlation coefficient ♦ Teacher who uses the test must judge what magnitude of correlation is adequate for the intended use of the assessment 15
  • 16. © 2013 Springer Publishing Company, LLC. Consideration of Consequences ♦ Assessment has intended and unintended consequences ♦ Concept of validity includes consideration of the consequences of assessment use and how results are interpreted by students, teachers, and other stakeholders 16
  • 17. © 2013 Springer Publishing Company, LLC. Influences on Validity ♦ Characteristics of the assessment – Examples: clarity of directions, number of items, test construction errors ♦ Assessment administration and scoring factors – Examples: cheating, scoring errors, time limits ♦ Student characteristics – Examples: test anxiety, motivation 17
  • 18. © 2013 Springer Publishing Company, LLC. Reliability ♦ Consistency of test scores ♦ Extent to which test scores are accurate, error-free, and stable ♦ Reproducibility and generalizability of test scores ♦ Necessary but insufficient condition for validity 18
  • 19. © 2013 Springer Publishing Company, LLC. Reliability (cont’d) ♦ Sources of inconsistency – Instability of the behavior being measured – Sample of tasks varies from one assessment to another – Assessment conditions vary significantly – Scoring procedures are inconsistent ♦ These and other factors introduce an unknown amount of error into every measurement 19
  • 20. © 2013 Springer Publishing Company, LLC. Reliability (cont’d) ♦ Obtained score – The number of correct answers ♦ True score – Hypothetical – Cannot be measured directly – Represents what the student actually knows ♦ Error score – Difference between true score and obtained score – Cannot be measured directly – Affects measurement reliability 20
  • 21. © 2013 Springer Publishing Company, LLC. Reliability (cont’d) ♦ Methods of determining assessment reliability estimate how much measurement error is present ♦ When assessment results are reasonably consistent, measurement error ↓ and reliability ↑ 21
  • 22. © 2013 Springer Publishing Company, LLC. Reliability (cont’d) ♦ Reliability pertains to assessment results, not to the assessment instrument ♦ A reliability estimate always refers to a particular type of consistency ♦ A reliability estimate is always represented by a statistical value (reliability coefficient or standard error of measurement) 22
  • 23. © 2013 Springer Publishing Company, LLC. Methods of Estimating Reliability ♦ Measures of stability – Indicates whether students would achieve similar scores if they took the same assessment at another time—test-retest procedure – Appropriate when the trait being measured is expected to be stable over time – Limited usefulness for teacher-made assessments, but an important consideration when selecting standardized tests 23
  • 24. © 2013 Springer Publishing Company, LLC. Methods of Estimating Reliability (cont’d) ♦ Measures of equivalence – Use of two or more forms of the same assessment, based on the same blueprint – Both forms administered to the same group of students in close succession; resulting scores are correlated – High reliability coefficient indicates that the forms sample the domain equally well – Widely used in standardized testing, but not practical for teacher-made assessments 24
  • 25. © 2013 Springer Publishing Company, LLC. ♦ Measures of internal consistency—split-half methods – Used with a set of scores from only one administration of a single assessment: Divide the assessment into two equal subtests, score subtests separately, correlate the two sets of subscores – Underestimates the true reliability of the scores produced by the whole assessment—correct with Spearman-Brown prophecy formula Methods of Estimating Reliability (cont’d) 25
  • 26. © 2013 Springer Publishing Company, LLC. ♦ Measures of internal consistency—coefficient alpha – Extent to which the assessment tasks measure similar characteristics – Kuder-Richardson formulas are a specific type of coefficient alpha • Require dichotomously scored assessment tasks Methods of Estimating Reliability (cont’d) 26
  • 27. © 2013 Springer Publishing Company, LLC. ♦ Measures of consistency of ratings – Determine if same scores would have been obtained if a different person had scored the assessment or judged the performance – Two equally qualified persons score each student’s paper or rate each student’s performance; two scores are compared – Produces a percentage of agreement or index of scorer consistency (correlation) – Interrater consistency facilitated by the use of scoring rubrics and training of raters Methods of Estimating Reliability (cont’d) 27
  • 28. © 2013 Springer Publishing Company, LLC. Influences on Reliability of Scores ♦ Assessment-related factors – Length of the test • In general, more assessment tasks (e.g., test items) → greater score reliability – Homogeneity of assessment tasks • Score reliability enhanced by homogeneity of content covered by the assessment – Item difficulty and discrimination ability • Moderately difficulty items, good discrimination between high and low achievers, and absence of technical errors → greater score reliability 28
  • 29. © 2013 Springer Publishing Company, LLC. Influences on Reliability of Scores (cont’d) ♦ Student-related factors – Heterogeneity of the student group • In general, increased range of ability in the group of students → greater score reliability – Testwiseness • Student with test-taking skills and experience may obtain a higher score than true ability would predict – Motivation • Influences individual students differently • Scores of poorly motivated students may not accurately represent their actual achievement levels 29
  • 30. © 2013 Springer Publishing Company, LLC. ♦ Assessment administration conditions – Time limits • Inadequate time can lower the reliability of scores • Some students who know the content well may be unable to respond to all of the items – Cheating • Contributes random errors to assessment scores • Raises offenders’ observed scores above their true scores Influences on Reliability of Scores (cont’d) 30
  • 31. © 2013 Springer Publishing Company, LLC. Practicality (Usability) ♦ A quality of the assessment instrument itself and its administration procedures ♦ Qualities of practical assessments – Easy to administer and score – Do not take too much time away from other instructional activities – Have reasonable resource requirements 31
  • 32. © 2013 Springer Publishing Company, LLC. Practicality (Usability; cont’d) ♦ Practicality criteria – Easy to construct and use – Reasonable time requirements for administration and scoring the assessment and interpreting results – Reasonable costs associated with assessment construction, administration, and scoring – Assessment results can be interpreted easily and accurately by those who will use them 32