SlideShare a Scribd company logo
1 of 28
VALIDITY AND VALIDATION:
THEORIES AND PROCEDURES
125/12/2015
VALIDITY AND VALIDATION:
THEORIES AND PROCEDURES
VALIDATION TASK
To establish whether the interpretation and uses
of the VSTEP test scores were valid for measuring the
English language competence of test-takers
from level 3 to level 5 on the Vietnamese English
language competence scale
225/12/2015
To establish whether the interpretation and uses
of the VSTEP test scores were valid for measuring the
English language competence of test-takers
from level 3 to level 5 on the Vietnamese English
language competence scale
VALIDITY & VALIDATION
Validity is an integrated evaluative judgment of the degree to
which empirical evidence and theoretical rationales support the
adequacy and appropriateness of inferences and actions based
on test scores or other models of assessment.
(Messick, 1989)
325/12/2015
Validity is an integrated evaluative judgment of the degree to
which empirical evidence and theoretical rationales support the
adequacy and appropriateness of inferences and actions based
on test scores or other models of assessment.
(Messick, 1989)
Validation is to marshal evidence and arguments in support of,
or counter to, proposed interpretations and uses of test scores.
(Messick, 1989)
VALIDITY THEORIES
 1985 – The 1985 Testing Standards
 Unified concept of validity
 Construct-related evidence
 Content-related evidence
 Concurrent-related evidence
 1989 – Messick’s Validity Chapter
 Unified concept of validity
 Evidential basis (Construct, Relevance, Utility)
 Consequential basis (Values, Social Consequences)
425/12/2015
 1985 – The 1985 Testing Standards
 Unified concept of validity
 Construct-related evidence
 Content-related evidence
 Concurrent-related evidence
 1989 – Messick’s Validity Chapter
 Unified concept of validity
 Evidential basis (Construct, Relevance, Utility)
 Consequential basis (Values, Social Consequences)
MESSICK (1989)’S ASPECTS OF VALIDITY
Content
Structural
Consequential
External
Generalizability
Substantive
525/12/2015
Content
Structural
Consequential
External
Generalizability
Substantive
MESSICK (1989)’S ASPECTS OF VALIDITY
 The content aspect
 Content relevance
 Representativeness
 Technical quality
 The substantive aspect
Theoretical rationales for observed consistencies in responses
 Process of performance
 Empirical evidence of process
625/12/2015
 The content aspect
 Content relevance
 Representativeness
 Technical quality
 The substantive aspect
Theoretical rationales for observed consistencies in responses
 Process of performance
 Empirical evidence of process
MESSICK (1989)’S ASPECTS OF VALIDITY
 The structural aspect
The fidelity of the scoring structure to the construct structure.
 The generalizability aspect
The extent to which score properties and interpretations
generalize to and across groups, settings and tasks
 Reliability
 Content representativeness
725/12/2015
 The structural aspect
The fidelity of the scoring structure to the construct structure.
 The generalizability aspect
The extent to which score properties and interpretations
generalize to and across groups, settings and tasks
 Reliability
 Content representativeness
MESSICK (1989)’S ASPECTS OF VALIDITY
 The external aspect
 Convergent and discriminant evidence
 Criterion relevance
 Applied utility
 The consequential aspect
Value implications as a basis for action/consequences
 Bias
 Fairness
825/12/2015
 The external aspect
 Convergent and discriminant evidence
 Criterion relevance
 Applied utility
 The consequential aspect
Value implications as a basis for action/consequences
 Bias
 Fairness
MESSICK (1989)’S VALIDITY FRAMEWORK
 Value
 The most influential framework of validity
 Criticisms
 Abstract
 Difficult to be done by a single researcher
 No specific guidance for specific validation context
925/12/2015
 Value
 The most influential framework of validity
 Criticisms
 Abstract
 Difficult to be done by a single researcher
 No specific guidance for specific validation context
VALIDITY THEORIES
 Kane (1992)’s and (2006)’s Validity Chapter
Argument-based Approach to Validation
 Interpretive Argument
The network of inferences and assumptions
 Validity Argument
 Logical evidence
 Empirical evidence
The
Development
Stage
1025/12/2015
 Kane (1992)’s and (2006)’s Validity Chapter
Argument-based Approach to Validation
 Interpretive Argument
The network of inferences and assumptions
 Validity Argument
 Logical evidence
 Empirical evidence
The
Appraisal
Stage
KANE (1992)’S VALIDITY FRAMEWORK
 Values
 The most practical, objective framework of validity
 Unique interpretive argument, consistent validity argument
steps (Bachman, 2004)
 Criticisms
 No attention to the structural aspect (Messick, 1995)
 Inadequate attention/method to policy context and
consequences of tests (McNamara, 2006).
1125/12/2015
 Values
 The most practical, objective framework of validity
 Unique interpretive argument, consistent validity argument
steps (Bachman, 2004)
 Criticisms
 No attention to the structural aspect (Messick, 1995)
 Inadequate attention/method to policy context and
consequences of tests (McNamara, 2006).
LANGUAGE TEST VALIDATION
 Bachman (1990)’s framework, after Messick (1989)’s
 Bachman (2004)’s framework, after Kane (1992)’s
1225/12/2015
 Bachman (1990)’s framework, after Messick (1989)’s
 Bachman (2004)’s framework, after Kane (1992)’s
CHOICE OF VALIDITY FRAMEWORK
 Messick (1989)’s
 Six aspects
Content
Structural
Consequential
External
Generalizability
Substantive
1325/12/2015
Content
Structural
Consequential
External
Generalizability
Substantive
1. To what extent was the test content relevant to and
representative of the domain of English language ability?
2. To what extent was each sub-test successful in measuring
students’ English language ability?
3. How well did the test-takers’ test scores on the VSTEP
correlate with their test scores on the IELTS?
4. What were the consequences of the UEE English test
scores' interpretation and use?
VALIDATION QUESTIONS
1425/12/2015
1. To what extent was the test content relevant to and
representative of the domain of English language ability?
2. To what extent was each sub-test successful in measuring
students’ English language ability?
3. How well did the test-takers’ test scores on the VSTEP
correlate with their test scores on the IELTS?
4. What were the consequences of the UEE English test
scores' interpretation and use?
WINTERTemplate
01CONTENT
• Content relevance
• Technical quality
• Content representativeness
WINTERTemplate
RELEVANCE
• Topical content
• Typical behavior
• Underlying process
• Test specifications
01CONTENT
RELEVANCE
• Topical content
• Typical behavior
• Underlying process
• Test specifications
WINTERTemplate
01CONTENT
TECHNICAL QUALITY
Empirical Evidence
• difficulty level
• discriminating power
Expert Judgment
• readability level
• freedom of ambiguity/irrelevancy
• appropriateness of keyed answers & distractors
TECHNICAL QUALITY
Empirical Evidence
• difficulty level
• discriminating power
Expert Judgment
• readability level
• freedom of ambiguity/irrelevancy
• appropriateness of keyed answers & distractors
WINTERTemplate
REPRESENTATIVENESS
The breadth of the content specifications for a test should
reflect the breadth of the construct invoked in score
interpretation” (Messick, 1989, p. 35).
All essential components of the construct domain are
covered (Messick, 1994, p. 12).
01CONTENT
REPRESENTATIVENESS
The breadth of the content specifications for a test should
reflect the breadth of the construct invoked in score
interpretation” (Messick, 1989, p. 35).
All essential components of the construct domain are
covered (Messick, 1994, p. 12).
WINTERTemplate
01CONTENT
CONTENT ANALYSIS BY EXPERTS
• What knowledge and skills are needed to do each
item correctly?
• How relevant are the items to their assigned
objectives and domain?
Domain
• English secondary school curricula
• English program at the college
CONTENT ANALYSIS BY EXPERTS
• What knowledge and skills are needed to do each
item correctly?
• How relevant are the items to their assigned
objectives and domain?
Domain
• English secondary school curricula
• English program at the college
WINTERTemplate
01CONTENT
RASCH ANALYSIS
Item fit statistics
WINTERTemplate
01CONTENT
Item fit statistics
Smith (2004) suggested using item fit statistics to evaluate the
extent to which items tap into the same construct and place
test-takers in the same order.
- the extent to which the use of each item is consistent with the
way people have responded to the other items
- does the item rank order the individuals in a manner similar to
other items? (p. 106)
Smith (2004) argued that test-takers should be ranked
consistently by items measuring the same construct. If not, the
misfitting items to the Rasch model, i.e. the items that measure
a different construct, should be subject to revision or elimination
(p. 107).
Item fit statistics
Smith (2004) suggested using item fit statistics to evaluate the
extent to which items tap into the same construct and place
test-takers in the same order.
- the extent to which the use of each item is consistent with the
way people have responded to the other items
- does the item rank order the individuals in a manner similar to
other items? (p. 106)
Smith (2004) argued that test-takers should be ranked
consistently by items measuring the same construct. If not, the
misfitting items to the Rasch model, i.e. the items that measure
a different construct, should be subject to revision or elimination
(p. 107).
To what extent was the VSTEP sub-tests successful in
measuring students’ English language competence?
ITEM RESPONSE THEORY (RASCH MODEL)
item fit
item discrimination
item cluster
DISCRIPTIVE STATISTICS
choice response analysis
02SUBSTANTIVE & STRUCTURAL
25/12/2015 22
To what extent was the VSTEP sub-tests successful in
measuring students’ English language competence?
ITEM RESPONSE THEORY (RASCH MODEL)
item fit
item discrimination
item cluster
DISCRIPTIVE STATISTICS
choice response analysis
How well did the test-takers’ VSTEP overall and
sub-test scores correlate with the test-takers’
overall and sub-test IELTS scores?
03CRITERION-RELATED
25/12/2015 23
04
• The value implications of score interpretation
• The actual and potential consequences of score
uses
(Messick, 1989)
FOCUS: on validity of test score interpretation and
use - construct under-representation or construct-
irrelevant variance
CONSEQUENCES
25/12/2015 24
• The value implications of score interpretation
• The actual and potential consequences of score
uses
(Messick, 1989)
FOCUS: on validity of test score interpretation and
use - construct under-representation or construct-
irrelevant variance
04
Sources of evidence
• Content relevance and representativeness
• Item bias
• Technical quality of the test
• Expert judgment
CONSEQUENCES
25/12/2015 25
Sources of evidence
• Content relevance and representativeness
• Item bias
• Technical quality of the test
• Expert judgment
References
 American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1985). Standards
for Educational and Psychological Testing. Washington, DC: Authors.
 American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards
for Educational and Psychological Testing. Washington, DC: American Educational Research Association.
 Andrich, D., & Mercer, A. (1997). International perspectives on selection methods of entry into higher education. Canberra: National Board of
Employment, Education and Training [and] Higher Education Council.
 Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press.
 Bachman, L. F. (2004). Statistical analyses for language assessment. Cambridge: Cambridge University Press.
 Berk, R. A. (1980). Item Analysis. In R. A. Berk (Ed.), Criterion-referenced measurement: the state of the art. Baltimore and London: The Johns Hopkins
University Press.
 Cureton, E. E. (1951). Validity. In E. F. Lindquist (Ed.), Educational measurement (pp. 621-694). Washington, D.C.: American Council on Education.
 Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, California: Sage Publications.
 Kane, M. T. (1992). An argument-based approach to validity. Psychological Bulletin, 112, 527.
 Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 17-64). Westport, CT: American Council on
Education/Praeger.
 Loevinger, J. (1957). Objective tests as instruments of psychological theory. Psychological Reports, 3(4), 635-694.
 McNamara, T., & Roever, C. (2006). Language testing: the social dimension. Malden, MA: Blackwell Publishing.
 Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13-103). New York: American Council on Education/Macmillan.
 MOET. (2006). Secondary Education Curriculum: English. Hanoi: Education Publisher.
 Moss, P. A. (2007). Reconstructing Validity. Educational Researcher, 36(8), 470-476.
 Popham, W. J. (1997). Consequential Validity: Right Concern--Wrong Concept. Educational Measurement: Issues and Practice, 16(2), 9-13.
 Purpura, J. E. (1999). Learner strategy use and performance on language tests : a structural equation modeling approach. Cambridge: Cambridge
University Press.
 Smith, E. V. (2004). Evidence for Reliability of Measures and Validity of Measure Interpretation: A Rasch Measurement Perspective. In E. V. Smith & R.
M. Smith (Eds.), Introduction to Rasch Measurement: Theory, Models and Applications. Maple Grove: JAM Press.
 Wu, M. L., Adams, R. J., & Haldane, S. (2008). ConQuest: Generalised Item Response Modelling Software [computer program]. Camberwell: Australian
Council for Educational Research.
2625/12/2015
 American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1985). Standards
for Educational and Psychological Testing. Washington, DC: Authors.
 American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards
for Educational and Psychological Testing. Washington, DC: American Educational Research Association.
 Andrich, D., & Mercer, A. (1997). International perspectives on selection methods of entry into higher education. Canberra: National Board of
Employment, Education and Training [and] Higher Education Council.
 Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press.
 Bachman, L. F. (2004). Statistical analyses for language assessment. Cambridge: Cambridge University Press.
 Berk, R. A. (1980). Item Analysis. In R. A. Berk (Ed.), Criterion-referenced measurement: the state of the art. Baltimore and London: The Johns Hopkins
University Press.
 Cureton, E. E. (1951). Validity. In E. F. Lindquist (Ed.), Educational measurement (pp. 621-694). Washington, D.C.: American Council on Education.
 Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, California: Sage Publications.
 Kane, M. T. (1992). An argument-based approach to validity. Psychological Bulletin, 112, 527.
 Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 17-64). Westport, CT: American Council on
Education/Praeger.
 Loevinger, J. (1957). Objective tests as instruments of psychological theory. Psychological Reports, 3(4), 635-694.
 McNamara, T., & Roever, C. (2006). Language testing: the social dimension. Malden, MA: Blackwell Publishing.
 Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13-103). New York: American Council on Education/Macmillan.
 MOET. (2006). Secondary Education Curriculum: English. Hanoi: Education Publisher.
 Moss, P. A. (2007). Reconstructing Validity. Educational Researcher, 36(8), 470-476.
 Popham, W. J. (1997). Consequential Validity: Right Concern--Wrong Concept. Educational Measurement: Issues and Practice, 16(2), 9-13.
 Purpura, J. E. (1999). Learner strategy use and performance on language tests : a structural equation modeling approach. Cambridge: Cambridge
University Press.
 Smith, E. V. (2004). Evidence for Reliability of Measures and Validity of Measure Interpretation: A Rasch Measurement Perspective. In E. V. Smith & R.
M. Smith (Eds.), Introduction to Rasch Measurement: Theory, Models and Applications. Maple Grove: JAM Press.
 Wu, M. L., Adams, R. J., & Haldane, S. (2008). ConQuest: Generalised Item Response Modelling Software [computer program]. Camberwell: Australian
Council for Educational Research.
THANK YOU
FOR YOUR ATTENTION
2725/12/2015
THANK YOU
FOR YOUR ATTENTION
Q & A
2825/12/2015

More Related Content

What's hot

Presentation validity
Presentation validityPresentation validity
Presentation validityAshMusavi
 
Validity, reliability & practicality
Validity, reliability & practicalityValidity, reliability & practicality
Validity, reliability & practicalitySamcruz5
 
validity its types and importance
validity its types and importancevalidity its types and importance
validity its types and importanceIerine Joy Caserial
 
Presentation on validity and reliability in research methods
Presentation on validity and reliability in research methodsPresentation on validity and reliability in research methods
Presentation on validity and reliability in research methodsMehwish Iqbal
 
Validity in psychological testing
Validity in psychological testingValidity in psychological testing
Validity in psychological testingMilen Ramos
 
Reliability and validity w3
Reliability and validity w3Reliability and validity w3
Reliability and validity w3Muhammad Ali
 
Content &statistical validity
Content &statistical validityContent &statistical validity
Content &statistical validityAMU
 
Reliability and validity ppt
Reliability and validity pptReliability and validity ppt
Reliability and validity pptsurendra poudel
 
Validity & reliability seminar
Validity & reliability seminarValidity & reliability seminar
Validity & reliability seminarmrikara185
 
15th batch NPTI Validity & Reliablity Business Research Methods
15th batch NPTI Validity & Reliablity Business Research Methods 15th batch NPTI Validity & Reliablity Business Research Methods
15th batch NPTI Validity & Reliablity Business Research Methods Ravi Pohani
 
Validity, reliability and feasibility
Validity, reliability and feasibilityValidity, reliability and feasibility
Validity, reliability and feasibilitysilpa $H!lu
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validityKaimrc_Rss_Jd
 
Tools in Qualitative Research: Validity and Reliability
Tools in Qualitative Research: Validity and ReliabilityTools in Qualitative Research: Validity and Reliability
Tools in Qualitative Research: Validity and ReliabilityDr. Sarita Anand
 
Validity, reliabiltiy and alignment to determine the effectiveness of assessment
Validity, reliabiltiy and alignment to determine the effectiveness of assessmentValidity, reliabiltiy and alignment to determine the effectiveness of assessment
Validity, reliabiltiy and alignment to determine the effectiveness of assessmentMirea Mizushima
 

What's hot (20)

Presentation validity
Presentation validityPresentation validity
Presentation validity
 
Validation
ValidationValidation
Validation
 
Validity, reliability & practicality
Validity, reliability & practicalityValidity, reliability & practicality
Validity, reliability & practicality
 
validity its types and importance
validity its types and importancevalidity its types and importance
validity its types and importance
 
Presentation on validity and reliability in research methods
Presentation on validity and reliability in research methodsPresentation on validity and reliability in research methods
Presentation on validity and reliability in research methods
 
Rep
RepRep
Rep
 
Validity in psychological testing
Validity in psychological testingValidity in psychological testing
Validity in psychological testing
 
Validity in Assessment
Validity in AssessmentValidity in Assessment
Validity in Assessment
 
Validity & Reliability
Validity & ReliabilityValidity & Reliability
Validity & Reliability
 
Reliability and validity w3
Reliability and validity w3Reliability and validity w3
Reliability and validity w3
 
Reliablity and Validity
Reliablity and ValidityReliablity and Validity
Reliablity and Validity
 
Content &statistical validity
Content &statistical validityContent &statistical validity
Content &statistical validity
 
Validity
ValidityValidity
Validity
 
Reliability and validity ppt
Reliability and validity pptReliability and validity ppt
Reliability and validity ppt
 
Validity & reliability seminar
Validity & reliability seminarValidity & reliability seminar
Validity & reliability seminar
 
15th batch NPTI Validity & Reliablity Business Research Methods
15th batch NPTI Validity & Reliablity Business Research Methods 15th batch NPTI Validity & Reliablity Business Research Methods
15th batch NPTI Validity & Reliablity Business Research Methods
 
Validity, reliability and feasibility
Validity, reliability and feasibilityValidity, reliability and feasibility
Validity, reliability and feasibility
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validity
 
Tools in Qualitative Research: Validity and Reliability
Tools in Qualitative Research: Validity and ReliabilityTools in Qualitative Research: Validity and Reliability
Tools in Qualitative Research: Validity and Reliability
 
Validity, reliabiltiy and alignment to determine the effectiveness of assessment
Validity, reliabiltiy and alignment to determine the effectiveness of assessmentValidity, reliabiltiy and alignment to determine the effectiveness of assessment
Validity, reliabiltiy and alignment to determine the effectiveness of assessment
 

Viewers also liked

Ail apresentation(kumazawa)
Ail apresentation(kumazawa)Ail apresentation(kumazawa)
Ail apresentation(kumazawa)TakaKumazawa
 
Peering through the Looking Glass: Towards a Programmatic View of the Qualify...
Peering through the Looking Glass: Towards a Programmatic View of the Qualify...Peering through the Looking Glass: Towards a Programmatic View of the Qualify...
Peering through the Looking Glass: Towards a Programmatic View of the Qualify...MedCouncilCan
 
Table of specifications 2013 copy
Table of specifications 2013   copyTable of specifications 2013   copy
Table of specifications 2013 copyMarciano Melchor
 
Why Process Measures Are Often More Important Than Outcome Measures in Health...
Why Process Measures Are Often More Important Than Outcome Measures in Health...Why Process Measures Are Often More Important Than Outcome Measures in Health...
Why Process Measures Are Often More Important Than Outcome Measures in Health...Health Catalyst
 

Viewers also liked (6)

Ail apresentation(kumazawa)
Ail apresentation(kumazawa)Ail apresentation(kumazawa)
Ail apresentation(kumazawa)
 
Peering through the Looking Glass: Towards a Programmatic View of the Qualify...
Peering through the Looking Glass: Towards a Programmatic View of the Qualify...Peering through the Looking Glass: Towards a Programmatic View of the Qualify...
Peering through the Looking Glass: Towards a Programmatic View of the Qualify...
 
Language testing the social dimension
Language testing  the social dimensionLanguage testing  the social dimension
Language testing the social dimension
 
Table of specifications 2013 copy
Table of specifications 2013   copyTable of specifications 2013   copy
Table of specifications 2013 copy
 
Why Process Measures Are Often More Important Than Outcome Measures in Health...
Why Process Measures Are Often More Important Than Outcome Measures in Health...Why Process Measures Are Often More Important Than Outcome Measures in Health...
Why Process Measures Are Often More Important Than Outcome Measures in Health...
 
Table of specifications
Table of specificationsTable of specifications
Table of specifications
 

Similar to Xác trị slide 1 - validation basics

reliability and validity psychology 1234
reliability and validity psychology 1234reliability and validity psychology 1234
reliability and validity psychology 1234MajaAiraBumatay
 
Validity and reliability of questionnaires
Validity and reliability of questionnairesValidity and reliability of questionnaires
Validity and reliability of questionnairesVenkitachalam R
 
Copie de PRESENTATION_ RELIABILITY _ VALIDITY.pptx
Copie de PRESENTATION_ RELIABILITY _ VALIDITY.pptxCopie de PRESENTATION_ RELIABILITY _ VALIDITY.pptx
Copie de PRESENTATION_ RELIABILITY _ VALIDITY.pptxMonsefJraid
 
Principles of Language Assessment
Principles of Language AssessmentPrinciples of Language Assessment
Principles of Language Assessmentisacaiza82
 
Designing classsroom
Designing classsroomDesigning classsroom
Designing classsroomdesfi ceriany
 
HND_MSCP_W5_Reliability_and_Validity_of_Research.pdf
HND_MSCP_W5_Reliability_and_Validity_of_Research.pdfHND_MSCP_W5_Reliability_and_Validity_of_Research.pdf
HND_MSCP_W5_Reliability_and_Validity_of_Research.pdfMohammedAskar22
 
Principles of language assessment ( evaluation of language teaching)
Principles of language assessment ( evaluation of language teaching)Principles of language assessment ( evaluation of language teaching)
Principles of language assessment ( evaluation of language teaching)Alfi Suru
 
Principles of language assessment ( evaluation of language teaching)
Principles of language assessment ( evaluation of language teaching)Principles of language assessment ( evaluation of language teaching)
Principles of language assessment ( evaluation of language teaching)Alfi Suru
 
JC-16-23June2021-rel-val.pptx
JC-16-23June2021-rel-val.pptxJC-16-23June2021-rel-val.pptx
JC-16-23June2021-rel-val.pptxsaurami
 
Language Testing : Principles of language assessment
Language Testing : Principles of language assessment Language Testing : Principles of language assessment
Language Testing : Principles of language assessment Yulia Eolia
 
NQC Presentation On Validation And Moderation
NQC Presentation On Validation And ModerationNQC Presentation On Validation And Moderation
NQC Presentation On Validation And ModerationKathleen Zarubin
 
Presentation Validity & Reliability
Presentation Validity & ReliabilityPresentation Validity & Reliability
Presentation Validity & Reliabilitysongoten77
 
Item development.pdf for national examination development
Item development.pdf for national examination developmentItem development.pdf for national examination development
Item development.pdf for national examination developmentGalataaAGoobanaa
 

Similar to Xác trị slide 1 - validation basics (20)

reliability and validity psychology 1234
reliability and validity psychology 1234reliability and validity psychology 1234
reliability and validity psychology 1234
 
Validity and reliability of questionnaires
Validity and reliability of questionnairesValidity and reliability of questionnaires
Validity and reliability of questionnaires
 
Copie de PRESENTATION_ RELIABILITY _ VALIDITY.pptx
Copie de PRESENTATION_ RELIABILITY _ VALIDITY.pptxCopie de PRESENTATION_ RELIABILITY _ VALIDITY.pptx
Copie de PRESENTATION_ RELIABILITY _ VALIDITY.pptx
 
Principles of Language Assessment
Principles of Language AssessmentPrinciples of Language Assessment
Principles of Language Assessment
 
Designing classsroom
Designing classsroomDesigning classsroom
Designing classsroom
 
HND_MSCP_W5_Reliability_and_Validity_of_Research.pdf
HND_MSCP_W5_Reliability_and_Validity_of_Research.pdfHND_MSCP_W5_Reliability_and_Validity_of_Research.pdf
HND_MSCP_W5_Reliability_and_Validity_of_Research.pdf
 
Principles of language assessment ( evaluation of language teaching)
Principles of language assessment ( evaluation of language teaching)Principles of language assessment ( evaluation of language teaching)
Principles of language assessment ( evaluation of language teaching)
 
Principles of language assessment ( evaluation of language teaching)
Principles of language assessment ( evaluation of language teaching)Principles of language assessment ( evaluation of language teaching)
Principles of language assessment ( evaluation of language teaching)
 
Test construction
Test constructionTest construction
Test construction
 
Qualitative Research Methods
Qualitative Research MethodsQualitative Research Methods
Qualitative Research Methods
 
JC-16-23June2021-rel-val.pptx
JC-16-23June2021-rel-val.pptxJC-16-23June2021-rel-val.pptx
JC-16-23June2021-rel-val.pptx
 
The Components of Test Specifications
The Components of Test SpecificationsThe Components of Test Specifications
The Components of Test Specifications
 
Language Testing : Principles of language assessment
Language Testing : Principles of language assessment Language Testing : Principles of language assessment
Language Testing : Principles of language assessment
 
Validity & reliability
Validity & reliabilityValidity & reliability
Validity & reliability
 
Validity
ValidityValidity
Validity
 
CRITERIA OF A GOOD TEST.pptx
CRITERIA OF A GOOD TEST.pptxCRITERIA OF A GOOD TEST.pptx
CRITERIA OF A GOOD TEST.pptx
 
NQC Presentation On Validation And Moderation
NQC Presentation On Validation And ModerationNQC Presentation On Validation And Moderation
NQC Presentation On Validation And Moderation
 
Intro assessmentcmm
Intro assessmentcmmIntro assessmentcmm
Intro assessmentcmm
 
Presentation Validity & Reliability
Presentation Validity & ReliabilityPresentation Validity & Reliability
Presentation Validity & Reliability
 
Item development.pdf for national examination development
Item development.pdf for national examination developmentItem development.pdf for national examination development
Item development.pdf for national examination development
 

More from englishonecfl

Chương trình và nội dung hội nghị Mạc tộc lần thứ II
Chương trình và nội dung hội nghị Mạc tộc lần thứ IIChương trình và nội dung hội nghị Mạc tộc lần thứ II
Chương trình và nội dung hội nghị Mạc tộc lần thứ IIenglishonecfl
 
Basic pronunciation online in Moodle 25.08.2016
Basic pronunciation online in Moodle 25.08.2016Basic pronunciation online in Moodle 25.08.2016
Basic pronunciation online in Moodle 25.08.2016englishonecfl
 
Reading 2 - test specification for writing test - vstep
Reading 2 - test specification for writing test - vstepReading 2 - test specification for writing test - vstep
Reading 2 - test specification for writing test - vstepenglishonecfl
 
Reading 2 guideline for item writing writing test
Reading 2 guideline for item writing writing testReading 2 guideline for item writing writing test
Reading 2 guideline for item writing writing testenglishonecfl
 
Reading 1 guidelines for designing writing prompts
Reading 1 guidelines for designing writing promptsReading 1 guidelines for designing writing prompts
Reading 1 guidelines for designing writing promptsenglishonecfl
 
Guiding questions for reading materials
Guiding questions for reading materialsGuiding questions for reading materials
Guiding questions for reading materialsenglishonecfl
 
Listening item submission template
Listening item submission templateListening item submission template
Listening item submission templateenglishonecfl
 
Writing good multiple choice test questions
Writing good multiple choice test questionsWriting good multiple choice test questions
Writing good multiple choice test questionsenglishonecfl
 
Nghe slide - testing listening skill slides
Nghe   slide - testing listening skill slidesNghe   slide - testing listening skill slides
Nghe slide - testing listening skill slidesenglishonecfl
 
Vstep listening item writer
Vstep listening item writerVstep listening item writer
Vstep listening item writerenglishonecfl
 
Tham chiếu khung cefr của các bài thi
Tham chiếu khung cefr của các  bài thiTham chiếu khung cefr của các  bài thi
Tham chiếu khung cefr của các bài thienglishonecfl
 
Online version 20151003 main issues in language testing
Online version 20151003 main issues in language   testingOnline version 20151003 main issues in language   testing
Online version 20151003 main issues in language testingenglishonecfl
 
Ke hoach to chuc bd can bo ra de thi 2015
Ke hoach to chuc bd can bo ra de thi   2015Ke hoach to chuc bd can bo ra de thi   2015
Ke hoach to chuc bd can bo ra de thi 2015englishonecfl
 
Khung chtr của 2 hợp phần
Khung chtr của 2 hợp phầnKhung chtr của 2 hợp phần
Khung chtr của 2 hợp phầnenglishonecfl
 

More from englishonecfl (20)

Chương trình và nội dung hội nghị Mạc tộc lần thứ II
Chương trình và nội dung hội nghị Mạc tộc lần thứ IIChương trình và nội dung hội nghị Mạc tộc lần thứ II
Chương trình và nội dung hội nghị Mạc tộc lần thứ II
 
Basic pronunciation online in Moodle 25.08.2016
Basic pronunciation online in Moodle 25.08.2016Basic pronunciation online in Moodle 25.08.2016
Basic pronunciation online in Moodle 25.08.2016
 
Assessing speaking
Assessing speakingAssessing speaking
Assessing speaking
 
Reading 2 - test specification for writing test - vstep
Reading 2 - test specification for writing test - vstepReading 2 - test specification for writing test - vstep
Reading 2 - test specification for writing test - vstep
 
Reading 2 guideline for item writing writing test
Reading 2 guideline for item writing writing testReading 2 guideline for item writing writing test
Reading 2 guideline for item writing writing test
 
Reading 1 guidelines for designing writing prompts
Reading 1 guidelines for designing writing promptsReading 1 guidelines for designing writing prompts
Reading 1 guidelines for designing writing prompts
 
Guiding questions for reading materials
Guiding questions for reading materialsGuiding questions for reading materials
Guiding questions for reading materials
 
Listening item submission template
Listening item submission templateListening item submission template
Listening item submission template
 
Examining reading
Examining readingExamining reading
Examining reading
 
Reading
ReadingReading
Reading
 
Reading
ReadingReading
Reading
 
Writing good multiple choice test questions
Writing good multiple choice test questionsWriting good multiple choice test questions
Writing good multiple choice test questions
 
Reading
ReadingReading
Reading
 
Nghe slide - testing listening skill slides
Nghe   slide - testing listening skill slidesNghe   slide - testing listening skill slides
Nghe slide - testing listening skill slides
 
Vstep listening item writer
Vstep listening item writerVstep listening item writer
Vstep listening item writer
 
Tham chiếu khung cefr của các bài thi
Tham chiếu khung cefr của các  bài thiTham chiếu khung cefr của các  bài thi
Tham chiếu khung cefr của các bài thi
 
Online version 20151003 main issues in language testing
Online version 20151003 main issues in language   testingOnline version 20151003 main issues in language   testing
Online version 20151003 main issues in language testing
 
Ke hoach to chuc bd can bo ra de thi 2015
Ke hoach to chuc bd can bo ra de thi   2015Ke hoach to chuc bd can bo ra de thi   2015
Ke hoach to chuc bd can bo ra de thi 2015
 
Khung chtr của 2 hợp phần
Khung chtr của 2 hợp phầnKhung chtr của 2 hợp phần
Khung chtr của 2 hợp phần
 
Google Forms
Google FormsGoogle Forms
Google Forms
 

Recently uploaded

Basic Intentional Injuries Health Education
Basic Intentional Injuries Health EducationBasic Intentional Injuries Health Education
Basic Intentional Injuries Health EducationNeilDeclaro1
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptxJoelynRubio1
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxUmeshTimilsina1
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17Celine George
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxCeline George
 

Recently uploaded (20)

Basic Intentional Injuries Health Education
Basic Intentional Injuries Health EducationBasic Intentional Injuries Health Education
Basic Intentional Injuries Health Education
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 

Xác trị slide 1 - validation basics

  • 1. VALIDITY AND VALIDATION: THEORIES AND PROCEDURES 125/12/2015 VALIDITY AND VALIDATION: THEORIES AND PROCEDURES
  • 2. VALIDATION TASK To establish whether the interpretation and uses of the VSTEP test scores were valid for measuring the English language competence of test-takers from level 3 to level 5 on the Vietnamese English language competence scale 225/12/2015 To establish whether the interpretation and uses of the VSTEP test scores were valid for measuring the English language competence of test-takers from level 3 to level 5 on the Vietnamese English language competence scale
  • 3. VALIDITY & VALIDATION Validity is an integrated evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores or other models of assessment. (Messick, 1989) 325/12/2015 Validity is an integrated evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores or other models of assessment. (Messick, 1989) Validation is to marshal evidence and arguments in support of, or counter to, proposed interpretations and uses of test scores. (Messick, 1989)
  • 4. VALIDITY THEORIES  1985 – The 1985 Testing Standards  Unified concept of validity  Construct-related evidence  Content-related evidence  Concurrent-related evidence  1989 – Messick’s Validity Chapter  Unified concept of validity  Evidential basis (Construct, Relevance, Utility)  Consequential basis (Values, Social Consequences) 425/12/2015  1985 – The 1985 Testing Standards  Unified concept of validity  Construct-related evidence  Content-related evidence  Concurrent-related evidence  1989 – Messick’s Validity Chapter  Unified concept of validity  Evidential basis (Construct, Relevance, Utility)  Consequential basis (Values, Social Consequences)
  • 5. MESSICK (1989)’S ASPECTS OF VALIDITY Content Structural Consequential External Generalizability Substantive 525/12/2015 Content Structural Consequential External Generalizability Substantive
  • 6. MESSICK (1989)’S ASPECTS OF VALIDITY  The content aspect  Content relevance  Representativeness  Technical quality  The substantive aspect Theoretical rationales for observed consistencies in responses  Process of performance  Empirical evidence of process 625/12/2015  The content aspect  Content relevance  Representativeness  Technical quality  The substantive aspect Theoretical rationales for observed consistencies in responses  Process of performance  Empirical evidence of process
  • 7. MESSICK (1989)’S ASPECTS OF VALIDITY  The structural aspect The fidelity of the scoring structure to the construct structure.  The generalizability aspect The extent to which score properties and interpretations generalize to and across groups, settings and tasks  Reliability  Content representativeness 725/12/2015  The structural aspect The fidelity of the scoring structure to the construct structure.  The generalizability aspect The extent to which score properties and interpretations generalize to and across groups, settings and tasks  Reliability  Content representativeness
  • 8. MESSICK (1989)’S ASPECTS OF VALIDITY  The external aspect  Convergent and discriminant evidence  Criterion relevance  Applied utility  The consequential aspect Value implications as a basis for action/consequences  Bias  Fairness 825/12/2015  The external aspect  Convergent and discriminant evidence  Criterion relevance  Applied utility  The consequential aspect Value implications as a basis for action/consequences  Bias  Fairness
  • 9. MESSICK (1989)’S VALIDITY FRAMEWORK  Value  The most influential framework of validity  Criticisms  Abstract  Difficult to be done by a single researcher  No specific guidance for specific validation context 925/12/2015  Value  The most influential framework of validity  Criticisms  Abstract  Difficult to be done by a single researcher  No specific guidance for specific validation context
  • 10. VALIDITY THEORIES  Kane (1992)’s and (2006)’s Validity Chapter Argument-based Approach to Validation  Interpretive Argument The network of inferences and assumptions  Validity Argument  Logical evidence  Empirical evidence The Development Stage 1025/12/2015  Kane (1992)’s and (2006)’s Validity Chapter Argument-based Approach to Validation  Interpretive Argument The network of inferences and assumptions  Validity Argument  Logical evidence  Empirical evidence The Appraisal Stage
  • 11. KANE (1992)’S VALIDITY FRAMEWORK  Values  The most practical, objective framework of validity  Unique interpretive argument, consistent validity argument steps (Bachman, 2004)  Criticisms  No attention to the structural aspect (Messick, 1995)  Inadequate attention/method to policy context and consequences of tests (McNamara, 2006). 1125/12/2015  Values  The most practical, objective framework of validity  Unique interpretive argument, consistent validity argument steps (Bachman, 2004)  Criticisms  No attention to the structural aspect (Messick, 1995)  Inadequate attention/method to policy context and consequences of tests (McNamara, 2006).
  • 12. LANGUAGE TEST VALIDATION  Bachman (1990)’s framework, after Messick (1989)’s  Bachman (2004)’s framework, after Kane (1992)’s 1225/12/2015  Bachman (1990)’s framework, after Messick (1989)’s  Bachman (2004)’s framework, after Kane (1992)’s
  • 13. CHOICE OF VALIDITY FRAMEWORK  Messick (1989)’s  Six aspects Content Structural Consequential External Generalizability Substantive 1325/12/2015 Content Structural Consequential External Generalizability Substantive
  • 14. 1. To what extent was the test content relevant to and representative of the domain of English language ability? 2. To what extent was each sub-test successful in measuring students’ English language ability? 3. How well did the test-takers’ test scores on the VSTEP correlate with their test scores on the IELTS? 4. What were the consequences of the UEE English test scores' interpretation and use? VALIDATION QUESTIONS 1425/12/2015 1. To what extent was the test content relevant to and representative of the domain of English language ability? 2. To what extent was each sub-test successful in measuring students’ English language ability? 3. How well did the test-takers’ test scores on the VSTEP correlate with their test scores on the IELTS? 4. What were the consequences of the UEE English test scores' interpretation and use?
  • 15. WINTERTemplate 01CONTENT • Content relevance • Technical quality • Content representativeness
  • 16. WINTERTemplate RELEVANCE • Topical content • Typical behavior • Underlying process • Test specifications 01CONTENT RELEVANCE • Topical content • Typical behavior • Underlying process • Test specifications
  • 17. WINTERTemplate 01CONTENT TECHNICAL QUALITY Empirical Evidence • difficulty level • discriminating power Expert Judgment • readability level • freedom of ambiguity/irrelevancy • appropriateness of keyed answers & distractors TECHNICAL QUALITY Empirical Evidence • difficulty level • discriminating power Expert Judgment • readability level • freedom of ambiguity/irrelevancy • appropriateness of keyed answers & distractors
  • 18. WINTERTemplate REPRESENTATIVENESS The breadth of the content specifications for a test should reflect the breadth of the construct invoked in score interpretation” (Messick, 1989, p. 35). All essential components of the construct domain are covered (Messick, 1994, p. 12). 01CONTENT REPRESENTATIVENESS The breadth of the content specifications for a test should reflect the breadth of the construct invoked in score interpretation” (Messick, 1989, p. 35). All essential components of the construct domain are covered (Messick, 1994, p. 12).
  • 19. WINTERTemplate 01CONTENT CONTENT ANALYSIS BY EXPERTS • What knowledge and skills are needed to do each item correctly? • How relevant are the items to their assigned objectives and domain? Domain • English secondary school curricula • English program at the college CONTENT ANALYSIS BY EXPERTS • What knowledge and skills are needed to do each item correctly? • How relevant are the items to their assigned objectives and domain? Domain • English secondary school curricula • English program at the college
  • 21. WINTERTemplate 01CONTENT Item fit statistics Smith (2004) suggested using item fit statistics to evaluate the extent to which items tap into the same construct and place test-takers in the same order. - the extent to which the use of each item is consistent with the way people have responded to the other items - does the item rank order the individuals in a manner similar to other items? (p. 106) Smith (2004) argued that test-takers should be ranked consistently by items measuring the same construct. If not, the misfitting items to the Rasch model, i.e. the items that measure a different construct, should be subject to revision or elimination (p. 107). Item fit statistics Smith (2004) suggested using item fit statistics to evaluate the extent to which items tap into the same construct and place test-takers in the same order. - the extent to which the use of each item is consistent with the way people have responded to the other items - does the item rank order the individuals in a manner similar to other items? (p. 106) Smith (2004) argued that test-takers should be ranked consistently by items measuring the same construct. If not, the misfitting items to the Rasch model, i.e. the items that measure a different construct, should be subject to revision or elimination (p. 107).
  • 22. To what extent was the VSTEP sub-tests successful in measuring students’ English language competence? ITEM RESPONSE THEORY (RASCH MODEL) item fit item discrimination item cluster DISCRIPTIVE STATISTICS choice response analysis 02SUBSTANTIVE & STRUCTURAL 25/12/2015 22 To what extent was the VSTEP sub-tests successful in measuring students’ English language competence? ITEM RESPONSE THEORY (RASCH MODEL) item fit item discrimination item cluster DISCRIPTIVE STATISTICS choice response analysis
  • 23. How well did the test-takers’ VSTEP overall and sub-test scores correlate with the test-takers’ overall and sub-test IELTS scores? 03CRITERION-RELATED 25/12/2015 23
  • 24. 04 • The value implications of score interpretation • The actual and potential consequences of score uses (Messick, 1989) FOCUS: on validity of test score interpretation and use - construct under-representation or construct- irrelevant variance CONSEQUENCES 25/12/2015 24 • The value implications of score interpretation • The actual and potential consequences of score uses (Messick, 1989) FOCUS: on validity of test score interpretation and use - construct under-representation or construct- irrelevant variance
  • 25. 04 Sources of evidence • Content relevance and representativeness • Item bias • Technical quality of the test • Expert judgment CONSEQUENCES 25/12/2015 25 Sources of evidence • Content relevance and representativeness • Item bias • Technical quality of the test • Expert judgment
  • 26. References  American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1985). Standards for Educational and Psychological Testing. Washington, DC: Authors.  American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association.  Andrich, D., & Mercer, A. (1997). International perspectives on selection methods of entry into higher education. Canberra: National Board of Employment, Education and Training [and] Higher Education Council.  Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press.  Bachman, L. F. (2004). Statistical analyses for language assessment. Cambridge: Cambridge University Press.  Berk, R. A. (1980). Item Analysis. In R. A. Berk (Ed.), Criterion-referenced measurement: the state of the art. Baltimore and London: The Johns Hopkins University Press.  Cureton, E. E. (1951). Validity. In E. F. Lindquist (Ed.), Educational measurement (pp. 621-694). Washington, D.C.: American Council on Education.  Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, California: Sage Publications.  Kane, M. T. (1992). An argument-based approach to validity. Psychological Bulletin, 112, 527.  Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 17-64). Westport, CT: American Council on Education/Praeger.  Loevinger, J. (1957). Objective tests as instruments of psychological theory. Psychological Reports, 3(4), 635-694.  McNamara, T., & Roever, C. (2006). Language testing: the social dimension. Malden, MA: Blackwell Publishing.  Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13-103). New York: American Council on Education/Macmillan.  MOET. (2006). Secondary Education Curriculum: English. Hanoi: Education Publisher.  Moss, P. A. (2007). Reconstructing Validity. Educational Researcher, 36(8), 470-476.  Popham, W. J. (1997). Consequential Validity: Right Concern--Wrong Concept. Educational Measurement: Issues and Practice, 16(2), 9-13.  Purpura, J. E. (1999). Learner strategy use and performance on language tests : a structural equation modeling approach. Cambridge: Cambridge University Press.  Smith, E. V. (2004). Evidence for Reliability of Measures and Validity of Measure Interpretation: A Rasch Measurement Perspective. In E. V. Smith & R. M. Smith (Eds.), Introduction to Rasch Measurement: Theory, Models and Applications. Maple Grove: JAM Press.  Wu, M. L., Adams, R. J., & Haldane, S. (2008). ConQuest: Generalised Item Response Modelling Software [computer program]. Camberwell: Australian Council for Educational Research. 2625/12/2015  American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1985). Standards for Educational and Psychological Testing. Washington, DC: Authors.  American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association.  Andrich, D., & Mercer, A. (1997). International perspectives on selection methods of entry into higher education. Canberra: National Board of Employment, Education and Training [and] Higher Education Council.  Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press.  Bachman, L. F. (2004). Statistical analyses for language assessment. Cambridge: Cambridge University Press.  Berk, R. A. (1980). Item Analysis. In R. A. Berk (Ed.), Criterion-referenced measurement: the state of the art. Baltimore and London: The Johns Hopkins University Press.  Cureton, E. E. (1951). Validity. In E. F. Lindquist (Ed.), Educational measurement (pp. 621-694). Washington, D.C.: American Council on Education.  Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, California: Sage Publications.  Kane, M. T. (1992). An argument-based approach to validity. Psychological Bulletin, 112, 527.  Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 17-64). Westport, CT: American Council on Education/Praeger.  Loevinger, J. (1957). Objective tests as instruments of psychological theory. Psychological Reports, 3(4), 635-694.  McNamara, T., & Roever, C. (2006). Language testing: the social dimension. Malden, MA: Blackwell Publishing.  Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13-103). New York: American Council on Education/Macmillan.  MOET. (2006). Secondary Education Curriculum: English. Hanoi: Education Publisher.  Moss, P. A. (2007). Reconstructing Validity. Educational Researcher, 36(8), 470-476.  Popham, W. J. (1997). Consequential Validity: Right Concern--Wrong Concept. Educational Measurement: Issues and Practice, 16(2), 9-13.  Purpura, J. E. (1999). Learner strategy use and performance on language tests : a structural equation modeling approach. Cambridge: Cambridge University Press.  Smith, E. V. (2004). Evidence for Reliability of Measures and Validity of Measure Interpretation: A Rasch Measurement Perspective. In E. V. Smith & R. M. Smith (Eds.), Introduction to Rasch Measurement: Theory, Models and Applications. Maple Grove: JAM Press.  Wu, M. L., Adams, R. J., & Haldane, S. (2008). ConQuest: Generalised Item Response Modelling Software [computer program]. Camberwell: Australian Council for Educational Research.
  • 27. THANK YOU FOR YOUR ATTENTION 2725/12/2015 THANK YOU FOR YOUR ATTENTION