SlideShare a Scribd company logo
1 of 30
VALIDITY
and
RELIABILITY
ELEYNFIE A. SANICO, MAEd-EEd
EEd 505- Evaluation of Learning
VALIDITY:
It is a term derived from the Latin word validus, meaning strong.
In contrast to what some teachers believe, it is not a property of a test. It
pertains to the accuracy of the inferences of the teachers make about
students based on the information gathered from an assessment
(McMillan, 2007; Fives & DiDonato-Barnes, 2013)
This implies that the conclusions teachers
come up with in their evaluation of student
performance is valid if there are strong and
sound evidences of the extent of students’
learning.
An assessment is valid if it measures a students’ actual
knowledge and performance with respect to the
intended outcomes, and not something else.
It is represented of the area of learning or content of
the curricular aim being assessed (McMillan,
2007;Popham, 2011).
For instance, an assessment purportedly for measuring
arithmetic skills of grade 4 pupils is invalid if used for
grade 1 pupils because of issues on content (test
content evidence) and level of performance (response
process evidence).
A test that measures recall of
mathematical formula is invalid if
it is supposed to assess problem-
solving.
There are three sources of information
that can be used to establish validity:
Content-Related Evidence
Criterion-Related Evidence
Construct-Related Evidence
A. Content-Related Evidence
Content-related evidence for
validity pertains to the extent to
which the test covers the entire
domain of content.
For example, If a grade 4 pupil was able
to correctly answer 80% of the items in
Science test about matter, the teacher
may infer that the pupil knows 80% of
the content area.
Face validity
A test that appears to adequately
measure the learning outcomes
and content is said to possess
face validity.
As the name suggests, it looks at the
superficial face value of the instrument.
It is based on the subjective opinion of the
one reviewing it.
Hence, it is considered non-systematic or
non-scientific.
A test that was prepared to assess the ability of
pupils to construct simple sentences with
correct subject-verb agreement has face
validity if the test looks like an adequate
measure of the cognitive skills.
Instructional Validity
The extent to which an assessment is systematically
sensitive to the nature of instruction offered.
Popham (2006,p.1) defined as the “degree to which
students’ performances on a test accurately reflect the
quality of instruction to promote students’ mastery of
what is being assessed.”
Yoon & Resnick (1998) asserted that an instructionally valid
test is one that registers differences in the amount and kind of
instruction to which students have been exposed.
They also described the degree of overlap between the
content tested and the content taught as opportunity to learn
which has an impact on test scores.
In the first grading, they will cover three economic issues:
Unemployment
Globalization
Sustainable development
Only two were discussed in class but assessment covered all
three issues. Although these were all identified in the curriculum
guide and may even be found in a textbook, the question
remains as to whether the topics were all taught or not.
Inclusion of items that were not taken up in class reduces
validity because students had no opportunity to learn the
knowledge or skill being assessed.
Table of Specifications (ToS)
 It is prepared before developing the test. It is the best blueprint that identifies the
content area and describes the learning outcomes at each level of the cognitive
domain (Notar, et.al., 2004)
 It is a tool used in conjunction with lesson and unit planning to help teachers make
genuine connections between planning, instruction, and assessment (Fives and
DiDonato-Barnes, 2013)
 It assures teachers that they are testing students’ learning across a wide range of
content and readings as well as cognitive processes requiring higher order thinking.
Carey ( as cited by Notar, et. Al., 2004) specified six
elements in ToS development:
1. balance among the goals selected for the examination
2. balance among the levels of learning
3. the test format
4. the total items
5. the number of items for each goal and level of learning
6. the enabling skills to be selected from each goal framework
Meanwhile, determining the number of items for each topic in the ToS
depends on the instructional time. This means that the topics that
consumed longer instructional time with more teaching-learning
activities should be given more emphasis.
Test items that demand higher order thinking skills obviously require
more time to answer, whereas simple recall items entail the least
amount.
Nitko & Brookhart (2011) gives the average response time for each
assessment task, as seen in Table below.
Table 4.1: Time Requirement for Certain
Assessment Tasks
Type of Test Questions Time Required to Answer
Modified Response (True-false) 20-30 seconds
Modified True or false 30-45 seconds(Notar, et.al., 2004)
Sentence completion (one-word fill-in) 40-60 seconds
Multiple choice with four responses (lower level) 40-60 seconds
Multiple choice (higher level) 70- 90 seconds
Matching type (5 stems, 6 choices) 2-4 minutes
Short answer 2-4 minutes
Multiple choice (with calculation) 2-5 minutes
Word problems (simple arithmetic) 5-10 minutes
Short essays 15-20 minutes
Data analysis/graphing 15-25 minutes
Drawing models/labeling 20-30 minutes
Extended essays 35-50 minutes
Validity suffers if the test is too short to sufficiently measure
behavior and cover the content.
Adding more items to the test may increase its validity.
However, an excessively long tests that may be taxing to the
students.
Regardless of the trade-off, teachers must construct tests that
students can finish within a reasonable time.
B. Criterion-Related Evidence
It refers to the degree to which, test scores
agree with an external criterion. As such, it is
related to external validity. It examines the
relationship between an assessment and
another measure of the same trait (McMillan,
2007).
Three types of criteria (Nitko & Brookhart,
2011)
1. Achievement test scores
2. Ratings, grades and other
numerical judgments made by the
teacher
3. Career data
Types of Criterion-Related Evidence
1.Concurrent Validity- provides an estimate of a
student’s current performance in relation to a
previously validated or established measure.
2.Predictive Validity- pertains to the power or
usefulness of test scores to predict future
performance
C. Construct- Related Evidence
A construct is an individual characteristic that explains some
aspect of behavior (Miller, Linn & Gronlund, 2009).
Construct- related evidence of validity is an assessment of the
quality of the instrument used.
It measures the extent to which the assessment is a
meaningful measure of an unobservable trait or characteristic
?(McMillan, 2007).
Types of Construct-Related Evidence
(McMillan, 2007)
1.Theoritical
2.Logical
3.Statistical
Unified Concept of Validity
In 1989, Messick proposed a unified concept of validity based on an
expanded theory of construct validity which addresses score meaning
and social values in test interpretation and test use.
His concept of unified validity “ integrates considerations of content,
criteria, and consequences into a construct framework for the empirical
test9ing of rational hypotheses about score meaning and theoretically
relevant relationship” (Messick, 1995,p.741)
Validity of Assessment Methods
1. The selected performance should reflect a valued activity.
2. The completion of performance assessment should provide a valuable
learning experience.
3. The statement of goals and objectives should be clearly aligned with
the measurable outcomes of the performance activity.
4. The task should not examine extraneous or unintended variables.
5. Performance assessments should be fair and free from bias.
Moskall (2003) Laid down 5 Recommendations. These are intrinsically associated to the validity of
the assessment.
Threats to Validity
Miller, Linn & Gronlund (2009) identified ten factors that affect validity of assessment results.
1. Unclear test directions
2. Complicated vocabulary and sentence structure
3. Ambiguous statement
4. Inadequate time limits
5. Inappropriate level of difficulty of test items
6. Poorly constructed test items
7. Inappropriate test items for outcomes being measured
8. Short test
9. Improper arrangement of items
10.Identifiable pattern of answers
Thank you
and
God bless..

More Related Content

What's hot

Test administration (Educational Assessment)
Test administration (Educational Assessment)Test administration (Educational Assessment)
Test administration (Educational Assessment)HennaAnsari
 
Performance assessment
Performance assessmentPerformance assessment
Performance assessmentKrisna Marcos
 
Reliability (assessment of student learning I)
Reliability (assessment of student learning I)Reliability (assessment of student learning I)
Reliability (assessment of student learning I)Rey-ra Mora
 
Administering the test
Administering the testAdministering the test
Administering the testenylisac25
 
good test Characteristics
good test Characteristics  good test Characteristics
good test Characteristics Ali Heydari
 
Educational Measurement
Educational MeasurementEducational Measurement
Educational MeasurementAJ Briones
 
PLANNING CLASSROOM TESTS AND ASSESSMENTS
PLANNING CLASSROOM TESTS AND ASSESSMENTSPLANNING CLASSROOM TESTS AND ASSESSMENTS
PLANNING CLASSROOM TESTS AND ASSESSMENTSSANA FATIMA
 
Test construction
Test constructionTest construction
Test constructionlpiresha
 
Intro To Educational Assessment
Intro To Educational AssessmentIntro To Educational Assessment
Intro To Educational Assessmentguevarra_2000
 
Authentic Assessment
Authentic AssessmentAuthentic Assessment
Authentic AssessmentSuha Tamim
 
Portfolio Assessment
Portfolio AssessmentPortfolio Assessment
Portfolio AssessmentSel Yan
 
Recent trends and issues in assessment and evaluation
Recent trends and issues in assessment and evaluationRecent trends and issues in assessment and evaluation
Recent trends and issues in assessment and evaluationROOHASHAHID1
 
intended vs implemented vs achieved curriculum
intended vs implemented vs achieved curriculumintended vs implemented vs achieved curriculum
intended vs implemented vs achieved curriculumobemrosalia
 
Topic 8b Item Analysis
Topic 8b Item AnalysisTopic 8b Item Analysis
Topic 8b Item AnalysisYee Bee Choo
 
Characteristics of Assessment
Characteristics of Assessment Characteristics of Assessment
Characteristics of Assessment AliAlZurfi
 

What's hot (20)

Test administration (Educational Assessment)
Test administration (Educational Assessment)Test administration (Educational Assessment)
Test administration (Educational Assessment)
 
Performance assessment
Performance assessmentPerformance assessment
Performance assessment
 
Types of Test
Types of Test Types of Test
Types of Test
 
Reliability (assessment of student learning I)
Reliability (assessment of student learning I)Reliability (assessment of student learning I)
Reliability (assessment of student learning I)
 
Assembling The Test
Assembling The TestAssembling The Test
Assembling The Test
 
Administering the test
Administering the testAdministering the test
Administering the test
 
good test Characteristics
good test Characteristics  good test Characteristics
good test Characteristics
 
Validity
ValidityValidity
Validity
 
Educational Measurement
Educational MeasurementEducational Measurement
Educational Measurement
 
PLANNING CLASSROOM TESTS AND ASSESSMENTS
PLANNING CLASSROOM TESTS AND ASSESSMENTSPLANNING CLASSROOM TESTS AND ASSESSMENTS
PLANNING CLASSROOM TESTS AND ASSESSMENTS
 
Test construction
Test constructionTest construction
Test construction
 
Intro To Educational Assessment
Intro To Educational AssessmentIntro To Educational Assessment
Intro To Educational Assessment
 
Authentic Assessment
Authentic AssessmentAuthentic Assessment
Authentic Assessment
 
Portfolio Assessment
Portfolio AssessmentPortfolio Assessment
Portfolio Assessment
 
Recent trends and issues in assessment and evaluation
Recent trends and issues in assessment and evaluationRecent trends and issues in assessment and evaluation
Recent trends and issues in assessment and evaluation
 
intended vs implemented vs achieved curriculum
intended vs implemented vs achieved curriculumintended vs implemented vs achieved curriculum
intended vs implemented vs achieved curriculum
 
Test construction
Test constructionTest construction
Test construction
 
Topic 8b Item Analysis
Topic 8b Item AnalysisTopic 8b Item Analysis
Topic 8b Item Analysis
 
Characteristics of Assessment
Characteristics of Assessment Characteristics of Assessment
Characteristics of Assessment
 
Reliability
ReliabilityReliability
Reliability
 

Similar to Validity

Validity and reliability
Validity and reliabilityValidity and reliability
Validity and reliabilityrandoparis
 
Table of specifications 2013 copy
Table of specifications 2013   copyTable of specifications 2013   copy
Table of specifications 2013 copyMarciano Melchor
 
Table of specifications 2013 copy
Table of specifications 2013   copyTable of specifications 2013   copy
Table of specifications 2013 copyMarciano Melchor
 
Journal raganit santos
Journal raganit santosJournal raganit santos
Journal raganit santosessayumiyuri
 
The effectiveness of evaluation style in improving lingual knowledge for ajlo...
The effectiveness of evaluation style in improving lingual knowledge for ajlo...The effectiveness of evaluation style in improving lingual knowledge for ajlo...
The effectiveness of evaluation style in improving lingual knowledge for ajlo...Alexander Decker
 
Effect of scoring patterns on scorer reliability in economics essay tests
Effect of scoring patterns on scorer reliability in economics essay testsEffect of scoring patterns on scorer reliability in economics essay tests
Effect of scoring patterns on scorer reliability in economics essay testsAlexander Decker
 
207 TMA-2_ManotaR,FusileroM,DumoM
207 TMA-2_ManotaR,FusileroM,DumoM207 TMA-2_ManotaR,FusileroM,DumoM
207 TMA-2_ManotaR,FusileroM,DumoMArem Salazar
 
LESSON 6 JBF 361.pptx
LESSON 6 JBF 361.pptxLESSON 6 JBF 361.pptx
LESSON 6 JBF 361.pptxAdnanIssah
 
BASIC OF MEASUREMENT & EVALUATION
BASIC OF MEASUREMENT & EVALUATION BASIC OF MEASUREMENT & EVALUATION
BASIC OF MEASUREMENT & EVALUATION suresh kumar
 
Construction of Tests
Construction of TestsConstruction of Tests
Construction of TestsDakshta1
 
Assessment tools
Assessment toolsAssessment tools
Assessment toolsJhullieKim
 
Fundamental concepts and principles in Language Testing
Fundamental concepts and principles in Language TestingFundamental concepts and principles in Language Testing
Fundamental concepts and principles in Language TestingPhạm Phúc Khánh Minh
 
Principles of Language Assessment
Principles of Language AssessmentPrinciples of Language Assessment
Principles of Language AssessmentA Faiz
 
Principles of assessment
Principles of assessmentPrinciples of assessment
Principles of assessmentmunsif123
 
Building a base for better teaching
Building a base for  better teachingBuilding a base for  better teaching
Building a base for better teachingYadi Purnomo
 
Assessment: Grading & Student Evaluation
Assessment: Grading & Student EvaluationAssessment: Grading & Student Evaluation
Assessment: Grading & Student EvaluationEddy White, Ph.D.
 
Principles of language assessment
Principles of language assessmentPrinciples of language assessment
Principles of language assessmentAmeer Al-Labban
 

Similar to Validity (20)

Validity and reliability
Validity and reliabilityValidity and reliability
Validity and reliability
 
Table of specifications 2013 copy
Table of specifications 2013   copyTable of specifications 2013   copy
Table of specifications 2013 copy
 
Table of specifications 2013 copy
Table of specifications 2013   copyTable of specifications 2013   copy
Table of specifications 2013 copy
 
Journal raganit santos
Journal raganit santosJournal raganit santos
Journal raganit santos
 
Week 8 & 9 - Validity and Reliability
Week 8 & 9 - Validity and ReliabilityWeek 8 & 9 - Validity and Reliability
Week 8 & 9 - Validity and Reliability
 
The effectiveness of evaluation style in improving lingual knowledge for ajlo...
The effectiveness of evaluation style in improving lingual knowledge for ajlo...The effectiveness of evaluation style in improving lingual knowledge for ajlo...
The effectiveness of evaluation style in improving lingual knowledge for ajlo...
 
Effect of scoring patterns on scorer reliability in economics essay tests
Effect of scoring patterns on scorer reliability in economics essay testsEffect of scoring patterns on scorer reliability in economics essay tests
Effect of scoring patterns on scorer reliability in economics essay tests
 
207 TMA-2_ManotaR,FusileroM,DumoM
207 TMA-2_ManotaR,FusileroM,DumoM207 TMA-2_ManotaR,FusileroM,DumoM
207 TMA-2_ManotaR,FusileroM,DumoM
 
LESSON 6 JBF 361.pptx
LESSON 6 JBF 361.pptxLESSON 6 JBF 361.pptx
LESSON 6 JBF 361.pptx
 
BASIC OF MEASUREMENT & EVALUATION
BASIC OF MEASUREMENT & EVALUATION BASIC OF MEASUREMENT & EVALUATION
BASIC OF MEASUREMENT & EVALUATION
 
Term Paper
Term PaperTerm Paper
Term Paper
 
Construction of Tests
Construction of TestsConstruction of Tests
Construction of Tests
 
Assessment tools
Assessment toolsAssessment tools
Assessment tools
 
Fundamental concepts and principles in Language Testing
Fundamental concepts and principles in Language TestingFundamental concepts and principles in Language Testing
Fundamental concepts and principles in Language Testing
 
Principles of Language Assessment
Principles of Language AssessmentPrinciples of Language Assessment
Principles of Language Assessment
 
Principles of assessment
Principles of assessmentPrinciples of assessment
Principles of assessment
 
Building a base for better teaching
Building a base for  better teachingBuilding a base for  better teaching
Building a base for better teaching
 
Assessment: Grading & Student Evaluation
Assessment: Grading & Student EvaluationAssessment: Grading & Student Evaluation
Assessment: Grading & Student Evaluation
 
Testing
TestingTesting
Testing
 
Principles of language assessment
Principles of language assessmentPrinciples of language assessment
Principles of language assessment
 

Recently uploaded

Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxabhijeetpadhi001
 

Recently uploaded (20)

OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptx
 

Validity

  • 1. VALIDITY and RELIABILITY ELEYNFIE A. SANICO, MAEd-EEd EEd 505- Evaluation of Learning
  • 2. VALIDITY: It is a term derived from the Latin word validus, meaning strong. In contrast to what some teachers believe, it is not a property of a test. It pertains to the accuracy of the inferences of the teachers make about students based on the information gathered from an assessment (McMillan, 2007; Fives & DiDonato-Barnes, 2013)
  • 3. This implies that the conclusions teachers come up with in their evaluation of student performance is valid if there are strong and sound evidences of the extent of students’ learning.
  • 4. An assessment is valid if it measures a students’ actual knowledge and performance with respect to the intended outcomes, and not something else. It is represented of the area of learning or content of the curricular aim being assessed (McMillan, 2007;Popham, 2011).
  • 5. For instance, an assessment purportedly for measuring arithmetic skills of grade 4 pupils is invalid if used for grade 1 pupils because of issues on content (test content evidence) and level of performance (response process evidence).
  • 6. A test that measures recall of mathematical formula is invalid if it is supposed to assess problem- solving.
  • 7. There are three sources of information that can be used to establish validity: Content-Related Evidence Criterion-Related Evidence Construct-Related Evidence
  • 8. A. Content-Related Evidence Content-related evidence for validity pertains to the extent to which the test covers the entire domain of content.
  • 9. For example, If a grade 4 pupil was able to correctly answer 80% of the items in Science test about matter, the teacher may infer that the pupil knows 80% of the content area.
  • 10. Face validity A test that appears to adequately measure the learning outcomes and content is said to possess face validity.
  • 11. As the name suggests, it looks at the superficial face value of the instrument. It is based on the subjective opinion of the one reviewing it. Hence, it is considered non-systematic or non-scientific.
  • 12. A test that was prepared to assess the ability of pupils to construct simple sentences with correct subject-verb agreement has face validity if the test looks like an adequate measure of the cognitive skills.
  • 13. Instructional Validity The extent to which an assessment is systematically sensitive to the nature of instruction offered. Popham (2006,p.1) defined as the “degree to which students’ performances on a test accurately reflect the quality of instruction to promote students’ mastery of what is being assessed.”
  • 14. Yoon & Resnick (1998) asserted that an instructionally valid test is one that registers differences in the amount and kind of instruction to which students have been exposed. They also described the degree of overlap between the content tested and the content taught as opportunity to learn which has an impact on test scores.
  • 15. In the first grading, they will cover three economic issues: Unemployment Globalization Sustainable development
  • 16. Only two were discussed in class but assessment covered all three issues. Although these were all identified in the curriculum guide and may even be found in a textbook, the question remains as to whether the topics were all taught or not. Inclusion of items that were not taken up in class reduces validity because students had no opportunity to learn the knowledge or skill being assessed.
  • 17. Table of Specifications (ToS)  It is prepared before developing the test. It is the best blueprint that identifies the content area and describes the learning outcomes at each level of the cognitive domain (Notar, et.al., 2004)  It is a tool used in conjunction with lesson and unit planning to help teachers make genuine connections between planning, instruction, and assessment (Fives and DiDonato-Barnes, 2013)  It assures teachers that they are testing students’ learning across a wide range of content and readings as well as cognitive processes requiring higher order thinking.
  • 18. Carey ( as cited by Notar, et. Al., 2004) specified six elements in ToS development: 1. balance among the goals selected for the examination 2. balance among the levels of learning 3. the test format 4. the total items 5. the number of items for each goal and level of learning 6. the enabling skills to be selected from each goal framework
  • 19. Meanwhile, determining the number of items for each topic in the ToS depends on the instructional time. This means that the topics that consumed longer instructional time with more teaching-learning activities should be given more emphasis. Test items that demand higher order thinking skills obviously require more time to answer, whereas simple recall items entail the least amount. Nitko & Brookhart (2011) gives the average response time for each assessment task, as seen in Table below.
  • 20. Table 4.1: Time Requirement for Certain Assessment Tasks Type of Test Questions Time Required to Answer Modified Response (True-false) 20-30 seconds Modified True or false 30-45 seconds(Notar, et.al., 2004) Sentence completion (one-word fill-in) 40-60 seconds Multiple choice with four responses (lower level) 40-60 seconds Multiple choice (higher level) 70- 90 seconds Matching type (5 stems, 6 choices) 2-4 minutes Short answer 2-4 minutes Multiple choice (with calculation) 2-5 minutes Word problems (simple arithmetic) 5-10 minutes Short essays 15-20 minutes Data analysis/graphing 15-25 minutes Drawing models/labeling 20-30 minutes Extended essays 35-50 minutes
  • 21. Validity suffers if the test is too short to sufficiently measure behavior and cover the content. Adding more items to the test may increase its validity. However, an excessively long tests that may be taxing to the students. Regardless of the trade-off, teachers must construct tests that students can finish within a reasonable time.
  • 22. B. Criterion-Related Evidence It refers to the degree to which, test scores agree with an external criterion. As such, it is related to external validity. It examines the relationship between an assessment and another measure of the same trait (McMillan, 2007).
  • 23. Three types of criteria (Nitko & Brookhart, 2011) 1. Achievement test scores 2. Ratings, grades and other numerical judgments made by the teacher 3. Career data
  • 24. Types of Criterion-Related Evidence 1.Concurrent Validity- provides an estimate of a student’s current performance in relation to a previously validated or established measure. 2.Predictive Validity- pertains to the power or usefulness of test scores to predict future performance
  • 25. C. Construct- Related Evidence A construct is an individual characteristic that explains some aspect of behavior (Miller, Linn & Gronlund, 2009). Construct- related evidence of validity is an assessment of the quality of the instrument used. It measures the extent to which the assessment is a meaningful measure of an unobservable trait or characteristic ?(McMillan, 2007).
  • 26. Types of Construct-Related Evidence (McMillan, 2007) 1.Theoritical 2.Logical 3.Statistical
  • 27. Unified Concept of Validity In 1989, Messick proposed a unified concept of validity based on an expanded theory of construct validity which addresses score meaning and social values in test interpretation and test use. His concept of unified validity “ integrates considerations of content, criteria, and consequences into a construct framework for the empirical test9ing of rational hypotheses about score meaning and theoretically relevant relationship” (Messick, 1995,p.741)
  • 28. Validity of Assessment Methods 1. The selected performance should reflect a valued activity. 2. The completion of performance assessment should provide a valuable learning experience. 3. The statement of goals and objectives should be clearly aligned with the measurable outcomes of the performance activity. 4. The task should not examine extraneous or unintended variables. 5. Performance assessments should be fair and free from bias. Moskall (2003) Laid down 5 Recommendations. These are intrinsically associated to the validity of the assessment.
  • 29. Threats to Validity Miller, Linn & Gronlund (2009) identified ten factors that affect validity of assessment results. 1. Unclear test directions 2. Complicated vocabulary and sentence structure 3. Ambiguous statement 4. Inadequate time limits 5. Inappropriate level of difficulty of test items 6. Poorly constructed test items 7. Inappropriate test items for outcomes being measured 8. Short test 9. Improper arrangement of items 10.Identifiable pattern of answers