Not expensive, Within appropriate time constraint, Relatively easy to administer, A scoring/evaluation procedure that is specific and time-efficient.
1. Are administrative details clearly established before the test? 2. Can students complete the test reasonably within the set time frame? 3. Is the cost of the test within budget limits?
Consistency of assessment results (Linn & Gronlund). A test is reliable if: “You give the same test to the samestudent or matched students on twodifferent occasions, the test should yieldsimilar results.” (Brown,2004)
Students-related reliability Rater reliability Test administration reliability Test reliability
The most common learner-related issue in reliability iscaused by temporaryillness, fatigue, a “badday”, anxiety, and otherphysical or psychologicalfactors.
Inter-rater reliability: When two or more scorers yield inconsistent scores of the same test. Factors: lack of attention to scoring, inexperience, inattention, etc.
Intra-rater Scoring criteria, fatigue, bias toward particular “good” and “bad” students, or simple carelessness.
It can be caused by administration factors. e.g. noisy from outside, photocopying variations, room condition, even condition of desks and chair.
Factors cause unreliability: If a test too long, test takers may become fatigued by the time they reach the later items and hastily respond incorrectly. Ambiguous items.
“Measuring what should be measured”o Content-related evidenceo Criterion-related evidenceo Construct-related evidenceo Consequential validityo Face validity
If a test samples the subject matter about which conclusions are to be drawn. If a test requires the test-taker to perform the behavior that is being measured.
is used to demonstrate the accuracy ofa measure or procedure by comparingit with another measure or procedurewhich has been demonstrated to bevalid.
Exampleimagine a hands-on driving test has beenshown to be an accurate test of drivingskills. By comparing the scores on thewritten driving test with the scores fromthe hands-on driving test, the writtentest can be validated by using a criterionrelated strategy in which the hands-ondriving test is compared to the writtentest.
1.Concurrent validity/ empiric validity if a test result is supported by otherconcurrent performance beyondassessment itself.e.g.the validity of a high score on the finalexam of a foreign language course will besubstantiated by actual proficiency in thelanguage.
2. Predictive validity to assess (and predict) a testtaker’s likelihood of future success. e.g SNMPTN
How well performance on theassessment can be interpreted asmeaningful measure of some characteristicsor quality.
How well use of assessment resultsaccomplishes intended purposes and avoidsunintended effect.
It refers to the degree to which a test looks right, and appears to measure the knowledge or ability it claims to measure, based on the subjective judgment of the examinees who take it, the administrative personnel who decide on its use, and other psychometrically unsophisticated observers (Mousavi in Brown, 2004)
The language as natural as possible. Items contextualized rather than isolated. Topics meaningful (relevant, interesting) for the learner. Some thematic organization to items is provided, such as through a story line or episode. Tasks represent, or closely approximate, real-world tasks.
Contextualized Decontextualized‘Going to”1. What _______ this 1. There are three countries I weekend? would like to visit. One is a. you are going to do Italy. b. are you going to do a. The other is New c. your gonna do Zealand and other is Nepal b. The others are New Zealand and Nepal c. Others are New Zealand and Nepal
Contextualized Contextualized2. I’m not sure. 2. When I was twelve_______ anything years old, I usedspecial? ______every day.a.Are you going to a. swimming do b. to swimmingb.You are going to do c. to swimc. Is going to do
The effect of testing on teaching and learning (Hughes in Brown, 2004). Generally refers to the effects tests have on instruction in terms of how students prepare for the test (Brown, 2004).