It is “the degree to which a certain inference from a test is appropriate or meaningful” (Drummond, 2000)It is the extent to which a test does the job desired of it; the evidence may be either empirical or logical (Lyman, 1991)It is the extent to which a test measures what it is supposed to measure (Murphy & Davidshofer, 1998)
Types Purpose Procedure Types of TestsContent To compare Compare test Survey achievement whether the test blueprint with the tests, Criterion- items match the set school, course, referenced tests, of goals and program objectives. examinations objectives Use panel of experts in content area (eg teachers, professors)
Types Purpose Procedure Types of TestsCriterion: To determine Correlate test scores Scholastic aptitude,Predictive whether there is a with criterion General aptitude relationship between measure obtained batteries, Prognostic a test and a criterion after a period of time tests, Readiness measure to be tests, Personality obtained in the tests future
Types Purpose Procedure Types of TestsConstruct To determine Conduct multivariate Intelligence tests, whether a construct statistical analysis, aptitude tests, exists and to discriminant personality tests understand the traits analysis, or concepts that multivariate analysis make up the set of of variance scores or items
It refers to the degree to which test scores are consistent, dependable or repeatable; it is the function of the degree to which test scores are free from errors (Drummond, 2000)It refers to the consistency of test scores obtained by the same persons when reexamined with the same test on different occasions, or with different sets of equivalent items, or under other variable examining conditions (Anastasi and Urbina, 1997).
The concept of reliability underlies the error of measurement of a single score whereby we can predict the range of fluctuation likely to occur in a single individual’s score as a result of irrelevant chance factors.The other concept of reliability refers to the consistency of a test based on the number of items in the test and the average inter correlations among all items and computing the average of these inter correlations among test items.
Method Procedure Coefficient ProblemsTest-retest Same procedure Stability Memory effect given twice with time Practice effect interval testing Change over timeAlternate forms Equivalent tests given Equivalence and Hard to develop 2 with time between stability equivalent tests testing May reflect change in behavior over time
Method Procedure Coefficient ProblemsInternal Consistency One test given at Equivalence and Uses shortened one time only (test internal consistence forms (split half) divided into part in Only good if traits split-half) are unitary or homogenous Gives high estimate on a speeded test Hard to compute by hand