Characteristics of
a Good Test
Ajab Ali Lashari
Lecture 4
Subject: Test Construction
SMIU Karachi
Email: ajab.lasahri84@gmail.com
Definition to Test
• A test is an instrument or systematic
procedure for observing and
describing one or more characteristics
of student, using either a numerical
scale or classification scheme.
Defintion to measurement
and evaluation
• Measurement is procedure for
assigning numbers to specified
attribute or characteristic of person.
• Evaluation is the process of making a
value judgment about the worth of
someone or something. (Nitko, 2001).
Faulty Test
• 13% of students who fail in class are
caused by faulty test questions
World watch- The Philadelphia trumpet,
2005
• It is estimated that 90% of the testing
items are out of quality
Wilen WW (1992)
learning
objective
• A learning objective (target)
specifies what you would like
students to achieve at the
completion of an instructional
segment.
• A learning objective (target)
specifies what you would like
students to achieve at the
completion of an instructional
segment.
Stages in Test
Construction
Editing Editing the Test items
Writing Writing the Test items
Selecting Selecting the Appropriate Item Format I. Planning
the Test
Preparing Preparing the Table of Specifications
Determining Determining the Objectives
Stages in Test
Construction
A. Administering the test
B. Item analysis
C. Preparing the Final Form
of the Test
Stages in Test
Construction
Interpreting Interpreting the Test Scores
Establishing Establishing Test Validity
Establishing Establishing Test Reliability
Characteristics
of a Good Test
Validity
Reliability
Practicality
Administrability
Comprehensiveness
Objectivity
Simplicity
Scorability
Validity
• A test is valid if it measures what we
want it to measure and nothing else.
Validity is a more test-dependant
concept but reliability is a purely
statistical parameter.
Types of
validity
Content Validity
Criterion-Related Validity
Construct Validity
Face Validity
Content validity
• Does the test measure the objectives
of the course?
• The extent to which a test measures a
representative sample of the content
to be tested at the intended level of
learning
Criterion-related Validity
• Criterion-related Validity investigates
the correspondence between the
scores obtained from the newly-
developed test and the scores
obtained from some independent
outside criteria.
Criterion-related Validity
Depending on the time of
administration
• Concurrent Validity: Correlation
between the test scores (new test)
with a recognized measure taken at
the same time.
• Predictive validity: Comparison
(correlation) of students' scores with a
criterion taken at a later time.
Construct validity
• Refers to measuring certain traits or theoretical
construct.
• It is based on the degree to which the items in a
test reflect the essential aspects of the theory on
which the test is based on.
Face Validity
• Does it appear to measure what it is
supposed to measure?
Factors Affecting
Validity
• a. Directions (clear and simple)
• b. Difficulty level of the test (not
too easy nor too difficult)
• c. Structure of the items (poorly
constructed and/or ambiguous
items will contribute to invalidity)
• d. Arrangement of items and
correct responses (starting with
the easiest items and ending with
the difficult ones + arranging item
responses randomly not based on
an identifiable pattern)
Reliability
• A test is reliable if we get the same
results repeatedly.
• An “unreliable” test, on the other
hand one’s score might fluctuate from
one administration to the other.
Several ways to
measuring
Reliability
Internal
Consistency
Test-retest
Reliability
Split-half
Methods
Inter rater
Reliability
Parallel-
Forms
Test-Retest
• Administrating a given test to a
particular group twice and calculating
the correlation between the two sets
of score
• Since there has to be a reasonable
amount of time between the two
administrations, this kind of reliability
is referred to as the reliability or
consistency over time.
Test-Retest
Disadvantages
of Test-Retest
It requires two administrations.
Preparing similar conditions under which
the administration take place adds to the
complications of this method.
There should be a short time between to
administration.
Although not too short nor too long. To
keep the balance it is recommended to
have a period of two weeks between them.
Parallel-Forms
• Two similar, or parallel forms of the
same test are administrated to a
group of examinees just once.
• The two form of the test should be
the same.
• Subtests should also be the same.
• The problem here is constructing
two parallel forms of a test which is a
difficult job to do.
Split-Half Test
• In this method, when a single test with
homogeneous items is administrated
to a group of examinees, the test is
split, or divided, into two equal halves.
• The correlation between the two
halves is an estimate of the test score
reliability. In this method easy and
difficult items should be equally
distributed in two halves.
Split-Half
Advantages and
Disadvantages
• Advantages:
• There is no need to administer the
same test twice. Nor is it
necessary to develop two parallel
form of the same test.
• Disadvantages:
• Developing a test with
homogeneous items
Which method should
we use?
• It depends on the function of the test.
• Test-retest method is appropriate when the
consistency of scores a particular time interval
(stability of test scores over time) is important
• The Parallel-forms method is desirable when the
consistency
• of scores over different forms is of importance.
• When the go-togetherness of the items of a test
is of significance (the internal consistency), Split-
Half and KR-21 will be the most appropriate
methods.
Factors
Influencing
Reliability
To have a reliability estimate, one or two sets
of scores should be obtained from the same
group of testees.
Thus, two factors contribute to test reliability:
the testee and
The test itself.
Practicality
• practicality refers to the ease of
administration
and scoring of a test.
Administrability
• Administrability the test should
directed uniformly to all students
so that the scores obtained will
not vary due to factors other than
differences of the students’
knowledge and skills.
• There should a clear provision for
instruction for the students and
the one who will check the test
(having clear directions and
processes)
Scorability
• Scorability the test should be easy
to score, directions for
scoring is clear, provide the answer
sheet and the answer key.
Comprehensiveness
• A test is said to have comprehensiveness if
it encompasses all aspects of a particular
subject of study.
Simplicity
• A test is said to be simple if it is easy to
understand along with the instructions and
other details.
Objectivity
• Objectivity represents the
agreement of two or more
raters or a test administrator
concerning the score of a
student.
• Not influenced by emotion
or personal prejudice.
• Lack of objectivity reduces
test validity in the same way
that lack reliability influence
validity.
Objectivity
• Objectivity represents the agreement
of two or more raters or a test
administrator concerning the score of
a student.
• Not influenced by emotion or
personal prejudice.
• Lack of objectivity reduces test
validity in the same way that lack
reliability influence validity.
Other Factors
Test
length
Speed
Item
difficulty
References
David ,M. Robert , L. . Norman, E.
Measurement and Assessment in Teaching
(10th Ed). Pearson(2008)
Anthony J, N. Educational Assessment of
Students (3th Ed). Merrill Prentice Hall (2001)
http://www.ehow.com/how_4913690_steps-
preparing-test.html

Characteristics of a Good Test

  • 1.
    Characteristics of a GoodTest Ajab Ali Lashari Lecture 4 Subject: Test Construction SMIU Karachi Email: ajab.lasahri84@gmail.com
  • 2.
    Definition to Test •A test is an instrument or systematic procedure for observing and describing one or more characteristics of student, using either a numerical scale or classification scheme.
  • 3.
    Defintion to measurement andevaluation • Measurement is procedure for assigning numbers to specified attribute or characteristic of person. • Evaluation is the process of making a value judgment about the worth of someone or something. (Nitko, 2001).
  • 4.
    Faulty Test • 13%of students who fail in class are caused by faulty test questions World watch- The Philadelphia trumpet, 2005 • It is estimated that 90% of the testing items are out of quality Wilen WW (1992)
  • 5.
    learning objective • A learningobjective (target) specifies what you would like students to achieve at the completion of an instructional segment. • A learning objective (target) specifies what you would like students to achieve at the completion of an instructional segment.
  • 6.
    Stages in Test Construction EditingEditing the Test items Writing Writing the Test items Selecting Selecting the Appropriate Item Format I. Planning the Test Preparing Preparing the Table of Specifications Determining Determining the Objectives
  • 7.
    Stages in Test Construction A.Administering the test B. Item analysis C. Preparing the Final Form of the Test
  • 8.
    Stages in Test Construction InterpretingInterpreting the Test Scores Establishing Establishing Test Validity Establishing Establishing Test Reliability
  • 9.
    Characteristics of a GoodTest Validity Reliability Practicality Administrability Comprehensiveness Objectivity Simplicity Scorability
  • 10.
    Validity • A testis valid if it measures what we want it to measure and nothing else. Validity is a more test-dependant concept but reliability is a purely statistical parameter.
  • 11.
    Types of validity Content Validity Criterion-RelatedValidity Construct Validity Face Validity
  • 12.
    Content validity • Doesthe test measure the objectives of the course? • The extent to which a test measures a representative sample of the content to be tested at the intended level of learning
  • 13.
    Criterion-related Validity • Criterion-relatedValidity investigates the correspondence between the scores obtained from the newly- developed test and the scores obtained from some independent outside criteria.
  • 14.
    Criterion-related Validity Depending onthe time of administration • Concurrent Validity: Correlation between the test scores (new test) with a recognized measure taken at the same time. • Predictive validity: Comparison (correlation) of students' scores with a criterion taken at a later time.
  • 15.
    Construct validity • Refersto measuring certain traits or theoretical construct. • It is based on the degree to which the items in a test reflect the essential aspects of the theory on which the test is based on.
  • 16.
    Face Validity • Doesit appear to measure what it is supposed to measure?
  • 17.
    Factors Affecting Validity • a.Directions (clear and simple) • b. Difficulty level of the test (not too easy nor too difficult) • c. Structure of the items (poorly constructed and/or ambiguous items will contribute to invalidity) • d. Arrangement of items and correct responses (starting with the easiest items and ending with the difficult ones + arranging item responses randomly not based on an identifiable pattern)
  • 18.
    Reliability • A testis reliable if we get the same results repeatedly. • An “unreliable” test, on the other hand one’s score might fluctuate from one administration to the other.
  • 19.
  • 20.
    Test-Retest • Administrating agiven test to a particular group twice and calculating the correlation between the two sets of score • Since there has to be a reasonable amount of time between the two administrations, this kind of reliability is referred to as the reliability or consistency over time.
  • 21.
  • 22.
    Disadvantages of Test-Retest It requirestwo administrations. Preparing similar conditions under which the administration take place adds to the complications of this method. There should be a short time between to administration. Although not too short nor too long. To keep the balance it is recommended to have a period of two weeks between them.
  • 23.
    Parallel-Forms • Two similar,or parallel forms of the same test are administrated to a group of examinees just once. • The two form of the test should be the same. • Subtests should also be the same. • The problem here is constructing two parallel forms of a test which is a difficult job to do.
  • 24.
    Split-Half Test • Inthis method, when a single test with homogeneous items is administrated to a group of examinees, the test is split, or divided, into two equal halves. • The correlation between the two halves is an estimate of the test score reliability. In this method easy and difficult items should be equally distributed in two halves.
  • 25.
    Split-Half Advantages and Disadvantages • Advantages: •There is no need to administer the same test twice. Nor is it necessary to develop two parallel form of the same test. • Disadvantages: • Developing a test with homogeneous items
  • 26.
    Which method should weuse? • It depends on the function of the test. • Test-retest method is appropriate when the consistency of scores a particular time interval (stability of test scores over time) is important • The Parallel-forms method is desirable when the consistency • of scores over different forms is of importance. • When the go-togetherness of the items of a test is of significance (the internal consistency), Split- Half and KR-21 will be the most appropriate methods.
  • 27.
    Factors Influencing Reliability To have areliability estimate, one or two sets of scores should be obtained from the same group of testees. Thus, two factors contribute to test reliability: the testee and The test itself.
  • 28.
    Practicality • practicality refersto the ease of administration and scoring of a test.
  • 29.
    Administrability • Administrability thetest should directed uniformly to all students so that the scores obtained will not vary due to factors other than differences of the students’ knowledge and skills. • There should a clear provision for instruction for the students and the one who will check the test (having clear directions and processes)
  • 30.
    Scorability • Scorability thetest should be easy to score, directions for scoring is clear, provide the answer sheet and the answer key.
  • 31.
    Comprehensiveness • A testis said to have comprehensiveness if it encompasses all aspects of a particular subject of study.
  • 32.
    Simplicity • A testis said to be simple if it is easy to understand along with the instructions and other details.
  • 33.
    Objectivity • Objectivity representsthe agreement of two or more raters or a test administrator concerning the score of a student. • Not influenced by emotion or personal prejudice. • Lack of objectivity reduces test validity in the same way that lack reliability influence validity.
  • 34.
    Objectivity • Objectivity representsthe agreement of two or more raters or a test administrator concerning the score of a student. • Not influenced by emotion or personal prejudice. • Lack of objectivity reduces test validity in the same way that lack reliability influence validity.
  • 35.
  • 36.
    References David ,M. Robert, L. . Norman, E. Measurement and Assessment in Teaching (10th Ed). Pearson(2008) Anthony J, N. Educational Assessment of Students (3th Ed). Merrill Prentice Hall (2001) http://www.ehow.com/how_4913690_steps- preparing-test.html