SlideShare a Scribd company logo
Arash Yazdani
Introduction:

 item or a combination
 A test may consist of a single

of items. Regardless of the number of items in a test,
every single item should possess certain
characteristics.
 Having good items , however does not necessarily
lead to a good test, because a test as a whole is more
than a mere combination of individual items.
 Therefore, in addition to having good items, a test
should have certain characteristics.
 1.Reliability 2.Validity 3.Practicality
Reliability

There are different theories to explain the
concept of reliability in a scientific way.
Firs and simplest: A test is reliable if we
get the same results repeatedly.
Second: when a test gives consistent
results.
Third: reliability is ratio of true score
variance to observed score variance.
In

order to explain the

concept of reliability in a
non-technical term is to say:
imagine that feeling that
someone did not as well on
a test as he could have.
Now imagine if he could
take the test again, he
would
do
better.
This may be quit true.
Nevertheless, one should
also admit that some
factors, such as a good
chance on guessing the
correct responses, would
raise his score higher than
it should really be. Seldom
does anyone
complain
about this.

N

ow, if one could take a test

over and over again, he would
probably agree that his average score
over all the tests in an acceptable
estimate of what he really know or
how he really feels about the test.
On a “reliable test, one’s score on its
varies administrations would not
differ greatly. That is, one’s score
would be quit consistent.
On an “unreliable” test, on the other
hand one’s score might fluctuate
from one administration to the
other. That is, one’s score on various
administration
will
be
inconsistence.
The notion of consistency of one’s
score with respect to one's average
score over repeated administration is
the central concern on the concept of
reliability.
The

change in one’s
score
is
inevitable.
Some of the changes
might represent
a
steady increase in one’s
score. The increase
would most likely be
due to some sort of
learning. This kind of
change, which would
be predictable, is called
systematic variation.

The

systematic

variation
contributes to the
reliability and the
unsystematic
variation, which is
called
error
variation
,
contributes to the
unreliability of a
test.
True Score
 takes a test. Since all
Let’s assume that someone
measurement devices are subject to error, the
score one gets on a test cannot be true
manifestation of one’s ability in that particular
trait. In other words, the score contains one’s true
ability along with some error. If this error part
could be eliminated, the resulting score would
represent an errorless measure of that ability. By
definition, this errorless score is called a “true
score”.
Observed score


The true score is almost always different from the score

one gets, which is called the “observed score”. Since the
observed score includes the measurement error, i.e., the
error score, it can be grater than, equal to, or smaller
than the true score. If there is absolutely no error of
measurement, the observed score will equal the true
score. However, when there is a measurement error,
which is often the case, it can lead to an overestimation
or an underestimation of true score. Therefore, if the
observed score is represented by X, the true score by T
and the error score by E, the relationship between the
observed and true score can be illustrated as follows:
X=T

or

X>T

or

X<T
Standard Error of
Measurement



 It is necessary to find an index of error in
measurement which could be applied to all occasions
of a particular measure. This index of error is called
standard error of measurement, abbreviated as SEM.
 By definition, SEM is the standard deviation of all
error score obtained from a given measure in
different situations.
Methods of Estimating Reliability


Test-Retest Method
Parallel-form Method
Split-Half Method
KR-21 Method
Test-Retest

 In this method reliability is obtained through
administrating a given test to a particular group
twice and calculating the correlation between the
two sets of score obtained from the two
administration.
 Since there has to be a reasonable amount of time
between the two administrations, this kind of
reliability is referred to as the reliability or
consistency over time.
Test-Retest

Disadvantages of Test-Retest


 It requires two administrations.

 Preparing similar conditions under which the
administration take place adds to the complications
of this method.
 There should be a short time between to
administration. Although not too short nor too long.
To keep the balance it is recommended to have a
period of two weeks between them.
Parallel-Forms

 In the parallel-forms method, two similar, or parallel forms of the
same test are administrated to a group of examinees just once.
 The problem here is constructing two parallel forms of a test which is a
difficult job to do.
 The two form of the test should be the same. It means all the elements
upon which test items are constructed should be the same in both
forms. For example if we are measuring a particular element of
grammar, the other form should also contain the same number of
items on the same elements of grammar.
 Subtests should also be the same, i.e., if one form of the test has tree
subsection of grammar, vocabulary, and reading comprehension, the
other form should also have the same subsections with the same
proportions.
Split-Half

 In split-half method the items comprising a test are
homogeneous. That is, all the items in a test attempt
to measure elements of a particular trait, E.g., tenses,
propositions, other grammatical points, vocabulary,
reading and listening comprehension, which are all
subparts of the trait called language ability.
 In this method, when a single test with
homogeneous items is administrated to a group of
examinees, the test is split, or divided, into two equal
halves. The correlation between the two halves is an
estimate of the test score reliability.
Split-Half

 In using this method, two main points should be
taken into consideration. Firs, the procedure for
dividing the test into two equal halves, and
second, the computation of total test reliability from
the reliability of one half of the test.
 In this method easy and difficult items should be
equally distributed in two halves.
Split-Half


Split-Half Advantages and
Disadvantages



 Advantages: it is more practical than others. In using the
Split-Half method, there is no need to administer the
same test twice. Nor is it necessary to develop two
parallel form of the same test.
 Disadvantages: the main shortcoming of this method is
developing a test with homogeneous items because
assuming the quality between the two halves is not a safe
assumption. Furthermore, different subsections, in a test
,e.g., grammar, vocabulary, reading or listening
comprehension, will jeopardize test homogeneity, and
thus reduce test score reliability.
KR-21




KR-21

Which method should we use?


 It depends on the function of the test.
 Test-retest method is appropriate when the
consistency of scores a particular time interval
(stability of test scores over time) is important
 The Parallel-forms method is desirable when the
consistency of scores over different forms is of
importance.
 When the go-togetherness of the items of a test is of
significance (the internal consistency), Split-Half
and KR-21 will be the most appropriate methods.
Factors Influencing Reliability


 To have a reliability estimate, one or two sets of
scores should be obtained from the same group of
testees. Thus, two factors contribute to test reliability:
the testee and the test itself.
The Effect of Testees

 Since human beings are dynamic creatures, the attributes
related to human beings are also dynamic. The
implication is that the performance of human beings
will, by their very nature, fluctuate from time to time, or
from place to place. (e.g., students misunderstanding or
misreading test directions, noise level, distractions, and
sickness) can cause test scores to vary.
 Heterogeneity of the Group Members.The greater the
heterogeneity of the group members in the
preferences, skills or behaviors being tested, the greater
the chance for high reliability correlation coefficients.
The Effect of Test Factors
 Test length. Generally, the longer a test is, the more
reliable it is, however the length is up to a point.
 Speed. When a test is a speed test, reliability can be
problematic. It is inappropriate to estimate reliability
using internal consistency, test-retest, or alternate form
methods. This is because not every student is able to
complete all of the items in a speed test. In contrast, a
power test is a test in which every student is able to
complete all the items.
 Item difficulty. When there is little variability among test
scores, the reliability will be low. Thus, reliability will be
low if a test is so easy that every student gets most or all of
the items correct or so difficult that every student gets
most or all of the items wrong.
The Effect of Administration
Factors



• Poor or unclear directions given during
administration or inaccurate scoring can affect
reliability.
 For Example - say you were told that your scores on
being social determined your promotion. The result
is more likely to be what you think they want than
what your behavior is.
The Influence of Scoring Factors

 the likes and dislikes of
 In an objectively-scored test,
the scorers will not influence the results.
 In a subjectively-scored test, the likes and dislikes of
the scorers will influence the results and as a result
reliability.
 Intra-rater errors (Errors which are due to
fluctuations of the same rater scoring a single test
twice)
 Inter-rater errors (Errors which are due to the
fluctuations of different scorers-at least two- scoring
a single test.
Validity


 The second major
characteristic of a good test is
validity.

 What does validity mean?
 A test is valid if it
measures what we want it
to measure and nothing
else.
 The extent to which a test
measures what it is
supposed to measure or
can be used for the
purposes for which the test
is intended.

 Validity is a more-testdependant concept but
reliability is a purely
statistical parameter.
 So, validity refers to the
extent to which a test
measures what it is
supposed to measure.
 There are four types of
validity.
Types Of Validity

Content V
Criterion-Related V

Construct V
Content Validity

 Relevance of the test item to the purpose of the test.
 Does the test measure the objectives of the course?
 It refers to the correspondence (agreement) between
the test content and the content of materials (subject
matter and instructional objectives) taught to be
tested.
 The extent to which a test measures a representative
sample of the content to be tested at the intended
level of learning.
Content Validity

 Content Validity is called appropriateness of the test;
that is appropriateness of the sample and the
learning level.
 Content Validity is the most important type of
validity which can be achieved through a careful
examination of the test content.
 It provides the most useful subjective information
about the appropriateness of the test.
Criterion-related Validity


 Criterion-related Validity investigates the correspondence

between the scores obtained from the newly-developed test
and the scores obtained from some independent outside
criteria.
 The newly-developed test has to be administered along with
the criterion measure to the same group.
 The extent to which the test scores correlate with a relevant
outside criterion.
 Criterion-related validity:
Refers to the extent to which different tests intended to measure
the same ability are in agreement.
Depending on the time of administration, two types exist:

 Concurrent Validity
 Predictive Validity
Concurrent Validity
Correlation between the test scores (new test)
with a recognized measure taken at the same
time.

Predictive validity
Comparison (correlation) of students' scores
with a criterion taken at a later time (date).
Construct validity

 Refers to measuring certain traits or theoretical construct
 Refers to the extent to which the psychological reality of a
trait or construct can be established.
 It is based on the degree to which the items in a test
reflect the essential aspects of the theory on which the test
is based on.
 Construct validity also refers to the accuracy with which
the test measures certain psychological/theoretical traits
 Reading comprehension – Oral language ability.
 This is done through factor analysis
Factors Affecting Validity


 a. Directions (clear and simple)
 b. Difficulty level of the test (not too easy nor too
difficult)
 c. Structure of the items (poorly constructed and/or
ambiguous items will contribute to invalidity)
 d. Arrangement of items and correct responses
(starting with the easiest items and ending with the
difficult ones + arranging item responses randomly
not based on an identifiable pattern)
Validity and Reliability

 Reliability is a purely statistical parameter; that is, it
can be determined fairly independently of the test.
But Validity is a test-dependent concept.
 We have degrees of validity:
 very valid, moderately valid, not very valid
 A test must be reliable to be valid, but reliability
does not guarantee validity.
Reliability, Validity and
Acceptability



 How reliable and valid should a test be?
 The more important the decision to be made, the
more confidence is needed in the scores, an thus, the
more reliable and valid test are required.
 Nevertheless, it is a generally accepted tradition that
validity and reliability coefficients below 0.50 ( low )
0.5 to 0.75 ( moderate), 0.75 to 0.90 ( high )
Practicality

Generally speaking,
practicality refers to the
ease of administration
and scoring of a test.
Ease of administration

 It is = Clarity, simplicity and the ease of reading
instructions

Fewer numbers of subtests

The time required for test
Ease of scoring

A test can be scored subjectively or
objectively.
Since scoring is difficult and time
consuming, the trend is toward
objectivity, simplicity and machine
scoring.
Ease of Interpretation and
Application



 The meaningfulness of scores obtained from that
test
 If the test results are misinterpreted or
misapplied, they will be of little value and may
actually be harmful to some individual or group.
The End

More Related Content

What's hot

Reliability (assessment of student learning I)
Reliability (assessment of student learning I)Reliability (assessment of student learning I)
Reliability (assessment of student learning I)Rey-ra Mora
 
Considerations in preparing relevant test items
Considerations in preparing relevant test itemsConsiderations in preparing relevant test items
Considerations in preparing relevant test itemsMohammad Yunas
 
validity and reliability
validity and reliabilityvalidity and reliability
validity and reliabilityaffera mujahid
 
Testing in language programs (chapter 8)
Testing in language programs (chapter 8)Testing in language programs (chapter 8)
Testing in language programs (chapter 8)Tahere Bakhshi
 
Validity of test
Validity of testValidity of test
Validity of testSarat Rout
 
Validity and Reliability
Validity and ReliabilityValidity and Reliability
Validity and ReliabilityMaury Martinez
 
Six steps for avoiding misinterpretations
Six steps for avoiding misinterpretationsSix steps for avoiding misinterpretations
Six steps for avoiding misinterpretationsAbdul Majid
 
Qualities of a Good Test
Qualities of a Good TestQualities of a Good Test
Qualities of a Good TestDrSindhuAlmas
 
Testing and Test construction (Evaluation in EFL)
Testing and Test construction (Evaluation in EFL)Testing and Test construction (Evaluation in EFL)
Testing and Test construction (Evaluation in EFL)Samcruz5
 
Validity, reliability & practicality
Validity, reliability & practicalityValidity, reliability & practicality
Validity, reliability & practicalitySamcruz5
 
Methods of interpreting test scores by Dr.Shazia Zamir
Methods of interpreting test scores by Dr.Shazia Zamir Methods of interpreting test scores by Dr.Shazia Zamir
Methods of interpreting test scores by Dr.Shazia Zamir Dr.Shazia Zamir
 
Nature of Interpretation: Norm referenced, Criterion referenced
Nature of  Interpretation:  Norm referenced,  Criterion referencedNature of  Interpretation:  Norm referenced,  Criterion referenced
Nature of Interpretation: Norm referenced, Criterion referencedADITYA ARYA
 
Meaning of Test, Testing and Evaluation
Meaning of Test, Testing and EvaluationMeaning of Test, Testing and Evaluation
Meaning of Test, Testing and EvaluationDr. Amjad Ali Arain
 
stages of test construction
stages of test constructionstages of test construction
stages of test constructionirshad narejo
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good testALMA HERMOGINO
 
Factors in Test Administeration
Factors in Test Administeration Factors in Test Administeration
Factors in Test Administeration Dr. Amjad Ali Arain
 

What's hot (20)

Reliability (assessment of student learning I)
Reliability (assessment of student learning I)Reliability (assessment of student learning I)
Reliability (assessment of student learning I)
 
Considerations in preparing relevant test items
Considerations in preparing relevant test itemsConsiderations in preparing relevant test items
Considerations in preparing relevant test items
 
validity and reliability
validity and reliabilityvalidity and reliability
validity and reliability
 
Testing in language programs (chapter 8)
Testing in language programs (chapter 8)Testing in language programs (chapter 8)
Testing in language programs (chapter 8)
 
Validity of test
Validity of testValidity of test
Validity of test
 
Assembling The Test
Assembling The TestAssembling The Test
Assembling The Test
 
Validity and Reliability
Validity and ReliabilityValidity and Reliability
Validity and Reliability
 
Six steps for avoiding misinterpretations
Six steps for avoiding misinterpretationsSix steps for avoiding misinterpretations
Six steps for avoiding misinterpretations
 
Qualities of a Good Test
Qualities of a Good TestQualities of a Good Test
Qualities of a Good Test
 
Testing and Test construction (Evaluation in EFL)
Testing and Test construction (Evaluation in EFL)Testing and Test construction (Evaluation in EFL)
Testing and Test construction (Evaluation in EFL)
 
Validity, reliability & practicality
Validity, reliability & practicalityValidity, reliability & practicality
Validity, reliability & practicality
 
Subjective test
Subjective testSubjective test
Subjective test
 
Methods of interpreting test scores by Dr.Shazia Zamir
Methods of interpreting test scores by Dr.Shazia Zamir Methods of interpreting test scores by Dr.Shazia Zamir
Methods of interpreting test scores by Dr.Shazia Zamir
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validity
 
Nature of Interpretation: Norm referenced, Criterion referenced
Nature of  Interpretation:  Norm referenced,  Criterion referencedNature of  Interpretation:  Norm referenced,  Criterion referenced
Nature of Interpretation: Norm referenced, Criterion referenced
 
01 validity and its type
01 validity and its type01 validity and its type
01 validity and its type
 
Meaning of Test, Testing and Evaluation
Meaning of Test, Testing and EvaluationMeaning of Test, Testing and Evaluation
Meaning of Test, Testing and Evaluation
 
stages of test construction
stages of test constructionstages of test construction
stages of test construction
 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good test
 
Factors in Test Administeration
Factors in Test Administeration Factors in Test Administeration
Factors in Test Administeration
 

Similar to Characteristics of a good test

Validity, Reliability ,Objective & Their Types
Validity, Reliability ,Objective & Their TypesValidity, Reliability ,Objective & Their Types
Validity, Reliability ,Objective & Their TypesMohammadRabbani18
 
What makes a good testA test is considered good” if the .docx
What makes a good testA test is considered good” if the .docxWhat makes a good testA test is considered good” if the .docx
What makes a good testA test is considered good” if the .docxmecklenburgstrelitzh
 
Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Linejan
 
Characteristics of effective tests and hiring
Characteristics of effective tests and hiringCharacteristics of effective tests and hiring
Characteristics of effective tests and hiringBinibining Kalawakan
 
Evaluation of Measurement Instruments.ppt
Evaluation of Measurement Instruments.pptEvaluation of Measurement Instruments.ppt
Evaluation of Measurement Instruments.pptCityComputers3
 
Reliability in Language Testing
Reliability in Language Testing Reliability in Language Testing
Reliability in Language Testing Seray Tanyer
 
RELIABILITY AND VALIDITY
RELIABILITY AND VALIDITYRELIABILITY AND VALIDITY
RELIABILITY AND VALIDITYJoydeep Singh
 
Valiadity and reliability- Language testing
Valiadity and reliability- Language testingValiadity and reliability- Language testing
Valiadity and reliability- Language testingPhuong Tran
 
Presentation Validity & Reliability
Presentation Validity & ReliabilityPresentation Validity & Reliability
Presentation Validity & Reliabilitysongoten77
 
Reliability bachman 1990 chapter 6
Reliability bachman 1990 chapter 6Reliability bachman 1990 chapter 6
Reliability bachman 1990 chapter 6ahfameri
 
Meaning and Methods of Estimating Reliability of Test.pptx
Meaning and Methods of Estimating Reliability of Test.pptxMeaning and Methods of Estimating Reliability of Test.pptx
Meaning and Methods of Estimating Reliability of Test.pptxsarat68
 
Characteristics of Good Evaluation Instrument
Characteristics of Good Evaluation InstrumentCharacteristics of Good Evaluation Instrument
Characteristics of Good Evaluation InstrumentSuresh Babu
 

Similar to Characteristics of a good test (20)

EM&E.pptx
EM&E.pptxEM&E.pptx
EM&E.pptx
 
Validity, Reliability ,Objective & Their Types
Validity, Reliability ,Objective & Their TypesValidity, Reliability ,Objective & Their Types
Validity, Reliability ,Objective & Their Types
 
What makes a good testA test is considered good” if the .docx
What makes a good testA test is considered good” if the .docxWhat makes a good testA test is considered good” if the .docx
What makes a good testA test is considered good” if the .docx
 
Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Report - Reliability & validity
Louzel Report - Reliability & validity
 
Characteristics of effective tests and hiring
Characteristics of effective tests and hiringCharacteristics of effective tests and hiring
Characteristics of effective tests and hiring
 
Rep
RepRep
Rep
 
Evaluation of Measurement Instruments.ppt
Evaluation of Measurement Instruments.pptEvaluation of Measurement Instruments.ppt
Evaluation of Measurement Instruments.ppt
 
Reliability in Language Testing
Reliability in Language Testing Reliability in Language Testing
Reliability in Language Testing
 
RELIABILITY AND VALIDITY
RELIABILITY AND VALIDITYRELIABILITY AND VALIDITY
RELIABILITY AND VALIDITY
 
Business research methods
Business research methodsBusiness research methods
Business research methods
 
Reliability
ReliabilityReliability
Reliability
 
Valiadity and reliability- Language testing
Valiadity and reliability- Language testingValiadity and reliability- Language testing
Valiadity and reliability- Language testing
 
Presentation Validity & Reliability
Presentation Validity & ReliabilityPresentation Validity & Reliability
Presentation Validity & Reliability
 
Qualities of good evaluation tool (1)
Qualities of good evaluation  tool (1)Qualities of good evaluation  tool (1)
Qualities of good evaluation tool (1)
 
Reliability bachman 1990 chapter 6
Reliability bachman 1990 chapter 6Reliability bachman 1990 chapter 6
Reliability bachman 1990 chapter 6
 
Reliability bachman 1990 chapter 6
Reliability bachman 1990 chapter 6Reliability bachman 1990 chapter 6
Reliability bachman 1990 chapter 6
 
Meaning and Methods of Estimating Reliability of Test.pptx
Meaning and Methods of Estimating Reliability of Test.pptxMeaning and Methods of Estimating Reliability of Test.pptx
Meaning and Methods of Estimating Reliability of Test.pptx
 
Characteristics of Good Evaluation Instrument
Characteristics of Good Evaluation InstrumentCharacteristics of Good Evaluation Instrument
Characteristics of Good Evaluation Instrument
 
Validity and reliablity
Validity and reliablityValidity and reliablity
Validity and reliablity
 
Reliability
ReliabilityReliability
Reliability
 

Recently uploaded

Salient features of Environment protection Act 1986.pptx
Salient features of Environment protection Act 1986.pptxSalient features of Environment protection Act 1986.pptx
Salient features of Environment protection Act 1986.pptxakshayaramakrishnan21
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfPo-Chuan Chen
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxRaedMohamed3
 
Advances in production technology of Grapes.pdf
Advances in production technology of Grapes.pdfAdvances in production technology of Grapes.pdf
Advances in production technology of Grapes.pdfDr. M. Kumaresan Hort.
 
50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...
50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...
50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...Nguyen Thanh Tu Collection
 
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...Nguyen Thanh Tu Collection
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxEduSkills OECD
 
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptxslides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptxCapitolTechU
 
Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & EngineeringBasic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & EngineeringDenish Jangid
 
Application of Matrices in real life. Presentation on application of matrices
Application of Matrices in real life. Presentation on application of matricesApplication of Matrices in real life. Presentation on application of matrices
Application of Matrices in real life. Presentation on application of matricesRased Khan
 
Basic_QTL_Marker-assisted_Selection_Sourabh.ppt
Basic_QTL_Marker-assisted_Selection_Sourabh.pptBasic_QTL_Marker-assisted_Selection_Sourabh.ppt
Basic_QTL_Marker-assisted_Selection_Sourabh.pptSourabh Kumar
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfjoachimlavalley1
 
NLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptxNLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptxssuserbdd3e8
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
 
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptxMatatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptxJenilouCasareno
 
Forest and Wildlife Resources Class 10 Free Study Material PDF
Forest and Wildlife Resources Class 10 Free Study Material PDFForest and Wildlife Resources Class 10 Free Study Material PDF
Forest and Wildlife Resources Class 10 Free Study Material PDFVivekanand Anglo Vedic Academy
 
Gyanartha SciBizTech Quiz slideshare.pptx
Gyanartha SciBizTech Quiz slideshare.pptxGyanartha SciBizTech Quiz slideshare.pptx
Gyanartha SciBizTech Quiz slideshare.pptxShibin Azad
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersPedroFerreira53928
 

Recently uploaded (20)

Salient features of Environment protection Act 1986.pptx
Salient features of Environment protection Act 1986.pptxSalient features of Environment protection Act 1986.pptx
Salient features of Environment protection Act 1986.pptx
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
Advances in production technology of Grapes.pdf
Advances in production technology of Grapes.pdfAdvances in production technology of Grapes.pdf
Advances in production technology of Grapes.pdf
 
50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...
50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...
50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...
 
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
 
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptxslides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
 
Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & EngineeringBasic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
 
Application of Matrices in real life. Presentation on application of matrices
Application of Matrices in real life. Presentation on application of matricesApplication of Matrices in real life. Presentation on application of matrices
Application of Matrices in real life. Presentation on application of matrices
 
Operations Management - Book1.p - Dr. Abdulfatah A. Salem
Operations Management - Book1.p  - Dr. Abdulfatah A. SalemOperations Management - Book1.p  - Dr. Abdulfatah A. Salem
Operations Management - Book1.p - Dr. Abdulfatah A. Salem
 
Basic_QTL_Marker-assisted_Selection_Sourabh.ppt
Basic_QTL_Marker-assisted_Selection_Sourabh.pptBasic_QTL_Marker-assisted_Selection_Sourabh.ppt
Basic_QTL_Marker-assisted_Selection_Sourabh.ppt
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
NLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptxNLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptx
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptxMatatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
 
Forest and Wildlife Resources Class 10 Free Study Material PDF
Forest and Wildlife Resources Class 10 Free Study Material PDFForest and Wildlife Resources Class 10 Free Study Material PDF
Forest and Wildlife Resources Class 10 Free Study Material PDF
 
NCERT Solutions Power Sharing Class 10 Notes pdf
NCERT Solutions Power Sharing Class 10 Notes pdfNCERT Solutions Power Sharing Class 10 Notes pdf
NCERT Solutions Power Sharing Class 10 Notes pdf
 
Gyanartha SciBizTech Quiz slideshare.pptx
Gyanartha SciBizTech Quiz slideshare.pptxGyanartha SciBizTech Quiz slideshare.pptx
Gyanartha SciBizTech Quiz slideshare.pptx
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumers
 

Characteristics of a good test

  • 2. Introduction:  item or a combination  A test may consist of a single of items. Regardless of the number of items in a test, every single item should possess certain characteristics.  Having good items , however does not necessarily lead to a good test, because a test as a whole is more than a mere combination of individual items.  Therefore, in addition to having good items, a test should have certain characteristics.  1.Reliability 2.Validity 3.Practicality
  • 3. Reliability  There are different theories to explain the concept of reliability in a scientific way. Firs and simplest: A test is reliable if we get the same results repeatedly. Second: when a test gives consistent results. Third: reliability is ratio of true score variance to observed score variance.
  • 4. In order to explain the concept of reliability in a non-technical term is to say: imagine that feeling that someone did not as well on a test as he could have. Now imagine if he could take the test again, he would do better. This may be quit true. Nevertheless, one should also admit that some factors, such as a good chance on guessing the correct responses, would raise his score higher than it should really be. Seldom does anyone complain about this. N ow, if one could take a test over and over again, he would probably agree that his average score over all the tests in an acceptable estimate of what he really know or how he really feels about the test. On a “reliable test, one’s score on its varies administrations would not differ greatly. That is, one’s score would be quit consistent. On an “unreliable” test, on the other hand one’s score might fluctuate from one administration to the other. That is, one’s score on various administration will be inconsistence. The notion of consistency of one’s score with respect to one's average score over repeated administration is the central concern on the concept of reliability.
  • 5. The change in one’s score is inevitable. Some of the changes might represent a steady increase in one’s score. The increase would most likely be due to some sort of learning. This kind of change, which would be predictable, is called systematic variation. The systematic variation contributes to the reliability and the unsystematic variation, which is called error variation , contributes to the unreliability of a test.
  • 6. True Score  takes a test. Since all Let’s assume that someone measurement devices are subject to error, the score one gets on a test cannot be true manifestation of one’s ability in that particular trait. In other words, the score contains one’s true ability along with some error. If this error part could be eliminated, the resulting score would represent an errorless measure of that ability. By definition, this errorless score is called a “true score”.
  • 7. Observed score  The true score is almost always different from the score one gets, which is called the “observed score”. Since the observed score includes the measurement error, i.e., the error score, it can be grater than, equal to, or smaller than the true score. If there is absolutely no error of measurement, the observed score will equal the true score. However, when there is a measurement error, which is often the case, it can lead to an overestimation or an underestimation of true score. Therefore, if the observed score is represented by X, the true score by T and the error score by E, the relationship between the observed and true score can be illustrated as follows:
  • 9.
  • 10.
  • 11.
  • 12. Standard Error of Measurement   It is necessary to find an index of error in measurement which could be applied to all occasions of a particular measure. This index of error is called standard error of measurement, abbreviated as SEM.  By definition, SEM is the standard deviation of all error score obtained from a given measure in different situations.
  • 13.
  • 14.
  • 15.
  • 16. Methods of Estimating Reliability  Test-Retest Method Parallel-form Method Split-Half Method KR-21 Method
  • 17. Test-Retest   In this method reliability is obtained through administrating a given test to a particular group twice and calculating the correlation between the two sets of score obtained from the two administration.  Since there has to be a reasonable amount of time between the two administrations, this kind of reliability is referred to as the reliability or consistency over time.
  • 19. Disadvantages of Test-Retest   It requires two administrations.  Preparing similar conditions under which the administration take place adds to the complications of this method.  There should be a short time between to administration. Although not too short nor too long. To keep the balance it is recommended to have a period of two weeks between them.
  • 20. Parallel-Forms   In the parallel-forms method, two similar, or parallel forms of the same test are administrated to a group of examinees just once.  The problem here is constructing two parallel forms of a test which is a difficult job to do.  The two form of the test should be the same. It means all the elements upon which test items are constructed should be the same in both forms. For example if we are measuring a particular element of grammar, the other form should also contain the same number of items on the same elements of grammar.  Subtests should also be the same, i.e., if one form of the test has tree subsection of grammar, vocabulary, and reading comprehension, the other form should also have the same subsections with the same proportions.
  • 21. Split-Half   In split-half method the items comprising a test are homogeneous. That is, all the items in a test attempt to measure elements of a particular trait, E.g., tenses, propositions, other grammatical points, vocabulary, reading and listening comprehension, which are all subparts of the trait called language ability.  In this method, when a single test with homogeneous items is administrated to a group of examinees, the test is split, or divided, into two equal halves. The correlation between the two halves is an estimate of the test score reliability.
  • 22. Split-Half   In using this method, two main points should be taken into consideration. Firs, the procedure for dividing the test into two equal halves, and second, the computation of total test reliability from the reliability of one half of the test.  In this method easy and difficult items should be equally distributed in two halves.
  • 24. Split-Half Advantages and Disadvantages   Advantages: it is more practical than others. In using the Split-Half method, there is no need to administer the same test twice. Nor is it necessary to develop two parallel form of the same test.  Disadvantages: the main shortcoming of this method is developing a test with homogeneous items because assuming the quality between the two halves is not a safe assumption. Furthermore, different subsections, in a test ,e.g., grammar, vocabulary, reading or listening comprehension, will jeopardize test homogeneity, and thus reduce test score reliability.
  • 27. Which method should we use?   It depends on the function of the test.  Test-retest method is appropriate when the consistency of scores a particular time interval (stability of test scores over time) is important  The Parallel-forms method is desirable when the consistency of scores over different forms is of importance.  When the go-togetherness of the items of a test is of significance (the internal consistency), Split-Half and KR-21 will be the most appropriate methods.
  • 28. Factors Influencing Reliability   To have a reliability estimate, one or two sets of scores should be obtained from the same group of testees. Thus, two factors contribute to test reliability: the testee and the test itself.
  • 29. The Effect of Testees   Since human beings are dynamic creatures, the attributes related to human beings are also dynamic. The implication is that the performance of human beings will, by their very nature, fluctuate from time to time, or from place to place. (e.g., students misunderstanding or misreading test directions, noise level, distractions, and sickness) can cause test scores to vary.  Heterogeneity of the Group Members.The greater the heterogeneity of the group members in the preferences, skills or behaviors being tested, the greater the chance for high reliability correlation coefficients.
  • 30. The Effect of Test Factors  Test length. Generally, the longer a test is, the more reliable it is, however the length is up to a point.  Speed. When a test is a speed test, reliability can be problematic. It is inappropriate to estimate reliability using internal consistency, test-retest, or alternate form methods. This is because not every student is able to complete all of the items in a speed test. In contrast, a power test is a test in which every student is able to complete all the items.  Item difficulty. When there is little variability among test scores, the reliability will be low. Thus, reliability will be low if a test is so easy that every student gets most or all of the items correct or so difficult that every student gets most or all of the items wrong.
  • 31. The Effect of Administration Factors  • Poor or unclear directions given during administration or inaccurate scoring can affect reliability.  For Example - say you were told that your scores on being social determined your promotion. The result is more likely to be what you think they want than what your behavior is.
  • 32. The Influence of Scoring Factors  the likes and dislikes of  In an objectively-scored test, the scorers will not influence the results.  In a subjectively-scored test, the likes and dislikes of the scorers will influence the results and as a result reliability.  Intra-rater errors (Errors which are due to fluctuations of the same rater scoring a single test twice)  Inter-rater errors (Errors which are due to the fluctuations of different scorers-at least two- scoring a single test.
  • 33. Validity   The second major characteristic of a good test is validity.  What does validity mean?  A test is valid if it measures what we want it to measure and nothing else.  The extent to which a test measures what it is supposed to measure or can be used for the purposes for which the test is intended.  Validity is a more-testdependant concept but reliability is a purely statistical parameter.  So, validity refers to the extent to which a test measures what it is supposed to measure.  There are four types of validity.
  • 34. Types Of Validity  Content V Criterion-Related V Construct V
  • 35. Content Validity   Relevance of the test item to the purpose of the test.  Does the test measure the objectives of the course?  It refers to the correspondence (agreement) between the test content and the content of materials (subject matter and instructional objectives) taught to be tested.  The extent to which a test measures a representative sample of the content to be tested at the intended level of learning.
  • 36. Content Validity   Content Validity is called appropriateness of the test; that is appropriateness of the sample and the learning level.  Content Validity is the most important type of validity which can be achieved through a careful examination of the test content.  It provides the most useful subjective information about the appropriateness of the test.
  • 37. Criterion-related Validity   Criterion-related Validity investigates the correspondence between the scores obtained from the newly-developed test and the scores obtained from some independent outside criteria.  The newly-developed test has to be administered along with the criterion measure to the same group.  The extent to which the test scores correlate with a relevant outside criterion.  Criterion-related validity: Refers to the extent to which different tests intended to measure the same ability are in agreement. Depending on the time of administration, two types exist:  Concurrent Validity  Predictive Validity
  • 38. Concurrent Validity Correlation between the test scores (new test) with a recognized measure taken at the same time. Predictive validity Comparison (correlation) of students' scores with a criterion taken at a later time (date).
  • 39. Construct validity   Refers to measuring certain traits or theoretical construct  Refers to the extent to which the psychological reality of a trait or construct can be established.  It is based on the degree to which the items in a test reflect the essential aspects of the theory on which the test is based on.  Construct validity also refers to the accuracy with which the test measures certain psychological/theoretical traits  Reading comprehension – Oral language ability.  This is done through factor analysis
  • 40. Factors Affecting Validity   a. Directions (clear and simple)  b. Difficulty level of the test (not too easy nor too difficult)  c. Structure of the items (poorly constructed and/or ambiguous items will contribute to invalidity)  d. Arrangement of items and correct responses (starting with the easiest items and ending with the difficult ones + arranging item responses randomly not based on an identifiable pattern)
  • 41. Validity and Reliability   Reliability is a purely statistical parameter; that is, it can be determined fairly independently of the test. But Validity is a test-dependent concept.  We have degrees of validity:  very valid, moderately valid, not very valid  A test must be reliable to be valid, but reliability does not guarantee validity.
  • 42. Reliability, Validity and Acceptability   How reliable and valid should a test be?  The more important the decision to be made, the more confidence is needed in the scores, an thus, the more reliable and valid test are required.  Nevertheless, it is a generally accepted tradition that validity and reliability coefficients below 0.50 ( low ) 0.5 to 0.75 ( moderate), 0.75 to 0.90 ( high )
  • 43. Practicality  Generally speaking, practicality refers to the ease of administration and scoring of a test.
  • 44. Ease of administration   It is = Clarity, simplicity and the ease of reading instructions  Fewer numbers of subtests  The time required for test
  • 45. Ease of scoring  A test can be scored subjectively or objectively. Since scoring is difficult and time consuming, the trend is toward objectivity, simplicity and machine scoring.
  • 46. Ease of Interpretation and Application   The meaningfulness of scores obtained from that test  If the test results are misinterpreted or misapplied, they will be of little value and may actually be harmful to some individual or group.