SlideShare a Scribd company logo
Principles of Language Assessment:
Test Usefulness
Course: Testing
Bachman &
Palmer, Ch. 2
The most important quality of a test is its usefu
But,
-What makes a test useful ?
- How do we know a test will be useful before we
- Or it has been useful after we have used it ?
Simply using a test does not
make it useful !
A model of test usefulness has been
proposed that include six test
qualities.
Model of
test
usefulness
Reliability
Construct
validity
Authenticity
Interactiveness
Impact
Practicality
Usefulness =
Reliability + Construct validity +
Authenticity + Interactiveness + Impact
+ Validity
This model along with the three
principles, provides a basis for
answering this question:
“ How useful is this particular
test for its intended purpose(s)
? “
It is the overall usefulness of the test
that is to be maximized, rather than
the individual qualities that affect
usefulness.
The individual test qualities cannot be
evaluated independently, but must be
evaluated in terms of their combined
effect on the overall usefulness of the
test.
Test usefulness & the appropriate
balance among the different qualities
cannot be prescribed in general, but
must be determined for each specific
testing situation.
Therefore,
In order to be useful, any given
lg. test must be developed
with a specific purpose, a particular group
of test takers and a specific lg. use domain.
“ target lg. use” or TLU
*( tasks in the TLU domain
“TLU tasks”
1
R
E
L
I
A
B
I
L
I
T
Y
- Reliability is often defined as consistency of
measurement.
Scores on test
tasks with
characteristics
A
Scores on test
tasks with
characteristics
A’
Reliability
- It is not possible to eliminate inconsistencies
entirely. What we can do is to try to minimize
the potential sources of inconsistencies.
2
C
o
n
s
t
r
u
c
t
- Construct validity pertains to the
meaningfulness & appropriateness of the
interpretations that we make on the basis of
test scores.
-The term construct validity is used to refer to
the extent to which we can interpret a given
test score as an indicator of the ability(ies), or
construct(s), we want to measure with respect
to a specific domain of generalization.
V
a
l
i
d
i
t
y
Score interpretation:
Interference
about lg.
ability
(construct
definition)
Domain
of
generalization
TEST SCORE
Lg. ability
Characteristics of
the test task
Interactiveness
Constructvalidity
Authenticity
3
A
U
T
H
E
N
T
I
C
I
T
Y
Characteristics
of the
TLU task
Characteristics
of the
Test task
Authenticity
- We define authenticity as the degree of
correspondence of the characteristics of a
given lg. test task to the features of a TLU task.
Authenticity is important, because:
1- It provides a link between test performance & the
TLU tasks & domain to which we want to generalize.
2- The way test takers perceive the relative
authenticity of test tasks can facilitate their test
performance.
4
I
N
T
E
R
A
C
T
I
Ven
ess
-We define interactiveness as the extent & the
type of involvement of the test taker’s
individual characteristics in accomplishing a
test task.
- Unlike authenticity, interactiveness resides in
the interaction between the individual ( test
taker or lg. user) & the task (test or TLU).
Interactiveness
LANGUAGE ABILITY
(Lg. knowledge, Metacognitive strategies)
Characteristics of lg. test task
Topical
Knowledge
Affective
Schemata
Example 1
The typists who perform certain typing
tasks in English very well but they might
be able simply to copy the letters &
words , without processing the document
as a piece of discourse. Therefore:
Authenticity : High
Interactiveness : Low
Example 2
The typists who are capable of carrying on
“ small talk” about food, clothing, etc.
Authenticity : Low (Lack of relevance of
the test task to the TLU task.)
Interactiveness : High (Test takers have
reasonable amount of control in selecting
topics & influencing the structure of the
interaction.)
Example 3
International students entering an
American university were given a test of
English vocabulary, to match the words in
one column to the meanings in another one.
Authenticity : Low (few domains involve
this kind of task)
Interactiveness : Low (Highly restricted
involvement of lg. knowledge)
Example 4
To conduct a face-to-face role play; a
salesperson and a customer.
Authenticity : High (Correspondence
between the characteristics of the TLU
domain and the ones of test task.)
Interactiveness : High (High level of
involvement of all the areas of lg. &
test taker’s topical knowledge.)
POINTS TO REMEMBER
1- Both authenticity & interactiveness are relative.
2- Three types of characteristics must be considered:
those of test takers, TLU task & test task.
3- Certain test tasks may be relatively useful, even
though they are low in authenticity or interactiveness.
4- In designing or analyzing tests, our estimates of
authenticity & interactiveness are only guesses.
5- The minimum acceptable levels that we specify for
authenticity & interactiveness will depend on the
specific testing situation.
5
I
M
P
A
C
T
- Another quality of tests is their impact on
society & educational systems. The impact of
test use operates at two levels:
a micro
level
a macro
level
Individuals
who are
affected by
the particular
tests use.
In terms of
educational
system or
society.
W A S H B A C K
“ the effect of testing on teaching &
learning.” (Hughes, 1989)
“ how assessment instruments affect
educational practices & beliefs. .”
(Cohen, 1994)
Washback
Impact on individuals
Impact on society & educational system
A) tests takers
B) teachers
A) IMPACT ON TEST TAKERS
Test takers can be affected by three aspects of testing
procedure:
 the experience of taking &, in some cases, of
preparing for the test. (Test taker’s
perception of TLU domain, his areas
of lg. knowledge & his use of
strategies)
 the feedback they receive, about their performance on
the test,
B) IMPACT ON TEACHERS
If teachers find that they have to use a specified test, they may
find “ teaching to test” almost unavoidable.
This term implies doing something in teaching that may not
be compatible with teachers’ own values & goals, or with
the values & goals of the instructional program.
One way to minimize the potential for negative impact on
instruction is to change the way we test.
6
P
R
A
C
T
I
C
A
L
I
T
Y
While the other five qualities pertain to the
uses that are made of test scores, practicality
pertains primarily to the ways in which the
test will be implemented, &, to a large
degree, whether it will be developed & used at
all. Thus, a practical test is one whose design,
development & use do not require more resources
that are available.
Thus, determining the practicality of a given test involves
the consideration of:
 the resources that will be required to develop an
operational test that has the balance of qualities we want;
&
 the allocation & management of the resources that
are available. Practicality = --------------------------------------
Available resources
Required resources
If practicality 1 , the test development & use is
practical.
Types of Resources
1- Human resources (e.g test writers, scorers or raters, test
administrators & technical support.)
a) Space (e.g rooms for test development)
2- Material resources b) Equipment (eg. typewriters,
computers)
c) Materials (e.g. paper, picture)
a) Time for specific tasks (designing, writing,
analyzing)
3- Time b) Development time
Test Usefulness

More Related Content

What's hot

what is stylistics and its levels 1.Phonological level 2.Graphological leve...
what is stylistics and its levels 1.Phonological level   2.Graphological leve...what is stylistics and its levels 1.Phonological level   2.Graphological leve...
what is stylistics and its levels 1.Phonological level 2.Graphological leve...
RajpootBhatti5
 
Discourse Analysis ppt
Discourse Analysis pptDiscourse Analysis ppt
Discourse Analysis ppt
Aisyah Pujakesuma
 
Discourse analysis
Discourse analysisDiscourse analysis
Discourse analysis
Alvy Mayrina
 
Principles of language assessment
Principles of language assessmentPrinciples of language assessment
Principles of language assessmentAstrid Caballero
 
Discourse analysis for language teachers
Discourse analysis for       language teachersDiscourse analysis for       language teachers
Discourse analysis for language teachers
gaflores2
 
English - Assessing writing
English - Assessing writingEnglish - Assessing writing
English - Assessing writing
Dwi Putra Mahardhika
 
Language testing and evaluation
Language testing and evaluationLanguage testing and evaluation
Language testing and evaluation
esra66
 
Testing reading
Testing readingTesting reading
Testing reading
OtherGoddess30
 
Testing speaking
Testing speakingTesting speaking
Testing speaking
M B
 
Assessing listening
Assessing listeningAssessing listening
Assessing listening
Jesullyna Manuel
 
Kinds of Language Tests
Kinds of Language TestsKinds of Language Tests
Kinds of Language Tests
Jennefer Edrozo
 
discrete-point and integrative testing
discrete-point and integrative testingdiscrete-point and integrative testing
discrete-point and integrative testing
indriyatul munawaroh
 
ASSESSMENT CONCEPTS AND ISSUES
ASSESSMENT CONCEPTS AND ISSUESASSESSMENT CONCEPTS AND ISSUES
ASSESSMENT CONCEPTS AND ISSUES
Andre Philip Tacderas
 
Assessing grammar
Assessing grammarAssessing grammar
Assessing grammarjuliovangel
 
Assessing speaking
Assessing speakingAssessing speaking
Assessing speaking
امین کوهنوردی
 
Testing for Language Teachers
Testing for Language TeachersTesting for Language Teachers
Testing for Language Teachers
mpazhou
 
English for-specific-purposes (2)
English for-specific-purposes (2)English for-specific-purposes (2)
English for-specific-purposes (2)
aisha ilyas
 
Assessing speaking
Assessing speakingAssessing speaking
Assessing speaking
AnxhelaXibraku
 

What's hot (20)

what is stylistics and its levels 1.Phonological level 2.Graphological leve...
what is stylistics and its levels 1.Phonological level   2.Graphological leve...what is stylistics and its levels 1.Phonological level   2.Graphological leve...
what is stylistics and its levels 1.Phonological level 2.Graphological leve...
 
Discourse Analysis ppt
Discourse Analysis pptDiscourse Analysis ppt
Discourse Analysis ppt
 
Discourse analysis
Discourse analysisDiscourse analysis
Discourse analysis
 
Principles of language assessment
Principles of language assessmentPrinciples of language assessment
Principles of language assessment
 
Discourse analysis for language teachers
Discourse analysis for       language teachersDiscourse analysis for       language teachers
Discourse analysis for language teachers
 
English - Assessing writing
English - Assessing writingEnglish - Assessing writing
English - Assessing writing
 
Language testing and evaluation
Language testing and evaluationLanguage testing and evaluation
Language testing and evaluation
 
Testing reading
Testing readingTesting reading
Testing reading
 
Testing speaking
Testing speakingTesting speaking
Testing speaking
 
Assessing listening
Assessing listeningAssessing listening
Assessing listening
 
Kinds of Language Tests
Kinds of Language TestsKinds of Language Tests
Kinds of Language Tests
 
discrete-point and integrative testing
discrete-point and integrative testingdiscrete-point and integrative testing
discrete-point and integrative testing
 
ASSESSMENT CONCEPTS AND ISSUES
ASSESSMENT CONCEPTS AND ISSUESASSESSMENT CONCEPTS AND ISSUES
ASSESSMENT CONCEPTS AND ISSUES
 
Assessing grammar
Assessing grammarAssessing grammar
Assessing grammar
 
Assessing speaking
Assessing speakingAssessing speaking
Assessing speaking
 
Testing for Language Teachers
Testing for Language TeachersTesting for Language Teachers
Testing for Language Teachers
 
Discourse Analysis
Discourse AnalysisDiscourse Analysis
Discourse Analysis
 
English for-specific-purposes (2)
English for-specific-purposes (2)English for-specific-purposes (2)
English for-specific-purposes (2)
 
Assessing Listening
Assessing ListeningAssessing Listening
Assessing Listening
 
Assessing speaking
Assessing speakingAssessing speaking
Assessing speaking
 

Similar to Test Usefulness

Validity
ValidityValidity
Validity
Maury Martinez
 
Quantitative analysis
Quantitative analysisQuantitative analysis
Quantitative analysis
Pachica, Gerry B.
 
Validity and reliability of questionnaires
Validity and reliability of questionnairesValidity and reliability of questionnaires
Validity and reliability of questionnaires
Venkitachalam R
 
CONSTRUCTION OF TEST IN MANAGEMENT .docx
CONSTRUCTION OF TEST IN MANAGEMENT .docxCONSTRUCTION OF TEST IN MANAGEMENT .docx
CONSTRUCTION OF TEST IN MANAGEMENT .docx
PGIMS Rohtak
 
Qualities of a Good Test
Qualities of a Good TestQualities of a Good Test
Qualities of a Good Test
Dr. Amjad Ali Arain
 
7.1 assessment and the cefr (1)
7.1 assessment and the cefr (1)7.1 assessment and the cefr (1)
7.1 assessment and the cefr (1)
Jesús Ángel González López
 
Fundamental concepts and principles in Language Testing
Fundamental concepts and principles in Language TestingFundamental concepts and principles in Language Testing
Fundamental concepts and principles in Language Testing
Phạm Phúc Khánh Minh
 
Reliability in Language Testing
Reliability in Language Testing Reliability in Language Testing
Reliability in Language Testing
Seray Tanyer
 
Validity and reliability in assessment.
Validity and reliability in assessment. Validity and reliability in assessment.
Validity and reliability in assessment.
Tarek Tawfik Amin
 
The nittygritty of language testing
The nittygritty of language testingThe nittygritty of language testing
The nittygritty of language testing
Marzs
 
Characteristics of a good test
Characteristics of a good test Characteristics of a good test
Characteristics of a good test
Arash Yazdani
 
Enhancing fairness through a social contract
Enhancing fairness through a social contractEnhancing fairness through a social contract
Enhancing fairness through a social contract
Mahsa Farahanynia
 
Characteristics of Good Evaluation Instrument
Characteristics of Good Evaluation InstrumentCharacteristics of Good Evaluation Instrument
Characteristics of Good Evaluation Instrument
Suresh Babu
 
Sample work test
Sample work testSample work test
Sample work test
Tariq Mehmood
 
Validity and reliability
Validity and reliabilityValidity and reliability
Validity and reliability
randoparis
 
Standardized and non standardized tests
Standardized and non standardized testsStandardized and non standardized tests
Standardized and non standardized tests
vinoli_sg
 
Some Reflections on Task-based Language Performance Assessment
Some Reflections on Task-based Language Performance AssessmentSome Reflections on Task-based Language Performance Assessment
Some Reflections on Task-based Language Performance Assessment
Parisa Mehran
 
7 assessment and the cefr
7 assessment and the cefr 7 assessment and the cefr
7 assessment and the cefr
Jesús Ángel González López
 

Similar to Test Usefulness (20)

Validity
ValidityValidity
Validity
 
Quantitative analysis
Quantitative analysisQuantitative analysis
Quantitative analysis
 
Validity and reliability of questionnaires
Validity and reliability of questionnairesValidity and reliability of questionnaires
Validity and reliability of questionnaires
 
CONSTRUCTION OF TEST IN MANAGEMENT .docx
CONSTRUCTION OF TEST IN MANAGEMENT .docxCONSTRUCTION OF TEST IN MANAGEMENT .docx
CONSTRUCTION OF TEST IN MANAGEMENT .docx
 
Qualities of a Good Test
Qualities of a Good TestQualities of a Good Test
Qualities of a Good Test
 
7.1 assessment and the cefr (1)
7.1 assessment and the cefr (1)7.1 assessment and the cefr (1)
7.1 assessment and the cefr (1)
 
Fundamental concepts and principles in Language Testing
Fundamental concepts and principles in Language TestingFundamental concepts and principles in Language Testing
Fundamental concepts and principles in Language Testing
 
Reliability in Language Testing
Reliability in Language Testing Reliability in Language Testing
Reliability in Language Testing
 
Validity and reliability in assessment.
Validity and reliability in assessment. Validity and reliability in assessment.
Validity and reliability in assessment.
 
The nittygritty of language testing
The nittygritty of language testingThe nittygritty of language testing
The nittygritty of language testing
 
Characteristics of a good test
Characteristics of a good test Characteristics of a good test
Characteristics of a good test
 
Enhancing fairness through a social contract
Enhancing fairness through a social contractEnhancing fairness through a social contract
Enhancing fairness through a social contract
 
Week 8 & 9 - Validity and Reliability
Week 8 & 9 - Validity and ReliabilityWeek 8 & 9 - Validity and Reliability
Week 8 & 9 - Validity and Reliability
 
Characteristics of Good Evaluation Instrument
Characteristics of Good Evaluation InstrumentCharacteristics of Good Evaluation Instrument
Characteristics of Good Evaluation Instrument
 
Sample work test
Sample work testSample work test
Sample work test
 
Validity and reliability
Validity and reliabilityValidity and reliability
Validity and reliability
 
Standardized and non standardized tests
Standardized and non standardized testsStandardized and non standardized tests
Standardized and non standardized tests
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validity
 
Some Reflections on Task-based Language Performance Assessment
Some Reflections on Task-based Language Performance AssessmentSome Reflections on Task-based Language Performance Assessment
Some Reflections on Task-based Language Performance Assessment
 
7 assessment and the cefr
7 assessment and the cefr 7 assessment and the cefr
7 assessment and the cefr
 

Recently uploaded

Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
Normal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of LabourNormal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of Labour
Wasim Ak
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Marketing internship report file for MBA
Marketing internship report file for MBAMarketing internship report file for MBA
Marketing internship report file for MBA
gb193092
 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
Mohammed Sikander
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
deeptiverma2406
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
kimdan468
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
Normal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of LabourNormal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of Labour
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Marketing internship report file for MBA
Marketing internship report file for MBAMarketing internship report file for MBA
Marketing internship report file for MBA
 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 

Test Usefulness

  • 1. Principles of Language Assessment: Test Usefulness Course: Testing Bachman & Palmer, Ch. 2
  • 2. The most important quality of a test is its usefu But, -What makes a test useful ? - How do we know a test will be useful before we - Or it has been useful after we have used it ?
  • 3. Simply using a test does not make it useful ! A model of test usefulness has been proposed that include six test qualities.
  • 5. Usefulness = Reliability + Construct validity + Authenticity + Interactiveness + Impact + Validity
  • 6. This model along with the three principles, provides a basis for answering this question: “ How useful is this particular test for its intended purpose(s) ? “
  • 7. It is the overall usefulness of the test that is to be maximized, rather than the individual qualities that affect usefulness.
  • 8. The individual test qualities cannot be evaluated independently, but must be evaluated in terms of their combined effect on the overall usefulness of the test.
  • 9. Test usefulness & the appropriate balance among the different qualities cannot be prescribed in general, but must be determined for each specific testing situation.
  • 10. Therefore, In order to be useful, any given lg. test must be developed with a specific purpose, a particular group of test takers and a specific lg. use domain. “ target lg. use” or TLU *( tasks in the TLU domain “TLU tasks”
  • 11. 1 R E L I A B I L I T Y - Reliability is often defined as consistency of measurement. Scores on test tasks with characteristics A Scores on test tasks with characteristics A’ Reliability - It is not possible to eliminate inconsistencies entirely. What we can do is to try to minimize the potential sources of inconsistencies.
  • 12. 2 C o n s t r u c t - Construct validity pertains to the meaningfulness & appropriateness of the interpretations that we make on the basis of test scores. -The term construct validity is used to refer to the extent to which we can interpret a given test score as an indicator of the ability(ies), or construct(s), we want to measure with respect to a specific domain of generalization. V a l i d i t y
  • 13. Score interpretation: Interference about lg. ability (construct definition) Domain of generalization TEST SCORE Lg. ability Characteristics of the test task Interactiveness Constructvalidity Authenticity
  • 14. 3 A U T H E N T I C I T Y Characteristics of the TLU task Characteristics of the Test task Authenticity - We define authenticity as the degree of correspondence of the characteristics of a given lg. test task to the features of a TLU task. Authenticity is important, because: 1- It provides a link between test performance & the TLU tasks & domain to which we want to generalize. 2- The way test takers perceive the relative authenticity of test tasks can facilitate their test performance.
  • 15. 4 I N T E R A C T I Ven ess -We define interactiveness as the extent & the type of involvement of the test taker’s individual characteristics in accomplishing a test task. - Unlike authenticity, interactiveness resides in the interaction between the individual ( test taker or lg. user) & the task (test or TLU).
  • 16. Interactiveness LANGUAGE ABILITY (Lg. knowledge, Metacognitive strategies) Characteristics of lg. test task Topical Knowledge Affective Schemata
  • 17. Example 1 The typists who perform certain typing tasks in English very well but they might be able simply to copy the letters & words , without processing the document as a piece of discourse. Therefore: Authenticity : High Interactiveness : Low
  • 18. Example 2 The typists who are capable of carrying on “ small talk” about food, clothing, etc. Authenticity : Low (Lack of relevance of the test task to the TLU task.) Interactiveness : High (Test takers have reasonable amount of control in selecting topics & influencing the structure of the interaction.)
  • 19. Example 3 International students entering an American university were given a test of English vocabulary, to match the words in one column to the meanings in another one. Authenticity : Low (few domains involve this kind of task) Interactiveness : Low (Highly restricted involvement of lg. knowledge)
  • 20. Example 4 To conduct a face-to-face role play; a salesperson and a customer. Authenticity : High (Correspondence between the characteristics of the TLU domain and the ones of test task.) Interactiveness : High (High level of involvement of all the areas of lg. & test taker’s topical knowledge.)
  • 21. POINTS TO REMEMBER 1- Both authenticity & interactiveness are relative. 2- Three types of characteristics must be considered: those of test takers, TLU task & test task. 3- Certain test tasks may be relatively useful, even though they are low in authenticity or interactiveness. 4- In designing or analyzing tests, our estimates of authenticity & interactiveness are only guesses. 5- The minimum acceptable levels that we specify for authenticity & interactiveness will depend on the specific testing situation.
  • 22. 5 I M P A C T - Another quality of tests is their impact on society & educational systems. The impact of test use operates at two levels: a micro level a macro level Individuals who are affected by the particular tests use. In terms of educational system or society.
  • 23. W A S H B A C K “ the effect of testing on teaching & learning.” (Hughes, 1989) “ how assessment instruments affect educational practices & beliefs. .” (Cohen, 1994)
  • 24. Washback Impact on individuals Impact on society & educational system A) tests takers B) teachers
  • 25. A) IMPACT ON TEST TAKERS Test takers can be affected by three aspects of testing procedure:  the experience of taking &, in some cases, of preparing for the test. (Test taker’s perception of TLU domain, his areas of lg. knowledge & his use of strategies)  the feedback they receive, about their performance on the test,
  • 26. B) IMPACT ON TEACHERS If teachers find that they have to use a specified test, they may find “ teaching to test” almost unavoidable. This term implies doing something in teaching that may not be compatible with teachers’ own values & goals, or with the values & goals of the instructional program. One way to minimize the potential for negative impact on instruction is to change the way we test.
  • 27. 6 P R A C T I C A L I T Y While the other five qualities pertain to the uses that are made of test scores, practicality pertains primarily to the ways in which the test will be implemented, &, to a large degree, whether it will be developed & used at all. Thus, a practical test is one whose design, development & use do not require more resources that are available.
  • 28. Thus, determining the practicality of a given test involves the consideration of:  the resources that will be required to develop an operational test that has the balance of qualities we want; &  the allocation & management of the resources that are available. Practicality = -------------------------------------- Available resources Required resources If practicality 1 , the test development & use is practical.
  • 29. Types of Resources 1- Human resources (e.g test writers, scorers or raters, test administrators & technical support.) a) Space (e.g rooms for test development) 2- Material resources b) Equipment (eg. typewriters, computers) c) Materials (e.g. paper, picture) a) Time for specific tasks (designing, writing, analyzing) 3- Time b) Development time