SlideShare a Scribd company logo
1 of 7
Principles of Testing
1. Practicality 2.Reliability 3. Validity 4.Backwash 5.Authenticity Practicality
An effective test is:
i) cost effective ii) time effective iii) easy to administer iv) Scoring and evaluation process is
specific and time-efficient.
Are the following tests practical?
1. A speaking test for 250 students in which only 5 assessors and interlocutors are available.
A listening test takes half an hour to administer and 15 minutes to assess.
An English language test which consists of 100 multiple choice questions which can only be
assessed on computers.
Reliability
i)A reliable test is consistent and dependable. ii)(The consistency of test results over
repeated administrations )
If a student achieves similar scores when a test is administered at different time and
different places that test is a reliable test.
1. Student Related Reliability
Students related factors are likely to interfere with students` true scores. For example,
temporary illness, fatigue, bad day, anxiety, physical or psychological factors as well a test-
takers` test-wiseness or strategies for efficient test taking.
2.Rater Reliability.
It occurs due to human subjectivity and biases when scoring tests.
i)Inter-rater reliability
It occurs when two or more scorers yield inconsistent scores of the same test due to the lack
of attention to scoring criteria, inexperience, inattention or even preconceived biases.
Types of Reliability
iii) Test Administration Reliability
Unreliability may also result from the conditions in which the test is administered. For
example, noise, light or temperature in the classroom.
iv) Test Reliability
Sometimes the nature of the test itself can cause measurement errors. For example, a long
test in which test takers become fatigued towards the end of the test and give incorrect
answers or ambiguous test items with more than one correct answers.
How to Make Tests More Reliable p.21-24 (notes)
1) Take samples of behaviour (the more items you have on test, the more reliable the test
will be) 2) Do not allow candidates too much freedom. 3) Write unambiguous items. 4)
Provide clear and explicit instructions 5) Ensure that tests are well laid out and perfectly
legible. 6) Candidates should be familiar with format and testing techniques. 7) Provide
uniform and non-distracting conditions of administration. 8) Use items that permit scoring
which is as objective as possible. 9) Make comparisons between candidates as direct as
possible. 10) Provide a detailed scoring key. 11) Train scorers. 12) Agree acceptable
responses and appropriate scores at outset of scoring 13) Identify candidates by number,
not name. 14) Employ, multiple, independent scoring.
Validity A test is said to be valid if it measures accurately what it is intended to measure.
Some aspects of the concept of validity
i) Content validity
A test is said to have content validity if its content constitutes a representative sample of the
language skills, structures, etc. with which it is meant to be concerned.
How can the content validity of a test is ensured?
Comparing the test content to test specification to ensure that the test contains the a proper
sampling of relevant skills, or knowledge to be tested.
Lack of content validity is likely to have a negative backwash effect on teaching as the areas
which are not tested are usually ignored in teaching.
ii) Criterion related validity
How far results on the test agree with those provided by some independent and highly
dependable assessment of the candidate’s ability. This independent assessment is thus the
criterion measure against which the test is validated.
A) Concurrent validity
If a the results of a test are supported by other concurrent performance beyond the
assessment. For example, the validity of a high score on the final exam of a foreign language
course will be substantiated by actual proficiency in the language.
B) The Predictive Validity
The assessment criterion is to measure a test taker’s likelihood of future success. For
example, placement tests, admissions assessment batteries, language aptitude tests.
iii) Construct –Related Evidence
Construct: Any theory, hypothesis, or model that attempts to explain, observed
phenomenon in our universe of perceptions. For example, `Proficiency` and `communicative
competence` are linguistic constructs. (Difficult to measure directly, inferential data is
required) It is often difficult to measure constructs directly. Language learning and teaching
involves theoretical constructs.
Construct validity asks ` Does this test actually measure the theoretical constructs as it has
been defined?
****For example: The major components of oral proficiency are pronunciation, fluency,
grammatical accuracy, vocabulary use, and sociolinguistic appropriateness. An oral interview
has construct validity only if it measures all of the referred components of the speech.
Or vocabulary test example on p.25
iv)Consequential Validity
Consequential validity is related to all the consequences of a test. For example, its accuracy
in measuring intended criteria, its impact on the preparation of test takers, its effects on the
learner and intended and unintended social consequences of a test’s interpretation and use.
For example, the effects of assessment on students` motivation, subsequent performance in
a course, independent learning, study habits and attitude toward school work.
v) Face Validity
Face validity refers to the degree to which a test looks right and appears to measure the
knowledge or ability it claims to measure, based on the subjective judgement of the
examinees who take it, the administrative personnel who decide on its use, and other
unsophisticated observers. P.26 & 27 `Does the test, on the face of it appear from the
learner’s perspective to test what it is designed to test? `
What are the features of a test with high face validity?
Refer to p. 27
Face validity can not be empirically tested by teachers or experts but it is in the eye of the
beholder. A test cannot be highly valid if it is unreliable due to measurement error. However,
a test can be reliable but not necessarily valid for the purposes it claims.
vi)Authenticity
Bachman and Palmer (1996, p.23) define authenticity as `the degree of correspondence of
the characteristics of a given language test task to the features of a target language task`.
Test items should simulate real world tasks (tasks that are likely to occur in real world)
The features of an authentic test
1. The language in the test is as authentic as possible
2. Items are contextualized rather than isolated
3. Topics are meaningful (relevant, interesting) for the learner
4. Some thematic organization to items is provided, such as through a story line or episode.
5. Tasks represent or closely approximate, real world tasks.
How can test authenticity be maintained?
1. Reading text is selected from the real-world sources.
2. Listening comprehension sections should feature natural language with hesitations, white
noise and interruptions.
3. Include episodic test items, sequenced to form meaningful units, paragraphs or stories
vii) Backwash
It’s a facet of consequential validity. It is the effect of testing on teaching and learning. (can
be beneficial or harmful) When a test is regarded important it dominates all learning and
teaching activities. If the test technique is at variance with the content and the objectives of
the course, it can have either a positive or negative effect on teaching and learning.
Explain why and how?
Tests are likely to dominate instruction. A test should be supportive good teaching and exert
a corrective feedback on bad teaching. Backwash is likely to enhance a number of basic
principles of language acquisition: intrinsic motivation, autonomy, self confidence, language
ego, Interlingua, and strategic investment, among others.
For achieving beneficial backwash see p.15
1. Test the abilities whose development you want to encourage. (Test oral ability to
encourage oral ability)
Something related to content validity. Teachers usually avoid to test abilities that are
difficult to test, expensive in terms of time and money.
2. Sample widely and unpredictably
A test can only measure a sample of language items and abilities included into specification.
However, the sample taken should represent as far as possible the full scope of what is
specified. If the sample is taken from a restricted area of specifications, it has a harmful
backwash effect on teaching.
3. Using direct testing.
Direct testing implies the testing of performance skills, with texts and tasks as authentic as
possible. If we test directly that we are interested in fostering, then practice for the test
represents practice in those skills.
4. Making testing criterion referenced
i)If test specifications make clear just what candidates have to be able to do, and with what
degree of success, then students will have a clear picture of what they have to achieve.
5. Students know that if they do perform the task s at the criteria level, then they will be
successful on the test, regardless of how other students perform.
6. Base achievement tests on objectives.
If achievement tests are based on objectives, rather than on detailed teaching and textbook
content they will provide a truer picture of what has actually been achieved.
7. Ensure test is known and understood by students and teachers.
The rationale for the test, its specifications, and sample items should be made available to
everyone. Concerned with the preparation for the test. Students should know what the test
demands from them.
8) Where necessary provide assistance to teachers.
The introduction of a new test may make demands on teachers. For example, a new national
test may attempt to assess communicative skills rather than vocabulary or grammatical
structures, teachers who are unfamiliar with communicative language teaching, they need
training.
9) Counting the cost
Test should be easy and cheap in terms of construction, administration and scoring.
For applying principles to the evaluation of classroom tests see pages31-38 (Book) & Chapter
1
Applying Principles to the Evaluation of Classroom Test
1. The classroom tests can be evaluated by considering five basic principles (Validity, etc……)
2. Validity is the most significant issue in testing. If a test lacks validity, then all the other
consideration would be useless.
3. Practicality comes next and followed by authenticity in significance.
Are the test procedures practical?
Practicality checklist
1. Are administrative details clearly identified before the test?
2. Can students complete the test reasonably within the set time frame?
3. Can the test be administered smoothly, without procedural glitches?
4. Are all materials and equipment ready?
5. Is the cost of the test within budgeted limits?
6. Is the scoring/evaluation system feasible in the teacher’s time frame?
7. Are methods for reporting results determined in advance?
Is the test reliable? (Related to teacher and test)
Reliability in terms of physical context
1. A clearly photocopied test sheet for every student
2. Sound amplification clearly audible to everyone
3. Video is visible to everyone
4. Equal lighting, temperature, extraneous noise and optimal classroom conditions for all
students.
5. Objective scoring procedures.
Intra ratter reliability is an important issue for classroom teachers.
Teachers need to find ways to maintain their concentration and endurance during the
scoring process.
For open ended questions, teachers:
1. T should use a consistent set of criteria for a correct response.
2. T should give utmost attention to the set of criteria throughout the evaluation time.
3. Read through the tests at least twice to check consistency.
4. If you modify the criteria in the course of the scoring process, go back and apply the same
standard to all.
5. Read the test by several sittings to avoid fatigue.
Does the procedure demonstrate content validity?
Content validity is the main source of validity in classroom tests.
`Content validity is basically related to the extent to which the test requires students to
perform tasks that were included in the previous classroom lessons and represents unit
objectives`
See example on p.32
How can the content validity of a test be evaluated?
1. Are classroom objectives clearly identified and appropriately framed? Check examples on
pages 32-33
An appropriate test would elicit an adequate number of samples of student performance,
have a clearly framed set of standards for evaluating the performance and provide some sort
of feedback to the students.
2.Are lesson objectives represented in the form of test specifications?
The content validity of a test should be represented in how the objectives of the unit are
represented in the form of the content of items, cluster of items, and item types.
Do you clearly perceive the performance of test-takers as reflective of the classroom
objectives?
Is the procedure face valid and `biased for best`?
Students usually judge a test to be face valid when:
1. Directions are clear.
2. The structure of the test is organized logically.
3. Its difficulty level is appropriately pitched.
4. The test has no surprises
5. Timing is appropriate
Face validity=Biased for best
A teacher should
1. Offer students adequate review and preparation for the test
2. Suggest beneficial strategies
3. Structure the test in such a way that best students are modestly challenged and the
weaker students are not overwhelmed.*For test taking strategies see pp. 34-35 *(Ts
strategic suggestions to optimize students` test performance)
Are the test tasks as authentic as possible?
Checklist
1. Is the language in the test as authentic (real-world language) possible?
2. Are items as contextualized as possible rather than isolated?
3. Are topics and situations interesting, enjoyable and/or humorous?
4. Is some thematic organization provided, such as through a story line or episode?
5. Do tasks represent, or closely approximate real world tasks? Does the test offer beneficial
backwash to the learner?
Checklist
1. Is the test content relevant to the curriculum and objectives?
2. How much time students spend to prepare for the test?
3. Does a test have positive consequences to the test takers in terms of learning outcomes
and teachers in terms of designing their future teaching?
Do the students use the feedback to improve their learning or do teachers benefit from the
test outcome to design their teaching to help students improve their learning outcomes?
If a test fails to measure accurately whatever it is intended to measure or if students real
knowledge and skills are not reflected in the test scores they obtain.
There are two main sources of inaccuracy:
1) The first reason is related to test content and technique (validity).
For example we can not assess students` abilities by means of multiple choice tests.
However accuracy is sacrificed for reasons of economy and convenience. (However, writing
good multiple choice tests is still very difficult as a lot of time and effort is needed)
Use appropriate test techniques.
2) Lack of reliability-if a test measures consistently, then it is an accurate test.
Unreliability is related to the features of the test itself and the way it is scored. For example,
unclear instructions, ambiguous questions, items that result in guessing on the part of the
test takers.
Considering principles of test construction when constructing tests is likely to minimize
factors that lead to inaccurate test construction.
Another reason is the accordance of significantly different scores
to equivalent test performances by two different assessors.
Inaccurate Tests
1) The first reason is related to test content and technique (validity).
For example we can not assess students` abilities by means of multiple choice tests.
However accuracy is sacrificed for reasons of economy and convenience. (However, writing
good multiple choice tests is still very difficult as a lot of time and effort is needed)
Use appropriate test techniques.
2) Lack of reliability-if a test measures consistently, then it is an accurate test.
Unreliability is related to the features of the test itself and the way it is scored. For example,
unclear instructions, ambiguous questions, items that result in guessing on the part of the
test takers.
Considering principles of test construction when constructing tests is likely to minimize
factors that lead to inaccurate test construction.
Another reason is the accordance of significantly different scores
to equivalent test performances by two different assessors.
Inaccurate Tests

More Related Content

What's hot

Teaching speaking brown
Teaching speaking brownTeaching speaking brown
Teaching speaking brownshohreh12345
 
Language Assessment : Kinds of tests and testing
Language Assessment : Kinds of tests and testingLanguage Assessment : Kinds of tests and testing
Language Assessment : Kinds of tests and testingMusfera Nara Vadia
 
Material adaptation
Material adaptationMaterial adaptation
Material adaptationmaxyfelix
 
Methods of-language-teaching
Methods of-language-teachingMethods of-language-teaching
Methods of-language-teachingImam Shofwa
 
Principles of language assessment
Principles of language assessmentPrinciples of language assessment
Principles of language assessmentSutrisno Evenddy
 
Communicative Testing
Communicative  TestingCommunicative  Testing
Communicative TestingNingsih SM
 
Task based language teaching
Task based language teachingTask based language teaching
Task based language teachingYicel Cermeño
 
A framework for materials writing
A framework for materials writingA framework for materials writing
A framework for materials writingMohammed Mallah
 
Fundamental concepts and principles in Language Testing
Fundamental concepts and principles in Language TestingFundamental concepts and principles in Language Testing
Fundamental concepts and principles in Language TestingPhạm Phúc Khánh Minh
 
Language testing approaches & techniques
Language testing approaches & techniquesLanguage testing approaches & techniques
Language testing approaches & techniquesShin Chan
 
task based language teaching TBLT
task based language teaching TBLTtask based language teaching TBLT
task based language teaching TBLTMohammed Imad
 
MATERIALS USED IN ENGLISH FOR SPECIAL PURPOSES (ESP)
MATERIALS USED IN ENGLISH FOR SPECIAL PURPOSES (ESP)MATERIALS USED IN ENGLISH FOR SPECIAL PURPOSES (ESP)
MATERIALS USED IN ENGLISH FOR SPECIAL PURPOSES (ESP)Lord Mark Jayson Ilarde
 

What's hot (20)

Teaching speaking brown
Teaching speaking brownTeaching speaking brown
Teaching speaking brown
 
Language Assessment : Kinds of tests and testing
Language Assessment : Kinds of tests and testingLanguage Assessment : Kinds of tests and testing
Language Assessment : Kinds of tests and testing
 
ASSESSMENT CONCEPTS AND ISSUES
ASSESSMENT CONCEPTS AND ISSUESASSESSMENT CONCEPTS AND ISSUES
ASSESSMENT CONCEPTS AND ISSUES
 
Assessing listening
Assessing listeningAssessing listening
Assessing listening
 
Kinds of Language Tests
Kinds of Language TestsKinds of Language Tests
Kinds of Language Tests
 
Material adaptation
Material adaptationMaterial adaptation
Material adaptation
 
Testing writing
Testing writingTesting writing
Testing writing
 
Methods of-language-teaching
Methods of-language-teachingMethods of-language-teaching
Methods of-language-teaching
 
Testing listening slide
Testing listening slideTesting listening slide
Testing listening slide
 
Notional functional syllabuses
Notional functional syllabusesNotional functional syllabuses
Notional functional syllabuses
 
Principles of language assessment
Principles of language assessmentPrinciples of language assessment
Principles of language assessment
 
Communicative Testing
Communicative  TestingCommunicative  Testing
Communicative Testing
 
Task based language teaching
Task based language teachingTask based language teaching
Task based language teaching
 
A framework for materials writing
A framework for materials writingA framework for materials writing
A framework for materials writing
 
Fundamental concepts and principles in Language Testing
Fundamental concepts and principles in Language TestingFundamental concepts and principles in Language Testing
Fundamental concepts and principles in Language Testing
 
Assessing Listening
Assessing ListeningAssessing Listening
Assessing Listening
 
Techniques and principles in language teaching
Techniques and principles in language teachingTechniques and principles in language teaching
Techniques and principles in language teaching
 
Language testing approaches & techniques
Language testing approaches & techniquesLanguage testing approaches & techniques
Language testing approaches & techniques
 
task based language teaching TBLT
task based language teaching TBLTtask based language teaching TBLT
task based language teaching TBLT
 
MATERIALS USED IN ENGLISH FOR SPECIAL PURPOSES (ESP)
MATERIALS USED IN ENGLISH FOR SPECIAL PURPOSES (ESP)MATERIALS USED IN ENGLISH FOR SPECIAL PURPOSES (ESP)
MATERIALS USED IN ENGLISH FOR SPECIAL PURPOSES (ESP)
 

Similar to Principles of Language Testing Explained

constructionoftests-211015110341 (1).pptx
constructionoftests-211015110341 (1).pptxconstructionoftests-211015110341 (1).pptx
constructionoftests-211015110341 (1).pptxGajeSingh9
 
Testing for language teachers 101 (1)
Testing for language teachers 101 (1)Testing for language teachers 101 (1)
Testing for language teachers 101 (1)Paul Doyon
 
Construction of Tests
Construction of TestsConstruction of Tests
Construction of TestsDakshta1
 
Testing for Language Teachers
Testing for Language TeachersTesting for Language Teachers
Testing for Language Teachersmpazhou
 
Principles of language assessment
Principles of language assessmentPrinciples of language assessment
Principles of language assessmentAmeer Al-Labban
 
UTPL-LANGUAGE TESTING-II-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)
UTPL-LANGUAGE TESTING-II-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)UTPL-LANGUAGE TESTING-II-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)
UTPL-LANGUAGE TESTING-II-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)Videoconferencias UTPL
 
Characteristics of Assessment
Characteristics of Assessment Characteristics of Assessment
Characteristics of Assessment AliAlZurfi
 
testing and evaluation
testing and evaluation testing and evaluation
testing and evaluation AqsaSuleman1
 
Assessment of learning
Assessment of learningAssessment of learning
Assessment of learningAhlamModiarat
 
Issues regarding construction of exams
Issues regarding construction of examsIssues regarding construction of exams
Issues regarding construction of examsFalehaa
 
Summary of all the chapters
Summary of all the chaptersSummary of all the chapters
Summary of all the chapterskashmasardar
 
ELTLAE Group 2.pptx
ELTLAE Group 2.pptxELTLAE Group 2.pptx
ELTLAE Group 2.pptxAhzaPutro
 
Discussion question for meeting two language assessment
Discussion question for meeting two language assessmentDiscussion question for meeting two language assessment
Discussion question for meeting two language assessmentManasApintamon
 
Principlesofhighqualityassessment 121028212944-phpapp02 (1)
Principlesofhighqualityassessment 121028212944-phpapp02 (1)Principlesofhighqualityassessment 121028212944-phpapp02 (1)
Principlesofhighqualityassessment 121028212944-phpapp02 (1)Greg Beloro
 
Principles of Language Assessment
Principles of Language AssessmentPrinciples of Language Assessment
Principles of Language AssessmentA Faiz
 
Language Testing : Principles of language assessment
Language Testing : Principles of language assessment Language Testing : Principles of language assessment
Language Testing : Principles of language assessment Yulia Eolia
 

Similar to Principles of Language Testing Explained (20)

constructionoftests-211015110341 (1).pptx
constructionoftests-211015110341 (1).pptxconstructionoftests-211015110341 (1).pptx
constructionoftests-211015110341 (1).pptx
 
Testing for language teachers 101 (1)
Testing for language teachers 101 (1)Testing for language teachers 101 (1)
Testing for language teachers 101 (1)
 
Construction of Tests
Construction of TestsConstruction of Tests
Construction of Tests
 
Testing for Language Teachers
Testing for Language TeachersTesting for Language Teachers
Testing for Language Teachers
 
Principles of language assessment
Principles of language assessmentPrinciples of language assessment
Principles of language assessment
 
UTPL-LANGUAGE TESTING-II-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)
UTPL-LANGUAGE TESTING-II-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)UTPL-LANGUAGE TESTING-II-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)
UTPL-LANGUAGE TESTING-II-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)
 
Characteristics of Assessment
Characteristics of Assessment Characteristics of Assessment
Characteristics of Assessment
 
testing and evaluation
testing and evaluation testing and evaluation
testing and evaluation
 
Assessment in Learning
Assessment in LearningAssessment in Learning
Assessment in Learning
 
Assessment of learning
Assessment of learningAssessment of learning
Assessment of learning
 
Standardized and non standardized tests (1)
Standardized and non standardized tests (1)Standardized and non standardized tests (1)
Standardized and non standardized tests (1)
 
Issues regarding construction of exams
Issues regarding construction of examsIssues regarding construction of exams
Issues regarding construction of exams
 
Summary of all the chapters
Summary of all the chaptersSummary of all the chapters
Summary of all the chapters
 
ELTLAE Group 2.pptx
ELTLAE Group 2.pptxELTLAE Group 2.pptx
ELTLAE Group 2.pptx
 
Discussion question for meeting two language assessment
Discussion question for meeting two language assessmentDiscussion question for meeting two language assessment
Discussion question for meeting two language assessment
 
Principlesofhighqualityassessment 121028212944-phpapp02 (1)
Principlesofhighqualityassessment 121028212944-phpapp02 (1)Principlesofhighqualityassessment 121028212944-phpapp02 (1)
Principlesofhighqualityassessment 121028212944-phpapp02 (1)
 
Principles of Language Assessment
Principles of Language AssessmentPrinciples of Language Assessment
Principles of Language Assessment
 
Language Testing : Principles of language assessment
Language Testing : Principles of language assessment Language Testing : Principles of language assessment
Language Testing : Principles of language assessment
 
Validity
ValidityValidity
Validity
 
Lang Assessment 2014.pptx
Lang Assessment 2014.pptxLang Assessment 2014.pptx
Lang Assessment 2014.pptx
 

Recently uploaded

The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 

Recently uploaded (20)

The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 

Principles of Language Testing Explained

  • 1. Principles of Testing 1. Practicality 2.Reliability 3. Validity 4.Backwash 5.Authenticity Practicality An effective test is: i) cost effective ii) time effective iii) easy to administer iv) Scoring and evaluation process is specific and time-efficient. Are the following tests practical? 1. A speaking test for 250 students in which only 5 assessors and interlocutors are available. A listening test takes half an hour to administer and 15 minutes to assess. An English language test which consists of 100 multiple choice questions which can only be assessed on computers. Reliability i)A reliable test is consistent and dependable. ii)(The consistency of test results over repeated administrations ) If a student achieves similar scores when a test is administered at different time and different places that test is a reliable test. 1. Student Related Reliability Students related factors are likely to interfere with students` true scores. For example, temporary illness, fatigue, bad day, anxiety, physical or psychological factors as well a test- takers` test-wiseness or strategies for efficient test taking. 2.Rater Reliability. It occurs due to human subjectivity and biases when scoring tests. i)Inter-rater reliability It occurs when two or more scorers yield inconsistent scores of the same test due to the lack of attention to scoring criteria, inexperience, inattention or even preconceived biases. Types of Reliability iii) Test Administration Reliability Unreliability may also result from the conditions in which the test is administered. For example, noise, light or temperature in the classroom. iv) Test Reliability Sometimes the nature of the test itself can cause measurement errors. For example, a long test in which test takers become fatigued towards the end of the test and give incorrect answers or ambiguous test items with more than one correct answers. How to Make Tests More Reliable p.21-24 (notes) 1) Take samples of behaviour (the more items you have on test, the more reliable the test will be) 2) Do not allow candidates too much freedom. 3) Write unambiguous items. 4) Provide clear and explicit instructions 5) Ensure that tests are well laid out and perfectly legible. 6) Candidates should be familiar with format and testing techniques. 7) Provide uniform and non-distracting conditions of administration. 8) Use items that permit scoring which is as objective as possible. 9) Make comparisons between candidates as direct as possible. 10) Provide a detailed scoring key. 11) Train scorers. 12) Agree acceptable responses and appropriate scores at outset of scoring 13) Identify candidates by number, not name. 14) Employ, multiple, independent scoring. Validity A test is said to be valid if it measures accurately what it is intended to measure. Some aspects of the concept of validity i) Content validity A test is said to have content validity if its content constitutes a representative sample of the language skills, structures, etc. with which it is meant to be concerned.
  • 2. How can the content validity of a test is ensured? Comparing the test content to test specification to ensure that the test contains the a proper sampling of relevant skills, or knowledge to be tested. Lack of content validity is likely to have a negative backwash effect on teaching as the areas which are not tested are usually ignored in teaching. ii) Criterion related validity How far results on the test agree with those provided by some independent and highly dependable assessment of the candidate’s ability. This independent assessment is thus the criterion measure against which the test is validated. A) Concurrent validity If a the results of a test are supported by other concurrent performance beyond the assessment. For example, the validity of a high score on the final exam of a foreign language course will be substantiated by actual proficiency in the language. B) The Predictive Validity The assessment criterion is to measure a test taker’s likelihood of future success. For example, placement tests, admissions assessment batteries, language aptitude tests. iii) Construct –Related Evidence Construct: Any theory, hypothesis, or model that attempts to explain, observed phenomenon in our universe of perceptions. For example, `Proficiency` and `communicative competence` are linguistic constructs. (Difficult to measure directly, inferential data is required) It is often difficult to measure constructs directly. Language learning and teaching involves theoretical constructs. Construct validity asks ` Does this test actually measure the theoretical constructs as it has been defined? ****For example: The major components of oral proficiency are pronunciation, fluency, grammatical accuracy, vocabulary use, and sociolinguistic appropriateness. An oral interview has construct validity only if it measures all of the referred components of the speech. Or vocabulary test example on p.25 iv)Consequential Validity Consequential validity is related to all the consequences of a test. For example, its accuracy in measuring intended criteria, its impact on the preparation of test takers, its effects on the learner and intended and unintended social consequences of a test’s interpretation and use. For example, the effects of assessment on students` motivation, subsequent performance in a course, independent learning, study habits and attitude toward school work. v) Face Validity Face validity refers to the degree to which a test looks right and appears to measure the knowledge or ability it claims to measure, based on the subjective judgement of the examinees who take it, the administrative personnel who decide on its use, and other unsophisticated observers. P.26 & 27 `Does the test, on the face of it appear from the learner’s perspective to test what it is designed to test? ` What are the features of a test with high face validity? Refer to p. 27 Face validity can not be empirically tested by teachers or experts but it is in the eye of the beholder. A test cannot be highly valid if it is unreliable due to measurement error. However, a test can be reliable but not necessarily valid for the purposes it claims. vi)Authenticity
  • 3. Bachman and Palmer (1996, p.23) define authenticity as `the degree of correspondence of the characteristics of a given language test task to the features of a target language task`. Test items should simulate real world tasks (tasks that are likely to occur in real world) The features of an authentic test 1. The language in the test is as authentic as possible 2. Items are contextualized rather than isolated 3. Topics are meaningful (relevant, interesting) for the learner 4. Some thematic organization to items is provided, such as through a story line or episode. 5. Tasks represent or closely approximate, real world tasks. How can test authenticity be maintained? 1. Reading text is selected from the real-world sources. 2. Listening comprehension sections should feature natural language with hesitations, white noise and interruptions. 3. Include episodic test items, sequenced to form meaningful units, paragraphs or stories vii) Backwash It’s a facet of consequential validity. It is the effect of testing on teaching and learning. (can be beneficial or harmful) When a test is regarded important it dominates all learning and teaching activities. If the test technique is at variance with the content and the objectives of the course, it can have either a positive or negative effect on teaching and learning. Explain why and how? Tests are likely to dominate instruction. A test should be supportive good teaching and exert a corrective feedback on bad teaching. Backwash is likely to enhance a number of basic principles of language acquisition: intrinsic motivation, autonomy, self confidence, language ego, Interlingua, and strategic investment, among others. For achieving beneficial backwash see p.15 1. Test the abilities whose development you want to encourage. (Test oral ability to encourage oral ability) Something related to content validity. Teachers usually avoid to test abilities that are difficult to test, expensive in terms of time and money. 2. Sample widely and unpredictably A test can only measure a sample of language items and abilities included into specification. However, the sample taken should represent as far as possible the full scope of what is specified. If the sample is taken from a restricted area of specifications, it has a harmful backwash effect on teaching. 3. Using direct testing. Direct testing implies the testing of performance skills, with texts and tasks as authentic as possible. If we test directly that we are interested in fostering, then practice for the test represents practice in those skills. 4. Making testing criterion referenced i)If test specifications make clear just what candidates have to be able to do, and with what degree of success, then students will have a clear picture of what they have to achieve. 5. Students know that if they do perform the task s at the criteria level, then they will be successful on the test, regardless of how other students perform. 6. Base achievement tests on objectives. If achievement tests are based on objectives, rather than on detailed teaching and textbook content they will provide a truer picture of what has actually been achieved. 7. Ensure test is known and understood by students and teachers.
  • 4. The rationale for the test, its specifications, and sample items should be made available to everyone. Concerned with the preparation for the test. Students should know what the test demands from them. 8) Where necessary provide assistance to teachers. The introduction of a new test may make demands on teachers. For example, a new national test may attempt to assess communicative skills rather than vocabulary or grammatical structures, teachers who are unfamiliar with communicative language teaching, they need training. 9) Counting the cost Test should be easy and cheap in terms of construction, administration and scoring. For applying principles to the evaluation of classroom tests see pages31-38 (Book) & Chapter 1 Applying Principles to the Evaluation of Classroom Test 1. The classroom tests can be evaluated by considering five basic principles (Validity, etc……) 2. Validity is the most significant issue in testing. If a test lacks validity, then all the other consideration would be useless. 3. Practicality comes next and followed by authenticity in significance. Are the test procedures practical? Practicality checklist 1. Are administrative details clearly identified before the test? 2. Can students complete the test reasonably within the set time frame? 3. Can the test be administered smoothly, without procedural glitches? 4. Are all materials and equipment ready? 5. Is the cost of the test within budgeted limits? 6. Is the scoring/evaluation system feasible in the teacher’s time frame? 7. Are methods for reporting results determined in advance? Is the test reliable? (Related to teacher and test) Reliability in terms of physical context 1. A clearly photocopied test sheet for every student 2. Sound amplification clearly audible to everyone 3. Video is visible to everyone 4. Equal lighting, temperature, extraneous noise and optimal classroom conditions for all students. 5. Objective scoring procedures. Intra ratter reliability is an important issue for classroom teachers. Teachers need to find ways to maintain their concentration and endurance during the scoring process. For open ended questions, teachers: 1. T should use a consistent set of criteria for a correct response. 2. T should give utmost attention to the set of criteria throughout the evaluation time. 3. Read through the tests at least twice to check consistency. 4. If you modify the criteria in the course of the scoring process, go back and apply the same standard to all. 5. Read the test by several sittings to avoid fatigue. Does the procedure demonstrate content validity? Content validity is the main source of validity in classroom tests.
  • 5. `Content validity is basically related to the extent to which the test requires students to perform tasks that were included in the previous classroom lessons and represents unit objectives` See example on p.32 How can the content validity of a test be evaluated? 1. Are classroom objectives clearly identified and appropriately framed? Check examples on pages 32-33 An appropriate test would elicit an adequate number of samples of student performance, have a clearly framed set of standards for evaluating the performance and provide some sort of feedback to the students. 2.Are lesson objectives represented in the form of test specifications? The content validity of a test should be represented in how the objectives of the unit are represented in the form of the content of items, cluster of items, and item types. Do you clearly perceive the performance of test-takers as reflective of the classroom objectives? Is the procedure face valid and `biased for best`? Students usually judge a test to be face valid when: 1. Directions are clear. 2. The structure of the test is organized logically. 3. Its difficulty level is appropriately pitched. 4. The test has no surprises 5. Timing is appropriate Face validity=Biased for best A teacher should 1. Offer students adequate review and preparation for the test 2. Suggest beneficial strategies 3. Structure the test in such a way that best students are modestly challenged and the weaker students are not overwhelmed.*For test taking strategies see pp. 34-35 *(Ts strategic suggestions to optimize students` test performance) Are the test tasks as authentic as possible? Checklist 1. Is the language in the test as authentic (real-world language) possible? 2. Are items as contextualized as possible rather than isolated? 3. Are topics and situations interesting, enjoyable and/or humorous? 4. Is some thematic organization provided, such as through a story line or episode? 5. Do tasks represent, or closely approximate real world tasks? Does the test offer beneficial backwash to the learner? Checklist 1. Is the test content relevant to the curriculum and objectives? 2. How much time students spend to prepare for the test? 3. Does a test have positive consequences to the test takers in terms of learning outcomes and teachers in terms of designing their future teaching? Do the students use the feedback to improve their learning or do teachers benefit from the test outcome to design their teaching to help students improve their learning outcomes? If a test fails to measure accurately whatever it is intended to measure or if students real knowledge and skills are not reflected in the test scores they obtain. There are two main sources of inaccuracy:
  • 6. 1) The first reason is related to test content and technique (validity). For example we can not assess students` abilities by means of multiple choice tests. However accuracy is sacrificed for reasons of economy and convenience. (However, writing good multiple choice tests is still very difficult as a lot of time and effort is needed) Use appropriate test techniques. 2) Lack of reliability-if a test measures consistently, then it is an accurate test. Unreliability is related to the features of the test itself and the way it is scored. For example, unclear instructions, ambiguous questions, items that result in guessing on the part of the test takers. Considering principles of test construction when constructing tests is likely to minimize factors that lead to inaccurate test construction. Another reason is the accordance of significantly different scores to equivalent test performances by two different assessors. Inaccurate Tests
  • 7. 1) The first reason is related to test content and technique (validity). For example we can not assess students` abilities by means of multiple choice tests. However accuracy is sacrificed for reasons of economy and convenience. (However, writing good multiple choice tests is still very difficult as a lot of time and effort is needed) Use appropriate test techniques. 2) Lack of reliability-if a test measures consistently, then it is an accurate test. Unreliability is related to the features of the test itself and the way it is scored. For example, unclear instructions, ambiguous questions, items that result in guessing on the part of the test takers. Considering principles of test construction when constructing tests is likely to minimize factors that lead to inaccurate test construction. Another reason is the accordance of significantly different scores to equivalent test performances by two different assessors. Inaccurate Tests