SlideShare a Scribd company logo
1 of 32
Dr. R. Green, Aug 2006 1
Principles of language
testing
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 2
Overview
 What are the principles of language testing?
 How can we define them?
 What factors can influence them?
 How can we measure them?
 How do they interrelate?
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 3
Reliability
Related to accuracy, dependability and consistency
e.g. 20°C here today, 20°C in North Italy – are they
the same?
According to Henning [1987], reliability is
 a measure of accuracy, consistency, dependability,
or fairness of scores resulting from the
administration of a particular examination e.g. 75%
on a test today, 83% tomorrow – problem with
reliability.
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 4
Validity: internal & external
Construct validity [internal]
 the extent to which evidence can be found to
support the underlying theoretical construct
on which the test is based
Content validity [internal]
 the extent to which the content of a test can
be said to be sufficiently representative and
comprehensive of the purpose for which it
has been designed
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 5
Validity [2]
Response validity [internal]
 the extent to which test takers respond in the
way expected by the test developers
Concurrent validity [external]
 the extent to which test takers' scores on one
test relate to those on another externally
recognised test or measure
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 6
Validity [3]
Predictive validity [external]
 the extent to which scores on test Y predict test
takers' ability to do X e.g. IELTS + success in
academic studies at university
Face validity [internal/external]
 the extent to which the test is perceived to reflect the
stated purpose e.g. writing in a listening test – is this
appropriate? depends on the target language
situation i.e. academic environment
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 7
Validity [4]
 'Validity is not a characteristic of a test, but a
feature of the inferences made on the basis
of test scores and the uses to which a test is
put.'
Alderson [2002: 5]
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 8
Practicality
The ease with which the test:
 items can be replicated in terms of resources
needed e.g. time, materials, people
 can be administered
 can be graded
 results can be interpreted
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 9
Factors which can
influence reliability,
validity and practicality…
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 10
Test [1]
 quality of items
 number of items
 difficulty level of items
 level of item discrimination
 type of test methods
 number of test methods
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 11
Test [2]
 time allowed
 clarity of instructions
 use of the test
 selection of content
 sampling of content
 invalid constructs
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 12
Test taker
 familiarity with test method
 attitude towards the test i.e. interest,
motivation, emotional/mental state
 degree of guessing employed
 level of ability
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 13
Test administration
 consistency of administration procedure
 degree of interaction between invigilators and
test takers
 time of day the test is administered
 clarity of instructions
 test environment – light / heat / noise /
space / layout of room
 quality of equipment used e.g. for listening
tests
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 14
Scoring
 accuracy of the key e.g. does it include
all possible alternatives?
 inter-rater reliability e.g. in writing,
speaking
 intra-rater reliability e.g. in writing,
speaking
 machine vs. human
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 15
How can we measure reliability?
Test-retest
 same test administered to the same test
takers following an interval of no more than 2
weeks
Inter-rater reliability
 two or more independent estimates on a test
e.g. written scripts marked by two raters
independently and results compared
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 16
Measuring reliability [2]
Internal consistency reliability estimates
e.g.
 Split half reliability
 Cronbach’s alpha / Kuder Richardson 20
[KR20]
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 17
Split half reliability
 test to be administered to a group of test takers is
divided into halves, scores on each half correlated
with the other half
 the resulting coefficient is then adjusted by
Spearman-Brown Prophecy Formula to allow for the
fact that the total score is based on an instrument
that is twice as long as its halves
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 18
Cronbach's Alpha [KR 20]
 this approach looks at how test takers
perform on each individual item and then
compares that performance against their
performance on the test as a whole
 measured on a -1 to +1 scale like
discrimination
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 19
Reliability is influenced by …..
 the longer the test, the more reliable it is likely to be
[though there is a point of no extra return]
 items which discriminate will add to reliability,
therefore, if the items are too easy / too difficult,
reliability is likely to be lower
 if there is a wide range of abilities amongst the test
takers, test is likely to have higher reliability
 the more homogeneous the items are, the higher
the reliability is likely to be
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 20
How can we measure validity?
According to Henning [1987]
 non-empirically, involving inspection, intuition
and common sense
 empirically, involving the collection and
analysis of qualitative and quantitative data
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 21
Construct validity
 evidence is usually obtained through such statistical
analyses as factor analysis [looks for items which
group together], discrimination; also through
retrospection procedures
Content validity
 this type of validity cannot be measured statistically;
need to involve experts in an analysis of the test;
detailed specifications should be drawn up to ensure
the content is both representative and
comprehensive
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 22
Response validity
 can be ascertained by means of interviewing test
takers [Henning]; asking them to take part in
introspection / retrospection procedures [Alderson]
Concurrent validity
 determined by correlating the results on the test with
another externally recognised measure. Care needs
to be taken that the two measures are measuring
similar skills and using similar test methods
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 23
Predictive validity
 can be determined by investigating the
relationship between a test taker's score e.g.
on IELTS/TOEFL and his/her success in the
academic program chosen
 problem - other factors may influence
success e.g. life abroad, ability in chosen
field, peers, tutors, personal issues, etc.;
also time factor element
Dr. R. Green, Aug 2006 24
Reliability vs. validity?
 'an observation can be reliable without being valid,
but cannot be valid without first being reliable. In
other words, reliability is a necessary, but not
sufficient, condition for validity.'
[Hubley & Zumbo 1996]
 ‘Of all the concepts in testing and measurement, it
may be argued, validity is the most basic and far-
reaching, for without validity, a test, measure or
observation and any inferences made from it are
meaningless’
[Hubley & Zumbo 1996, 207]
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 25
Reliability vs. validity [2]
 even an ideal test which is perfectly reliable
and possessing perfect criterion-related
validity will be invalid for some purposes
[Henning 1987]
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 26
Practicality
Designing and developing good test items
requires
 working with other colleagues
 materials i.e. paper, computer, printer etc.
 time
Some items look very attractive but this
attraction has to be weighed against these
factors.
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 27
References
 Alderson, J. C 2002 Conceptions of validity and validation.
Paper presented at a conference in Bucharest, June 2002.
 Angoff, 1988 Validity: An evolving concept. In H. Wainer & H.
Braun [Eds.] Test validity [pp. 19-32], Hillsdale, NJ: Erlbaum.
 Bachman, L. F. 1990 Fundamental considerations in language
testing. Oxford: O.U.P.
 Cumming A. & Berwick R. [Eds.] Validation in Language Testing
Multilingual Matters 1996
 Hatch, E. & Lazaraton, A. 1991 The Research Manual - Design
& Statistics for Applied Linguistics Newbury House
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 28
References [2]
 Henning, G. 1987 A guide to language testing: Development,
evaluation and research Cambridge, Mass: Newbury House
 Hubley, A. M. & Zumbo, B. D. A dialectic on validity: where we
have been and where we are going. The Journal of General
Psychology 1996. 123[3] 207-215
 Messick, S. 1988 The once and future issues of validity:
Assessing the meaning and consequences of measurement. In
H. Wainer & H. Braun [Eds.] Test validity [pp. 33-45], Hillsdale,
NJ: Erlbaum.
 Messick, S. 1989 Validity. In R. L. Linn [Ed.] Educational
measurement. [3rd ed., pp 13-103]. New York: Macmillan.
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 29
Item-total Statistics
Corrected Item-Total Alpha if Item
Correlation Deleted
R01 .5259 .7964
R02 .6804 .7594
R03 .6683 .7623
R04 .5516 .7940
R05 .7173 .7489
R16 .3946 .8288
N of Cases = 194.0 N of Items = 6 Alpha = .8121
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 30
Item-total Statistics
Corrected Item Total Alpha if Item
Correlation Deleted
R16 .5773 .7909
R17 .5995 .7863
R18 .7351 .7553
R19 .7920 .7419
R20 .6490 .7753
R01 .1939 .8663
N of Cases = 194.0 N of Items = 6 Alpha = .8185
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 31
Component Matrixa
.502 .559
.690 .423
.683 .461
.571 .404
.750 .343
.670 -.223
.631 -.508
.770 -.368
.789 -.383
.646 -.494
R01
R02
R03
R04
R05
R16
R17
R18
R19
R20
1 2
Component
Extraction Method: Principal Component Analysis.
2 components extracted.a.
EUROPOS SĄJUNGA
Dr. R. Green, Aug 2006 32
Thank you for your attention!
EUROPOS SĄJUNGA

More Related Content

What's hot

Validity, reliablility, washback
Validity, reliablility, washbackValidity, reliablility, washback
Validity, reliablility, washbackMaury Martinez
 
validity and reliability
validity and reliabilityvalidity and reliability
validity and reliabilityaffera mujahid
 
Chapter 2(principles of language assessment)
Chapter 2(principles of language assessment)Chapter 2(principles of language assessment)
Chapter 2(principles of language assessment)Kheang Sokheng
 
Qualities of a good test (1)
Qualities of a good test (1)Qualities of a good test (1)
Qualities of a good test (1)kimoya
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validityshobhitsaxena67
 
Testing for Language Teachers Arthur Hughes
Testing for Language TeachersArthur HughesTesting for Language TeachersArthur Hughes
Testing for Language Teachers Arthur HughesRajputt Ainee
 
Principles of language_assessment
Principles of language_assessmentPrinciples of language_assessment
Principles of language_assessmentLeidylanda
 
Language testing and evaluation validity and reliability.
Language testing and evaluation validity and reliability.Language testing and evaluation validity and reliability.
Language testing and evaluation validity and reliability.Vadher Ankita
 
Understanding reliability and validity
Understanding reliability and validityUnderstanding reliability and validity
Understanding reliability and validityMuhammad Faisal
 
Validity & reliability an interesting powerpoint slide i created
Validity & reliability  an interesting powerpoint slide i createdValidity & reliability  an interesting powerpoint slide i created
Validity & reliability an interesting powerpoint slide i createdSze Kai
 
Principles of language assessment
Principles of language assessmentPrinciples of language assessment
Principles of language assessmentSutrisno Evenddy
 
Validity and Reliability
Validity and ReliabilityValidity and Reliability
Validity and ReliabilityMaury Martinez
 
4. qualities of good measuring instrument
4. qualities of good measuring instrument4. qualities of good measuring instrument
4. qualities of good measuring instrumentJohn Paul Hablado
 
Language Assessment Principles and Issues
Language Assessment Principles and IssuesLanguage Assessment Principles and Issues
Language Assessment Principles and IssuesMaury Martinez
 
Validity, Reliability and Feasibility
Validity, Reliability and FeasibilityValidity, Reliability and Feasibility
Validity, Reliability and FeasibilityJasna3134
 
Week 2 exercise_2015 (9)
Week 2 exercise_2015 (9)Week 2 exercise_2015 (9)
Week 2 exercise_2015 (9)Saida Efendieva
 

What's hot (20)

Validity
ValidityValidity
Validity
 
Validity, reliablility, washback
Validity, reliablility, washbackValidity, reliablility, washback
Validity, reliablility, washback
 
validity and reliability
validity and reliabilityvalidity and reliability
validity and reliability
 
Chapter 2(principles of language assessment)
Chapter 2(principles of language assessment)Chapter 2(principles of language assessment)
Chapter 2(principles of language assessment)
 
Qualities of a good test (1)
Qualities of a good test (1)Qualities of a good test (1)
Qualities of a good test (1)
 
Criteria of a good language test
Criteria of a good language testCriteria of a good language test
Criteria of a good language test
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validity
 
Testing for Language Teachers Arthur Hughes
Testing for Language TeachersArthur HughesTesting for Language TeachersArthur Hughes
Testing for Language Teachers Arthur Hughes
 
Principles of language_assessment
Principles of language_assessmentPrinciples of language_assessment
Principles of language_assessment
 
Language testing and evaluation validity and reliability.
Language testing and evaluation validity and reliability.Language testing and evaluation validity and reliability.
Language testing and evaluation validity and reliability.
 
Understanding reliability and validity
Understanding reliability and validityUnderstanding reliability and validity
Understanding reliability and validity
 
Validity & reliability an interesting powerpoint slide i created
Validity & reliability  an interesting powerpoint slide i createdValidity & reliability  an interesting powerpoint slide i created
Validity & reliability an interesting powerpoint slide i created
 
Test Usefulness
Test UsefulnessTest Usefulness
Test Usefulness
 
Principles of language assessment
Principles of language assessmentPrinciples of language assessment
Principles of language assessment
 
Validity and Reliability
Validity and ReliabilityValidity and Reliability
Validity and Reliability
 
4. qualities of good measuring instrument
4. qualities of good measuring instrument4. qualities of good measuring instrument
4. qualities of good measuring instrument
 
Language Assessment Principles and Issues
Language Assessment Principles and IssuesLanguage Assessment Principles and Issues
Language Assessment Principles and Issues
 
Validity, Reliability and Feasibility
Validity, Reliability and FeasibilityValidity, Reliability and Feasibility
Validity, Reliability and Feasibility
 
Week 2 exercise_2015 (9)
Week 2 exercise_2015 (9)Week 2 exercise_2015 (9)
Week 2 exercise_2015 (9)
 
Reliablity
ReliablityReliablity
Reliablity
 

Viewers also liked

Presentation Validity & Reliability
Presentation Validity & ReliabilityPresentation Validity & Reliability
Presentation Validity & Reliabilitysongoten77
 
UTPL-LANGUAGE TESTING-II-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)
UTPL-LANGUAGE TESTING-II-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)UTPL-LANGUAGE TESTING-II-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)
UTPL-LANGUAGE TESTING-II-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)Videoconferencias UTPL
 
Testing for Language Teachers
Testing for Language TeachersTesting for Language Teachers
Testing for Language Teachersmpazhou
 
Principles of language assessment
Principles of language assessmentPrinciples of language assessment
Principles of language assessmentAstrid Caballero
 
Principles of Language Assessment
Principles of Language AssessmentPrinciples of Language Assessment
Principles of Language AssessmentA Faiz
 
Validity, reliability & practicality
Validity, reliability & practicalityValidity, reliability & practicality
Validity, reliability & practicalitySamcruz5
 
validity its types and importance
validity its types and importancevalidity its types and importance
validity its types and importanceIerine Joy Caserial
 
Validity, its types, measurement & factors.
Validity, its types, measurement & factors.Validity, its types, measurement & factors.
Validity, its types, measurement & factors.Maheen Iftikhar
 
Testing for language teachers 101 (1)
Testing for language teachers 101 (1)Testing for language teachers 101 (1)
Testing for language teachers 101 (1)Paul Doyon
 

Viewers also liked (11)

Presentation Validity & Reliability
Presentation Validity & ReliabilityPresentation Validity & Reliability
Presentation Validity & Reliability
 
UTPL-LANGUAGE TESTING-II-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)
UTPL-LANGUAGE TESTING-II-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)UTPL-LANGUAGE TESTING-II-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)
UTPL-LANGUAGE TESTING-II-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)
 
Criterion-related Validity (Overview)
Criterion-related Validity (Overview)Criterion-related Validity (Overview)
Criterion-related Validity (Overview)
 
Testing for Language Teachers
Testing for Language TeachersTesting for Language Teachers
Testing for Language Teachers
 
Validity
ValidityValidity
Validity
 
Principles of language assessment
Principles of language assessmentPrinciples of language assessment
Principles of language assessment
 
Principles of Language Assessment
Principles of Language AssessmentPrinciples of Language Assessment
Principles of Language Assessment
 
Validity, reliability & practicality
Validity, reliability & practicalityValidity, reliability & practicality
Validity, reliability & practicality
 
validity its types and importance
validity its types and importancevalidity its types and importance
validity its types and importance
 
Validity, its types, measurement & factors.
Validity, its types, measurement & factors.Validity, its types, measurement & factors.
Validity, its types, measurement & factors.
 
Testing for language teachers 101 (1)
Testing for language teachers 101 (1)Testing for language teachers 101 (1)
Testing for language teachers 101 (1)
 

Similar to Principles of language_testing_rita_green

Copie de PRESENTATION_ RELIABILITY _ VALIDITY.pptx
Copie de PRESENTATION_ RELIABILITY _ VALIDITY.pptxCopie de PRESENTATION_ RELIABILITY _ VALIDITY.pptx
Copie de PRESENTATION_ RELIABILITY _ VALIDITY.pptxMonsefJraid
 
Validity in Research
Validity in ResearchValidity in Research
Validity in ResearchEcem Ekinci
 
Principles of Second Language Assessmentc.pptx
Principles of Second Language Assessmentc.pptxPrinciples of Second Language Assessmentc.pptx
Principles of Second Language Assessmentc.pptxSubramanian Mani
 
PRINCIPLES OF ASSESSMENT 2.pptx
PRINCIPLES OF ASSESSMENT 2.pptxPRINCIPLES OF ASSESSMENT 2.pptx
PRINCIPLES OF ASSESSMENT 2.pptxJoelGuamani2
 
Principles of assessment
Principles of assessmentPrinciples of assessment
Principles of assessmentmunsif123
 
CHARACTERISTICS OF A GOOD INSTRUMENT
CHARACTERISTICS OF A GOOD INSTRUMENTCHARACTERISTICS OF A GOOD INSTRUMENT
CHARACTERISTICS OF A GOOD INSTRUMENTMusfera Nara Vadia
 
RCH 8301, Quantitative Research Methods 1 Course L
  RCH 8301, Quantitative Research Methods 1 Course L  RCH 8301, Quantitative Research Methods 1 Course L
RCH 8301, Quantitative Research Methods 1 Course LVannaJoy20
 
Basic Principles of Assessment
Basic Principles of AssessmentBasic Principles of Assessment
Basic Principles of AssessmentYee Bee Choo
 
Validity and reliability (aco section 6a) sheena jayma msgs ed
Validity and reliability (aco section 6a) sheena jayma msgs edValidity and reliability (aco section 6a) sheena jayma msgs ed
Validity and reliability (aco section 6a) sheena jayma msgs edSheena Gyne Jayma
 
research-instruments (1).pptx
research-instruments (1).pptxresearch-instruments (1).pptx
research-instruments (1).pptxJCronus
 
Pilot Study for Validity and Reliability of an Aptitude Test
Pilot Study for Validity and Reliability of an Aptitude TestPilot Study for Validity and Reliability of an Aptitude Test
Pilot Study for Validity and Reliability of an Aptitude TestBahram Kazemian
 
Testing and Evaluation Strategies in Second Language Teaching.pptx
Testing and Evaluation Strategies in Second Language Teaching.pptxTesting and Evaluation Strategies in Second Language Teaching.pptx
Testing and Evaluation Strategies in Second Language Teaching.pptxSubramanian Mani
 
MAAM DAGUDAG -RESEARCH REPORT 2019.pptx
MAAM DAGUDAG -RESEARCH REPORT 2019.pptxMAAM DAGUDAG -RESEARCH REPORT 2019.pptx
MAAM DAGUDAG -RESEARCH REPORT 2019.pptxRODELAZARES3
 
POLIT.pptx
POLIT.pptxPOLIT.pptx
POLIT.pptxbeminaja
 
Evaluation.2011intro
Evaluation.2011introEvaluation.2011intro
Evaluation.2011introKAthy Cea
 
Evaluation.2011intro
Evaluation.2011introEvaluation.2011intro
Evaluation.2011introKAthy Cea
 
Evaluation.2011intro
Evaluation.2011introEvaluation.2011intro
Evaluation.2011introKAthy Cea
 
Evaluation.2011intro
Evaluation.2011introEvaluation.2011intro
Evaluation.2011introKAthy Cea
 
Evaluation.2011intro
Evaluation.2011introEvaluation.2011intro
Evaluation.2011introKAthy Cea
 

Similar to Principles of language_testing_rita_green (20)

Copie de PRESENTATION_ RELIABILITY _ VALIDITY.pptx
Copie de PRESENTATION_ RELIABILITY _ VALIDITY.pptxCopie de PRESENTATION_ RELIABILITY _ VALIDITY.pptx
Copie de PRESENTATION_ RELIABILITY _ VALIDITY.pptx
 
Validity in Research
Validity in ResearchValidity in Research
Validity in Research
 
Principles of Second Language Assessmentc.pptx
Principles of Second Language Assessmentc.pptxPrinciples of Second Language Assessmentc.pptx
Principles of Second Language Assessmentc.pptx
 
PRINCIPLES OF ASSESSMENT 2.pptx
PRINCIPLES OF ASSESSMENT 2.pptxPRINCIPLES OF ASSESSMENT 2.pptx
PRINCIPLES OF ASSESSMENT 2.pptx
 
Research Design
Research DesignResearch Design
Research Design
 
Principles of assessment
Principles of assessmentPrinciples of assessment
Principles of assessment
 
CHARACTERISTICS OF A GOOD INSTRUMENT
CHARACTERISTICS OF A GOOD INSTRUMENTCHARACTERISTICS OF A GOOD INSTRUMENT
CHARACTERISTICS OF A GOOD INSTRUMENT
 
RCH 8301, Quantitative Research Methods 1 Course L
  RCH 8301, Quantitative Research Methods 1 Course L  RCH 8301, Quantitative Research Methods 1 Course L
RCH 8301, Quantitative Research Methods 1 Course L
 
Basic Principles of Assessment
Basic Principles of AssessmentBasic Principles of Assessment
Basic Principles of Assessment
 
Validity and reliability (aco section 6a) sheena jayma msgs ed
Validity and reliability (aco section 6a) sheena jayma msgs edValidity and reliability (aco section 6a) sheena jayma msgs ed
Validity and reliability (aco section 6a) sheena jayma msgs ed
 
research-instruments (1).pptx
research-instruments (1).pptxresearch-instruments (1).pptx
research-instruments (1).pptx
 
Pilot Study for Validity and Reliability of an Aptitude Test
Pilot Study for Validity and Reliability of an Aptitude TestPilot Study for Validity and Reliability of an Aptitude Test
Pilot Study for Validity and Reliability of an Aptitude Test
 
Testing and Evaluation Strategies in Second Language Teaching.pptx
Testing and Evaluation Strategies in Second Language Teaching.pptxTesting and Evaluation Strategies in Second Language Teaching.pptx
Testing and Evaluation Strategies in Second Language Teaching.pptx
 
MAAM DAGUDAG -RESEARCH REPORT 2019.pptx
MAAM DAGUDAG -RESEARCH REPORT 2019.pptxMAAM DAGUDAG -RESEARCH REPORT 2019.pptx
MAAM DAGUDAG -RESEARCH REPORT 2019.pptx
 
POLIT.pptx
POLIT.pptxPOLIT.pptx
POLIT.pptx
 
Evaluation.2011intro
Evaluation.2011introEvaluation.2011intro
Evaluation.2011intro
 
Evaluation.2011intro
Evaluation.2011introEvaluation.2011intro
Evaluation.2011intro
 
Evaluation.2011intro
Evaluation.2011introEvaluation.2011intro
Evaluation.2011intro
 
Evaluation.2011intro
Evaluation.2011introEvaluation.2011intro
Evaluation.2011intro
 
Evaluation.2011intro
Evaluation.2011introEvaluation.2011intro
Evaluation.2011intro
 

Recently uploaded

Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfChris Hunter
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesShubhangi Sonawane
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 

Recently uploaded (20)

INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 

Principles of language_testing_rita_green

  • 1. Dr. R. Green, Aug 2006 1 Principles of language testing EUROPOS SĄJUNGA
  • 2. Dr. R. Green, Aug 2006 2 Overview  What are the principles of language testing?  How can we define them?  What factors can influence them?  How can we measure them?  How do they interrelate? EUROPOS SĄJUNGA
  • 3. Dr. R. Green, Aug 2006 3 Reliability Related to accuracy, dependability and consistency e.g. 20°C here today, 20°C in North Italy – are they the same? According to Henning [1987], reliability is  a measure of accuracy, consistency, dependability, or fairness of scores resulting from the administration of a particular examination e.g. 75% on a test today, 83% tomorrow – problem with reliability. EUROPOS SĄJUNGA
  • 4. Dr. R. Green, Aug 2006 4 Validity: internal & external Construct validity [internal]  the extent to which evidence can be found to support the underlying theoretical construct on which the test is based Content validity [internal]  the extent to which the content of a test can be said to be sufficiently representative and comprehensive of the purpose for which it has been designed EUROPOS SĄJUNGA
  • 5. Dr. R. Green, Aug 2006 5 Validity [2] Response validity [internal]  the extent to which test takers respond in the way expected by the test developers Concurrent validity [external]  the extent to which test takers' scores on one test relate to those on another externally recognised test or measure EUROPOS SĄJUNGA
  • 6. Dr. R. Green, Aug 2006 6 Validity [3] Predictive validity [external]  the extent to which scores on test Y predict test takers' ability to do X e.g. IELTS + success in academic studies at university Face validity [internal/external]  the extent to which the test is perceived to reflect the stated purpose e.g. writing in a listening test – is this appropriate? depends on the target language situation i.e. academic environment EUROPOS SĄJUNGA
  • 7. Dr. R. Green, Aug 2006 7 Validity [4]  'Validity is not a characteristic of a test, but a feature of the inferences made on the basis of test scores and the uses to which a test is put.' Alderson [2002: 5] EUROPOS SĄJUNGA
  • 8. Dr. R. Green, Aug 2006 8 Practicality The ease with which the test:  items can be replicated in terms of resources needed e.g. time, materials, people  can be administered  can be graded  results can be interpreted EUROPOS SĄJUNGA
  • 9. Dr. R. Green, Aug 2006 9 Factors which can influence reliability, validity and practicality… EUROPOS SĄJUNGA
  • 10. Dr. R. Green, Aug 2006 10 Test [1]  quality of items  number of items  difficulty level of items  level of item discrimination  type of test methods  number of test methods EUROPOS SĄJUNGA
  • 11. Dr. R. Green, Aug 2006 11 Test [2]  time allowed  clarity of instructions  use of the test  selection of content  sampling of content  invalid constructs EUROPOS SĄJUNGA
  • 12. Dr. R. Green, Aug 2006 12 Test taker  familiarity with test method  attitude towards the test i.e. interest, motivation, emotional/mental state  degree of guessing employed  level of ability EUROPOS SĄJUNGA
  • 13. Dr. R. Green, Aug 2006 13 Test administration  consistency of administration procedure  degree of interaction between invigilators and test takers  time of day the test is administered  clarity of instructions  test environment – light / heat / noise / space / layout of room  quality of equipment used e.g. for listening tests EUROPOS SĄJUNGA
  • 14. Dr. R. Green, Aug 2006 14 Scoring  accuracy of the key e.g. does it include all possible alternatives?  inter-rater reliability e.g. in writing, speaking  intra-rater reliability e.g. in writing, speaking  machine vs. human EUROPOS SĄJUNGA
  • 15. Dr. R. Green, Aug 2006 15 How can we measure reliability? Test-retest  same test administered to the same test takers following an interval of no more than 2 weeks Inter-rater reliability  two or more independent estimates on a test e.g. written scripts marked by two raters independently and results compared EUROPOS SĄJUNGA
  • 16. Dr. R. Green, Aug 2006 16 Measuring reliability [2] Internal consistency reliability estimates e.g.  Split half reliability  Cronbach’s alpha / Kuder Richardson 20 [KR20] EUROPOS SĄJUNGA
  • 17. Dr. R. Green, Aug 2006 17 Split half reliability  test to be administered to a group of test takers is divided into halves, scores on each half correlated with the other half  the resulting coefficient is then adjusted by Spearman-Brown Prophecy Formula to allow for the fact that the total score is based on an instrument that is twice as long as its halves EUROPOS SĄJUNGA
  • 18. Dr. R. Green, Aug 2006 18 Cronbach's Alpha [KR 20]  this approach looks at how test takers perform on each individual item and then compares that performance against their performance on the test as a whole  measured on a -1 to +1 scale like discrimination EUROPOS SĄJUNGA
  • 19. Dr. R. Green, Aug 2006 19 Reliability is influenced by …..  the longer the test, the more reliable it is likely to be [though there is a point of no extra return]  items which discriminate will add to reliability, therefore, if the items are too easy / too difficult, reliability is likely to be lower  if there is a wide range of abilities amongst the test takers, test is likely to have higher reliability  the more homogeneous the items are, the higher the reliability is likely to be EUROPOS SĄJUNGA
  • 20. Dr. R. Green, Aug 2006 20 How can we measure validity? According to Henning [1987]  non-empirically, involving inspection, intuition and common sense  empirically, involving the collection and analysis of qualitative and quantitative data EUROPOS SĄJUNGA
  • 21. Dr. R. Green, Aug 2006 21 Construct validity  evidence is usually obtained through such statistical analyses as factor analysis [looks for items which group together], discrimination; also through retrospection procedures Content validity  this type of validity cannot be measured statistically; need to involve experts in an analysis of the test; detailed specifications should be drawn up to ensure the content is both representative and comprehensive EUROPOS SĄJUNGA
  • 22. Dr. R. Green, Aug 2006 22 Response validity  can be ascertained by means of interviewing test takers [Henning]; asking them to take part in introspection / retrospection procedures [Alderson] Concurrent validity  determined by correlating the results on the test with another externally recognised measure. Care needs to be taken that the two measures are measuring similar skills and using similar test methods EUROPOS SĄJUNGA
  • 23. Dr. R. Green, Aug 2006 23 Predictive validity  can be determined by investigating the relationship between a test taker's score e.g. on IELTS/TOEFL and his/her success in the academic program chosen  problem - other factors may influence success e.g. life abroad, ability in chosen field, peers, tutors, personal issues, etc.; also time factor element
  • 24. Dr. R. Green, Aug 2006 24 Reliability vs. validity?  'an observation can be reliable without being valid, but cannot be valid without first being reliable. In other words, reliability is a necessary, but not sufficient, condition for validity.' [Hubley & Zumbo 1996]  ‘Of all the concepts in testing and measurement, it may be argued, validity is the most basic and far- reaching, for without validity, a test, measure or observation and any inferences made from it are meaningless’ [Hubley & Zumbo 1996, 207] EUROPOS SĄJUNGA
  • 25. Dr. R. Green, Aug 2006 25 Reliability vs. validity [2]  even an ideal test which is perfectly reliable and possessing perfect criterion-related validity will be invalid for some purposes [Henning 1987] EUROPOS SĄJUNGA
  • 26. Dr. R. Green, Aug 2006 26 Practicality Designing and developing good test items requires  working with other colleagues  materials i.e. paper, computer, printer etc.  time Some items look very attractive but this attraction has to be weighed against these factors. EUROPOS SĄJUNGA
  • 27. Dr. R. Green, Aug 2006 27 References  Alderson, J. C 2002 Conceptions of validity and validation. Paper presented at a conference in Bucharest, June 2002.  Angoff, 1988 Validity: An evolving concept. In H. Wainer & H. Braun [Eds.] Test validity [pp. 19-32], Hillsdale, NJ: Erlbaum.  Bachman, L. F. 1990 Fundamental considerations in language testing. Oxford: O.U.P.  Cumming A. & Berwick R. [Eds.] Validation in Language Testing Multilingual Matters 1996  Hatch, E. & Lazaraton, A. 1991 The Research Manual - Design & Statistics for Applied Linguistics Newbury House EUROPOS SĄJUNGA
  • 28. Dr. R. Green, Aug 2006 28 References [2]  Henning, G. 1987 A guide to language testing: Development, evaluation and research Cambridge, Mass: Newbury House  Hubley, A. M. & Zumbo, B. D. A dialectic on validity: where we have been and where we are going. The Journal of General Psychology 1996. 123[3] 207-215  Messick, S. 1988 The once and future issues of validity: Assessing the meaning and consequences of measurement. In H. Wainer & H. Braun [Eds.] Test validity [pp. 33-45], Hillsdale, NJ: Erlbaum.  Messick, S. 1989 Validity. In R. L. Linn [Ed.] Educational measurement. [3rd ed., pp 13-103]. New York: Macmillan. EUROPOS SĄJUNGA
  • 29. Dr. R. Green, Aug 2006 29 Item-total Statistics Corrected Item-Total Alpha if Item Correlation Deleted R01 .5259 .7964 R02 .6804 .7594 R03 .6683 .7623 R04 .5516 .7940 R05 .7173 .7489 R16 .3946 .8288 N of Cases = 194.0 N of Items = 6 Alpha = .8121 EUROPOS SĄJUNGA
  • 30. Dr. R. Green, Aug 2006 30 Item-total Statistics Corrected Item Total Alpha if Item Correlation Deleted R16 .5773 .7909 R17 .5995 .7863 R18 .7351 .7553 R19 .7920 .7419 R20 .6490 .7753 R01 .1939 .8663 N of Cases = 194.0 N of Items = 6 Alpha = .8185 EUROPOS SĄJUNGA
  • 31. Dr. R. Green, Aug 2006 31 Component Matrixa .502 .559 .690 .423 .683 .461 .571 .404 .750 .343 .670 -.223 .631 -.508 .770 -.368 .789 -.383 .646 -.494 R01 R02 R03 R04 R05 R16 R17 R18 R19 R20 1 2 Component Extraction Method: Principal Component Analysis. 2 components extracted.a. EUROPOS SĄJUNGA
  • 32. Dr. R. Green, Aug 2006 32 Thank you for your attention! EUROPOS SĄJUNGA