SlideShare a Scribd company logo
1 of 22
Download to read offline
CHAPTER 32 OF ROUTLEDGE HANDBOOK
FAIRNESS
OVERVIEW
Introduction
1. Roles Of The Stakeholders
2. Roles Of The Test-takers
3. Power Relations In Testing
Test Fairness
1. Test Sensitivity Review
2. Test Bias
Fairness and validity
1. Fairness independent of validity
2. Fairness subsumes validity
3. Fairness and validity are overlapping
4. Fairness as an important aspect of validity
Xi's framework
Kunnan's Test Context Framework
INTRODUCTION
• There are many definitions of fairness.
• Your-dictionary.com (2011) states that to be fair is to be “just and honest,” “impartial,” and “unprejudiced,”
specifically, “free from discrimination based on race, religion, sex, etc.”
• Kunnan (2000) argues that fairness embraces three concerns: validity, accessibility of tests to test takers, and
justice.
INTRODUCTION
Spaan (2000)
Defines fairness as an ideal “in which opportunities are equal,”
“In the natural world, test writers and developers cannot be ‘fair’ in the ideal sense, but … they can try to
equitable.” this equitability is “the joint responsibility of the test developer, the test user, and the examinee, in a
sort of social contract”.
Fairness can be seen as a system or process – as distinct from a quality.
INTRODUCTION
What most language testing views of fairness have in common is a desire to avoid the effects of any construct-
irrelevant factors on the entire testing process, from the test-design stage through post-administration decision-
making.
In this context, one dimension of fairness concerns the roles of the major stakeholders in achieving language
testing fairness. these stakeholders, to use spaan’s (2000) tripartite “social contract” scheme, are:
test developers;
L2/FL learners;
and test users (i.e., teachers, administrators, etc.).
INTRODUCTION
Role Of Test Developers
❑They must try to ensure the validity, reliability, and practicality of their test methods;
❑They must also provide test users with easily understandable guidelines for the use of their tests;
❑They must also solicit feedback from them, to effect further test improvements.
INTRODUCTION
Roles Of The Test Takers
❑They must become familiar with the testing format and overall test content before taking the test;
❑They must try to make sure that the level of the test matches their own skill/knowledge level.
Roles Of The Test Administrators
❑They must give tests to students for whom the tests were designed; to do otherwise would be an instance of
test abuse.
INTRODUCTION
Power Relations In Testing
Viewing test developer, test taker, and test user as parties to a social contract highlights a second issue, namely
the phenomenon of power relations in testing.
Shohamy (2001) points out that language tests have often been used by powerful agencies such as
governments, educational bureaucracies, or school staff for reasons other than the assessment of language
skills. For example, tests have been used to establish discipline, to impose sanctions on schools or teachers, or
to raise the prestige of the subject matter being tested.
INTRODUCTION
Power relations in testing
She offers a set of principles organized under the heading of critical language testing, as a way to engage
language testers “in a wider sphere of social dialogue and debate about … the roles tests … have been assigned
to play in society”
❑Critical language testing “encourages test takers to develop a critical view of tests as well as to act on it by
questioning tests and critiquing the value which is inherent in them.”
A critical language testing perspective asks that all parties to the contract remain vigilant.
Test Fairness
Test Sensitivity Review
One approach to examining whether or not test questions are fair is through a test sensitivity review.
Such reviews are performed by trained judges employed by test-development organizations, who examine test
tasks to determine whether they contain language or content that may be considered stereotyping, patronizing,
inflammatory, or otherwise offensive to test takers belonging to subgroups defined by culture, ethnicity, or
gender.
Test Fairness
Test bias
Is a technical term indicating a testing situation in which a particular test use results in different interpretations
of test scores received by cultural, ethnic, gender, or linguistic subgroups.
Synonymous with DIF.
Bias or DIF is considered to be present when a test item is differentially difficult for a ethnic, cultural, or
gender-related subgroup which is otherwise equally matched with another subgroup in terms of knowledge or
skill.
Among the statistical methods used to uncover DIF are:
the Standardization Procedure;
and the Mantel–Haenszel method.
Fairness And Validity
Kane (2010) points out that the relationship between fairness and validity depends on how the two concepts
are defined: narrowly defining validity and broadly defining fairness will result in validity being considered a
component of fairness. On the other hand, a broad definition of validity and a narrow conceptualization of
fairness will result in fairness being understood as a part of validity.
Fairness And Validity
1. Fairness Independent Of Validity
One example of this is the Standards for Educational and Psychological Testing which define it as having, at
minimum, three components: lack of item bias, the presence of equitable treatment of all test-takers in the
testing process, and equity of opportunity of examinees to learn the material on a given test. While fairness
here is not linked directly with validity, these 1999 standards do mention that fairness “promotes the validity
and reliability of inferences made from test performance”.
Fairness And Validity
2. Fairness Subsumes Validity
Kunnan (2000) articulates a framework in which fairness includes issues of validity, accessibility to test takers,
and justice. Under validity, Kunnan includes issues such as construct validity, DIF, insensitive test-item language,
and content bias. An example of the latter might be a dialect of English employed in the test prompts that differs
in some respects from another English dialect that may constitute the L2 of the test taker. Under accessibility,
Kunnan indicates issues such as affordability, geographic proximity of test taker to the testing site,
accommodations for test takers with disabilities, and opportunity to learn. “Opportunity to learn” is closely
connected with the notion of construct under-representation (messick, 1989), which indicates the ability of a test
to measure some aspects of a construct or skill, but not others.
Fairness And Validity
2. Fairness Subsumes Validity (continued)
A test may be measuring aspects of a construct, such as knowledge of a particular rule of language pragmatics,
that certain test takers will not have had the opportunity to learn and thus score poorly on the test, despite the
fact that they may be proficient in other relevant areas of the construct. Finally, Kunnan’s facet of justice
embraces the notion of whether or not a test contributes to social equity. Kunnan (2004) later modified this
model to include absence of bias, and test-administration conditions.
Fairness And Validity
3. Fairness And Validity Are Overlapping
Kane’s definition of test fairness draws on political and legal concepts. One, procedural due process, states that the
same rules should be applied to everyone in more or less the same way.
Kane also bases his definition on substantive due process, which states that the procedures applied should be
reasonable both in general and in the context in which they are applied. In applying this twin definition of fairness to
assessment, he gives two principles: the first is procedural fairness, in which test takers are treated “in essentially
the same way…take the same test or equivalent tests, under the same conditions or equivalent conditions, and …
their performances [are] evaluated using the same (or essentially the same) rules and procedures.” The second is
substantive fairness, in which the score interpretation and any test-based decision rule are reasonable and
appropriate for all test takers.
Fairness And Validity
4. Fairness As An Important Aspect Of Validity
By Willingham and Cole
In this context, a fair test is one for which both the (preferably, small) extent of statistical error of
measurement, and the inferences (hopefully, reasonable ones) from the test results regarding test-taker ability,
are comparable from individual to individual and from subgroup to subgroup.
They state that comparable validity must be met at all stages of the testing process – when designing the test,
developing the test, administering the test, and using the test results. Comparable validity must be achieved by
selecting test material that does not give an advantage to some test takers for reasons that are irrelevant to the
construct being measured.
Fairness And Validity
4. Fairness As An Important Aspect Of Validity (continued)
Willingham and Cole see fairness as having three qualities linked to validity:
(1) comparable opportunities for test takers to show their knowledge and skills;
(2) comparable test tasks and scores;
(3) comparable treatment of test takers in test interpretation and use.
Xi both expands the definition of test fairness and offers a new framework for investigating fairness issues.
Xi's Framework
Xi states that fairness is “comparable validity for identifiable and relevant groups across all stages of
assessment, from assessment conceptualization to use of assessment results,” where “construct-irrelevant
factors, construct under-representation, inconsistent test administration practices, inappropriate decision-
making procedures or use of test results have no systematic or appreciable effects on test scores, test score
interpretations, score-based decisions and consequences for all relevant groups of examinees”.
Xi's Framework
Consists a fairness argument embedded within a validity argument. A validity argument is a chain of inferences that
leads a test user to appropriate interpretations of test results. Xi’s validity argument framework consists of six
successive sub-arguments, that: (1) there is evidence that the domain of L2 use which is of interest, provides a
meaningful basis for our observations of test-taker performance on the test; (2) there is evidence that the observed
test scores reflect that domain of L2 use and not construct-irrelevant factors; (3) there is evidence that the observed
scores on the test are generalizable over similar language tasks on other, similar tests; (4) there is evidence that the
abovementioned generalization of observed scores can be linked to a theoretical interpretation (i.e., The construct,
the theoretical skill) of such scores; (5) there is evidence that the theoretical construct can explain the L2 use in
actual situations envisioned by the users of the test; and (6) there is evidence that the language-test results are
“relevant, useful, and sufficient” for determining the level of L2 ability. Each of these sub-arguments is supported
by certain assumptions.
Xi's Framework
Embedded in the above chain of sub-arguments and underlying assumptions, xi proposes, can be a fairness
argument, which consists in part of a series of rebuttals, one or more posed to each of the validity sub-
arguments. One can conceive of such rebuttals as research questions into the degree of fairness of a given
language test, i.e., each rebuttal serves as a practical check on the claims of each sub-argument. For example,
to the first of the sub-arguments above, one can ask whether or not the domain of L2 use actually provides a
meaningful basis for observations of test-taker performance.
Kunnan’s Test Context Framework
This approach is intended to consider the wider political, educational, Cultural, social, economic, legal, and historical
aspects of a test. In this it differs somewhat from other, more psychometrically focused approaches considered above,
such as DIF, or even xi’s fairness argument framework. It has a certain overlap with Kane’s (2010) applications of
political and legal concepts to language testing, and it also resonates with the analyses of power relations in language
testing offered by Shohamy (2001), both mentioned above.
Kunnan’s approach thus brings wider social factors into consideration when evaluating the fairness of a language test.

More Related Content

What's hot

Characteristics of a good test
Characteristics  of a good testCharacteristics  of a good test
Characteristics of a good testLalima Tripathi
 
UTPL-LENGUAGE TESTING-I-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)
UTPL-LENGUAGE TESTING-I-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)UTPL-LENGUAGE TESTING-I-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)
UTPL-LENGUAGE TESTING-I-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)Videoconferencias UTPL
 
Summary on LANGUAGE TESTING & ASSESSMENT (Part I) Alderson & Banerjee
Summary on LANGUAGE TESTING & ASSESSMENT (Part I) Alderson & Banerjee Summary on LANGUAGE TESTING & ASSESSMENT (Part I) Alderson & Banerjee
Summary on LANGUAGE TESTING & ASSESSMENT (Part I) Alderson & Banerjee MissJillSmith
 
Validity of a Research Tool
Validity of a Research ToolValidity of a Research Tool
Validity of a Research TooljobyVarghese22
 
Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Linejan
 
Chapter 6 selection
Chapter 6 selectionChapter 6 selection
Chapter 6 selectionAhmed Salem
 
Introduction to language testing (wed, 23 sept 2014)
Introduction to language testing (wed, 23 sept 2014)Introduction to language testing (wed, 23 sept 2014)
Introduction to language testing (wed, 23 sept 2014)Widya Kurnia Arizona San
 
Reliability and validity ppt
Reliability and validity pptReliability and validity ppt
Reliability and validity pptsurendra poudel
 
Article Analysis - Language Testing
Article Analysis - Language Testing Article Analysis - Language Testing
Article Analysis - Language Testing translatoran
 
01 introducción a la evaluación del aprendizaje de idiomas
01 introducción a la evaluación del aprendizaje de idiomas01 introducción a la evaluación del aprendizaje de idiomas
01 introducción a la evaluación del aprendizaje de idiomasY Casart
 
Analysis of Multiple Choice Questions (MCQs): Item and Test Statistics from a...
Analysis of Multiple Choice Questions (MCQs): Item and Test Statistics from a...Analysis of Multiple Choice Questions (MCQs): Item and Test Statistics from a...
Analysis of Multiple Choice Questions (MCQs): Item and Test Statistics from a...iosrjce
 
Development of pyschologica test construction
Development of pyschologica test constructionDevelopment of pyschologica test construction
Development of pyschologica test constructionKiran Dammani
 
15th batch NPTI Validity & Reliablity Business Research Methods
15th batch NPTI Validity & Reliablity Business Research Methods 15th batch NPTI Validity & Reliablity Business Research Methods
15th batch NPTI Validity & Reliablity Business Research Methods Ravi Pohani
 
Introduction to standard setting (cutscores)
Introduction to standard setting (cutscores)Introduction to standard setting (cutscores)
Introduction to standard setting (cutscores)Nathan Thompson
 

What's hot (16)

Characteristics of a good test
Characteristics  of a good testCharacteristics  of a good test
Characteristics of a good test
 
UTPL-LENGUAGE TESTING-I-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)
UTPL-LENGUAGE TESTING-I-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)UTPL-LENGUAGE TESTING-I-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)
UTPL-LENGUAGE TESTING-I-BIMESTRE-(OCTUBRE 2011-FEBRERO 2012)
 
Summary on LANGUAGE TESTING & ASSESSMENT (Part I) Alderson & Banerjee
Summary on LANGUAGE TESTING & ASSESSMENT (Part I) Alderson & Banerjee Summary on LANGUAGE TESTING & ASSESSMENT (Part I) Alderson & Banerjee
Summary on LANGUAGE TESTING & ASSESSMENT (Part I) Alderson & Banerjee
 
Validity of a Research Tool
Validity of a Research ToolValidity of a Research Tool
Validity of a Research Tool
 
Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Report - Reliability & validity
Louzel Report - Reliability & validity
 
Chapter 6 selection
Chapter 6 selectionChapter 6 selection
Chapter 6 selection
 
Introduction to language testing (wed, 23 sept 2014)
Introduction to language testing (wed, 23 sept 2014)Introduction to language testing (wed, 23 sept 2014)
Introduction to language testing (wed, 23 sept 2014)
 
CLASSROOM ACTIVITIES
CLASSROOM  ACTIVITIESCLASSROOM  ACTIVITIES
CLASSROOM ACTIVITIES
 
Reliability and validity ppt
Reliability and validity pptReliability and validity ppt
Reliability and validity ppt
 
Article Analysis - Language Testing
Article Analysis - Language Testing Article Analysis - Language Testing
Article Analysis - Language Testing
 
01 introducción a la evaluación del aprendizaje de idiomas
01 introducción a la evaluación del aprendizaje de idiomas01 introducción a la evaluación del aprendizaje de idiomas
01 introducción a la evaluación del aprendizaje de idiomas
 
Project desing
Project desingProject desing
Project desing
 
Analysis of Multiple Choice Questions (MCQs): Item and Test Statistics from a...
Analysis of Multiple Choice Questions (MCQs): Item and Test Statistics from a...Analysis of Multiple Choice Questions (MCQs): Item and Test Statistics from a...
Analysis of Multiple Choice Questions (MCQs): Item and Test Statistics from a...
 
Development of pyschologica test construction
Development of pyschologica test constructionDevelopment of pyschologica test construction
Development of pyschologica test construction
 
15th batch NPTI Validity & Reliablity Business Research Methods
15th batch NPTI Validity & Reliablity Business Research Methods 15th batch NPTI Validity & Reliablity Business Research Methods
15th batch NPTI Validity & Reliablity Business Research Methods
 
Introduction to standard setting (cutscores)
Introduction to standard setting (cutscores)Introduction to standard setting (cutscores)
Introduction to standard setting (cutscores)
 

Similar to the Routledge hanbook of language testing Ch 32. fairness

Fairness in language testing
Fairness in language testingFairness in language testing
Fairness in language testingmoji azimi
 
8. brown & hudson 1998 the alternatives in language assessment
8. brown & hudson 1998 the alternatives in language assessment8. brown & hudson 1998 the alternatives in language assessment
8. brown & hudson 1998 the alternatives in language assessmentCate Atehortua
 
How do we go about investigating test fairness
How do we go about investigating test fairnessHow do we go about investigating test fairness
How do we go about investigating test fairnesstongsung2
 
Qualities of a Good Test
Qualities of a Good TestQualities of a Good Test
Qualities of a Good TestDrSindhuAlmas
 
Reliability And Validity
Reliability And ValidityReliability And Validity
Reliability And ValidityCrystal Torres
 
Validity in Research
Validity in ResearchValidity in Research
Validity in ResearchEcem Ekinci
 
Test characteristics
Test characteristicsTest characteristics
Test characteristicsSamcruz5
 
Presentation validity
Presentation validityPresentation validity
Presentation validityAshMusavi
 
TEST DEVELOPMENT AND EVALUATION (6462)
TEST DEVELOPMENT AND EVALUATION (6462)TEST DEVELOPMENT AND EVALUATION (6462)
TEST DEVELOPMENT AND EVALUATION (6462)HennaAnsari
 
Validity & reliability seminar
Validity & reliability seminarValidity & reliability seminar
Validity & reliability seminarmrikara185
 
VALIDITY
VALIDITYVALIDITY
VALIDITYANCYBS
 
3232423232323232323232323232323232323 .pptx
3232423232323232323232323232323232323 .pptx3232423232323232323232323232323232323 .pptx
3232423232323232323232323232323232323 .pptxAttallah Alanazi
 
Validity and objectivity of tests
Validity and objectivity of testsValidity and objectivity of tests
Validity and objectivity of testsbushra mushtaq
 
Running head ASSESSING A CLIENT .docx
Running head ASSESSING A CLIENT                                  .docxRunning head ASSESSING A CLIENT                                  .docx
Running head ASSESSING A CLIENT .docxhealdkathaleen
 
Enhancing fairness through a social contract
Enhancing fairness through a social contractEnhancing fairness through a social contract
Enhancing fairness through a social contract Mahsa Farahanynia
 
The validity of Assessment.pptx
The validity of Assessment.pptxThe validity of Assessment.pptx
The validity of Assessment.pptxNurulKhusna13
 
Principles of assessment
Principles of assessmentPrinciples of assessment
Principles of assessmentmunsif123
 

Similar to the Routledge hanbook of language testing Ch 32. fairness (20)

Fairness in language testing
Fairness in language testingFairness in language testing
Fairness in language testing
 
8. brown & hudson 1998 the alternatives in language assessment
8. brown & hudson 1998 the alternatives in language assessment8. brown & hudson 1998 the alternatives in language assessment
8. brown & hudson 1998 the alternatives in language assessment
 
How do we go about investigating test fairness
How do we go about investigating test fairnessHow do we go about investigating test fairness
How do we go about investigating test fairness
 
Qualities of a Good Test
Qualities of a Good TestQualities of a Good Test
Qualities of a Good Test
 
Reliability And Validity
Reliability And ValidityReliability And Validity
Reliability And Validity
 
Validity in Research
Validity in ResearchValidity in Research
Validity in Research
 
Test characteristics
Test characteristicsTest characteristics
Test characteristics
 
Presentation validity
Presentation validityPresentation validity
Presentation validity
 
Week 8 & 9 - Validity and Reliability
Week 8 & 9 - Validity and ReliabilityWeek 8 & 9 - Validity and Reliability
Week 8 & 9 - Validity and Reliability
 
TEST DEVELOPMENT AND EVALUATION (6462)
TEST DEVELOPMENT AND EVALUATION (6462)TEST DEVELOPMENT AND EVALUATION (6462)
TEST DEVELOPMENT AND EVALUATION (6462)
 
Validity & reliability seminar
Validity & reliability seminarValidity & reliability seminar
Validity & reliability seminar
 
Principles of Language Assessment
Principles of Language AssessmentPrinciples of Language Assessment
Principles of Language Assessment
 
VALIDITY
VALIDITYVALIDITY
VALIDITY
 
3232423232323232323232323232323232323 .pptx
3232423232323232323232323232323232323 .pptx3232423232323232323232323232323232323 .pptx
3232423232323232323232323232323232323 .pptx
 
Validity and objectivity of tests
Validity and objectivity of testsValidity and objectivity of tests
Validity and objectivity of tests
 
Running head ASSESSING A CLIENT .docx
Running head ASSESSING A CLIENT                                  .docxRunning head ASSESSING A CLIENT                                  .docx
Running head ASSESSING A CLIENT .docx
 
Enhancing fairness through a social contract
Enhancing fairness through a social contractEnhancing fairness through a social contract
Enhancing fairness through a social contract
 
Validity
ValidityValidity
Validity
 
The validity of Assessment.pptx
The validity of Assessment.pptxThe validity of Assessment.pptx
The validity of Assessment.pptx
 
Principles of assessment
Principles of assessmentPrinciples of assessment
Principles of assessment
 

Recently uploaded

Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 

Recently uploaded (20)

Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 

the Routledge hanbook of language testing Ch 32. fairness

  • 1. CHAPTER 32 OF ROUTLEDGE HANDBOOK FAIRNESS
  • 2. OVERVIEW Introduction 1. Roles Of The Stakeholders 2. Roles Of The Test-takers 3. Power Relations In Testing Test Fairness 1. Test Sensitivity Review 2. Test Bias Fairness and validity 1. Fairness independent of validity 2. Fairness subsumes validity 3. Fairness and validity are overlapping 4. Fairness as an important aspect of validity Xi's framework Kunnan's Test Context Framework
  • 3. INTRODUCTION • There are many definitions of fairness. • Your-dictionary.com (2011) states that to be fair is to be “just and honest,” “impartial,” and “unprejudiced,” specifically, “free from discrimination based on race, religion, sex, etc.” • Kunnan (2000) argues that fairness embraces three concerns: validity, accessibility of tests to test takers, and justice.
  • 4. INTRODUCTION Spaan (2000) Defines fairness as an ideal “in which opportunities are equal,” “In the natural world, test writers and developers cannot be ‘fair’ in the ideal sense, but … they can try to equitable.” this equitability is “the joint responsibility of the test developer, the test user, and the examinee, in a sort of social contract”. Fairness can be seen as a system or process – as distinct from a quality.
  • 5. INTRODUCTION What most language testing views of fairness have in common is a desire to avoid the effects of any construct- irrelevant factors on the entire testing process, from the test-design stage through post-administration decision- making. In this context, one dimension of fairness concerns the roles of the major stakeholders in achieving language testing fairness. these stakeholders, to use spaan’s (2000) tripartite “social contract” scheme, are: test developers; L2/FL learners; and test users (i.e., teachers, administrators, etc.).
  • 6. INTRODUCTION Role Of Test Developers ❑They must try to ensure the validity, reliability, and practicality of their test methods; ❑They must also provide test users with easily understandable guidelines for the use of their tests; ❑They must also solicit feedback from them, to effect further test improvements.
  • 7. INTRODUCTION Roles Of The Test Takers ❑They must become familiar with the testing format and overall test content before taking the test; ❑They must try to make sure that the level of the test matches their own skill/knowledge level. Roles Of The Test Administrators ❑They must give tests to students for whom the tests were designed; to do otherwise would be an instance of test abuse.
  • 8. INTRODUCTION Power Relations In Testing Viewing test developer, test taker, and test user as parties to a social contract highlights a second issue, namely the phenomenon of power relations in testing. Shohamy (2001) points out that language tests have often been used by powerful agencies such as governments, educational bureaucracies, or school staff for reasons other than the assessment of language skills. For example, tests have been used to establish discipline, to impose sanctions on schools or teachers, or to raise the prestige of the subject matter being tested.
  • 9. INTRODUCTION Power relations in testing She offers a set of principles organized under the heading of critical language testing, as a way to engage language testers “in a wider sphere of social dialogue and debate about … the roles tests … have been assigned to play in society” ❑Critical language testing “encourages test takers to develop a critical view of tests as well as to act on it by questioning tests and critiquing the value which is inherent in them.” A critical language testing perspective asks that all parties to the contract remain vigilant.
  • 10. Test Fairness Test Sensitivity Review One approach to examining whether or not test questions are fair is through a test sensitivity review. Such reviews are performed by trained judges employed by test-development organizations, who examine test tasks to determine whether they contain language or content that may be considered stereotyping, patronizing, inflammatory, or otherwise offensive to test takers belonging to subgroups defined by culture, ethnicity, or gender.
  • 11. Test Fairness Test bias Is a technical term indicating a testing situation in which a particular test use results in different interpretations of test scores received by cultural, ethnic, gender, or linguistic subgroups. Synonymous with DIF. Bias or DIF is considered to be present when a test item is differentially difficult for a ethnic, cultural, or gender-related subgroup which is otherwise equally matched with another subgroup in terms of knowledge or skill. Among the statistical methods used to uncover DIF are: the Standardization Procedure; and the Mantel–Haenszel method.
  • 12. Fairness And Validity Kane (2010) points out that the relationship between fairness and validity depends on how the two concepts are defined: narrowly defining validity and broadly defining fairness will result in validity being considered a component of fairness. On the other hand, a broad definition of validity and a narrow conceptualization of fairness will result in fairness being understood as a part of validity.
  • 13. Fairness And Validity 1. Fairness Independent Of Validity One example of this is the Standards for Educational and Psychological Testing which define it as having, at minimum, three components: lack of item bias, the presence of equitable treatment of all test-takers in the testing process, and equity of opportunity of examinees to learn the material on a given test. While fairness here is not linked directly with validity, these 1999 standards do mention that fairness “promotes the validity and reliability of inferences made from test performance”.
  • 14. Fairness And Validity 2. Fairness Subsumes Validity Kunnan (2000) articulates a framework in which fairness includes issues of validity, accessibility to test takers, and justice. Under validity, Kunnan includes issues such as construct validity, DIF, insensitive test-item language, and content bias. An example of the latter might be a dialect of English employed in the test prompts that differs in some respects from another English dialect that may constitute the L2 of the test taker. Under accessibility, Kunnan indicates issues such as affordability, geographic proximity of test taker to the testing site, accommodations for test takers with disabilities, and opportunity to learn. “Opportunity to learn” is closely connected with the notion of construct under-representation (messick, 1989), which indicates the ability of a test to measure some aspects of a construct or skill, but not others.
  • 15. Fairness And Validity 2. Fairness Subsumes Validity (continued) A test may be measuring aspects of a construct, such as knowledge of a particular rule of language pragmatics, that certain test takers will not have had the opportunity to learn and thus score poorly on the test, despite the fact that they may be proficient in other relevant areas of the construct. Finally, Kunnan’s facet of justice embraces the notion of whether or not a test contributes to social equity. Kunnan (2004) later modified this model to include absence of bias, and test-administration conditions.
  • 16. Fairness And Validity 3. Fairness And Validity Are Overlapping Kane’s definition of test fairness draws on political and legal concepts. One, procedural due process, states that the same rules should be applied to everyone in more or less the same way. Kane also bases his definition on substantive due process, which states that the procedures applied should be reasonable both in general and in the context in which they are applied. In applying this twin definition of fairness to assessment, he gives two principles: the first is procedural fairness, in which test takers are treated “in essentially the same way…take the same test or equivalent tests, under the same conditions or equivalent conditions, and … their performances [are] evaluated using the same (or essentially the same) rules and procedures.” The second is substantive fairness, in which the score interpretation and any test-based decision rule are reasonable and appropriate for all test takers.
  • 17. Fairness And Validity 4. Fairness As An Important Aspect Of Validity By Willingham and Cole In this context, a fair test is one for which both the (preferably, small) extent of statistical error of measurement, and the inferences (hopefully, reasonable ones) from the test results regarding test-taker ability, are comparable from individual to individual and from subgroup to subgroup. They state that comparable validity must be met at all stages of the testing process – when designing the test, developing the test, administering the test, and using the test results. Comparable validity must be achieved by selecting test material that does not give an advantage to some test takers for reasons that are irrelevant to the construct being measured.
  • 18. Fairness And Validity 4. Fairness As An Important Aspect Of Validity (continued) Willingham and Cole see fairness as having three qualities linked to validity: (1) comparable opportunities for test takers to show their knowledge and skills; (2) comparable test tasks and scores; (3) comparable treatment of test takers in test interpretation and use. Xi both expands the definition of test fairness and offers a new framework for investigating fairness issues.
  • 19. Xi's Framework Xi states that fairness is “comparable validity for identifiable and relevant groups across all stages of assessment, from assessment conceptualization to use of assessment results,” where “construct-irrelevant factors, construct under-representation, inconsistent test administration practices, inappropriate decision- making procedures or use of test results have no systematic or appreciable effects on test scores, test score interpretations, score-based decisions and consequences for all relevant groups of examinees”.
  • 20. Xi's Framework Consists a fairness argument embedded within a validity argument. A validity argument is a chain of inferences that leads a test user to appropriate interpretations of test results. Xi’s validity argument framework consists of six successive sub-arguments, that: (1) there is evidence that the domain of L2 use which is of interest, provides a meaningful basis for our observations of test-taker performance on the test; (2) there is evidence that the observed test scores reflect that domain of L2 use and not construct-irrelevant factors; (3) there is evidence that the observed scores on the test are generalizable over similar language tasks on other, similar tests; (4) there is evidence that the abovementioned generalization of observed scores can be linked to a theoretical interpretation (i.e., The construct, the theoretical skill) of such scores; (5) there is evidence that the theoretical construct can explain the L2 use in actual situations envisioned by the users of the test; and (6) there is evidence that the language-test results are “relevant, useful, and sufficient” for determining the level of L2 ability. Each of these sub-arguments is supported by certain assumptions.
  • 21. Xi's Framework Embedded in the above chain of sub-arguments and underlying assumptions, xi proposes, can be a fairness argument, which consists in part of a series of rebuttals, one or more posed to each of the validity sub- arguments. One can conceive of such rebuttals as research questions into the degree of fairness of a given language test, i.e., each rebuttal serves as a practical check on the claims of each sub-argument. For example, to the first of the sub-arguments above, one can ask whether or not the domain of L2 use actually provides a meaningful basis for observations of test-taker performance.
  • 22. Kunnan’s Test Context Framework This approach is intended to consider the wider political, educational, Cultural, social, economic, legal, and historical aspects of a test. In this it differs somewhat from other, more psychometrically focused approaches considered above, such as DIF, or even xi’s fairness argument framework. It has a certain overlap with Kane’s (2010) applications of political and legal concepts to language testing, and it also resonates with the analyses of power relations in language testing offered by Shohamy (2001), both mentioned above. Kunnan’s approach thus brings wider social factors into consideration when evaluating the fairness of a language test.