SlideShare a Scribd company logo
Human Interface Laboratory
Towards Cross-Lingual Generalization of
Translation Gender Bias
2021. 3. 9 @FAccT Conference
Won Ik Cho*, Jiwon Kim*, Jaeyoung Yang, Nam Soo Kim
Contents
• Translation gender bias
 What’s the problem and why this matters?
 Significant in which language pairs? - Struggles so far
• Our approach
 Language pairs and template
 Dataset construction
 Measurement of fluency and biasedness
• Discussion
 Results and analysis
 Takeaways
1
Bias
• Bias in machine learning?
 Bias and variance
• Overfitting and underfitting
 Bias in view of fairness machine learning?
• Problem of individuality and context rather than of
statistics and system (Binns, 2017)
 Is the bias in machine learning related with the bias in fairness machine
learning and real social bias?
• e.g., image semantic role labeling
– Zhao et al., Men Also Like Shopping:
Reducing Gender Bias Amplification
using Corpus-level Constraints,
in Proc. EMNLP, 2017.
• This also happens in translation!
2
Bias
• What is shown (social) bias in AI and NLP?
 Sun et al., Mitigating Gender Bias in Natural Language Processing:
Literature Review, in Proc. ACL, 2019.
3
Overview: Gender bias in translation?
• Formulation #1
 Gender-neutral pronouns
• Target problem?
 Translation of gender-neutral pronouns to gender-specific ones
• Gender-neutral pronoun
 Pronouns with no biological
gender displayed
 Frequently appears in languages
like Korean, Japanese, Turkish, ...
 Prates et al., Assessing Gender
Bias in Machine Translation:
A Case Study with Google Trans
late, Neural Computing and
Applications, 2018.
4
Overview: Gender bias in translation?
• Formulation #2
 Gendered languages
• Target problem?
 Translation of expressions without
gender representation to gendered items
• Gendered languages
 Grammatical genders in articles,
nouns, adjectives
 Differs from the biological gender
 Vanmassenhove et al.,
Getting Gender Right in
Neural Machine Translation,
in Proc. EMNLP, 2018.
5
Overview: Gender bias in translation?
• Why do they matter?
 The result can be offensive to end users
• When they do matter?
 In cases the users are familiar or not with the target/source language
• Who will potentially feel offensive?
 Especially if the mistranslation regards social stereotypes
• Research questions
 How can the evaluation incorporate various aspects of translation gender
bias?
 How will grammatical properties and resource condition influence the bias
issue?
6
Template-based attacks
• 걔(s/he)는 [##]이야!
 Cho et al., On measuring gender bias in translation of gender-neutral
pronouns," in Proc. GeBNLP, ACL Workshop, 2019.
• Why Korean?
 Displays various sentence styles
 Translation service
popular among the users
7
Semantic cues
• WinoMT
 Stanovsky et al., Evaluating Gender Bias in Machine Translation,
in Proc. ACL, 2019.
8
- Performance can differ among
a similar language family
- High accuracy in stereotypical
case does not guarantee the
accuracy in the counterpart
Our approach
• Combined approach – Cross-lingual evaluation?
 Two different types of translation gender bias
• Different types of gender bias can be observed in a single translation system
9
?
Our approach
• Language pairs
 Different types of `language pairs’ should be investigated
• 1) Languages with gender-neutral pronouns
• 2) Languages without (colloquially used) GNPs but not grammatically gendered
• 3) Languages with grammatical gender
10
Our approach
• Language pairs
 Linguistic criteria: On GNPs and genderedness
• Korean/Tagalog -> English -> German/Portuguese
 Resource criteria
• Korean > Tagalog
• German > Portuguese
11
Our approach
• Template
 How can the pairs be simultaneously evaluated?
12
Korean 그 여자에 대해 하나 말하자면, 걔는 초등학교 선생님이야.
Tagalog Isang bagay tungkol sa babae, siya ay isang guro sa elementarya.
English One thing about the woman, she is an elementary school teacher.
German Eine Sache über die Frau, sie ist eine Grundschullehrerin.
Portuguese Um facto sobre a mulher, ela é professora do ensino primário.
Our approach
• Template
 How can the pairs be simultaneously evaluated?
13
Our approach
• Evaluation
 Template-based evaluation set construction
 Inference with public MT modules
 Human evaluation (gender-related) and automatic metrics (fluency)
14
Our approach
• Measurement
 Biasedness
• Accuracy on biological gender
• Accuracy on grammatical gender
• Disparate impact
– Accuracy on female case
divided by accuracy on male case
 Fluency
• BLEU
– EN, DE, PT
• BERTScore
– Multilingual BERT
15
Results and analysis
• Results
 More bias-related errors in EN > DE/PT than in KO/TL > EN
• She is a game programmer > Sie ist ein professioneller Spieler
• aviador, soldado, monge (airman, soldier, monk)
• Exceptional cases for Bing KO-EN
16
Results and analysis
• Analysis
 Unbiasedness/Disparate impact
• Higher among type 1 languages
– DE, PT < KO, TL (overall)
• In the same type, resource seems
to matter
– DE < PT, KO < TL
 Fluency measurement
• Lexical and semantic approach have different results
– BLEU (lexical): DE > PT > KO, TL
– BERTScore (semantic): DE < PT, KO < TL
 Observations
• The amount of available language resource, though here assumed for public
MT modules, does not guarantee unbiased translation, albeit fluency measure
may be higher in some sense
• There is a difference regarding the evaluation on gender-related inference per
fluency measures
17
Takeaways
• Translation gender bias is problematic since wrong results can be
offensive to end users
• Translation gender bias matters regardless of the user proficiency
of the language, and especially offensive if the mistranslation
engages social stereotypes
• Our approach, including template and measurement, can combine
the translation gender bias evaluation regarding various language
pairs
• Our evaluation results suggest that the inductive bias as a social
stereotype is a major factor causing the errors and augmenting
training corpora may not be a solution
18
Reference (order of appearance)
• Binns, Reuben. "Fairness in Machine Learning: Lessons from Political Philosophy." arXiv preprint
arXiv:1712.03586 (2017).
• Zhao, Jieyu, et al. "Men Also Like Shopping: Reducing Gender Bias Amplification Using Corpus-
level Constraints." arXiv preprint arXiv:1707.09457 (2017).
• Sun, Tony, Andrew Gaut, Shirlyn Tang, Yuxin Huang, Mai ElSherief, Jieyu Zhao, Diba Mirza,
Elizabeth Belding, Kai-Wei Chang, and William Yang Wang. "Mitigating Gender Bias in Natural
Language Processing: Literature Review." In Proceedings of the 57th Annual Meeting of the
Association for Computational Linguistics, pp. 1630-1640. 2019.
• Prates, Marcelo OR, Pedro H. Avelar, and Luís C. Lamb. "Assessing Gender Bias in Machine
Translation: A Case Study with Google Translate." Neural Computing and Applications (2018): 1-
19.
• Vanmassenhove, Eva, Christian Hardmeier, and Andy Way. "Getting Gender Right in Neural
Machine Translation." In Proceedings of the 2018 Conference on Empirical Methods in Natural
Language Processing, pp. 3003-3008. 2018.
• Cho, Won Ik, et al. "On Measuring Gender Bias in Translation of Gender-neutral Pronouns."
GeBNLP 2019 (2019): 173.
• Stanovsky, Gabriel, Noah A. Smith, and Luke Zettlemoyer. "Evaluating Gender Bias in Machine
Translation." arXiv preprint arXiv:1906.00591 (2019).
19
Thank you!
EndOfPresentation

More Related Content

Similar to 2103 ACM FAccT

A Survey of ‘Bias’ in Natural Language Processing Systems
A Survey of ‘Bias’ in Natural Language Processing SystemsA Survey of ‘Bias’ in Natural Language Processing Systems
A Survey of ‘Bias’ in Natural Language Processing Systems
subarna89
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
ananth
 
Automated Language Assessment Scoring and impact on instruction
Automated Language Assessment Scoring and impact on instructionAutomated Language Assessment Scoring and impact on instruction
Automated Language Assessment Scoring and impact on instruction
tfarny
 
A Level English Language Exam Prep from AQA 2011
A Level English Language Exam Prep from AQA 2011A Level English Language Exam Prep from AQA 2011
A Level English Language Exam Prep from AQA 2011
ENSFCEnglish
 
Seven Steps to EnGendering Evaluations of Public Health Programs
 Seven Steps to EnGendering Evaluations of Public Health Programs Seven Steps to EnGendering Evaluations of Public Health Programs
Seven Steps to EnGendering Evaluations of Public Health Programs
MEASURE Evaluation
 
EDRD 6000 - Language issues in qualitative research shiyuan zhou
EDRD 6000 - Language issues in qualitative research   shiyuan zhouEDRD 6000 - Language issues in qualitative research   shiyuan zhou
EDRD 6000 - Language issues in qualitative research shiyuan zhou
Emma Shiyuan Zhou
 
Attitudes bolouri
Attitudes bolouriAttitudes bolouri
Attitudes bolouri
Allame Tabatabaei
 
Boston Mini Upa2011: Localization Research Presentation by Jennifer Fabrizi a...
Boston Mini Upa2011: Localization Research Presentation by Jennifer Fabrizi a...Boston Mini Upa2011: Localization Research Presentation by Jennifer Fabrizi a...
Boston Mini Upa2011: Localization Research Presentation by Jennifer Fabrizi a...
Jennifer Fabrizi
 
Psychological test adaptation
Psychological test adaptationPsychological test adaptation
Psychological test adaptationCarlo Magno
 
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...
Antonio Toral
 
Dr. Nafissi ELT5 2019
Dr. Nafissi ELT5 2019Dr. Nafissi ELT5 2019
Dr. Nafissi ELT5 2019
Zohreh Nafissi
 
Week 11 english 145
Week 11 english 145 Week 11 english 145
Week 11 english 145 lisyaseloni
 
Assessing The Accuracy And Teachers Impressions Of Google Translate A Study...
Assessing The Accuracy And Teachers  Impressions Of Google Translate  A Study...Assessing The Accuracy And Teachers  Impressions Of Google Translate  A Study...
Assessing The Accuracy And Teachers Impressions Of Google Translate A Study...
Allison Thompson
 
Lepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLifeng (Aaron) Han
 
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
Lifeng (Aaron) Han
 
Machine translator Introduction
Machine translator IntroductionMachine translator Introduction
Machine translator Introduction
Hamid Shahrivari Joghan
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
PrithaVashisht1
 
A Comparison Of Freshman And Sophomore EFL Students Written Performance Thro...
A Comparison Of Freshman And Sophomore EFL Students  Written Performance Thro...A Comparison Of Freshman And Sophomore EFL Students  Written Performance Thro...
A Comparison Of Freshman And Sophomore EFL Students Written Performance Thro...
Bryce Nelson
 
Error Analysis developed by Bochra Benaicha
Error Analysis developed by Bochra BenaichaError Analysis developed by Bochra Benaicha
Error Analysis developed by Bochra Benaicha
Bochra Benaicha
 

Similar to 2103 ACM FAccT (20)

A Survey of ‘Bias’ in Natural Language Processing Systems
A Survey of ‘Bias’ in Natural Language Processing SystemsA Survey of ‘Bias’ in Natural Language Processing Systems
A Survey of ‘Bias’ in Natural Language Processing Systems
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
 
Automated Language Assessment Scoring and impact on instruction
Automated Language Assessment Scoring and impact on instructionAutomated Language Assessment Scoring and impact on instruction
Automated Language Assessment Scoring and impact on instruction
 
A Level English Language Exam Prep from AQA 2011
A Level English Language Exam Prep from AQA 2011A Level English Language Exam Prep from AQA 2011
A Level English Language Exam Prep from AQA 2011
 
Seven Steps to EnGendering Evaluations of Public Health Programs
 Seven Steps to EnGendering Evaluations of Public Health Programs Seven Steps to EnGendering Evaluations of Public Health Programs
Seven Steps to EnGendering Evaluations of Public Health Programs
 
EDRD 6000 - Language issues in qualitative research shiyuan zhou
EDRD 6000 - Language issues in qualitative research   shiyuan zhouEDRD 6000 - Language issues in qualitative research   shiyuan zhou
EDRD 6000 - Language issues in qualitative research shiyuan zhou
 
Attitudes bolouri
Attitudes bolouriAttitudes bolouri
Attitudes bolouri
 
Boston Mini Upa2011: Localization Research Presentation by Jennifer Fabrizi a...
Boston Mini Upa2011: Localization Research Presentation by Jennifer Fabrizi a...Boston Mini Upa2011: Localization Research Presentation by Jennifer Fabrizi a...
Boston Mini Upa2011: Localization Research Presentation by Jennifer Fabrizi a...
 
Psychological test adaptation
Psychological test adaptationPsychological test adaptation
Psychological test adaptation
 
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...
 
Dr. Nafissi ELT5 2019
Dr. Nafissi ELT5 2019Dr. Nafissi ELT5 2019
Dr. Nafissi ELT5 2019
 
Week 11 english 145
Week 11 english 145 Week 11 english 145
Week 11 english 145
 
Assessing The Accuracy And Teachers Impressions Of Google Translate A Study...
Assessing The Accuracy And Teachers  Impressions Of Google Translate  A Study...Assessing The Accuracy And Teachers  Impressions Of Google Translate  A Study...
Assessing The Accuracy And Teachers Impressions Of Google Translate A Study...
 
Lepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metric
 
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
 
Machine translator Introduction
Machine translator IntroductionMachine translator Introduction
Machine translator Introduction
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
A Comparison Of Freshman And Sophomore EFL Students Written Performance Thro...
A Comparison Of Freshman And Sophomore EFL Students  Written Performance Thro...A Comparison Of Freshman And Sophomore EFL Students  Written Performance Thro...
A Comparison Of Freshman And Sophomore EFL Students Written Performance Thro...
 
Error Analysis developed by Bochra Benaicha
Error Analysis developed by Bochra BenaichaError Analysis developed by Bochra Benaicha
Error Analysis developed by Bochra Benaicha
 
Lessons 6 and 7 for blog
Lessons 6 and 7 for blogLessons 6 and 7 for blog
Lessons 6 and 7 for blog
 

More from WarNik Chow

2312 PACLIC
2312 PACLIC2312 PACLIC
2312 PACLIC
WarNik Chow
 
2311 EAAMO
2311 EAAMO2311 EAAMO
2311 EAAMO
WarNik Chow
 
2211 HCOMP
2211 HCOMP2211 HCOMP
2211 HCOMP
WarNik Chow
 
2211 APSIPA
2211 APSIPA2211 APSIPA
2211 APSIPA
WarNik Chow
 
2211 AACL
2211 AACL2211 AACL
2211 AACL
WarNik Chow
 
2210 CODI
2210 CODI2210 CODI
2210 CODI
WarNik Chow
 
2206 FAccT_inperson
2206 FAccT_inperson2206 FAccT_inperson
2206 FAccT_inperson
WarNik Chow
 
2206 Modupop!
2206 Modupop!2206 Modupop!
2206 Modupop!
WarNik Chow
 
2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset
WarNik Chow
 
2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e
WarNik Chow
 
2106 PRSLLS
2106 PRSLLS2106 PRSLLS
2106 PRSLLS
WarNik Chow
 
2106 JWLLP
2106 JWLLP2106 JWLLP
2106 JWLLP
WarNik Chow
 
2106 ACM DIS
2106 ACM DIS2106 ACM DIS
2106 ACM DIS
WarNik Chow
 
2102 Redone seminar
2102 Redone seminar2102 Redone seminar
2102 Redone seminar
WarNik Chow
 
2011 NLP-OSS
2011 NLP-OSS2011 NLP-OSS
2011 NLP-OSS
WarNik Chow
 
2010 INTERSPEECH
2010 INTERSPEECH 2010 INTERSPEECH
2010 INTERSPEECH
WarNik Chow
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories
WarNik Chow
 
2010 HCLT Hate Speech
2010 HCLT Hate Speech2010 HCLT Hate Speech
2010 HCLT Hate Speech
WarNik Chow
 
2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP
WarNik Chow
 
2008 [lang con2020] act!
2008 [lang con2020] act!2008 [lang con2020] act!
2008 [lang con2020] act!
WarNik Chow
 

More from WarNik Chow (20)

2312 PACLIC
2312 PACLIC2312 PACLIC
2312 PACLIC
 
2311 EAAMO
2311 EAAMO2311 EAAMO
2311 EAAMO
 
2211 HCOMP
2211 HCOMP2211 HCOMP
2211 HCOMP
 
2211 APSIPA
2211 APSIPA2211 APSIPA
2211 APSIPA
 
2211 AACL
2211 AACL2211 AACL
2211 AACL
 
2210 CODI
2210 CODI2210 CODI
2210 CODI
 
2206 FAccT_inperson
2206 FAccT_inperson2206 FAccT_inperson
2206 FAccT_inperson
 
2206 Modupop!
2206 Modupop!2206 Modupop!
2206 Modupop!
 
2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset
 
2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e
 
2106 PRSLLS
2106 PRSLLS2106 PRSLLS
2106 PRSLLS
 
2106 JWLLP
2106 JWLLP2106 JWLLP
2106 JWLLP
 
2106 ACM DIS
2106 ACM DIS2106 ACM DIS
2106 ACM DIS
 
2102 Redone seminar
2102 Redone seminar2102 Redone seminar
2102 Redone seminar
 
2011 NLP-OSS
2011 NLP-OSS2011 NLP-OSS
2011 NLP-OSS
 
2010 INTERSPEECH
2010 INTERSPEECH 2010 INTERSPEECH
2010 INTERSPEECH
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories
 
2010 HCLT Hate Speech
2010 HCLT Hate Speech2010 HCLT Hate Speech
2010 HCLT Hate Speech
 
2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP
 
2008 [lang con2020] act!
2008 [lang con2020] act!2008 [lang con2020] act!
2008 [lang con2020] act!
 

Recently uploaded

SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 

Recently uploaded (20)

SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 

2103 ACM FAccT

  • 1. Human Interface Laboratory Towards Cross-Lingual Generalization of Translation Gender Bias 2021. 3. 9 @FAccT Conference Won Ik Cho*, Jiwon Kim*, Jaeyoung Yang, Nam Soo Kim
  • 2. Contents • Translation gender bias  What’s the problem and why this matters?  Significant in which language pairs? - Struggles so far • Our approach  Language pairs and template  Dataset construction  Measurement of fluency and biasedness • Discussion  Results and analysis  Takeaways 1
  • 3. Bias • Bias in machine learning?  Bias and variance • Overfitting and underfitting  Bias in view of fairness machine learning? • Problem of individuality and context rather than of statistics and system (Binns, 2017)  Is the bias in machine learning related with the bias in fairness machine learning and real social bias? • e.g., image semantic role labeling – Zhao et al., Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints, in Proc. EMNLP, 2017. • This also happens in translation! 2
  • 4. Bias • What is shown (social) bias in AI and NLP?  Sun et al., Mitigating Gender Bias in Natural Language Processing: Literature Review, in Proc. ACL, 2019. 3
  • 5. Overview: Gender bias in translation? • Formulation #1  Gender-neutral pronouns • Target problem?  Translation of gender-neutral pronouns to gender-specific ones • Gender-neutral pronoun  Pronouns with no biological gender displayed  Frequently appears in languages like Korean, Japanese, Turkish, ...  Prates et al., Assessing Gender Bias in Machine Translation: A Case Study with Google Trans late, Neural Computing and Applications, 2018. 4
  • 6. Overview: Gender bias in translation? • Formulation #2  Gendered languages • Target problem?  Translation of expressions without gender representation to gendered items • Gendered languages  Grammatical genders in articles, nouns, adjectives  Differs from the biological gender  Vanmassenhove et al., Getting Gender Right in Neural Machine Translation, in Proc. EMNLP, 2018. 5
  • 7. Overview: Gender bias in translation? • Why do they matter?  The result can be offensive to end users • When they do matter?  In cases the users are familiar or not with the target/source language • Who will potentially feel offensive?  Especially if the mistranslation regards social stereotypes • Research questions  How can the evaluation incorporate various aspects of translation gender bias?  How will grammatical properties and resource condition influence the bias issue? 6
  • 8. Template-based attacks • 걔(s/he)는 [##]이야!  Cho et al., On measuring gender bias in translation of gender-neutral pronouns," in Proc. GeBNLP, ACL Workshop, 2019. • Why Korean?  Displays various sentence styles  Translation service popular among the users 7
  • 9. Semantic cues • WinoMT  Stanovsky et al., Evaluating Gender Bias in Machine Translation, in Proc. ACL, 2019. 8 - Performance can differ among a similar language family - High accuracy in stereotypical case does not guarantee the accuracy in the counterpart
  • 10. Our approach • Combined approach – Cross-lingual evaluation?  Two different types of translation gender bias • Different types of gender bias can be observed in a single translation system 9 ?
  • 11. Our approach • Language pairs  Different types of `language pairs’ should be investigated • 1) Languages with gender-neutral pronouns • 2) Languages without (colloquially used) GNPs but not grammatically gendered • 3) Languages with grammatical gender 10
  • 12. Our approach • Language pairs  Linguistic criteria: On GNPs and genderedness • Korean/Tagalog -> English -> German/Portuguese  Resource criteria • Korean > Tagalog • German > Portuguese 11
  • 13. Our approach • Template  How can the pairs be simultaneously evaluated? 12 Korean 그 여자에 대해 하나 말하자면, 걔는 초등학교 선생님이야. Tagalog Isang bagay tungkol sa babae, siya ay isang guro sa elementarya. English One thing about the woman, she is an elementary school teacher. German Eine Sache über die Frau, sie ist eine Grundschullehrerin. Portuguese Um facto sobre a mulher, ela é professora do ensino primário.
  • 14. Our approach • Template  How can the pairs be simultaneously evaluated? 13
  • 15. Our approach • Evaluation  Template-based evaluation set construction  Inference with public MT modules  Human evaluation (gender-related) and automatic metrics (fluency) 14
  • 16. Our approach • Measurement  Biasedness • Accuracy on biological gender • Accuracy on grammatical gender • Disparate impact – Accuracy on female case divided by accuracy on male case  Fluency • BLEU – EN, DE, PT • BERTScore – Multilingual BERT 15
  • 17. Results and analysis • Results  More bias-related errors in EN > DE/PT than in KO/TL > EN • She is a game programmer > Sie ist ein professioneller Spieler • aviador, soldado, monge (airman, soldier, monk) • Exceptional cases for Bing KO-EN 16
  • 18. Results and analysis • Analysis  Unbiasedness/Disparate impact • Higher among type 1 languages – DE, PT < KO, TL (overall) • In the same type, resource seems to matter – DE < PT, KO < TL  Fluency measurement • Lexical and semantic approach have different results – BLEU (lexical): DE > PT > KO, TL – BERTScore (semantic): DE < PT, KO < TL  Observations • The amount of available language resource, though here assumed for public MT modules, does not guarantee unbiased translation, albeit fluency measure may be higher in some sense • There is a difference regarding the evaluation on gender-related inference per fluency measures 17
  • 19. Takeaways • Translation gender bias is problematic since wrong results can be offensive to end users • Translation gender bias matters regardless of the user proficiency of the language, and especially offensive if the mistranslation engages social stereotypes • Our approach, including template and measurement, can combine the translation gender bias evaluation regarding various language pairs • Our evaluation results suggest that the inductive bias as a social stereotype is a major factor causing the errors and augmenting training corpora may not be a solution 18
  • 20. Reference (order of appearance) • Binns, Reuben. "Fairness in Machine Learning: Lessons from Political Philosophy." arXiv preprint arXiv:1712.03586 (2017). • Zhao, Jieyu, et al. "Men Also Like Shopping: Reducing Gender Bias Amplification Using Corpus- level Constraints." arXiv preprint arXiv:1707.09457 (2017). • Sun, Tony, Andrew Gaut, Shirlyn Tang, Yuxin Huang, Mai ElSherief, Jieyu Zhao, Diba Mirza, Elizabeth Belding, Kai-Wei Chang, and William Yang Wang. "Mitigating Gender Bias in Natural Language Processing: Literature Review." In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1630-1640. 2019. • Prates, Marcelo OR, Pedro H. Avelar, and Luís C. Lamb. "Assessing Gender Bias in Machine Translation: A Case Study with Google Translate." Neural Computing and Applications (2018): 1- 19. • Vanmassenhove, Eva, Christian Hardmeier, and Andy Way. "Getting Gender Right in Neural Machine Translation." In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3003-3008. 2018. • Cho, Won Ik, et al. "On Measuring Gender Bias in Translation of Gender-neutral Pronouns." GeBNLP 2019 (2019): 173. • Stanovsky, Gabriel, Noah A. Smith, and Luke Zettlemoyer. "Evaluating Gender Bias in Machine Translation." arXiv preprint arXiv:1906.00591 (2019). 19

Editor's Notes

  1. .