This document summarizes an academic presentation on measuring and addressing gender bias in machine translation. It discusses two main types of translation gender bias related to gender-neutral pronouns and grammatical gender. The authors propose a combined cross-lingual evaluation approach using template translations between language pairs that vary in their use of gender-neutral pronouns and grammatical constructs. They measure translation quality and gender bias across the language pairs using automatic metrics and human evaluations. The results suggest translation gender bias is more common when translating to languages with grammatical gender constructs and that increasing training data may not fully address social stereotypes that can amplify bias.
Dynamic assessment and academic writing: evidence of learning transfer?Prithvi Shrestha
In the context of higher education, many higher order skills and knowledge are expected to be transferable by lecturers. Sustaining these skills and knowledge is therefore central to learning and disciplinary writing development. English for Specific Purposes (ESP) courses can contribute to this purpose as they aim to enable Higher Education students to participate in their chosen academic communities as fully as possible. Despite learning transfer being a key purpose in ESP, research in this area is still limited (Cheng, 2007).
In this context, this paper reports on a small-scale study investigating the transfer of academic writing skills and conceptual knowledge among undergraduate business studies students. The data are derived from a larger study (Shrestha, 2011) conducted at a British university. One assignment text each was collected from four students who studied an ESP course for business studies. While three students had received interactive feedback on their previous two assignments, following a Vygotsky-inspired dynamic assessment (DA) approach, one student was provided with traditional tutor feedback. DA blends instruction with assessment by targeting and further developing students’ potential abilities (Poehner, 2011) whereas traditional tutor feedback is less interactive and hence, may not sufficiently target learners’ potential abilities. The student texts were analysed by drawing on Vygotskian sociocultural theory of learning (Vygotsky, 1978), and genre theory (Martin & Rose, 2007) based on Halliday’s Systemic Functional Linguistics (Halliday & Matthiessen, 2004). The findings suggest that the transfer of academic writing skills and conceptual knowledge occurred more in the texts of the students that underwent dynamic assessment than that of the student who followed a traditional assessment approach for their first two assignments. Implications of this for ESP instruction and assessment design will be presented.
SIOP Master Tutorial: NLP and Text Mining for I/O PsychologistsAndrea Kropp
This 90 minute workshop teaches I/O Psychologists and Talent Analytics leaders how to introduce text mining and natural language processing into their work for the purposes of analyzing free text employee comments, resumes, interviews or similar rich, yet unstructured data sources. More information and sample code available at:
https://github.com/andreakropp/SIOP2017-NLPTutorial
Dynamic assessment and academic writing: evidence of learning transfer?Prithvi Shrestha
In the context of higher education, many higher order skills and knowledge are expected to be transferable by lecturers. Sustaining these skills and knowledge is therefore central to learning and disciplinary writing development. English for Specific Purposes (ESP) courses can contribute to this purpose as they aim to enable Higher Education students to participate in their chosen academic communities as fully as possible. Despite learning transfer being a key purpose in ESP, research in this area is still limited (Cheng, 2007).
In this context, this paper reports on a small-scale study investigating the transfer of academic writing skills and conceptual knowledge among undergraduate business studies students. The data are derived from a larger study (Shrestha, 2011) conducted at a British university. One assignment text each was collected from four students who studied an ESP course for business studies. While three students had received interactive feedback on their previous two assignments, following a Vygotsky-inspired dynamic assessment (DA) approach, one student was provided with traditional tutor feedback. DA blends instruction with assessment by targeting and further developing students’ potential abilities (Poehner, 2011) whereas traditional tutor feedback is less interactive and hence, may not sufficiently target learners’ potential abilities. The student texts were analysed by drawing on Vygotskian sociocultural theory of learning (Vygotsky, 1978), and genre theory (Martin & Rose, 2007) based on Halliday’s Systemic Functional Linguistics (Halliday & Matthiessen, 2004). The findings suggest that the transfer of academic writing skills and conceptual knowledge occurred more in the texts of the students that underwent dynamic assessment than that of the student who followed a traditional assessment approach for their first two assignments. Implications of this for ESP instruction and assessment design will be presented.
SIOP Master Tutorial: NLP and Text Mining for I/O PsychologistsAndrea Kropp
This 90 minute workshop teaches I/O Psychologists and Talent Analytics leaders how to introduce text mining and natural language processing into their work for the purposes of analyzing free text employee comments, resumes, interviews or similar rich, yet unstructured data sources. More information and sample code available at:
https://github.com/andreakropp/SIOP2017-NLPTutorial
Natural Language Processing: L01 introductionananth
This presentation introduces the course Natural Language Processing (NLP) by enumerating a number of applications, course positioning, challenges presented by Natural Language text and emerging approaches to topics like word representation.
Seven Steps to EnGendering Evaluations of Public Health ProgramsMEASURE Evaluation
Because international development increasingly focuses on gender, evaluators need a better understanding of how to measure and incorporate gender—including its economic, social, and health dimensions—in their evaluations. This interactive training, consisting of this presentation and a tool, will help participants learn to better evaluate programs with gender components. Access the tool at https://www.measureevaluation.org/resources/publications/tl-19-40
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...Antonio Toral
We reassess a recent study (Hassan et al.,
2018) that claimed that machine translation
(MT) has reached human parity for the transla-
tion of news from Chinese into English, using
pairwise ranking and considering three vari-
ables that were not taken into account in that
previous study: the language in which the
source side of the test set was originally writ-
ten, the translation proficiency of the evalua-
tors, and the provision of inter-sentential con-
text. If we consider only original source text
(i.e. not translated from another language, or
translationese), then we find evidence showing
that human parity has not been achieved. We
compare the judgments of professional trans-
lators against those of non-experts and dis-
cover that those of the experts result in higher
inter-annotator agreement and better discrim-
ination between human and machine transla-
tions. In addition, we analyse the human trans-
lations of the test set and identify important
translation issues. Finally, based on these find-
ings, we provide a set of recommendations for
future human evaluations of MT.
Presentation on "Probing into Unfulfilled Business English Needs: Iranian EFL Learners’ Thirst for Intercultural Business Communication" for the 5th English Language Teaching Conference held at Allameh Tabatabaii University, Tehran, Iran - 2019
LEPOR: an augmented machine translation evaluation metric - Thesis PPT Lifeng (Aaron) Han
Machine translation (MT) was developed as one of the hottest research topics in the natural language processing (NLP) literature. One important issue in MT is that how to evaluate the MT system reasonably and tell us whether the translation system makes an improvement or not. The traditional manual judgment methods are expensive, time-consuming, unrepeatable, and sometimes with low agreement. On the other hand, the popular automatic MT evaluation methods have some weaknesses. Firstly, they tend to perform well on the language pairs with English as the target language, but weak when English is used as source. Secondly, some methods rely on many additional linguistic features to achieve good performance, which makes the metric unable to replicateand apply to other language pairs easily. Thirdly, some popular metrics utilize incomprehensive factors, which result in low performance on some practical tasks.
In this thesis, to address the existing problems, we design novel MT evaluation methods and investigate their performances on different languages. Firstly, we design augmented factors to yield highly accurate evaluation.Secondly, we design a tunable evaluation model where weighting of factors can be optimized according to the characteristics of languages. Thirdly, in the enhanced version of our methods, we design concise linguistic feature using POS to show that our methods can yield even higher performance when using some external linguistic resources. Finally, we introduce the practical performance of our metrics in the ACL-WMT workshop shared tasks, which show that the proposed methods are robust across different languages.
In this slides the basic concept of machine translation is described.MT challenges are represented and describes rule-based and statistical MT briefly. Some notes about evaluation is described too
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
Training programs for language learning are designed to enable learners to communicate interactively. What makes a training program effective to help instructional designers customize content for significant learning progress? An analysis using regression and KNN classification helps us find out and move in the right direction.
This presentation highlights one of the major theories and hypotheses that dealt with error study. The aim of this presentation is to gain an general understanding or error analysis, its scope, process and significance.
Natural Language Processing: L01 introductionananth
This presentation introduces the course Natural Language Processing (NLP) by enumerating a number of applications, course positioning, challenges presented by Natural Language text and emerging approaches to topics like word representation.
Seven Steps to EnGendering Evaluations of Public Health ProgramsMEASURE Evaluation
Because international development increasingly focuses on gender, evaluators need a better understanding of how to measure and incorporate gender—including its economic, social, and health dimensions—in their evaluations. This interactive training, consisting of this presentation and a tool, will help participants learn to better evaluate programs with gender components. Access the tool at https://www.measureevaluation.org/resources/publications/tl-19-40
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...Antonio Toral
We reassess a recent study (Hassan et al.,
2018) that claimed that machine translation
(MT) has reached human parity for the transla-
tion of news from Chinese into English, using
pairwise ranking and considering three vari-
ables that were not taken into account in that
previous study: the language in which the
source side of the test set was originally writ-
ten, the translation proficiency of the evalua-
tors, and the provision of inter-sentential con-
text. If we consider only original source text
(i.e. not translated from another language, or
translationese), then we find evidence showing
that human parity has not been achieved. We
compare the judgments of professional trans-
lators against those of non-experts and dis-
cover that those of the experts result in higher
inter-annotator agreement and better discrim-
ination between human and machine transla-
tions. In addition, we analyse the human trans-
lations of the test set and identify important
translation issues. Finally, based on these find-
ings, we provide a set of recommendations for
future human evaluations of MT.
Presentation on "Probing into Unfulfilled Business English Needs: Iranian EFL Learners’ Thirst for Intercultural Business Communication" for the 5th English Language Teaching Conference held at Allameh Tabatabaii University, Tehran, Iran - 2019
LEPOR: an augmented machine translation evaluation metric - Thesis PPT Lifeng (Aaron) Han
Machine translation (MT) was developed as one of the hottest research topics in the natural language processing (NLP) literature. One important issue in MT is that how to evaluate the MT system reasonably and tell us whether the translation system makes an improvement or not. The traditional manual judgment methods are expensive, time-consuming, unrepeatable, and sometimes with low agreement. On the other hand, the popular automatic MT evaluation methods have some weaknesses. Firstly, they tend to perform well on the language pairs with English as the target language, but weak when English is used as source. Secondly, some methods rely on many additional linguistic features to achieve good performance, which makes the metric unable to replicateand apply to other language pairs easily. Thirdly, some popular metrics utilize incomprehensive factors, which result in low performance on some practical tasks.
In this thesis, to address the existing problems, we design novel MT evaluation methods and investigate their performances on different languages. Firstly, we design augmented factors to yield highly accurate evaluation.Secondly, we design a tunable evaluation model where weighting of factors can be optimized according to the characteristics of languages. Thirdly, in the enhanced version of our methods, we design concise linguistic feature using POS to show that our methods can yield even higher performance when using some external linguistic resources. Finally, we introduce the practical performance of our metrics in the ACL-WMT workshop shared tasks, which show that the proposed methods are robust across different languages.
In this slides the basic concept of machine translation is described.MT challenges are represented and describes rule-based and statistical MT briefly. Some notes about evaluation is described too
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
Training programs for language learning are designed to enable learners to communicate interactively. What makes a training program effective to help instructional designers customize content for significant learning progress? An analysis using regression and KNN classification helps us find out and move in the right direction.
This presentation highlights one of the major theories and hypotheses that dealt with error study. The aim of this presentation is to gain an general understanding or error analysis, its scope, process and significance.
Assessing How Users Display Self-Disclosure and Authenticity in Conversation with Human-Like Agents: A Case Study of Luda Lee (presented at AACL-IJCNLP 2022)
20/09/17 DevC Seongnam Opening Event
https://festa.io/events/1158
SLU? BERT? Distillation? 그게 뭔데… 어떻게 하는 건데… (feat. PyTorch)
본 talk에서는 음성으로부터 intent를 추출하는 SLU task에 BERT와 같은 pretrained langauge model을 적용하는 과정에서 직접적으로 적용하는 것의 난점과 knowledge distillation으로써 이를 해결하는 과정에 대해 다룹니다.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
2103 ACM FAccT
1. Human Interface Laboratory
Towards Cross-Lingual Generalization of
Translation Gender Bias
2021. 3. 9 @FAccT Conference
Won Ik Cho*, Jiwon Kim*, Jaeyoung Yang, Nam Soo Kim
2. Contents
• Translation gender bias
What’s the problem and why this matters?
Significant in which language pairs? - Struggles so far
• Our approach
Language pairs and template
Dataset construction
Measurement of fluency and biasedness
• Discussion
Results and analysis
Takeaways
1
3. Bias
• Bias in machine learning?
Bias and variance
• Overfitting and underfitting
Bias in view of fairness machine learning?
• Problem of individuality and context rather than of
statistics and system (Binns, 2017)
Is the bias in machine learning related with the bias in fairness machine
learning and real social bias?
• e.g., image semantic role labeling
– Zhao et al., Men Also Like Shopping:
Reducing Gender Bias Amplification
using Corpus-level Constraints,
in Proc. EMNLP, 2017.
• This also happens in translation!
2
4. Bias
• What is shown (social) bias in AI and NLP?
Sun et al., Mitigating Gender Bias in Natural Language Processing:
Literature Review, in Proc. ACL, 2019.
3
5. Overview: Gender bias in translation?
• Formulation #1
Gender-neutral pronouns
• Target problem?
Translation of gender-neutral pronouns to gender-specific ones
• Gender-neutral pronoun
Pronouns with no biological
gender displayed
Frequently appears in languages
like Korean, Japanese, Turkish, ...
Prates et al., Assessing Gender
Bias in Machine Translation:
A Case Study with Google Trans
late, Neural Computing and
Applications, 2018.
4
6. Overview: Gender bias in translation?
• Formulation #2
Gendered languages
• Target problem?
Translation of expressions without
gender representation to gendered items
• Gendered languages
Grammatical genders in articles,
nouns, adjectives
Differs from the biological gender
Vanmassenhove et al.,
Getting Gender Right in
Neural Machine Translation,
in Proc. EMNLP, 2018.
5
7. Overview: Gender bias in translation?
• Why do they matter?
The result can be offensive to end users
• When they do matter?
In cases the users are familiar or not with the target/source language
• Who will potentially feel offensive?
Especially if the mistranslation regards social stereotypes
• Research questions
How can the evaluation incorporate various aspects of translation gender
bias?
How will grammatical properties and resource condition influence the bias
issue?
6
8. Template-based attacks
• 걔(s/he)는 [##]이야!
Cho et al., On measuring gender bias in translation of gender-neutral
pronouns," in Proc. GeBNLP, ACL Workshop, 2019.
• Why Korean?
Displays various sentence styles
Translation service
popular among the users
7
9. Semantic cues
• WinoMT
Stanovsky et al., Evaluating Gender Bias in Machine Translation,
in Proc. ACL, 2019.
8
- Performance can differ among
a similar language family
- High accuracy in stereotypical
case does not guarantee the
accuracy in the counterpart
10. Our approach
• Combined approach – Cross-lingual evaluation?
Two different types of translation gender bias
• Different types of gender bias can be observed in a single translation system
9
?
11. Our approach
• Language pairs
Different types of `language pairs’ should be investigated
• 1) Languages with gender-neutral pronouns
• 2) Languages without (colloquially used) GNPs but not grammatically gendered
• 3) Languages with grammatical gender
10
12. Our approach
• Language pairs
Linguistic criteria: On GNPs and genderedness
• Korean/Tagalog -> English -> German/Portuguese
Resource criteria
• Korean > Tagalog
• German > Portuguese
11
13. Our approach
• Template
How can the pairs be simultaneously evaluated?
12
Korean 그 여자에 대해 하나 말하자면, 걔는 초등학교 선생님이야.
Tagalog Isang bagay tungkol sa babae, siya ay isang guro sa elementarya.
English One thing about the woman, she is an elementary school teacher.
German Eine Sache über die Frau, sie ist eine Grundschullehrerin.
Portuguese Um facto sobre a mulher, ela é professora do ensino primário.
15. Our approach
• Evaluation
Template-based evaluation set construction
Inference with public MT modules
Human evaluation (gender-related) and automatic metrics (fluency)
14
16. Our approach
• Measurement
Biasedness
• Accuracy on biological gender
• Accuracy on grammatical gender
• Disparate impact
– Accuracy on female case
divided by accuracy on male case
Fluency
• BLEU
– EN, DE, PT
• BERTScore
– Multilingual BERT
15
17. Results and analysis
• Results
More bias-related errors in EN > DE/PT than in KO/TL > EN
• She is a game programmer > Sie ist ein professioneller Spieler
• aviador, soldado, monge (airman, soldier, monk)
• Exceptional cases for Bing KO-EN
16
18. Results and analysis
• Analysis
Unbiasedness/Disparate impact
• Higher among type 1 languages
– DE, PT < KO, TL (overall)
• In the same type, resource seems
to matter
– DE < PT, KO < TL
Fluency measurement
• Lexical and semantic approach have different results
– BLEU (lexical): DE > PT > KO, TL
– BERTScore (semantic): DE < PT, KO < TL
Observations
• The amount of available language resource, though here assumed for public
MT modules, does not guarantee unbiased translation, albeit fluency measure
may be higher in some sense
• There is a difference regarding the evaluation on gender-related inference per
fluency measures
17
19. Takeaways
• Translation gender bias is problematic since wrong results can be
offensive to end users
• Translation gender bias matters regardless of the user proficiency
of the language, and especially offensive if the mistranslation
engages social stereotypes
• Our approach, including template and measurement, can combine
the translation gender bias evaluation regarding various language
pairs
• Our evaluation results suggest that the inductive bias as a social
stereotype is a major factor causing the errors and augmenting
training corpora may not be a solution
18
20. Reference (order of appearance)
• Binns, Reuben. "Fairness in Machine Learning: Lessons from Political Philosophy." arXiv preprint
arXiv:1712.03586 (2017).
• Zhao, Jieyu, et al. "Men Also Like Shopping: Reducing Gender Bias Amplification Using Corpus-
level Constraints." arXiv preprint arXiv:1707.09457 (2017).
• Sun, Tony, Andrew Gaut, Shirlyn Tang, Yuxin Huang, Mai ElSherief, Jieyu Zhao, Diba Mirza,
Elizabeth Belding, Kai-Wei Chang, and William Yang Wang. "Mitigating Gender Bias in Natural
Language Processing: Literature Review." In Proceedings of the 57th Annual Meeting of the
Association for Computational Linguistics, pp. 1630-1640. 2019.
• Prates, Marcelo OR, Pedro H. Avelar, and Luís C. Lamb. "Assessing Gender Bias in Machine
Translation: A Case Study with Google Translate." Neural Computing and Applications (2018): 1-
19.
• Vanmassenhove, Eva, Christian Hardmeier, and Andy Way. "Getting Gender Right in Neural
Machine Translation." In Proceedings of the 2018 Conference on Empirical Methods in Natural
Language Processing, pp. 3003-3008. 2018.
• Cho, Won Ik, et al. "On Measuring Gender Bias in Translation of Gender-neutral Pronouns."
GeBNLP 2019 (2019): 173.
• Stanovsky, Gabriel, Noah A. Smith, and Luke Zettlemoyer. "Evaluating Gender Bias in Machine
Translation." arXiv preprint arXiv:1906.00591 (2019).
19