Dodging AI biases in future-proof Machine Translation solutions

INTENTO
Konstantin Savenkov
Intento CEO
Dodging AI Biases
in Future-Proof

Machine Translation

Solutions
© Intento, Inc. / October 2020
GlobalSaké

AGENDA
2
Some context on Intento
—
Future of Work, AI and company culture
—
Using MT for multilingual communication
—
Case Study 1: Gender Bias
—
Case Study 2: Tone of Voice
—
Case Study 3: Data Locality
—
Key Takeaways

SOME CONTEXT ON INTENTO
3

ENTERPRISES
MASSIVELY FAIL
* Share of US companies with successful AI deployment
(Deloitte State of Cognitive Survey 2017)
INTENTO4
20%*
Wrong vendor selected
Failed integrations
Failed pilots
Failed to deliver ROI
© Intento, Inc. / September 2020
TO ADOPT
AI

BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
5
MT Procurement
MT Need MT Systems
Localization

6
MT Procurement
MT Need MT Systems
Localization

7
MT Procurement
—
MT Curation
MT Need MT Systems
Localization

8
MT Procurement
—
MT Curation
—
Multi-Engine MT
MT Need MT Systems
Localization

9
MT Procurement
—
MT Curation
—
Multi-Engine MT
—
Multi-Purpose MT
MT Need MT Systems
Localization
Customer Service
Ofﬁce Productivity
Global Community

10
MT Procurement
—
MT Curation
—
Multi-Engine MT
—
Multi-Purpose MT
—
Continuous Improvement
MT Need MT Systems
Localization
Customer Service
Ofﬁce Productivity
Global Community

FUTURE OF WORK, AI,
AND COMPANY CULTURE
11

FUTURE OF WORK AND AI
FROM TOOLS TO COLLEAGUES
12
AI is more than just a tool
—
experience (pre-training)
—
test assignments (evaluations on your data)
—
onboarding (domain adaptation)
—
continuous learning

RETHINKING TEAMS
13
Team as a cooperation of
cognitive models,
both human and artificial

AI AND COMPANY CULTURE
14
Can we afford to have only a
part of the company aligned
with its culture and values?

MT FOR MULTILINGUAL
COMMUNICATION
15

MT ADOPTION BOTTLENECKS
16
technical (integration)
—
linguistic (domain adaptation)
—
economical (supply chain)
—
cultural (biases)
—
security & legal (privacy)

MT FOR COMMUNICATION
17
INTENTO MT
HUB
INTENTO

GETTING COMMUNICATION RIGHT
18
pre-moderation is not feasible
—
right communication = right
culture

GETTING COMMUNICATION RIGHT
19
pre-moderation is not feasible
—
right communication = right
culture
—
adding MT to the mix

THINGS TO LOOK AFTER
20
Gender
—
Tone of Voice
—
Privacy

WORKING AROUND
THE GENDER BIAS
21

GENDER BIAS
IN MACHINE TRANSLATION
22
Gender Bias in MT as evaluated in WinoMT Challenge [1]
—
carefully measures
the bias
—
in practice,
other cases
create more issues
(see next slide)
[1] “Evaluating Gender Bias in Machine Translation”, Gabriel Stanovsky, Noah A. Smith, Luke Zettlemoyer, 2019, https://arxiv.org/abs/1906.00591

GENDER BIAS
IN COMMUNICATION
23
Source text (English)
Machine Translation
(French)
COMMENT
Are you ready? Es-tu prêt? MASCULINE
Are you ready? Es-tu prête? FEMININE
Are you surprised? Tu es surpris? MASCULINE
Are you surprised? Tu es surprise? FEMININE
Lack of context
—
Defaults to either
feminine or
masculine
—
Baseline MT
engines are not
consistent

GENDER BIAS
MOSTLY MASCULINE BY DEFAULT
24
English to French
—
31 segment
—
stock models
—
mostly masculine
A B C D E F GA B C D E F G
Default gender distribution

GENDER BIAS CONTROL
HOW TO FIX IT?
25
Option 1: Copy & paste from
Google Translate Web App
(supports gender control)
—
Option 2: Use long phrases,
adding some context
—
Option 3: MT-agnostic NLP
Not for French, not secure, cumbersome,
no customization.
You can instruct support operators, but
not employees and or.
Works to a certain extent, provides a
wider choice of MT engines

GENDER CONTROL
ADJUST TO FEMININE
26
English to French
—
31 segments
—
stock models
—
let’s make it more
FEMININE
Gender adjustment => feminine

GENDER CONTROL
ADJUST TO FEMININE
27
English to French
—
31 segments
—
stock models
—
MASCULINE
Gender adjustment => masculine

CONTROLLING
TONE OF VOICE
28

TONE OF VOICE CONTROL
SAMPLES FROM SUPPORT CHATS
29
Source text (English)
Machine Translation
(German)
COMMENT
Can you share your screen?
Können Sie Ihren Bildschirm
freigeben?
FORMAL
Could you help me? Kannst du mir helfen? INFORMAL
Make sure you report any of
these issues.
Stellen Sie sicher, dass Sie eines
dieser Probleme melden.
FORMAL
Can you give an example? Kannst du ein Beispiel geben? INFORMAL
Formal vs.
Informal
—
Crucial for Live
Chats
—
Baseline MT
engines are not
consistent

DEFAULT MT OUTPUT
30
English to German
—
210 segments
—
stock models
A B C D E F G

HOW TO MAKE IT INFORMAL?
31
Option 1: Use DeepL with
formality=less (99.5% accuracy)
—
Option 2: Generate synthetic
training data, hoping
translations become more
informal
—
Option 3: MT-agnostic NLP
What if you need a custom model and
terminology, or another MT has better
linguistic quality for you?
Expensive and time-consuming, also
introduces bias into the model
Works to a certain extent, provides a
wider choice of MT engines

MT-AGNOSTIC ADJUSTMENT
32
English to German
—
210 segments
—
stock models
—
INFORMAL
A B C D E F G

MT-AGNOSTIC ADJUSTMENT
33
English to German
—
210 segments
—
stock models
—
FORMAL
A B C D E F G

PRIVACY PROTECTION
34

DATA PROTECTION LAWS
35
A B C D E F
According to DLA Piper https://www.dlapiperdataprotection.com/index.html?t=world-map as fetched on 2020-10-20

CLOUD MT DEPLOYMENTS
36
A B C D E F
Alibaba
Amazon
Baidu
DeepL
Globalese*
Google
GTCom*
IBM
Microsoft
Mirai
ModenMT*
Naver
Niutrans*
PROMT
Rozetta
SDL*
Sogou
Systran*
Tencent
Tilde*
Yandex
Youdao
* On-premise and private cloud deployment available

DATA AND PRIVACY PROTECTION
37
Communication may contain PII, healthcare, HR, financial data.
—
Option 1:
- select proper MT vendor for every region
- when in doubt, use private-cloud deployments
—
Option 2:
- proper DPA and data protection clauses + insurance
—
Option 3: pseudonymization to remove PII

KEY TAKEAWAYS
38
Machine Translation becomes more and more ubiquitous. It
becomes more like our coworker than a tool.
—
When it’s biased, it may damage our work environment and
culture.
—
As of today, it’s mostly masculine by gender and quite
inconsistent by tone of voice.
—
It’s possible to dodge those biases using NLP paired with MT.
—
Make sure you know where your MT sits so that you stay
compliant.

THANKS!
ks@inten.to
39
Konstantin Savenkov, CEO

ks@inten.to

2150 Shattuck Ave

Berkeley CA 94705
INTENTO
https://inten.to

Dodging AI biases in future-proof Machine Translation solutions

Recommended

Recommended

More Related Content

What's hot

What's hot (8)

Similar to Dodging AI biases in future-proof Machine Translation solutions

Similar to Dodging AI biases in future-proof Machine Translation solutions (20)

More from Konstantin Savenkov

More from Konstantin Savenkov (13)

Recently uploaded

Recently uploaded (20)

Dodging AI biases in future-proof Machine Translation solutions