SlideShare a Scribd company logo
1 of 39
Download to read offline
INTENTO
Konstantin Savenkov
Intento CEO
Dodging AI Biases
in Future-Proof

Machine Translation

Solutions
© Intento, Inc. / October 2020
GlobalSaké
© Intento, Inc. / October 2020
AGENDA
2
Some context on Intento
—
Future of Work, AI and company culture
—
Using MT for multilingual communication
—
Case Study 1: Gender Bias
—
Case Study 2: Tone of Voice
—
Case Study 3: Data Locality
—
Key Takeaways
© Intento, Inc. / October 2020
SOME CONTEXT ON INTENTO
3
ENTERPRISES
MASSIVELY FAIL
* Share of US companies with successful AI deployment
(Deloitte State of Cognitive Survey 2017)
INTENTO4
20%*
Wrong vendor selected
Failed integrations
Failed pilots
Failed to deliver ROI
© Intento, Inc. / September 2020
TO ADOPT
AI
© Intento, Inc. / October 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
5
MT Procurement
MT Need MT Systems
Localization
© Intento, Inc. / October 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
6
MT Procurement
MT Need MT Systems
Localization
© Intento, Inc. / October 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
7
MT Procurement
—
MT Curation
MT Need MT Systems
Localization
© Intento, Inc. / October 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
8
MT Procurement
—
MT Curation
—
Multi-Engine MT
MT Need MT Systems
Localization
© Intento, Inc. / October 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
9
MT Procurement
—
MT Curation
—
Multi-Engine MT
—
Multi-Purpose MT
MT Need MT Systems
Localization
Customer Service
Office Productivity
Global Community
© Intento, Inc. / October 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
10
MT Procurement
—
MT Curation
—
Multi-Engine MT
—
Multi-Purpose MT
—
Continuous Improvement
MT Need MT Systems
Localization
Customer Service
Office Productivity
Global Community
© Intento, Inc. / October 2020
FUTURE OF WORK, AI,
AND COMPANY CULTURE
11
© Intento, Inc. / October 2020
FUTURE OF WORK AND AI
FROM TOOLS TO COLLEAGUES
12
AI is more than just a tool
—
experience (pre-training)
—
test assignments (evaluations on your data)
—
onboarding (domain adaptation)
—
continuous learning
© Intento, Inc. / October 2020
RETHINKING TEAMS
13
Team as a cooperation of
cognitive models,
both human and artificial
© Intento, Inc. / October 2020
AI AND COMPANY CULTURE
14
Can we afford to have only a
part of the company aligned
with its culture and values?
© Intento, Inc. / October 2020
MT FOR MULTILINGUAL
COMMUNICATION
15
© Intento, Inc. / October 2020
MT ADOPTION BOTTLENECKS
16
technical (integration)
—
linguistic (domain adaptation)
—
economical (supply chain)
—
cultural (biases)
—
security & legal (privacy)
© Intento, Inc. / October 2020
MT FOR COMMUNICATION
17
INTENTO MT
HUB
INTENTO
© Intento, Inc. / October 2020
GETTING COMMUNICATION RIGHT
18
pre-moderation is not feasible
—
right communication = right
culture
© Intento, Inc. / October 2020
GETTING COMMUNICATION RIGHT
19
pre-moderation is not feasible
—
right communication = right
culture
—
adding MT to the mix
© Intento, Inc. / October 2020
THINGS TO LOOK AFTER
20
Gender
—
Tone of Voice
—
Privacy
© Intento, Inc. / October 2020
WORKING AROUND
THE GENDER BIAS
21
© Intento, Inc. / October 2020
GENDER BIAS
IN MACHINE TRANSLATION
22
Gender Bias in MT as evaluated in WinoMT Challenge [1]
—
carefully measures
the bias
—
in practice,
other cases
create more issues
(see next slide)
[1] “Evaluating Gender Bias in Machine Translation”, Gabriel Stanovsky, Noah A. Smith, Luke Zettlemoyer, 2019, https://arxiv.org/abs/1906.00591
© Intento, Inc. / October 2020
GENDER BIAS
IN COMMUNICATION
23
Source text (English)
Machine Translation
(French)
COMMENT
Are you ready? Es-tu prêt? MASCULINE
Are you ready? Es-tu prête? FEMININE
Are you surprised? Tu es surpris? MASCULINE
Are you surprised? Tu es surprise? FEMININE
Lack of context
—
Defaults to either
feminine or
masculine
—
Baseline MT
engines are not
consistent
© Intento, Inc. / October 2020
GENDER BIAS
MOSTLY MASCULINE BY DEFAULT
24
English to French
—
31 segment
—
stock models
—
mostly masculine
A B C D E F GA B C D E F G
Default gender distribution
© Intento, Inc. / October 2020
GENDER BIAS CONTROL
HOW TO FIX IT?
25
Option 1: Copy & paste from
Google Translate Web App
(supports gender control)
—
Option 2: Use long phrases,
adding some context
—
Option 3: MT-agnostic NLP
Not for French, not secure, cumbersome,
no customization.
You can instruct support operators, but
not employees and or.
Works to a certain extent, provides a
wider choice of MT engines
© Intento, Inc. / October 2020
GENDER CONTROL
ADJUST TO FEMININE
26
English to French
—
31 segments
—
stock models
—
let’s make it more
FEMININE
A B C D E F GA B C D E F G
Gender adjustment => feminine
© Intento, Inc. / October 2020
GENDER CONTROL
ADJUST TO FEMININE
27
English to French
—
31 segments
—
stock models
—
let’s make it more
MASCULINE
A B C D E F GA B C D E F G
Gender adjustment => masculine
© Intento, Inc. / October 2020
CONTROLLING
TONE OF VOICE
28
© Intento, Inc. / October 2020
TONE OF VOICE CONTROL
SAMPLES FROM SUPPORT CHATS
29
Source text (English)
Machine Translation
(German)
COMMENT
Can you share your screen?
Können Sie Ihren Bildschirm
freigeben?
FORMAL
Could you help me? Kannst du mir helfen? INFORMAL
Make sure you report any of
these issues.
Stellen Sie sicher, dass Sie eines
dieser Probleme melden.
FORMAL
Can you give an example? Kannst du ein Beispiel geben? INFORMAL
Formal vs.
Informal
—
Crucial for Live
Chats
—
Baseline MT
engines are not
consistent
© Intento, Inc. / October 2020
TONE OF VOICE CONTROL
DEFAULT MT OUTPUT
30
English to German
—
210 segments
—
stock models
A B C D E F G
© Intento, Inc. / October 2020
TONE OF VOICE CONTROL
HOW TO MAKE IT INFORMAL?
31
Option 1: Use DeepL with
formality=less (99.5% accuracy)
—
Option 2: Generate synthetic
training data, hoping
translations become more
informal
—
Option 3: MT-agnostic NLP
What if you need a custom model and
terminology, or another MT has better
linguistic quality for you?
Expensive and time-consuming, also
introduces bias into the model
Works to a certain extent, provides a
wider choice of MT engines
© Intento, Inc. / October 2020
TONE OF VOICE CONTROL
MT-AGNOSTIC ADJUSTMENT
32
English to German
—
210 segments
—
stock models
—
let’s make it more
INFORMAL
A B C D E F G
© Intento, Inc. / October 2020
TONE OF VOICE CONTROL
MT-AGNOSTIC ADJUSTMENT
33
English to German
—
210 segments
—
stock models
—
let’s make it more
FORMAL
A B C D E F G
© Intento, Inc. / October 2020
PRIVACY PROTECTION
34
© Intento, Inc. / October 2020
DATA PROTECTION LAWS
35
A B C D E F
According to DLA Piper https://www.dlapiperdataprotection.com/index.html?t=world-map as fetched on 2020-10-20
© Intento, Inc. / October 2020
CLOUD MT DEPLOYMENTS
36
A B C D E F
Alibaba
Amazon
Baidu
DeepL
Globalese*
Google
GTCom*
IBM
Microsoft
Mirai
ModenMT*
Naver
Niutrans*
PROMT
Rozetta
SDL*
Sogou
Systran*
Tencent
Tilde*
Yandex
Youdao
* On-premise and private cloud deployment available
© Intento, Inc. / October 2020
DATA AND PRIVACY PROTECTION
37
Communication may contain PII, healthcare, HR, financial data.
—
Option 1:
- select proper MT vendor for every region
- when in doubt, use private-cloud deployments
—
Option 2:
- proper DPA and data protection clauses + insurance
—
Option 3: pseudonymization to remove PII
© Intento, Inc. / October 2020
KEY TAKEAWAYS
38
Machine Translation becomes more and more ubiquitous. It
becomes more like our coworker than a tool.
—
When it’s biased, it may damage our work environment and
culture.
—
As of today, it’s mostly masculine by gender and quite
inconsistent by tone of voice.
—
It’s possible to dodge those biases using NLP paired with MT.
—
Make sure you know where your MT sits so that you stay
compliant.
THANKS!
ks@inten.to
39
Konstantin Savenkov, CEO

ks@inten.to

2150 Shattuck Ave

Berkeley CA 94705
INTENTO
https://inten.to

More Related Content

What's hot

State of the Machine Translation by Intento (March 2018)
State of the Machine Translation by Intento (March 2018)State of the Machine Translation by Intento (March 2018)
State of the Machine Translation by Intento (March 2018)Konstantin Savenkov
 
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...Konstantin Savenkov
 
State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)Konstantin Savenkov
 
State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)Konstantin Savenkov
 
Intento Machine Translation Benchmark, July 2017
Intento Machine Translation Benchmark, July 2017Intento Machine Translation Benchmark, July 2017
Intento Machine Translation Benchmark, July 2017Konstantin Savenkov
 

What's hot (8)

State of the Machine Translation by Intento (March 2018)
State of the Machine Translation by Intento (March 2018)State of the Machine Translation by Intento (March 2018)
State of the Machine Translation by Intento (March 2018)
 
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
 
State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)
 
State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)
 
Intento Machine Translation Benchmark, July 2017
Intento Machine Translation Benchmark, July 2017Intento Machine Translation Benchmark, July 2017
Intento Machine Translation Benchmark, July 2017
 
Intento Enterprise MT Hub
Intento Enterprise MT HubIntento Enterprise MT Hub
Intento Enterprise MT Hub
 
Intento Enterprise MT Hub
Intento Enterprise MT HubIntento Enterprise MT Hub
Intento Enterprise MT Hub
 
Intento Enterprise MT Hub
Intento Enterprise MT HubIntento Enterprise MT Hub
Intento Enterprise MT Hub
 

Similar to Dodging AI biases in future-proof Machine Translation solutions

Annette Zimmermann (Gartner): Gartner Strategic Predictions: What Will Disrup...
Annette Zimmermann (Gartner): Gartner Strategic Predictions: What Will Disrup...Annette Zimmermann (Gartner): Gartner Strategic Predictions: What Will Disrup...
Annette Zimmermann (Gartner): Gartner Strategic Predictions: What Will Disrup...AugmentedWorldExpo
 
Improve Efficiency by Double Digits – Leveraging Artificial Intelligence and ...
Improve Efficiency by Double Digits – Leveraging Artificial Intelligence and ...Improve Efficiency by Double Digits – Leveraging Artificial Intelligence and ...
Improve Efficiency by Double Digits – Leveraging Artificial Intelligence and ...ETCenter
 
Improving the Demand Side of the AI Economy (API World 2018)
Improving the Demand Side of the AI Economy (API World 2018)Improving the Demand Side of the AI Economy (API World 2018)
Improving the Demand Side of the AI Economy (API World 2018)Konstantin Savenkov
 
Intelligenza artificiale: le sue potenzialità, la bozza di regolamento UE e r...
Intelligenza artificiale: le sue potenzialità, la bozza di regolamento UE e r...Intelligenza artificiale: le sue potenzialità, la bozza di regolamento UE e r...
Intelligenza artificiale: le sue potenzialità, la bozza di regolamento UE e r...Giulio Coraggio
 
Kubernetes based connected vehicle platform #k8sjp_t1 #k8sjp
Kubernetes based connected vehicle platform #k8sjp_t1 #k8sjp Kubernetes based connected vehicle platform #k8sjp_t1 #k8sjp
Kubernetes based connected vehicle platform #k8sjp_t1 #k8sjp Kenta Suzuki
 
Integrating Service Mesh with Kubernetes-based connected vehicle platform
Integrating Service Mesh with Kubernetes-based connected vehicle platformIntegrating Service Mesh with Kubernetes-based connected vehicle platform
Integrating Service Mesh with Kubernetes-based connected vehicle platformJun Kai Yong
 
Technology, Media And Telecommunications Prediction 0f 2020
Technology, Media And Telecommunications Prediction 0f 2020Technology, Media And Telecommunications Prediction 0f 2020
Technology, Media And Telecommunications Prediction 0f 2020aakash malhotra
 
Bitmovin LIVE: NAB 2020 Kickoff Webinar - "COVID-19 and its impact on OTT Video"
Bitmovin LIVE: NAB 2020 Kickoff Webinar - "COVID-19 and its impact on OTT Video"Bitmovin LIVE: NAB 2020 Kickoff Webinar - "COVID-19 and its impact on OTT Video"
Bitmovin LIVE: NAB 2020 Kickoff Webinar - "COVID-19 and its impact on OTT Video"Bitmovin Inc
 
June 27 top_10_techtrends_dcearley_176465
June 27 top_10_techtrends_dcearley_176465June 27 top_10_techtrends_dcearley_176465
June 27 top_10_techtrends_dcearley_176465Kirill Goncharuk
 
스마트 디바이스 현황_및__전자정부에_대한_제언
스마트 디바이스 현황_및__전자정부에_대한_제언스마트 디바이스 현황_및__전자정부에_대한_제언
스마트 디바이스 현황_및__전자정부에_대한_제언Gori Communication
 
Artificial Intelligence: WHAT CONSEQUENCES FOR PRINTING AND WEB-TO-PRINT?
Artificial Intelligence: WHAT CONSEQUENCES FOR PRINTING AND WEB-TO-PRINT?Artificial Intelligence: WHAT CONSEQUENCES FOR PRINTING AND WEB-TO-PRINT?
Artificial Intelligence: WHAT CONSEQUENCES FOR PRINTING AND WEB-TO-PRINT?Ludovic Martin
 
Jaist satellite 20180301 v6
Jaist satellite 20180301 v6Jaist satellite 20180301 v6
Jaist satellite 20180301 v6ISSIP
 
SAARIKOSKI YLE metadata machine
SAARIKOSKI YLE metadata machineSAARIKOSKI YLE metadata machine
SAARIKOSKI YLE metadata machineFIAT/IFTA
 
Customer Centric Innovation in a World of Shiny Objects
Customer Centric Innovation in a World of Shiny ObjectsCustomer Centric Innovation in a World of Shiny Objects
Customer Centric Innovation in a World of Shiny ObjectsJoAnna Cheshire
 
HP Software Performance Tour 2014 - Apps, Big Data and Security 20/20
HP Software Performance Tour 2014 - Apps, Big Data and Security 20/20HP Software Performance Tour 2014 - Apps, Big Data and Security 20/20
HP Software Performance Tour 2014 - Apps, Big Data and Security 20/20HP Enterprise Italia
 
Comprendere il Cloud e le altre correnti scatenanti la trasformazione digitale
Comprendere il Cloud e le altre correnti scatenanti la trasformazione digitale Comprendere il Cloud e le altre correnti scatenanti la trasformazione digitale
Comprendere il Cloud e le altre correnti scatenanti la trasformazione digitale SMAU
 
What Is GPT-3 And Why Is It Revolutionizing Artificial Intelligence?
What Is GPT-3 And Why Is It Revolutionizing Artificial Intelligence?What Is GPT-3 And Why Is It Revolutionizing Artificial Intelligence?
What Is GPT-3 And Why Is It Revolutionizing Artificial Intelligence?Bernard Marr
 
Software development for the diversification of Nigeria Ecomony
Software development for the diversification of Nigeria EcomonySoftware development for the diversification of Nigeria Ecomony
Software development for the diversification of Nigeria EcomonyPatrick Ogbuitepu
 

Similar to Dodging AI biases in future-proof Machine Translation solutions (20)

Intento Enterprise MT Hub
Intento Enterprise MT HubIntento Enterprise MT Hub
Intento Enterprise MT Hub
 
Annette Zimmermann (Gartner): Gartner Strategic Predictions: What Will Disrup...
Annette Zimmermann (Gartner): Gartner Strategic Predictions: What Will Disrup...Annette Zimmermann (Gartner): Gartner Strategic Predictions: What Will Disrup...
Annette Zimmermann (Gartner): Gartner Strategic Predictions: What Will Disrup...
 
Improve Efficiency by Double Digits – Leveraging Artificial Intelligence and ...
Improve Efficiency by Double Digits – Leveraging Artificial Intelligence and ...Improve Efficiency by Double Digits – Leveraging Artificial Intelligence and ...
Improve Efficiency by Double Digits – Leveraging Artificial Intelligence and ...
 
Improving the Demand Side of the AI Economy (API World 2018)
Improving the Demand Side of the AI Economy (API World 2018)Improving the Demand Side of the AI Economy (API World 2018)
Improving the Demand Side of the AI Economy (API World 2018)
 
Intelligenza artificiale: le sue potenzialità, la bozza di regolamento UE e r...
Intelligenza artificiale: le sue potenzialità, la bozza di regolamento UE e r...Intelligenza artificiale: le sue potenzialità, la bozza di regolamento UE e r...
Intelligenza artificiale: le sue potenzialità, la bozza di regolamento UE e r...
 
Kubernetes based connected vehicle platform #k8sjp_t1 #k8sjp
Kubernetes based connected vehicle platform #k8sjp_t1 #k8sjp Kubernetes based connected vehicle platform #k8sjp_t1 #k8sjp
Kubernetes based connected vehicle platform #k8sjp_t1 #k8sjp
 
Integrating Service Mesh with Kubernetes-based connected vehicle platform
Integrating Service Mesh with Kubernetes-based connected vehicle platformIntegrating Service Mesh with Kubernetes-based connected vehicle platform
Integrating Service Mesh with Kubernetes-based connected vehicle platform
 
Technology, Media And Telecommunications Prediction 0f 2020
Technology, Media And Telecommunications Prediction 0f 2020Technology, Media And Telecommunications Prediction 0f 2020
Technology, Media And Telecommunications Prediction 0f 2020
 
Bitmovin LIVE: NAB 2020 Kickoff Webinar - "COVID-19 and its impact on OTT Video"
Bitmovin LIVE: NAB 2020 Kickoff Webinar - "COVID-19 and its impact on OTT Video"Bitmovin LIVE: NAB 2020 Kickoff Webinar - "COVID-19 and its impact on OTT Video"
Bitmovin LIVE: NAB 2020 Kickoff Webinar - "COVID-19 and its impact on OTT Video"
 
June 27 top_10_techtrends_dcearley_176465
June 27 top_10_techtrends_dcearley_176465June 27 top_10_techtrends_dcearley_176465
June 27 top_10_techtrends_dcearley_176465
 
스마트 디바이스 현황_및__전자정부에_대한_제언
스마트 디바이스 현황_및__전자정부에_대한_제언스마트 디바이스 현황_및__전자정부에_대한_제언
스마트 디바이스 현황_및__전자정부에_대한_제언
 
Artificial Intelligence: WHAT CONSEQUENCES FOR PRINTING AND WEB-TO-PRINT?
Artificial Intelligence: WHAT CONSEQUENCES FOR PRINTING AND WEB-TO-PRINT?Artificial Intelligence: WHAT CONSEQUENCES FOR PRINTING AND WEB-TO-PRINT?
Artificial Intelligence: WHAT CONSEQUENCES FOR PRINTING AND WEB-TO-PRINT?
 
Jaist satellite 20180301 v6
Jaist satellite 20180301 v6Jaist satellite 20180301 v6
Jaist satellite 20180301 v6
 
SAARIKOSKI YLE metadata machine
SAARIKOSKI YLE metadata machineSAARIKOSKI YLE metadata machine
SAARIKOSKI YLE metadata machine
 
Customer Centric Innovation in a World of Shiny Objects
Customer Centric Innovation in a World of Shiny ObjectsCustomer Centric Innovation in a World of Shiny Objects
Customer Centric Innovation in a World of Shiny Objects
 
HP Software Performance Tour 2014 - Apps, Big Data and Security 20/20
HP Software Performance Tour 2014 - Apps, Big Data and Security 20/20HP Software Performance Tour 2014 - Apps, Big Data and Security 20/20
HP Software Performance Tour 2014 - Apps, Big Data and Security 20/20
 
Comprendere il Cloud e le altre correnti scatenanti la trasformazione digitale
Comprendere il Cloud e le altre correnti scatenanti la trasformazione digitale Comprendere il Cloud e le altre correnti scatenanti la trasformazione digitale
Comprendere il Cloud e le altre correnti scatenanti la trasformazione digitale
 
What Is GPT-3 And Why Is It Revolutionizing Artificial Intelligence?
What Is GPT-3 And Why Is It Revolutionizing Artificial Intelligence?What Is GPT-3 And Why Is It Revolutionizing Artificial Intelligence?
What Is GPT-3 And Why Is It Revolutionizing Artificial Intelligence?
 
Voice logger By infeetel
Voice logger By infeetelVoice logger By infeetel
Voice logger By infeetel
 
Software development for the diversification of Nigeria Ecomony
Software development for the diversification of Nigeria EcomonySoftware development for the diversification of Nigeria Ecomony
Software development for the diversification of Nigeria Ecomony
 

More from Konstantin Savenkov

GPT and other Text Transformers: Black Swans and Stochastic Parrots
GPT and other Text Transformers:  Black Swans and Stochastic ParrotsGPT and other Text Transformers:  Black Swans and Stochastic Parrots
GPT and other Text Transformers: Black Swans and Stochastic ParrotsKonstantin Savenkov
 
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...Konstantin Savenkov
 
Сравнительный анализ систем машинного перевода
Сравнительный анализ систем машинного переводаСравнительный анализ систем машинного перевода
Сравнительный анализ систем машинного переводаKonstantin Savenkov
 
NLU / Intent Detection Benchmark by Intento, August 2017
NLU / Intent Detection Benchmark by Intento, August 2017NLU / Intent Detection Benchmark by Intento, August 2017
NLU / Intent Detection Benchmark by Intento, August 2017Konstantin Savenkov
 
Управление бизнесом на основе данных
Управление бизнесом на основе данныхУправление бизнесом на основе данных
Управление бизнесом на основе данныхKonstantin Savenkov
 
Messengers, Bots and Personal Assistants
Messengers, Bots and Personal AssistantsMessengers, Bots and Personal Assistants
Messengers, Bots and Personal AssistantsKonstantin Savenkov
 
Рекомендательные системы: роль и оценка эффективности
Рекомендательные системы: роль и оценка эффективностиРекомендательные системы: роль и оценка эффективности
Рекомендательные системы: роль и оценка эффективностиKonstantin Savenkov
 
Measuring the agile process improvement
Measuring the agile process improvementMeasuring the agile process improvement
Measuring the agile process improvementKonstantin Savenkov
 
Driving Business Goals with Recommender Systems @ YAC/m 2015
Driving Business Goals with Recommender Systems @ YAC/m 2015Driving Business Goals with Recommender Systems @ YAC/m 2015
Driving Business Goals with Recommender Systems @ YAC/m 2015Konstantin Savenkov
 
The Economics of Recommender Systems
The Economics of Recommender SystemsThe Economics of Recommender Systems
The Economics of Recommender SystemsKonstantin Savenkov
 
Recommender Systems in a nutshell
Recommender Systems in a nutshellRecommender Systems in a nutshell
Recommender Systems in a nutshellKonstantin Savenkov
 

More from Konstantin Savenkov (13)

GPT and other Text Transformers: Black Swans and Stochastic Parrots
GPT and other Text Transformers:  Black Swans and Stochastic ParrotsGPT and other Text Transformers:  Black Swans and Stochastic Parrots
GPT and other Text Transformers: Black Swans and Stochastic Parrots
 
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
 
Сравнительный анализ систем машинного перевода
Сравнительный анализ систем машинного переводаСравнительный анализ систем машинного перевода
Сравнительный анализ систем машинного перевода
 
NLU / Intent Detection Benchmark by Intento, August 2017
NLU / Intent Detection Benchmark by Intento, August 2017NLU / Intent Detection Benchmark by Intento, August 2017
NLU / Intent Detection Benchmark by Intento, August 2017
 
Building a Data Driven Business
Building a Data Driven BusinessBuilding a Data Driven Business
Building a Data Driven Business
 
Управление бизнесом на основе данных
Управление бизнесом на основе данныхУправление бизнесом на основе данных
Управление бизнесом на основе данных
 
Messengers, Bots and Personal Assistants
Messengers, Bots and Personal AssistantsMessengers, Bots and Personal Assistants
Messengers, Bots and Personal Assistants
 
Рекомендательные системы: роль и оценка эффективности
Рекомендательные системы: роль и оценка эффективностиРекомендательные системы: роль и оценка эффективности
Рекомендательные системы: роль и оценка эффективности
 
Measuring the agile process improvement
Measuring the agile process improvementMeasuring the agile process improvement
Measuring the agile process improvement
 
Lean production для SAAS
Lean production для SAASLean production для SAAS
Lean production для SAAS
 
Driving Business Goals with Recommender Systems @ YAC/m 2015
Driving Business Goals with Recommender Systems @ YAC/m 2015Driving Business Goals with Recommender Systems @ YAC/m 2015
Driving Business Goals with Recommender Systems @ YAC/m 2015
 
The Economics of Recommender Systems
The Economics of Recommender SystemsThe Economics of Recommender Systems
The Economics of Recommender Systems
 
Recommender Systems in a nutshell
Recommender Systems in a nutshellRecommender Systems in a nutshell
Recommender Systems in a nutshell
 

Recently uploaded

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 

Recently uploaded (20)

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 

Dodging AI biases in future-proof Machine Translation solutions

  • 1. INTENTO Konstantin Savenkov Intento CEO Dodging AI Biases in Future-Proof Machine Translation Solutions © Intento, Inc. / October 2020 GlobalSaké
  • 2. © Intento, Inc. / October 2020 AGENDA 2 Some context on Intento — Future of Work, AI and company culture — Using MT for multilingual communication — Case Study 1: Gender Bias — Case Study 2: Tone of Voice — Case Study 3: Data Locality — Key Takeaways
  • 3. © Intento, Inc. / October 2020 SOME CONTEXT ON INTENTO 3
  • 4. ENTERPRISES MASSIVELY FAIL * Share of US companies with successful AI deployment (Deloitte State of Cognitive Survey 2017) INTENTO4 20%* Wrong vendor selected Failed integrations Failed pilots Failed to deliver ROI © Intento, Inc. / September 2020 TO ADOPT AI
  • 5. © Intento, Inc. / October 2020 BRIDGING THE GAP BETWEEN MT CAPABILITIES AND ADOPTION 5 MT Procurement MT Need MT Systems Localization
  • 6. © Intento, Inc. / October 2020 BRIDGING THE GAP BETWEEN MT CAPABILITIES AND ADOPTION 6 MT Procurement MT Need MT Systems Localization
  • 7. © Intento, Inc. / October 2020 BRIDGING THE GAP BETWEEN MT CAPABILITIES AND ADOPTION 7 MT Procurement — MT Curation MT Need MT Systems Localization
  • 8. © Intento, Inc. / October 2020 BRIDGING THE GAP BETWEEN MT CAPABILITIES AND ADOPTION 8 MT Procurement — MT Curation — Multi-Engine MT MT Need MT Systems Localization
  • 9. © Intento, Inc. / October 2020 BRIDGING THE GAP BETWEEN MT CAPABILITIES AND ADOPTION 9 MT Procurement — MT Curation — Multi-Engine MT — Multi-Purpose MT MT Need MT Systems Localization Customer Service Office Productivity Global Community
  • 10. © Intento, Inc. / October 2020 BRIDGING THE GAP BETWEEN MT CAPABILITIES AND ADOPTION 10 MT Procurement — MT Curation — Multi-Engine MT — Multi-Purpose MT — Continuous Improvement MT Need MT Systems Localization Customer Service Office Productivity Global Community
  • 11. © Intento, Inc. / October 2020 FUTURE OF WORK, AI, AND COMPANY CULTURE 11
  • 12. © Intento, Inc. / October 2020 FUTURE OF WORK AND AI FROM TOOLS TO COLLEAGUES 12 AI is more than just a tool — experience (pre-training) — test assignments (evaluations on your data) — onboarding (domain adaptation) — continuous learning
  • 13. © Intento, Inc. / October 2020 RETHINKING TEAMS 13 Team as a cooperation of cognitive models, both human and artificial
  • 14. © Intento, Inc. / October 2020 AI AND COMPANY CULTURE 14 Can we afford to have only a part of the company aligned with its culture and values?
  • 15. © Intento, Inc. / October 2020 MT FOR MULTILINGUAL COMMUNICATION 15
  • 16. © Intento, Inc. / October 2020 MT ADOPTION BOTTLENECKS 16 technical (integration) — linguistic (domain adaptation) — economical (supply chain) — cultural (biases) — security & legal (privacy)
  • 17. © Intento, Inc. / October 2020 MT FOR COMMUNICATION 17 INTENTO MT HUB INTENTO
  • 18. © Intento, Inc. / October 2020 GETTING COMMUNICATION RIGHT 18 pre-moderation is not feasible — right communication = right culture
  • 19. © Intento, Inc. / October 2020 GETTING COMMUNICATION RIGHT 19 pre-moderation is not feasible — right communication = right culture — adding MT to the mix
  • 20. © Intento, Inc. / October 2020 THINGS TO LOOK AFTER 20 Gender — Tone of Voice — Privacy
  • 21. © Intento, Inc. / October 2020 WORKING AROUND THE GENDER BIAS 21
  • 22. © Intento, Inc. / October 2020 GENDER BIAS IN MACHINE TRANSLATION 22 Gender Bias in MT as evaluated in WinoMT Challenge [1] — carefully measures the bias — in practice, other cases create more issues (see next slide) [1] “Evaluating Gender Bias in Machine Translation”, Gabriel Stanovsky, Noah A. Smith, Luke Zettlemoyer, 2019, https://arxiv.org/abs/1906.00591
  • 23. © Intento, Inc. / October 2020 GENDER BIAS IN COMMUNICATION 23 Source text (English) Machine Translation (French) COMMENT Are you ready? Es-tu prêt? MASCULINE Are you ready? Es-tu prête? FEMININE Are you surprised? Tu es surpris? MASCULINE Are you surprised? Tu es surprise? FEMININE Lack of context — Defaults to either feminine or masculine — Baseline MT engines are not consistent
  • 24. © Intento, Inc. / October 2020 GENDER BIAS MOSTLY MASCULINE BY DEFAULT 24 English to French — 31 segment — stock models — mostly masculine A B C D E F GA B C D E F G Default gender distribution
  • 25. © Intento, Inc. / October 2020 GENDER BIAS CONTROL HOW TO FIX IT? 25 Option 1: Copy & paste from Google Translate Web App (supports gender control) — Option 2: Use long phrases, adding some context — Option 3: MT-agnostic NLP Not for French, not secure, cumbersome, no customization. You can instruct support operators, but not employees and or. Works to a certain extent, provides a wider choice of MT engines
  • 26. © Intento, Inc. / October 2020 GENDER CONTROL ADJUST TO FEMININE 26 English to French — 31 segments — stock models — let’s make it more FEMININE A B C D E F GA B C D E F G Gender adjustment => feminine
  • 27. © Intento, Inc. / October 2020 GENDER CONTROL ADJUST TO FEMININE 27 English to French — 31 segments — stock models — let’s make it more MASCULINE A B C D E F GA B C D E F G Gender adjustment => masculine
  • 28. © Intento, Inc. / October 2020 CONTROLLING TONE OF VOICE 28
  • 29. © Intento, Inc. / October 2020 TONE OF VOICE CONTROL SAMPLES FROM SUPPORT CHATS 29 Source text (English) Machine Translation (German) COMMENT Can you share your screen? Können Sie Ihren Bildschirm freigeben? FORMAL Could you help me? Kannst du mir helfen? INFORMAL Make sure you report any of these issues. Stellen Sie sicher, dass Sie eines dieser Probleme melden. FORMAL Can you give an example? Kannst du ein Beispiel geben? INFORMAL Formal vs. Informal — Crucial for Live Chats — Baseline MT engines are not consistent
  • 30. © Intento, Inc. / October 2020 TONE OF VOICE CONTROL DEFAULT MT OUTPUT 30 English to German — 210 segments — stock models A B C D E F G
  • 31. © Intento, Inc. / October 2020 TONE OF VOICE CONTROL HOW TO MAKE IT INFORMAL? 31 Option 1: Use DeepL with formality=less (99.5% accuracy) — Option 2: Generate synthetic training data, hoping translations become more informal — Option 3: MT-agnostic NLP What if you need a custom model and terminology, or another MT has better linguistic quality for you? Expensive and time-consuming, also introduces bias into the model Works to a certain extent, provides a wider choice of MT engines
  • 32. © Intento, Inc. / October 2020 TONE OF VOICE CONTROL MT-AGNOSTIC ADJUSTMENT 32 English to German — 210 segments — stock models — let’s make it more INFORMAL A B C D E F G
  • 33. © Intento, Inc. / October 2020 TONE OF VOICE CONTROL MT-AGNOSTIC ADJUSTMENT 33 English to German — 210 segments — stock models — let’s make it more FORMAL A B C D E F G
  • 34. © Intento, Inc. / October 2020 PRIVACY PROTECTION 34
  • 35. © Intento, Inc. / October 2020 DATA PROTECTION LAWS 35 A B C D E F According to DLA Piper https://www.dlapiperdataprotection.com/index.html?t=world-map as fetched on 2020-10-20
  • 36. © Intento, Inc. / October 2020 CLOUD MT DEPLOYMENTS 36 A B C D E F Alibaba Amazon Baidu DeepL Globalese* Google GTCom* IBM Microsoft Mirai ModenMT* Naver Niutrans* PROMT Rozetta SDL* Sogou Systran* Tencent Tilde* Yandex Youdao * On-premise and private cloud deployment available
  • 37. © Intento, Inc. / October 2020 DATA AND PRIVACY PROTECTION 37 Communication may contain PII, healthcare, HR, financial data. — Option 1: - select proper MT vendor for every region - when in doubt, use private-cloud deployments — Option 2: - proper DPA and data protection clauses + insurance — Option 3: pseudonymization to remove PII
  • 38. © Intento, Inc. / October 2020 KEY TAKEAWAYS 38 Machine Translation becomes more and more ubiquitous. It becomes more like our coworker than a tool. — When it’s biased, it may damage our work environment and culture. — As of today, it’s mostly masculine by gender and quite inconsistent by tone of voice. — It’s possible to dodge those biases using NLP paired with MT. — Make sure you know where your MT sits so that you stay compliant.
  • 39. THANKS! ks@inten.to 39 Konstantin Savenkov, CEO ks@inten.to 2150 Shattuck Ave Berkeley CA 94705 INTENTO https://inten.to