SlideShare a Scribd company logo
Predicting
machine translation quality
I am @bittlingmayer.
My company is @SignalNLabs
interests: translation quality, translation crowdsourcing,
transliteration, browser translation integrations, topic
classification, automatic source-side correction
previously @Google, @Adobe, @Cerner
Ciao!
Today’s topics
◉ Why translation quality?
◉ What is the problem?
◉ Our data model
◉ Our learning infra
Quality estimation?
sentence-level quality
good machine translation vs bad
1
Quality evaluation?
corpus-level quality given reference translations
machine translation vs human translation
2
Why quality?
Why is predicting quality useful?
Machine translation
should not be a gamble.
$4.50
1M chars by machine
Optimisation Function
$10000
1M chars at 5¢/word by human
Perfect Prediction == Perfect Translation
translator
predictor
reward [scores, rankings]
state
action [translations]
Reinforcement Learning
What’s the problem?
Is it really harder than self-driving cars?
Language is hard.
Context.
Data are dirty.
Bridging.
Payoff
What is solvable?
Effort
bad input
50% of errors
context/customisation
like a human
like Search, FB, Maps...source-side ambiguity
ideally interactive
bad output
What is quality?
Can we quantify the quality of a translation?
Accuracy
What is sentence-level quality?
Fluency
Low Quality
Good Enough
Misleading
Human Quality
Recall vs Precision vs Accuracy
actual bad
predicted bad
Trivial 90% Accuracy Example
actual bad
predicted bad: 100%
How does quality vary?
to English to top languages to other
from English
from top languages
from other
How does quality vary?
Wikipedia
news
dialogues, film subtitles, Coursera, Medium
“everyday” reviews, customer service
your children’s WhatsApp messages
my WhatsApp messages
Other concepts of quality?
How do we solve it?
With data and features
What is our data model?
source target score
en-zh Hello 您好 1.0
en-zh The car is driving. The car is driving. 0.0
en-ru The car is driving. Автомобиль вождения. 0.3
... ... ... ...
What is our data model?
source target src_length_bytes ... trg_spam_prob score
en-zh Hello 您好 5 ... 0.5 1.0
en-zh The car is driving. The car is driving. 19 ... 0.2 0.0
en-ru The car is driving. Автомобиль вождения. 19 ... 0.1 0.3
... ... ... ... ... ... ...
10-1000 features
signals engineered by us
1000-10M rows
sentences* hand-scored by linguists
language-agnostic
Language is just another feature.
Human scores
Evaluate many translations by hand
Human Evaluation Score Types
Labels
good/bad
multilabels
word-level labels
Ranking
rank multiple systems
Post-Edit
to comprehensible
to human quality
Human Evaluation Score Types
Labels
good/bad
0.0-1.0
multilabels
word-level labels
Ranking
rank multiple systems
Post-Edit
to comprehensible
to human quality
requires smaller dataset and budget
$0.001 / row @ 5x redundancy$
QuEst baseline features
quest.dcs.shef.ac.uk/quest_files/features_blackbox_baseline_17
number of tokens in the source sentence
number of tokens in the target sentence
average source token length
LM probability of source sentence
LM probability of target sentence
number of occurrences of the target word within the target hypothesis (averaged for all words in the hypothesis -
type/token ratio)
average number of translations per source word in the sentence (as given by IBM 1 table thresholded such that prob
(t|s) > 0.2)
average number of translations per source word in the sentence (as given by IBM 1 table thresholded such that prob
(t|s) > 0.01) weighted by the inverse frequency of each word in the source corpus
percentage of unigrams in quartile 1 of frequency (lower frequency words) in a corpus of the source language (SMT
training corpus)
percentage of unigrams in quartile 4 of frequency (higher frequency words) in a corpus of the source language
percentage of bigrams in quartile 1 of frequency of source words in a corpus of the source language
percentage of bigrams in quartile 4 of frequency of source words in a corpus of the source language
percentage of trigrams in quartile 1 of frequency of source words in a corpus of the source language
percentage of trigrams in quartile 4 of frequency of source words in a corpus of the source language
percentage of unigrams in the source sentence seen in a corpus (SMT training corpus)
number of punctuation marks in the source sentence
number of punctuation marks in the target sentence
number of tokens
length
LM probability
number of occurrences of the target word within the target hypothesis
average number of translations per source word in the sentence
…
percentage of unigrams in quartile 1 of frequency (lower frequency words)
…
percentage of unigrams in quartile n of frequency (higher frequency words)
…
percentage of trigrams in quartile 1 of frequency of source words
…
percentage of trigrams in quartile n of frequency of source words
number of punctuation marks
bad input signals
vot tak narod ho4et napisat'
Возможно, вы имели в виду: вот так народ хочет написать
human vot tak narod ho4et napisat' vot tak narod ho4et napisat'
search вот так народ хочет написать That's how people want to write
translation Вот так народ хочет написать. So people want to write.
bad output signals
ambiguity signals
translation signals
Google Microsoft Wiktionary ...
Merry Christmas Krismasi! Krismasi Njema! heri ya Krismasi
Krismasi njema
...
eat apples kula mapera kula apples ∅ ...
lexical signals
sygnały leksykalne
char signals
sygnały znaków
syntactic signals
parse tree to sequence conversion
sequence to sequence learning
cross-lingual signals
outside signals
context/customisation
signals
Other signals?
50-99+% accuracy
Depends on the benchmark! ;-)
1000-10M rows
10-1000 features
Data augmentation?
Can we use parallel corpora?
target
Onartutako gertaerak
Aholkuak eta iradokizunak
Etorkizuneko egitasmoei buruz galdetzea
onespena eskatzea
Laguntza eskatzea
Jende galdetzea itxaron
Norbait iritzia eskatzea
Etorkizunari Garrantzia
emanez informazio saihestea
Bad pertsona
…
…
...
Aditu batek ingelesez izatea
Being Lucky
zaharra izatea
pobrea izatea
ari irekietan
aberatsa izatea
Ziur izatea / zenbait
ari kezkaturik
Aspergarria!
Your Mind aldatzeak
Pertsonak txaloak Up
source
받아 들여지는 사실
조언 및 제안
향후 계획에 대해 물어
승인 요청
도움을 요청
사람을 요구하는 대기
누군가의 의견을 물어
미래에 대한 태도
제공 정보 방지
나쁜 사람들
…
…
...
영어 전문가 인
존재 럭키
오래 되
가난
안심되는
부자가되는
확인 인 / 특정
걱정되는
지루한!
당신의 마음을 변경
사람을 응원합니다
score
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
…
…
...
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
What is our learning infra?
H2O.ai deeplearning
Do we need
deep learning?
Why doesn’t
deep learning
work for
translation?
Want to learn more?
The real experts
◉ Dr. Lucia Specia
◉ quest.dcs.shef.ac.uk
◉ statmt.org/wmt15/quality-estimation-task.html
ACL 2016 will be held in Berlin in August.
Reading
Any questions ?
You can find me at
◉ @bittlingmayer
◉ adam@signaln.com
Thanks!

More Related Content

What's hot

Pycon India 2018 Natural Language Processing Workshop
Pycon India 2018   Natural Language Processing WorkshopPycon India 2018   Natural Language Processing Workshop
Pycon India 2018 Natural Language Processing Workshop
Lakshya Sivaramakrishnan
 
Translation
TranslationTranslation
Translation
mhasnain
 
How to input chinese characters
How to input chinese charactersHow to input chinese characters
How to input chinese characters
Yi Lee
 
Thai Text processing by Transfer Learning using Transformer (Bert)
Thai Text processing by Transfer Learning using Transformer (Bert)Thai Text processing by Transfer Learning using Transformer (Bert)
Thai Text processing by Transfer Learning using Transformer (Bert)
Kobkrit Viriyayudhakorn
 
NLP
NLPNLP
How To "Speak Developer"
How To "Speak Developer"How To "Speak Developer"
How To "Speak Developer"
Nick Malcolm
 
Chinese basics and translation guide
Chinese basics and translation guideChinese basics and translation guide
Chinese basics and translation guide
Frank Zhonghe Wei
 
Microblog-genre noise and its impact on semantic annotation accuracy
Microblog-genre noise and its impact on semantic annotation accuracyMicroblog-genre noise and its impact on semantic annotation accuracy
Microblog-genre noise and its impact on semantic annotation accuracy
Leon Derczynski
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processing
Minh Pham
 
Intro to Plain Language-for FCN Apr2012 Presentation
Intro to Plain Language-for FCN Apr2012 PresentationIntro to Plain Language-for FCN Apr2012 Presentation
Intro to Plain Language-for FCN Apr2012 Presentation
Federal Communicators Network
 

What's hot (10)

Pycon India 2018 Natural Language Processing Workshop
Pycon India 2018   Natural Language Processing WorkshopPycon India 2018   Natural Language Processing Workshop
Pycon India 2018 Natural Language Processing Workshop
 
Translation
TranslationTranslation
Translation
 
How to input chinese characters
How to input chinese charactersHow to input chinese characters
How to input chinese characters
 
Thai Text processing by Transfer Learning using Transformer (Bert)
Thai Text processing by Transfer Learning using Transformer (Bert)Thai Text processing by Transfer Learning using Transformer (Bert)
Thai Text processing by Transfer Learning using Transformer (Bert)
 
NLP
NLPNLP
NLP
 
How To "Speak Developer"
How To "Speak Developer"How To "Speak Developer"
How To "Speak Developer"
 
Chinese basics and translation guide
Chinese basics and translation guideChinese basics and translation guide
Chinese basics and translation guide
 
Microblog-genre noise and its impact on semantic annotation accuracy
Microblog-genre noise and its impact on semantic annotation accuracyMicroblog-genre noise and its impact on semantic annotation accuracy
Microblog-genre noise and its impact on semantic annotation accuracy
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processing
 
Intro to Plain Language-for FCN Apr2012 Presentation
Intro to Plain Language-for FCN Apr2012 PresentationIntro to Plain Language-for FCN Apr2012 Presentation
Intro to Plain Language-for FCN Apr2012 Presentation
 

Viewers also liked

10. Lucia Specia (USFD) Evaluation of Machine Translation
10. Lucia Specia (USFD) Evaluation of Machine Translation10. Lucia Specia (USFD) Evaluation of Machine Translation
10. Lucia Specia (USFD) Evaluation of Machine Translation
RIILP
 
Basic steps in the research process
Basic steps in the research processBasic steps in the research process
Basic steps in the research process
yen_dbsk
 
Translation quality measurement2
Translation quality measurement2Translation quality measurement2
Translation quality measurement2
patigalin
 
Steps in research process...mejo k george
Steps in research process...mejo k georgeSteps in research process...mejo k george
Steps in research process...mejo k george
Mejo K George
 
What is machine translation
What is machine translationWhat is machine translation
What is machine translation
Stephen Peacock
 
Steps in the research process
Steps in the research processSteps in the research process
Steps in the research process
Craig FANSLER
 
Translation quality assessment redefined
Translation quality assessment redefinedTranslation quality assessment redefined
Translation quality assessment redefined
Denis Khamin
 
Machine Translation
Machine TranslationMachine Translation
Machine Translation
Skilrock Technologies
 
Software Development Life Cycle (SDLC)
Software Development Life Cycle (SDLC)Software Development Life Cycle (SDLC)
Software Development Life Cycle (SDLC)
Angelin R
 
Chapter 4 presentation of data
Chapter 4 presentation of dataChapter 4 presentation of data
Chapter 4 presentation of data
Polytechnic University of the Philippines
 
Presentation of data
Presentation of dataPresentation of data
Presentation of data
Winona Esel Bernardo
 
Data analysis powerpoint
Data analysis powerpointData analysis powerpoint
Data analysis powerpoint
Sarah Hallum
 
Friend
Friend Friend
Friend
Juliana Ashy
 
การกำเนิดเทคโนโลยีสารสนเทศ
การกำเนิดเทคโนโลยีสารสนเทศการกำเนิดเทคโนโลยีสารสนเทศ
การกำเนิดเทคโนโลยีสารสนเทศ
Sa'Laoy Krissada
 
11 4-16
11 4-1611 4-16
Bothar Group 2015 YIR
Bothar Group 2015 YIRBothar Group 2015 YIR
Bothar Group 2015 YIRSimon Rhodes
 
งานชิ้นที่ 1 ประวัติส่วนตัว
งานชิ้นที่ 1 ประวัติส่วนตัวงานชิ้นที่ 1 ประวัติส่วนตัว
งานชิ้นที่ 1 ประวัติส่วนตัว
Ai Nicha
 
Internet 01 new
Internet 01 newInternet 01 new
Internet 01 new
Linh_21022012
 
Fred Swenson Research Paper
Fred Swenson Research PaperFred Swenson Research Paper
Fred Swenson Research Paper
Fred Swenson
 

Viewers also liked (20)

10. Lucia Specia (USFD) Evaluation of Machine Translation
10. Lucia Specia (USFD) Evaluation of Machine Translation10. Lucia Specia (USFD) Evaluation of Machine Translation
10. Lucia Specia (USFD) Evaluation of Machine Translation
 
Basic steps in the research process
Basic steps in the research processBasic steps in the research process
Basic steps in the research process
 
Analysis
AnalysisAnalysis
Analysis
 
Translation quality measurement2
Translation quality measurement2Translation quality measurement2
Translation quality measurement2
 
Steps in research process...mejo k george
Steps in research process...mejo k georgeSteps in research process...mejo k george
Steps in research process...mejo k george
 
What is machine translation
What is machine translationWhat is machine translation
What is machine translation
 
Steps in the research process
Steps in the research processSteps in the research process
Steps in the research process
 
Translation quality assessment redefined
Translation quality assessment redefinedTranslation quality assessment redefined
Translation quality assessment redefined
 
Machine Translation
Machine TranslationMachine Translation
Machine Translation
 
Software Development Life Cycle (SDLC)
Software Development Life Cycle (SDLC)Software Development Life Cycle (SDLC)
Software Development Life Cycle (SDLC)
 
Chapter 4 presentation of data
Chapter 4 presentation of dataChapter 4 presentation of data
Chapter 4 presentation of data
 
Presentation of data
Presentation of dataPresentation of data
Presentation of data
 
Data analysis powerpoint
Data analysis powerpointData analysis powerpoint
Data analysis powerpoint
 
Friend
Friend Friend
Friend
 
การกำเนิดเทคโนโลยีสารสนเทศ
การกำเนิดเทคโนโลยีสารสนเทศการกำเนิดเทคโนโลยีสารสนเทศ
การกำเนิดเทคโนโลยีสารสนเทศ
 
11 4-16
11 4-1611 4-16
11 4-16
 
Bothar Group 2015 YIR
Bothar Group 2015 YIRBothar Group 2015 YIR
Bothar Group 2015 YIR
 
งานชิ้นที่ 1 ประวัติส่วนตัว
งานชิ้นที่ 1 ประวัติส่วนตัวงานชิ้นที่ 1 ประวัติส่วนตัว
งานชิ้นที่ 1 ประวัติส่วนตัว
 
Internet 01 new
Internet 01 newInternet 01 new
Internet 01 new
 
Fred Swenson Research Paper
Fred Swenson Research PaperFred Swenson Research Paper
Fred Swenson Research Paper
 

Similar to #5 Predicting Machine Translation Quality

Sumit A
Sumit ASumit A
Sumit A
Hilary Ip
 
Module 8: Natural language processing Pt 1
Module 8:  Natural language processing Pt 1Module 8:  Natural language processing Pt 1
Module 8: Natural language processing Pt 1
Sara Hooker
 
NLP PPT.pptx
NLP PPT.pptxNLP PPT.pptx
Natural language processing: feature extraction
Natural language processing: feature extractionNatural language processing: feature extraction
Natural language processing: feature extraction
Gabriel Hamilton
 
KiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialKiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorial
Alyona Medelyan
 
Strata - Final_IB_02_17
Strata - Final_IB_02_17Strata - Final_IB_02_17
Strata - Final_IB_02_17
Irina Borisova
 
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and ApplicationsICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
Forward Gradient
 
Going Global Without Going Insane
Going Global Without Going InsaneGoing Global Without Going Insane
Going Global Without Going Insane
Kevin Potts
 
NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA DATASCIENCE
 
Speech recognition - Art of the possible
Speech recognition - Art of the possibleSpeech recognition - Art of the possible
Speech recognition - Art of the possible
Jisc
 
Speech Recognition: Art of the possible - DigiFest 2022
Speech Recognition: Art of the possible - DigiFest 2022Speech Recognition: Art of the possible - DigiFest 2022
Speech Recognition: Art of the possible - DigiFest 2022
Dominik Lukes
 
Speech Recognition: Art of the possible - DigiFest 2022
Speech Recognition: Art of the possible - DigiFest 2022Speech Recognition: Art of the possible - DigiFest 2022
Speech Recognition: Art of the possible - DigiFest 2022
Dominik Lukes
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
Boston Institute of Analytics
 
Semantic vs. Statistic Language Model Expansion
Semantic vs. Statistic Language Model ExpansionSemantic vs. Statistic Language Model Expansion
Semantic vs. Statistic Language Model Expansion
Yuval Krymolowski
 
Nlp final
Nlp finalNlp final
Nlp final
Anand Chafekar
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
punedevscom
 
Big data veracity challenges
Big data veracity challengesBig data veracity challenges
Big data veracity challenges
Prayukth K V
 
Non-trivial pursuits: Learning machines and forgetful humans
Non-trivial pursuits: Learning machines and forgetful humansNon-trivial pursuits: Learning machines and forgetful humans
Non-trivial pursuits: Learning machines and forgetful humans
Christian Heilmann
 
State of NLP and Amazon Comprehend
State of NLP and Amazon ComprehendState of NLP and Amazon Comprehend
State of NLP and Amazon Comprehend
Egor Pushkin
 
DeepPavlov 2019
DeepPavlov 2019DeepPavlov 2019
DeepPavlov 2019
Mikhail Burtsev
 

Similar to #5 Predicting Machine Translation Quality (20)

Sumit A
Sumit ASumit A
Sumit A
 
Module 8: Natural language processing Pt 1
Module 8:  Natural language processing Pt 1Module 8:  Natural language processing Pt 1
Module 8: Natural language processing Pt 1
 
NLP PPT.pptx
NLP PPT.pptxNLP PPT.pptx
NLP PPT.pptx
 
Natural language processing: feature extraction
Natural language processing: feature extractionNatural language processing: feature extraction
Natural language processing: feature extraction
 
KiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialKiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorial
 
Strata - Final_IB_02_17
Strata - Final_IB_02_17Strata - Final_IB_02_17
Strata - Final_IB_02_17
 
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and ApplicationsICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
 
Going Global Without Going Insane
Going Global Without Going InsaneGoing Global Without Going Insane
Going Global Without Going Insane
 
NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2
 
Speech recognition - Art of the possible
Speech recognition - Art of the possibleSpeech recognition - Art of the possible
Speech recognition - Art of the possible
 
Speech Recognition: Art of the possible - DigiFest 2022
Speech Recognition: Art of the possible - DigiFest 2022Speech Recognition: Art of the possible - DigiFest 2022
Speech Recognition: Art of the possible - DigiFest 2022
 
Speech Recognition: Art of the possible - DigiFest 2022
Speech Recognition: Art of the possible - DigiFest 2022Speech Recognition: Art of the possible - DigiFest 2022
Speech Recognition: Art of the possible - DigiFest 2022
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Semantic vs. Statistic Language Model Expansion
Semantic vs. Statistic Language Model ExpansionSemantic vs. Statistic Language Model Expansion
Semantic vs. Statistic Language Model Expansion
 
Nlp final
Nlp finalNlp final
Nlp final
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Big data veracity challenges
Big data veracity challengesBig data veracity challenges
Big data veracity challenges
 
Non-trivial pursuits: Learning machines and forgetful humans
Non-trivial pursuits: Learning machines and forgetful humansNon-trivial pursuits: Learning machines and forgetful humans
Non-trivial pursuits: Learning machines and forgetful humans
 
State of NLP and Amazon Comprehend
State of NLP and Amazon ComprehendState of NLP and Amazon Comprehend
State of NLP and Amazon Comprehend
 
DeepPavlov 2019
DeepPavlov 2019DeepPavlov 2019
DeepPavlov 2019
 

Recently uploaded

Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
saastr
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
christinelarrosa
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
High performance Serverless Java on AWS- GoTo Amsterdam 2024
High performance Serverless Java on AWS- GoTo Amsterdam 2024High performance Serverless Java on AWS- GoTo Amsterdam 2024
High performance Serverless Java on AWS- GoTo Amsterdam 2024
Vadym Kazulkin
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
Fwdays
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Ajin Abraham
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
christinelarrosa
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillinQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
LizaNolte
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
ScyllaDB
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
Enterprise Knowledge
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving
 

Recently uploaded (20)

Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
High performance Serverless Java on AWS- GoTo Amsterdam 2024
High performance Serverless Java on AWS- GoTo Amsterdam 2024High performance Serverless Java on AWS- GoTo Amsterdam 2024
High performance Serverless Java on AWS- GoTo Amsterdam 2024
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillinQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
 

#5 Predicting Machine Translation Quality

  • 2. I am @bittlingmayer. My company is @SignalNLabs interests: translation quality, translation crowdsourcing, transliteration, browser translation integrations, topic classification, automatic source-side correction previously @Google, @Adobe, @Cerner Ciao!
  • 3. Today’s topics ◉ Why translation quality? ◉ What is the problem? ◉ Our data model ◉ Our learning infra
  • 4. Quality estimation? sentence-level quality good machine translation vs bad 1
  • 5. Quality evaluation? corpus-level quality given reference translations machine translation vs human translation 2
  • 6. Why quality? Why is predicting quality useful?
  • 8. $4.50 1M chars by machine Optimisation Function $10000 1M chars at 5¢/word by human
  • 9. Perfect Prediction == Perfect Translation translator predictor reward [scores, rankings] state action [translations] Reinforcement Learning
  • 10. What’s the problem? Is it really harder than self-driving cars?
  • 15.
  • 16. Payoff What is solvable? Effort bad input 50% of errors context/customisation like a human like Search, FB, Maps...source-side ambiguity ideally interactive bad output
  • 17. What is quality? Can we quantify the quality of a translation?
  • 18. Accuracy What is sentence-level quality? Fluency Low Quality Good Enough Misleading Human Quality
  • 19. Recall vs Precision vs Accuracy actual bad predicted bad
  • 20. Trivial 90% Accuracy Example actual bad predicted bad: 100%
  • 21. How does quality vary? to English to top languages to other from English from top languages from other
  • 22. How does quality vary? Wikipedia news dialogues, film subtitles, Coursera, Medium “everyday” reviews, customer service your children’s WhatsApp messages my WhatsApp messages
  • 23. Other concepts of quality?
  • 24. How do we solve it? With data and features
  • 25. What is our data model? source target score en-zh Hello 您好 1.0 en-zh The car is driving. The car is driving. 0.0 en-ru The car is driving. Автомобиль вождения. 0.3 ... ... ... ...
  • 26. What is our data model? source target src_length_bytes ... trg_spam_prob score en-zh Hello 您好 5 ... 0.5 1.0 en-zh The car is driving. The car is driving. 19 ... 0.2 0.0 en-ru The car is driving. Автомобиль вождения. 19 ... 0.1 0.3 ... ... ... ... ... ... ...
  • 27. 10-1000 features signals engineered by us 1000-10M rows sentences* hand-scored by linguists language-agnostic Language is just another feature.
  • 28. Human scores Evaluate many translations by hand
  • 29. Human Evaluation Score Types Labels good/bad multilabels word-level labels Ranking rank multiple systems Post-Edit to comprehensible to human quality
  • 30. Human Evaluation Score Types Labels good/bad 0.0-1.0 multilabels word-level labels Ranking rank multiple systems Post-Edit to comprehensible to human quality requires smaller dataset and budget
  • 31. $0.001 / row @ 5x redundancy$
  • 33. number of tokens in the source sentence number of tokens in the target sentence average source token length LM probability of source sentence LM probability of target sentence number of occurrences of the target word within the target hypothesis (averaged for all words in the hypothesis - type/token ratio) average number of translations per source word in the sentence (as given by IBM 1 table thresholded such that prob (t|s) > 0.2) average number of translations per source word in the sentence (as given by IBM 1 table thresholded such that prob (t|s) > 0.01) weighted by the inverse frequency of each word in the source corpus percentage of unigrams in quartile 1 of frequency (lower frequency words) in a corpus of the source language (SMT training corpus) percentage of unigrams in quartile 4 of frequency (higher frequency words) in a corpus of the source language percentage of bigrams in quartile 1 of frequency of source words in a corpus of the source language percentage of bigrams in quartile 4 of frequency of source words in a corpus of the source language percentage of trigrams in quartile 1 of frequency of source words in a corpus of the source language percentage of trigrams in quartile 4 of frequency of source words in a corpus of the source language percentage of unigrams in the source sentence seen in a corpus (SMT training corpus) number of punctuation marks in the source sentence number of punctuation marks in the target sentence
  • 34. number of tokens length LM probability number of occurrences of the target word within the target hypothesis average number of translations per source word in the sentence … percentage of unigrams in quartile 1 of frequency (lower frequency words) … percentage of unigrams in quartile n of frequency (higher frequency words) … percentage of trigrams in quartile 1 of frequency of source words … percentage of trigrams in quartile n of frequency of source words number of punctuation marks
  • 36. vot tak narod ho4et napisat' Возможно, вы имели в виду: вот так народ хочет написать
  • 37. human vot tak narod ho4et napisat' vot tak narod ho4et napisat' search вот так народ хочет написать That's how people want to write translation Вот так народ хочет написать. So people want to write.
  • 40.
  • 42. Google Microsoft Wiktionary ... Merry Christmas Krismasi! Krismasi Njema! heri ya Krismasi Krismasi njema ... eat apples kula mapera kula apples ∅ ...
  • 46. parse tree to sequence conversion
  • 52. 50-99+% accuracy Depends on the benchmark! ;-) 1000-10M rows 10-1000 features
  • 54. Can we use parallel corpora? target Onartutako gertaerak Aholkuak eta iradokizunak Etorkizuneko egitasmoei buruz galdetzea onespena eskatzea Laguntza eskatzea Jende galdetzea itxaron Norbait iritzia eskatzea Etorkizunari Garrantzia emanez informazio saihestea Bad pertsona … … ... Aditu batek ingelesez izatea Being Lucky zaharra izatea pobrea izatea ari irekietan aberatsa izatea Ziur izatea / zenbait ari kezkaturik Aspergarria! Your Mind aldatzeak Pertsonak txaloak Up source 받아 들여지는 사실 조언 및 제안 향후 계획에 대해 물어 승인 요청 도움을 요청 사람을 요구하는 대기 누군가의 의견을 물어 미래에 대한 태도 제공 정보 방지 나쁜 사람들 … … ... 영어 전문가 인 존재 럭키 오래 되 가난 안심되는 부자가되는 확인 인 / 특정 걱정되는 지루한! 당신의 마음을 변경 사람을 응원합니다 score 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 … … ... 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
  • 55. What is our learning infra? H2O.ai deeplearning
  • 56. Do we need deep learning?
  • 58. Want to learn more? The real experts ◉ Dr. Lucia Specia ◉ quest.dcs.shef.ac.uk ◉ statmt.org/wmt15/quality-estimation-task.html ACL 2016 will be held in Berlin in August. Reading
  • 59. Any questions ? You can find me at ◉ @bittlingmayer ◉ adam@signaln.com Thanks!