SlideShare a Scribd company logo
Machine Translation
Quality Estimation with
QuEst++
Carolina Scarton – c.scarton@sheffield.ac.uk
University of Sheffield
Quality Estimation of Machine Translation

Predict quality of unseen data
Quality Estimation of Machine Translation

Predict quality of unseen data

Only few labelled data points for training
Quality Estimation of Machine Translation

Predict quality of unseen data

Only few labelled data points for training

Only uses information from source and target → no references!
Quality Estimation of Machine Translation

Predict quality of unseen data

Only few labelled data points for training

Only uses information from source and target → no references!

Reduce post-editing and revision time → minimize costs!

Spotting errors

Estimate post-editing effort and time
QuEst++

Framework for QE

Word-level, sentence-level and document-level
QuEst++

Framework for QE

Word-level, sentence-level and document-level

Feature extraction module →features from source and target

plus information from MT system
QuEst++

Framework for QE

Word-level, sentence-level and document-level

Feature extraction module →features from source and target

plus information from MT system

Machine learning module → use the features for building a QE
model

QE model → can predict the quality of unseen data
Target documents
(training)
Source documents
(training)
Training scenario – annotated data
Example: post-editing effort (1 to 5) – training data (with annotation)
Source MT Score
Barack Obama becomes the fourth American
president to receive the Nobel Peace Prize
Barack Obama se convierte en el cuarto
presidente estadounidense para recibir el
Premio Nobel de la Paz
4.5
The presidential couple then has a meeting
scheduled with King Harald V and Queen
Sonja of Norway.
La pareja presidencial entonces tiene una
reunión programada con el Rey Harald V y
Reina Sonja de Noruega.
4.0
Although, at the cost of the state falling
deeper into debt – next year the treasury
won't just be 163 billion short, but even more.
Aunque, a costa del estado cayendo más –
el año que viene el Tesoro no sólo se 163
millones de corto, pero aún más.
2.0
Transformer worth tens of millions of crowns
burns in Louny region
La pena de transformadores decenas de
millones de coronas Louny quemaduras en
la región
1.5
Target documents
(training)
Source documents
(training)
MT system
Training scenario – annotated data
External resources/tools:
- SRILM
- GIZA++ tables
- TreeTagger
- StanfordParser
Information from the MT
system that translated the
documents (if available)
Feature extractor
Training scenario – annotated data
Black-box:
- number of tokens
- number of punctuation
- LM perplexity
- n-gram counts
- POS counts
- syntactic tree
- lexical cohesion
Glass-box:
- n-best list information
Target documents
(training)
Feature extractor
Source documents
(training)
MT system
Features for QE
Quality labels

Post-editing effort

Post-editing time

HTER

BLEU

...
QE model training
Training scenario – annotated data
ML algorithms:
- SVC
- SVR
- CRF
Target documents
(training)
Feature extractor
Source documents
(training)
MT system
Features for QE
QE model
Training scenario – annotated data
Quality labels

Post-editing effort

Post-editing time

HTER

BLEU

...
QE model training
Target documents
(training)
Feature extractor
Source documents
(training)
MT system
Features for QE
Target documents
(unseen data)
Source documents
(unseen data)
Predicting labels for unseen data
Source MT Score
Mass Slaughter on a Personal Level El sacrificio masivo a nivel personal
?
People begin to ask why their leaders are
making them fight.
La gente empiece a preguntar por qué sus
líderes están haciendo ellos lucha. ?
As the community affairs officers moved into
the park in their light-blue windbreakers,
many protesters simply gathered their
belongings and left.
Asuntos como la Comunidad oficiales
movido en el parque en su luz-azul
windbreakers, muchos manifestantes
simplemente se reunieron sus pertenencias
y de la izquierda.
?
Some stories are about honor and bravery. Algunas historias son de honor y valentía.
?
Example: post-editing effort (1 to 5) – unseen data
Target documents
(unseen data)
Feature extractor
Source documents
(unseen data)
Predicting labels for unseen data
MT system
Predicting labels for unseen data
Features are the
same as the ones
extracted at training
time
Target documents
(unseen data)
Feature extractor
Source documents
(unseen data)
MT system
Features for QE
QE model
Predicting labels for unseen data
Can predict labels for
the new data → ML
magic!
Target documents
(unseen data)
Feature extractor
Source documents
(unseen data)
MT system
Features for QE
Predicting labels for unseen data
Source MT Score
Mass Slaughter on a Personal Level El sacrificio masivo a nivel personal
4.5
People begin to ask why their leaders are
making them fight.
La gente empiece a preguntar por qué sus
líderes están haciendo ellos lucha. 3.0
As the community affairs officers moved into
the park in their light-blue windbreakers,
many protesters simply gathered their
belongings and left.
Asuntos como la Comunidad oficiales
movido en el parque en su luz-azul
windbreakers, muchos manifestantes
simplemente se reunieron sus pertenencias
y de la izquierda.
1.8
Some stories are about honor and bravery. Algunas historias son de honor y valentía.
4.5
Example: post-editing effort (1 to 5) – unseen data → predictions!
Predicted
scores
Thank you!
QuEst++ download: http://www.quest.dcs.shef.ac.uk/

More Related Content

Viewers also liked

Liangyou Li - ESR 8 - DCU
Liangyou Li - ESR 8 - DCU Liangyou Li - ESR 8 - DCU
Liangyou Li - ESR 8 - DCU
RIILP
 
Gabriella Gonzalez - eTRAD
Gabriella Gonzalez - eTRAD Gabriella Gonzalez - eTRAD
Gabriella Gonzalez - eTRAD
RIILP
 
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
RIILP
 
ESR7 Carolina Scarton - EXPERT Summer School - Malaga 2015
ESR7 Carolina Scarton - EXPERT Summer School - Malaga 2015ESR7 Carolina Scarton - EXPERT Summer School - Malaga 2015
ESR7 Carolina Scarton - EXPERT Summer School - Malaga 2015
RIILP
 
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
RIILP
 
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
RIILP
 
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
RIILP
 
Hernani Costa - ESR 3 - UMA
Hernani Costa - ESR 3 - UMA Hernani Costa - ESR 3 - UMA
Hernani Costa - ESR 3 - UMA
RIILP
 
ESR6 Varvara Logacheva - EXPERT Summer School - Malaga 2015
ESR6 Varvara Logacheva - EXPERT Summer School - Malaga 2015ESR6 Varvara Logacheva - EXPERT Summer School - Malaga 2015
ESR6 Varvara Logacheva - EXPERT Summer School - Malaga 2015
RIILP
 
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
RIILP
 
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
RIILP
 
2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction
RIILP
 
ESR12 Hanna Bechara - EXPERT Summer School - Malaga 2015
ESR12 Hanna Bechara - EXPERT Summer School - Malaga 2015ESR12 Hanna Bechara - EXPERT Summer School - Malaga 2015
ESR12 Hanna Bechara - EXPERT Summer School - Malaga 2015
RIILP
 
7. Intellectual Property - Alberto Massidda (Translated)
7. Intellectual Property - Alberto Massidda (Translated)7. Intellectual Property - Alberto Massidda (Translated)
7. Intellectual Property - Alberto Massidda (Translated)
RIILP
 
11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation
11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation
11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation
RIILP
 
10. Lucia Specia (USFD) Evaluation of Machine Translation
10. Lucia Specia (USFD) Evaluation of Machine Translation10. Lucia Specia (USFD) Evaluation of Machine Translation
10. Lucia Specia (USFD) Evaluation of Machine Translation
RIILP
 

Viewers also liked (16)

Liangyou Li - ESR 8 - DCU
Liangyou Li - ESR 8 - DCU Liangyou Li - ESR 8 - DCU
Liangyou Li - ESR 8 - DCU
 
Gabriella Gonzalez - eTRAD
Gabriella Gonzalez - eTRAD Gabriella Gonzalez - eTRAD
Gabriella Gonzalez - eTRAD
 
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
 
ESR7 Carolina Scarton - EXPERT Summer School - Malaga 2015
ESR7 Carolina Scarton - EXPERT Summer School - Malaga 2015ESR7 Carolina Scarton - EXPERT Summer School - Malaga 2015
ESR7 Carolina Scarton - EXPERT Summer School - Malaga 2015
 
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
 
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
 
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
 
Hernani Costa - ESR 3 - UMA
Hernani Costa - ESR 3 - UMA Hernani Costa - ESR 3 - UMA
Hernani Costa - ESR 3 - UMA
 
ESR6 Varvara Logacheva - EXPERT Summer School - Malaga 2015
ESR6 Varvara Logacheva - EXPERT Summer School - Malaga 2015ESR6 Varvara Logacheva - EXPERT Summer School - Malaga 2015
ESR6 Varvara Logacheva - EXPERT Summer School - Malaga 2015
 
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
 
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
 
2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction
 
ESR12 Hanna Bechara - EXPERT Summer School - Malaga 2015
ESR12 Hanna Bechara - EXPERT Summer School - Malaga 2015ESR12 Hanna Bechara - EXPERT Summer School - Malaga 2015
ESR12 Hanna Bechara - EXPERT Summer School - Malaga 2015
 
7. Intellectual Property - Alberto Massidda (Translated)
7. Intellectual Property - Alberto Massidda (Translated)7. Intellectual Property - Alberto Massidda (Translated)
7. Intellectual Property - Alberto Massidda (Translated)
 
11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation
11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation
11. manuel leiva & juanjo arevalillo (hermes) evaluation of machine translation
 
10. Lucia Specia (USFD) Evaluation of Machine Translation
10. Lucia Specia (USFD) Evaluation of Machine Translation10. Lucia Specia (USFD) Evaluation of Machine Translation
10. Lucia Specia (USFD) Evaluation of Machine Translation
 

Similar to Carolina Scarton - ESR 7 - USFD

NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobilNLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
Databricks
 
Big Data Science in Scala
Big Data Science in ScalaBig Data Science in Scala
Big Data Science in Scala
Anastasia Bobyreva
 
SplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding OverviewSplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding Overview
Splunk
 
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding OverviewSplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
Splunk
 
Fire-fighting java big data problems
Fire-fighting java big data problemsFire-fighting java big data problems
Fire-fighting java big data problems
grepalex
 
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache SparkBest Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Databricks
 
Aen007 Kenigsberg 091807
Aen007 Kenigsberg 091807Aen007 Kenigsberg 091807
Aen007 Kenigsberg 091807
Dreamforce07
 
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
HostedbyConfluent
 
Implementing a data_science_project (Python Version)_part1
Implementing a data_science_project (Python Version)_part1Implementing a data_science_project (Python Version)_part1
Implementing a data_science_project (Python Version)_part1
Dr Sulaimon Afolabi
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
Miroslaw Staron
 
How to Transform Into a Data-Driven Organization
How to Transform Into a Data-Driven OrganizationHow to Transform Into a Data-Driven Organization
How to Transform Into a Data-Driven Organization
WarrenCruz3
 
Jorge Torres - Machine Learning Democratization with Python
 Jorge Torres - Machine Learning Democratization with Python  Jorge Torres - Machine Learning Democratization with Python
Jorge Torres - Machine Learning Democratization with Python
PyCon Odessa
 
Data Rehab Series: Automating Taxonomy
Data Rehab Series: Automating TaxonomyData Rehab Series: Automating Taxonomy
Data Rehab Series: Automating Taxonomy
RingLead
 
Joe C
Joe CJoe C
Joe C
Hilary Ip
 
COSCUP - Open Source Engines Providing Big Data in the Cloud, Markku Lepisto
COSCUP - Open Source Engines Providing Big Data in the Cloud, Markku LepistoCOSCUP - Open Source Engines Providing Big Data in the Cloud, Markku Lepisto
COSCUP - Open Source Engines Providing Big Data in the Cloud, Markku Lepisto
Amazon Web Services
 
AI meets Big Data
AI meets Big DataAI meets Big Data
AI meets Big Data
Jan Wiegelmann
 
Xpanse-Manufacturing-2023.pdf
Xpanse-Manufacturing-2023.pdfXpanse-Manufacturing-2023.pdf
Xpanse-Manufacturing-2023.pdf
NiallWalsh25
 
Caspar Preservation Methodology Steve Renkin
Caspar Preservation Methodology Steve RenkinCaspar Preservation Methodology Steve Renkin
Caspar Preservation Methodology Steve Renkin
DigitalPreservationEurope
 
Human in the Loop AI for Building Knowledge Bases
Human in the Loop AI for Building Knowledge Bases Human in the Loop AI for Building Knowledge Bases
Human in the Loop AI for Building Knowledge Bases
Yunyao Li
 
BAS 150 Lesson 1 Lecture
BAS 150 Lesson 1 LectureBAS 150 Lesson 1 Lecture
BAS 150 Lesson 1 Lecture
Wake Tech BAS
 

Similar to Carolina Scarton - ESR 7 - USFD (20)

NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobilNLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
 
Big Data Science in Scala
Big Data Science in ScalaBig Data Science in Scala
Big Data Science in Scala
 
SplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding OverviewSplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding Overview
 
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding OverviewSplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
 
Fire-fighting java big data problems
Fire-fighting java big data problemsFire-fighting java big data problems
Fire-fighting java big data problems
 
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache SparkBest Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache Spark
 
Aen007 Kenigsberg 091807
Aen007 Kenigsberg 091807Aen007 Kenigsberg 091807
Aen007 Kenigsberg 091807
 
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
 
Implementing a data_science_project (Python Version)_part1
Implementing a data_science_project (Python Version)_part1Implementing a data_science_project (Python Version)_part1
Implementing a data_science_project (Python Version)_part1
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
 
How to Transform Into a Data-Driven Organization
How to Transform Into a Data-Driven OrganizationHow to Transform Into a Data-Driven Organization
How to Transform Into a Data-Driven Organization
 
Jorge Torres - Machine Learning Democratization with Python
 Jorge Torres - Machine Learning Democratization with Python  Jorge Torres - Machine Learning Democratization with Python
Jorge Torres - Machine Learning Democratization with Python
 
Data Rehab Series: Automating Taxonomy
Data Rehab Series: Automating TaxonomyData Rehab Series: Automating Taxonomy
Data Rehab Series: Automating Taxonomy
 
Joe C
Joe CJoe C
Joe C
 
COSCUP - Open Source Engines Providing Big Data in the Cloud, Markku Lepisto
COSCUP - Open Source Engines Providing Big Data in the Cloud, Markku LepistoCOSCUP - Open Source Engines Providing Big Data in the Cloud, Markku Lepisto
COSCUP - Open Source Engines Providing Big Data in the Cloud, Markku Lepisto
 
AI meets Big Data
AI meets Big DataAI meets Big Data
AI meets Big Data
 
Xpanse-Manufacturing-2023.pdf
Xpanse-Manufacturing-2023.pdfXpanse-Manufacturing-2023.pdf
Xpanse-Manufacturing-2023.pdf
 
Caspar Preservation Methodology Steve Renkin
Caspar Preservation Methodology Steve RenkinCaspar Preservation Methodology Steve Renkin
Caspar Preservation Methodology Steve Renkin
 
Human in the Loop AI for Building Knowledge Bases
Human in the Loop AI for Building Knowledge Bases Human in the Loop AI for Building Knowledge Bases
Human in the Loop AI for Building Knowledge Bases
 
BAS 150 Lesson 1 Lecture
BAS 150 Lesson 1 LectureBAS 150 Lesson 1 Lecture
BAS 150 Lesson 1 Lecture
 

More from RIILP

Carla Parra Escartin - ER2 Hermes Traducciones
Carla Parra Escartin - ER2 Hermes Traducciones Carla Parra Escartin - ER2 Hermes Traducciones
Carla Parra Escartin - ER2 Hermes Traducciones
RIILP
 
Juanjo Arevelillo - Hermes Traducciones
Juanjo Arevelillo - Hermes Traducciones Juanjo Arevelillo - Hermes Traducciones
Juanjo Arevelillo - Hermes Traducciones
RIILP
 
Gianluca Giulinin - FAO
Gianluca Giulinin - FAO Gianluca Giulinin - FAO
Gianluca Giulinin - FAO
RIILP
 
Tony O'Dowd - KantanMT
Tony O'Dowd -  KantanMT Tony O'Dowd -  KantanMT
Tony O'Dowd - KantanMT
RIILP
 
Santanu Pal - ESR 2 USAAR
Santanu Pal - ESR 2 USAARSantanu Pal - ESR 2 USAAR
Santanu Pal - ESR 2 USAAR
RIILP
 
Chris Hokamp - ESR 9 DCU
Chris Hokamp - ESR 9 DCU Chris Hokamp - ESR 9 DCU
Chris Hokamp - ESR 9 DCU
RIILP
 
Anna Zaretskaya - ESR 1 UMA
Anna Zaretskaya - ESR 1 UMAAnna Zaretskaya - ESR 1 UMA
Anna Zaretskaya - ESR 1 UMA
RIILP
 
Rohit Gupta - ESR 4 - UoW
Rohit Gupta - ESR 4 - UoW Rohit Gupta - ESR 4 - UoW
Rohit Gupta - ESR 4 - UoW
RIILP
 
Liling Tan - ESR 5 USAAR
Liling Tan - ESR 5 USAARLiling Tan - ESR 5 USAAR
Liling Tan - ESR 5 USAAR
RIILP
 
ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015
ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015
ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015
RIILP
 
ESR5 Liling Tan - EXPERT Summer School - Malaga 2015
ESR5 Liling Tan - EXPERT Summer School - Malaga 2015ESR5 Liling Tan - EXPERT Summer School - Malaga 2015
ESR5 Liling Tan - EXPERT Summer School - Malaga 2015
RIILP
 
ESR8 Liangyou Li - EXPERT Summer School - Malaga 2015
ESR8 Liangyou Li - EXPERT Summer School - Malaga 2015ESR8 Liangyou Li - EXPERT Summer School - Malaga 2015
ESR8 Liangyou Li - EXPERT Summer School - Malaga 2015
RIILP
 
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
RIILP
 
9. Ethics - Juan Jose Arevalillo Doval (Hermes)
9. Ethics - Juan Jose Arevalillo Doval (Hermes)9. Ethics - Juan Jose Arevalillo Doval (Hermes)
9. Ethics - Juan Jose Arevalillo Doval (Hermes)
RIILP
 
8. Transfer of Technology to Market and Commercial Exploitation of Results - ...
8. Transfer of Technology to Market and Commercial Exploitation of Results - ...8. Transfer of Technology to Market and Commercial Exploitation of Results - ...
8. Transfer of Technology to Market and Commercial Exploitation of Results - ...
RIILP
 

More from RIILP (15)

Carla Parra Escartin - ER2 Hermes Traducciones
Carla Parra Escartin - ER2 Hermes Traducciones Carla Parra Escartin - ER2 Hermes Traducciones
Carla Parra Escartin - ER2 Hermes Traducciones
 
Juanjo Arevelillo - Hermes Traducciones
Juanjo Arevelillo - Hermes Traducciones Juanjo Arevelillo - Hermes Traducciones
Juanjo Arevelillo - Hermes Traducciones
 
Gianluca Giulinin - FAO
Gianluca Giulinin - FAO Gianluca Giulinin - FAO
Gianluca Giulinin - FAO
 
Tony O'Dowd - KantanMT
Tony O'Dowd -  KantanMT Tony O'Dowd -  KantanMT
Tony O'Dowd - KantanMT
 
Santanu Pal - ESR 2 USAAR
Santanu Pal - ESR 2 USAARSantanu Pal - ESR 2 USAAR
Santanu Pal - ESR 2 USAAR
 
Chris Hokamp - ESR 9 DCU
Chris Hokamp - ESR 9 DCU Chris Hokamp - ESR 9 DCU
Chris Hokamp - ESR 9 DCU
 
Anna Zaretskaya - ESR 1 UMA
Anna Zaretskaya - ESR 1 UMAAnna Zaretskaya - ESR 1 UMA
Anna Zaretskaya - ESR 1 UMA
 
Rohit Gupta - ESR 4 - UoW
Rohit Gupta - ESR 4 - UoW Rohit Gupta - ESR 4 - UoW
Rohit Gupta - ESR 4 - UoW
 
Liling Tan - ESR 5 USAAR
Liling Tan - ESR 5 USAARLiling Tan - ESR 5 USAAR
Liling Tan - ESR 5 USAAR
 
ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015
ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015
ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015
 
ESR5 Liling Tan - EXPERT Summer School - Malaga 2015
ESR5 Liling Tan - EXPERT Summer School - Malaga 2015ESR5 Liling Tan - EXPERT Summer School - Malaga 2015
ESR5 Liling Tan - EXPERT Summer School - Malaga 2015
 
ESR8 Liangyou Li - EXPERT Summer School - Malaga 2015
ESR8 Liangyou Li - EXPERT Summer School - Malaga 2015ESR8 Liangyou Li - EXPERT Summer School - Malaga 2015
ESR8 Liangyou Li - EXPERT Summer School - Malaga 2015
 
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
 
9. Ethics - Juan Jose Arevalillo Doval (Hermes)
9. Ethics - Juan Jose Arevalillo Doval (Hermes)9. Ethics - Juan Jose Arevalillo Doval (Hermes)
9. Ethics - Juan Jose Arevalillo Doval (Hermes)
 
8. Transfer of Technology to Market and Commercial Exploitation of Results - ...
8. Transfer of Technology to Market and Commercial Exploitation of Results - ...8. Transfer of Technology to Market and Commercial Exploitation of Results - ...
8. Transfer of Technology to Market and Commercial Exploitation of Results - ...
 

Recently uploaded

reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
perranet1
 
Senior Software Profiles Backend Sample - Sheet1.pdf
Senior Software Profiles  Backend Sample - Sheet1.pdfSenior Software Profiles  Backend Sample - Sheet1.pdf
Senior Software Profiles Backend Sample - Sheet1.pdf
Vineet
 
Sid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.pptSid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.ppt
ArshadAyub49
 
Bangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts ServiceBangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts Service
nhero3888
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
Alireza Kamrani
 
Call Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
Call Girls Lucknow 0000000000 Independent Call Girl Service LucknowCall Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
Call Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
hiju9823
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
blueshagoo1
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
oaxefes
 
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
osoyvvf
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
asyed10
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
eoxhsaa
 
Digital Marketing Performance Marketing Sample .pdf
Digital Marketing Performance Marketing  Sample .pdfDigital Marketing Performance Marketing  Sample .pdf
Digital Marketing Performance Marketing Sample .pdf
Vineet
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
Vietnam Cotton & Spinning Association
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
davidpietrzykowski1
 
Senior Engineering Sample EM DOE - Sheet1.pdf
Senior Engineering Sample EM DOE  - Sheet1.pdfSenior Engineering Sample EM DOE  - Sheet1.pdf
Senior Engineering Sample EM DOE - Sheet1.pdf
Vineet
 
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdfOverview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
nhutnguyen355078
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
Vineet
 
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
9gr6pty
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
Timothy Spann
 
SAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content DocumentSAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content Document
newdirectionconsulta
 

Recently uploaded (20)

reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
 
Senior Software Profiles Backend Sample - Sheet1.pdf
Senior Software Profiles  Backend Sample - Sheet1.pdfSenior Software Profiles  Backend Sample - Sheet1.pdf
Senior Software Profiles Backend Sample - Sheet1.pdf
 
Sid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.pptSid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.ppt
 
Bangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts ServiceBangalore ℂall Girl 000000 Bangalore Escorts Service
Bangalore ℂall Girl 000000 Bangalore Escorts Service
 
How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
 
Call Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
Call Girls Lucknow 0000000000 Independent Call Girl Service LucknowCall Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
Call Girls Lucknow 0000000000 Independent Call Girl Service Lucknow
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
 
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
一比一原版(uom毕业证书)曼彻斯特大学毕业证如何办理
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
 
Digital Marketing Performance Marketing Sample .pdf
Digital Marketing Performance Marketing  Sample .pdfDigital Marketing Performance Marketing  Sample .pdf
Digital Marketing Performance Marketing Sample .pdf
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
 
Salesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - CanariasSalesforce AI + Data Community Tour Slides - Canarias
Salesforce AI + Data Community Tour Slides - Canarias
 
Senior Engineering Sample EM DOE - Sheet1.pdf
Senior Engineering Sample EM DOE  - Sheet1.pdfSenior Engineering Sample EM DOE  - Sheet1.pdf
Senior Engineering Sample EM DOE - Sheet1.pdf
 
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdfOverview IFM June 2024 Consumer Confidence INDEX Report.pdf
Overview IFM June 2024 Consumer Confidence INDEX Report.pdf
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
 
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
 
SAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content DocumentSAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content Document
 

Carolina Scarton - ESR 7 - USFD

  • 1. Machine Translation Quality Estimation with QuEst++ Carolina Scarton – c.scarton@sheffield.ac.uk University of Sheffield
  • 2. Quality Estimation of Machine Translation  Predict quality of unseen data
  • 3. Quality Estimation of Machine Translation  Predict quality of unseen data  Only few labelled data points for training
  • 4. Quality Estimation of Machine Translation  Predict quality of unseen data  Only few labelled data points for training  Only uses information from source and target → no references!
  • 5. Quality Estimation of Machine Translation  Predict quality of unseen data  Only few labelled data points for training  Only uses information from source and target → no references!  Reduce post-editing and revision time → minimize costs!  Spotting errors  Estimate post-editing effort and time
  • 6. QuEst++  Framework for QE  Word-level, sentence-level and document-level
  • 7. QuEst++  Framework for QE  Word-level, sentence-level and document-level  Feature extraction module →features from source and target  plus information from MT system
  • 8. QuEst++  Framework for QE  Word-level, sentence-level and document-level  Feature extraction module →features from source and target  plus information from MT system  Machine learning module → use the features for building a QE model  QE model → can predict the quality of unseen data
  • 9. Target documents (training) Source documents (training) Training scenario – annotated data Example: post-editing effort (1 to 5) – training data (with annotation) Source MT Score Barack Obama becomes the fourth American president to receive the Nobel Peace Prize Barack Obama se convierte en el cuarto presidente estadounidense para recibir el Premio Nobel de la Paz 4.5 The presidential couple then has a meeting scheduled with King Harald V and Queen Sonja of Norway. La pareja presidencial entonces tiene una reunión programada con el Rey Harald V y Reina Sonja de Noruega. 4.0 Although, at the cost of the state falling deeper into debt – next year the treasury won't just be 163 billion short, but even more. Aunque, a costa del estado cayendo más – el año que viene el Tesoro no sólo se 163 millones de corto, pero aún más. 2.0 Transformer worth tens of millions of crowns burns in Louny region La pena de transformadores decenas de millones de coronas Louny quemaduras en la región 1.5
  • 10. Target documents (training) Source documents (training) MT system Training scenario – annotated data External resources/tools: - SRILM - GIZA++ tables - TreeTagger - StanfordParser Information from the MT system that translated the documents (if available) Feature extractor
  • 11. Training scenario – annotated data Black-box: - number of tokens - number of punctuation - LM perplexity - n-gram counts - POS counts - syntactic tree - lexical cohesion Glass-box: - n-best list information Target documents (training) Feature extractor Source documents (training) MT system Features for QE
  • 12. Quality labels  Post-editing effort  Post-editing time  HTER  BLEU  ... QE model training Training scenario – annotated data ML algorithms: - SVC - SVR - CRF Target documents (training) Feature extractor Source documents (training) MT system Features for QE
  • 13. QE model Training scenario – annotated data Quality labels  Post-editing effort  Post-editing time  HTER  BLEU  ... QE model training Target documents (training) Feature extractor Source documents (training) MT system Features for QE
  • 14. Target documents (unseen data) Source documents (unseen data) Predicting labels for unseen data Source MT Score Mass Slaughter on a Personal Level El sacrificio masivo a nivel personal ? People begin to ask why their leaders are making them fight. La gente empiece a preguntar por qué sus líderes están haciendo ellos lucha. ? As the community affairs officers moved into the park in their light-blue windbreakers, many protesters simply gathered their belongings and left. Asuntos como la Comunidad oficiales movido en el parque en su luz-azul windbreakers, muchos manifestantes simplemente se reunieron sus pertenencias y de la izquierda. ? Some stories are about honor and bravery. Algunas historias son de honor y valentía. ? Example: post-editing effort (1 to 5) – unseen data
  • 15. Target documents (unseen data) Feature extractor Source documents (unseen data) Predicting labels for unseen data MT system
  • 16. Predicting labels for unseen data Features are the same as the ones extracted at training time Target documents (unseen data) Feature extractor Source documents (unseen data) MT system Features for QE
  • 17. QE model Predicting labels for unseen data Can predict labels for the new data → ML magic! Target documents (unseen data) Feature extractor Source documents (unseen data) MT system Features for QE
  • 18. Predicting labels for unseen data Source MT Score Mass Slaughter on a Personal Level El sacrificio masivo a nivel personal 4.5 People begin to ask why their leaders are making them fight. La gente empiece a preguntar por qué sus líderes están haciendo ellos lucha. 3.0 As the community affairs officers moved into the park in their light-blue windbreakers, many protesters simply gathered their belongings and left. Asuntos como la Comunidad oficiales movido en el parque en su luz-azul windbreakers, muchos manifestantes simplemente se reunieron sus pertenencias y de la izquierda. 1.8 Some stories are about honor and bravery. Algunas historias son de honor y valentía. 4.5 Example: post-editing effort (1 to 5) – unseen data → predictions! Predicted scores
  • 19. Thank you! QuEst++ download: http://www.quest.dcs.shef.ac.uk/