The document discusses machine translation quality estimation. It begins by outlining issues with current automatic evaluation metrics for machine translation, such as their dependence on reference translations and inability to account for the severity of errors. It then introduces the concept of quality estimation, which aims to predict translation quality before post-editing by using machine learning on examples of source texts paired with automatic translations and human-assigned quality scores. Examples are given showing quality estimation can help prioritize sentences for post-editing and select the highest quality translation from multiple systems. The state-of-the-art in quality estimation is described as using a variety of linguistic features and learning algorithms, though available datasets with human quality judgments are limited.
Quality Assessment and Economic Sustainability of TranslationLuigi Muzii
Â
- Quality in translation is a complex issue that depends on perspectives, constraints, and expectations. It is not an intrinsic or universally defined concept.
- From an economic perspective, quality must be balanced with profitability and sustainability. Translations need to meet requirements while considering costs and resources.
- Metrics can help measure quality from different angles, but quality assessment models also have limitations. An optimal approach considers multiple factors rather than focusing only on errors.
The document discusses translation quality measurement and proposes a system for calculating a Translation Quality Index (TQI). It suggests using checklists and sampling techniques to measure errors in a translation sample. Errors would be categorized and assigned weights. The number of errors would be calculated as a percentage of the total words to determine a TQI score, with lower percentages indicating higher quality. An example TQI calculation using a 3,000 word sample with 30 error points is provided. The TQI is proposed as a standardized metric for measuring and comparing translation quality.
This document discusses translation assessment, including defining translation assessment, types of translation assessment, criteria for translation assessment, and ways to assess translations. Some key points:
- Translation assessment examines translations to provide information to improve teaching and student learning. It focuses on the learning process rather than just the end product.
- Types of assessment include product assessment, process assessment, and qualitative/quantitative assessment. Assessment can be formative, summative, or diagnostic.
- Criteria include translation problems, errors, competence in various skill areas, and causes of errors like lack of knowledge or methodology.
- Ways of assessing include scales, exercises, tests, questionnaires, and tools like translation diaries for format
Delivered at Machine Translation Summit during a special workshop on post-editing.
November 3rd 2015
Miami, Florida.
In this talk, we describe the latest advances in the world of commercial and academic machine translation development that are having the effect of improving acceptance of the technology and keeping its users happy.
Overview of Multidimensional Quality Metrics (QTLaunchPad)Arle Lommel
Â
This document outlines a 5-step process for assessing translation quality using multidimensional quality metrics: 1) Specify the project parameters, 2) Select appropriate metrics, 3) Choose an evaluation method, 4) Conduct the evaluation, and 5) Score the results. It provides examples of developing a metrics specification file, using inline markup for evaluation, and calculating scores while accounting for source text quality and metric weights. The goal is to establish a consistent, transparent process for requesters and providers to measure and improve translation quality.
Delivered at Machine Translation Summit during a special workshop on MT for patent and scientific literature.
October 30th 2015
Miami, Florida.
In this talk, we describe how we adapted machine translation for patents to help a translation company improve their productivity.
Translation and localization of market research surveys for global projects can be a nightmare! Delays, client confidence crises, programming nightmares, late fielding, and queer data are among the risks. I use this presentation to help understand the problems and find efficient solutions for "Going Global".
Dr. John Tinsley discusses the latest advances in machine translation technology for patent information. He provides an overview of machine translation approaches like statistical and rule-based translation. Tinsley explains how machine translation systems analyze large datasets of translated text to statistically determine the most likely translations. While machine translation is improving, challenges remain like ambiguity, creative language use, and linguistic differences between languages. Tinsley advocates evaluating machine translation systems based on task performance rather than just translation quality.
Quality Assessment and Economic Sustainability of TranslationLuigi Muzii
Â
- Quality in translation is a complex issue that depends on perspectives, constraints, and expectations. It is not an intrinsic or universally defined concept.
- From an economic perspective, quality must be balanced with profitability and sustainability. Translations need to meet requirements while considering costs and resources.
- Metrics can help measure quality from different angles, but quality assessment models also have limitations. An optimal approach considers multiple factors rather than focusing only on errors.
The document discusses translation quality measurement and proposes a system for calculating a Translation Quality Index (TQI). It suggests using checklists and sampling techniques to measure errors in a translation sample. Errors would be categorized and assigned weights. The number of errors would be calculated as a percentage of the total words to determine a TQI score, with lower percentages indicating higher quality. An example TQI calculation using a 3,000 word sample with 30 error points is provided. The TQI is proposed as a standardized metric for measuring and comparing translation quality.
This document discusses translation assessment, including defining translation assessment, types of translation assessment, criteria for translation assessment, and ways to assess translations. Some key points:
- Translation assessment examines translations to provide information to improve teaching and student learning. It focuses on the learning process rather than just the end product.
- Types of assessment include product assessment, process assessment, and qualitative/quantitative assessment. Assessment can be formative, summative, or diagnostic.
- Criteria include translation problems, errors, competence in various skill areas, and causes of errors like lack of knowledge or methodology.
- Ways of assessing include scales, exercises, tests, questionnaires, and tools like translation diaries for format
Delivered at Machine Translation Summit during a special workshop on post-editing.
November 3rd 2015
Miami, Florida.
In this talk, we describe the latest advances in the world of commercial and academic machine translation development that are having the effect of improving acceptance of the technology and keeping its users happy.
Overview of Multidimensional Quality Metrics (QTLaunchPad)Arle Lommel
Â
This document outlines a 5-step process for assessing translation quality using multidimensional quality metrics: 1) Specify the project parameters, 2) Select appropriate metrics, 3) Choose an evaluation method, 4) Conduct the evaluation, and 5) Score the results. It provides examples of developing a metrics specification file, using inline markup for evaluation, and calculating scores while accounting for source text quality and metric weights. The goal is to establish a consistent, transparent process for requesters and providers to measure and improve translation quality.
Delivered at Machine Translation Summit during a special workshop on MT for patent and scientific literature.
October 30th 2015
Miami, Florida.
In this talk, we describe how we adapted machine translation for patents to help a translation company improve their productivity.
Translation and localization of market research surveys for global projects can be a nightmare! Delays, client confidence crises, programming nightmares, late fielding, and queer data are among the risks. I use this presentation to help understand the problems and find efficient solutions for "Going Global".
Dr. John Tinsley discusses the latest advances in machine translation technology for patent information. He provides an overview of machine translation approaches like statistical and rule-based translation. Tinsley explains how machine translation systems analyze large datasets of translated text to statistically determine the most likely translations. While machine translation is improving, challenges remain like ambiguity, creative language use, and linguistic differences between languages. Tinsley advocates evaluating machine translation systems based on task performance rather than just translation quality.
The raw number of defects found in a product version is not an adequate measure of the cost of the defects. This presentation explains how to qualify and monetize the cost of these defects throughout the SDLC
This document contains a 20 question quiz about parameter passing techniques, subprogram implementation, and features of various programming languages like C++, Java, Ada, and C. The questions cover topics like call by value vs call by reference, scope and lifetime of variables, activation records, nested and generic subprograms, and default parameters.
MT providers claim that (customized) MT "helps you translate more words and grow your business". It "boosts productivity". And it can even "increase revenues" or "optimize customer service and support". What is the reality? Is MT improving or did we reach a plateau? Speakers at last year's QE Summit agreed: one of the main problems in the translation industry today is the lack of benchmarking. The output of MT engines cannot be compared to industry averages or standards because these are not yet available. Automated scores are meaningless outside the âlaboratoryâ. At the same time, buyers of translation services are increasingly interested in translated content of different quality levels. They also want to know how the different engines are performing on different content types and in different language pairs. How do we know? Can we predict the output quality? Shouldnât MT providers become more transparent to help buyers of these technologies make informed decisions?
Session leader: Dag Schmidtke (Microsoft)
Panelists: John Tinsley (Iconic), Olga Beregovaya (Welocalize), Olga Pospelova (eBay)
New Breakthroughs in Machine Transation Technologykantanmt
Â
Tony OâDowd takes us through some of the most innovative technologies offered on the KantanMT.com platform which are helping a growing community of KantanMT users to develop and self-manage custom Machine Translation engines in the cloud.
Maxim Khalilov then illustrates bmmtâs journey with Machine Translation on KantanMT. He discusses what they have achieved so far in terms of MT engine development and showcases the value that his team is bringing to their growing international client base through the use of Machine Translation.
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize
Â
Welocalize language tools expert Laura Casanellas details key topics related to human translation and machine translation post-editing, production, throughputs and measuring success. This is the presentation used in a recent online webinar you can find at http://www.welocalize.com/wemt/wemt-webinars/
Topics for this recorded webinar include:
- Defining throughputs for human translation and machine translation post-editing
- How to accurately compare individual throughputs for translating and post-editing
- What are the most common deviations in throughputs
- How to spot progress and performance improvement
- Who really benefits from post-editing
The document provides information on the software development process and programming concepts. It outlines the 7 stages of software development as analysis, design, implementation, testing, documentation, evaluation, and maintenance. It also describes programming concepts such as high level languages versus machine code, variables, arrays, loops, algorithms for validation, finding min/max, counting occurrences, and linear search. Pseudocode and structure diagrams are given as examples of design notations. Normal, extreme, and exceptional test data are discussed for thorough testing.
Software Design Principles and Best Practices - Satyajit DeyCefalo
Â
The document discusses software design principles and best practices, including definitions of technical debt, stakeholders' goals, causes of technical debt, types of technical debt, code smells, common code smells like comments, uncommunicative names, long methods, and design principles like SOLID principles. It provides examples of single responsibility principle, open-closed principle, Liskov substitution principle, interface segregation principle, and dependency inversion principle. It emphasizes that good code is readable and maintainable by other programmers.
Quality assurance in the early stages of the productMaksym Vovk
Â
This document discusses quality assurance practices that can be applied in the early stages of product development. It addresses problems that arise from high bug costs and unclear roles of developers and testers. Potential solutions proposed include applying testing practices at the concept, requirements, and design stages. Specific techniques discussed are testing concepts using persona data, A/B testing, requirements analysis, test-driven development, behavior-driven development, and pairing testers and developers. Benefits include reducing bugs and manual tests while increasing knowledge across roles. Challenges include ensuring tests are appropriately scoped and requiring changes to team mindsets.
Why Isn't Clean Coding Working For My TeamRob Curry
Â
Teams fail to achieve the full benefit of the "clean code" approach when they focus on the code and neglect the Agile process. The full title of Uncle Bob's "Clean Code" book is "Clean Code: A Handbook of Agile Software Craftsmanship". This talk presents an depth look at necessary relationship between Clean Code software craftsmanship and the Agile methodology, identifies common scenarios and situations where teams may fall short of recognizing and respecting that relationship, and provides practical recommendations for achieving a fully integrated process of Agile Software Craftsmanship.
Robert Martin's book "Clean Code: A Handbook of Agile Software Craftsmanship" had a huge positive impact on software development teams that adopted his approach to "Agile Software Craftsmanship". But teams sometimes fail to achieve the full benefit of the "clean code" approach because they focus on the code and neglect the Agile process.
It's easy to do: the book provides such clear, practical advice on how to write code that is easier to maintain, more reliable, and less error prone that developers adopt those techniques to great effect and fail to pursue and adopt the harder, agile process recommendations from the book. This is further complicated by the fact that there is now a Software Craftsmanship Manifesto that is separate from the Agile Manifesto.
So, how does using selected clean code techniques break the Agile process defined in the the book? What is the relationship between the two that Uncle Bob wanted us to understand and adopt in toto? Where do we go wrong? Are there some work environment or business driven scenarios that are more likely to break the relationship?
This presentation addresses those questions and more by an taking an in depth look at necessary relationship between Clean Code software craftsmanship and the Agile methodology, identifies common scenarios and situations where teams may fall short of recognizing and respecting that relationship, and provides practical recommendations for achieving a fully integrated process of Agile Software Craftsmanship.
Agile Testing cerfiticate from ISTQB available later this year. This presentation is about agile testing in general, some research findings about certificates (extract of Finnish figures from ISTQB global survey), and a few notes about the new certificate and related courses. Presentation at Testaus2014 seminar.
The document outlines the steps involved in program design and problem solving techniques, including defining the problem, outlining the solution, developing an algorithm using pseudocode, testing the algorithm, coding the algorithm, running and documenting the program. It also discusses algorithmic problem solving, the structure theorem, meaningful naming conventions, communication between modules through variables and parameters, module cohesion and coupling, and sequential file updates.
This document discusses algorithms and programming. It begins by defining an algorithm as a finite set of steps to solve a problem. It provides examples of algorithms to find the average of test scores and divide two numbers. The document discusses characteristics of algorithms like inputs, outputs, definiteness, finiteness, and effectiveness. It also covers tools for designing algorithms like flowcharts and pseudocode. The document then discusses programming, explaining how to analyze a problem, design a solution, code it, test it, and evaluate it. It provides tips for writing clear, well-structured programs.
Learn the different approaches to machine translation and how to improve the ...SDL
Â
SDL provides machine translation solutions to customers. They have a team of over 50 professionals across various locations that work on driving MT adoption, building custom engines, and conducting linguistic projects. SDL's approach involves evaluating data, training machine translation engines, testing outputs, and refining engines through an iterative process with a focus on maximizing quality. They provide customized solutions through domain-specific engines and language verticals to meet the needs of different customers and content types.
How Does Your MT System Measure Up? tekom/tcworld 2014 kantanmt
Â
KantanMT Founder and Chief Architect Tony OâDowd presented at the annual tekom Trade Fair for Technical Communication on the 12th November as part of the GALA track. The tekom trade fair is organized by tcworld and is the biggest technical communication event worldwide.
The presentation, entitled; âHow Does Your Machine Translation System Measure Up?â outlines how to measure the performance of your MT engines and the efficiency of your translation processes. This presentation is aimed towards professionals in the localization industry.
Key Discussion Points:
⢠Measuring performance of Statistical MT
⢠Recent advances in MT and data visualization techniques
⢠Tracking MT efficiency in the translation process
Please contact Louise Irwin (louisei@kantanmt.com) for more information
10. Lucia Specia (USFD) Evaluation of Machine TranslationRIILP
Â
This document discusses various methods for evaluating translation quality, including manual metrics, task-based metrics, and reference-based automatic metrics. It notes that evaluating translation quality is difficult because the definition of quality depends on factors like the end user and intended purpose. Methods discussed include n-point scales for adequacy and fluency, ranking translations, and counting errors. Issues with subjective judgments, reliability, and defining what makes a translation "best" are also covered.
Title: Bridging the Gap in Multilingual Communication
Learn how Unbabel is solving translation by combining the speed and scale of machine translation with the quality and expertise unique to humans. Discover the best global Machine Translation Quality Estimation system â their secret to scaling machine translation - and see how you can experiment and iterate with OpenKiwi â their open source framework!
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)kantanmt
Â
The document discusses maximizing return on investment for machine translation projects. It introduces KantanMT, a cloud-based statistical machine translation system, and associated tools like KantanAnalytics and Kantan BuildAnalytics that help project managers and SMT developers optimize machine translation quality and costs. Case studies are presented showing how KantanMT and post-editing delivered significant cost savings, increased translator productivity, and enabled fast localization turnarounds for clients in software documentation, automotive parts data, and e-commerce product descriptions.
This document discusses software quality assurance and control for small development teams. It defines the differences between quality assurance, which ensures the development process is efficient, and quality control, which ensures the product meets requirements. Quality control has two components: validation, which checks the right product is built, and verification, which checks the product is built correctly. Validation involves getting complete requirements from customers and defining the problem being solved. Verification uses testing strategies like unit, integration, and system testing following a V-model approach. The document provides tips for testing, including change-related, integration, and ensuring independence between development and testing roles.
Language Quality Management: Models, Measures, Methodologies Sajan
Â
With growing content, shorter release dates and many target languages, itâs important for global companies to have a process in place to track and measure translation effectiveness. Learn how big companies like Microsoft, Snap-on Diagnostics and Symantec manage translation quality.
Presentation for the AMTA 2018 Conference on eBay L10n's multi-dimensional evaluation methods for Phrase-Based and Neural MT engines (quality -adequacy, fluency- and productivity -edit distance, time spent)
This document discusses metrics and monitoring best practices based on lessons learned at Chrome. It covers the properties of good metrics, use cases for metrics in different contexts like labs, Web Performance APIs and Real User Monitoring (RUM). It provides the example metric of Largest Contentful Paint and discusses challenges in defining it accurately. The document also covers limitations and best practices for lab testing, A/B testing and understanding RUM data, emphasizing controlled experimentation and detecting changes and mix shifts. The overall message is that metrics require careful thought and validation but can provide valuable insights when done well.
The raw number of defects found in a product version is not an adequate measure of the cost of the defects. This presentation explains how to qualify and monetize the cost of these defects throughout the SDLC
This document contains a 20 question quiz about parameter passing techniques, subprogram implementation, and features of various programming languages like C++, Java, Ada, and C. The questions cover topics like call by value vs call by reference, scope and lifetime of variables, activation records, nested and generic subprograms, and default parameters.
MT providers claim that (customized) MT "helps you translate more words and grow your business". It "boosts productivity". And it can even "increase revenues" or "optimize customer service and support". What is the reality? Is MT improving or did we reach a plateau? Speakers at last year's QE Summit agreed: one of the main problems in the translation industry today is the lack of benchmarking. The output of MT engines cannot be compared to industry averages or standards because these are not yet available. Automated scores are meaningless outside the âlaboratoryâ. At the same time, buyers of translation services are increasingly interested in translated content of different quality levels. They also want to know how the different engines are performing on different content types and in different language pairs. How do we know? Can we predict the output quality? Shouldnât MT providers become more transparent to help buyers of these technologies make informed decisions?
Session leader: Dag Schmidtke (Microsoft)
Panelists: John Tinsley (Iconic), Olga Beregovaya (Welocalize), Olga Pospelova (eBay)
New Breakthroughs in Machine Transation Technologykantanmt
Â
Tony OâDowd takes us through some of the most innovative technologies offered on the KantanMT.com platform which are helping a growing community of KantanMT users to develop and self-manage custom Machine Translation engines in the cloud.
Maxim Khalilov then illustrates bmmtâs journey with Machine Translation on KantanMT. He discusses what they have achieved so far in terms of MT engine development and showcases the value that his team is bringing to their growing international client base through the use of Machine Translation.
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize
Â
Welocalize language tools expert Laura Casanellas details key topics related to human translation and machine translation post-editing, production, throughputs and measuring success. This is the presentation used in a recent online webinar you can find at http://www.welocalize.com/wemt/wemt-webinars/
Topics for this recorded webinar include:
- Defining throughputs for human translation and machine translation post-editing
- How to accurately compare individual throughputs for translating and post-editing
- What are the most common deviations in throughputs
- How to spot progress and performance improvement
- Who really benefits from post-editing
The document provides information on the software development process and programming concepts. It outlines the 7 stages of software development as analysis, design, implementation, testing, documentation, evaluation, and maintenance. It also describes programming concepts such as high level languages versus machine code, variables, arrays, loops, algorithms for validation, finding min/max, counting occurrences, and linear search. Pseudocode and structure diagrams are given as examples of design notations. Normal, extreme, and exceptional test data are discussed for thorough testing.
Software Design Principles and Best Practices - Satyajit DeyCefalo
Â
The document discusses software design principles and best practices, including definitions of technical debt, stakeholders' goals, causes of technical debt, types of technical debt, code smells, common code smells like comments, uncommunicative names, long methods, and design principles like SOLID principles. It provides examples of single responsibility principle, open-closed principle, Liskov substitution principle, interface segregation principle, and dependency inversion principle. It emphasizes that good code is readable and maintainable by other programmers.
Quality assurance in the early stages of the productMaksym Vovk
Â
This document discusses quality assurance practices that can be applied in the early stages of product development. It addresses problems that arise from high bug costs and unclear roles of developers and testers. Potential solutions proposed include applying testing practices at the concept, requirements, and design stages. Specific techniques discussed are testing concepts using persona data, A/B testing, requirements analysis, test-driven development, behavior-driven development, and pairing testers and developers. Benefits include reducing bugs and manual tests while increasing knowledge across roles. Challenges include ensuring tests are appropriately scoped and requiring changes to team mindsets.
Why Isn't Clean Coding Working For My TeamRob Curry
Â
Teams fail to achieve the full benefit of the "clean code" approach when they focus on the code and neglect the Agile process. The full title of Uncle Bob's "Clean Code" book is "Clean Code: A Handbook of Agile Software Craftsmanship". This talk presents an depth look at necessary relationship between Clean Code software craftsmanship and the Agile methodology, identifies common scenarios and situations where teams may fall short of recognizing and respecting that relationship, and provides practical recommendations for achieving a fully integrated process of Agile Software Craftsmanship.
Robert Martin's book "Clean Code: A Handbook of Agile Software Craftsmanship" had a huge positive impact on software development teams that adopted his approach to "Agile Software Craftsmanship". But teams sometimes fail to achieve the full benefit of the "clean code" approach because they focus on the code and neglect the Agile process.
It's easy to do: the book provides such clear, practical advice on how to write code that is easier to maintain, more reliable, and less error prone that developers adopt those techniques to great effect and fail to pursue and adopt the harder, agile process recommendations from the book. This is further complicated by the fact that there is now a Software Craftsmanship Manifesto that is separate from the Agile Manifesto.
So, how does using selected clean code techniques break the Agile process defined in the the book? What is the relationship between the two that Uncle Bob wanted us to understand and adopt in toto? Where do we go wrong? Are there some work environment or business driven scenarios that are more likely to break the relationship?
This presentation addresses those questions and more by an taking an in depth look at necessary relationship between Clean Code software craftsmanship and the Agile methodology, identifies common scenarios and situations where teams may fall short of recognizing and respecting that relationship, and provides practical recommendations for achieving a fully integrated process of Agile Software Craftsmanship.
Agile Testing cerfiticate from ISTQB available later this year. This presentation is about agile testing in general, some research findings about certificates (extract of Finnish figures from ISTQB global survey), and a few notes about the new certificate and related courses. Presentation at Testaus2014 seminar.
The document outlines the steps involved in program design and problem solving techniques, including defining the problem, outlining the solution, developing an algorithm using pseudocode, testing the algorithm, coding the algorithm, running and documenting the program. It also discusses algorithmic problem solving, the structure theorem, meaningful naming conventions, communication between modules through variables and parameters, module cohesion and coupling, and sequential file updates.
This document discusses algorithms and programming. It begins by defining an algorithm as a finite set of steps to solve a problem. It provides examples of algorithms to find the average of test scores and divide two numbers. The document discusses characteristics of algorithms like inputs, outputs, definiteness, finiteness, and effectiveness. It also covers tools for designing algorithms like flowcharts and pseudocode. The document then discusses programming, explaining how to analyze a problem, design a solution, code it, test it, and evaluate it. It provides tips for writing clear, well-structured programs.
Learn the different approaches to machine translation and how to improve the ...SDL
Â
SDL provides machine translation solutions to customers. They have a team of over 50 professionals across various locations that work on driving MT adoption, building custom engines, and conducting linguistic projects. SDL's approach involves evaluating data, training machine translation engines, testing outputs, and refining engines through an iterative process with a focus on maximizing quality. They provide customized solutions through domain-specific engines and language verticals to meet the needs of different customers and content types.
How Does Your MT System Measure Up? tekom/tcworld 2014 kantanmt
Â
KantanMT Founder and Chief Architect Tony OâDowd presented at the annual tekom Trade Fair for Technical Communication on the 12th November as part of the GALA track. The tekom trade fair is organized by tcworld and is the biggest technical communication event worldwide.
The presentation, entitled; âHow Does Your Machine Translation System Measure Up?â outlines how to measure the performance of your MT engines and the efficiency of your translation processes. This presentation is aimed towards professionals in the localization industry.
Key Discussion Points:
⢠Measuring performance of Statistical MT
⢠Recent advances in MT and data visualization techniques
⢠Tracking MT efficiency in the translation process
Please contact Louise Irwin (louisei@kantanmt.com) for more information
10. Lucia Specia (USFD) Evaluation of Machine TranslationRIILP
Â
This document discusses various methods for evaluating translation quality, including manual metrics, task-based metrics, and reference-based automatic metrics. It notes that evaluating translation quality is difficult because the definition of quality depends on factors like the end user and intended purpose. Methods discussed include n-point scales for adequacy and fluency, ranking translations, and counting errors. Issues with subjective judgments, reliability, and defining what makes a translation "best" are also covered.
Title: Bridging the Gap in Multilingual Communication
Learn how Unbabel is solving translation by combining the speed and scale of machine translation with the quality and expertise unique to humans. Discover the best global Machine Translation Quality Estimation system â their secret to scaling machine translation - and see how you can experiment and iterate with OpenKiwi â their open source framework!
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)kantanmt
Â
The document discusses maximizing return on investment for machine translation projects. It introduces KantanMT, a cloud-based statistical machine translation system, and associated tools like KantanAnalytics and Kantan BuildAnalytics that help project managers and SMT developers optimize machine translation quality and costs. Case studies are presented showing how KantanMT and post-editing delivered significant cost savings, increased translator productivity, and enabled fast localization turnarounds for clients in software documentation, automotive parts data, and e-commerce product descriptions.
This document discusses software quality assurance and control for small development teams. It defines the differences between quality assurance, which ensures the development process is efficient, and quality control, which ensures the product meets requirements. Quality control has two components: validation, which checks the right product is built, and verification, which checks the product is built correctly. Validation involves getting complete requirements from customers and defining the problem being solved. Verification uses testing strategies like unit, integration, and system testing following a V-model approach. The document provides tips for testing, including change-related, integration, and ensuring independence between development and testing roles.
Language Quality Management: Models, Measures, Methodologies Sajan
Â
With growing content, shorter release dates and many target languages, itâs important for global companies to have a process in place to track and measure translation effectiveness. Learn how big companies like Microsoft, Snap-on Diagnostics and Symantec manage translation quality.
Presentation for the AMTA 2018 Conference on eBay L10n's multi-dimensional evaluation methods for Phrase-Based and Neural MT engines (quality -adequacy, fluency- and productivity -edit distance, time spent)
This document discusses metrics and monitoring best practices based on lessons learned at Chrome. It covers the properties of good metrics, use cases for metrics in different contexts like labs, Web Performance APIs and Real User Monitoring (RUM). It provides the example metric of Largest Contentful Paint and discusses challenges in defining it accurately. The document also covers limitations and best practices for lab testing, A/B testing and understanding RUM data, emphasizing controlled experimentation and detecting changes and mix shifts. The overall message is that metrics require careful thought and validation but can provide valuable insights when done well.
Tony OâDowd (KantanMT). KantanMT enables its community to generate meaningful business intelligence that helps them identify the scope of their customised machine translation projects. More importantly, it helps them schedule and scale those projects to achieve maximum translation productivity and a positive ROI.
The document discusses Randstad's experimentation with using evolutionary algorithms and artificial intelligence through the company Sentient Ascend to optimize their conversion rates. They tested Sentient on product detail pages in the Netherlands, Norway, and Sweden, testing various elements. While some variants performed better, the improvements were not always statistically significant due to small sample sizes. Validating the best variant in a traditional A/B test confirmed an uplift. Overall Sentient showed promise but had some issues, and the author questions if it delivers fully on its promises and is worth the costs compared to traditional testing.
The document discusses translation quality measurement and proposes a system for calculating a Translation Quality Index (TQI). It suggests using checklists and sampling techniques to evaluate errors in a translation sample. Errors would be categorized and weighted based on importance. The number of errors would be calculated relative to the sample size to determine a percentage score, or TQI, indicating the overall quality of the translation. A higher TQI represents fewer errors and therefore higher translation quality.
This presentation is a part of the MosesCore project that encourages the development and usage of open source machine translation tools, notably the Moses statistical MT toolkit. â¨â¨MosesCore is supported by the European Commission Grant Number 288487 under the 7th Framework Programme. â¨â¨
â¨â¨For the latest updates go to http://www.statmt.org/mosescore/
or follow us on Twitter - #MosesCore
KantanMT Founder and Chief Architect, Tony O'Dowd and Technical Project Manager, Louise Faherty show you how to improve the translation productivity of your team, manage post-editing effort and translation project schedules better with powerful Machine Translation engines.
You will learn:
⢠How to deal with Translation challenges
⢠About the necessity of Machine Translation to be competitive
⢠How KantanMT.com can be integrated with existing Translation Management Systems
The document provides an agenda and overview for an MT and localization quality assurance discussion. It discusses Welocalize's approach to MT, including their dedicated team of experts and experience with various MT engines. It covers analytics like automatic scoring, human evaluations, and productivity tests. It also discusses considerations for the language supply chain in terms of post-editor training and guidelines for full versus light post-editing. In summary, the document outlines Welocalize's expertise and process for designing and implementing MT programs for clients.
How to Achieve Agile Localization for High-Volume Content with Machine Transl...kantanmt
Â
This slide deck on achieving agile localization for high-volume content with the help of Machine Translation was presented by Tony OâDowd, Founder and Chief Architect at KantanMT during the annual tcworld conference 2015, which was held in Stuttgart, Germany. It outlines the best practices for developing and implementing a dynamic and agile localization strategy that integrates Custom Machine Translation (CMT) into the localization workflow, with the final aim of developing a scalable localization strategy that makes it possible to create and publish high-volume multilingual content.
The document contains questions and answers related to testing tools from Mercury Interactive (now HP Software). It discusses using object properties to distinguish between windows with the same label, how to capture numeric input in QTP, migrating tests between applications, and other topics. Mercury confirms that Quality Center supports Microsoft SQL Server and discusses upgrade paths between different versions.
MT best practices for price, speed AND quality, as well as Lexceleraâs machine translation case studies and services including training, integration, post-editing and hosted MT
Similar to Lucia Specia - Estimativa de qualidade em TA (20)
Funçþes avançadas do memoQ 6.2 por Bernardo Santos (Kilgray), na sequência da apresentação feita nos workshops da I Conferência Internacional de Tradução e Tecnologia, 13 e 14 de Maio, Faculdade de Letras do Porto.
Este documento discute como as måquinas de tradução podem melhor servir os tradutores humanos. Propþe duas åreas principais de melhoria tecnológica: (1) gestão contextual do conhecimento para facilitar pesquisa e decisão; e (2) ferramentas de aprendizagem de edição para automatizar tarefas repetitivas como edição e verificação. Os tradutores devem participar no desenvolvimento destas novas ferramentas para assegurar que atendem às suas necessidades.
Este documento descreve o Opentrad, uma plataforma de tradução automåtica de código aberto. Ele fornece informaçþes sobre os motores de tradução, idiomas suportados, solução tecnológica, produtos, integração e clientes.
O documento discute o uso da tradução automĂĄtica (TA) pela ComissĂŁo Europeia, focando no Departamento de LĂngua Portuguesa (DLP). Resume que o DLP coordenou o desenvolvimento de dicionĂĄrios para a TA do francĂŞs para o portuguĂŞs e do inglĂŞs para o portuguĂŞs, e que a TA estĂĄ disponĂvel para toda a ComissĂŁo Europeia desde 2013 e poderĂĄ ser oferecida a outras instituiçþes da UE e governos nacionais em 2014.
1. O documento discute sistemas de tradução automĂĄtica por regras versus sistemas estatĂsticos e como transformar um sistema de regras em um sistema hĂbrido.
2. Apresenta a plataforma OpenLogos e suas caracterĂsticas para desenvolver um sistema hĂbrido de tradução automĂĄtica.
3. Discutem o trabalho futuro de explorar o OpenLogos para criar novos recursos linguĂsticos e aplicaçþes e divulgar os recursos gratuitos disponĂveis.
O documento discute a importância da tradução automĂĄtica, seu desenvolvimento histĂłrico e as abordagens atuais baseadas em regras e estatĂsticas. TambĂŠm analisa os desafios linguĂsticos envolvidos e a necessidade de mais pesquisa interdisciplinar entre a linguĂstica e a engenharia.
More from I Conferência Internacional de Tradução e Tecnologia (10)
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
Â
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
"Choosing proper type of scaling", Olena SyrotaFwdays
Â
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze scaling approaches and then select the proper ones for our system.
Discover top-tier mobile app development services, offering innovative solutions for iOS and Android. Enhance your business with custom, user-friendly mobile applications.
Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.
To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
Â
An English đŹđ§ translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech đ¨đż version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
The Microsoft 365 Migration Tutorial For Beginner.pptxoperationspcvita
Â
This presentation will help you understand the power of Microsoft 365. However, we have mentioned every productivity app included in Office 365. Additionally, we have suggested the migration situation related to Office 365 and how we can help you.
You can also read: https://www.systoolsgroup.com/updates/office-365-tenant-to-tenant-migration-step-by-step-complete-guide/
âHow Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...Edge AI and Vision Alliance
Â
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/how-axelera-ai-uses-digital-compute-in-memory-to-deliver-fast-and-energy-efficient-computer-vision-a-presentation-from-axelera-ai/
Bram Verhoef, Head of Machine Learning at Axelera AI, presents the âHow Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-efficient Computer Visionâ tutorial at the May 2024 Embedded Vision Summit.
As artificial intelligence inference transitions from cloud environments to edge locations, computer vision applications achieve heightened responsiveness, reliability and privacy. This migration, however, introduces the challenge of operating within the stringent confines of resource constraints typical at the edge, including small form factors, low energy budgets and diminished memory and computational capacities. Axelera AI addresses these challenges through an innovative approach of performing digital computations within memory itself. This technique facilitates the realization of high-performance, energy-efficient and cost-effective computer vision capabilities at the thin and thick edge, extending the frontier of what is achievable with current technologies.
In this presentation, Verhoef unveils his companyâs pioneering chip technology and demonstrates its capacity to deliver exceptional frames-per-second performance across a range of standard computer vision networks typical of applications in security, surveillance and the industrial sector. This shows that advanced computer vision can be accessible and efficient, even at the very edge of our technological ecosystem.
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframePrecisely
Â
Inconsistent user experience and siloed data, high costs, and changing customer expectations â Citizens Bank was experiencing these challenges while it was attempting to deliver a superior digital banking experience for its clients. Its core banking applications run on the mainframe and Citizens was using legacy utilities to get the critical mainframe data to feed customer-facing channels, like call centers, web, and mobile. Ultimately, this led to higher operating costs (MIPS), delayed response times, and longer time to market.
Ever-changing customer expectations demand more modern digital experiences, and the bank needed to find a solution that could provide real-time data to its customer channels with low latency and operating costs. Join this session to learn how Citizens is leveraging Precisely to replicate mainframe data to its customer channels and deliver on their âmodern digital bankâ experiences.
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyScyllaDB
Â
Freshworks creates AI-boosted business software that helps employees work more efficiently and effectively. Managing data across multiple RDBMS and NoSQL databases was already a challenge at their current scale. To prepare for 10X growth, they knew it was time to rethink their database strategy. Learn how they architected a solution that would simplify scaling while keeping costs under control.
Dandelion Hashtable: beyond billion requests per second on a commodity serverAntonios Katsarakis
Â
This slide deck presents DLHT, a concurrent in-memory hashtable. Despite efforts to optimize hashtables, that go as far as sacrificing core functionality, state-of-the-art designs still incur multiple memory accesses per request and block request processing in three cases. First, most hashtables block while waiting for data to be retrieved from memory. Second, open-addressing designs, which represent the current state-of-the-art, either cannot free index slots on deletes or must block all requests to do so. Third, index resizes block every request until all objects are copied to the new index. Defying folklore wisdom, DLHT forgoes open-addressing and adopts a fully-featured and memory-aware closed-addressing design based on bounded cache-line-chaining. This design offers lock-free index operations and deletes that free slots instantly, (2) completes most requests with a single memory access, (3) utilizes software prefetching to hide memory latencies, and (4) employs a novel non-blocking and parallel resizing. In a commodity server and a memory-resident workload, DLHT surpasses 1.6B requests per second and provides 3.5x (12x) the throughput of the state-of-the-art closed-addressing (open-addressing) resizable hashtable on Gets (Deletes).
Essentials of Automations: Exploring Attributes & Automation ParametersSafe Software
Â
Building automations in FME Flow can save time, money, and help businesses scale by eliminating data silos and providing data to stakeholders in real-time. One essential component to orchestrating complex automations is the use of attributes & automation parameters (both formerly known as âkeysâ). In fact, itâs unlikely youâll ever build an Automation without using these components, but what exactly are they?
Attributes & automation parameters enable the automation author to pass data values from one automation component to the next. During this webinar, our FME Flow Specialists will cover leveraging the three types of these output attributes & parameters in FME Flow: Event, Custom, and Automation. As a bonus, theyâll also be making use of the Split-Merge Block functionality.
Youâll leave this webinar with a better understanding of how to maximize the potential of automations by making use of attributes & automation parameters, with the ultimate goal of setting your enterprise integration workflows up on autopilot.
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Choosing The Best AWS Service For Your Website + API.pptx
Â
Lucia Specia - Estimativa de qualidade em TA
1. Quality of Machine Translation Quality Estimation Open issues Conclusions
Estimativa da qualidade da tradu¸cËao
autom´atica
Lucia Specia
University of SheďŹeld
l.specia@sheffield.ac.uk
Faculdade de Letras da Universidade do Porto
13 May 2013
Estimativa da qualidade da tradu¸cËao autom´atica 1 / 31
2. Quality of Machine Translation Quality Estimation Open issues Conclusions
Outline
1 Quality of Machine Translation
2 Quality Estimation
3 Open issues
4 Conclusions
Estimativa da qualidade da tradu¸cËao autom´atica 2 / 31
3. Quality of Machine Translation Quality Estimation Open issues Conclusions
Outline
1 Quality of Machine Translation
2 Quality Estimation
3 Open issues
4 Conclusions
Estimativa da qualidade da tradu¸cËao autom´atica 3 / 31
4. Quality of Machine Translation Quality Estimation Open issues Conclusions
Introduction
Machine Translation:
Around since the early 1950s
Estimativa da qualidade da tradu¸cËao autom´atica 4 / 31
5. Quality of Machine Translation Quality Estimation Open issues Conclusions
Introduction
Machine Translation:
Around since the early 1950s
Increasingly more popular since 1990: statistical
approaches
Estimativa da qualidade da tradu¸cËao autom´atica 4 / 31
6. Quality of Machine Translation Quality Estimation Open issues Conclusions
Introduction
Machine Translation:
Around since the early 1950s
Increasingly more popular since 1990: statistical
approaches
Software tools and data available to build translation
systems - Moses and others
Estimativa da qualidade da tradu¸cËao autom´atica 4 / 31
7. Quality of Machine Translation Quality Estimation Open issues Conclusions
Introduction
Machine Translation:
Around since the early 1950s
Increasingly more popular since 1990: statistical
approaches
Software tools and data available to build translation
systems - Moses and others
Increasing demand for cheaper and fast translations
Estimativa da qualidade da tradu¸cËao autom´atica 4 / 31
8. Quality of Machine Translation Quality Estimation Open issues Conclusions
Introduction
Machine Translation:
Around since the early 1950s
Increasingly more popular since 1990: statistical
approaches
Software tools and data available to build translation
systems - Moses and others
Increasing demand for cheaper and fast translations
How do we measure quality and progress over time?
So far... mostly automatic evaluation metrics
Estimativa da qualidade da tradu¸cËao autom´atica 4 / 31
9. Quality of Machine Translation Quality Estimation Open issues Conclusions
MT evaluation metrics
N-gram matching between system output and one or
more reference translations: BLEU and many others
Estimativa da qualidade da tradu¸cËao autom´atica 5 / 31
10. Quality of Machine Translation Quality Estimation Open issues Conclusions
MT evaluation metrics
N-gram matching between system output and one or
more reference translations: BLEU and many others
Issue 1: Too many possible good quality translations,
need thousands of references to capture valid variations
Estimativa da qualidade da tradu¸cËao autom´atica 5 / 31
11. Quality of Machine Translation Quality Estimation Open issues Conclusions
MT evaluation metrics
N-gram matching between system output and one or
more reference translations: BLEU and many others
Issue 1: Too many possible good quality translations,
need thousands of references to capture valid variations
Solution: HyTER (Language Weaver) annotation tool to
generate all possible correct translations! [DM12]
Translations built bottom-up from word/phrase
translation equivalents using FSA
2-2.5 hours worth of expert annotation per sentence
One annotator: 5.2 Ă 106 paths
A bunch of annotators: 8.5 Ă 1011 paths
Estimativa da qualidade da tradu¸cËao autom´atica 5 / 31
12. Quality of Machine Translation Quality Estimation Open issues Conclusions
MT evaluation metrics
Issue 2: DiďŹcult to quantify severity of mismatching
n-grams
Estimativa da qualidade da tradu¸cËao autom´atica 6 / 31
13. Quality of Machine Translation Quality Estimation Open issues Conclusions
MT evaluation metrics
Issue 2: DiďŹcult to quantify severity of mismatching
n-grams
ref Do not buy this product, itâs their craziest invention!
sys Do buy this product, itâs their craziest invention!
Estimativa da qualidade da tradu¸cËao autom´atica 6 / 31
14. Quality of Machine Translation Quality Estimation Open issues Conclusions
MT evaluation metrics
Issue 2: DiďŹcult to quantify severity of mismatching
n-grams
ref Do not buy this product, itâs their craziest invention!
sys Do buy this product, itâs their craziest invention!
Some attempts to weight mismatches diďŹerently -
sparse, lexicalised approach
Estimativa da qualidade da tradu¸cËao autom´atica 6 / 31
15. Quality of Machine Translation Quality Estimation Open issues Conclusions
MT evaluation metrics
Issue 2: DiďŹcult to quantify severity of mismatching
n-grams
ref Do not buy this product, itâs their craziest invention!
sys Do buy this product, itâs their craziest invention!
Some attempts to weight mismatches diďŹerently -
sparse, lexicalised approach
However, same error is more or less important depending
on the user or purpose:
Severe if end-user does not speak source language
Trivial to post-edit by translators
Estimativa da qualidade da tradu¸cËao autom´atica 6 / 31
16. Quality of Machine Translation Quality Estimation Open issues Conclusions
MT evaluation metrics
Conversely:
ref The battery lasts 6 hours and it can be fully recharged
in 30 minutes.
sys Six-hours battery, 30 minutes to full charge last.
Estimativa da qualidade da tradu¸cËao autom´atica 7 / 31
17. Quality of Machine Translation Quality Estimation Open issues Conclusions
MT evaluation metrics
Conversely:
ref The battery lasts 6 hours and it can be fully recharged
in 30 minutes.
sys Six-hours battery, 30 minutes to full charge last.
Ok for gisting - meaning preserved
Very costly for post-editing if style is to be preserved
Estimativa da qualidade da tradu¸cËao autom´atica 7 / 31
18. Quality of Machine Translation Quality Estimation Open issues Conclusions
Task-based evaluation
Measure translation quality within task. E.g. Autodesk -
Productivity test through post-editing [Aut11]
2-day translation and post-editing , 37 participants
In-house Moses (Autodesk data: software)
Time spent on each segment
Estimativa da qualidade da tradu¸cËao autom´atica 8 / 31
19. Quality of Machine Translation Quality Estimation Open issues Conclusions
Task-based evaluation
E.g.: Intel - User satisfaction with un-edited MT
Translation is good if customer can solve problem
Estimativa da qualidade da tradu¸cËao autom´atica 9 / 31
20. Quality of Machine Translation Quality Estimation Open issues Conclusions
Task-based evaluation
E.g.: Intel - User satisfaction with un-edited MT
Translation is good if customer can solve problem
MT for Customer Support websites [Int10]
Overall customer satisfaction: 75% for EnglishâChinese
Estimativa da qualidade da tradu¸cËao autom´atica 9 / 31
21. Quality of Machine Translation Quality Estimation Open issues Conclusions
Task-based evaluation
E.g.: Intel - User satisfaction with un-edited MT
Translation is good if customer can solve problem
MT for Customer Support websites [Int10]
Overall customer satisfaction: 75% for EnglishâChinese
95% reduction in cost
Project cycle from 10 days to 1 day
From 300 to 60,000 words translated/hour
Estimativa da qualidade da tradu¸cËao autom´atica 9 / 31
22. Quality of Machine Translation Quality Estimation Open issues Conclusions
Task-based evaluation
E.g.: Intel - User satisfaction with un-edited MT
Translation is good if customer can solve problem
MT for Customer Support websites [Int10]
Overall customer satisfaction: 75% for EnglishâChinese
95% reduction in cost
Project cycle from 10 days to 1 day
From 300 to 60,000 words translated/hour
Customers in China using MT texts were more satisďŹed
with support than natives using original texts (68%)!
Estimativa da qualidade da tradu¸cËao autom´atica 9 / 31
23. Quality of Machine Translation Quality Estimation Open issues Conclusions
Task-based evaluation
E.g.: Intel - User satisfaction with un-edited MT
Translation is good if customer can solve problem
MT for Customer Support websites [Int10]
Overall customer satisfaction: 75% for EnglishâChinese
95% reduction in cost
Project cycle from 10 days to 1 day
From 300 to 60,000 words translated/hour
Customers in China using MT texts were more satisďŹed
with support than natives using original texts (68%)!
MT for chat and community forums [Int12]
âź60% âunderstandable and actionableâ
(âEnglish/Spanish)
Max âź10% ânot understandableâ
(âChinese)
Estimativa da qualidade da tradu¸cËao autom´atica 9 / 31
24. Quality of Machine Translation Quality Estimation Open issues Conclusions
Outline
1 Quality of Machine Translation
2 Quality Estimation
3 Open issues
4 Conclusions
Estimativa da qualidade da tradu¸cËao autom´atica 10 / 31
25. Quality of Machine Translation Quality Estimation Open issues Conclusions
Overview
Metrics either depend on references or post-editing/use of
translations (task-based)
Estimativa da qualidade da tradu¸cËao autom´atica 11 / 31
26. Quality of Machine Translation Quality Estimation Open issues Conclusions
Overview
Metrics either depend on references or post-editing/use of
translations (task-based)
Our proposal
Quality assessment without reference, prior to
post-editing/use of translations
Estimativa da qualidade da tradu¸cËao autom´atica 11 / 31
27. Quality of Machine Translation Quality Estimation Open issues Conclusions
Overview
Why donât translators use (more) MT?
Estimativa da qualidade da tradu¸cËao autom´atica 12 / 31
28. Quality of Machine Translation Quality Estimation Open issues Conclusions
Overview
Why donât translators use (more) MT?
Translations are not good enough!
Estimativa da qualidade da tradu¸cËao autom´atica 12 / 31
29. Quality of Machine Translation Quality Estimation Open issues Conclusions
Overview
Why donât translators use (more) MT?
Translations are not good enough!
What about TMs? Arenât fuzzy matches useful?
Estimativa da qualidade da tradu¸cËao autom´atica 12 / 31
30. Quality of Machine Translation Quality Estimation Open issues Conclusions
Overview
Why donât translators use (more) MT?
Translations are not good enough!
What about TMs? Arenât fuzzy matches useful?
Estimativa da qualidade da tradu¸cËao autom´atica 12 / 31
31. Quality of Machine Translation Quality Estimation Open issues Conclusions
Framework
Quality estimation (QE): provide an estimate of
quality for new translated text *before* it is post-edited
Quality = post-editing eďŹort
Estimativa da qualidade da tradu¸cËao autom´atica 13 / 31
32. Quality of Machine Translation Quality Estimation Open issues Conclusions
Framework
Quality estimation (QE): provide an estimate of
quality for new translated text *before* it is post-edited
Quality = post-editing eďŹort
No access to reference translations: machine learning
techniques to predict post-editing eďŹort scores
Estimativa da qualidade da tradu¸cËao autom´atica 13 / 31
33. Quality of Machine Translation Quality Estimation Open issues Conclusions
Framework
Quality estimation (QE): provide an estimate of
quality for new translated text *before* it is post-edited
Quality = post-editing eďŹort
No access to reference translations: machine learning
techniques to predict post-editing eďŹort scores
Considers interaction with TM systems: only used for
low fuzzy match cases, or to select between TM and MT
Estimativa da qualidade da tradu¸cËao autom´atica 13 / 31
34. Quality of Machine Translation Quality Estimation Open issues Conclusions
Framework
Quality estimation (QE): provide an estimate of
quality for new translated text *before* it is post-edited
Quality = post-editing eďŹort
No access to reference translations: machine learning
techniques to predict post-editing eďŹort scores
Considers interaction with TM systems: only used for
low fuzzy match cases, or to select between TM and MT
QTLaunchPad project
Multidimensional Quality Metrics for MT and HT, for manual
and (semi-)automatic evaluation (QE):
http://www.qt21.eu/launchpad/
Estimativa da qualidade da tradu¸cËao autom´atica 13 / 31
35. Quality of Machine Translation Quality Estimation Open issues Conclusions
Framework
QE system
Examples:
source &
translations,
quality scores
Quality
indicators
Estimativa da qualidade da tradu¸cËao autom´atica 14 / 31
36. Quality of Machine Translation Quality Estimation Open issues Conclusions
Framework
Source
text
MT system
Translation
QE system
Quality score
Examples:
source &
translations,
quality scores
Quality
indicators
Estimativa da qualidade da tradu¸cËao autom´atica 14 / 31
37. Quality of Machine Translation Quality Estimation Open issues Conclusions
Examples of positive results
Time to post-edit subset of sentences predicted as
âgoodâ (low eďŹort) vs time to post-edit random subset of
sentences
Estimativa da qualidade da tradu¸cËao autom´atica 15 / 31
38. Quality of Machine Translation Quality Estimation Open issues Conclusions
Examples of positive results
Time to post-edit subset of sentences predicted as
âgoodâ (low eďŹort) vs time to post-edit random subset of
sentences
Language no QE QE
fr-en 0.75 words/sec 1.09 words/sec
en-es 0.32 words/sec 0.57 words/sec
Estimativa da qualidade da tradu¸cËao autom´atica 15 / 31
39. Quality of Machine Translation Quality Estimation Open issues Conclusions
Examples of positive results
Time to post-edit subset of sentences predicted as
âgoodâ (low eďŹort) vs time to post-edit random subset of
sentences
Language no QE QE
fr-en 0.75 words/sec 1.09 words/sec
en-es 0.32 words/sec 0.57 words/sec
Accuracy in selecting best translation among 4 MT
systems
Best MT system Highest QE score
54% 77%
Estimativa da qualidade da tradu¸cËao autom´atica 15 / 31
40. Quality of Machine Translation Quality Estimation Open issues Conclusions
State-of-the-art
Quality indicators:
Source text TranslationMT system
Confidence
indicators
Complexity
indicators
Fluency
indicators
Adequacy
indicators
Estimativa da qualidade da tradu¸cËao autom´atica 16 / 31
41. Quality of Machine Translation Quality Estimation Open issues Conclusions
State-of-the-art
Quality indicators:
Source text TranslationMT system
Confidence
indicators
Complexity
indicators
Fluency
indicators
Adequacy
indicators
Learning algorithms: wide range
Estimativa da qualidade da tradu¸cËao autom´atica 16 / 31
42. Quality of Machine Translation Quality Estimation Open issues Conclusions
State-of-the-art
Quality indicators:
Source text TranslationMT system
Confidence
indicators
Complexity
indicators
Fluency
indicators
Adequacy
indicators
Learning algorithms: wide range
Datasets: few with absolute human scores (1-4/5 scores,
PE time, edit distance)
Estimativa da qualidade da tradu¸cËao autom´atica 16 / 31
43. Quality of Machine Translation Quality Estimation Open issues Conclusions
Outline
1 Quality of Machine Translation
2 Quality Estimation
3 Open issues
4 Conclusions
Estimativa da qualidade da tradu¸cËao autom´atica 17 / 31
44. Quality of Machine Translation Quality Estimation Open issues Conclusions
State-of-the-art indicators
Shallow indicators:
(S/T/S-T) Sentence length
(S/T) Language model
(S/T) Token-type ratio
(S) Average number of possible translations per word
(S) % of n-grams belonging to diďŹerent frequency
quartiles of a source language corpus
(T) Untranslated/OOV words
(T) Mismatching brackets, quotation marks
(S-T) Preservation of punctuation
(S-T) Word alignment score, etc.
Estimativa da qualidade da tradu¸cËao autom´atica 18 / 31
45. Quality of Machine Translation Quality Estimation Open issues Conclusions
State-of-the-art indicators
Shallow indicators:
(S/T/S-T) Sentence length
(S/T) Language model
(S/T) Token-type ratio
(S) Average number of possible translations per word
(S) % of n-grams belonging to diďŹerent frequency
quartiles of a source language corpus
(T) Untranslated/OOV words
(T) Mismatching brackets, quotation marks
(S-T) Preservation of punctuation
(S-T) Word alignment score, etc.
These do well for estimation post-editing eďŹort...
...but are not enough for other aspects of quality, e.g.
adequacy
Estimativa da qualidade da tradu¸cËao autom´atica 18 / 31
46. Quality of Machine Translation Quality Estimation Open issues Conclusions
State-of-the-art indicators
Linguistic indicators - count-based:
(S/T/S-T) Content/non-content words
(S/T/S-T) Nouns/verbs/... NP/VP/...
(S/T/S-T) Deictics (references)
(S/T/S-T) Discourse markers (references)
(S/T/S-T) Named entities
(S/T/S-T) Zero-subjects
(S/T/S-T) Pronominal subjects
(S/T/S-T) Negation indicators
(T) Subject-verb / adjective-noun agreement
(T) Language Model of POS
(T) Grammar checking (dangling words)
(T) Coherence
Estimativa da qualidade da tradu¸cËao autom´atica 19 / 31
47. Quality of Machine Translation Quality Estimation Open issues Conclusions
State-of-the-art indicators
Linguistic indicators - alignment-based:
(S-T) Correct translation of pronouns
(S-T) Matching of dependency relations
(S-T) Matching of named entities
(S-T) Alignment of parse trees
(S-T) Alignment of predicates & arguments, etc.
Estimativa da qualidade da tradu¸cËao autom´atica 20 / 31
48. Quality of Machine Translation Quality Estimation Open issues Conclusions
State-of-the-art indicators
Linguistic indicators - alignment-based:
(S-T) Correct translation of pronouns
(S-T) Matching of dependency relations
(S-T) Matching of named entities
(S-T) Alignment of parse trees
(S-T) Alignment of predicates & arguments, etc.
Some indicators are language-dependent, others need
resources that are language-dependent, but apply to most
languages, e.g. LM of POS tags
Estimativa da qualidade da tradu¸cËao autom´atica 20 / 31
49. Quality of Machine Translation Quality Estimation Open issues Conclusions
State-of-the-art indicators
Fine-grained, lexicalised indicators:
target-word = âprocessâ =
1, if source-word = âhdhh alamlytâ.
0, otherwise.
target-word = âprocessâ =
1, if source-pos = âDT DTNNâ.
0, otherwise.
Estimativa da qualidade da tradu¸cËao autom´atica 21 / 31
50. Quality of Machine Translation Quality Estimation Open issues Conclusions
State-of-the-art indicators
Fine-grained, lexicalised indicators:
target-word = âprocessâ =
1, if source-word = âhdhh alamlytâ.
0, otherwise.
target-word = âprocessâ =
1, if source-pos = âDT DTNNâ.
0, otherwise.
Closer to error detection
Need large amounts of training data [BHAO11], or RB approaches
Estimativa da qualidade da tradu¸cËao autom´atica 21 / 31
51. Quality of Machine Translation Quality Estimation Open issues Conclusions
Do these indicators work?
Estimativa da qualidade da tradu¸cËao autom´atica 22 / 31
52. Quality of Machine Translation Quality Estimation Open issues Conclusions
Do these indicators work?
To some extent... Issues:
Representation of shallow/deep indicators: counts,
ratios, (absolute) diďŹerences?
F = S â T, F = |S â T|, F =
T
S
, F =
S â T
S
...
Estimativa da qualidade da tradu¸cËao autom´atica 22 / 31
53. Quality of Machine Translation Quality Estimation Open issues Conclusions
Do these indicators work?
To some extent... Issues:
Representation of shallow/deep indicators: counts,
ratios, (absolute) diďŹerences?
F = S â T, F = |S â T|, F =
T
S
, F =
S â T
S
...
Resources to extract deep indicators: availability and
reliability
Estimativa da qualidade da tradu¸cËao autom´atica 22 / 31
54. Quality of Machine Translation Quality Estimation Open issues Conclusions
Do these indicators work?
To some extent... Issues:
Representation of shallow/deep indicators: counts,
ratios, (absolute) diďŹerences?
F = S â T, F = |S â T|, F =
T
S
, F =
S â T
S
...
Resources to extract deep indicators: availability and
reliability
Data to extract ďŹne-grained indicators: need previously
translated and post-edited data esp. for negative
examples
Estimativa da qualidade da tradu¸cËao autom´atica 22 / 31
55. Quality of Machine Translation Quality Estimation Open issues Conclusions
Manual scoring: agreement between translators
Absolute value judgements: diďŹcult to achieve consistency
across annotators even in highly controlled setup
Estimativa da qualidade da tradu¸cËao autom´atica 23 / 31
56. Quality of Machine Translation Quality Estimation Open issues Conclusions
Manual scoring: agreement between translators
Absolute value judgements: diďŹcult to achieve consistency
across annotators even in highly controlled setup
en-es news WMT12 dataset: 3 professional
translators, 1-5 scores
15% of initial dataset discarded: annotators disagreed by
more than one category
Remaining annotations had to be scaled (0.33, 0.17,
0.50)
Estimativa da qualidade da tradu¸cËao autom´atica 23 / 31
57. Quality of Machine Translation Quality Estimation Open issues Conclusions
Manual scoring: Agreement between translators
en-pt subtitles of TV series: 3 non-professionals
annotators, 1-4 scores
351 cases (41%): full agreement
445 cases (52%): partial agreement
54 cases (7%): null agreement
Estimativa da qualidade da tradu¸cËao autom´atica 24 / 31
58. Quality of Machine Translation Quality Estimation Open issues Conclusions
Manual scoring: Agreement between translators
en-pt subtitles of TV series: 3 non-professionals
annotators, 1-4 scores
351 cases (41%): full agreement
445 cases (52%): partial agreement
54 cases (7%): null agreement
Agreement by score:
Score Full
4 59%
3 35%
2 23%
1 50%
Estimativa da qualidade da tradu¸cËao autom´atica 24 / 31
59. Quality of Machine Translation Quality Estimation Open issues Conclusions
More objective ways of annotating translations
HTER: Edit distance between MT output and its minimally
post-edited version
Estimativa da qualidade da tradu¸cËao autom´atica 25 / 31
60. Quality of Machine Translation Quality Estimation Open issues Conclusions
More objective ways of annotating translations
HTER: Edit distance between MT output and its minimally
post-edited version
HTER =
#edits
#words postedited version
Edits: substitute, delete, insert, shift
Estimativa da qualidade da tradu¸cËao autom´atica 25 / 31
61. Quality of Machine Translation Quality Estimation Open issues Conclusions
More objective ways of annotating translations
HTER: Edit distance between MT output and its minimally
post-edited version
HTER =
#edits
#words postedited version
Edits: substitute, delete, insert, shift
Analysis by Maarit Koponen (WMT-12) on post-edited
translations with HTER and 1-5 scores
A number of cases where translations with low HTER
(few edits) were assigned low quality scores (high
post-editing eďŹort), and vice-versa
Estimativa da qualidade da tradu¸cËao autom´atica 25 / 31
62. Quality of Machine Translation Quality Estimation Open issues Conclusions
More objective ways of annotating translations
HTER: Edit distance between MT output and its minimally
post-edited version
HTER =
#edits
#words postedited version
Edits: substitute, delete, insert, shift
Analysis by Maarit Koponen (WMT-12) on post-edited
translations with HTER and 1-5 scores
A number of cases where translations with low HTER
(few edits) were assigned low quality scores (high
post-editing eďŹort), and vice-versa
Certain edits seem to require more cognitive eďŹort than
others - not captured by HTER
Estimativa da qualidade da tradu¸cËao autom´atica 25 / 31
63. Quality of Machine Translation Quality Estimation Open issues Conclusions
More objective ways of annotating translations
TIME: varies considerably across translators (expected)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0
100
200
300
400
500
600
A1
A2
A3
A4
A5
A6
A7
A8
Segments
Annotators
Seconds
Can we normalise this variation?
A dedicated QE system for each translator?
Estimativa da qualidade da tradu¸cËao autom´atica 26 / 31
64. Quality of Machine Translation Quality Estimation Open issues Conclusions
More objective ways of annotating translations
TIME: varies considerably across translators (expected)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0.00
5.00
10.00
15.00
20.00
25.00
A1
A2
A3
A4
A5
A6
A7
A8
Annotators
Seconds / word
Segments
Can we normalise this variation?
A dedicated QE system for each translator?
Estimativa da qualidade da tradu¸cËao autom´atica 26 / 31
65. Quality of Machine Translation Quality Estimation Open issues Conclusions
More objective ways of annotating translations
Time, HTER, Keystrokes: data from 8 post-editors
Estimativa da qualidade da tradu¸cËao autom´atica 27 / 31
66. Quality of Machine Translation Quality Estimation Open issues Conclusions
More objective ways of annotating translations
PET: http://pers-www.wlv.ac.uk/~in1676/pet/
Estimativa da qualidade da tradu¸cËao autom´atica 27 / 31
67. Quality of Machine Translation Quality Estimation Open issues Conclusions
How to use estimated PE eďŹort scores?
Should (supposedly) bad quality translations be ďŹltered
out or shown to translators (diďŹerent scores/colour
codes as in TMs)?
Wasting time to read scores and translations vs wasting
âgistingâ information
Estimativa da qualidade da tradu¸cËao autom´atica 28 / 31
68. Quality of Machine Translation Quality Estimation Open issues Conclusions
How to use estimated PE eďŹort scores?
Should (supposedly) bad quality translations be ďŹltered
out or shown to translators (diďŹerent scores/colour
codes as in TMs)?
Wasting time to read scores and translations vs wasting
âgistingâ information
How to deďŹne a threshold on the estimated translation
quality to decide what should be ďŹltered out?
Translator dependent
Task dependent (SDL)
Estimativa da qualidade da tradu¸cËao autom´atica 28 / 31
69. Quality of Machine Translation Quality Estimation Open issues Conclusions
How to use estimated PE eďŹort scores?
Should (supposedly) bad quality translations be ďŹltered
out or shown to translators (diďŹerent scores/colour
codes as in TMs)?
Wasting time to read scores and translations vs wasting
âgistingâ information
How to deďŹne a threshold on the estimated translation
quality to decide what should be ďŹltered out?
Translator dependent
Task dependent (SDL)
Do translators prefer detailed estimates (sub-sentence
level) or an overall estimate for the complete sentence?
Too much information vs hard-to-interpret scores
Estimativa da qualidade da tradu¸cËao autom´atica 28 / 31
70. Quality of Machine Translation Quality Estimation Open issues Conclusions
Outline
1 Quality of Machine Translation
2 Quality Estimation
3 Open issues
4 Conclusions
Estimativa da qualidade da tradu¸cËao autom´atica 29 / 31
71. Quality of Machine Translation Quality Estimation Open issues Conclusions
Conclusions
It is possible to estimate at least certain aspects of MT
quality, esp. wrt PE eďŹort: QuEst
http://quest.dcs.shef.ac.uk/
Estimativa da qualidade da tradu¸cËao autom´atica 30 / 31
72. Quality of Machine Translation Quality Estimation Open issues Conclusions
Conclusions
It is possible to estimate at least certain aspects of MT
quality, esp. wrt PE eďŹort: QuEst
http://quest.dcs.shef.ac.uk/
PE eďŹort estimates can be used in real applications
Ranking translations: ďŹlter out bad quality translations
Selecting translations from multiple MT systems
Estimativa da qualidade da tradu¸cËao autom´atica 30 / 31
73. Quality of Machine Translation Quality Estimation Open issues Conclusions
Conclusions
It is possible to estimate at least certain aspects of MT
quality, esp. wrt PE eďŹort: QuEst
http://quest.dcs.shef.ac.uk/
PE eďŹort estimates can be used in real applications
Ranking translations: ďŹlter out bad quality translations
Selecting translations from multiple MT systems
Commercial products by SDL (document-level for gisting)
and Multilizer
Estimativa da qualidade da tradu¸cËao autom´atica 30 / 31
74. Quality of Machine Translation Quality Estimation Open issues Conclusions
Conclusions
It is possible to estimate at least certain aspects of MT
quality, esp. wrt PE eďŹort: QuEst
http://quest.dcs.shef.ac.uk/
PE eďŹort estimates can be used in real applications
Ranking translations: ďŹlter out bad quality translations
Selecting translations from multiple MT systems
Commercial products by SDL (document-level for gisting)
and Multilizer
A number of open issues to be investigated...
Estimativa da qualidade da tradu¸cËao autom´atica 30 / 31
75. Quality of Machine Translation Quality Estimation Open issues Conclusions
Conclusions
It is possible to estimate at least certain aspects of MT
quality, esp. wrt PE eďŹort: QuEst
http://quest.dcs.shef.ac.uk/
PE eďŹort estimates can be used in real applications
Ranking translations: ďŹlter out bad quality translations
Selecting translations from multiple MT systems
Commercial products by SDL (document-level for gisting)
and Multilizer
A number of open issues to be investigated...
Collaboration with âhuman translatorsâ essential
Estimativa da qualidade da tradu¸cËao autom´atica 30 / 31
76. Quality of Machine Translation Quality Estimation Open issues Conclusions
Conclusions
It is possible to estimate at least certain aspects of MT
quality, esp. wrt PE eďŹort: QuEst
http://quest.dcs.shef.ac.uk/
PE eďŹort estimates can be used in real applications
Ranking translations: ďŹlter out bad quality translations
Selecting translations from multiple MT systems
Commercial products by SDL (document-level for gisting)
and Multilizer
A number of open issues to be investigated...
Collaboration with âhuman translatorsâ essential
My vision
Sub-sentence level QE (error detection), highlighting
errors but also given an overall estimate for the sentence
Estimativa da qualidade da tradu¸cËao autom´atica 30 / 31
77. Quality of Machine Translation Quality Estimation Open issues Conclusions
Estimativa da qualidade da tradu¸cËao
autom´atica
Lucia Specia
University of SheďŹeld
l.specia@sheffield.ac.uk
Faculdade de Letras da Universidade do Porto
13 May 2013
Estimativa da qualidade da tradu¸cËao autom´atica 31 / 31
78. Quality of Machine Translation Quality Estimation Open issues Conclusions
Autodesk.
Translation and Post-Editing Productivity.
In http: // translate. autodesk. com/ productivity. html ,
2011.
Nguyen Bach, Fei Huang, and Yaser Al-Onaizan.
Goodness: a method for measuring machine translation conďŹdence.
pages 211â219, Portland, Oregon, 2011.
Markus Dreyer and Daniel Marcu.
Hyter: Meaning-equivalent semantics for translation evaluation.
In Proceedings of the 2012 Conference of the North American
Chapter of the Association for Computational Linguistics: Human
Language Technologies, pages 162â171, Montr´eal, Canada, 2012.
Intel.
Being Streetwise with Machine Translation in an Enterprise
Neighborhood.
Estimativa da qualidade da tradu¸cËao autom´atica 31 / 31
79. Quality of Machine Translation Quality Estimation Open issues Conclusions
In http:
// mtmarathon2010. info/ JEC2010_ Burgett_ slides. pptx ,
2010.
Intel.
Enabling Multilingual Collaboration through Machine Translation.
In http: // media12. connectedsocialmedia. com/ intel/ 06/
8647/ Enabling_ Multilingual_ Collaboration_ Machine_
Translation. pdf , 2012.
Estimativa da qualidade da tradu¸cËao autom´atica 31 / 31