Lucia Specia - Estimativa de qualidade em TA

456 views

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
456
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
14
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Lucia Specia - Estimativa de qualidade em TA

  1. 1. Quality of Machine Translation Quality Estimation Open issues ConclusionsEstimativa da qualidade da tradu¸c˜aoautom´aticaLucia SpeciaUniversity of Sheffieldl.specia@sheffield.ac.ukFaculdade de Letras da Universidade do Porto13 May 2013Estimativa da qualidade da tradu¸c˜ao autom´atica 1 / 31
  2. 2. Quality of Machine Translation Quality Estimation Open issues ConclusionsOutline1 Quality of Machine Translation2 Quality Estimation3 Open issues4 ConclusionsEstimativa da qualidade da tradu¸c˜ao autom´atica 2 / 31
  3. 3. Quality of Machine Translation Quality Estimation Open issues ConclusionsOutline1 Quality of Machine Translation2 Quality Estimation3 Open issues4 ConclusionsEstimativa da qualidade da tradu¸c˜ao autom´atica 3 / 31
  4. 4. Quality of Machine Translation Quality Estimation Open issues ConclusionsIntroductionMachine Translation:Around since the early 1950sEstimativa da qualidade da tradu¸c˜ao autom´atica 4 / 31
  5. 5. Quality of Machine Translation Quality Estimation Open issues ConclusionsIntroductionMachine Translation:Around since the early 1950sIncreasingly more popular since 1990: statisticalapproachesEstimativa da qualidade da tradu¸c˜ao autom´atica 4 / 31
  6. 6. Quality of Machine Translation Quality Estimation Open issues ConclusionsIntroductionMachine Translation:Around since the early 1950sIncreasingly more popular since 1990: statisticalapproachesSoftware tools and data available to build translationsystems - Moses and othersEstimativa da qualidade da tradu¸c˜ao autom´atica 4 / 31
  7. 7. Quality of Machine Translation Quality Estimation Open issues ConclusionsIntroductionMachine Translation:Around since the early 1950sIncreasingly more popular since 1990: statisticalapproachesSoftware tools and data available to build translationsystems - Moses and othersIncreasing demand for cheaper and fast translationsEstimativa da qualidade da tradu¸c˜ao autom´atica 4 / 31
  8. 8. Quality of Machine Translation Quality Estimation Open issues ConclusionsIntroductionMachine Translation:Around since the early 1950sIncreasingly more popular since 1990: statisticalapproachesSoftware tools and data available to build translationsystems - Moses and othersIncreasing demand for cheaper and fast translationsHow do we measure quality and progress over time?So far... mostly automatic evaluation metricsEstimativa da qualidade da tradu¸c˜ao autom´atica 4 / 31
  9. 9. Quality of Machine Translation Quality Estimation Open issues ConclusionsMT evaluation metricsN-gram matching between system output and one ormore reference translations: BLEU and many othersEstimativa da qualidade da tradu¸c˜ao autom´atica 5 / 31
  10. 10. Quality of Machine Translation Quality Estimation Open issues ConclusionsMT evaluation metricsN-gram matching between system output and one ormore reference translations: BLEU and many othersIssue 1: Too many possible good quality translations,need thousands of references to capture valid variationsEstimativa da qualidade da tradu¸c˜ao autom´atica 5 / 31
  11. 11. Quality of Machine Translation Quality Estimation Open issues ConclusionsMT evaluation metricsN-gram matching between system output and one ormore reference translations: BLEU and many othersIssue 1: Too many possible good quality translations,need thousands of references to capture valid variationsSolution: HyTER (Language Weaver) annotation tool togenerate all possible correct translations! [DM12]Translations built bottom-up from word/phrasetranslation equivalents using FSA2-2.5 hours worth of expert annotation per sentenceOne annotator: 5.2 × 106 pathsA bunch of annotators: 8.5 × 1011 pathsEstimativa da qualidade da tradu¸c˜ao autom´atica 5 / 31
  12. 12. Quality of Machine Translation Quality Estimation Open issues ConclusionsMT evaluation metricsIssue 2: Difficult to quantify severity of mismatchingn-gramsEstimativa da qualidade da tradu¸c˜ao autom´atica 6 / 31
  13. 13. Quality of Machine Translation Quality Estimation Open issues ConclusionsMT evaluation metricsIssue 2: Difficult to quantify severity of mismatchingn-gramsref Do not buy this product, it’s their craziest invention!sys Do buy this product, it’s their craziest invention!Estimativa da qualidade da tradu¸c˜ao autom´atica 6 / 31
  14. 14. Quality of Machine Translation Quality Estimation Open issues ConclusionsMT evaluation metricsIssue 2: Difficult to quantify severity of mismatchingn-gramsref Do not buy this product, it’s their craziest invention!sys Do buy this product, it’s their craziest invention!Some attempts to weight mismatches differently -sparse, lexicalised approachEstimativa da qualidade da tradu¸c˜ao autom´atica 6 / 31
  15. 15. Quality of Machine Translation Quality Estimation Open issues ConclusionsMT evaluation metricsIssue 2: Difficult to quantify severity of mismatchingn-gramsref Do not buy this product, it’s their craziest invention!sys Do buy this product, it’s their craziest invention!Some attempts to weight mismatches differently -sparse, lexicalised approachHowever, same error is more or less important dependingon the user or purpose:Severe if end-user does not speak source languageTrivial to post-edit by translatorsEstimativa da qualidade da tradu¸c˜ao autom´atica 6 / 31
  16. 16. Quality of Machine Translation Quality Estimation Open issues ConclusionsMT evaluation metricsConversely:ref The battery lasts 6 hours and it can be fully rechargedin 30 minutes.sys Six-hours battery, 30 minutes to full charge last.Estimativa da qualidade da tradu¸c˜ao autom´atica 7 / 31
  17. 17. Quality of Machine Translation Quality Estimation Open issues ConclusionsMT evaluation metricsConversely:ref The battery lasts 6 hours and it can be fully rechargedin 30 minutes.sys Six-hours battery, 30 minutes to full charge last.Ok for gisting - meaning preservedVery costly for post-editing if style is to be preservedEstimativa da qualidade da tradu¸c˜ao autom´atica 7 / 31
  18. 18. Quality of Machine Translation Quality Estimation Open issues ConclusionsTask-based evaluationMeasure translation quality within task. E.g. Autodesk -Productivity test through post-editing [Aut11]2-day translation and post-editing , 37 participantsIn-house Moses (Autodesk data: software)Time spent on each segmentEstimativa da qualidade da tradu¸c˜ao autom´atica 8 / 31
  19. 19. Quality of Machine Translation Quality Estimation Open issues ConclusionsTask-based evaluationE.g.: Intel - User satisfaction with un-edited MTTranslation is good if customer can solve problemEstimativa da qualidade da tradu¸c˜ao autom´atica 9 / 31
  20. 20. Quality of Machine Translation Quality Estimation Open issues ConclusionsTask-based evaluationE.g.: Intel - User satisfaction with un-edited MTTranslation is good if customer can solve problemMT for Customer Support websites [Int10]Overall customer satisfaction: 75% for English→ChineseEstimativa da qualidade da tradu¸c˜ao autom´atica 9 / 31
  21. 21. Quality of Machine Translation Quality Estimation Open issues ConclusionsTask-based evaluationE.g.: Intel - User satisfaction with un-edited MTTranslation is good if customer can solve problemMT for Customer Support websites [Int10]Overall customer satisfaction: 75% for English→Chinese95% reduction in costProject cycle from 10 days to 1 dayFrom 300 to 60,000 words translated/hourEstimativa da qualidade da tradu¸c˜ao autom´atica 9 / 31
  22. 22. Quality of Machine Translation Quality Estimation Open issues ConclusionsTask-based evaluationE.g.: Intel - User satisfaction with un-edited MTTranslation is good if customer can solve problemMT for Customer Support websites [Int10]Overall customer satisfaction: 75% for English→Chinese95% reduction in costProject cycle from 10 days to 1 dayFrom 300 to 60,000 words translated/hourCustomers in China using MT texts were more satisfiedwith support than natives using original texts (68%)!Estimativa da qualidade da tradu¸c˜ao autom´atica 9 / 31
  23. 23. Quality of Machine Translation Quality Estimation Open issues ConclusionsTask-based evaluationE.g.: Intel - User satisfaction with un-edited MTTranslation is good if customer can solve problemMT for Customer Support websites [Int10]Overall customer satisfaction: 75% for English→Chinese95% reduction in costProject cycle from 10 days to 1 dayFrom 300 to 60,000 words translated/hourCustomers in China using MT texts were more satisfiedwith support than natives using original texts (68%)!MT for chat and community forums [Int12]∼60% “understandable and actionable”(→English/Spanish)Max ∼10% “not understandable”(→Chinese)Estimativa da qualidade da tradu¸c˜ao autom´atica 9 / 31
  24. 24. Quality of Machine Translation Quality Estimation Open issues ConclusionsOutline1 Quality of Machine Translation2 Quality Estimation3 Open issues4 ConclusionsEstimativa da qualidade da tradu¸c˜ao autom´atica 10 / 31
  25. 25. Quality of Machine Translation Quality Estimation Open issues ConclusionsOverviewMetrics either depend on references or post-editing/use oftranslations (task-based)Estimativa da qualidade da tradu¸c˜ao autom´atica 11 / 31
  26. 26. Quality of Machine Translation Quality Estimation Open issues ConclusionsOverviewMetrics either depend on references or post-editing/use oftranslations (task-based)Our proposalQuality assessment without reference, prior topost-editing/use of translationsEstimativa da qualidade da tradu¸c˜ao autom´atica 11 / 31
  27. 27. Quality of Machine Translation Quality Estimation Open issues ConclusionsOverviewWhy don’t translators use (more) MT?Estimativa da qualidade da tradu¸c˜ao autom´atica 12 / 31
  28. 28. Quality of Machine Translation Quality Estimation Open issues ConclusionsOverviewWhy don’t translators use (more) MT?Translations are not good enough!Estimativa da qualidade da tradu¸c˜ao autom´atica 12 / 31
  29. 29. Quality of Machine Translation Quality Estimation Open issues ConclusionsOverviewWhy don’t translators use (more) MT?Translations are not good enough!What about TMs? Aren’t fuzzy matches useful?Estimativa da qualidade da tradu¸c˜ao autom´atica 12 / 31
  30. 30. Quality of Machine Translation Quality Estimation Open issues ConclusionsOverviewWhy don’t translators use (more) MT?Translations are not good enough!What about TMs? Aren’t fuzzy matches useful?Estimativa da qualidade da tradu¸c˜ao autom´atica 12 / 31
  31. 31. Quality of Machine Translation Quality Estimation Open issues ConclusionsFrameworkQuality estimation (QE): provide an estimate ofquality for new translated text *before* it is post-editedQuality = post-editing effortEstimativa da qualidade da tradu¸c˜ao autom´atica 13 / 31
  32. 32. Quality of Machine Translation Quality Estimation Open issues ConclusionsFrameworkQuality estimation (QE): provide an estimate ofquality for new translated text *before* it is post-editedQuality = post-editing effortNo access to reference translations: machine learningtechniques to predict post-editing effort scoresEstimativa da qualidade da tradu¸c˜ao autom´atica 13 / 31
  33. 33. Quality of Machine Translation Quality Estimation Open issues ConclusionsFrameworkQuality estimation (QE): provide an estimate ofquality for new translated text *before* it is post-editedQuality = post-editing effortNo access to reference translations: machine learningtechniques to predict post-editing effort scoresConsiders interaction with TM systems: only used forlow fuzzy match cases, or to select between TM and MTEstimativa da qualidade da tradu¸c˜ao autom´atica 13 / 31
  34. 34. Quality of Machine Translation Quality Estimation Open issues ConclusionsFrameworkQuality estimation (QE): provide an estimate ofquality for new translated text *before* it is post-editedQuality = post-editing effortNo access to reference translations: machine learningtechniques to predict post-editing effort scoresConsiders interaction with TM systems: only used forlow fuzzy match cases, or to select between TM and MTQTLaunchPad projectMultidimensional Quality Metrics for MT and HT, for manualand (semi-)automatic evaluation (QE):http://www.qt21.eu/launchpad/Estimativa da qualidade da tradu¸c˜ao autom´atica 13 / 31
  35. 35. Quality of Machine Translation Quality Estimation Open issues ConclusionsFrameworkQE systemExamples:source &translations,quality scoresQualityindicatorsEstimativa da qualidade da tradu¸c˜ao autom´atica 14 / 31
  36. 36. Quality of Machine Translation Quality Estimation Open issues ConclusionsFrameworkSourcetextMT systemTranslationQE systemQuality scoreExamples:source &translations,quality scoresQualityindicatorsEstimativa da qualidade da tradu¸c˜ao autom´atica 14 / 31
  37. 37. Quality of Machine Translation Quality Estimation Open issues ConclusionsExamples of positive resultsTime to post-edit subset of sentences predicted as“good” (low effort) vs time to post-edit random subset ofsentencesEstimativa da qualidade da tradu¸c˜ao autom´atica 15 / 31
  38. 38. Quality of Machine Translation Quality Estimation Open issues ConclusionsExamples of positive resultsTime to post-edit subset of sentences predicted as“good” (low effort) vs time to post-edit random subset ofsentencesLanguage no QE QEfr-en 0.75 words/sec 1.09 words/secen-es 0.32 words/sec 0.57 words/secEstimativa da qualidade da tradu¸c˜ao autom´atica 15 / 31
  39. 39. Quality of Machine Translation Quality Estimation Open issues ConclusionsExamples of positive resultsTime to post-edit subset of sentences predicted as“good” (low effort) vs time to post-edit random subset ofsentencesLanguage no QE QEfr-en 0.75 words/sec 1.09 words/secen-es 0.32 words/sec 0.57 words/secAccuracy in selecting best translation among 4 MTsystemsBest MT system Highest QE score54% 77%Estimativa da qualidade da tradu¸c˜ao autom´atica 15 / 31
  40. 40. Quality of Machine Translation Quality Estimation Open issues ConclusionsState-of-the-artQuality indicators:Source text TranslationMT systemConfidenceindicatorsComplexityindicatorsFluencyindicatorsAdequacyindicatorsEstimativa da qualidade da tradu¸c˜ao autom´atica 16 / 31
  41. 41. Quality of Machine Translation Quality Estimation Open issues ConclusionsState-of-the-artQuality indicators:Source text TranslationMT systemConfidenceindicatorsComplexityindicatorsFluencyindicatorsAdequacyindicatorsLearning algorithms: wide rangeEstimativa da qualidade da tradu¸c˜ao autom´atica 16 / 31
  42. 42. Quality of Machine Translation Quality Estimation Open issues ConclusionsState-of-the-artQuality indicators:Source text TranslationMT systemConfidenceindicatorsComplexityindicatorsFluencyindicatorsAdequacyindicatorsLearning algorithms: wide rangeDatasets: few with absolute human scores (1-4/5 scores,PE time, edit distance)Estimativa da qualidade da tradu¸c˜ao autom´atica 16 / 31
  43. 43. Quality of Machine Translation Quality Estimation Open issues ConclusionsOutline1 Quality of Machine Translation2 Quality Estimation3 Open issues4 ConclusionsEstimativa da qualidade da tradu¸c˜ao autom´atica 17 / 31
  44. 44. Quality of Machine Translation Quality Estimation Open issues ConclusionsState-of-the-art indicatorsShallow indicators:(S/T/S-T) Sentence length(S/T) Language model(S/T) Token-type ratio(S) Average number of possible translations per word(S) % of n-grams belonging to different frequencyquartiles of a source language corpus(T) Untranslated/OOV words(T) Mismatching brackets, quotation marks(S-T) Preservation of punctuation(S-T) Word alignment score, etc.Estimativa da qualidade da tradu¸c˜ao autom´atica 18 / 31
  45. 45. Quality of Machine Translation Quality Estimation Open issues ConclusionsState-of-the-art indicatorsShallow indicators:(S/T/S-T) Sentence length(S/T) Language model(S/T) Token-type ratio(S) Average number of possible translations per word(S) % of n-grams belonging to different frequencyquartiles of a source language corpus(T) Untranslated/OOV words(T) Mismatching brackets, quotation marks(S-T) Preservation of punctuation(S-T) Word alignment score, etc.These do well for estimation post-editing effort......but are not enough for other aspects of quality, e.g.adequacyEstimativa da qualidade da tradu¸c˜ao autom´atica 18 / 31
  46. 46. Quality of Machine Translation Quality Estimation Open issues ConclusionsState-of-the-art indicatorsLinguistic indicators - count-based:(S/T/S-T) Content/non-content words(S/T/S-T) Nouns/verbs/... NP/VP/...(S/T/S-T) Deictics (references)(S/T/S-T) Discourse markers (references)(S/T/S-T) Named entities(S/T/S-T) Zero-subjects(S/T/S-T) Pronominal subjects(S/T/S-T) Negation indicators(T) Subject-verb / adjective-noun agreement(T) Language Model of POS(T) Grammar checking (dangling words)(T) CoherenceEstimativa da qualidade da tradu¸c˜ao autom´atica 19 / 31
  47. 47. Quality of Machine Translation Quality Estimation Open issues ConclusionsState-of-the-art indicatorsLinguistic indicators - alignment-based:(S-T) Correct translation of pronouns(S-T) Matching of dependency relations(S-T) Matching of named entities(S-T) Alignment of parse trees(S-T) Alignment of predicates & arguments, etc.Estimativa da qualidade da tradu¸c˜ao autom´atica 20 / 31
  48. 48. Quality of Machine Translation Quality Estimation Open issues ConclusionsState-of-the-art indicatorsLinguistic indicators - alignment-based:(S-T) Correct translation of pronouns(S-T) Matching of dependency relations(S-T) Matching of named entities(S-T) Alignment of parse trees(S-T) Alignment of predicates & arguments, etc.Some indicators are language-dependent, others needresources that are language-dependent, but apply to mostlanguages, e.g. LM of POS tagsEstimativa da qualidade da tradu¸c˜ao autom´atica 20 / 31
  49. 49. Quality of Machine Translation Quality Estimation Open issues ConclusionsState-of-the-art indicatorsFine-grained, lexicalised indicators:target-word = “process” =1, if source-word = “hdhh alamlyt”.0, otherwise.target-word = “process” =1, if source-pos = “DT DTNN”.0, otherwise.Estimativa da qualidade da tradu¸c˜ao autom´atica 21 / 31
  50. 50. Quality of Machine Translation Quality Estimation Open issues ConclusionsState-of-the-art indicatorsFine-grained, lexicalised indicators:target-word = “process” =1, if source-word = “hdhh alamlyt”.0, otherwise.target-word = “process” =1, if source-pos = “DT DTNN”.0, otherwise.Closer to error detectionNeed large amounts of training data [BHAO11], or RB approachesEstimativa da qualidade da tradu¸c˜ao autom´atica 21 / 31
  51. 51. Quality of Machine Translation Quality Estimation Open issues ConclusionsDo these indicators work?Estimativa da qualidade da tradu¸c˜ao autom´atica 22 / 31
  52. 52. Quality of Machine Translation Quality Estimation Open issues ConclusionsDo these indicators work?To some extent... Issues:Representation of shallow/deep indicators: counts,ratios, (absolute) differences?F = S − T, F = |S − T|, F =TS, F =S − TS...Estimativa da qualidade da tradu¸c˜ao autom´atica 22 / 31
  53. 53. Quality of Machine Translation Quality Estimation Open issues ConclusionsDo these indicators work?To some extent... Issues:Representation of shallow/deep indicators: counts,ratios, (absolute) differences?F = S − T, F = |S − T|, F =TS, F =S − TS...Resources to extract deep indicators: availability andreliabilityEstimativa da qualidade da tradu¸c˜ao autom´atica 22 / 31
  54. 54. Quality of Machine Translation Quality Estimation Open issues ConclusionsDo these indicators work?To some extent... Issues:Representation of shallow/deep indicators: counts,ratios, (absolute) differences?F = S − T, F = |S − T|, F =TS, F =S − TS...Resources to extract deep indicators: availability andreliabilityData to extract fine-grained indicators: need previouslytranslated and post-edited data esp. for negativeexamplesEstimativa da qualidade da tradu¸c˜ao autom´atica 22 / 31
  55. 55. Quality of Machine Translation Quality Estimation Open issues ConclusionsManual scoring: agreement between translatorsAbsolute value judgements: difficult to achieve consistencyacross annotators even in highly controlled setupEstimativa da qualidade da tradu¸c˜ao autom´atica 23 / 31
  56. 56. Quality of Machine Translation Quality Estimation Open issues ConclusionsManual scoring: agreement between translatorsAbsolute value judgements: difficult to achieve consistencyacross annotators even in highly controlled setupen-es news WMT12 dataset: 3 professionaltranslators, 1-5 scores15% of initial dataset discarded: annotators disagreed bymore than one categoryRemaining annotations had to be scaled (0.33, 0.17,0.50)Estimativa da qualidade da tradu¸c˜ao autom´atica 23 / 31
  57. 57. Quality of Machine Translation Quality Estimation Open issues ConclusionsManual scoring: Agreement between translatorsen-pt subtitles of TV series: 3 non-professionalsannotators, 1-4 scores351 cases (41%): full agreement445 cases (52%): partial agreement54 cases (7%): null agreementEstimativa da qualidade da tradu¸c˜ao autom´atica 24 / 31
  58. 58. Quality of Machine Translation Quality Estimation Open issues ConclusionsManual scoring: Agreement between translatorsen-pt subtitles of TV series: 3 non-professionalsannotators, 1-4 scores351 cases (41%): full agreement445 cases (52%): partial agreement54 cases (7%): null agreementAgreement by score:Score Full4 59%3 35%2 23%1 50%Estimativa da qualidade da tradu¸c˜ao autom´atica 24 / 31
  59. 59. Quality of Machine Translation Quality Estimation Open issues ConclusionsMore objective ways of annotating translationsHTER: Edit distance between MT output and its minimallypost-edited versionEstimativa da qualidade da tradu¸c˜ao autom´atica 25 / 31
  60. 60. Quality of Machine Translation Quality Estimation Open issues ConclusionsMore objective ways of annotating translationsHTER: Edit distance between MT output and its minimallypost-edited versionHTER =#edits#words postedited versionEdits: substitute, delete, insert, shiftEstimativa da qualidade da tradu¸c˜ao autom´atica 25 / 31
  61. 61. Quality of Machine Translation Quality Estimation Open issues ConclusionsMore objective ways of annotating translationsHTER: Edit distance between MT output and its minimallypost-edited versionHTER =#edits#words postedited versionEdits: substitute, delete, insert, shiftAnalysis by Maarit Koponen (WMT-12) on post-editedtranslations with HTER and 1-5 scoresA number of cases where translations with low HTER(few edits) were assigned low quality scores (highpost-editing effort), and vice-versaEstimativa da qualidade da tradu¸c˜ao autom´atica 25 / 31
  62. 62. Quality of Machine Translation Quality Estimation Open issues ConclusionsMore objective ways of annotating translationsHTER: Edit distance between MT output and its minimallypost-edited versionHTER =#edits#words postedited versionEdits: substitute, delete, insert, shiftAnalysis by Maarit Koponen (WMT-12) on post-editedtranslations with HTER and 1-5 scoresA number of cases where translations with low HTER(few edits) were assigned low quality scores (highpost-editing effort), and vice-versaCertain edits seem to require more cognitive effort thanothers - not captured by HTEREstimativa da qualidade da tradu¸c˜ao autom´atica 25 / 31
  63. 63. Quality of Machine Translation Quality Estimation Open issues ConclusionsMore objective ways of annotating translationsTIME: varies considerably across translators (expected)1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200100200300400500600A1A2A3A4A5A6A7A8SegmentsAnnotatorsSecondsCan we normalise this variation?A dedicated QE system for each translator?Estimativa da qualidade da tradu¸c˜ao autom´atica 26 / 31
  64. 64. Quality of Machine Translation Quality Estimation Open issues ConclusionsMore objective ways of annotating translationsTIME: varies considerably across translators (expected)1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200.005.0010.0015.0020.0025.00A1A2A3A4A5A6A7A8AnnotatorsSeconds / wordSegmentsCan we normalise this variation?A dedicated QE system for each translator?Estimativa da qualidade da tradu¸c˜ao autom´atica 26 / 31
  65. 65. Quality of Machine Translation Quality Estimation Open issues ConclusionsMore objective ways of annotating translationsTime, HTER, Keystrokes: data from 8 post-editorsEstimativa da qualidade da tradu¸c˜ao autom´atica 27 / 31
  66. 66. Quality of Machine Translation Quality Estimation Open issues ConclusionsMore objective ways of annotating translationsPET: http://pers-www.wlv.ac.uk/~in1676/pet/Estimativa da qualidade da tradu¸c˜ao autom´atica 27 / 31
  67. 67. Quality of Machine Translation Quality Estimation Open issues ConclusionsHow to use estimated PE effort scores?Should (supposedly) bad quality translations be filteredout or shown to translators (different scores/colourcodes as in TMs)?Wasting time to read scores and translations vs wasting“gisting” informationEstimativa da qualidade da tradu¸c˜ao autom´atica 28 / 31
  68. 68. Quality of Machine Translation Quality Estimation Open issues ConclusionsHow to use estimated PE effort scores?Should (supposedly) bad quality translations be filteredout or shown to translators (different scores/colourcodes as in TMs)?Wasting time to read scores and translations vs wasting“gisting” informationHow to define a threshold on the estimated translationquality to decide what should be filtered out?Translator dependentTask dependent (SDL)Estimativa da qualidade da tradu¸c˜ao autom´atica 28 / 31
  69. 69. Quality of Machine Translation Quality Estimation Open issues ConclusionsHow to use estimated PE effort scores?Should (supposedly) bad quality translations be filteredout or shown to translators (different scores/colourcodes as in TMs)?Wasting time to read scores and translations vs wasting“gisting” informationHow to define a threshold on the estimated translationquality to decide what should be filtered out?Translator dependentTask dependent (SDL)Do translators prefer detailed estimates (sub-sentencelevel) or an overall estimate for the complete sentence?Too much information vs hard-to-interpret scoresEstimativa da qualidade da tradu¸c˜ao autom´atica 28 / 31
  70. 70. Quality of Machine Translation Quality Estimation Open issues ConclusionsOutline1 Quality of Machine Translation2 Quality Estimation3 Open issues4 ConclusionsEstimativa da qualidade da tradu¸c˜ao autom´atica 29 / 31
  71. 71. Quality of Machine Translation Quality Estimation Open issues ConclusionsConclusionsIt is possible to estimate at least certain aspects of MTquality, esp. wrt PE effort: QuEsthttp://quest.dcs.shef.ac.uk/Estimativa da qualidade da tradu¸c˜ao autom´atica 30 / 31
  72. 72. Quality of Machine Translation Quality Estimation Open issues ConclusionsConclusionsIt is possible to estimate at least certain aspects of MTquality, esp. wrt PE effort: QuEsthttp://quest.dcs.shef.ac.uk/PE effort estimates can be used in real applicationsRanking translations: filter out bad quality translationsSelecting translations from multiple MT systemsEstimativa da qualidade da tradu¸c˜ao autom´atica 30 / 31
  73. 73. Quality of Machine Translation Quality Estimation Open issues ConclusionsConclusionsIt is possible to estimate at least certain aspects of MTquality, esp. wrt PE effort: QuEsthttp://quest.dcs.shef.ac.uk/PE effort estimates can be used in real applicationsRanking translations: filter out bad quality translationsSelecting translations from multiple MT systemsCommercial products by SDL (document-level for gisting)and MultilizerEstimativa da qualidade da tradu¸c˜ao autom´atica 30 / 31
  74. 74. Quality of Machine Translation Quality Estimation Open issues ConclusionsConclusionsIt is possible to estimate at least certain aspects of MTquality, esp. wrt PE effort: QuEsthttp://quest.dcs.shef.ac.uk/PE effort estimates can be used in real applicationsRanking translations: filter out bad quality translationsSelecting translations from multiple MT systemsCommercial products by SDL (document-level for gisting)and MultilizerA number of open issues to be investigated...Estimativa da qualidade da tradu¸c˜ao autom´atica 30 / 31
  75. 75. Quality of Machine Translation Quality Estimation Open issues ConclusionsConclusionsIt is possible to estimate at least certain aspects of MTquality, esp. wrt PE effort: QuEsthttp://quest.dcs.shef.ac.uk/PE effort estimates can be used in real applicationsRanking translations: filter out bad quality translationsSelecting translations from multiple MT systemsCommercial products by SDL (document-level for gisting)and MultilizerA number of open issues to be investigated...Collaboration with “human translators” essentialEstimativa da qualidade da tradu¸c˜ao autom´atica 30 / 31
  76. 76. Quality of Machine Translation Quality Estimation Open issues ConclusionsConclusionsIt is possible to estimate at least certain aspects of MTquality, esp. wrt PE effort: QuEsthttp://quest.dcs.shef.ac.uk/PE effort estimates can be used in real applicationsRanking translations: filter out bad quality translationsSelecting translations from multiple MT systemsCommercial products by SDL (document-level for gisting)and MultilizerA number of open issues to be investigated...Collaboration with “human translators” essentialMy visionSub-sentence level QE (error detection), highlightingerrors but also given an overall estimate for the sentenceEstimativa da qualidade da tradu¸c˜ao autom´atica 30 / 31
  77. 77. Quality of Machine Translation Quality Estimation Open issues ConclusionsEstimativa da qualidade da tradu¸c˜aoautom´aticaLucia SpeciaUniversity of Sheffieldl.specia@sheffield.ac.ukFaculdade de Letras da Universidade do Porto13 May 2013Estimativa da qualidade da tradu¸c˜ao autom´atica 31 / 31
  78. 78. Quality of Machine Translation Quality Estimation Open issues ConclusionsAutodesk.Translation and Post-Editing Productivity.In http: // translate. autodesk. com/ productivity. html ,2011.Nguyen Bach, Fei Huang, and Yaser Al-Onaizan.Goodness: a method for measuring machine translation confidence.pages 211–219, Portland, Oregon, 2011.Markus Dreyer and Daniel Marcu.Hyter: Meaning-equivalent semantics for translation evaluation.In Proceedings of the 2012 Conference of the North AmericanChapter of the Association for Computational Linguistics: HumanLanguage Technologies, pages 162–171, Montr´eal, Canada, 2012.Intel.Being Streetwise with Machine Translation in an EnterpriseNeighborhood.Estimativa da qualidade da tradu¸c˜ao autom´atica 31 / 31
  79. 79. Quality of Machine Translation Quality Estimation Open issues ConclusionsIn http:// mtmarathon2010. info/ JEC2010_ Burgett_ slides. pptx ,2010.Intel.Enabling Multilingual Collaboration through Machine Translation.In http: // media12. connectedsocialmedia. com/ intel/ 06/8647/ Enabling_ Multilingual_ Collaboration_ Machine_Translation. pdf , 2012.Estimativa da qualidade da tradu¸c˜ao autom´atica 31 / 31

×