TAUS Best PracticesMT Post-Editing GuidelinesNovember 2010
Objectives and scopeThese guidelines are aimed at helping customers andservice providers set clear expectations and can be usedas a basis on which to instruct post-editors.Each company’s postediting guidelines are likely to varydepending on a range of parameters. It is not practical topresent a set of guidelines that will cover all scenarios.We expect that organisations will use these baselineguidelines and will tailor them as required for their ownpurposes. Generally, these guidelines assume bi-lingualpostediting (not monolingual) that is ideally carried outby a paid translator but that might in some scenarios becarried out by bilingual domain experts or volunteers. Theguidelines are not system or language-specific.
RecommendationsTo reduce the level of post-editing required (regardless of languagepair, direction, system type or domain), we recommend the following:• Tune your system appropriately, i.e. ensure high level dictionary and linguisticcoding for RBMT systems, or training with clean, high-quality, domain-specificdata for data-driven or hybrid systems.• Ensure the source text is written well (i.e. correctspelling, punctuation, unambiguous) and, if possible, tuned for translation by MT(i.e. by using specific authoring rules that suit the MT system in question).• Integrate terminology management across source text authoring, MT and TMsystems.• Train post-editors in advance.• Examine the raw MT output quality before negotiating throughput and price andset reasonable expectations.• Agree a definition for the final quality of the information to be post-edited, basedon user type and levels of acceptance.• Pay post-editors to give structured feedback on common MT errors (and, ifnecessary, guide them in how to do this) so the system can be improved over time.
Post-editing GuidelinesAssuming the recommendations above areimplemented, we suggest some basic guidelinesfor postediting. The effort involved inpostediting will be determined by two maincriteria:1. The quality of the MT raw output.2. The expected end quality of the content.
To reach quality similar to “high-quality human translationand revision” (a.k.a. “publishable quality”), full postediting isusually recommended. For quality of a lower standard, oftenreferred to as “good enough” or “fit for purpose”, lightpostediting is usually recommended. However, lightpostediting of really poor MT output may not bring the outputup to publishable quality standards. On the other hand, if theraw MT output is of good quality, then perhaps all that isneeded is a light, not a full, post-edit to achieve publishablequality. So, instead of differentiating between guidelines forlight and full-postediting, we will differentiate here betweentwo levels of expected quality. Other levels could bedefined, but we will stick to two here to keep things simple.The set of guidelines proposed below are conceptualised as agroup of guidelines where individual guidelines can beselected, depending on the needs of the customer and theraw MT quality.
Guidelines for achieving“good enough” quality“Good enough” is defined as comprehensible(i.e. you can understand the main content of themessage), accurate (i.e. it communicates thesame meaning as the source text), but as notbeing stylistically compelling. The text maysound like it was generated by acomputer, syntax might be somewhatunusual, grammar may not be perfect but themessage is accurate.
• Aim for semantically correct translation.• Ensure that no information has been accidentallyadded or omitted.• Edit any offensive, inappropriate or culturallyunacceptable content.• Use as much of the raw MT output as possible.• Basic rules regarding spelling apply.• No need to implement corrections that are of astylistic nature only.• No need to restructure sentences solely toimprove the natural flow of the text.
Guidelines for achieving quality similaror equal to human translationThis level of quality is generally defined as beingcomprehensible (i.e. an end user perfectlyunderstands the content of the message),accurate (i.e. it communicates the samemeaning as the source text), stylistically fine,though the style may not be as good as thatachieved by a native-speaker human translator.Syntax is normal, grammar and punctuation arecorrect.
• Aim for grammatically, syntactically and semanticallycorrect translation.• Ensure that key terminology is correctly translated andthat untranslated terms belong to the client’s list of“Do Not Translate” terms”.• Ensure that no information has been accidentallyadded or omitted.• Edit any offensive, inappropriate or culturallyunacceptable content.• Use as much of the raw MT output as possible.• Basic rules regarding spelling, punctuation andhyphenation apply.• Ensure that formatting is correct.
Our thanks to:Thanks to everyone who has helped to put theseguidelines together. We were very fortunate to havethe help of TAUS Members, governmentalinstitutions and translator organizations. Detailsabout the project team and process for arriving atthese guidelines can be found here.Special thanks to Sharon OBrien, Dublin CityUniversity and CNGL, and FredHollowood, Symantec and TAUS Advisory Board fortheir dedication and support in putting theseguidelines together.
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.