SDL BeGlobal
The SDL Platform for Automated
Translation

SDL Proprietary and Confidential
Platform
Characteristics
The SDL BeGlobal platform

Multiple access points to MT engines
WorldServer
Trados Studio
Passolo
LivePerson
Oracle
RightN...
Infrastructure Platform
Scalable
Serving thousands of users in real-time

Private
All requests remain private and confiden...
Infrastructure Platform
Organized
Only users with specific rights can do
specific things

Available & Deployable
Can insta...
Available Language Pairs
Western European

Eastern European

Middle East &

Asian

African
Danish <> English

Albanian <> ...
Integrations
Business User Focus
Integrations: Web Translator

A simple web application for
translating text and
documents, customizable with
customer’s vi...
Integrations: BeGlobal Desktop

A desktop tool that
allows translating text
and documents

9
Integrations: Microsoft Word

The Word plugin allows
the user to translate
The document that they
have on the screen
Integrations: Microsoft Word

Source file is displayed next to the target
translation once completed. This means the
user ...
Integrations: Microsoft Outlook

The Outlook plugin will
translate an email, either
appending the translation
to the sourc...
Integrations: LivePerson chat

The agent chats in English and
all incoming Spanish messages
are automatically translated

...
Integrations: SDL Trados Studio
Define BeGlobal connection

Define pre-translation settings

15
Integrations: SDL Trados Studio
Translation results

Trados Studio Editor

16
BeGlobal Trainer

Continuous improvement
Language Pair Customization Process – Then and Now
• Professional Services approach
Parallel Data,
TMs, Glossaries

The Cu...
What is BeGlobal Trainer?
• A Software-as-a-Service component of SDL BeGlobal that
allows the customization (“training”) o...
The “Training” Workflow
Collect
Parallel
Data

Prepare
Parallel
Data

WORK OFFLINE

TMX files used as training data (sing...
Automatic cleaning is done
on the training corpus.

Trainer Concept

Multiple internal trainings are
performed to determin...
Real-life applications
Translation productivity
Translate
Post-edit
WorldServer,
Trados Studio

Machine
Translate

Generat...
Fully documented, fully supported
• Service provided with full documentation, user guide, best
practices and dedicated con...
Demonstrations
Post-Editing
Best Practice
In Order to Post-Edit Effectively...
• Retain as much raw MT output as possible
• Some parts of the MT output can almost a...
Typical Features to Watch out for in SMT Output
• SMT output is generally fluent in style & consistent with
existing data
...
Extra Words in the Target
LP

Source

MT Output

PE Version

Comments

Les ailes droites et
gauches sont les
mêmes.

Word ...
Words Missing in the Target
LP

Source

MT Output

PE Version

Comments

IT-EN

Come è possibile
regolare la
luminosità de...
Mistranslations
LP

Source

MT Output

PE Version

Comments

EN-FR

You can alter, publicly
display, publicly
perform or b...
Grammar
LP

Source

MT Output

PE Version

Comments
Wrong case

EN-DE

To remove the 3D
diffuser:

Zum Entfernen des 3D
Re...
Summary
SDL BeGlobal …
Is an
Infrastructure

Scalable, Secure, Easy to deploy and
Integrate

For translators

With continued tight...
Thank You!
Upcoming SlideShare
Loading in …5
×

SDL BeGlobal The SDL Platform for Automated Translation

1,486 views

Published on

Post edited machine translation as a skill and as an addition to the professional translators’ toolkit is now becoming widely accepted. Here you can see why...

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

SDL BeGlobal The SDL Platform for Automated Translation

  1. 1. SDL BeGlobal The SDL Platform for Automated Translation SDL Proprietary and Confidential
  2. 2. Platform Characteristics
  3. 3. The SDL BeGlobal platform Multiple access points to MT engines WorldServer Trados Studio Passolo LivePerson Oracle RightNow Custom Integrations 3 Realtime API TouchPoints Translation Engines Role-based access web UI Customized Language Pairs Generic Language Pairs
  4. 4. Infrastructure Platform Scalable Serving thousands of users in real-time Private All requests remain private and confidential to users Secure Only accessible to those privileged
  5. 5. Infrastructure Platform Organized Only users with specific rights can do specific things Available & Deployable Can instantly be switched on (cloud offering) Made for integration Catering to IT professionals
  6. 6. Available Language Pairs Western European Eastern European Middle East & Asian African Danish <> English Albanian <> English Arabic <> English Bengali <> English Dutch <> English Bulgarian <> English Arabic <> Spanish Hindi <> English Finnish <> English Czech <> English Arabic <> French Indonesian <> English French <> English Estonian <> English Dari <> English Japanese <> English French <> Arabic Hungarian <> English Hausa <> English Korean <> English French <> Spanish, German Latvian <> English Hebrew <> English Simplified Chinese <> English German <> English Lithuanian <> English Pashto <> English Thai <> English German <> French Polish <> English Persian <> English Traditional Chinese <> English German <> Spanish Romanian <> English Somali <> English Urdu <> English Greek <> English Russian <> English Italian <> English, Spanish Serbian <> English Norwegian <> English Slovak <> English Portuguese <> English Slovenian <> English Spanish <> English, Arabic, Turkish <> English French, German, Italian Ukrainian <> English Swedish <> English 6 New language pairs are added on a continuing basis Custom developments welcome
  7. 7. Integrations Business User Focus
  8. 8. Integrations: Web Translator A simple web application for translating text and documents, customizable with customer’s visual identity 8
  9. 9. Integrations: BeGlobal Desktop A desktop tool that allows translating text and documents 9
  10. 10. Integrations: Microsoft Word The Word plugin allows the user to translate The document that they have on the screen
  11. 11. Integrations: Microsoft Word Source file is displayed next to the target translation once completed. This means the user can see the 2 documents
  12. 12. Integrations: Microsoft Outlook The Outlook plugin will translate an email, either appending the translation to the source, or overwriting it 12
  13. 13. Integrations: LivePerson chat The agent chats in English and all incoming Spanish messages are automatically translated 14 The user chats in Spanish and receives the agent’s messages also in Spanish
  14. 14. Integrations: SDL Trados Studio Define BeGlobal connection Define pre-translation settings 15
  15. 15. Integrations: SDL Trados Studio Translation results Trados Studio Editor 16
  16. 16. BeGlobal Trainer Continuous improvement
  17. 17. Language Pair Customization Process – Then and Now • Professional Services approach Parallel Data, TMs, Glossaries The Customer Translation Engines SDL Applied Science Engineers Custom Language Pair Training Data Deploy Customized Language Pairs Generic Language Pairs SDL SMT “Training” Environment • BeGlobal Trainer self-service approach Parallel Data, TMs, Glossaries 18 The Customer BeGlobal Trainer
  18. 18. What is BeGlobal Trainer? • A Software-as-a-Service component of SDL BeGlobal that allows the customization (“training”) of MT engines based on parallel corpus in TMX format. • Perform domain-adaptation for existing generic engines. • The packaged (commercial) approach of SDL Science Labs to MT Language Pair customization, summing up 15+ years of field experience. • A simple 4-step process: 19
  19. 19. The “Training” Workflow Collect Parallel Data Prepare Parallel Data WORK OFFLINE TMX files used as training data (single file or multiple) At least 500K/1M source words for domain improvement Create New Training Upload Data Train MT Engine Evaluate Engine BEGLOBAL TRAINER Translate Deploy Engine BeGlobal Online, API, SDL Trados Studio, … BEGLOBAL SaaS
  20. 20. Automatic cleaning is done on the training corpus. Trainer Concept Multiple internal trainings are performed to determine best/fastest engine. Training corpus (TMX) Clean Upload Complete training BiLingual Evaluation Understudy Train Evaluate Trained Language Pair Algorithm for evaluating the quality of text which has been machine-translated. Regression set (TXT) “The closer a machine translation is to the professional human translation in the TMX file, the better it is" Test set (TMX) Deployed Custom Language Pair Generic Deploy Human evaluation Optional Human evaluation on Regression set (trained vs generic) or custom evaluation “BLEU” score evaluation
  21. 21. Real-life applications Translation productivity Translate Post-edit WorldServer, Trados Studio Machine Translate Generate TM BG Integrations Deploy BeGlobal Platform 23 Train BeGlobal Trainer Integrations & Self-service translations
  22. 22. Fully documented, fully supported • Service provided with full documentation, user guide, best practices and dedicated consultancy 24
  23. 23. Demonstrations
  24. 24. Post-Editing Best Practice
  25. 25. In Order to Post-Edit Effectively... • Retain as much raw MT output as possible • Some parts of the MT output can almost always be used • Don‟t over edit or under edit • For each segment, re-read your translation – no details missing – no superfluous words left in Content and therefore MT output can vary. More or less editing may be required depending on the nature or difficulty of the material. 27
  26. 26. Typical Features to Watch out for in SMT Output • SMT output is generally fluent in style & consistent with existing data – (i.e. data used to build the engine) • BUT look out for the following issues (sample of just some of the issues): – Extra words in target, not present in the source – Words missing in the target (adjectives, nouns, verbs, prefixes) – Mistranslations (antonyms: remove/install, can/cannot, do/don„t) – Grammar: gender, agreement or verb inflection might be wrong 28
  27. 27. Extra Words in the Target LP Source MT Output PE Version Comments Les ailes droites et gauches sont les mêmes. Word added EN-FR Right and left fenders are the same. Les ailes droites et gauches sont bien les mêmes. EN-ES The email attachment is supported. El archivo adjunto al El como archivos correo electrónico es adjuntos al correo electrónico es compatible. compatible. Word added
  28. 28. Words Missing in the Target LP Source MT Output PE Version Comments IT-EN Come è possibile regolare la luminosità dello schermo del display? How can adjust the brightness the display screen? How can I adjust the brightness of the display screen? Subject missing in the target (not required in the source) EN-FR Using hardware removed earlier, attach the new center link. A l'aide du matériel de fixation déposé plus tôt, fixez la nouvelle articulation. A l'aide du matériel de fixation déposé plus tôt, fixez la nouvelle articulation centrale. Missing word in the target
  29. 29. Mistranslations LP Source MT Output PE Version Comments EN-FR You can alter, publicly display, publicly perform or broadcast the hardware. Vous ne pouvez en aucun cas modifier, afficher publiquement, exécuter publiquement ou diffuser les matériels. Vous pouvez modifier, afficher publiquement, exécuter publiquement ou diffuser les matériels. Negative instead of positive sentence ES-EN No conectar la batería Connect the battery Do not connect the battery Positive instead of negative sentence IT-EN Chiudere la vite di spurgo C e passare alla fase successiva del processo di spurgo. Open the bleeding screw C then continue next step of bleeding process. Close the bleeding screw C then continue next step of bleeding process. Antonym
  30. 30. Grammar LP Source MT Output PE Version Comments Wrong case EN-DE To remove the 3D diffuser: Zum Entfernen des 3D Refraktionstechnik: Zum Entfernen der 3D Refraktionstechnik: IT-EN Di seguito viene descritto soltanto ciò che è aggiunto. The following is described only what is added. The following describes Wrong verb tense only what is added. EN-FR The pressure is reduced to pilot pressure. La pression est réduit à la pression pilote. La pression est réduite à la pression pilote Gender agreement (masculine instead of feminine)
  31. 31. Summary
  32. 32. SDL BeGlobal … Is an Infrastructure Scalable, Secure, Easy to deploy and Integrate For translators With continued tightened integration with SDL Trados Studio And for nonlinguists Enable “non-professional translator”, “the business user” to self-service their multilingual communication needs: On demand & within their environment Continuously improves Allows non-technical, non-computational linguist user to train and evaluate custom engines
  33. 33. Thank You!

×