15. Alessandro Cattelan (Translated) Natural Language Processing for Translation)

1,682 views

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,682
On SlideShare
0
From Embeds
0
Number of Embeds
999
Actions
Shares
0
Downloads
19
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

15. Alessandro Cattelan (Translated) Natural Language Processing for Translation)

  1. 1. Natural Language Processing for Translation Alessandro Cattelan, Translated srl
  2. 2. Language industry size Language service industry Extremely fragmented $33.5 billion market both in terms of in 2012 http://www.commonsenseadvisory.com/Portals/0/downloads/12 0531_QT_Top_100_LSPs.pdf language service providers and customers.
  3. 3. Language industry customers Large customers spend millions of dollars a year in translation. However, it is the smaller customers with limited budgets that make up most of the market.
  4. 4. Specific characteristics Larger customers  Large budgets  Use technology (MT, TM, termbases, etc.)  Efficient processes (translation is part of the development cycle) Smaller customers  Tight budgets  No technology and no processes
  5. 5. Smaller Customers Even though they are on a tight budget and use no technology for translation, we can still give them something better than this…
  6. 6. Common requirements Both smaller and larger customers are interested in:  Getting high quality translations  Receiving the translation as soon as possible  Saving as much as possible
  7. 7. Challenge → Opportunity No technology and no processes to improve efficiency in translation Develop technology and processes to win customers
  8. 8. Content reuse Translation Memory Large public translation memories make it possible to leverage previously translated content and to reduce weighted word count.  Collecting data  Aligning bilingual content  Making data available in CAT tools
  9. 9. Translation Memory Never translate the same sentence twice… nor part of it!  Improving matching algorithm for translation memories EN IT To open a file, select File from the menu and click on Open Per aprire un file, selezionare File dal menu e fare clic su Apri Select File from the menu […]
  10. 10. Translation Memory Never translate the same sentence twice… nor part of it!  Improving matching algorithm for translation memories  Using MT to complete fuzzy matches EN IT Select File from the menu Selezionare File dal menu Select File from the menu and click on New document Selezionare File dal menu […]
  11. 11. Machine Translation Most of the times, customers do not have custom MT engines nor the data to create an engine.  Use existing domain-specific engines, even though they are not adapted to the customer  Adapt generic engines to specific domains (needs to be fast!)  Adapt the engine in real-time with the user translations
  12. 12. Using generic engines “If I have seen further it is by standing on the shoulders of giants.” [I. Newton] Post-processing of MT output from generic engines:  Correcting terminology issues  Adapting output to previous translations  Managing mark-up…
  13. 13. MT quality evaluation Establishing the right weight for words translated by MT systems.
  14. 14. MT quality evaluation What is a fair rate for editing machine translation output?  Confidence scores for MT  Matching metrics for TM segments  MT quality perceived by the user
  15. 15. Terminology Management Terminology management can have a great impact on quality and productivity.  Automatic extraction of terminology  Finding target language equivalents for source terms  Adding context to the terms
  16. 16. Any questions?

×