Neural Machine Translation: a report from the front line

•Download as PPTX, PDF•

1 like•1,578 views

This presentation was given at various events in June 2017 on the current status of Neural Machine Translation development at Iconic. Rule based, statistical, hybrid, neural - at the end of the day it's all machine translation. At Iconic, we've been "doing neural" for over 12 months in various guises but, frequently, we find that our clients don't care what we use once we get the job done. In these slides, we go through a number of case studies involving MT and show how fit for purpose translations were delivered, combining various different approaches to MT.

Business

Neural MT
Separating the hype from reality
a report from the front line
TAUS Webinar / Industry Leaders Forum / LocWorld 34

The Neural Narrative
Great, thanks. Let me use it now…

LevelofDifficulty
Korean
Japanese
French IT
Marketing
Patent
Gisting
PEMT
Perfect
German
Use caseIndustryLanguage
Use cases for machine translation

Impact of Neural MT
Use cases covered by generic MT
Use cases needing custom MTThe
Bar

Impact of Neural MT
Use cases covered by generic MT
Use cases needing custom MTThe
Bar
Neural
MT has
raised the
bar

Of course.
We’re a team of MT experts. This is a big part of the value that
we bring to the table. We’re not just taking open-source tools off
the shelf. We’re innovating, researching, developing new
processes. Same for Neural MT.
e.g. lexically constrained decoding
Neural MT @ Iconic
“Do you ‘do’ Neural MT?”

It’s one of the ways.
MT is not a one-size-fits-all technology. What constitutes the best
approach depends on the language pair, domain, use case, and
various other factors. In some cases, the best approach will be
Neural MT, but not yet all the time.
Neural MT @ Iconic
“Is this the way you do MT now?”

When it gives the best output!
When you’re customising MT, there are so many things you can
do – different processors, parameters, ways of combining data
and tuning. We try multiple approaches and allow our systems to
use the best one.
Let’s look at some case studies, but first…
Neural MT @ Iconic
“When do you use it then?”

Patents Case Study
Average length: 7 words
Average length: 30 words
3 languages Same data sets Client evaluation
Ranking
1. Unusable
2. Poor
3. Adequate
4. Good
5. Excellent
Criteria
90% Adequate or above
0% Unusable

Linguist Review
both pass title criteria
only Iconic MT passed
abstracts
Patents Case Study – Chinese to English
31
3938
19
0
5
10
15
20
25
30
35
40
45
Titles Abstracts
Iconic MT Iconic Neural MT
Outcome
Iconic MT deployed in
production

Linguist Review
both passed on titles
only NMT passed abstracts
Patents Case Study – Japanese to English
52
33
0
10
20
30
40
50
60
Titles Abstracts
Iconic MT Iconic Neural MT
Outcome
Iconic MT deployed for titles
Neural MT deployed for
abstracts
4
5
4
4

Patents Case Study – Korean to English
40
25
0
5
10
15
20
25
30
35
40
45
Titles Abstracts
Iconic MT Iconic Neural MT
Linguist Review
Iconic MT below criteria
Neural MT significantly better
Outcome
Under review!
2
6
4
3

Neural MT raises the bar for general purpose MT
but the bar still needs to be tested.
Customisation Case Study
English to French English to Hindi
BLEU 1-TER BLEU 1-TER
Iconic MT 43.0 (+10.4) 55.2 (+7.7) 46.75 (+12.96) 56.4 (+5.5)
GNMT 32.6 47.5 33.79 50.9
Iconic NMT 39.2 50.5 - -
2 languages 1.5M training segments IT content

The Iconic Ensemble Architecture™
Neural MT is another
powerful tool in our
arsenal that helps us
deliver best-in-class
machine translation
output

MT @ Iconic – what we don’t do
MT MT MT MT
MT
MTMT
MTMTMTMTMT
MTMTMTMTMTMT
The ability to build your own MT
engines with Moses, Phrasal,
OpenNMT, Nematus, Fairseq.
Provide off-the-shelf general
industry engines. There are
some very adequate solutions for
that!
MT

Customised expert-built MT, using
the most appropriate tool for the job,
MT or otherwise.
MT @ Iconic – what we do do!
Develop products and solutions that
incorporate machine translation –
not just access to an API.

Engage our expert team on an Neural MT project to see if it works for your content
Neural MT – Early Adopter Program
To date
Custom-developments with
some of our closest partners
Now
Inviting early adopters to
expand the range of casesEarly Adopter Program

Thank You!
john@iconictranslation.com
@johntins / @iconictrans

This was a talk given at the annual GALA conference in Amsterdam on March 27th 2017. The topic is Neural Machine Translation. Where are we now? Neural Machine Translation is at the peak of a hype cycle. There is no doubt it is an emerging technology with massive potential, but it is not yet a sweeping solution to all ills. Several factors prevent NMT from being commercially ready. Expectations, therefore, need to be managed. That is the goal of this presentation.

Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWS

Iconic Translation Machines

The Latest Advances in Patent Machine Translation

Iconic Translation Machines

Improving Translator Productivity with MT: A Patent Translation Case Study

Iconic Translation Machines

What machine translation developers are doing to make post-editors happy

Iconic Translation Machines

2. Constantin Orasan (UoW) EXPERT IntroductionRIILP

2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)

RIILP

9. Manuel Harranz (pangeanic) Hybrid Solutions for TranslationRIILP

Welocalize language tools expert Laura Casanellas details key topics related to human translation and machine translation post-editing, production, throughputs and measuring success. This is the presentation used in a recent online webinar you can find at http://www.welocalize.com/wemt/wemt-webinars/ Topics for this recorded webinar include: - Defining throughputs for human translation and machine translation post-editing - How to accurately compare individual throughputs for translating and post-editing - What are the most common deviations in throughputs - How to spot progress and performance improvement - Who really benefits from post-editing

18. Alessandro Cattelan (Translated) TerminologyRIILP

DCXS best selfcare-solutions DynamicFAQ

LilianBernardin

Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)

TAUS - The Language Data Network

Panelists: Yoshiyasu Yamakawa (Intel), JP Barraza (Systran), Konstantin Dranch (Memsource), David Koot (TAUS) The focus of this session will be on predictions and risk management. What kind of things can you predict and how can you manage risks by by analyzing your translation data or monitoring your productivity and quality. Tracking translation data in different cycles of the translation process (translation, post-editing, review, proof-reading) offers tremendous value when it comes to predicting future trends or making informed choices. What type of data can be valuable and what kind of predictions can we make using this data? How can we make more efficient use of already available data? How can we use this type of data to improve machine translation, automatic QA, error-recognition, sampling or quality estimation? How can academia and industry work together towards a common goal?

Is there a future for Model Transformation Languages?

Jordi Cabot

KantanFest: Andy Way

kantanmt

“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...

Edge AI and Vision Alliance

For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2020/08/once-for-all-dnns-simplifying-design-of-efficient-models-for-diverse-hardware-a-presentation-from-mit/ For more information about edge AI and vision, please visit: http://www.edge-ai-vision.com Christine Cheng, co-chair of the inference benchmark working group at MLPerf and a senior machine learning optimization engineer at Intel, delivers the presentation “MLPerf: An Industry Standard Performance Benchmark Suite for Machine Learning” at the Edge AI and Vision Alliance’s July 2020 Edge AI and Vision Innovation Forum. Cheng explains how MLPerf’s inference benchmark suite for evaluating processor performance works and is evolving.

FortranCalculus Class

Optimal Designs Enterprise

FC-Compiler™ is a (free) Calculus-level Compiler that simplifies Tweaking parameters in ones math model. The FortranCalculus (FC) language is for math modeling, simulation, and optimization. FC is based on Automatic Differentiation that simplifies computer code to an absolute minimum; i.e., a mathematical model, constraints, and the objective (function) definition. Minimizing the amount of code allows the user to concentrate on the science or engineering problem at hand and not on the (numerical) process requirements to achieve an optimum solution. Download at http://goal-driven.net/apps/fc-compiler.html FC-Compiler™ App has many (50+) example problems with output (see 'Demos' on main menu) for viewing and getting ideas on solving your own problems. These are improved productivity examples do to using Calculus-level Problem-Solving. Please share this Calculus Problem-Solving tool with your friends. Thanks!

Some "challenges" on the open-source/open-data front

Greg Landrum

IRJET- Applications of Artificial Intelligence in Neural Machine Translation

IRJET Journal

What's hot

From the Lab to the Market: Commercialising MT Research

Iconic Translation Machines

9. Ethics - Juan Jose Arevalillo Doval (Hermes)

RIILP

Learn the different approaches to machine translation and how to improve the ...

SDL

kerstin bier, localization world barcelona, manuel herranz, mt, pangeanic, sy...

Manuel Herranz

6. Entrepreneurship - Juan Jose Arevalillo Doval (Hermes)

RIILP

High Volume, Rapid Turn Around Localization: Lessons Learned

SDL

Carla Parra Escartin - ER2 Hermes Traducciones

RIILP

12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...RIILP

Guideline for project writingsuryakantbhonge

VOC real world enterprise needs

Ivan Berlocher

10. Lucia Specia (USFD) Evaluation of Machine TranslationRIILP

The I in PRIMM - Code Comprehension and Questioning

Sue Sentance

Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas

Welocalize

18. Alessandro Cattelan (Translated) TerminologyRIILP

DCXS best selfcare-solutions DynamicFAQ

LilianBernardin

What's hot (15)

From the Lab to the Market: Commercialising MT Research

9. Ethics - Juan Jose Arevalillo Doval (Hermes)

Learn the different approaches to machine translation and how to improve the ...

kerstin bier, localization world barcelona, manuel herranz, mt, pangeanic, sy...

6. Entrepreneurship - Juan Jose Arevalillo Doval (Hermes)

High Volume, Rapid Turn Around Localization: Lessons Learned

Carla Parra Escartin - ER2 Hermes Traducciones

12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...

Guideline for project writing

VOC real world enterprise needs

10. Lucia Specia (USFD) Evaluation of Machine Translation

The I in PRIMM - Code Comprehension and Questioning

Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas

18. Alessandro Cattelan (Translated) Terminology

DCXS best selfcare-solutions DynamicFAQ

Similar to Neural Machine Translation: a report from the front line

Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)

TAUS - The Language Data Network

Is there a future for Model Transformation Languages?

Jordi Cabot

KantanFest: Andy Way

kantanmt

“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...

Edge AI and Vision Alliance

FortranCalculus Class

Optimal Designs Enterprise

Some "challenges" on the open-source/open-data front

Greg Landrum

IRJET- Applications of Artificial Intelligence in Neural Machine Translation

IRJET Journal

Spark meetup london share and analyse genomic data at scale with spark, adam...

Andy Petrella

Genomics and Health data is nowadays one of the hot topics requiring lots of computations and specially machine learning. This helps science with a very relevant societal impact to get even better outcome. That is why Apache Spark and its ADAM library is a must have. This talk will be twofold. First, we'll show how Apache Spark, MLlib and ADAM can be plugged all together to extract information from even huge and wide genomics dataset. Everything will be packed into examples from the Spark Notebook, showing how bio-scientists can work interactively with such a system. Second, we'll explain how these methodologies and even the datasets themselves can be shared at very large scale between remote entities like hospitals or laboratories using micro services leveraging Apache Spark, ADAM, Play Framework 2, Avro and Tachyon.

A comprehensive guide to prompt engineering.pdf

JamieDornan2

Prompt engineering is the practice of designing and refining specific text prompts to guide transformer-based language models, such as Large Language Models (LLMs), in generating desired outputs. It involves crafting clear and specific instructions and allowing the model sufficient time to process information. By carefully engineering prompts, practitioners can harness the capabilities of LLMs to achieve different goals.

FEA_basics.pdf

CMR University

The finite element method is a numerical method that can be used for the accurate solution of complex engineering problems.Thereafter, within a decade, the potentialities of the method for the solution of different types of applied science and engineering problems were recognized. Over the years, the finite element technique has been so well established that today it is considered to be one of the best methods for solving a wide variety of practical problems efficiently.In fact, the method has become one of the active research areas for applied mathematicians. One of the main reasons for the popularity of the method in different fields of engineering is that once a general computer program is written, it can be used for the solution of any problem simply by changing the input data.

The Strengths & Limitations of Risk Management Standards

Ben Tomhave

Much airtime is given to various standards for information security and risk management, but how much value can really be derived from them? At what point do they cross the line from "useful" to "too much effort and cost"? How can you best leverage standards to improve quality and performance? These questions, and more, will be addressed in this session as we explore the most common standards and how to best leverage them in managing the operational risk portfolio.

Deep learning for text analytics

Erik Tromp

XP2018 presentation for Phoenix Scrum User Group 2018

Thene Sheehy

(150324) Everything you ever wanted to know about Studio!

Paul Filkin

5 challenges of scaling l10n workflows KantanMT/bmmt webinar

kantanmt

In this joint presentation, Tony O’Dowd, Founder and Chief Architect of KantanMT and Maxim Khalilov, Technical Lead of bmmt deliver an overview of the MT technology currently available in the language technology market, the challenges of operating MT systems at scale and speed, and their opinions on the future trajectory of MT. Each presentation will be grounded with client examples, and how they’ve successfully integrated MT into their localization workflows. Finally, both presenters will finish off with a 5 point checklist for successful MT deployment based on both the MT provider and LSP point of view. If you have any questions about this presentation or want to get in touch with either company please contact: Louise Irwin, Marketing Specialist at KantanMT (louisei@kantanmt.com) Peggy Linder, Operations Manager at bmmt (peggy.lindner@bmmt.eu)

00 Fundamentals of csharp course introduction

maznabili

Weak Supervision.pdf

StephenLeo7

No Training Data? No Problem! Weak Supervision to the Rescue! A talk on NLP Weak Supervision at the Singapore Quantum Black Meetup. This talk talks about 1. ML's insatiable need for large datasets 2. Contemporary ML leaving out domain knowledge from Subject Matter Experts 3. How Weak Supervision, an approach of Data-Centric AI, solves both the problems simultaneously by encoding domain subject matter expertise into programmatic labeling functions. 4. The WRENCH benchmark to compare various weak supervision algorithms on several standard datasets. 5. Snorkel to combine the various labeling functions. 6. COSINE to fine-tune a final transformer based model that overcomes the noise in weak labels 7. Future Directions and Resources Feel free to use the slides but please remember to credit me with a link to my Linkedin profile: www.linkedin.com/in/marie-stephen-leo.

2019 04-23-tf lite-avid-f

Avid Farhoodfar, PhD, MSSE, IEEE Santa Clara CE

As Machine learning reaches the mainstream, new tools available to developers makes it possible to implement machine-learning features—voice, face, and image recognition; personalized recommendations; and more—in a mobile context. TensorFlow Lite applies many techniques for achieving low latency; optimizing the kernels for mobile apps, pre-fused activations, and quantized kernels that allow smaller and faster (fixed-point math) models.

Introduction to TDD

Ahmed Misbah

Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing

Welocalize

Similar to Neural Machine Translation: a report from the front line (20)

Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)

Is there a future for Model Transformation Languages?

KantanFest: Andy Way

“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...

FortranCalculus Class

Some "challenges" on the open-source/open-data front

IRJET- Applications of Artificial Intelligence in Neural Machine Translation

Spark meetup london share and analyse genomic data at scale with spark, adam...

A comprehensive guide to prompt engineering.pdf

FEA_basics.pdf

The Strengths & Limitations of Risk Management Standards

Deep learning for text analytics

XP2018 presentation for Phoenix Scrum User Group 2018

(150324) Everything you ever wanted to know about Studio!

5 challenges of scaling l10n workflows KantanMT/bmmt webinar

00 Fundamentals of csharp course introduction

Weak Supervision.pdf

2019 04-23-tf lite-avid-f

Introduction to TDD

Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing

More from Iconic Translation Machines

The growing role of translation technology in e-discovery, litigation, digita...

Iconic Translation Machines

Making the Old New Again - Modern Technical Provides Access to Historical Che...

Iconic Translation Machines

As the oldest abstracting service, Chemisches Zentralblatt generated detailed abstracts of scientific research from 1840-1969. CAS, in partnership with Iconic Translation Machines (ITM), has made this information accessible in ChemZent TM , the first and only English searchable translation of Chemisches Zentralblatt. After its introduction in 1840, Chemisches Zentralblatt quickly grew to be an invaluable resource for chemists. While the content is freely accessible via various online platforms, locating specific information in the volumes of Chemisches Zentralblatt can be challenging. To find a topic or author, the user needs to know the year and volume of interest. In addition, the content was previously only available in German. Leveraging ITM technology and extensive CAS expertise processing scientific literature, the two companies teamed to develop ChemZent. This historical content further enhances the most comprehensive and authoritative source of references, substances and reactions in chemistry and related sciences accessible in SciFinder ® . Three million English–translated abstracts are now searchable in SciFinder, making this rich source of content accessible to today’s researchers.

Past, Present, and Future: Machine Translation & Natural Language Processing ...

Iconic Translation Machines

What? Why? How? Factors that impact the success of commercial MT projects

Iconic Translation Machines

This was a presentation given at the conference of the Association of Machine Translation in the Americas (AMTA) in Austin, Texas on October 31st, 2016. This is a predominantly academic event, and this presentation was a condensed version of our "MT Success Blog Series" on our website where we aimed to give the community and idea as to the practical considerations around commercial machine translation. http://iconictranslation.com/2016/07/8-steps-to-mt-success-series-introduction/

Machine Translation: The Neural Frontier

Iconic Translation Machines

Innovative Business and Pricing Models: for MT

Iconic Translation Machines

MT Evaluation: Seeing the Wood for the Trees

Iconic Translation Machines

"Machine Translation 101" and the Challenge of Patents

Iconic Translation Machines

Data and Linguistics: Delivering Machine Translation with Subject Matter Expe...

Iconic Translation Machines

Beyond Data: Delivering Machine Translation with Subject Matter Expertise

Iconic Translation Machines

More from Iconic Translation Machines (10)

The growing role of translation technology in e-discovery, litigation, digita...

Making the Old New Again - Modern Technical Provides Access to Historical Che...

Past, Present, and Future: Machine Translation & Natural Language Processing ...

What? Why? How? Factors that impact the success of commercial MT projects

Machine Translation: The Neural Frontier

Innovative Business and Pricing Models: for MT

MT Evaluation: Seeing the Wood for the Trees

"Machine Translation 101" and the Challenge of Patents

Data and Linguistics: Delivering Machine Translation with Subject Matter Expe...

Beyond Data: Delivering Machine Translation with Subject Matter Expertise

Recently uploaded

Search Disrupted Google’s Leaked Documents Rock the SEO World.pdf

Arihant Webtech Pvt. Ltd

The world of search engine optimization (SEO) is buzzing with discussions after Google confirmed that around 2,500 leaked internal documents related to its Search feature are indeed authentic. The revelation has sparked significant concerns within the SEO community. The leaked documents were initially reported by SEO experts Rand Fishkin and Mike King, igniting widespread analysis and discourse. For More Info:- https://news.arihantwebtech.com/search-disrupted-googles-leaked-documents-rock-the-seo-world/

Business Valuation Principles for Entrepreneurs

Ben Wann

Maksym Vyshnivetskyi: PMO Quality Management (UA)

Lviv Startup Club

Discover the innovative and creative projects that highlight my journey throu...

dylandmeas

Brand Analysis for an artist named Struan

sarahvanessa51503

Cracking the Workplace Discipline Code Main.pptx

Workforce Group

Cultivating and maintaining discipline within teams is a critical differentiator for successful organisations. Forward-thinking leaders and business managers understand the impact that discipline has on organisational success. A disciplined workforce operates with clarity, focus, and a shared understanding of expectations, ultimately driving better results, optimising productivity, and facilitating seamless collaboration. Although discipline is not a one-size-fits-all approach, it can help create a work environment that encourages personal growth and accountability rather than solely relying on punitive measures. In this deck, you will learn the significance of workplace discipline for organisational success. You’ll also learn • Four (4) workplace discipline methods you should consider • The best and most practical approach to implementing workplace discipline. • Three (3) key tips to maintain a disciplined workplace.

ikea_woodgreen_petscharity_dog-alogue_digital.pdf

agatadrynko

3.0 Project 2_ Developing My Brand Identity Kit.pptx

tanyjahb

anas about venice for grade 6f about venice

anasabutalha2013

The effects of customers service quality and online reviews on customer loyal...

balatucanapplelovely

What is the TDS Return Filing Due Date for FY 2024-25.pdf

seoforlegalpillers

It is crucial for the taxpayers to understand about the TDS Return Filing Due Date, so that they can fulfill your TDS obligations efficiently. Taxpayers can avoid penalties by sticking to the deadlines and by accurate filing of TDS. Timely filing of TDS will make sure about the availability of tax credits. You can also seek the professional guidance of experts like Legal Pillers for timely filing of the TDS Return.

Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf

dylandmeas

The Influence of Marketing Strategy and Market Competition on Business Perfor...

Adam Smith

Introduction to Amazon company 111111111111

zoyaansari11365

CADAVER AS OUR FIRST TEACHER anatomt in your.pptx

fakeloginn69

ENTREPRENEURSHIP TRAINING.ppt for graduating class (1).ppt

zechu97

Skye Residences | Extended Stay Residences Near Toronto Airport

marketingjdass

Experience unparalleled EXTENDED STAY and comfort at Skye Residences located just minutes from Toronto Airport. Discover sophisticated accommodations tailored for discerning travelers. Website Link : https://skyeresidences.com/ https://skyeresidences.com/about-us/ https://skyeresidences.com/gallery/ https://skyeresidences.com/rooms/ https://skyeresidences.com/near-by-attractions/ https://skyeresidences.com/commute/ https://skyeresidences.com/contact/ https://skyeresidences.com/queen-suite-with-sofa-bed/ https://skyeresidences.com/queen-suite-with-sofa-bed-and-balcony/ https://skyeresidences.com/queen-suite-with-sofa-bed-accessible/ https://skyeresidences.com/2-bedroom-deluxe-queen-suite-with-sofa-bed/ https://skyeresidences.com/2-bedroom-deluxe-king-queen-suite-with-sofa-bed/ https://skyeresidences.com/2-bedroom-deluxe-queen-suite-with-sofa-bed-accessible/ #Skye Residences Etobicoke, #Skye Residences Near Toronto Airport, #Skye Residences Toronto, #Skye Hotel Toronto, #Skye Hotel Near Toronto Airport, #Hotel Near Toronto Airport, #Near Toronto Airport Accommodation, #Suites Near Toronto Airport, #Etobicoke Suites Near Airport, #Hotel Near Toronto Pearson International Airport, #Toronto Airport Suite Rentals, #Pearson Airport Hotel Suites

BeMetals Presentation_May_22_2024 .pdf

DerekIwanaka1

Attending a job Interview for B1 and B2 Englsih learners

Erika906060

FINAL PRESENTATION.pptx12143241324134134

LR1709MUSIC

Recently uploaded (20)

Search Disrupted Google’s Leaked Documents Rock the SEO World.pdf

Business Valuation Principles for Entrepreneurs

Maksym Vyshnivetskyi: PMO Quality Management (UA)

Discover the innovative and creative projects that highlight my journey throu...

Brand Analysis for an artist named Struan

Cracking the Workplace Discipline Code Main.pptx

ikea_woodgreen_petscharity_dog-alogue_digital.pdf

3.0 Project 2_ Developing My Brand Identity Kit.pptx

anas about venice for grade 6f about venice

The effects of customers service quality and online reviews on customer loyal...

What is the TDS Return Filing Due Date for FY 2024-25.pdf

Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf

The Influence of Marketing Strategy and Market Competition on Business Perfor...

Introduction to Amazon company 111111111111

CADAVER AS OUR FIRST TEACHER anatomt in your.pptx

ENTREPRENEURSHIP TRAINING.ppt for graduating class (1).ppt

Skye Residences | Extended Stay Residences Near Toronto Airport

BeMetals Presentation_May_22_2024 .pdf

Attending a job Interview for B1 and B2 Englsih learners

FINAL PRESENTATION.pptx12143241324134134

Neural Machine Translation: a report from the front line

1. Neural MT Separating the hype from reality a report from the front line TAUS Webinar / Industry Leaders Forum / LocWorld 34

2. The Neural Narrative Great, thanks. Let me use it now…

3. LevelofDifficulty Korean Japanese French IT Marketing Patent Gisting PEMT Perfect German Use caseIndustryLanguage Use cases for machine translation

4. Impact of Neural MT Use cases covered by generic MT Use cases needing custom MTThe Bar

5. Impact of Neural MT Use cases covered by generic MT Use cases needing custom MTThe Bar Neural MT has raised the bar

7. Of course. We’re a team of MT experts. This is a big part of the value that we bring to the table. We’re not just taking open-source tools off the shelf. We’re innovating, researching, developing new processes. Same for Neural MT. e.g. lexically constrained decoding Neural MT @ Iconic “Do you ‘do’ Neural MT?”

8. It’s one of the ways. MT is not a one-size-fits-all technology. What constitutes the best approach depends on the language pair, domain, use case, and various other factors. In some cases, the best approach will be Neural MT, but not yet all the time. Neural MT @ Iconic “Is this the way you do MT now?”

9. When it gives the best output! When you’re customising MT, there are so many things you can do – different processors, parameters, ways of combining data and tuning. We try multiple approaches and allow our systems to use the best one. Let’s look at some case studies, but first… Neural MT @ Iconic “When do you use it then?”

10. The Iconic Ensemble Architecture™

11. Patents Case Study Average length: 7 words Average length: 30 words 3 languages Same data sets Client evaluation Ranking 1. Unusable 2. Poor 3. Adequate 4. Good 5. Excellent Criteria 90% Adequate or above 0% Unusable

12. Linguist Review both pass title criteria only Iconic MT passed abstracts Patents Case Study – Chinese to English 31 3938 19 0 5 10 15 20 25 30 35 40 45 Titles Abstracts Iconic MT Iconic Neural MT Outcome Iconic MT deployed in production

13. Linguist Review both passed on titles only NMT passed abstracts Patents Case Study – Japanese to English 52 33 0 10 20 30 40 50 60 Titles Abstracts Iconic MT Iconic Neural MT Outcome Iconic MT deployed for titles Neural MT deployed for abstracts 4 5 4 4

14. Patents Case Study – Korean to English 40 25 0 5 10 15 20 25 30 35 40 45 Titles Abstracts Iconic MT Iconic Neural MT Linguist Review Iconic MT below criteria Neural MT significantly better Outcome Under review! 2 6 4 3

15. Neural MT raises the bar for general purpose MT but the bar still needs to be tested. Customisation Case Study English to French English to Hindi BLEU 1-TER BLEU 1-TER Iconic MT 43.0 (+10.4) 55.2 (+7.7) 46.75 (+12.96) 56.4 (+5.5) GNMT 32.6 47.5 33.79 50.9 Iconic NMT 39.2 50.5 - - 2 languages 1.5M training segments IT content

16. The Iconic Ensemble Architecture™ Neural MT is another powerful tool in our arsenal that helps us deliver best-in-class machine translation output

17. “I don’t give a damn what you do!”

18. MT @ Iconic – what we don’t do MT MT MT MT MT MTMT MTMTMTMTMT MTMTMTMTMTMT The ability to build your own MT engines with Moses, Phrasal, OpenNMT, Nematus, Fairseq. Provide off-the-shelf general industry engines. There are some very adequate solutions for that! MT

19. Customised expert-built MT, using the most appropriate tool for the job, MT or otherwise. MT @ Iconic – what we do do! Develop products and solutions that incorporate machine translation – not just access to an API.

20. Engage our expert team on an Neural MT project to see if it works for your content Neural MT – Early Adopter Program To date Custom-developments with some of our closest partners Now Inviting early adopters to expand the range of casesEarly Adopter Program

21. Thank You! john@iconictranslation.com @johntins / @iconictrans

Editor's Notes

Direct people to previous talks I’ve given about the impact and history of neural MT: SUMMARY it’s promising, but it’s not a one size fits all taking over approach still looking case by case in short term long term, wait and see, it’s exciting
Been speaking recently about Neural MT The (commercial) narrative around neural MT is moving even faster than the pace of development! We’ve gone very quickly from explaining what it is, to companies showing initial results, to having fully blown production systems producing the best results ever – and in some cases from companies who never even offered MT before. Wow that’s impressive  Our team at Iconic know a little bit about MT but you can call me a cynic if we take the approach of still trying to manage expectations while we all learn a little more about Neural MT. Much of it’s still marketing Still needs to be contextualised This was always the case with MT, and that doesn’t change with Neural MT
Short-term impact | Long-term prospects
Ultimately just another type of MT so we’re still have a lot of the same issues, and some new issues Whether that custom MT is neural, or SMT, or hybrid, STILL depends. It’s still to be judged on a case by case basis
Ok, so let’s talk about our approach neural MT at Iconic, but we’ll get some key question out of the way up front. Questions we're asked frequently as a provider of machine translation
This is what we've been doing for many years now Fancy way of saying we can apply terminology With Moses, the way ppl do MT, a lot of this was out of the box. Now we have to implement it ourselves so that will separate the wheat from the chaff in terms of software.
Beam size, phrase length, distortion limit – now training epochs, vocab size, number of hidden layers Before we talk about the HOW we do it (which is less important as I’ll point out) let’s look at what we’ve done so far with some case studies REAL RESULTS FROM THE FIELD
A lot of ongoing patent work with some of our biggest clients which is an ideal starting point for us to test our production metal in this area TEST NEURAL MT ON REAL USE CASES AND CRITERIA “Baseline” Iconic MT here is
WHAT: Chinese – pre-ordering system, mature 3 years, auto post-editing OUTCOME: Use Iconic where possible, because it’s quicker to retrain and more control
WHAT: Japanese – syntax based pre-ordering system, transliteration and script normalisation Interesting, because we were doing A LOT of development before Neural MT but it was hard to make big improvements with constraints on data OUTCOME: Use Iconic where possible, because it’s quicker to retrain and more control in the short term
Korean project Interesting, because it wasn’t something we had in production before Neural MT – mainly because it was so hard. WHAT: Korean – hierarchical system It’s a live one so the results are actually still with the client – LET’S SEE NEXT WEEK, TAUS AND LOCWORLD! Internal QA on the output suggest the automatic scores are generous to Iconic MT. The Neural output is significantly better
Still ongoing. Iconic Neural MT engines in building, but this is the baseline we establish! CONCLUSIONS HERE: Customised Iconic MT is better than general neural MT for 2 very different languages Again, even customised NMT not as good So, we will use what we can where’s it’s best. NEURAL MT CUSTOMISATION NOT YET VERY POWERFUL ACROSS THE BOARD. WE’RE STARTING TO GET AN INTUITION WHERE IT WILL HELP AND NOT.
Where does NEURAL FIT IN? HOWEVER, it’s all well and good looking being the curtain and how we do this but at the end of the day, a particular quote springs to mind….
This was a quote from a prospective client on a call last month. I was in the midst of explain what we use in which cases, where neural MT fits in, and they interrupted me as said “I don’t give a damn what you use”. I thought for a second as I was disrupted from my flow and thought, he’s right you know. I’m telling people to leave it to us, and that’s what he wants. Why does it matter how the translations are produced? Does it really matter if it’s neural or not – once it’s guaranteed as the best of what we could achieve, including using neural. I don’t think it does to anyone! But because you’ve decided to watch, I’ll give you some insight into what we do and don’t do
- With SMT/NMT, whatever, this it will get you so far before more expertise is required. This is the case now more than ever with NMT and how experimental it is. - Not trying to provide off the shelf generic engines. Never done that. That’s the realm of Google. You’ll find it’s quite good for the general use case! Good basis for customisation.
We’re doing what we’ve always done! EDISCOVERY REGULATORY COMPLIANCE WHERE MT IS A PART OF A BROADER SOLUTION
To close the loop on this story with Iconic and NEURAL MT We’re learning, fast, and the more REAL opportunities there are the better for everyone. Let us know if you have any questions or if it’s something you’d like to explore. THANKS!

Neural Machine Translation: a report from the front line

Recommended

Recommended

More Related Content

What's hot

What's hot (15)

Similar to Neural Machine Translation: a report from the front line

Similar to Neural Machine Translation: a report from the front line (20)

More from Iconic Translation Machines

More from Iconic Translation Machines (10)

Recently uploaded

Recently uploaded (20)

Neural Machine Translation: a report from the front line

Editor's Notes