This presentation was given at various events in June 2017 on the current status of Neural Machine Translation development at Iconic.
Rule based, statistical, hybrid, neural - at the end of the day it's all machine translation. At Iconic, we've been "doing neural" for over 12 months in various guises but, frequently, we find that our clients don't care what we use once we get the job done. In these slides, we go through a number of case studies involving MT and show how fit for purpose translations were delivered, combining various different approaches to MT.
This was a talk given at the annual GALA conference in Amsterdam on March 27th 2017. The topic is Neural Machine Translation. Where are we now?
Neural Machine Translation is at the peak of a hype cycle. There is no doubt it is an emerging technology with massive potential, but it is not yet a sweeping solution to all ills. Several factors prevent NMT from being commercially ready. Expectations, therefore, need to be managed. That is the goal of this presentation.
Delivered at the 29th LocWorld conference.
October 16th 2015
Santa Clara, CA, USA.
In this talk, we describe how we carried out a successful large scale evaluation and deployment of machine translation at RWS.
Delivered at the European Patent Office's Patent Information Conference.
November 11th 2015
Miami, Florida.
In this talk, we talk about recent advances in MT for patents and introduce our IPTranslator.com application for on-demand translation.
Delivered at Machine Translation Summit during a special workshop on MT for patent and scientific literature.
October 30th 2015
Miami, Florida.
In this talk, we describe how we adapted machine translation for patents to help a translation company improve their productivity.
Delivered at Machine Translation Summit during a special workshop on post-editing.
November 3rd 2015
Miami, Florida.
In this talk, we describe the latest advances in the world of commercial and academic machine translation development that are having the effect of improving acceptance of the technology and keeping its users happy.
This was a talk given at the annual GALA conference in Amsterdam on March 27th 2017. The topic is Neural Machine Translation. Where are we now?
Neural Machine Translation is at the peak of a hype cycle. There is no doubt it is an emerging technology with massive potential, but it is not yet a sweeping solution to all ills. Several factors prevent NMT from being commercially ready. Expectations, therefore, need to be managed. That is the goal of this presentation.
Delivered at the 29th LocWorld conference.
October 16th 2015
Santa Clara, CA, USA.
In this talk, we describe how we carried out a successful large scale evaluation and deployment of machine translation at RWS.
Delivered at the European Patent Office's Patent Information Conference.
November 11th 2015
Miami, Florida.
In this talk, we talk about recent advances in MT for patents and introduce our IPTranslator.com application for on-demand translation.
Delivered at Machine Translation Summit during a special workshop on MT for patent and scientific literature.
October 30th 2015
Miami, Florida.
In this talk, we describe how we adapted machine translation for patents to help a translation company improve their productivity.
Delivered at Machine Translation Summit during a special workshop on post-editing.
November 3rd 2015
Miami, Florida.
In this talk, we describe the latest advances in the world of commercial and academic machine translation development that are having the effect of improving acceptance of the technology and keeping its users happy.
Delivered at the biannual conference of Association of Machine Translation in the Americas (AMTA 2014)
October 24th 2014
Vancouver, Canada.
In this talk, we describe how state-of-the-art research lead to the establishment of Iconic Translation Machines.
Learn the different approaches to machine translation and how to improve the ...SDL
Learn the different approaches to machine translation and how to improve the quality of your global strategy with machine translation. Delivered at the SDL Customer Success Summit Montreal 2016.
kerstin bier, localization world barcelona, manuel herranz, mt, pangeanic, sy...Manuel Herranz
Co-presentation by Kerstin Bier and Manuel Herranz in Localization World Barcelona 2011 on the achievement and progress made by a customized PangeaMT engine at Sybase. Initial machine translation implementation, machine translation customization for Sybase, use of client's data for training and productivity results.
High Volume, Rapid Turn Around Localization: Lessons LearnedSDL
Customer success story by Johnson & Johnson on best practices and lesson learned in localization process. Learn the importance of prioritization, in-country review, terminology database, teamwork and collaboration. Delivered at the SDL Customer Success Summit Montreal 2016.
The I in PRIMM - Code Comprehension and QuestioningSue Sentance
Slides from a talk given at the CAS London conference on 29th February 2020. Discusses the teaching of computer programming using PRIMM and in particular, the Investigate stage. Looks at the Block Model and how we can explore students' understanding by asking a range of different questions.
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize
Welocalize language tools expert Laura Casanellas details key topics related to human translation and machine translation post-editing, production, throughputs and measuring success. This is the presentation used in a recent online webinar you can find at http://www.welocalize.com/wemt/wemt-webinars/
Topics for this recorded webinar include:
- Defining throughputs for human translation and machine translation post-editing
- How to accurately compare individual throughputs for translating and post-editing
- What are the most common deviations in throughputs
- How to spot progress and performance improvement
- Who really benefits from post-editing
Panelists: Yoshiyasu Yamakawa (Intel), JP Barraza (Systran), Konstantin Dranch (Memsource), David Koot (TAUS)
The focus of this session will be on predictions and risk management. What kind of things can you predict and how can you manage risks by by analyzing your translation data or monitoring your productivity and quality. Tracking translation data in different cycles of the translation process (translation, post-editing, review, proof-reading) offers tremendous value when it comes to predicting future trends or making informed choices. What type of data can be valuable and what kind of predictions can we make using this data? How can we make more efficient use of already available data? How can we use this type of data to improve machine translation, automatic QA, error-recognition, sampling or quality estimation? How can academia and industry work together towards a common goal?
For the full video of this presentation, please visit:
https://www.edge-ai-vision.com/2020/08/once-for-all-dnns-simplifying-design-of-efficient-models-for-diverse-hardware-a-presentation-from-mit/
For more information about edge AI and vision, please visit:
http://www.edge-ai-vision.com
Christine Cheng, co-chair of the inference benchmark working group at MLPerf and a senior machine learning optimization engineer at Intel, delivers the presentation “MLPerf: An Industry Standard Performance Benchmark Suite for Machine Learning” at the Edge AI and Vision Alliance’s July 2020 Edge AI and Vision Innovation Forum. Cheng explains how MLPerf’s inference benchmark suite for evaluating processor performance works and is evolving.
FC-Compiler™ is a (free) Calculus-level Compiler that simplifies Tweaking parameters in ones math model. The FortranCalculus (FC) language is for math modeling, simulation, and optimization. FC is based on Automatic Differentiation that simplifies computer code to an absolute minimum; i.e., a mathematical model, constraints, and the objective (function) definition. Minimizing the amount of code allows the user to concentrate on the science or engineering problem at hand and not on the (numerical) process requirements to achieve an optimum solution. Download at http://goal-driven.net/apps/fc-compiler.html
FC-Compiler™ App has many (50+) example problems with output (see 'Demos' on main menu) for viewing and getting ideas on solving your own problems. These are improved productivity examples do to using Calculus-level Problem-Solving. Please share this Calculus Problem-Solving tool with your friends. Thanks!
Delivered at the biannual conference of Association of Machine Translation in the Americas (AMTA 2014)
October 24th 2014
Vancouver, Canada.
In this talk, we describe how state-of-the-art research lead to the establishment of Iconic Translation Machines.
Learn the different approaches to machine translation and how to improve the ...SDL
Learn the different approaches to machine translation and how to improve the quality of your global strategy with machine translation. Delivered at the SDL Customer Success Summit Montreal 2016.
kerstin bier, localization world barcelona, manuel herranz, mt, pangeanic, sy...Manuel Herranz
Co-presentation by Kerstin Bier and Manuel Herranz in Localization World Barcelona 2011 on the achievement and progress made by a customized PangeaMT engine at Sybase. Initial machine translation implementation, machine translation customization for Sybase, use of client's data for training and productivity results.
High Volume, Rapid Turn Around Localization: Lessons LearnedSDL
Customer success story by Johnson & Johnson on best practices and lesson learned in localization process. Learn the importance of prioritization, in-country review, terminology database, teamwork and collaboration. Delivered at the SDL Customer Success Summit Montreal 2016.
The I in PRIMM - Code Comprehension and QuestioningSue Sentance
Slides from a talk given at the CAS London conference on 29th February 2020. Discusses the teaching of computer programming using PRIMM and in particular, the Investigate stage. Looks at the Block Model and how we can explore students' understanding by asking a range of different questions.
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize
Welocalize language tools expert Laura Casanellas details key topics related to human translation and machine translation post-editing, production, throughputs and measuring success. This is the presentation used in a recent online webinar you can find at http://www.welocalize.com/wemt/wemt-webinars/
Topics for this recorded webinar include:
- Defining throughputs for human translation and machine translation post-editing
- How to accurately compare individual throughputs for translating and post-editing
- What are the most common deviations in throughputs
- How to spot progress and performance improvement
- Who really benefits from post-editing
Panelists: Yoshiyasu Yamakawa (Intel), JP Barraza (Systran), Konstantin Dranch (Memsource), David Koot (TAUS)
The focus of this session will be on predictions and risk management. What kind of things can you predict and how can you manage risks by by analyzing your translation data or monitoring your productivity and quality. Tracking translation data in different cycles of the translation process (translation, post-editing, review, proof-reading) offers tremendous value when it comes to predicting future trends or making informed choices. What type of data can be valuable and what kind of predictions can we make using this data? How can we make more efficient use of already available data? How can we use this type of data to improve machine translation, automatic QA, error-recognition, sampling or quality estimation? How can academia and industry work together towards a common goal?
For the full video of this presentation, please visit:
https://www.edge-ai-vision.com/2020/08/once-for-all-dnns-simplifying-design-of-efficient-models-for-diverse-hardware-a-presentation-from-mit/
For more information about edge AI and vision, please visit:
http://www.edge-ai-vision.com
Christine Cheng, co-chair of the inference benchmark working group at MLPerf and a senior machine learning optimization engineer at Intel, delivers the presentation “MLPerf: An Industry Standard Performance Benchmark Suite for Machine Learning” at the Edge AI and Vision Alliance’s July 2020 Edge AI and Vision Innovation Forum. Cheng explains how MLPerf’s inference benchmark suite for evaluating processor performance works and is evolving.
FC-Compiler™ is a (free) Calculus-level Compiler that simplifies Tweaking parameters in ones math model. The FortranCalculus (FC) language is for math modeling, simulation, and optimization. FC is based on Automatic Differentiation that simplifies computer code to an absolute minimum; i.e., a mathematical model, constraints, and the objective (function) definition. Minimizing the amount of code allows the user to concentrate on the science or engineering problem at hand and not on the (numerical) process requirements to achieve an optimum solution. Download at http://goal-driven.net/apps/fc-compiler.html
FC-Compiler™ App has many (50+) example problems with output (see 'Demos' on main menu) for viewing and getting ideas on solving your own problems. These are improved productivity examples do to using Calculus-level Problem-Solving. Please share this Calculus Problem-Solving tool with your friends. Thanks!
Spark meetup london share and analyse genomic data at scale with spark, adam...Andy Petrella
Genomics and Health data is nowadays one of the hot topics requiring lots of computations and specially machine learning. This helps science with a very relevant societal impact to get even better outcome. That is why Apache Spark and its ADAM library is a must have.
This talk will be twofold.
First, we'll show how Apache Spark, MLlib and ADAM can be plugged all together to extract information from even huge and wide genomics dataset. Everything will be packed into examples from the Spark Notebook, showing how bio-scientists can work interactively with such a system.
Second, we'll explain how these methodologies and even the datasets themselves can be shared at very large scale between remote entities like hospitals or laboratories using micro services leveraging Apache Spark, ADAM, Play Framework 2, Avro and Tachyon.
A comprehensive guide to prompt engineering.pdfJamieDornan2
Prompt engineering is the practice of designing and refining specific text prompts to guide transformer-based language models, such as Large Language Models (LLMs), in generating desired outputs. It involves crafting clear and specific instructions and allowing the model sufficient time to process information. By carefully engineering prompts, practitioners can harness the capabilities of LLMs to achieve different goals.
The finite element method is a numerical method that can be used for the accurate solution of complex engineering problems.Thereafter, within a decade, the potentialities
of the method for the solution of different types of applied science and engineering
problems were recognized. Over the years, the finite element technique has been so well
established that today it is considered to be one of the best methods for solving a wide
variety of practical problems efficiently.In fact, the method has become one of the active
research areas for applied mathematicians. One of the main reasons for the popularity of
the method in different fields of engineering is that once a general computer program is
written, it can be used for the solution of any problem simply by changing the input data.
The Strengths & Limitations of Risk Management StandardsBen Tomhave
Much airtime is given to various standards for information security and risk management, but how much value can really be derived from them? At what point do they cross the line from "useful" to "too much effort and cost"? How can you best leverage standards to improve quality and performance? These questions, and more, will be addressed in this session as we explore the most common standards and how to best leverage them in managing the operational risk portfolio.
(150324) Everything you ever wanted to know about Studio!Paul Filkin
The powerpoint presented as part of the workshop held in Warsaw in March 2015 during the Translation & Localisation Conference 2015.
I enhanced it a little with some a few extra slides covering some of the things we talked about in addition to the questions submitted beforehand.
5 challenges of scaling l10n workflows KantanMT/bmmt webinarkantanmt
In this joint presentation, Tony O’Dowd, Founder and Chief Architect of KantanMT and Maxim Khalilov, Technical Lead of bmmt deliver an overview of the MT technology currently available in the language technology market, the challenges of operating MT systems at scale and speed, and their opinions on the future trajectory of MT.
Each presentation will be grounded with client examples, and how they’ve successfully integrated MT into their localization workflows.
Finally, both presenters will finish off with a 5 point checklist for successful MT deployment based on both the MT provider and LSP point of view.
If you have any questions about this presentation or want to get in touch with either company please contact:
Louise Irwin, Marketing Specialist at KantanMT (louisei@kantanmt.com)
Peggy Linder, Operations Manager at bmmt (peggy.lindner@bmmt.eu)
No Training Data? No Problem! Weak Supervision to the Rescue!
A talk on NLP Weak Supervision at the Singapore Quantum Black Meetup.
This talk talks about
1. ML's insatiable need for large datasets
2. Contemporary ML leaving out domain knowledge from Subject Matter Experts
3. How Weak Supervision, an approach of Data-Centric AI, solves both the problems simultaneously by encoding domain subject matter expertise into programmatic labeling functions.
4. The WRENCH benchmark to compare various weak supervision algorithms on several standard datasets.
5. Snorkel to combine the various labeling functions.
6. COSINE to fine-tune a final transformer based model that overcomes the noise in weak labels
7. Future Directions and Resources
Feel free to use the slides but please remember to credit me with a link to my Linkedin profile: www.linkedin.com/in/marie-stephen-leo.
As Machine learning reaches the mainstream, new tools available to developers makes it possible to implement machine-learning features—voice, face, and image recognition; personalized recommendations; and more—in a mobile context.
TensorFlow Lite applies many techniques for achieving low latency; optimizing the kernels for mobile apps, pre-fused activations, and quantized kernels that allow smaller and faster (fixed-point math) models.
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-EditingWelocalize
Analyzing and Predicting MT Utility and Post-Editing Productivity in Enterprise-Scale Translation Projects by Olga Beregovaya and David Clarke from Welocalize
Alon Lavie and Michael Denkowski from Safaba Translation Solutions
Similar to Neural Machine Translation: a report from the front line (20)
This is the presentation given at the e-discovery and information governance conference in Dublin, Ireland on November 17th 2017. The presentation goes into the options that firms have when it comes to handling foreign language documents during the document review process and the growing role that machine translation technology is playing in this.
As the oldest abstracting service, Chemisches Zentralblatt generated detailed abstracts of scientific research from
1840-1969. CAS, in partnership with Iconic Translation Machines (ITM), has made this information accessible in ChemZent TM , the first and only English searchable translation of Chemisches Zentralblatt.
After its introduction in 1840, Chemisches Zentralblatt quickly grew to be an invaluable resource for chemists. While the content is freely accessible via various online platforms, locating specific information in the volumes of Chemisches Zentralblatt can be challenging. To find a topic or author, the user needs to know the year and volume of interest. In addition, the content was previously only available in German.
Leveraging ITM technology and extensive CAS expertise processing scientific literature, the two companies teamed
to develop ChemZent. This historical content further enhances the most comprehensive and authoritative source of references, substances and reactions in chemistry and related sciences accessible in SciFinder ® . Three million English–translated abstracts are now searchable in SciFinder, making this rich source of content accessible to today’s researchers.
This was a presentation given at the European Patent Office's annual Patent Information Conference in Madrid, Spain on November 10th, 2016.
In it, we give an overview of how machine translation works, latest advances in neural MT, and how this can be applied to patents and intellectual property content, not only for translations but also information extraction and other NLP applications.
This was a presentation given at the conference of the Association of Machine Translation in the Americas (AMTA) in Austin, Texas on October 31st, 2016. This is a predominantly academic event, and this presentation was a condensed version of our "MT Success Blog Series" on our website where we aimed to give the community and idea as to the practical considerations around commercial machine translation.
http://iconictranslation.com/2016/07/8-steps-to-mt-success-series-introduction/
This was a pitch for Iconic's neural machine translation technology given at the TAUS Annual Conference in Portland, Oregan on October 24th, 2016.
There has been a lot of talk, and a lot of hype about neural machine translation in the press. But not a lot of practical application. Let's change the conversation
These slides are a combination of a 3 different presentations given at LocWorld 31, the TAUS Industry Leaders Forum, and the TAUS QE Summit all held in Dublin, Ireland from June 6-10.
Delivered at the TAUS Quality Evaluation Summit.
May 28th 2015
Dublin, Ireland.
In this talk, we describe how to carry out machine translation evaluation in order to extract meaningful business intelligence.
Delivered at the European Patent Office's annual Patent Information Conference (EPOPIC 2014)
November 5th 2014
Warsaw, Poland.
In this talk, we give an introduction as to how machine translation works and what makes certain content types and languages more difficult than others.
Delivered at the 26th LocWorld Conference in North America.
October 31st 2014
Vancouver, Canada.
In this talk, we describe the various strands of knowledge - machine translation, language, and industry - require to develop effective MT software.
Delivered at the TAUS Machine Translation Showcase.
June 6th 2014
Dublin, Ireland.
In this talk, we explain how machine translation systems can be developed for highly technical content types.
The world of search engine optimization (SEO) is buzzing with discussions after Google confirmed that around 2,500 leaked internal documents related to its Search feature are indeed authentic. The revelation has sparked significant concerns within the SEO community. The leaked documents were initially reported by SEO experts Rand Fishkin and Mike King, igniting widespread analysis and discourse. For More Info:- https://news.arihantwebtech.com/search-disrupted-googles-leaked-documents-rock-the-seo-world/
Business Valuation Principles for EntrepreneursBen Wann
This insightful presentation is designed to equip entrepreneurs with the essential knowledge and tools needed to accurately value their businesses. Understanding business valuation is crucial for making informed decisions, whether you're seeking investment, planning to sell, or simply want to gauge your company's worth.
Discover the innovative and creative projects that highlight my journey throu...dylandmeas
Discover the innovative and creative projects that highlight my journey through Full Sail University. Below, you’ll find a collection of my work showcasing my skills and expertise in digital marketing, event planning, and media production.
Cracking the Workplace Discipline Code Main.pptxWorkforce Group
Cultivating and maintaining discipline within teams is a critical differentiator for successful organisations.
Forward-thinking leaders and business managers understand the impact that discipline has on organisational success. A disciplined workforce operates with clarity, focus, and a shared understanding of expectations, ultimately driving better results, optimising productivity, and facilitating seamless collaboration.
Although discipline is not a one-size-fits-all approach, it can help create a work environment that encourages personal growth and accountability rather than solely relying on punitive measures.
In this deck, you will learn the significance of workplace discipline for organisational success. You’ll also learn
• Four (4) workplace discipline methods you should consider
• The best and most practical approach to implementing workplace discipline.
• Three (3) key tips to maintain a disciplined workplace.
3.0 Project 2_ Developing My Brand Identity Kit.pptxtanyjahb
A personal brand exploration presentation summarizes an individual's unique qualities and goals, covering strengths, values, passions, and target audience. It helps individuals understand what makes them stand out, their desired image, and how they aim to achieve it.
What is the TDS Return Filing Due Date for FY 2024-25.pdfseoforlegalpillers
It is crucial for the taxpayers to understand about the TDS Return Filing Due Date, so that they can fulfill your TDS obligations efficiently. Taxpayers can avoid penalties by sticking to the deadlines and by accurate filing of TDS. Timely filing of TDS will make sure about the availability of tax credits. You can also seek the professional guidance of experts like Legal Pillers for timely filing of the TDS Return.
Personal Brand Statement:
As an Army veteran dedicated to lifelong learning, I bring a disciplined, strategic mindset to my pursuits. I am constantly expanding my knowledge to innovate and lead effectively. My journey is driven by a commitment to excellence, and to make a meaningful impact in the world.
Skye Residences | Extended Stay Residences Near Toronto Airportmarketingjdass
Experience unparalleled EXTENDED STAY and comfort at Skye Residences located just minutes from Toronto Airport. Discover sophisticated accommodations tailored for discerning travelers.
Website Link :
https://skyeresidences.com/
https://skyeresidences.com/about-us/
https://skyeresidences.com/gallery/
https://skyeresidences.com/rooms/
https://skyeresidences.com/near-by-attractions/
https://skyeresidences.com/commute/
https://skyeresidences.com/contact/
https://skyeresidences.com/queen-suite-with-sofa-bed/
https://skyeresidences.com/queen-suite-with-sofa-bed-and-balcony/
https://skyeresidences.com/queen-suite-with-sofa-bed-accessible/
https://skyeresidences.com/2-bedroom-deluxe-queen-suite-with-sofa-bed/
https://skyeresidences.com/2-bedroom-deluxe-king-queen-suite-with-sofa-bed/
https://skyeresidences.com/2-bedroom-deluxe-queen-suite-with-sofa-bed-accessible/
#Skye Residences Etobicoke, #Skye Residences Near Toronto Airport, #Skye Residences Toronto, #Skye Hotel Toronto, #Skye Hotel Near Toronto Airport, #Hotel Near Toronto Airport, #Near Toronto Airport Accommodation, #Suites Near Toronto Airport, #Etobicoke Suites Near Airport, #Hotel Near Toronto Pearson International Airport, #Toronto Airport Suite Rentals, #Pearson Airport Hotel Suites
Attending a job Interview for B1 and B2 Englsih learnersErika906060
It is a sample of an interview for a business english class for pre-intermediate and intermediate english students with emphasis on the speking ability.
4. Impact of Neural MT
Use cases covered by generic MT
Use cases needing custom MTThe
Bar
5. Impact of Neural MT
Use cases covered by generic MT
Use cases needing custom MTThe
Bar
Neural
MT has
raised the
bar
6.
7. Of course.
We’re a team of MT experts. This is a big part of the value that
we bring to the table. We’re not just taking open-source tools off
the shelf. We’re innovating, researching, developing new
processes. Same for Neural MT.
e.g. lexically constrained decoding
Neural MT @ Iconic
“Do you ‘do’ Neural MT?”
8. It’s one of the ways.
MT is not a one-size-fits-all technology. What constitutes the best
approach depends on the language pair, domain, use case, and
various other factors. In some cases, the best approach will be
Neural MT, but not yet all the time.
Neural MT @ Iconic
“Is this the way you do MT now?”
9. When it gives the best output!
When you’re customising MT, there are so many things you can
do – different processors, parameters, ways of combining data
and tuning. We try multiple approaches and allow our systems to
use the best one.
Let’s look at some case studies, but first…
Neural MT @ Iconic
“When do you use it then?”
11. Patents Case Study
Average length: 7 words
Average length: 30 words
3 languages Same data sets Client evaluation
Ranking
1. Unusable
2. Poor
3. Adequate
4. Good
5. Excellent
Criteria
90% Adequate or above
0% Unusable
12. Linguist Review
both pass title criteria
only Iconic MT passed
abstracts
Patents Case Study – Chinese to English
31
3938
19
0
5
10
15
20
25
30
35
40
45
Titles Abstracts
Iconic MT Iconic Neural MT
Outcome
Iconic MT deployed in
production
13. Linguist Review
both passed on titles
only NMT passed abstracts
Patents Case Study – Japanese to English
52
33
0
10
20
30
40
50
60
Titles Abstracts
Iconic MT Iconic Neural MT
Outcome
Iconic MT deployed for titles
Neural MT deployed for
abstracts
4
5
4
4
14. Patents Case Study – Korean to English
40
25
0
5
10
15
20
25
30
35
40
45
Titles Abstracts
Iconic MT Iconic Neural MT
Linguist Review
Iconic MT below criteria
Neural MT significantly better
Outcome
Under review!
2
6
4
3
15. Neural MT raises the bar for general purpose MT
but the bar still needs to be tested.
Customisation Case Study
English to French English to Hindi
BLEU 1-TER BLEU 1-TER
Iconic MT 43.0 (+10.4) 55.2 (+7.7) 46.75 (+12.96) 56.4 (+5.5)
GNMT 32.6 47.5 33.79 50.9
Iconic NMT 39.2 50.5 - -
2 languages 1.5M training segments IT content
16. The Iconic Ensemble Architecture™
Neural MT is another
powerful tool in our
arsenal that helps us
deliver best-in-class
machine translation
output
18. MT @ Iconic – what we don’t do
MT MT MT MT
MT
MTMT
MTMTMTMTMT
MTMTMTMTMTMT
The ability to build your own MT
engines with Moses, Phrasal,
OpenNMT, Nematus, Fairseq.
Provide off-the-shelf general
industry engines. There are
some very adequate solutions for
that!
MT
19. Customised expert-built MT, using
the most appropriate tool for the job,
MT or otherwise.
MT @ Iconic – what we do do!
Develop products and solutions that
incorporate machine translation –
not just access to an API.
20. Engage our expert team on an Neural MT project to see if it works for your content
Neural MT – Early Adopter Program
To date
Custom-developments with
some of our closest partners
Now
Inviting early adopters to
expand the range of casesEarly Adopter Program
Direct people to previous talks I’ve given about the impact and history of neural MT:
SUMMARY
it’s promising, but it’s not a one size fits all taking over approach
still looking case by case in short term
long term, wait and see, it’s exciting
Been speaking recently about Neural MT
The (commercial) narrative around neural MT is moving even faster than the pace of development!
We’ve gone very quickly from explaining what it is, to companies showing initial results, to having fully blown production systems producing the best results ever – and in some cases from companies who never even offered MT before. Wow that’s impressive
Our team at Iconic know a little bit about MT but you can call me a cynic if we take the approach of still trying to manage expectations while we all learn a little more about Neural MT.
Much of it’s still marketing
Still needs to be contextualised
This was always the case with MT, and that doesn’t change with Neural MT
Short-term impact | Long-term prospects
Ultimately just another type of MT so we’re still have a lot of the same issues, and some new issues
Whether that custom MT is neural, or SMT, or hybrid, STILL depends. It’s still to be judged on a case by case basis
Ok, so let’s talk about our approach neural MT at Iconic, but we’ll get some key question out of the way up front.
Questions we're asked frequently as a provider of machine translation
This is what we've been doing for many years now
Fancy way of saying we can apply terminology
With Moses, the way ppl do MT, a lot of this was out of the box. Now we have to implement it ourselves so that will separate the wheat from the chaff in terms of software.
Beam size, phrase length, distortion limit – now training epochs, vocab size, number of hidden layers
Before we talk about the HOW we do it (which is less important as I’ll point out) let’s look at what we’ve done so far with some case studies
REAL RESULTS FROM THE FIELD
A lot of ongoing patent work with some of our biggest clients which is an ideal starting point for us to test our production metal in this area
TEST NEURAL MT ON REAL USE CASES AND CRITERIA
“Baseline” Iconic MT here is
WHAT: Chinese – pre-ordering system, mature 3 years, auto post-editing
OUTCOME: Use Iconic where possible, because it’s quicker to retrain and more control
WHAT: Japanese – syntax based pre-ordering system, transliteration and script normalisation
Interesting, because we were doing A LOT of development before Neural MT but it was hard to make big improvements with constraints on data
OUTCOME: Use Iconic where possible, because it’s quicker to retrain and more control in the short term
Korean project Interesting, because it wasn’t something we had in production before Neural MT – mainly because it was so hard.
WHAT: Korean – hierarchical system
It’s a live one so the results are actually still with the client – LET’S SEE NEXT WEEK, TAUS AND LOCWORLD!
Internal QA on the output suggest the automatic scores are generous to Iconic MT. The Neural output is significantly better
Still ongoing. Iconic Neural MT engines in building, but this is the baseline we establish!
CONCLUSIONS HERE:
Customised Iconic MT is better than general neural MT for 2 very different languages
Again, even customised NMT not as good
So, we will use what we can where’s it’s best.
NEURAL MT CUSTOMISATION NOT YET VERY POWERFUL ACROSS THE BOARD.
WE’RE STARTING TO GET AN INTUITION WHERE IT WILL HELP AND NOT.
Where does NEURAL FIT IN?
HOWEVER, it’s all well and good looking being the curtain and how we do this but at the end of the day, a particular quote springs to mind….
This was a quote from a prospective client on a call last month. I was in the midst of explain what we use in which cases, where neural MT fits in, and they interrupted me as said “I don’t give a damn what you use”. I thought for a second as I was disrupted from my flow and thought, he’s right you know. I’m telling people to leave it to us, and that’s what he wants.
Why does it matter how the translations are produced? Does it really matter if it’s neural or not – once it’s guaranteed as the best of what we could achieve, including using neural. I don’t think it does to anyone!
But because you’ve decided to watch, I’ll give you some insight into what we do and don’t do
- With SMT/NMT, whatever, this it will get you so far before more expertise is required. This is the case now more than ever with NMT and how experimental it is.
- Not trying to provide off the shelf generic engines. Never done that. That’s the realm of Google. You’ll find it’s quite good for the general use case! Good basis for customisation.
We’re doing what we’ve always done!
EDISCOVERY
REGULATORY COMPLIANCE
WHERE MT IS A PART OF A BROADER SOLUTION
To close the loop on this story with Iconic and NEURAL MT
We’re learning, fast, and the more REAL opportunities there are the better for everyone.
Let us know if you have any questions or if it’s something you’d like to explore.
THANKS!