Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Beyond the Hype
of Neural Machine
Translation
Tauyou & Prompsit
(Diego) dbc@tauyou.com | (Gema) gramirez@prompsit.com
Why neural nets?
“artificial neural networks [...] are able to be trained
from examples without the need for a thorough
un...
Why neural nets in MT now?
MT maturity
➔ MT is widely used (but planning to use it everywhere)
➔ MT for some languages is ...
So, why not?
Promising results from WMT16 competition: all best systems are NMT ones
SMT NMT
BLEU TER BLEU TER
en-fi* 14.8...
Neural nets are...
➔ ...computational models inspired by Biology
➔ ...playing increasing key roles in Graphics and Pattern...
NMT requires...
➔ Hardware: raw 10xCPUs or GPU
(times get shorter with GPUs)
➔ Software: deep learning framework
(Theano, ...
Down to the NMT
business
Applying NMT to generic and in-domain use cases
Generic English -- Swedish SMT vs. NMT
➔ Same generic corpus (8M segments)...
Comparison for generic English - Swedish
SMT NMT
Training time 48 hours (CPU) 2 weeks (GPU)
Translation time 00:12:35 (866...
Comparison for in-domain English - Norwegian
SMT NMT
Training time 1.8 hours (3 CPUs) 7 days (1 GPU)
Translation time 00:0...
Conclusions SMT vs. NMT: technical insight
SMT NMT
Space in disk ✘ ✓ Smaller
CPU during translation ✓ ✘
RAM during transla...
In domain
SMT NMT
BLEU ✘ ✓
Identical matches ✘ ✓
Edit distance similarity ✘ ✓
Translators feedback ✓ ✘
Generic
SMT NMT
BLE...
Final conclusions
➔ NMT is a new big player in MT:
◆ Research now focusing heavily on NMT: already
outperforms SMT in many...
Final conclusions
➔ SMT, and other approaches, more robust and alive
◆ Better quality and consistency in MT output.
◆ Bett...
Thanks!
Go raibh maith agaibh!
Tauyou & Prompsit
(Diego) dbc@tauyou.com | (Gema) gramirez@prompsit.com
Upcoming SlideShare
Loading in …5
×

Beyond the Hype of Neural Machine Translation

3,534 views

Published on

In this presentation by tauyou and Prompsit, we explain the basics of Neural Machine Translation, and then we apply it to two use cases, one for a generic engine, and another one for a domain-specific engine. Results show that, despite Neural Machine Translation is promising, we have quite a lot of work to do to make it an alternative in real use cases.
(LocWorld Dublin, June 2016)

Published in: Technology
  • Be the first to comment

Beyond the Hype of Neural Machine Translation

  1. 1. Beyond the Hype of Neural Machine Translation Tauyou & Prompsit (Diego) dbc@tauyou.com | (Gema) gramirez@prompsit.com
  2. 2. Why neural nets? “artificial neural networks [...] are able to be trained from examples without the need for a thorough understanding of the task in hand, and able to show surprising generalization performance and predicting power” Mikel L. Forcada (Neural Networks: Automata and Formal Models of Computation)
  3. 3. Why neural nets in MT now? MT maturity ➔ MT is widely used (but planning to use it everywhere) ➔ MT for some languages is still not good enough (yes for others) ➔ RBMT, SMT and hybrid MT approaches widely exploited Resources availability ➔ Computational power available and cheap (GPUs) ➔ Deep learning algorithms and frameworks available ➔ Data to learn from also available (corpora)
  4. 4. So, why not? Promising results from WMT16 competition: all best systems are NMT ones SMT NMT BLEU TER BLEU TER en-fi* 14.8 0.76 17.8 0.72 en-ro 27.4 0.61 28.7 0.60 en-ru 24.0 0.68 26.0 0.65 en-de 31.4 0.58 34.8 0.54 en-cz 24.1 0.67 26.3 0.63 * en-fi are Prompsit’s + DCU systems
  5. 5. Neural nets are... ➔ ...computational models inspired by Biology ➔ ...playing increasing key roles in Graphics and Pattern Recognition ➔ ...experiencing a new edge thanks to hardware and deep learning ➔ ...made of encoding/decoding ‘neurons’ ➔ ...applied to translation (= neural MT = NMT): ◆ encode SL words as vectors that represent the relevant information ◆ decode vectors into words preserving syntactic and semantic information in the TL
  6. 6. NMT requires... ➔ Hardware: raw 10xCPUs or GPU (times get shorter with GPUs) ➔ Software: deep learning framework (Theano, Torch, etc.) + NMT libraries ➔ Data: bilingual corpora (monolingual for LM only) ➔ Learning & (early) stopping: iteratively, translation models are created. ➔ Picking up a model: evaluation and selection of best model(s) ➔ Translating: model(s) are used to translate
  7. 7. Down to the NMT business
  8. 8. Applying NMT to generic and in-domain use cases Generic English -- Swedish SMT vs. NMT ➔ Same generic corpus (8M segments), same training and test sets ➔ SMT: Moses-based with no tuning on CPU ➔ NMT: Theano-based Groundhog NMT toolkit on GPU Domain-specific English -- Norwegian SMT vs. NMT ➔ Same in-domain corpus (800K segments), same training and test sets ➔ SMT: Moses-based + tuning on CPU ➔ NMT: Theano-based Groundhog NMT toolkit on GPU
  9. 9. Comparison for generic English - Swedish SMT NMT Training time 48 hours (CPU) 2 weeks (GPU) Translation time 00:12:35 (866 segments) 01:38:47 (866 segments) CPU usage in translation 56% (CPU) 100% (CPU) Space in disk 37.7 GB 9.1GB BLEU score 0.440 0.404 Identical matches 19.33% (161/866) 12% (104/866) Edit distance similarity 0.78 0.746
  10. 10. Comparison for in-domain English - Norwegian SMT NMT Training time 1.8 hours (3 CPUs) 7 days (1 GPU) Translation time 00:01:22 (1,000 segments) 02:08:00 (1,000 segments) CPU usage in translation 56% (CPU) 100% (CPU) Space in disk 2.3 GB 6.5GB BLEU score 0.53 0.62 Identical matches 27.76% (276/1000) 30% (300/1000) Edit distance similarity 0.77 0.83
  11. 11. Conclusions SMT vs. NMT: technical insight SMT NMT Space in disk ✘ ✓ Smaller CPU during translation ✓ ✘ RAM during translation ✘ ✓ Lesser Training speed rate ✓ Faster ✘ Can be optimized by hardware Translation speed rate ✓ Faster ✘ Can be optimized by hardware
  12. 12. In domain SMT NMT BLEU ✘ ✓ Identical matches ✘ ✓ Edit distance similarity ✘ ✓ Translators feedback ✓ ✘ Generic SMT NMT BLEU ≈ ≈ Identical matches ✓ ✘ Edit distance similarity ≈ ≈ Translators feedback ✓ ✘ Conclusions SMT vs. NMT: qualitative insight
  13. 13. Final conclusions ➔ NMT is a new big player in MT: ◆ Research now focusing heavily on NMT: already outperforms SMT in many cases ◆ Use case results: with little effort, it is on par with SMT ◆ Hardware requirements are more demanding for NMT: higher budget ◆ Translators feedback: SMT is still better
  14. 14. Final conclusions ➔ SMT, and other approaches, more robust and alive ◆ Better quality and consistency in MT output. ◆ Better ROI, specially for real-time translation applications where speed is critical ➔ Deep learning for other NLP applications? ◆ Of course! Vivid in quality estimation, terminology, sentiment analysis, etc.
  15. 15. Thanks! Go raibh maith agaibh! Tauyou & Prompsit (Diego) dbc@tauyou.com | (Gema) gramirez@prompsit.com

×