Introduction to GluonNLP

© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
GluonNLP
A Natural Language Processing toolkit
gluon-nlp.mxnet.io

2
Three common myths …
Motivations for GluonNLP

• I will write clean and reusable code
when I’m prototyping this time.
• Variant:
• - I will write clean and reusable code
next time.
=> Well crafted reusable APIs
Common myth 1
function &
script?
hard-coded
parameter?

Common myth 2
• My code will still run next year.
• Sometimes, it’s not our fault.
=> Integrated testing of examples

Common myth3
• I will finish setting up the baseline
model this afternoon.
• Though it may not be our fault
again.
=> Re-implementation of SOTA results

Goals
1. Problem: prototype code is not reusable without copying.
Solution: carefully designed API for versatile needs.
2. Problem: code may break due to API changes.
Solution: integrated testing for examples.
3. Problem: setting up baseline for NLP tasks is hard.
Solution: implementation for state-of-the-art models.
6

• Designed for engineers and researchers
• Enable fast prototyping for NLP application and research
7
GluonNLP goals

GluonNLP Community
• Internal users
• Amazon Comprehend
• Amazon Lex
• AmazonTranscribe
• AmazonTranslate
• Amazon Personalize
• Alexa NLU
• Alexa Brain
• External users

• High-level packages
• gluonnlp.data, gluonnlp.model, gluonnlp.embedding
• Low-Level packages
• gluonnlp.data.batchify, gluonnlp.model.StandardRNN
• Datasets:
• gluonnlp.data.SQuAD, gluonnlp.data.WikiText103
Designed for practitioners: researchers and engineers
http://gluon-nlp.mxnet.io/api/modules/data.html#public-datasets

GluonNLP Models
• Language Modeling
• MachineTranslation
• Word Embedding (100+)
• Text Classification
• Text Generation
• Sentence Embedding
1
0
• Dependency Parsing
• Entailment
• Question Answering
• Named Entity Recognition
• Keyphrase Extraction
• Semantic Role Labeling
• Summarization
Released
WIP
Planned

APIs: Data Loading: Bucketing
How to generate the mini-batches?

No Bucketing + Directly Pad the Samples
Average Padding = 11.7
Be Frugal! Use Bucketing.

Sorted Bucketing

Fixed Bucketing
Shorter sequences can have larger batch sizes.

Fixed Bucketing + Length-aware Batch Size
Batch Size = 18Batch Size = 11
Better throughput! ✌️
Batch Size = 8
ratio
Length of the buckets

Improvement over published results
AWD [1] model on WikiText2 Test Perplexity
GluonNLP 66.9 (250 epochs)
Pytorch 67.8 (250 epochs)
Diff -0.9
Table 3: AWD Language Model
Table 1: fastText n-gram embedding scores, trained onText8 dataset, evaluated on Wordsim353
Table 2: Machine Translation Model BLEU score same standard and settings

MachineTranslation
Encoder: Bidirectional
LSTM + Residual
Decoder: LSTM + Residual +
MLP Attention
• GluonNLP:
• BLEU 26.22 on
IWSLT2015, 10 epochs,
Beam Size=10
• Tensorflow/nmt:
• BLEU 26.10 on
IWSLT2015,
Beam Size=10
Wu, Yonghui, et al. "Google's neural machine translation system: Bridging the gap between human and machine translation." arXiv preprint arXiv:1609.08144 (2016).
Google Neural MachineTranslation (GNMT)

• Encoder
• 6 layers of self-attention+feed-forward
• Decoder
• 6 layers of masked self-attention and
output of encoder + feed-forward
• GluonNLP:
• BLEU 26.81 onWMT2014en_de, 40 epochs
• Tensorflow/t2t:
• BLEU 26.55 onWMT2014en_de
Vaswani, Ashish, et al. "Attention is all you need." Advances in Neural Information Processing Systems. 2017.
MachineTranslation
Transformer

• Feature-based approach
• Pre-training bidirectional
language model
• Character embedding +
stacked bidirectional LSTMs
• GluonNLPTutorial
Transfer learning: ELMo
Embedding from Language Model
Deep contextualized word representations, Peters et al., 2018

• Fine-tuning approach
• Pre-training masked language model +
next sentence prediction
• Stacked transformer encoder + BPE
• GluonNLPTutorial
Transfer Learning: BERT
Bidirectional Encoder Representations fromTransformers
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Devlin et al., 2018

Go build!
• http://gluon-nlp.mxnet.io/
Get help:
• https://discuss.mxnet.io/

Introduction to GluonNLP

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Introduction to GluonNLP

Similar to Introduction to GluonNLP (20)

More from Apache MXNet

More from Apache MXNet (20)

Recently uploaded

Recently uploaded (20)

Introduction to GluonNLP

Editor's Notes