SlideShare a Scribd company logo
1 of 28
Download to read offline
Apache MXNet Seattle meetup - August
GluonNLP:
A Deep Learning Toolkit for
NLP Practitioners
Presenter: Chenguang Wang
MXNet Science Team
1
Apache MXNet Seattle meetup - August
GluonNLP
2
Apache MXNet Seattle meetup - August
GluonNLP
• A deep learning framework designed for fast data processing/
loading, and model building
3
Apache MXNet Seattle meetup - August
GluonNLP APIs
gluonnlp.data

Build efficient data pipelines for NLP tasks

gluonnlp.model

Train or load state-of-the-arts models for common NLP tasks

gluonnlp.embedding

Train or load state-of-the-arts embeddings for common NLP tasks
4
Apache MXNet Seattle meetup - August
GluonNLP Community
Main contributors:
Sheng Zha, Chenguang Wang, Aston Zhang, Mu Li, Shuai Zheng, Leonard Lausen,
Xingjian Shi

Code&docs:
https://github.com/dmlc/gluon-nlp

http://gluon-nlp.mxnet.io/

Forums:
https://discuss.gluon.ai/

https://discuss.mxnet.io/

5
Apache MXNet Seattle meetup - August
GluonNLP Cool Examples
6
Apache MXNet Seattle meetup - August
Data Bucketing
How to generate the mini-batches?
7
Apache MXNet Seattle meetup - August
No Bucketing
Average Padding = 11.7
8
Data loading
slow and memory inefficient
Apache MXNet Seattle meetup - August
Sorted Bucketing
Average Padding = 3.7
9
GluonNLP data bucketing
fast and memory efficient
Apache MXNet Seattle meetup - August
Google Neural Machine Translation
10
Encoder: Bidireciontal LSTM
+ LSTM + Residual
Decoder: LSTM + Residual
+ MLP Attention
• Our implementation:
• BLEU 26.22 on
IWSLT2015, 10 epochs,
Beam Size=10
• Tensorflow/nmt:
• BLEU 26.10 on
IWSLT2015, Beam
Size=10
Wu, Yonghui, et al. "Google's neural machine translation system: Bridging the gap between human and machine translation." arXiv preprint arXiv:1609.08144 (2016).
Apache MXNet Seattle meetup - August
Transformer
• Encoder
• 6 layers of self-attention+ffn
• Decoder
• 6 layers of masked self-attention and
• output of encoder + ffn
11
• Our implementation:
• BLEU 26.81 on WMT2014en_de, 40 epochs
• Tensorflow/t2t:
• BLEU 26.55 on WMT2014en_de
Vaswani, Ashish, et al. "Attention is all you need." Advances in Neural Information Processing Systems. 2017.
Apache MXNet Seattle meetup - August
GluonNLP Step-by-step
-A language model example
12
Apache MXNet Seattle meetup - August
Language Model
• Language model is trying to predict the next word based on the
previous ones
13
Apache MXNet Seattle meetup - August
Steps to Write Language Model
• 1. Collect a dataset <-most of the work
• 2. Build the model <-a few lines of code
• 3. Train <-a few lines of code
• 4. Evaluate <-one line
• 5. Inference <-one line
14
http://gluon-nlp.mxnet.io/examples/language_model/language_model.html
Apache MXNet Seattle meetup - August
Step #1: Collect a dataset
import gluonnlp as nlp
dataset_name = 'wikitext-2'
train_dataset, val_dataset, test_dataset = [nlp.data.WikiText2(segment=segment,
bos=None, eos='<eos>',
skip_empty=False)
for segment in ['train', 'val', 'test']]
vocab = nlp.Vocab(nlp.data.Counter(train_dataset[0]), padding_token=None,
bos_token=None)
train_data, val_data, test_data = [x.bptt_batchify(vocab, bptt, batch_size,
last_batch='discard')
for x in [train_dataset, val_dataset,
test_dataset]]
15
Apache MXNet Seattle meetup - August
Step #2: Build the model
with self.name_scope():
self.embedding = self._get_embedding()
self.encoder = self._get_encoder()
self.decoder = self._get_decoder()
16self.embedding self.encoder self.decoder
model = nlp.model.train.StandardRNN(args.model, len(vocab), args.emsize,
args.nhid, args.nlayers, args.dropout, args.tied)
Apache MXNet Seattle meetup - August
Step #3: Train
model.initialize(mx.init.Xavier(), ctx=context)
trainer = gluon.Trainer(model.collect_params(), 'sgd',
{'learning_rate': lr,
'momentum': 0,
'wd': 0})
loss = gluon.loss.SoftmaxCrossEntropyLoss()
train(model, train_data, val_data, test_data, epochs, lr)
17
Apache MXNet Seattle meetup - August
Step #4: Evaluate
test_L = evaluate(model, test_data, batch_size)
18
Apache MXNet Seattle meetup - August
Step #5: Inference
model, _ = nlp.model.get_model('standard_lstm_lm_200', vocab=vocab)
test_L = evaluate(model, test_data, batch_size)
19
Apache MXNet Seattle meetup - August
GluonNLP Embedding
http://gluon-nlp.mxnet.io/api/embedding.html
20
Apache MXNet Seattle meetup - August
Embedding is Powerful!
21
Language Embedding Graph Embedding Image Embedding
Word Embedding, Sentence Embedding,
Paragraph embedding etc.
Word2vec, Fasttext, Glove, etc
Language model,
machine translation,
QA, Dialog System, etc.
Network embedding,
Subgraph embedding
LINE, Deepwalk,
CNN embedding
CNN embedding
Faster R-CNN, etc.
Graph mining
etc.
Image classification,
Image detection,
SSD, etc
Recommendation
Information Retrieval
Advertising, etc.
Embedding
… … …
Apache MXNet Seattle meetup - August
Word Embedding
Map words or phrases from the vocabulary to vectors of real numbers.
22
Apache MXNet Seattle meetup - August
Word2vec
• Skip-gram
• Given a center word,
predict surrounding
words
23
“Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean.
Efficient estimation of word representations in vector space.
ICLR Workshop , 2013.”
Apache MXNet Seattle meetup - August
FastText
• (Unknown) word:
• the sum of char-n-gram.
24
Bojanowski, Piotr, et al. "Enriching word vectors with subword information." arXiv preprint arXiv:1607.04606 (2016).
Apache MXNet Seattle meetup - August
Embedding Evaluation
• Similarity
• See the example: http://gluon-nlp.mxnet.io/index.html
• Analogy
• See the example: http://gluon-nlp.mxnet.io/examples/
word_embedding/word_embedding.html
25
Apache MXNet Seattle meetup - August
• Dataset
• Many public datasets.
• Streaming for very large
datasets.
• Text data processing
• Vocabulary
• Tokenization
• Bucketing
• Modeling
• Attention
• Beam Search
• Weight Drop
• Embedding
• Pretrained Embedding
• Embedding Training
GluonNLP Status
• State-of-the-art models
• Embedding, LM, MT, SA
• Examples friendly to users that
are new to the task
• Reproducible training scripts
26
More is coming soon!
Apache MXNet Seattle meetup - August
Summary
• In GluonNLP, we provide
• High-level APIs
• gluonnlp.data, gluonnlp.model, gluonnlp.embedding
• Low-Level APIs
• gluonnlp.data.batchify, gluonnlp.model.StandardRNN
• Designed for practitioners: researchers and engineers
27
Apache MXNet Seattle meetup - August
Thanks && QA
gluon-nlp.mxnet.io
28

More Related Content

What's hot

Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
Paco Nathan
 
Sparkling Water 5 28-14
Sparkling Water 5 28-14Sparkling Water 5 28-14
Sparkling Water 5 28-14
Sri Ambati
 
Charles_Qian_Resume
Charles_Qian_ResumeCharles_Qian_Resume
Charles_Qian_Resume
Charles Qian
 
Msr2010 ibrahim
Msr2010 ibrahimMsr2010 ibrahim
Msr2010 ibrahim
SAIL_QU
 
Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...
 Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark... Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...
Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...
Databricks
 

What's hot (20)

An excursion into Graph Analytics with Apache Spark GraphX
An excursion into Graph Analytics with Apache Spark GraphXAn excursion into Graph Analytics with Apache Spark GraphX
An excursion into Graph Analytics with Apache Spark GraphX
 
Congressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4jCongressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4j
 
Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16
Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16
Jake Mannix, Lead Data Engineer, Lucidworks at MLconf SEA - 5/20/16
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
 
Unifying State-of-the-Art AI and Big Data in Apache Spark with Reynold Xin
Unifying State-of-the-Art AI and Big Data in Apache Spark with Reynold XinUnifying State-of-the-Art AI and Big Data in Apache Spark with Reynold Xin
Unifying State-of-the-Art AI and Big Data in Apache Spark with Reynold Xin
 
H2O World - Benchmarking Open Source ML Platforms - Szilard Pafka
H2O World - Benchmarking Open Source ML Platforms - Szilard PafkaH2O World - Benchmarking Open Source ML Platforms - Szilard Pafka
H2O World - Benchmarking Open Source ML Platforms - Szilard Pafka
 
GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational DatabasesGraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational Databases
 
FlinkML: Large Scale Machine Learning with Apache Flink
FlinkML: Large Scale Machine Learning with Apache FlinkFlinkML: Large Scale Machine Learning with Apache Flink
FlinkML: Large Scale Machine Learning with Apache Flink
 
Sparkling Water 5 28-14
Sparkling Water 5 28-14Sparkling Water 5 28-14
Sparkling Water 5 28-14
 
Intro to H2O Machine Learning in R at Santa Clara University
Intro to H2O Machine Learning in R at Santa Clara UniversityIntro to H2O Machine Learning in R at Santa Clara University
Intro to H2O Machine Learning in R at Santa Clara University
 
Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...
Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...
Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...
 
Julia + R for Data Science
Julia + R for Data ScienceJulia + R for Data Science
Julia + R for Data Science
 
Charles_Qian_Resume
Charles_Qian_ResumeCharles_Qian_Resume
Charles_Qian_Resume
 
Extending the Yahoo Streaming Benchmark
Extending the Yahoo Streaming BenchmarkExtending the Yahoo Streaming Benchmark
Extending the Yahoo Streaming Benchmark
 
Msr2010 ibrahim
Msr2010 ibrahimMsr2010 ibrahim
Msr2010 ibrahim
 
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in WakariIntro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
 
Graph Databases in Python (PyCon Canada 2012)
Graph Databases in Python (PyCon Canada 2012)Graph Databases in Python (PyCon Canada 2012)
Graph Databases in Python (PyCon Canada 2012)
 
Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...
 Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark... Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...
Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...
 
Data Science Challenges in Personal Program Analysis
Data Science Challenges in Personal Program AnalysisData Science Challenges in Personal Program Analysis
Data Science Challenges in Personal Program Analysis
 
K. Tzoumas & S. Ewen – Flink Forward Keynote
K. Tzoumas & S. Ewen – Flink Forward KeynoteK. Tzoumas & S. Ewen – Flink Forward Keynote
K. Tzoumas & S. Ewen – Flink Forward Keynote
 

Similar to GluonNLP: A Deep Learning Toolkit for NLP Practitioners

jlettvin.resume.20160922.STAR
jlettvin.resume.20160922.STARjlettvin.resume.20160922.STAR
jlettvin.resume.20160922.STAR
Jonathan Lettvin
 
ApacheCon 2021 Apache Deep Learning 302
ApacheCon 2021   Apache Deep Learning 302ApacheCon 2021   Apache Deep Learning 302
ApacheCon 2021 Apache Deep Learning 302
Timothy Spann
 
Bryan Cheng-20170222
Bryan Cheng-20170222Bryan Cheng-20170222
Bryan Cheng-20170222
Bryan Cheng
 
David_Thomas_Resume_Software_08_29_16
David_Thomas_Resume_Software_08_29_16David_Thomas_Resume_Software_08_29_16
David_Thomas_Resume_Software_08_29_16
Dave Thomas
 
RMLL 2013 : Build Your Personal Search Engine using Crawlzilla
RMLL 2013 : Build Your Personal Search Engine using CrawlzillaRMLL 2013 : Build Your Personal Search Engine using Crawlzilla
RMLL 2013 : Build Your Personal Search Engine using Crawlzilla
Jazz Yao-Tsung Wang
 
Shiwei Liu-resume - 2017
Shiwei Liu-resume - 2017Shiwei Liu-resume - 2017
Shiwei Liu-resume - 2017
Savill Liu
 

Similar to GluonNLP: A Deep Learning Toolkit for NLP Practitioners (20)

jlettvin.resume.20160922.STAR
jlettvin.resume.20160922.STARjlettvin.resume.20160922.STAR
jlettvin.resume.20160922.STAR
 
Maoye resume 2017_1_v10_short
Maoye resume 2017_1_v10_shortMaoye resume 2017_1_v10_short
Maoye resume 2017_1_v10_short
 
Building and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache AirflowBuilding and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache Airflow
 
2021 04-20 apache arrow and its impact on the database industry.pptx
2021 04-20  apache arrow and its impact on the database industry.pptx2021 04-20  apache arrow and its impact on the database industry.pptx
2021 04-20 apache arrow and its impact on the database industry.pptx
 
ApacheCon 2021 Apache Deep Learning 302
ApacheCon 2021   Apache Deep Learning 302ApacheCon 2021   Apache Deep Learning 302
ApacheCon 2021 Apache Deep Learning 302
 
Challenges on Distributed Machine Learning
Challenges on Distributed Machine LearningChallenges on Distributed Machine Learning
Challenges on Distributed Machine Learning
 
New Developments in H2O: April 2017 Edition
New Developments in H2O: April 2017 EditionNew Developments in H2O: April 2017 Edition
New Developments in H2O: April 2017 Edition
 
Goncalo Pereira CV
Goncalo Pereira CVGoncalo Pereira CV
Goncalo Pereira CV
 
The Big Data Puzzle, Where Does the Eclipse Piece Fit?
The Big Data Puzzle, Where Does the Eclipse Piece Fit?The Big Data Puzzle, Where Does the Eclipse Piece Fit?
The Big Data Puzzle, Where Does the Eclipse Piece Fit?
 
Bryan Cheng-20170222
Bryan Cheng-20170222Bryan Cheng-20170222
Bryan Cheng-20170222
 
David_Thomas_Resume_Software_08_29_16
David_Thomas_Resume_Software_08_29_16David_Thomas_Resume_Software_08_29_16
David_Thomas_Resume_Software_08_29_16
 
Manoj_Rajandrakumar_Resume
Manoj_Rajandrakumar_ResumeManoj_Rajandrakumar_Resume
Manoj_Rajandrakumar_Resume
 
Analyzing Big Data's Weakest Link (hint: it might be you)
Analyzing Big Data's Weakest Link  (hint: it might be you)Analyzing Big Data's Weakest Link  (hint: it might be you)
Analyzing Big Data's Weakest Link (hint: it might be you)
 
RMLL 2013 : Build Your Personal Search Engine using Crawlzilla
RMLL 2013 : Build Your Personal Search Engine using CrawlzillaRMLL 2013 : Build Your Personal Search Engine using Crawlzilla
RMLL 2013 : Build Your Personal Search Engine using Crawlzilla
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest link
 
Shiwei Liu-resume - 2017
Shiwei Liu-resume - 2017Shiwei Liu-resume - 2017
Shiwei Liu-resume - 2017
 
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
 
Swift: A parallel scripting for applications at the petascale and beyond.
Swift: A parallel scripting for applications at the petascale and beyond.Swift: A parallel scripting for applications at the petascale and beyond.
Swift: A parallel scripting for applications at the petascale and beyond.
 
Cytoscape: Now and Future
Cytoscape: Now and FutureCytoscape: Now and Future
Cytoscape: Now and Future
 
Resume
ResumeResume
Resume
 

More from Apache MXNet

More from Apache MXNet (20)

Recent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingRecent Advances in Natural Language Processing
Recent Advances in Natural Language Processing
 
Fine-tuning BERT for Question Answering
Fine-tuning BERT for Question AnsweringFine-tuning BERT for Question Answering
Fine-tuning BERT for Question Answering
 
Introduction to GluonNLP
Introduction to GluonNLPIntroduction to GluonNLP
Introduction to GluonNLP
 
Introduction to object tracking with Deep Learning
Introduction to object tracking with Deep LearningIntroduction to object tracking with Deep Learning
Introduction to object tracking with Deep Learning
 
Introduction to GluonCV
Introduction to GluonCVIntroduction to GluonCV
Introduction to GluonCV
 
Introduction to Computer Vision
Introduction to Computer VisionIntroduction to Computer Vision
Introduction to Computer Vision
 
Image Segmentation: Approaches and Challenges
Image Segmentation: Approaches and ChallengesImage Segmentation: Approaches and Challenges
Image Segmentation: Approaches and Challenges
 
Introduction to Deep face detection and recognition
Introduction to Deep face detection and recognitionIntroduction to Deep face detection and recognition
Introduction to Deep face detection and recognition
 
Generative Adversarial Networks (GANs) using Apache MXNet
Generative Adversarial Networks (GANs) using Apache MXNetGenerative Adversarial Networks (GANs) using Apache MXNet
Generative Adversarial Networks (GANs) using Apache MXNet
 
Deep Learning With Apache MXNet On Video by Ben Taylor @ ziff.ai
Deep Learning With Apache MXNet On Video by Ben Taylor @ ziff.aiDeep Learning With Apache MXNet On Video by Ben Taylor @ ziff.ai
Deep Learning With Apache MXNet On Video by Ben Taylor @ ziff.ai
 
Using Java to deploy Deep Learning models with MXNet
Using Java to deploy Deep Learning models with MXNetUsing Java to deploy Deep Learning models with MXNet
Using Java to deploy Deep Learning models with MXNet
 
AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...
 
MXNet Paris Workshop - Intro To MXNet
MXNet Paris Workshop - Intro To MXNetMXNet Paris Workshop - Intro To MXNet
MXNet Paris Workshop - Intro To MXNet
 
Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018
 
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
 
Apache MXNet EcoSystem - ACNA2018
Apache MXNet EcoSystem - ACNA2018Apache MXNet EcoSystem - ACNA2018
Apache MXNet EcoSystem - ACNA2018
 
ONNX and Edge Deployments
ONNX and Edge DeploymentsONNX and Edge Deployments
ONNX and Edge Deployments
 
Distributed Inference with MXNet and Spark
Distributed Inference with MXNet and SparkDistributed Inference with MXNet and Spark
Distributed Inference with MXNet and Spark
 
Multivariate Time Series
Multivariate Time SeriesMultivariate Time Series
Multivariate Time Series
 
AI On the Edge: Model Compression
AI On the Edge: Model CompressionAI On the Edge: Model Compression
AI On the Edge: Model Compression
 

Recently uploaded

Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
FIDO Alliance
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
panagenda
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 

Recently uploaded (20)

Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
 

GluonNLP: A Deep Learning Toolkit for NLP Practitioners

  • 1. Apache MXNet Seattle meetup - August GluonNLP: A Deep Learning Toolkit for NLP Practitioners Presenter: Chenguang Wang MXNet Science Team 1
  • 2. Apache MXNet Seattle meetup - August GluonNLP 2
  • 3. Apache MXNet Seattle meetup - August GluonNLP • A deep learning framework designed for fast data processing/ loading, and model building 3
  • 4. Apache MXNet Seattle meetup - August GluonNLP APIs gluonnlp.data Build efficient data pipelines for NLP tasks gluonnlp.model Train or load state-of-the-arts models for common NLP tasks gluonnlp.embedding Train or load state-of-the-arts embeddings for common NLP tasks 4
  • 5. Apache MXNet Seattle meetup - August GluonNLP Community Main contributors: Sheng Zha, Chenguang Wang, Aston Zhang, Mu Li, Shuai Zheng, Leonard Lausen, Xingjian Shi Code&docs: https://github.com/dmlc/gluon-nlp http://gluon-nlp.mxnet.io/ Forums: https://discuss.gluon.ai/ https://discuss.mxnet.io/ 5
  • 6. Apache MXNet Seattle meetup - August GluonNLP Cool Examples 6
  • 7. Apache MXNet Seattle meetup - August Data Bucketing How to generate the mini-batches? 7
  • 8. Apache MXNet Seattle meetup - August No Bucketing Average Padding = 11.7 8 Data loading slow and memory inefficient
  • 9. Apache MXNet Seattle meetup - August Sorted Bucketing Average Padding = 3.7 9 GluonNLP data bucketing fast and memory efficient
  • 10. Apache MXNet Seattle meetup - August Google Neural Machine Translation 10 Encoder: Bidireciontal LSTM + LSTM + Residual Decoder: LSTM + Residual + MLP Attention • Our implementation: • BLEU 26.22 on IWSLT2015, 10 epochs, Beam Size=10 • Tensorflow/nmt: • BLEU 26.10 on IWSLT2015, Beam Size=10 Wu, Yonghui, et al. "Google's neural machine translation system: Bridging the gap between human and machine translation." arXiv preprint arXiv:1609.08144 (2016).
  • 11. Apache MXNet Seattle meetup - August Transformer • Encoder • 6 layers of self-attention+ffn • Decoder • 6 layers of masked self-attention and • output of encoder + ffn 11 • Our implementation: • BLEU 26.81 on WMT2014en_de, 40 epochs • Tensorflow/t2t: • BLEU 26.55 on WMT2014en_de Vaswani, Ashish, et al. "Attention is all you need." Advances in Neural Information Processing Systems. 2017.
  • 12. Apache MXNet Seattle meetup - August GluonNLP Step-by-step -A language model example 12
  • 13. Apache MXNet Seattle meetup - August Language Model • Language model is trying to predict the next word based on the previous ones 13
  • 14. Apache MXNet Seattle meetup - August Steps to Write Language Model • 1. Collect a dataset <-most of the work • 2. Build the model <-a few lines of code • 3. Train <-a few lines of code • 4. Evaluate <-one line • 5. Inference <-one line 14 http://gluon-nlp.mxnet.io/examples/language_model/language_model.html
  • 15. Apache MXNet Seattle meetup - August Step #1: Collect a dataset import gluonnlp as nlp dataset_name = 'wikitext-2' train_dataset, val_dataset, test_dataset = [nlp.data.WikiText2(segment=segment, bos=None, eos='<eos>', skip_empty=False) for segment in ['train', 'val', 'test']] vocab = nlp.Vocab(nlp.data.Counter(train_dataset[0]), padding_token=None, bos_token=None) train_data, val_data, test_data = [x.bptt_batchify(vocab, bptt, batch_size, last_batch='discard') for x in [train_dataset, val_dataset, test_dataset]] 15
  • 16. Apache MXNet Seattle meetup - August Step #2: Build the model with self.name_scope(): self.embedding = self._get_embedding() self.encoder = self._get_encoder() self.decoder = self._get_decoder() 16self.embedding self.encoder self.decoder model = nlp.model.train.StandardRNN(args.model, len(vocab), args.emsize, args.nhid, args.nlayers, args.dropout, args.tied)
  • 17. Apache MXNet Seattle meetup - August Step #3: Train model.initialize(mx.init.Xavier(), ctx=context) trainer = gluon.Trainer(model.collect_params(), 'sgd', {'learning_rate': lr, 'momentum': 0, 'wd': 0}) loss = gluon.loss.SoftmaxCrossEntropyLoss() train(model, train_data, val_data, test_data, epochs, lr) 17
  • 18. Apache MXNet Seattle meetup - August Step #4: Evaluate test_L = evaluate(model, test_data, batch_size) 18
  • 19. Apache MXNet Seattle meetup - August Step #5: Inference model, _ = nlp.model.get_model('standard_lstm_lm_200', vocab=vocab) test_L = evaluate(model, test_data, batch_size) 19
  • 20. Apache MXNet Seattle meetup - August GluonNLP Embedding http://gluon-nlp.mxnet.io/api/embedding.html 20
  • 21. Apache MXNet Seattle meetup - August Embedding is Powerful! 21 Language Embedding Graph Embedding Image Embedding Word Embedding, Sentence Embedding, Paragraph embedding etc. Word2vec, Fasttext, Glove, etc Language model, machine translation, QA, Dialog System, etc. Network embedding, Subgraph embedding LINE, Deepwalk, CNN embedding CNN embedding Faster R-CNN, etc. Graph mining etc. Image classification, Image detection, SSD, etc Recommendation Information Retrieval Advertising, etc. Embedding … … …
  • 22. Apache MXNet Seattle meetup - August Word Embedding Map words or phrases from the vocabulary to vectors of real numbers. 22
  • 23. Apache MXNet Seattle meetup - August Word2vec • Skip-gram • Given a center word, predict surrounding words 23 “Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. ICLR Workshop , 2013.”
  • 24. Apache MXNet Seattle meetup - August FastText • (Unknown) word: • the sum of char-n-gram. 24 Bojanowski, Piotr, et al. "Enriching word vectors with subword information." arXiv preprint arXiv:1607.04606 (2016).
  • 25. Apache MXNet Seattle meetup - August Embedding Evaluation • Similarity • See the example: http://gluon-nlp.mxnet.io/index.html • Analogy • See the example: http://gluon-nlp.mxnet.io/examples/ word_embedding/word_embedding.html 25
  • 26. Apache MXNet Seattle meetup - August • Dataset • Many public datasets. • Streaming for very large datasets. • Text data processing • Vocabulary • Tokenization • Bucketing • Modeling • Attention • Beam Search • Weight Drop • Embedding • Pretrained Embedding • Embedding Training GluonNLP Status • State-of-the-art models • Embedding, LM, MT, SA • Examples friendly to users that are new to the task • Reproducible training scripts 26 More is coming soon!
  • 27. Apache MXNet Seattle meetup - August Summary • In GluonNLP, we provide • High-level APIs • gluonnlp.data, gluonnlp.model, gluonnlp.embedding • Low-Level APIs • gluonnlp.data.batchify, gluonnlp.model.StandardRNN • Designed for practitioners: researchers and engineers 27
  • 28. Apache MXNet Seattle meetup - August Thanks && QA gluon-nlp.mxnet.io 28