SlideShare a Scribd company logo
1 of 20
1
Agenda - Deep Learning techniques for Natural Language Processing (NLP)
• Introduction
• Use case landscape
• Shallow vs Deep
• Deep NLP -SOTAs
• BERT
• ERNIE
• REFORMER
• Implementation
• Q&A
Pic : Young Sheldon with ELIZA chatbot
2
AI Trends for 2020
Process Automation
AI in health care
Voice/Chat interface
Federated learning
Ethical AI
3
Conversational AI - Trends
Digital Assistants for Enterprises/Solution Bots
Facilitating easy mail searches, managing meetings without hassle, assigning tasks, accessing knowledge repositories and different applications with
zero-touch: these are some of the areas where a typical white-collar employee spends more than 25% of the effort. These non-value-adding activities
can be performed smartly by an intelligent virtual assistant
Augmented Reality in Conversational AI
AR in chatbots is a unique technology that can take the engagement level and usage to the next heights
No UI is the New UI
With the emergence of Conversational AI bots, you no longer have to look into multiple pages and tabs of a web/mobile app for any information or task
execution. You can simply query the bot, which does most of the work
SMS 2.0: RCS messaging
One of the key channels where Conversational AI bots would be published is SMS channels. Rich Communication Services (RCS) has been eventually
replacing conventional SMS channels.
Machine to Machine Conversations(M2M)
. Conversational AI bots used to trigger man-machine interaction, decipher information collected from the IoT devices to draw insights and make
recommendations
4
Fight against Covid19 - using NLP
• Online consultation
• Intelligent Robo-call platform (1500 call /
sec)
• Online search query handling
• Virtual healthcare software (bright.md)
• BlueDot (Canadian startup) for initial
scanning
https://www.technologyreview.com/2020/03/11/905366/how-baidu-is-bringing-ai-to-the-fight-
against-coronavirus/
5
A typical AI Application today means…
6
Basic Objectives of NLP computing models
• Understand semantics
• Lexical (word)
• Composition (Sentence)
• Discourse (long term
context)
• Understand syntax
• Understand context
• Understand intent
Leonard: Hey, Penny. How's work?
Penny: Great! I hope I'm a waitress at the Cheesecake
Factory for my whole life!
Sheldon: Was that sarcasm?
Penny: No.
Sheldon: Was that sarcasm?
Penny: Yes.
"The Financial Permeability," Season 2, The Big Bang Theory
7
NLP - Computing Domain
Shallow Learning
Deep learning
• POS tagging
• NER (Named Entity Recognition )
• Bag of Words
• TF-IDF
• LDA
• CRF (Conditional Random Field)
• SRL (Semantic Role Labelling)
• OCR
• Word Embedding
• Sequence learning models
• LSTM
• RNN
• Encoder Decoder models
• Attention
• Transformer
• Knowledge Graph
8
Deep learning for NLP
Deep learning journey
9
“The Unreasonable success of RNNs”
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
Image
classification
Image captioning
Sentiment Analysis
Language
translation Subtitle generation
10
Word Embedding
• Vector space model
• Preserving semantics
• N-gram models
• Pre-trained models
• Building blocks for Language Model (LM)
Modi
Varanasi
Prime Minister
constituency
Word embedding space
11
ATTENTION block
Attention types
• Additive
• Multiplicative
• Self
• Key-Value
12
Sequence to Sequence (Seq2Seq) Model
Courtesy: Analytics Vidhya
• Encoder and Decoder models
• Encoder and Decoder can use any combination RNN or
LSTM or CNN to realize the model depending on
performance and other requirements
• Uses Attention mechanism for context preservation
Courtesy: towards Datascience
13
Transformer Based Language Models (LM)
• Seq2Seq model with ‘Attention’ with steroid
Transformer Based LMs Non-Transformer Based LMs
BERT(Google) ELMO (AllenAI) ,
ULMFiT(Fast.ai),
CoVE
GPT (OpenAI), GPT-2 GLove (Manning ,Socher and
others)
ERNIE (Baidu),
XLNet (Google Brain &
CMU)
Word2vec (Tomas Mikolov and
others)
Multihead attention blocks
Decoder has additional
Masked block
14
BERT
• BERT stands for Bidirectional Encoder Representations from
Transformers
import tensorflow as tf
import bert
from bert import tokenization
from bert import modeling
from bert import optimization
from bert import run_classifier
BERT_VOCAB = 'cased_L-12_H-768_A-12/vocab.txt'
BERT_INIT_CHKPNT = 'cased_L-12_H-768_A-12/bert_model.ckpt'
BERT_CONFIG = 'cased_L-12_H-768_A-12/bert_config.json'
tokenizer = tokenization.FullTokenizer(vocab_file=BERT_VOCAB,
do_lower_case=False)
bert_config = modeling.BertConfig.from_json_file(BERT_CONFIG)
model = _Model(bert_config, tokenizer)
15
BERT and Its Variants
• There are several variants of BERT for domain specific use cases. Few are mentioned below
• VideoBERT : Learning Cross-Modal Temporal Representations from Unlabeled Videos
• TinyBERT, ALBERT, ROBERTa...
BERT Variant Name Use case and Data set Reference
SciBERT Trained on papers from the corpus
of semanticscholar.org
https://arxiv.org/abs/1903.1
0676
BioBert Trained on BC5CDR and BioNLP13CG data set https://github.com/MeRajat
/SolvingAlmostAnythingWit
hBert
ClinicalBERT Trained on MIMIC-III data https://arxiv.org/abs/1904.0
5342
FinBERT TRC2-financial, Financial PhraseBank, FiQA
Sentiment
https://arxiv.org/abs/1908.1
0063
16
ERNIE 2.0
• ERNIE (Enhanced Representation through kNowledge IntEgration), a new knowledge integration language representation model
• ERNIE 2.0 is built as a continual pretraining framework to continuously gain enhancement on knowledge integration
through multi-task learning, enabling it to more fully learn various lexical, syntactic and semantic information through
massive data
• ERNIE 2.0 can incrementally train on several new tasks in sequence and accumulate the knowledge it obtains during the
learning process to apply to future tasks
Knowledge Graph
incorporation
Structured
knowledge
Encoding
Performs best for
Question Answer
type of use cases
17
Reformer: The Efficient Transformer
• A Transformer model designed to handle context windows of up to 1 million words, all on a single accelerator and using only
16GB of memory
• Reformer uses locality-sensitive-hashing (LSH) to reduce the complexity of attending over long sequences and reversible
residual layers to more efficiently use the memory available
• LSH accomplishes to handle large sequences in attention layer by computing a hash function that matches similar vectors
together, instead of searching through all possible pairs of vectors
• The second novel approach implemented in Reformer is to recompute the input of each layer on-demand during back-
propagation, rather than storing it in memory. This is accomplished by using reversible layers, where activations from the last
layer of the network are used to recover activations from any intermediate layer, by what amounts to running the network in
reverse
Reversible layers: (A) In a standard residual network, the activations from each layer
are used to update the inputs into the next layer. (B) In a reversible network, two
sets of activations are maintained, only one of which is updated after each layer. (C)
This approach enables running the network in reverse in order to recover all
intermediate values.
Locality-sensitive-hashing: Reformer takes in an input sequence of keys, where each
key is a vector representing individual words (or pixels, in the case of images) in the
first layer and larger contexts in subsequent layers. LSH is applied to the sequence,
after which the keys are sorted by their hash and chunked. Attention is applied only
within a single chunk and its immediate neighbors.
18
Multi-Lingual NLP (Multilingualism)
• NLP community has shown interest in multilingual NLP for specific reasons(both research and business)
• Some developments in this space happened by multiple individuals and organizations
• Again, BERT has its own variants as mBERT(https://github.com/google-
research/bert/blob/master/multilingual.md)
• Few more multilingual NLP network architectures are given below
• LASER (Language-Agnostic SEntence Representations)
(https://github.com/facebookresearch/LASER)
• Multilingual Universal Sentence Encoder for Semantic Retrieval
• https://github.com/facebookresearch/XLM
19
Context Aware Diagnostics and Troubleshooting
20
Confidentiality Notice
This document and all information contained herein is the sole property of Tata Elxsi Limited and shall not be reproduced or
disclosed to a third party without the express written consent of Tata Elxsi Limited.
www.tataelxsi.com
Thank You
Tata Elxsi
facebook.com/ElxsiTata twitter.com/tetataelxsi linkedin.com/company/tata-elxsi

More Related Content

What's hot

Generations of programming language
Generations of programming languageGenerations of programming language
Generations of programming languageJAIDEVPAUL
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition systemAlok Tiwari
 
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRANDeep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRANTAUS - The Language Data Network
 
Speech Recognition: Transcription and transformation of human speech
Speech Recognition: Transcription and transformation of human speechSpeech Recognition: Transcription and transformation of human speech
Speech Recognition: Transcription and transformation of human speechSubmissionResearchpa
 
Generations of programming_language.kum_ari11-1-1-1
Generations of programming_language.kum_ari11-1-1-1Generations of programming_language.kum_ari11-1-1-1
Generations of programming_language.kum_ari11-1-1-1lakshmi kumari neelapu
 
Tesseract OCR Engine - OpenFest 2009
Tesseract OCR Engine - OpenFest 2009Tesseract OCR Engine - OpenFest 2009
Tesseract OCR Engine - OpenFest 2009Svetlin Nakov
 
Machine language to artificial intelligence
Machine language to artificial intelligenceMachine language to artificial intelligence
Machine language to artificial intelligenceSuneel Dogra
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech RecognitionAhmed Moawad
 
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi..."Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi...Yandex
 
Speaker identification system with voice controlled functionality
Speaker identification system with voice controlled functionalitySpeaker identification system with voice controlled functionality
Speaker identification system with voice controlled functionalityarizhamid786
 
2 evolution of the major programming languages
2 evolution of the major programming languages2 evolution of the major programming languages
2 evolution of the major programming languagesjigeno
 
Ai based character recognition and speech synthesis
Ai based character recognition and speech  synthesisAi based character recognition and speech  synthesis
Ai based character recognition and speech synthesisAnkita Jadhao
 
Speech recognition system
Speech recognition systemSpeech recognition system
Speech recognition systemRipal Ranpara
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition Goa App
 
Marathi Isolated Word Recognition System using MFCC and DTW Features
Marathi Isolated Word Recognition System using MFCC and DTW FeaturesMarathi Isolated Word Recognition System using MFCC and DTW Features
Marathi Isolated Word Recognition System using MFCC and DTW FeaturesIDES Editor
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognitionfathitarek
 
Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By MatlabAnkit Gujrati
 
Efficient Speech Emotion Recognition using SVM and Decision Trees
Efficient Speech Emotion Recognition using SVM and Decision TreesEfficient Speech Emotion Recognition using SVM and Decision Trees
Efficient Speech Emotion Recognition using SVM and Decision TreesIRJET Journal
 

What's hot (20)

Generations of programming language
Generations of programming languageGenerations of programming language
Generations of programming language
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition system
 
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRANDeep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
 
Speech Recognition: Transcription and transformation of human speech
Speech Recognition: Transcription and transformation of human speechSpeech Recognition: Transcription and transformation of human speech
Speech Recognition: Transcription and transformation of human speech
 
Generations of programming_language.kum_ari11-1-1-1
Generations of programming_language.kum_ari11-1-1-1Generations of programming_language.kum_ari11-1-1-1
Generations of programming_language.kum_ari11-1-1-1
 
Tesseract OCR Engine - OpenFest 2009
Tesseract OCR Engine - OpenFest 2009Tesseract OCR Engine - OpenFest 2009
Tesseract OCR Engine - OpenFest 2009
 
Machine language to artificial intelligence
Machine language to artificial intelligenceMachine language to artificial intelligence
Machine language to artificial intelligence
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi..."Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
 
Speaker identification system with voice controlled functionality
Speaker identification system with voice controlled functionalitySpeaker identification system with voice controlled functionality
Speaker identification system with voice controlled functionality
 
2 evolution of the major programming languages
2 evolution of the major programming languages2 evolution of the major programming languages
2 evolution of the major programming languages
 
Ai based character recognition and speech synthesis
Ai based character recognition and speech  synthesisAi based character recognition and speech  synthesis
Ai based character recognition and speech synthesis
 
Speech recognition system
Speech recognition systemSpeech recognition system
Speech recognition system
 
Everything you need to know about chatbots
Everything you need to know about chatbotsEverything you need to know about chatbots
Everything you need to know about chatbots
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition
 
Marathi Isolated Word Recognition System using MFCC and DTW Features
Marathi Isolated Word Recognition System using MFCC and DTW FeaturesMarathi Isolated Word Recognition System using MFCC and DTW Features
Marathi Isolated Word Recognition System using MFCC and DTW Features
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Unicode & PHP6
Unicode & PHP6Unicode & PHP6
Unicode & PHP6
 
Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By Matlab
 
Efficient Speech Emotion Recognition using SVM and Decision Trees
Efficient Speech Emotion Recognition using SVM and Decision TreesEfficient Speech Emotion Recognition using SVM and Decision Trees
Efficient Speech Emotion Recognition using SVM and Decision Trees
 

Similar to Deep Learning in NLP (BERT, ERNIE and REFORMER)

OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014Paris Open Source Summit
 
Bridging the gap between AI and UI - DSI Vienna - full version
Bridging the gap between AI and UI - DSI Vienna - full versionBridging the gap between AI and UI - DSI Vienna - full version
Bridging the gap between AI and UI - DSI Vienna - full versionLiad Magen
 
BERT- Pre-training of Deep Bidirectional Transformers for Language Understand...
BERT- Pre-training of Deep Bidirectional Transformers for Language Understand...BERT- Pre-training of Deep Bidirectional Transformers for Language Understand...
BERT- Pre-training of Deep Bidirectional Transformers for Language Understand...Kyuri Kim
 
Samsung voice intelligence.v5.5
Samsung voice intelligence.v5.5Samsung voice intelligence.v5.5
Samsung voice intelligence.v5.5vinutharani1995
 
Building a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From ScratchBuilding a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From ScratchNatasha Latysheva
 
IRJET- Querying Database using Natural Language Interface
IRJET-  	  Querying Database using Natural Language InterfaceIRJET-  	  Querying Database using Natural Language Interface
IRJET- Querying Database using Natural Language InterfaceIRJET Journal
 
Conversational commerce: emerging architectures for smart & useful chatbots -...
Conversational commerce: emerging architectures for smart & useful chatbots -...Conversational commerce: emerging architectures for smart & useful chatbots -...
Conversational commerce: emerging architectures for smart & useful chatbots -...Grid Dynamics
 
Conversational commerce: emerging architectures for smart & useful chatbots -...
Conversational commerce: emerging architectures for smart & useful chatbots -...Conversational commerce: emerging architectures for smart & useful chatbots -...
Conversational commerce: emerging architectures for smart & useful chatbots -...Grid Dynamics
 
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
2015 bioinformatics python_introduction_wim_vancriekinge_vfinalProf. Wim Van Criekinge
 
Pharo: A Reflective System
Pharo: A Reflective SystemPharo: A Reflective System
Pharo: A Reflective SystemPharo
 
Foundation Models in Recommender Systems
Foundation Models in Recommender SystemsFoundation Models in Recommender Systems
Foundation Models in Recommender SystemsAnoop Deoras
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Roelof Pieters
 
Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Dhruv Gohil
 
AI Technology Overview and Career Advice
AI Technology Overview and Career AdviceAI Technology Overview and Career Advice
AI Technology Overview and Career AdviceKunling Geng
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVARobert McDermott
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVARobert McDermott
 

Similar to Deep Learning in NLP (BERT, ERNIE and REFORMER) (20)

DeepPavlov 2019
DeepPavlov 2019DeepPavlov 2019
DeepPavlov 2019
 
ms_3.pdf
ms_3.pdfms_3.pdf
ms_3.pdf
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
 
Bridging the gap between AI and UI - DSI Vienna - full version
Bridging the gap between AI and UI - DSI Vienna - full versionBridging the gap between AI and UI - DSI Vienna - full version
Bridging the gap between AI and UI - DSI Vienna - full version
 
BERT- Pre-training of Deep Bidirectional Transformers for Language Understand...
BERT- Pre-training of Deep Bidirectional Transformers for Language Understand...BERT- Pre-training of Deep Bidirectional Transformers for Language Understand...
BERT- Pre-training of Deep Bidirectional Transformers for Language Understand...
 
ijeter35852020.pdf
ijeter35852020.pdfijeter35852020.pdf
ijeter35852020.pdf
 
Samsung voice intelligence.v5.5
Samsung voice intelligence.v5.5Samsung voice intelligence.v5.5
Samsung voice intelligence.v5.5
 
Building a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From ScratchBuilding a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From Scratch
 
Transformer Zoo
Transformer ZooTransformer Zoo
Transformer Zoo
 
IRJET- Querying Database using Natural Language Interface
IRJET-  	  Querying Database using Natural Language InterfaceIRJET-  	  Querying Database using Natural Language Interface
IRJET- Querying Database using Natural Language Interface
 
Conversational commerce: emerging architectures for smart & useful chatbots -...
Conversational commerce: emerging architectures for smart & useful chatbots -...Conversational commerce: emerging architectures for smart & useful chatbots -...
Conversational commerce: emerging architectures for smart & useful chatbots -...
 
Conversational commerce: emerging architectures for smart & useful chatbots -...
Conversational commerce: emerging architectures for smart & useful chatbots -...Conversational commerce: emerging architectures for smart & useful chatbots -...
Conversational commerce: emerging architectures for smart & useful chatbots -...
 
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
 
Pharo: A Reflective System
Pharo: A Reflective SystemPharo: A Reflective System
Pharo: A Reflective System
 
Foundation Models in Recommender Systems
Foundation Models in Recommender SystemsFoundation Models in Recommender Systems
Foundation Models in Recommender Systems
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical
 
AI Technology Overview and Career Advice
AI Technology Overview and Career AdviceAI Technology Overview and Career Advice
AI Technology Overview and Career Advice
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
 

Recently uploaded

Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 

Recently uploaded (20)

Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 

Deep Learning in NLP (BERT, ERNIE and REFORMER)

  • 1. 1 Agenda - Deep Learning techniques for Natural Language Processing (NLP) • Introduction • Use case landscape • Shallow vs Deep • Deep NLP -SOTAs • BERT • ERNIE • REFORMER • Implementation • Q&A Pic : Young Sheldon with ELIZA chatbot
  • 2. 2 AI Trends for 2020 Process Automation AI in health care Voice/Chat interface Federated learning Ethical AI
  • 3. 3 Conversational AI - Trends Digital Assistants for Enterprises/Solution Bots Facilitating easy mail searches, managing meetings without hassle, assigning tasks, accessing knowledge repositories and different applications with zero-touch: these are some of the areas where a typical white-collar employee spends more than 25% of the effort. These non-value-adding activities can be performed smartly by an intelligent virtual assistant Augmented Reality in Conversational AI AR in chatbots is a unique technology that can take the engagement level and usage to the next heights No UI is the New UI With the emergence of Conversational AI bots, you no longer have to look into multiple pages and tabs of a web/mobile app for any information or task execution. You can simply query the bot, which does most of the work SMS 2.0: RCS messaging One of the key channels where Conversational AI bots would be published is SMS channels. Rich Communication Services (RCS) has been eventually replacing conventional SMS channels. Machine to Machine Conversations(M2M) . Conversational AI bots used to trigger man-machine interaction, decipher information collected from the IoT devices to draw insights and make recommendations
  • 4. 4 Fight against Covid19 - using NLP • Online consultation • Intelligent Robo-call platform (1500 call / sec) • Online search query handling • Virtual healthcare software (bright.md) • BlueDot (Canadian startup) for initial scanning https://www.technologyreview.com/2020/03/11/905366/how-baidu-is-bringing-ai-to-the-fight- against-coronavirus/
  • 5. 5 A typical AI Application today means…
  • 6. 6 Basic Objectives of NLP computing models • Understand semantics • Lexical (word) • Composition (Sentence) • Discourse (long term context) • Understand syntax • Understand context • Understand intent Leonard: Hey, Penny. How's work? Penny: Great! I hope I'm a waitress at the Cheesecake Factory for my whole life! Sheldon: Was that sarcasm? Penny: No. Sheldon: Was that sarcasm? Penny: Yes. "The Financial Permeability," Season 2, The Big Bang Theory
  • 7. 7 NLP - Computing Domain Shallow Learning Deep learning • POS tagging • NER (Named Entity Recognition ) • Bag of Words • TF-IDF • LDA • CRF (Conditional Random Field) • SRL (Semantic Role Labelling) • OCR • Word Embedding • Sequence learning models • LSTM • RNN • Encoder Decoder models • Attention • Transformer • Knowledge Graph
  • 8. 8 Deep learning for NLP Deep learning journey
  • 9. 9 “The Unreasonable success of RNNs” http://karpathy.github.io/2015/05/21/rnn-effectiveness/ Image classification Image captioning Sentiment Analysis Language translation Subtitle generation
  • 10. 10 Word Embedding • Vector space model • Preserving semantics • N-gram models • Pre-trained models • Building blocks for Language Model (LM) Modi Varanasi Prime Minister constituency Word embedding space
  • 11. 11 ATTENTION block Attention types • Additive • Multiplicative • Self • Key-Value
  • 12. 12 Sequence to Sequence (Seq2Seq) Model Courtesy: Analytics Vidhya • Encoder and Decoder models • Encoder and Decoder can use any combination RNN or LSTM or CNN to realize the model depending on performance and other requirements • Uses Attention mechanism for context preservation Courtesy: towards Datascience
  • 13. 13 Transformer Based Language Models (LM) • Seq2Seq model with ‘Attention’ with steroid Transformer Based LMs Non-Transformer Based LMs BERT(Google) ELMO (AllenAI) , ULMFiT(Fast.ai), CoVE GPT (OpenAI), GPT-2 GLove (Manning ,Socher and others) ERNIE (Baidu), XLNet (Google Brain & CMU) Word2vec (Tomas Mikolov and others) Multihead attention blocks Decoder has additional Masked block
  • 14. 14 BERT • BERT stands for Bidirectional Encoder Representations from Transformers import tensorflow as tf import bert from bert import tokenization from bert import modeling from bert import optimization from bert import run_classifier BERT_VOCAB = 'cased_L-12_H-768_A-12/vocab.txt' BERT_INIT_CHKPNT = 'cased_L-12_H-768_A-12/bert_model.ckpt' BERT_CONFIG = 'cased_L-12_H-768_A-12/bert_config.json' tokenizer = tokenization.FullTokenizer(vocab_file=BERT_VOCAB, do_lower_case=False) bert_config = modeling.BertConfig.from_json_file(BERT_CONFIG) model = _Model(bert_config, tokenizer)
  • 15. 15 BERT and Its Variants • There are several variants of BERT for domain specific use cases. Few are mentioned below • VideoBERT : Learning Cross-Modal Temporal Representations from Unlabeled Videos • TinyBERT, ALBERT, ROBERTa... BERT Variant Name Use case and Data set Reference SciBERT Trained on papers from the corpus of semanticscholar.org https://arxiv.org/abs/1903.1 0676 BioBert Trained on BC5CDR and BioNLP13CG data set https://github.com/MeRajat /SolvingAlmostAnythingWit hBert ClinicalBERT Trained on MIMIC-III data https://arxiv.org/abs/1904.0 5342 FinBERT TRC2-financial, Financial PhraseBank, FiQA Sentiment https://arxiv.org/abs/1908.1 0063
  • 16. 16 ERNIE 2.0 • ERNIE (Enhanced Representation through kNowledge IntEgration), a new knowledge integration language representation model • ERNIE 2.0 is built as a continual pretraining framework to continuously gain enhancement on knowledge integration through multi-task learning, enabling it to more fully learn various lexical, syntactic and semantic information through massive data • ERNIE 2.0 can incrementally train on several new tasks in sequence and accumulate the knowledge it obtains during the learning process to apply to future tasks Knowledge Graph incorporation Structured knowledge Encoding Performs best for Question Answer type of use cases
  • 17. 17 Reformer: The Efficient Transformer • A Transformer model designed to handle context windows of up to 1 million words, all on a single accelerator and using only 16GB of memory • Reformer uses locality-sensitive-hashing (LSH) to reduce the complexity of attending over long sequences and reversible residual layers to more efficiently use the memory available • LSH accomplishes to handle large sequences in attention layer by computing a hash function that matches similar vectors together, instead of searching through all possible pairs of vectors • The second novel approach implemented in Reformer is to recompute the input of each layer on-demand during back- propagation, rather than storing it in memory. This is accomplished by using reversible layers, where activations from the last layer of the network are used to recover activations from any intermediate layer, by what amounts to running the network in reverse Reversible layers: (A) In a standard residual network, the activations from each layer are used to update the inputs into the next layer. (B) In a reversible network, two sets of activations are maintained, only one of which is updated after each layer. (C) This approach enables running the network in reverse in order to recover all intermediate values. Locality-sensitive-hashing: Reformer takes in an input sequence of keys, where each key is a vector representing individual words (or pixels, in the case of images) in the first layer and larger contexts in subsequent layers. LSH is applied to the sequence, after which the keys are sorted by their hash and chunked. Attention is applied only within a single chunk and its immediate neighbors.
  • 18. 18 Multi-Lingual NLP (Multilingualism) • NLP community has shown interest in multilingual NLP for specific reasons(both research and business) • Some developments in this space happened by multiple individuals and organizations • Again, BERT has its own variants as mBERT(https://github.com/google- research/bert/blob/master/multilingual.md) • Few more multilingual NLP network architectures are given below • LASER (Language-Agnostic SEntence Representations) (https://github.com/facebookresearch/LASER) • Multilingual Universal Sentence Encoder for Semantic Retrieval • https://github.com/facebookresearch/XLM
  • 19. 19 Context Aware Diagnostics and Troubleshooting
  • 20. 20 Confidentiality Notice This document and all information contained herein is the sole property of Tata Elxsi Limited and shall not be reproduced or disclosed to a third party without the express written consent of Tata Elxsi Limited. www.tataelxsi.com Thank You Tata Elxsi facebook.com/ElxsiTata twitter.com/tetataelxsi linkedin.com/company/tata-elxsi