SlideShare a Scribd company logo
1 of 27
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
IACC - 2020
129: Analysis of Contextual and Non-Contextual Word
Embedding Models For Hindi NER With Web Application
Paper Presentation
1
AINDRIYA BARUA
(barua.aindriya@gmail.com)
Dr. KP Soman (HoD, CEN),
Thara S
Premjith B.
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
2
PDF of full paper is publicly available here
All the code used in the research are available on my Github
Full Hindi NER dataset is available here
If you find any part of the resources provided here useful for your research, please cite
this paper:
Barua, A., Thara, S., Premjith, B. and Soman, K.P., 2020, December. Analysis of Contextual and Non-
contextual Word Embedding Models for Hindi NER with Web Application for Data Collection. In
International Advanced Computing Conference (pp. 183-202). Springer, Singapore.
RESOURCES
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
3
ABSTRACT
1. Categorize word embeddings as Contextual and Non-contextual, and further
compare them inter and intra-category for NER in Hindi in Devanagari script.
2. Under non-contextual type embeddings, we experiment with Word2Vec and
FastText
3. Under the contextual embedding category, we experiment with BERT and its
variants RoBERTa, ELECTRA, CamemBERT, Distil-BERT, XLM-RoBERTa.
4. Best model is used to make an interactive web app for hindi NER
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
4
Dataset
● The dataset is taken from the
first shared task on
Information Extractor for
Conversational Systems in
Indian Languages (IECSIL)
● Consists of Hindi words and
corresponding NER labels.
● 15,48,570 words and labels.
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
5
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
6
Word Embedding Classification
● Non Contextual : produces only one vector paying little heed to the position of
the words in a sentence and disregards the various implications they may have.
● Contextual : Contextual Word Embedding Models can produce distinctive word
embeddings for a word that catches the positioning of a word in a sentence,
hence they are context-dependent. We will be analyzing Transformer based word
embedding models which are contextual.
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
7
Word2Vec and Fasttext: (Non contextual)
● Word2vec trains words against other
words that neighbor them in the input
corpus.
● It does so in one of two ways, either
using context to predict a target word
(a method known as continuous bag of
words, or CBOW), or using a word to
predict a target context, which is called
skip-gram.
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
8
Word2Vec vs Fasttext:
● fastText treats each word as the aggregation of its subwords:
character ngrams of the word. The vector for a word is the sum
of all vectors
● fastText does significantly better on morphologically rich
languages
● fastText can be used to obtain vectors for out-of-vocabulary
(OOV) words, by summing up vectors for its component char-
ngrams, provided at least one of the char-ngrams was present in
the training data.
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
9
Sequence to Sequence Models
● Sequence-to-sequence (seq2seq) models : convert sequences of
Type A to sequences of Type B. For eg, translation of English
sentences to German sentences
● RNN is based on this
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
10
RNN based Sequence-to-Sequence Model
● RNN takes a word vector (xi) from the input sequence and a hidden state (Hi) from the
previous time step
● The hidden state from the last unit is known as the context vector.
● Context vector - passed to the decoder and it is used to generate the target sequence
(English phrase)
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
11
Downfalls of Seq2seq
● Dealing with long-range dependencies is still challenging
● The sequential nature of the model architecture prevents
parallelization. These challenges are addressed by Google Brain’s
Transformer concept
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
12
Transformers to understand BERT
● Capturing such relationships and sequence of words in sentences: This is where the
Transformer concept plays a major role.
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
13
23rd September, 2018
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
16th, May, 2020
14
BERT
Bidirectional- because it reads text
from both directions, at the same time,
left and right-> better understanding
Encoder: uses transformer’s encoder
Representations (with)
Transformers: BERT uses a multi-layer
bidirectional Transformer encoder. Its
self-attention layer performs self-
attention in both directions.
23rd September, 2018
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
16th, May, 2020
15
RoBERTa: modication of BERT and enhances the key hyper-parameters which trains with way bigger
learning rates and mini-batches
XLM-RoBERTa: a huge multi-lingual model, pre-trained on a large amount of data, that does not
require special tensors to determine the language
CamemBERT: based on Facebook's RoBERTa, but trained on French language
DistilBERT: a comparatively very small, quick, less expensive, and Light-weight Transformer model
trained by distillation of Bert
ELECTRA: a pre-training methodology that trains 2 transformers: generator and discriminator. The
generator replaces tokens in a sequence, and is trained as a masked model. The discriminator identies
which tokens in the sequence were initially replaced by the generator.
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
16
DATA FLOW DIAGRAM FOR NON CONTEXTUAL MODELS
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
17
DATA FLOW DIAGRAM FOR CONTEXTUAL MODELS
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
18
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
19
INTRA CATEGORY COMPARISON: NON CONTEXTUAL
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
20
INTRA CATEGORY COMPARISON: CONTEXTUAL
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
21
INTRA CATEGORY COMPARISON: CONTEXTUAL
Hindi is morphologically rich and words are usually formed by a combination
of sub-words called `sandhi', which is is intuitively similar to the act of
breaking words into sub-words done by FastText
For OOV words, it sums up vectors for its component character n-grams,
if at least one of the n-gram is present in the training data, it can speculate
the representation of the new word with the help of that.
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
22
INTRA CATEGORY COMPARISON: CONTEXTUAL
XLM-RoBERTa is a multi-lingual model trained over 100 languages over a signicantly larger data-set. The training
corpus called CommonCrawl is as huge as 2.5 gigabytes, which is a manifold higher than its predecessors'
training data- the Wiki-100 corpus.
BERT performs better than RoBERTa on Hindi NER by approximately 7%.
CamemBERT is trained on French monolingual data, and hence it is interesting to note its performance on Hindi
data. It shows a 17% degradation from BERT's F1 score.
DistilBERT: The execution time is approximately four times less than that of BERT, but it did come with the trade-
oFf prediction metrics. It shows massive 38% degradation on BERT in our training.
Although it i claimed that ELECTRA Model ha an improvement on BERT, it causes a degradation of 45 % on our
Hindi NER task.
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
23
INTER CATEGORY COMPARISON: CONTEXTUAL VS NON- CONTEXTUAL
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
24
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
25
WEB APP FOR INTERACTIVE HINDI NER AND DATA COLLECTION
After the successful completion of all the experiments, the best model was
used to make the first of its kind interactive web application for NER in the
Hindi Language in Devanagari script, which is deployed at http://3.7.28.233.
23rd September, 2018
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
16th, May, 2020
26
Conclusion
By using these techniques, more applications can be made in the mother-tongues of
Indians, and bridging the language gap of Indians with the world, and also extract
valuable information from our ancient history.
Future work:
1. hyper-parameter tuning: making tweaks to the learning rates, batch sizes, etc.
2. dataset has class imbalance as established during the experiments, hence, a cost
sensitive learning approach could also yield better outcomes.
3. Reinforcement learning can also be incorporated on the web application, to improve
the models utilizing the user feedback that our website is designed to collect.
6th December, 2020
Center for Computational Engineering and Networking (CEN)
Aindriya Barua
Slide-
6th December, 2020
27
github.com/AindriyaBa
barua.aindriya@gmail.
Questions? Feel free to
reach me out on e-mail
or Github :)
If you find any part of the resources provided
here useful for your research, please site this
paper:
https://www.researchgate.net/publication/3491
90662_Analysis_of_Contextual_and_Non-

More Related Content

Similar to Contextual vs non-contextual word embedding models for Hindi Named Entity Recognition | Natural Language Processing- Aindriya Barua

Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...mlaij
 
Call for Papers - 2nd International Conference on NLP, Data Mining and Machin...
Call for Papers - 2nd International Conference on NLP, Data Mining and Machin...Call for Papers - 2nd International Conference on NLP, Data Mining and Machin...
Call for Papers - 2nd International Conference on NLP, Data Mining and Machin...ijistjournal
 
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...mlaij
 
Call for Papers - International Conference on NLP, Data Mining and Machine Le...
Call for Papers - International Conference on NLP, Data Mining and Machine Le...Call for Papers - International Conference on NLP, Data Mining and Machine Le...
Call for Papers - International Conference on NLP, Data Mining and Machine Le...dannyijwest
 
International Conference on NLP, Data Mining and Machine Learning (NLDML 2022)
International Conference on NLP, Data Mining and Machine Learning (NLDML 2022)International Conference on NLP, Data Mining and Machine Learning (NLDML 2022)
International Conference on NLP, Data Mining and Machine Learning (NLDML 2022)kevig
 
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...mlaij
 
3rd International Conference on NLP Trends & Technologies (NLPTT 2022)
3rd International Conference on NLP Trends & Technologies (NLPTT 2022) 3rd International Conference on NLP Trends & Technologies (NLPTT 2022)
3rd International Conference on NLP Trends & Technologies (NLPTT 2022) ijcsity
 
Call for papers - 8th International Conference on Computational Science and E...
Call for papers - 8th International Conference on Computational Science and E...Call for papers - 8th International Conference on Computational Science and E...
Call for papers - 8th International Conference on Computational Science and E...IJCSES Journal
 
Call for Papers - International Conference on NLP, Data Mining and Machine Le...
Call for Papers - International Conference on NLP, Data Mining and Machine Le...Call for Papers - International Conference on NLP, Data Mining and Machine Le...
Call for Papers - International Conference on NLP, Data Mining and Machine Le...dannyijwest
 
2nd International Conference on Big Data, IOT & NLP (BINLP 2022)
2nd International Conference on Big Data, IOT & NLP (BINLP 2022) 2nd International Conference on Big Data, IOT & NLP (BINLP 2022)
2nd International Conference on Big Data, IOT & NLP (BINLP 2022) ijcsity
 
Call for Paper - International Conference on Machine Learning, NLP and Data M...
Call for Paper - International Conference on Machine Learning, NLP and Data M...Call for Paper - International Conference on Machine Learning, NLP and Data M...
Call for Paper - International Conference on Machine Learning, NLP and Data M...mlaij
 
2nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)
2nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)2nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)
2nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)ijccsa
 
2nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)
2nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)2nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)
2nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)IJCNCJournal
 
Transforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approachTransforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approachFerdin Joe John Joseph PhD
 
8th International Conference on Computational Science and Engineering (CSE 2020)
8th International Conference on Computational Science and Engineering (CSE 2020)8th International Conference on Computational Science and Engineering (CSE 2020)
8th International Conference on Computational Science and Engineering (CSE 2020)acijjournal
 
2 nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)
2 nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)2 nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)
2 nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)ijdms
 
Cal for papers - 8th International Conference on Computational Science and En...
Cal for papers - 8th International Conference on Computational Science and En...Cal for papers - 8th International Conference on Computational Science and En...
Cal for papers - 8th International Conference on Computational Science and En...IJCSES Journal
 
8th International Conference on Computational Science and Engineering (CSE 2020)
8th International Conference on Computational Science and Engineering (CSE 2020)8th International Conference on Computational Science and Engineering (CSE 2020)
8th International Conference on Computational Science and Engineering (CSE 2020)ijcseit
 
How Composable is the Web.pdf
How Composable is the Web.pdfHow Composable is the Web.pdf
How Composable is the Web.pdfSouhaila Serbout
 

Similar to Contextual vs non-contextual word embedding models for Hindi Named Entity Recognition | Natural Language Processing- Aindriya Barua (20)

Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
 
Call for Papers - 2nd International Conference on NLP, Data Mining and Machin...
Call for Papers - 2nd International Conference on NLP, Data Mining and Machin...Call for Papers - 2nd International Conference on NLP, Data Mining and Machin...
Call for Papers - 2nd International Conference on NLP, Data Mining and Machin...
 
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
 
Call for Papers - International Conference on NLP, Data Mining and Machine Le...
Call for Papers - International Conference on NLP, Data Mining and Machine Le...Call for Papers - International Conference on NLP, Data Mining and Machine Le...
Call for Papers - International Conference on NLP, Data Mining and Machine Le...
 
International Conference on NLP, Data Mining and Machine Learning (NLDML 2022)
International Conference on NLP, Data Mining and Machine Learning (NLDML 2022)International Conference on NLP, Data Mining and Machine Learning (NLDML 2022)
International Conference on NLP, Data Mining and Machine Learning (NLDML 2022)
 
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
Call for Paper - International Conference on NLP, Data Mining and Machine Lea...
 
3rd International Conference on NLP Trends & Technologies (NLPTT 2022)
3rd International Conference on NLP Trends & Technologies (NLPTT 2022) 3rd International Conference on NLP Trends & Technologies (NLPTT 2022)
3rd International Conference on NLP Trends & Technologies (NLPTT 2022)
 
Call for papers - 8th International Conference on Computational Science and E...
Call for papers - 8th International Conference on Computational Science and E...Call for papers - 8th International Conference on Computational Science and E...
Call for papers - 8th International Conference on Computational Science and E...
 
Call for Papers - International Conference on NLP, Data Mining and Machine Le...
Call for Papers - International Conference on NLP, Data Mining and Machine Le...Call for Papers - International Conference on NLP, Data Mining and Machine Le...
Call for Papers - International Conference on NLP, Data Mining and Machine Le...
 
2nd International Conference on Big Data, IOT & NLP (BINLP 2022)
2nd International Conference on Big Data, IOT & NLP (BINLP 2022) 2nd International Conference on Big Data, IOT & NLP (BINLP 2022)
2nd International Conference on Big Data, IOT & NLP (BINLP 2022)
 
Call for Paper - International Conference on Machine Learning, NLP and Data M...
Call for Paper - International Conference on Machine Learning, NLP and Data M...Call for Paper - International Conference on Machine Learning, NLP and Data M...
Call for Paper - International Conference on Machine Learning, NLP and Data M...
 
2nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)
2nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)2nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)
2nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)
 
2nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)
2nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)2nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)
2nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)
 
Resume
Resume Resume
Resume
 
Transforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approachTransforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approach
 
8th International Conference on Computational Science and Engineering (CSE 2020)
8th International Conference on Computational Science and Engineering (CSE 2020)8th International Conference on Computational Science and Engineering (CSE 2020)
8th International Conference on Computational Science and Engineering (CSE 2020)
 
2 nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)
2 nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)2 nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)
2 nd International Conference on Cloud, Big Data and IoT (CBIoT 2021)
 
Cal for papers - 8th International Conference on Computational Science and En...
Cal for papers - 8th International Conference on Computational Science and En...Cal for papers - 8th International Conference on Computational Science and En...
Cal for papers - 8th International Conference on Computational Science and En...
 
8th International Conference on Computational Science and Engineering (CSE 2020)
8th International Conference on Computational Science and Engineering (CSE 2020)8th International Conference on Computational Science and Engineering (CSE 2020)
8th International Conference on Computational Science and Engineering (CSE 2020)
 
How Composable is the Web.pdf
How Composable is the Web.pdfHow Composable is the Web.pdf
How Composable is the Web.pdf
 

Recently uploaded

Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 

Recently uploaded (20)

Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 

Contextual vs non-contextual word embedding models for Hindi Named Entity Recognition | Natural Language Processing- Aindriya Barua

  • 1. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 IACC - 2020 129: Analysis of Contextual and Non-Contextual Word Embedding Models For Hindi NER With Web Application Paper Presentation 1 AINDRIYA BARUA (barua.aindriya@gmail.com) Dr. KP Soman (HoD, CEN), Thara S Premjith B.
  • 2. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 2 PDF of full paper is publicly available here All the code used in the research are available on my Github Full Hindi NER dataset is available here If you find any part of the resources provided here useful for your research, please cite this paper: Barua, A., Thara, S., Premjith, B. and Soman, K.P., 2020, December. Analysis of Contextual and Non- contextual Word Embedding Models for Hindi NER with Web Application for Data Collection. In International Advanced Computing Conference (pp. 183-202). Springer, Singapore. RESOURCES
  • 3. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 3 ABSTRACT 1. Categorize word embeddings as Contextual and Non-contextual, and further compare them inter and intra-category for NER in Hindi in Devanagari script. 2. Under non-contextual type embeddings, we experiment with Word2Vec and FastText 3. Under the contextual embedding category, we experiment with BERT and its variants RoBERTa, ELECTRA, CamemBERT, Distil-BERT, XLM-RoBERTa. 4. Best model is used to make an interactive web app for hindi NER
  • 4. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 4 Dataset ● The dataset is taken from the first shared task on Information Extractor for Conversational Systems in Indian Languages (IECSIL) ● Consists of Hindi words and corresponding NER labels. ● 15,48,570 words and labels.
  • 5. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 5
  • 6. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 6 Word Embedding Classification ● Non Contextual : produces only one vector paying little heed to the position of the words in a sentence and disregards the various implications they may have. ● Contextual : Contextual Word Embedding Models can produce distinctive word embeddings for a word that catches the positioning of a word in a sentence, hence they are context-dependent. We will be analyzing Transformer based word embedding models which are contextual.
  • 7. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 7 Word2Vec and Fasttext: (Non contextual) ● Word2vec trains words against other words that neighbor them in the input corpus. ● It does so in one of two ways, either using context to predict a target word (a method known as continuous bag of words, or CBOW), or using a word to predict a target context, which is called skip-gram.
  • 8. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 8 Word2Vec vs Fasttext: ● fastText treats each word as the aggregation of its subwords: character ngrams of the word. The vector for a word is the sum of all vectors ● fastText does significantly better on morphologically rich languages ● fastText can be used to obtain vectors for out-of-vocabulary (OOV) words, by summing up vectors for its component char- ngrams, provided at least one of the char-ngrams was present in the training data.
  • 9. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 9 Sequence to Sequence Models ● Sequence-to-sequence (seq2seq) models : convert sequences of Type A to sequences of Type B. For eg, translation of English sentences to German sentences ● RNN is based on this
  • 10. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 10 RNN based Sequence-to-Sequence Model ● RNN takes a word vector (xi) from the input sequence and a hidden state (Hi) from the previous time step ● The hidden state from the last unit is known as the context vector. ● Context vector - passed to the decoder and it is used to generate the target sequence (English phrase)
  • 11. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 11 Downfalls of Seq2seq ● Dealing with long-range dependencies is still challenging ● The sequential nature of the model architecture prevents parallelization. These challenges are addressed by Google Brain’s Transformer concept
  • 12. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 12 Transformers to understand BERT ● Capturing such relationships and sequence of words in sentences: This is where the Transformer concept plays a major role.
  • 13. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 13
  • 14. 23rd September, 2018 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 16th, May, 2020 14 BERT Bidirectional- because it reads text from both directions, at the same time, left and right-> better understanding Encoder: uses transformer’s encoder Representations (with) Transformers: BERT uses a multi-layer bidirectional Transformer encoder. Its self-attention layer performs self- attention in both directions.
  • 15. 23rd September, 2018 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 16th, May, 2020 15 RoBERTa: modication of BERT and enhances the key hyper-parameters which trains with way bigger learning rates and mini-batches XLM-RoBERTa: a huge multi-lingual model, pre-trained on a large amount of data, that does not require special tensors to determine the language CamemBERT: based on Facebook's RoBERTa, but trained on French language DistilBERT: a comparatively very small, quick, less expensive, and Light-weight Transformer model trained by distillation of Bert ELECTRA: a pre-training methodology that trains 2 transformers: generator and discriminator. The generator replaces tokens in a sequence, and is trained as a masked model. The discriminator identies which tokens in the sequence were initially replaced by the generator.
  • 16. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 16 DATA FLOW DIAGRAM FOR NON CONTEXTUAL MODELS
  • 17. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 17 DATA FLOW DIAGRAM FOR CONTEXTUAL MODELS
  • 18. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 18
  • 19. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 19 INTRA CATEGORY COMPARISON: NON CONTEXTUAL
  • 20. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 20 INTRA CATEGORY COMPARISON: CONTEXTUAL
  • 21. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 21 INTRA CATEGORY COMPARISON: CONTEXTUAL Hindi is morphologically rich and words are usually formed by a combination of sub-words called `sandhi', which is is intuitively similar to the act of breaking words into sub-words done by FastText For OOV words, it sums up vectors for its component character n-grams, if at least one of the n-gram is present in the training data, it can speculate the representation of the new word with the help of that.
  • 22. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 22 INTRA CATEGORY COMPARISON: CONTEXTUAL XLM-RoBERTa is a multi-lingual model trained over 100 languages over a signicantly larger data-set. The training corpus called CommonCrawl is as huge as 2.5 gigabytes, which is a manifold higher than its predecessors' training data- the Wiki-100 corpus. BERT performs better than RoBERTa on Hindi NER by approximately 7%. CamemBERT is trained on French monolingual data, and hence it is interesting to note its performance on Hindi data. It shows a 17% degradation from BERT's F1 score. DistilBERT: The execution time is approximately four times less than that of BERT, but it did come with the trade- oFf prediction metrics. It shows massive 38% degradation on BERT in our training. Although it i claimed that ELECTRA Model ha an improvement on BERT, it causes a degradation of 45 % on our Hindi NER task.
  • 23. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 23 INTER CATEGORY COMPARISON: CONTEXTUAL VS NON- CONTEXTUAL
  • 24. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 24
  • 25. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 25 WEB APP FOR INTERACTIVE HINDI NER AND DATA COLLECTION After the successful completion of all the experiments, the best model was used to make the first of its kind interactive web application for NER in the Hindi Language in Devanagari script, which is deployed at http://3.7.28.233.
  • 26. 23rd September, 2018 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 16th, May, 2020 26 Conclusion By using these techniques, more applications can be made in the mother-tongues of Indians, and bridging the language gap of Indians with the world, and also extract valuable information from our ancient history. Future work: 1. hyper-parameter tuning: making tweaks to the learning rates, batch sizes, etc. 2. dataset has class imbalance as established during the experiments, hence, a cost sensitive learning approach could also yield better outcomes. 3. Reinforcement learning can also be incorporated on the web application, to improve the models utilizing the user feedback that our website is designed to collect.
  • 27. 6th December, 2020 Center for Computational Engineering and Networking (CEN) Aindriya Barua Slide- 6th December, 2020 27 github.com/AindriyaBa barua.aindriya@gmail. Questions? Feel free to reach me out on e-mail or Github :) If you find any part of the resources provided here useful for your research, please site this paper: https://www.researchgate.net/publication/3491 90662_Analysis_of_Contextual_and_Non-