SlideShare a Scribd company logo
PROJECT
PRESENTATION
Paper: Bidirectional LSTM with Attention Mechanism and Convolutional Layer for Text
Classification
Reference: Liu, Gang, and Jiabao Guo. "Bidirectional LSTM with attention
mechanism and convolutional layer for text classification." Neurocomputing 337
(2019): 325-338.
CONTENT
• Paper
• Dataset
• Vocabulary Building
• Word2Vector
• Model Generation
• Model Summary
• Model Training
• Future Work
PAPER
• Objective: Sentiment classification of polarized datasets, such as reviews, questions, etc.
• CNNs are able to extract features for sentence modelling while reducing dimensionality of
the data.
• RNNs are specialized for sequential modelling. Bi-LSTM, combines the forward hidden layer
and the backward hidden layer, which can access both the preceding and succeeding
contexts, to obtain the contextual information of the text.
• Attention mechanism is used in two-layers for the preceding and succeeding
contextual features to highlight the important information from the contextual
information by setting different weights
• Softmax layer to generate labels.
• Their model outperforms state-of-the-art classification methods in terms of
classification accuracy
DATASET: IMDB – MOVIE REVIEW
• This is dataset for binary sentiment classification containing 50,000 highly-polarized reviews
with 25k for training and 25k for testing, and divided into positive reviews (labelled ‘2’) and
negative reviews (labelled ‘1’). Examples are shown below:
VOCABULARY BUILDING
• The sentences consist of many forms of words such as punctuations, contractions, and
simple words such ‘am’, ‘been’, ‘is’, etc. all connected together to make sentence.
• These must be processed to extract only meaningful words into tokens and generate
vocabulary.
WORD2VECTOR
• Word embedding are vector representations of words or tokens
• The Word2Vector model is used to convert the one-hot encoding representations into
vectors that account for the context of the word with respect to other similar or related
words.
• Two types: Bag-of-Words or Skip-gram; here, skip-gram was used.
WORD2VEC (CONTD.)
• Skip-gram word2vec model created and initialized with embedding size of 30, sliding
window size of 5, and minimum frequency count of 5.
• The model was trained for 30 epochs for best results. The total parameters of the model
were found as follows (in picture)
• Examples from the model testing for word similarity are shown below:
WORD2VEC (CONTD.)
• T-SNE (t-distributed stochastic neighbor embedding is a good way to visualize word vectors.
• But, they do not always produce accurate representations as it involves transforming from a
higher dimension to a much lower dimension.
MODEL GENERATION
• Convolutional Layer: 1-D convolutional layer with input channel of 300, and output channel
of 100, used to extract features and reduce dimension
• BiLSTM: Bidirectional LSTM layer with hidden size of 150, to extract contextual information
from past and future data.
• Since the sentence size and thus the number of embeddings varies for each review or data
input, padding was performed with zeros on each batch, and then packed using
pack_padded_sequence for efficient computation, before being fed to BiLSTM.
• The forward hidden state and backward hidden state extracted separately as forward context
and backward context, and fed into two attention layers.
• Attention Layer: Forward attention layer of hidden size 150, and Backward attention layer of
hidden size 150; attention mechanism used is general attention.
• Softmax: Softmax layer used at the end to generate label with max. probability.
• Metrics: Accuracy
• Adam optimizer at 10 epochs, with CrossEntropy loss and 80%-20% split
MODEL SUMMARY
MODEL TRAINING
FUTURE WORK
• Troubleshoot the main model training part and complete training.
• Modify attention mechanism with multi-head attention.
• Train and test model on a different dataset.

More Related Content

Similar to Deep Learning Project.pptx

Transfer Learning in NLP: A Survey
Transfer Learning in NLP: A SurveyTransfer Learning in NLP: A Survey
Transfer Learning in NLP: A Survey
NUPUR YADAV
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_experts
Sanghamitra Deb
 
Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architectures
ananth
 
BERT MODULE FOR TEXT CLASSIFICATION.pptx
BERT MODULE FOR TEXT CLASSIFICATION.pptxBERT MODULE FOR TEXT CLASSIFICATION.pptx
BERT MODULE FOR TEXT CLASSIFICATION.pptx
ManvanthBC
 
PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...
PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...
PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...
Jinwon Lee
 
NLP Classifier Models & Metrics
NLP Classifier Models & MetricsNLP Classifier Models & Metrics
NLP Classifier Models & Metrics
Sanghamitra Deb
 
“Design of Efficient Mobile Femtocell by Compression and Aggregation Technolo...
“Design of Efficient Mobile Femtocell by Compression and Aggregation Technolo...“Design of Efficient Mobile Femtocell by Compression and Aggregation Technolo...
“Design of Efficient Mobile Femtocell by Compression and Aggregation Technolo...
Virendra Uppalwar
 
multi modal transformers representation generation .pptx
multi modal transformers representation generation .pptxmulti modal transformers representation generation .pptx
multi modal transformers representation generation .pptx
siddharth1729
 
Survey of Attention mechanism
Survey of Attention mechanismSurvey of Attention mechanism
Survey of Attention mechanism
SwatiNarkhede1
 
Natural Language Processing Advancements By Deep Learning - A Survey
Natural Language Processing Advancements By Deep Learning - A SurveyNatural Language Processing Advancements By Deep Learning - A Survey
Natural Language Processing Advancements By Deep Learning - A Survey
AkshayaNagarajan10
 
240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptx240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptx
thanhdowork
 
Predicting Azure Churn with Deep Learning and Explaining Predictions with LIME
Predicting Azure Churn with Deep Learning and Explaining Predictions with LIMEPredicting Azure Churn with Deep Learning and Explaining Predictions with LIME
Predicting Azure Churn with Deep Learning and Explaining Predictions with LIME
Feng Zhu
 
Presentation vision transformersppt.pptx
Presentation vision transformersppt.pptxPresentation vision transformersppt.pptx
Presentation vision transformersppt.pptx
htn540
 
IRE Semantic Annotation of Documents
IRE Semantic Annotation of Documents IRE Semantic Annotation of Documents
IRE Semantic Annotation of Documents
Sharvil Katariya
 
Lec16 - Autoencoders.pptx
Lec16 - Autoencoders.pptxLec16 - Autoencoders.pptx
Lec16 - Autoencoders.pptx
Sameer Gulshan
 
A Generic Neural Network Architecture to Infer Heterogeneous Model Transforma...
A Generic Neural Network Architecture to Infer Heterogeneous Model Transforma...A Generic Neural Network Architecture to Infer Heterogeneous Model Transforma...
A Generic Neural Network Architecture to Infer Heterogeneous Model Transforma...
Lola Burgueño
 
Future semantic segmentation with convolutional LSTM
Future semantic segmentation with convolutional LSTMFuture semantic segmentation with convolutional LSTM
Future semantic segmentation with convolutional LSTM
Kyuri Kim
 
2017:12:06 acl読み会"Learning attention for historical text normalization by lea...
2017:12:06 acl読み会"Learning attention for historical text normalization by lea...2017:12:06 acl読み会"Learning attention for historical text normalization by lea...
2017:12:06 acl読み会"Learning attention for historical text normalization by lea...
ayaha osaki
 
TensorFlow.pptx
TensorFlow.pptxTensorFlow.pptx
TensorFlow.pptx
Jayesh Patil
 
SpecAugment review
SpecAugment reviewSpecAugment review
SpecAugment review
June-Woo Kim
 

Similar to Deep Learning Project.pptx (20)

Transfer Learning in NLP: A Survey
Transfer Learning in NLP: A SurveyTransfer Learning in NLP: A Survey
Transfer Learning in NLP: A Survey
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_experts
 
Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architectures
 
BERT MODULE FOR TEXT CLASSIFICATION.pptx
BERT MODULE FOR TEXT CLASSIFICATION.pptxBERT MODULE FOR TEXT CLASSIFICATION.pptx
BERT MODULE FOR TEXT CLASSIFICATION.pptx
 
PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...
PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...
PR-344: A Battle of Network Structures: An Empirical Study of CNN, Transforme...
 
NLP Classifier Models & Metrics
NLP Classifier Models & MetricsNLP Classifier Models & Metrics
NLP Classifier Models & Metrics
 
“Design of Efficient Mobile Femtocell by Compression and Aggregation Technolo...
“Design of Efficient Mobile Femtocell by Compression and Aggregation Technolo...“Design of Efficient Mobile Femtocell by Compression and Aggregation Technolo...
“Design of Efficient Mobile Femtocell by Compression and Aggregation Technolo...
 
multi modal transformers representation generation .pptx
multi modal transformers representation generation .pptxmulti modal transformers representation generation .pptx
multi modal transformers representation generation .pptx
 
Survey of Attention mechanism
Survey of Attention mechanismSurvey of Attention mechanism
Survey of Attention mechanism
 
Natural Language Processing Advancements By Deep Learning - A Survey
Natural Language Processing Advancements By Deep Learning - A SurveyNatural Language Processing Advancements By Deep Learning - A Survey
Natural Language Processing Advancements By Deep Learning - A Survey
 
240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptx240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptx
 
Predicting Azure Churn with Deep Learning and Explaining Predictions with LIME
Predicting Azure Churn with Deep Learning and Explaining Predictions with LIMEPredicting Azure Churn with Deep Learning and Explaining Predictions with LIME
Predicting Azure Churn with Deep Learning and Explaining Predictions with LIME
 
Presentation vision transformersppt.pptx
Presentation vision transformersppt.pptxPresentation vision transformersppt.pptx
Presentation vision transformersppt.pptx
 
IRE Semantic Annotation of Documents
IRE Semantic Annotation of Documents IRE Semantic Annotation of Documents
IRE Semantic Annotation of Documents
 
Lec16 - Autoencoders.pptx
Lec16 - Autoencoders.pptxLec16 - Autoencoders.pptx
Lec16 - Autoencoders.pptx
 
A Generic Neural Network Architecture to Infer Heterogeneous Model Transforma...
A Generic Neural Network Architecture to Infer Heterogeneous Model Transforma...A Generic Neural Network Architecture to Infer Heterogeneous Model Transforma...
A Generic Neural Network Architecture to Infer Heterogeneous Model Transforma...
 
Future semantic segmentation with convolutional LSTM
Future semantic segmentation with convolutional LSTMFuture semantic segmentation with convolutional LSTM
Future semantic segmentation with convolutional LSTM
 
2017:12:06 acl読み会"Learning attention for historical text normalization by lea...
2017:12:06 acl読み会"Learning attention for historical text normalization by lea...2017:12:06 acl読み会"Learning attention for historical text normalization by lea...
2017:12:06 acl読み会"Learning attention for historical text normalization by lea...
 
TensorFlow.pptx
TensorFlow.pptxTensorFlow.pptx
TensorFlow.pptx
 
SpecAugment review
SpecAugment reviewSpecAugment review
SpecAugment review
 

Recently uploaded

Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
co23btech11018
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
shadow0702a
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
bijceesjournal
 
Hematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood CountHematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood Count
shahdabdulbaset
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
Las Vegas Warehouse
 
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
gerogepatton
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
Nada Hikmah
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
KrishnaveniKrishnara1
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
IJECEIAES
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
MIGUELANGEL966976
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
Mahmoud Morsy
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
Rahul
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
gowrishankartb2005
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
RamonNovais6
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
IJECEIAES
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
Yasser Mahgoub
 

Recently uploaded (20)

Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
 
Hematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood CountHematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood Count
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
 
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
 
Certificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi AhmedCertificates - Mahmoud Mohamed Moursi Ahmed
Certificates - Mahmoud Mohamed Moursi Ahmed
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
 

Deep Learning Project.pptx

  • 1. PROJECT PRESENTATION Paper: Bidirectional LSTM with Attention Mechanism and Convolutional Layer for Text Classification Reference: Liu, Gang, and Jiabao Guo. "Bidirectional LSTM with attention mechanism and convolutional layer for text classification." Neurocomputing 337 (2019): 325-338.
  • 2. CONTENT • Paper • Dataset • Vocabulary Building • Word2Vector • Model Generation • Model Summary • Model Training • Future Work
  • 3. PAPER • Objective: Sentiment classification of polarized datasets, such as reviews, questions, etc. • CNNs are able to extract features for sentence modelling while reducing dimensionality of the data. • RNNs are specialized for sequential modelling. Bi-LSTM, combines the forward hidden layer and the backward hidden layer, which can access both the preceding and succeeding contexts, to obtain the contextual information of the text. • Attention mechanism is used in two-layers for the preceding and succeeding contextual features to highlight the important information from the contextual information by setting different weights • Softmax layer to generate labels. • Their model outperforms state-of-the-art classification methods in terms of classification accuracy
  • 4. DATASET: IMDB – MOVIE REVIEW • This is dataset for binary sentiment classification containing 50,000 highly-polarized reviews with 25k for training and 25k for testing, and divided into positive reviews (labelled ‘2’) and negative reviews (labelled ‘1’). Examples are shown below:
  • 5. VOCABULARY BUILDING • The sentences consist of many forms of words such as punctuations, contractions, and simple words such ‘am’, ‘been’, ‘is’, etc. all connected together to make sentence. • These must be processed to extract only meaningful words into tokens and generate vocabulary.
  • 6. WORD2VECTOR • Word embedding are vector representations of words or tokens • The Word2Vector model is used to convert the one-hot encoding representations into vectors that account for the context of the word with respect to other similar or related words. • Two types: Bag-of-Words or Skip-gram; here, skip-gram was used.
  • 7. WORD2VEC (CONTD.) • Skip-gram word2vec model created and initialized with embedding size of 30, sliding window size of 5, and minimum frequency count of 5. • The model was trained for 30 epochs for best results. The total parameters of the model were found as follows (in picture) • Examples from the model testing for word similarity are shown below:
  • 8. WORD2VEC (CONTD.) • T-SNE (t-distributed stochastic neighbor embedding is a good way to visualize word vectors. • But, they do not always produce accurate representations as it involves transforming from a higher dimension to a much lower dimension.
  • 9. MODEL GENERATION • Convolutional Layer: 1-D convolutional layer with input channel of 300, and output channel of 100, used to extract features and reduce dimension • BiLSTM: Bidirectional LSTM layer with hidden size of 150, to extract contextual information from past and future data. • Since the sentence size and thus the number of embeddings varies for each review or data input, padding was performed with zeros on each batch, and then packed using pack_padded_sequence for efficient computation, before being fed to BiLSTM. • The forward hidden state and backward hidden state extracted separately as forward context and backward context, and fed into two attention layers. • Attention Layer: Forward attention layer of hidden size 150, and Backward attention layer of hidden size 150; attention mechanism used is general attention. • Softmax: Softmax layer used at the end to generate label with max. probability. • Metrics: Accuracy • Adam optimizer at 10 epochs, with CrossEntropy loss and 80%-20% split
  • 12. FUTURE WORK • Troubleshoot the main model training part and complete training. • Modify attention mechanism with multi-head attention. • Train and test model on a different dataset.