SlideShare a Scribd company logo
Sentiment Analysis And Opinion
Mining
Mohamed Khamis Mohamed
Shrief Salem
Abdelrhman Hisham
Undersuprivesd
Prof. Tarek Ghareeb
Agenda
● Introduction
● Problem definition
● Objective
● Dataset
● Exploratory Data Analysis (EDA)
● Text pre-processing(Cleaning)
● Methodology and Techqniqes
Introduction
Sentiment analysis is a technique for analysing a piece
of text to determine the sentiment contained within it.
It accomplishes this by combining machine learning
and natural language processing (NLP).
Problem definition
The act of computationally recognising and categorising
opinions contained in a piece of text, especially in order
to discern whether the writer has a good, negative, or
neutral attitude toward a given topic, product, etc.
Objective
The main goal is to estimate the sentiment many movie
reviews from the Internet Movie Database (IMDb).
Dataset
IMDB dataset having 50K movie reviews for natural language
processing or Text analytics. This is a dataset for binary sentiment
classification containing substantially more data than previous
benchmark datasets. It consists of a set of 25,000 highly polar
movie reviews for training and 25,000 for testing. So,we have to
predict the number of positive and negative reviews using either
classification or deep learning algorithms.So here we will use BERT
and train it for classifying reviews as positive/negative correctly.
Text pre-processing(Cleaning)
Removing the html strips
Cleaned Text:
A wonderful little production. The filming technique is very unassuming- very old-time-BBC
fashion and gives a comforting, and sometimes discomforting, sense of realism to the entire
piece
def strip_html(text):
soup = BeautifulSoup(text, "html.parser")
return soup.get_text()
Text:
A wonderful little production. <br /><br />The filming technique is very unassuming- very old-time-BBC fashion
and gives a comforting, and sometimes discomforting,sense of realism to the entire piece
Removing special characters
and punctuations
Cleaned Text:
A wonderful little production The filming technique is very unassuming very old time BBC
fashion and gives a comforting and sometimes discomforting sense of realism to the entire
piece
def remove_special_characters(text):
pattern=r'[^a-zA-z0-9s]'
text=re.sub(pattern,'',text)
text.translate(str.maketrans(' ', ' ', string.punctuation))
return text
Text:
A wonderful little production. The filming technique is very unassuming- very old-time-BBC fashion and gives a
comforting, and sometimes discomforting, sense of realism to the entire piece
Remove stopwords
Cleaned Text:
wonderful little production filming technique unassuming old time BBC fashion
gives comforting sometimes discomforting sense realism entire piece
def remove_stopwords(text):
tokens = tokenizer.tokenize(text)
filtered_tokens = [token for token in tokens if token.lower() not in stopword_list]
filtered_text = ' '.join(filtered_tokens)
return filtered_text
Text:
A wonderful little production The filming technique is very unassuming very old time BBC fashion and gives a
comforting and sometimes discomforting sense of realism to the entire piece
Exploratory Data
Analysis (EDA)
Classes Distribution
count of words in positive
and negative reviews
Count Punctuations
Count Stopwords
Count Number of URL
Positive review sample
A wonderful little production. <br /><br />The filming technique is very unassuming- very
old-time-BBC fashion and gives a comforting, and sometimes discomforting, sense of
realism to the entire piece. <br /><br />The actors are extremely well chosen- Michael
Sheen not only "has got all the polari" but he has all the voices down pat too! You can
truly see the seamless editing guided by the references to Williams' diary entries, not only
is it well worth the watching but it is a terrific written and performed piece. A masterful
production about one of the great master's of comedy and his life. <br /><br />The
realism really comes home with the little things: the fantasy of the guard which, rather
than use the traditional 'dream' techniques remains solid then disappears. It plays on our
knowledge and our senses, particularly with the scenes concerning Orton and Halliwell
and the sets (particularly of their flat with Halliwell's murals decorating every surface) are
terribly well done.
Negative review sample
This show was an amazing, fresh & innovative idea in the 70's when it first aired. The first
7 or 8 years were brilliant, but things dropped off after that. By 1990, the show was not
really funny anymore, and it's continued its decline further to the complete waste of time
it is today.<br /><br />It's truly disgraceful how far this show has fallen. The writing is
painfully bad, the performances are almost as bad - if not for the mildly entertaining
respite of the guest-hosts, this show probably wouldn't still be on the air. I find it so hard
to believe that the same creator that hand-selected the original cast also chose the band
of hacks that followed. How can one recognize such brilliance and then see fit to replace
it with such mediocrity? I felt I must give 2 stars out of respect for the original cast that
made this show such a huge success. As it is now, the show is just awful. I can't believe
it's still on the air.
Methodology and
Techqniqes
Feature Extraction
● TF-IDF Vectorizer
● Word2Vec Embedding
TF-IDF Vectorizer
Word2Vec Embedding
Models
● MLPClassifier
● Support Vector Machine (SMV)
● Long Short Term Memory (LSTM)
● Convolution Neural Network (CNN)
● CNN-LSTM(Hybrid)
● BERT
MLPClassifier
MLPClassifier Results
precision recall f1-score support
Positive 0.87 0.87 0.87 4993
Negative 0.87 0.87 0.87 5007
accuracy 0.87 10000
weighted avg 0.87 0.87 0.87 10000
SMV
SMV Results
precision recall f1-score support
Positive 0.87 0.86 0.87 4993
Negative 0.87 0.87 0.87 5007
accuracy 0.87 10000
weighted avg 0.87 0.87 0.87 10000
LSTM
LSTM Results
precision recall f1-score support
Positive 0.81 0.83 0.82 4964
Negative 0.83 0.81 0.82 5036
accuracy 0.82 10000
weighted avg 0.82 0.82 0.82 10000
CNN
CNN Results
precision recall f1-score support
Positive 0.82 0.79 0.81 4964
Negative 0.83 0.81 0.82 5036
accuracy 0.81 10000
weighted avg 0.81 0.81 0.81 10000
CNN-LSTM
CNN-LSTM Results
precision recall f1-score support
Positive 0.80 0.85 0.82 4964
Negative 0.84 0.79 0.81 5036
accuracy 0.82 10000
weighted avg 0.82 0.82 0.82 10000
BERT
BERT Results
precision recall f1-score support
Positive 0.90 0.91 0.90 4964
Negative 0.91 0.90 0.90 5036
accuracy 0.90 10000
weighted avg 0.90 0.90 0.90 10000
Summary
Model Feature
Results
Precision Recall F1-score accuracy
MLP TFIDF 0.87 0.87 0.87 0.87
SVM TFIDF 0.87 0.87 0.87 0.87
LSTM Word2Vec 0.82 0.82 0.82 0.82
CNN Word2Vec 0.81 0.81 0.81 0.81
CNN-LSTM Word2Vec 0.82 0.82 0.82 0.82
Bert Bert 0.90 0.90 0.90 0.90
The main motive behind this project was to construct a sentiment analysis model that
will help us to get a better understanding of movie reviews that we have collected,
We compared the results of different classifiers: MLP, Support Vector Machine (SVM),
LSTM, CNN, Hyperd LSTM-CNN, and BERT.
For Evaluation, we observed the accuracy provided by each model.
By evaluating the models, we found out that Bert gives us the highest accuracy score of
90%.
Conclusion
References
● MaisYasen, Sara Tedmori. “Movies Reviews Sentiment Analysis and Classification”. IEEE Jordon International
Joint Conference on Electrical Engineering and Information Technology (JEEIT). 978-1-5386-7942-5.
● Tirath Prasad Sahu, Sanjeev Ahuja. “Sentiment Analysis of movie reviews: A study on feature selection and
classification algorithms”. International Conference on Microelectronics, Computing, and Communication
(MicroCom).978-1-4673-6621-2.
● Wijayanto, Unggul and Sarno, Ritanarto. “An Experimental Study of Supervised Sentiment Analysis Using
Gaussian Naïve Bayes”. 476-481.10.1109/ISEMANTIC.2018.8549788.
● Tejaswini M. Untawale, G. Choudhari. “Implementation of Sentiment Classification of Movie Reviews by
Supervised Machine Learning Approaches”. 978-1-5386-7808-4.
● Sourav Mehra, Tanupriya Choudhury. “Sentiment Analysis of User Entered Text”. International Conference of
Computational Techniques, Electronics and Mechanical Systems (CTEMS). 978-1-5386-7709-4.
● Nisha Rathee, Nikita Joshi, Jaspreet Kaur. “Sentiment Analysis Using Machine Learning Techniques on Python”.
978-1-5386-2842-3 “https://ieeexplore.ieee.org/document/8663224”.
● https://www.researchgate.net/profile/Raouf_Ganda/publication/318975052_Deep_learning_for_sentence_classi
fication/links/59cd37a30f7e9b454f9f9181/Deep-learning-for-sentenceclassification.pdf
https://www.aclweb.org/anthology/P12-3020.pdf

More Related Content

Similar to Sentiment Analysis.pptx

Project Presentation
Project PresentationProject Presentation
Project Presentation
Samuel Bolivar Bretto
 
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and ApplicationsICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
Forward Gradient
 
TAAI 2016 Keynote Talk: It is all about AI
TAAI 2016 Keynote Talk: It is all about AITAAI 2016 Keynote Talk: It is all about AI
TAAI 2016 Keynote Talk: It is all about AI
Yi-Shin Chen
 
Crossroads of Asynchrony and Graceful Degradation
Crossroads of Asynchrony and Graceful DegradationCrossroads of Asynchrony and Graceful Degradation
Crossroads of Asynchrony and Graceful Degradation
C4Media
 
Beginning text analysis
Beginning text analysisBeginning text analysis
Beginning text analysis
Barry DeCicco
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
Wagston Staehler
 
Capstone Project: Master's of Science in Data Science
Capstone Project: Master's of Science in Data Science Capstone Project: Master's of Science in Data Science
Capstone Project: Master's of Science in Data Science
Silvia Qu
 
AI Products - Accuracy vs. User Experience
AI Products - Accuracy vs. User ExperienceAI Products - Accuracy vs. User Experience
AI Products - Accuracy vs. User Experience
István Rechner
 
CVPR presentation
CVPR presentationCVPR presentation
CVPR presentation
Vartika Sharma
 
Textual & Sentiment Analysis of Movie Reviews
Textual & Sentiment Analysis of Movie ReviewsTextual & Sentiment Analysis of Movie Reviews
Textual & Sentiment Analysis of Movie Reviews
Yousef Fadila
 
TRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text DescriptionTRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text Description
George Awad
 
IRJET- Movie Captioning for Differently Abled People
IRJET- Movie Captioning for Differently Abled PeopleIRJET- Movie Captioning for Differently Abled People
IRJET- Movie Captioning for Differently Abled People
IRJET Journal
 
2013 - Andrei Zmievski: Machine learning para datos
2013 - Andrei Zmievski: Machine learning para datos2013 - Andrei Zmievski: Machine learning para datos
2013 - Andrei Zmievski: Machine learning para datos
PHP Conference Argentina
 
All about storyboards
All about storyboardsAll about storyboards
All about storyboards
Benedito De Souza
 
Motion graphics and_compositing_video_analysis_worksheet
Motion graphics and_compositing_video_analysis_worksheetMotion graphics and_compositing_video_analysis_worksheet
Motion graphics and_compositing_video_analysis_worksheet
smashingentertainment
 
Computer Vision
Computer VisionComputer Vision
Computer Vision
Computer Science Club
 
Tài liệu hướng dẫn làm Clip quảng cáo
Tài liệu hướng dẫn làm Clip quảng cáoTài liệu hướng dẫn làm Clip quảng cáo
Tài liệu hướng dẫn làm Clip quảng cáo
Hoàng Vương
 
Becoming a kinect hacker innovator v2
Becoming a kinect hacker innovator v2Becoming a kinect hacker innovator v2
Becoming a kinect hacker innovator v2
Jeff Sipko
 
Aspect identification
Aspect identificationAspect identification
Aspect identification
Jean Brenda
 
At&t research at trecvid 2009
At&t research at trecvid 2009At&t research at trecvid 2009
At&t research at trecvid 2009
Kirill Lazarev
 

Similar to Sentiment Analysis.pptx (20)

Project Presentation
Project PresentationProject Presentation
Project Presentation
 
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and ApplicationsICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
 
TAAI 2016 Keynote Talk: It is all about AI
TAAI 2016 Keynote Talk: It is all about AITAAI 2016 Keynote Talk: It is all about AI
TAAI 2016 Keynote Talk: It is all about AI
 
Crossroads of Asynchrony and Graceful Degradation
Crossroads of Asynchrony and Graceful DegradationCrossroads of Asynchrony and Graceful Degradation
Crossroads of Asynchrony and Graceful Degradation
 
Beginning text analysis
Beginning text analysisBeginning text analysis
Beginning text analysis
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
Capstone Project: Master's of Science in Data Science
Capstone Project: Master's of Science in Data Science Capstone Project: Master's of Science in Data Science
Capstone Project: Master's of Science in Data Science
 
AI Products - Accuracy vs. User Experience
AI Products - Accuracy vs. User ExperienceAI Products - Accuracy vs. User Experience
AI Products - Accuracy vs. User Experience
 
CVPR presentation
CVPR presentationCVPR presentation
CVPR presentation
 
Textual & Sentiment Analysis of Movie Reviews
Textual & Sentiment Analysis of Movie ReviewsTextual & Sentiment Analysis of Movie Reviews
Textual & Sentiment Analysis of Movie Reviews
 
TRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text DescriptionTRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text Description
 
IRJET- Movie Captioning for Differently Abled People
IRJET- Movie Captioning for Differently Abled PeopleIRJET- Movie Captioning for Differently Abled People
IRJET- Movie Captioning for Differently Abled People
 
2013 - Andrei Zmievski: Machine learning para datos
2013 - Andrei Zmievski: Machine learning para datos2013 - Andrei Zmievski: Machine learning para datos
2013 - Andrei Zmievski: Machine learning para datos
 
All about storyboards
All about storyboardsAll about storyboards
All about storyboards
 
Motion graphics and_compositing_video_analysis_worksheet
Motion graphics and_compositing_video_analysis_worksheetMotion graphics and_compositing_video_analysis_worksheet
Motion graphics and_compositing_video_analysis_worksheet
 
Computer Vision
Computer VisionComputer Vision
Computer Vision
 
Tài liệu hướng dẫn làm Clip quảng cáo
Tài liệu hướng dẫn làm Clip quảng cáoTài liệu hướng dẫn làm Clip quảng cáo
Tài liệu hướng dẫn làm Clip quảng cáo
 
Becoming a kinect hacker innovator v2
Becoming a kinect hacker innovator v2Becoming a kinect hacker innovator v2
Becoming a kinect hacker innovator v2
 
Aspect identification
Aspect identificationAspect identification
Aspect identification
 
At&t research at trecvid 2009
At&t research at trecvid 2009At&t research at trecvid 2009
At&t research at trecvid 2009
 

Recently uploaded

Template xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptxTemplate xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptx
TeukuEriSyahputra
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
tzu5xla
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
blueshagoo1
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
hqfek
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
dataschool1
 
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
lzdvtmy8
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理
zsafxbf
 
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
ywqeos
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
bmucuha
 
一比一原版雷丁大学毕业证(UoR毕业证书)学历如何办理
一比一原版雷丁大学毕业证(UoR毕业证书)学历如何办理一比一原版雷丁大学毕业证(UoR毕业证书)学历如何办理
一比一原版雷丁大学毕业证(UoR毕业证书)学历如何办理
mbawufebxi
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
Vietnam Cotton & Spinning Association
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
Vietnam Cotton & Spinning Association
 
Sid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.pptSid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.ppt
ArshadAyub49
 
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
Rebecca Bilbro
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
oaxefes
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
yuvarajkumar334
 
ML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
ML-PPT-UNIT-2 Generative Classifiers Discriminative ClassifiersML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
ML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
MastanaihnaiduYasam
 

Recently uploaded (20)

Template xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptxTemplate xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptx
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
 
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理
 
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
一比一原版雷丁大学毕业证(UoR毕业证书)学历如何办理
一比一原版雷丁大学毕业证(UoR毕业证书)学历如何办理一比一原版雷丁大学毕业证(UoR毕业证书)学历如何办理
一比一原版雷丁大学毕业证(UoR毕业证书)学历如何办理
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
 
Sid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.pptSid Sigma educational and problem solving power point- Six Sigma.ppt
Sid Sigma educational and problem solving power point- Six Sigma.ppt
 
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
 
ML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
ML-PPT-UNIT-2 Generative Classifiers Discriminative ClassifiersML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
ML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
 

Sentiment Analysis.pptx

  • 1. Sentiment Analysis And Opinion Mining Mohamed Khamis Mohamed Shrief Salem Abdelrhman Hisham Undersuprivesd Prof. Tarek Ghareeb
  • 2. Agenda ● Introduction ● Problem definition ● Objective ● Dataset ● Exploratory Data Analysis (EDA) ● Text pre-processing(Cleaning) ● Methodology and Techqniqes
  • 3. Introduction Sentiment analysis is a technique for analysing a piece of text to determine the sentiment contained within it. It accomplishes this by combining machine learning and natural language processing (NLP).
  • 4. Problem definition The act of computationally recognising and categorising opinions contained in a piece of text, especially in order to discern whether the writer has a good, negative, or neutral attitude toward a given topic, product, etc.
  • 5. Objective The main goal is to estimate the sentiment many movie reviews from the Internet Movie Database (IMDb).
  • 6. Dataset IMDB dataset having 50K movie reviews for natural language processing or Text analytics. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. It consists of a set of 25,000 highly polar movie reviews for training and 25,000 for testing. So,we have to predict the number of positive and negative reviews using either classification or deep learning algorithms.So here we will use BERT and train it for classifying reviews as positive/negative correctly.
  • 8. Removing the html strips Cleaned Text: A wonderful little production. The filming technique is very unassuming- very old-time-BBC fashion and gives a comforting, and sometimes discomforting, sense of realism to the entire piece def strip_html(text): soup = BeautifulSoup(text, "html.parser") return soup.get_text() Text: A wonderful little production. <br /><br />The filming technique is very unassuming- very old-time-BBC fashion and gives a comforting, and sometimes discomforting,sense of realism to the entire piece
  • 9. Removing special characters and punctuations Cleaned Text: A wonderful little production The filming technique is very unassuming very old time BBC fashion and gives a comforting and sometimes discomforting sense of realism to the entire piece def remove_special_characters(text): pattern=r'[^a-zA-z0-9s]' text=re.sub(pattern,'',text) text.translate(str.maketrans(' ', ' ', string.punctuation)) return text Text: A wonderful little production. The filming technique is very unassuming- very old-time-BBC fashion and gives a comforting, and sometimes discomforting, sense of realism to the entire piece
  • 10. Remove stopwords Cleaned Text: wonderful little production filming technique unassuming old time BBC fashion gives comforting sometimes discomforting sense realism entire piece def remove_stopwords(text): tokens = tokenizer.tokenize(text) filtered_tokens = [token for token in tokens if token.lower() not in stopword_list] filtered_text = ' '.join(filtered_tokens) return filtered_text Text: A wonderful little production The filming technique is very unassuming very old time BBC fashion and gives a comforting and sometimes discomforting sense of realism to the entire piece
  • 13. count of words in positive and negative reviews
  • 17. Positive review sample A wonderful little production. <br /><br />The filming technique is very unassuming- very old-time-BBC fashion and gives a comforting, and sometimes discomforting, sense of realism to the entire piece. <br /><br />The actors are extremely well chosen- Michael Sheen not only "has got all the polari" but he has all the voices down pat too! You can truly see the seamless editing guided by the references to Williams' diary entries, not only is it well worth the watching but it is a terrific written and performed piece. A masterful production about one of the great master's of comedy and his life. <br /><br />The realism really comes home with the little things: the fantasy of the guard which, rather than use the traditional 'dream' techniques remains solid then disappears. It plays on our knowledge and our senses, particularly with the scenes concerning Orton and Halliwell and the sets (particularly of their flat with Halliwell's murals decorating every surface) are terribly well done.
  • 18. Negative review sample This show was an amazing, fresh & innovative idea in the 70's when it first aired. The first 7 or 8 years were brilliant, but things dropped off after that. By 1990, the show was not really funny anymore, and it's continued its decline further to the complete waste of time it is today.<br /><br />It's truly disgraceful how far this show has fallen. The writing is painfully bad, the performances are almost as bad - if not for the mildly entertaining respite of the guest-hosts, this show probably wouldn't still be on the air. I find it so hard to believe that the same creator that hand-selected the original cast also chose the band of hacks that followed. How can one recognize such brilliance and then see fit to replace it with such mediocrity? I felt I must give 2 stars out of respect for the original cast that made this show such a huge success. As it is now, the show is just awful. I can't believe it's still on the air.
  • 20. Feature Extraction ● TF-IDF Vectorizer ● Word2Vec Embedding
  • 23. Models ● MLPClassifier ● Support Vector Machine (SMV) ● Long Short Term Memory (LSTM) ● Convolution Neural Network (CNN) ● CNN-LSTM(Hybrid) ● BERT
  • 25. MLPClassifier Results precision recall f1-score support Positive 0.87 0.87 0.87 4993 Negative 0.87 0.87 0.87 5007 accuracy 0.87 10000 weighted avg 0.87 0.87 0.87 10000
  • 26. SMV
  • 27. SMV Results precision recall f1-score support Positive 0.87 0.86 0.87 4993 Negative 0.87 0.87 0.87 5007 accuracy 0.87 10000 weighted avg 0.87 0.87 0.87 10000
  • 28. LSTM
  • 29. LSTM Results precision recall f1-score support Positive 0.81 0.83 0.82 4964 Negative 0.83 0.81 0.82 5036 accuracy 0.82 10000 weighted avg 0.82 0.82 0.82 10000
  • 30. CNN
  • 31. CNN Results precision recall f1-score support Positive 0.82 0.79 0.81 4964 Negative 0.83 0.81 0.82 5036 accuracy 0.81 10000 weighted avg 0.81 0.81 0.81 10000
  • 33. CNN-LSTM Results precision recall f1-score support Positive 0.80 0.85 0.82 4964 Negative 0.84 0.79 0.81 5036 accuracy 0.82 10000 weighted avg 0.82 0.82 0.82 10000
  • 34. BERT
  • 35. BERT Results precision recall f1-score support Positive 0.90 0.91 0.90 4964 Negative 0.91 0.90 0.90 5036 accuracy 0.90 10000 weighted avg 0.90 0.90 0.90 10000
  • 36. Summary Model Feature Results Precision Recall F1-score accuracy MLP TFIDF 0.87 0.87 0.87 0.87 SVM TFIDF 0.87 0.87 0.87 0.87 LSTM Word2Vec 0.82 0.82 0.82 0.82 CNN Word2Vec 0.81 0.81 0.81 0.81 CNN-LSTM Word2Vec 0.82 0.82 0.82 0.82 Bert Bert 0.90 0.90 0.90 0.90
  • 37. The main motive behind this project was to construct a sentiment analysis model that will help us to get a better understanding of movie reviews that we have collected, We compared the results of different classifiers: MLP, Support Vector Machine (SVM), LSTM, CNN, Hyperd LSTM-CNN, and BERT. For Evaluation, we observed the accuracy provided by each model. By evaluating the models, we found out that Bert gives us the highest accuracy score of 90%. Conclusion
  • 38. References ● MaisYasen, Sara Tedmori. “Movies Reviews Sentiment Analysis and Classification”. IEEE Jordon International Joint Conference on Electrical Engineering and Information Technology (JEEIT). 978-1-5386-7942-5. ● Tirath Prasad Sahu, Sanjeev Ahuja. “Sentiment Analysis of movie reviews: A study on feature selection and classification algorithms”. International Conference on Microelectronics, Computing, and Communication (MicroCom).978-1-4673-6621-2. ● Wijayanto, Unggul and Sarno, Ritanarto. “An Experimental Study of Supervised Sentiment Analysis Using Gaussian Naïve Bayes”. 476-481.10.1109/ISEMANTIC.2018.8549788. ● Tejaswini M. Untawale, G. Choudhari. “Implementation of Sentiment Classification of Movie Reviews by Supervised Machine Learning Approaches”. 978-1-5386-7808-4. ● Sourav Mehra, Tanupriya Choudhury. “Sentiment Analysis of User Entered Text”. International Conference of Computational Techniques, Electronics and Mechanical Systems (CTEMS). 978-1-5386-7709-4. ● Nisha Rathee, Nikita Joshi, Jaspreet Kaur. “Sentiment Analysis Using Machine Learning Techniques on Python”. 978-1-5386-2842-3 “https://ieeexplore.ieee.org/document/8663224”. ● https://www.researchgate.net/profile/Raouf_Ganda/publication/318975052_Deep_learning_for_sentence_classi fication/links/59cd37a30f7e9b454f9f9181/Deep-learning-for-sentenceclassification.pdf https://www.aclweb.org/anthology/P12-3020.pdf