SlideShare a Scribd company logo
Innovative Geeks
Presented By- Shekhar Bhardwaj - 03351202716
Shubham Siddhartha - 03451202716
Mukul Sharma - 02051202716
Eckovation Machine Learning
TOPIC: SENTIMENT ANALYSIS ON MOVIE
REVIEWS
The Rotten Tomatoes movie review dataset is a corpus of movie reviews
used for sentiment analysis, originally collected by Pang and Lee [1]. In
their work on sentiment treebanks, Socher et al. used Amazon's
Mechanical Turk to create fine-grained labels for all parsed phrases in the
corpus.
DATA DESCRIPTION
• The dataset is comprised of tab-separated files with phrases from the
Rotten Tomatoes dataset. The train/test split has been preserved for the
purposes of benchmarking, but the sentences have been shuffled from
their original order. Each Sentence has been parsed into many phrases
by the Stanford parser. Each phrase has a PhraseId. Each sentence has a
SentenceId. Phrases that are repeated (such as short/common words)
are only included once in the data.
• train.tsv contains the phrases and their associated sentiment labels. We
have additionally provided a SentenceId so that you can track which
phrases belong to a single sentence.
• test.tsv contains just phrases. You must assign a sentiment label to each
phrase.
THE SENTIMENT LABELS ARE:
• 0 – negative
• 1 - somewhat negative
• 2 – neutral
• 3 - somewhat positive
• 4 - positive
DATA SET-
DATA ANALYSIS
• Data analysis is a process of inspecting, cleansing, transforming,
and modeling data with the goal of discovering useful information,
informing conclusions, and supporting decision-making. Data analysis
has multiple facets and approaches, encompassing diverse techniques
under a variety of names, while being used in different business, science,
and social science domains.
• Data mining is a particular data analysis technique that focuses on
modeling and knowledge discovery for predictive rather than purely
descriptive purposes
EXTRACTING FEATURES FROM DATA SET AND REMOVING IRRELEVANT
DATA-CODE
• We have removed the stopwords and names from our dataset.
• We also converted the uppercase letters into lowercase.
• After that we calculated the no. features in our model , so as the create
Dataframe.
LOADING DATAFRAME-
DIFFERENT CLASSIFIERS TO FIND THE ACCURACY
OF DATA
• Logistic Regression
• Decision Tree
• Random Forest
• SVM(Support Vector Machine)
• Naïve Bayes
• kNN(k-Nearest Neighbors)
DECISION TREE:
• Decision tree is a type of supervised learning algorithm (having a pre-defined target
variable) that is mostly used in classification problems. It works for both categorical
and continuous input and output variables. In this technique, we split the population
or sample into two or more homogeneous sets (or sub-populations) based on most
significant splitter / differentiator in input variables.
CODE:
• On applying decision tree , we get the accuracy of
0.63696369
LOGISTIC REGRESSION-
• It is a classification not a regression algorithm. It is used to estimate discrete values ( Binary values like
0/1, yes/no, true/false ) based on given set of independent variable(s). In simple words, it predicts the
probability of occurrence of an event by fitting data to a logit function. Hence, it is also known as logit
regression. Since, it predicts the probability, its output values lies between 0 and 1 (as expected).
• Logit function or logistic equation is a curve of ‘S’ shape,with equation:
• e = the natural logarithm base (also known as Euler's number),
• x0 = the x-value of the sigmoid's midpoint,
• L = the curve's maximum value, and
• k = the steepness of the curve.
CODE-
• After applying the logistic regression on our model we get the accuracy of
0.6644664466..
• Which is much better than decision tree.
RANDOM FOREST-
• Random forests or random decision forests are an ensemble learning method
for classification, regression and other tasks, that operate by constructing a multitude
of decision trees at training time and outputting the class that is the mode of the
classes (classification) or mean prediction (regression) of the individual trees. Random
decision forests correct for decision trees' habit of overfitting to their training set.
• On applying the Random forest , we get the maximum accuracy:0.6611661166..
SVM (SUPPORT VECTOR MACHINE)-
• It is a classification method. In this algorithm, we plot each data item as a point in n-
dimensional space (where n is number of features you have) with the value of each
feature being the value of a particular coordinate.
• For example, if we only had two features like Height and Hair length of an individual,
we’d first plot these two variables in two dimensional space where each point has two
co-ordinates (these co-ordinates are known as Support Vectors)
• Now, we will find some line that splits the data between the two differently classified
groups of data. This will be the line such that the distances from the closest point in
each of the two groups will be farthest away.
CODE-
• We get the accuracy on applying SVM on our model is 0.572697260
K-NEAREST NEIGHBOUR-
• It can be used for both classification and regression problems. However, it is more
widely used in classification problems in the industry. K nearest neighbors is a simple
algorithm that stores all available cases and classifies new cases by a majority vote of
its k neighbors. The case being assigned to the class is most common amongst its K
nearest neighbors measured by a distance function.
• These distance functions can be Euclidean, Manhattan, Minkowski and Hamming
distance. First three functions are used for continuous function and fourth one
(Hamming) for categorical variables. If K = 1, then the case is simply assigned to the
class of its nearest neighbor. At times, choosing K turns out to be a challenge while
performing kNN modeling.
• On applying this algorithm on our model, we get the accuracy of 0.644114411.. for
value of k=3.
• We get the accuracy of for 0.628162816, k=10.
CODE-
GAUSSIAN NAIVE BAYES-
PLOTTING GRAPHS FOR EACH MODEL-
• So after applying different type of classifiers and regression ,we achieve
the maximum accuracy in logistic regression : 0.670517051
COUNTING NUMBERS OF WORDS FOR DIFFERENT
SENTIMENT
GRAPH:
MAXIMUM ACCURACY ON KAGGLE-
POSITIVE WORDS-
THANK YOU

More Related Content

What's hot

Sentiment analysis using imdb 50 k data
Sentiment analysis using imdb 50 k dataSentiment analysis using imdb 50 k data
Sentiment analysis using imdb 50 k data
Sarthak Dasgupta
 
Omsa
OmsaOmsa
Exploring social influence via posterior effect of word of-mouth
Exploring social influence via posterior effect of word of-mouthExploring social influence via posterior effect of word of-mouth
Exploring social influence via posterior effect of word of-mouth
moresmile
 
A Simple Guide to the Item Response Theory (IRT) and Rasch Modeling
A Simple Guide to the Item Response Theory (IRT) and Rasch ModelingA Simple Guide to the Item Response Theory (IRT) and Rasch Modeling
A Simple Guide to the Item Response Theory (IRT) and Rasch Modeling
OpenThink Labs
 
Absa project
Absa projectAbsa project
Absa project
Indranil Mukherjee
 
[UMAP 2016] User-Oriented Context Suggestion
[UMAP 2016] User-Oriented Context Suggestion[UMAP 2016] User-Oriented Context Suggestion
[UMAP 2016] User-Oriented Context Suggestion
YONG ZHENG
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
butest
 
Item Response Theory (IRT)
Item Response Theory (IRT)Item Response Theory (IRT)
Item Response Theory (IRT)
Dr. Muhammad Zafar Iqbal
 
Introduction to Item Response Theory
Introduction to Item Response TheoryIntroduction to Item Response Theory
Introduction to Item Response Theory
Nathan Thompson
 
[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation
[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation
[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation
YONG ZHENG
 
The importance of model fairness and interpretability in AI systems
The importance of model fairness and interpretability in AI systemsThe importance of model fairness and interpretability in AI systems
The importance of model fairness and interpretability in AI systems
Francesca Lazzeri, PhD
 
Sentiment analysis: Incremental learning to build domain-models
Sentiment analysis: Incremental learning to build domain-modelsSentiment analysis: Incremental learning to build domain-models
Sentiment analysis: Incremental learning to build domain-models
Raimon Bosch
 
Dealing with inconsistency
Dealing with inconsistencyDealing with inconsistency
Dealing with inconsistency
Rajat Sharma
 
[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...
[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...
[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...
YONG ZHENG
 
Rule based approach to sentiment analysis at romip’11 slides
Rule based approach to sentiment analysis at romip’11 slidesRule based approach to sentiment analysis at romip’11 slides
Rule based approach to sentiment analysis at romip’11 slides
Dmitry Kan
 
Developing and Movie Recommendation System in R
Developing and Movie Recommendation System in RDeveloping and Movie Recommendation System in R
Developing and Movie Recommendation System in R
Jody Schechter
 
Item Response Theory in Constructing Measures
Item Response Theory in Constructing MeasuresItem Response Theory in Constructing Measures
Item Response Theory in Constructing Measures
Carlo Magno
 
Chapter 05 k nn
Chapter 05 k nnChapter 05 k nn
Chapter 05 k nn
Raman Kannan
 
Rule based approach to sentiment analysis at ROMIP 2011
Rule based approach to sentiment analysis at ROMIP 2011Rule based approach to sentiment analysis at ROMIP 2011
Rule based approach to sentiment analysis at ROMIP 2011
Dmitry Kan
 
Irt assessment
Irt assessmentIrt assessment
Irt assessment
Allame Tabatabaei
 

What's hot (20)

Sentiment analysis using imdb 50 k data
Sentiment analysis using imdb 50 k dataSentiment analysis using imdb 50 k data
Sentiment analysis using imdb 50 k data
 
Omsa
OmsaOmsa
Omsa
 
Exploring social influence via posterior effect of word of-mouth
Exploring social influence via posterior effect of word of-mouthExploring social influence via posterior effect of word of-mouth
Exploring social influence via posterior effect of word of-mouth
 
A Simple Guide to the Item Response Theory (IRT) and Rasch Modeling
A Simple Guide to the Item Response Theory (IRT) and Rasch ModelingA Simple Guide to the Item Response Theory (IRT) and Rasch Modeling
A Simple Guide to the Item Response Theory (IRT) and Rasch Modeling
 
Absa project
Absa projectAbsa project
Absa project
 
[UMAP 2016] User-Oriented Context Suggestion
[UMAP 2016] User-Oriented Context Suggestion[UMAP 2016] User-Oriented Context Suggestion
[UMAP 2016] User-Oriented Context Suggestion
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
 
Item Response Theory (IRT)
Item Response Theory (IRT)Item Response Theory (IRT)
Item Response Theory (IRT)
 
Introduction to Item Response Theory
Introduction to Item Response TheoryIntroduction to Item Response Theory
Introduction to Item Response Theory
 
[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation
[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation
[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation
 
The importance of model fairness and interpretability in AI systems
The importance of model fairness and interpretability in AI systemsThe importance of model fairness and interpretability in AI systems
The importance of model fairness and interpretability in AI systems
 
Sentiment analysis: Incremental learning to build domain-models
Sentiment analysis: Incremental learning to build domain-modelsSentiment analysis: Incremental learning to build domain-models
Sentiment analysis: Incremental learning to build domain-models
 
Dealing with inconsistency
Dealing with inconsistencyDealing with inconsistency
Dealing with inconsistency
 
[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...
[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...
[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...
 
Rule based approach to sentiment analysis at romip’11 slides
Rule based approach to sentiment analysis at romip’11 slidesRule based approach to sentiment analysis at romip’11 slides
Rule based approach to sentiment analysis at romip’11 slides
 
Developing and Movie Recommendation System in R
Developing and Movie Recommendation System in RDeveloping and Movie Recommendation System in R
Developing and Movie Recommendation System in R
 
Item Response Theory in Constructing Measures
Item Response Theory in Constructing MeasuresItem Response Theory in Constructing Measures
Item Response Theory in Constructing Measures
 
Chapter 05 k nn
Chapter 05 k nnChapter 05 k nn
Chapter 05 k nn
 
Rule based approach to sentiment analysis at ROMIP 2011
Rule based approach to sentiment analysis at ROMIP 2011Rule based approach to sentiment analysis at ROMIP 2011
Rule based approach to sentiment analysis at ROMIP 2011
 
Irt assessment
Irt assessmentIrt assessment
Irt assessment
 

Similar to Moviereview prjct

ML SFCSE.pptx
ML SFCSE.pptxML SFCSE.pptx
ML SFCSE.pptx
NIKHILGR3
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
AmAn Singh
 
Machine Learning techniques used in AI.
Machine Learning  techniques used in AI.Machine Learning  techniques used in AI.
Machine Learning techniques used in AI.
ArchanaT32
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
Akshay Kanchan
 
svm.pptx
svm.pptxsvm.pptx
Different Algorithms used in classification [Auto-saved].pptx
Different Algorithms used in classification [Auto-saved].pptxDifferent Algorithms used in classification [Auto-saved].pptx
Different Algorithms used in classification [Auto-saved].pptx
Azad988896
 
AI Algorithms
AI AlgorithmsAI Algorithms
AI Algorithms
Dr. C.V. Suresh Babu
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Girish Khanzode
 
Primer on major data mining algorithms
Primer on major data mining algorithmsPrimer on major data mining algorithms
Primer on major data mining algorithms
Vikram Sankhala IIT, IIM, Ex IRS, FRM, Fin.Engr
 
KNN CLASSIFIER, INTRODUCTION TO K-NEAREST NEIGHBOR ALGORITHM.pptx
KNN CLASSIFIER, INTRODUCTION TO K-NEAREST NEIGHBOR ALGORITHM.pptxKNN CLASSIFIER, INTRODUCTION TO K-NEAREST NEIGHBOR ALGORITHM.pptx
KNN CLASSIFIER, INTRODUCTION TO K-NEAREST NEIGHBOR ALGORITHM.pptx
Nishant83346
 
Support Vector machine(SVM) and Random Forest
Support Vector machine(SVM) and Random ForestSupport Vector machine(SVM) and Random Forest
Support Vector machine(SVM) and Random Forest
umarcybermind
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
Vikash Kumar
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
Nandhini S
 
Machine learning
Machine learningMachine learning
Machine learning
Sukhwinder Singh
 
Random Forest Decision Tree.pptx
Random Forest Decision Tree.pptxRandom Forest Decision Tree.pptx
Random Forest Decision Tree.pptx
Ramakrishna Reddy Bijjam
 
PythonML.pptx
PythonML.pptxPythonML.pptx
PythonML.pptx
Hussain395748
 
Predictive analytics
Predictive analyticsPredictive analytics
Predictive analytics
Dinakar nk
 
sentiment analysis using support vector machine
sentiment analysis using support vector machinesentiment analysis using support vector machine
sentiment analysis using support vector machine
Shital Andhale
 
Support Vector Machines USING MACHINE LEARNING HOW IT WORKS
Support Vector Machines USING MACHINE LEARNING HOW IT WORKSSupport Vector Machines USING MACHINE LEARNING HOW IT WORKS
Support Vector Machines USING MACHINE LEARNING HOW IT WORKS
rajalakshmi5921
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?
Tuan Yang
 

Similar to Moviereview prjct (20)

ML SFCSE.pptx
ML SFCSE.pptxML SFCSE.pptx
ML SFCSE.pptx
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Machine Learning techniques used in AI.
Machine Learning  techniques used in AI.Machine Learning  techniques used in AI.
Machine Learning techniques used in AI.
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
svm.pptx
svm.pptxsvm.pptx
svm.pptx
 
Different Algorithms used in classification [Auto-saved].pptx
Different Algorithms used in classification [Auto-saved].pptxDifferent Algorithms used in classification [Auto-saved].pptx
Different Algorithms used in classification [Auto-saved].pptx
 
AI Algorithms
AI AlgorithmsAI Algorithms
AI Algorithms
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Primer on major data mining algorithms
Primer on major data mining algorithmsPrimer on major data mining algorithms
Primer on major data mining algorithms
 
KNN CLASSIFIER, INTRODUCTION TO K-NEAREST NEIGHBOR ALGORITHM.pptx
KNN CLASSIFIER, INTRODUCTION TO K-NEAREST NEIGHBOR ALGORITHM.pptxKNN CLASSIFIER, INTRODUCTION TO K-NEAREST NEIGHBOR ALGORITHM.pptx
KNN CLASSIFIER, INTRODUCTION TO K-NEAREST NEIGHBOR ALGORITHM.pptx
 
Support Vector machine(SVM) and Random Forest
Support Vector machine(SVM) and Random ForestSupport Vector machine(SVM) and Random Forest
Support Vector machine(SVM) and Random Forest
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
 
Machine learning
Machine learningMachine learning
Machine learning
 
Random Forest Decision Tree.pptx
Random Forest Decision Tree.pptxRandom Forest Decision Tree.pptx
Random Forest Decision Tree.pptx
 
PythonML.pptx
PythonML.pptxPythonML.pptx
PythonML.pptx
 
Predictive analytics
Predictive analyticsPredictive analytics
Predictive analytics
 
sentiment analysis using support vector machine
sentiment analysis using support vector machinesentiment analysis using support vector machine
sentiment analysis using support vector machine
 
Support Vector Machines USING MACHINE LEARNING HOW IT WORKS
Support Vector Machines USING MACHINE LEARNING HOW IT WORKSSupport Vector Machines USING MACHINE LEARNING HOW IT WORKS
Support Vector Machines USING MACHINE LEARNING HOW IT WORKS
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?
 

Recently uploaded

Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
co23btech11018
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
gowrishankartb2005
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
SakkaravarthiShanmug
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
IJECEIAES
 
An Introduction to the Compiler Designss
An Introduction to the Compiler DesignssAn Introduction to the Compiler Designss
An Introduction to the Compiler Designss
ElakkiaU
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
Hitesh Mohapatra
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
VICTOR MAESTRE RAMIREZ
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
People as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimalaPeople as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimala
riddhimaagrawal986
 
An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...
IJECEIAES
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
Madan Karki
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
PKavitha10
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
Divyanshu
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
171ticu
 
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
Gino153088
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
bjmsejournal
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
21UME003TUSHARDEB
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
abbyasa1014
 

Recently uploaded (20)

Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
 
An Introduction to the Compiler Designss
An Introduction to the Compiler DesignssAn Introduction to the Compiler Designss
An Introduction to the Compiler Designss
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
People as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimalaPeople as resource Grade IX.pdf minimala
People as resource Grade IX.pdf minimala
 
An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
 
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
 

Moviereview prjct

  • 1. Innovative Geeks Presented By- Shekhar Bhardwaj - 03351202716 Shubham Siddhartha - 03451202716 Mukul Sharma - 02051202716 Eckovation Machine Learning
  • 2. TOPIC: SENTIMENT ANALYSIS ON MOVIE REVIEWS The Rotten Tomatoes movie review dataset is a corpus of movie reviews used for sentiment analysis, originally collected by Pang and Lee [1]. In their work on sentiment treebanks, Socher et al. used Amazon's Mechanical Turk to create fine-grained labels for all parsed phrases in the corpus.
  • 3. DATA DESCRIPTION • The dataset is comprised of tab-separated files with phrases from the Rotten Tomatoes dataset. The train/test split has been preserved for the purposes of benchmarking, but the sentences have been shuffled from their original order. Each Sentence has been parsed into many phrases by the Stanford parser. Each phrase has a PhraseId. Each sentence has a SentenceId. Phrases that are repeated (such as short/common words) are only included once in the data. • train.tsv contains the phrases and their associated sentiment labels. We have additionally provided a SentenceId so that you can track which phrases belong to a single sentence. • test.tsv contains just phrases. You must assign a sentiment label to each phrase.
  • 4. THE SENTIMENT LABELS ARE: • 0 – negative • 1 - somewhat negative • 2 – neutral • 3 - somewhat positive • 4 - positive
  • 6. DATA ANALYSIS • Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, while being used in different business, science, and social science domains. • Data mining is a particular data analysis technique that focuses on modeling and knowledge discovery for predictive rather than purely descriptive purposes
  • 7. EXTRACTING FEATURES FROM DATA SET AND REMOVING IRRELEVANT DATA-CODE • We have removed the stopwords and names from our dataset. • We also converted the uppercase letters into lowercase. • After that we calculated the no. features in our model , so as the create Dataframe.
  • 9. DIFFERENT CLASSIFIERS TO FIND THE ACCURACY OF DATA • Logistic Regression • Decision Tree • Random Forest • SVM(Support Vector Machine) • Naïve Bayes • kNN(k-Nearest Neighbors)
  • 10. DECISION TREE: • Decision tree is a type of supervised learning algorithm (having a pre-defined target variable) that is mostly used in classification problems. It works for both categorical and continuous input and output variables. In this technique, we split the population or sample into two or more homogeneous sets (or sub-populations) based on most significant splitter / differentiator in input variables.
  • 11. CODE: • On applying decision tree , we get the accuracy of 0.63696369
  • 12. LOGISTIC REGRESSION- • It is a classification not a regression algorithm. It is used to estimate discrete values ( Binary values like 0/1, yes/no, true/false ) based on given set of independent variable(s). In simple words, it predicts the probability of occurrence of an event by fitting data to a logit function. Hence, it is also known as logit regression. Since, it predicts the probability, its output values lies between 0 and 1 (as expected). • Logit function or logistic equation is a curve of ‘S’ shape,with equation: • e = the natural logarithm base (also known as Euler's number), • x0 = the x-value of the sigmoid's midpoint, • L = the curve's maximum value, and • k = the steepness of the curve.
  • 13. CODE- • After applying the logistic regression on our model we get the accuracy of 0.6644664466.. • Which is much better than decision tree.
  • 14. RANDOM FOREST- • Random forests or random decision forests are an ensemble learning method for classification, regression and other tasks, that operate by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees. Random decision forests correct for decision trees' habit of overfitting to their training set. • On applying the Random forest , we get the maximum accuracy:0.6611661166..
  • 15. SVM (SUPPORT VECTOR MACHINE)- • It is a classification method. In this algorithm, we plot each data item as a point in n- dimensional space (where n is number of features you have) with the value of each feature being the value of a particular coordinate. • For example, if we only had two features like Height and Hair length of an individual, we’d first plot these two variables in two dimensional space where each point has two co-ordinates (these co-ordinates are known as Support Vectors) • Now, we will find some line that splits the data between the two differently classified groups of data. This will be the line such that the distances from the closest point in each of the two groups will be farthest away.
  • 16. CODE- • We get the accuracy on applying SVM on our model is 0.572697260
  • 17. K-NEAREST NEIGHBOUR- • It can be used for both classification and regression problems. However, it is more widely used in classification problems in the industry. K nearest neighbors is a simple algorithm that stores all available cases and classifies new cases by a majority vote of its k neighbors. The case being assigned to the class is most common amongst its K nearest neighbors measured by a distance function. • These distance functions can be Euclidean, Manhattan, Minkowski and Hamming distance. First three functions are used for continuous function and fourth one (Hamming) for categorical variables. If K = 1, then the case is simply assigned to the class of its nearest neighbor. At times, choosing K turns out to be a challenge while performing kNN modeling. • On applying this algorithm on our model, we get the accuracy of 0.644114411.. for value of k=3. • We get the accuracy of for 0.628162816, k=10.
  • 18. CODE-
  • 20. PLOTTING GRAPHS FOR EACH MODEL- • So after applying different type of classifiers and regression ,we achieve the maximum accuracy in logistic regression : 0.670517051
  • 21. COUNTING NUMBERS OF WORDS FOR DIFFERENT SENTIMENT