Sentiment Analysis using Machine Learning.pdf

Sentiment Analysis
Presented By- Rebecca Williams

Overview:
1. Abstract
2. Introduction
3. What is Sentiment
Analysis ?
1. Applications & uses
2. Advantages
3. Step by Step process of SA
4. Simple Example using TextBlob

Abstract
● Triple talaq is also known as talaq-e-biddat instant divorce. It is a kind of Islamic divorce used by
Muslims in India. It allows Muslims man to divorce their wife legally by simply stating the word
‘Talaq' three times in any form which can be in any way (verbal, written, or in electronic form).
● Now a day, the huge amount of data is posted on daily basis on the social media platform. Twitter
is a well known social networking platform where the user can post their views, opinions, and
thoughts freely.
● The sentimental analysis is a process of understanding opinions, thoughts and feelings of people
about a given subject. This paper analyses tweets posted on Twitter on the subject Triple from the
year 2002 to the year 2019.
● We have transformed unstructured data into well-informed data for getting the insights of people.
● The main focus of the work is to analyze the feelings of people using two well-known API like
TextBlob, and SpaCy. These APIs are based on Lexicon approach.
● This paper predicts sentiment into three classes positive, negative and neutral.

Introduction
● In this paper, we are applying statistics, natural language processing (NLP), and machine learning
to identify, analyze and extract some important information from tweets.
● The main objective is to observe the reviewer’s feelings, expressions, thoughts or judgments about
Triple Talak.
● Sentiment Analysis can be done by either machine learning or lexicon-based approach. In this
paper, we have applied a Lexicon based approach.
● This is a feasible and practical approach which can analyze tweet text without training or using
machine learning.
● Lexicon is a collection of words or one can say it is like a dictionary in which words are arranged
alphabetically. This approach is subdivided into a dictionary-based approach and corpus-based
approach.
● Here we are using a corpus-based approach. Corpus is a large body of words or text which
formulate a set of conceptual rules that govern a natural language from texts in that language and
examine how that language relates to other languages.

What does Sentiment Analysis mean?
The process of computationally identifying and categorizing opinions
expressed in a piece of text, especially in order to determine whether the
writer's attitude towards a particular topic, product, etc. is positive,
negative, or neutral.

Sentimental Analysis can used as follows:
● Social media monitoring
● Brand monitoring
● Voice of customer (VoC)
● Customer service
● Workforce analytics and voice of employee
● Product analytics
● Market research and analysis

Advantages
● Scalability:
Sentiment analysis allows to
process data at scale in a efficient
and cost-effective way.
● Real-time analysis:
A sentiment analysis system can
help you immediately identify
these kinds of situations and take
action.
● Consistent criteria:
By using a centralized sentiment
analysis system, companies can
apply the same criteria to all of
their data. This helps to reduce
errors and improve data
consistency.

What is the use of NLP in Sentiment analysis?
● Sentiment Analysis also known as Opinion Mining is a field within Natural Language
Processing (NLP) that builds systems that try to identify and extract opinions within text.
● A sentiment analysis system for text analysis combines natural language processing (NLP)
and machine learning techniques to assign weighted sentiment scores to the entities,
topics, themes and categories within a sentence or phrase.
● Natural Language Processing (NLP) is a branch of AI that helps computers to understand,
interpret and manipulate human language.

Sentimental Analysis : Step by
Step Process

Step 1: Tokenization
Tokenization is the process by which big
quantity of text is divided into smaller parts
called tokens.

Step 2: Cleaning the data
● Remove numbers
● Stemming/lemmatization
● Part of speech tagging
● Remove punctuation
● Lowercase

Step 3 : Removing the stop words
One of the major forms of pre-
processing is to filter out useless data. In
natural language processing, useless
words (data), are referred to as stop
words.

Step 4: Classification
● Rule-based systems that perform sentiment
analysis based on a set of manually crafted
rules.
● Automatic systems that rely on machine
learning techniques to learn from data.
● Hybrid systems that combine both rule
based and automatic approaches.

Step 5: Apply Supervised Algorithm for
Classification

Machine Learning/Automatic
This approach, employes a machine-learning technique and diverse features to construct a classifier that
can identify text that expresses sentiment. Nowadays, deep-learning methods are popular because they
fit on data learning representations.
Lexicon-Based/Rule-based
This method uses a variety of words annotated by polarity score, to decide the general assessment score
of a given content. The strongest asset of this technique is that it does not require any training data,
while its weakest point is that a large number of words and expressions are not included in sentiment
lexicons.
Hybrid
The combination of machine learning and lexicon-based approaches to address Sentiment Analysis is
called Hybrid. Though not commonly used, this method usually produces more promising results than the
approaches mentioned above.

Algorithms used :
There are three machine learning classification algorithms that are predominantly used for sentiment analysis:
● Support Vector Machines (SVMs)
● Naive-bayes
● Decision Trees
Each has its own advantages and drawbacks; however, a few different studies have concluded that the Naive-Bayes
classifier is the more accurate of the three.
There are also two main algorithms used within a lexicon based approach:
● Corpus
● Dictionary
The most accurate and best approach is a combination of both. However, today we’ll go into one of the more widely
used machine learning algorithms which is the Naive-Bayes algorithm.

Let’s see a simple example :

What is TextBlob?
TextBlob is a python library and offers a simple API to access its methods and perform basic
NLP tasks.
The sentiment function of textblob returns two properties, polarity, and subjectivity.
Polarity is float which lies in the range of [-1,1] where 1 means positive statement and -1 means
a negative statement. Subjective sentences generally refer to personal opinion, emotion or
judgment whereas objective refers to factual information. Subjectivity is also a float which lies
in the range of [0,1].

Code example:-
from textblob import TextBlob
Feedback1 ="unbelievably disappointing"
Feedback2 ="Terrible pitching and awful
hitting led to another crushing loss."
Feedback3 ="this is the greatest screwball
comedy ever filmed"
Feedback4 ="It was pathetic.The worst
part about it was the boxing scenes."
blob1= TextBlob(Feedback1)
print(blob1.sentiment)

Output
Sentiment(polarity=-0.6, subjectivity=0.7)
Sentiment(polarity=1.0, subjectivity=1.0)

“Just as knowledge makes human
intelligent, data makes software
intelligent.”
- Amarpreet Kalkat, Frrole

Sentiment Analysis using Machine Learning.pdf

Recommended

Recommended

More Related Content

Similar to Sentiment Analysis using Machine Learning.pdf

Similar to Sentiment Analysis using Machine Learning.pdf (20)

Recently uploaded

Recently uploaded (20)

Sentiment Analysis using Machine Learning.pdf