Natural language processing is concerned with programming computers to process and analyze large amounts of natural language data. Sentiment analysis is a technique to detect subjective information in text documents by determining the sentiment of a writer about some aspect of a document. It recognizes the subjectivity and objectivity of text and classifies the opinion orientation. Sentiment analysis works by extracting features from text, preprocessing the text by stemming words and removing stop words, and then classifying the text sentiment using classifiers like Naive Bayes. The benefits of sentiment analysis include determining marketing strategy, improving products and customer service, using user input for mining, and improved decision making.
2. NaturalLanguage Processing
Natural language processing (NLP) is a subfield of artificial
intelligence concerned with the interactions between computers
and human (natural) languages, in particular how to program
computers to process and analyze large amounts of natural
language data.
Some of the most commonly researched task in natural language
processing are :-
• Part-of-speech tagging
• Sentence breaking
• Stemming
• Optical character recognition (OCR)
3. INTRODUCTION
3
Opinion mining or
sentiment analysis is a
technique to detect and
extract subjective
information in text
documents.
1
In general, sentiment
analysis tries to
determine the sentiment
of a writer about some
aspect of a document.
2
The art Opinion Mining
is to recognize the
subjectivity and
objectivity of a text and
further classify the
opinion orientation of
text.
3
4. Different Types of Opinions
Regular opinion: It has two main sub-types such as Direct opinion is an
opinions expressed directly on an entity and Indirect opinion is an opinion
expressed indirectly on an entity
Comparative opinion: This expresses a relation of similarities
between two/more entities or a preference of opinion holder
based on entities
Explicit opinion: A subjective statement that gives a
regular/comparative opinion, e.g., “Coke tastes better than
Pepsi.”
Implicit opinion: An objective statement implying a
regular/comparative opinion. e.g., “The battery life of Nokia
phones is longer than Samsung phones.”
6. FEATURE
EXTRACTION
6
Feature extraction is important in opinion
mining as customers do not usually express
product opinions totally, but separately based
on individual features.
Feature selection is used in tasks like image
classification, data mining, cluster analysis,
image retrieval, and pattern recognition.
Feature denote properties of textual data in
text classification.
7. PREPROCESSING
7
Word stemming is a crude pseudo-linguistic process to
delete suffices to reduce words to word stem.
A common stemming algorithm is the Porter developed
suffix stripper.
Arabic language has two different morphological analysis
techniques: stemming and light-stemming.
While stemming reduces a word to its stem, light-stemming
deletes common affixes from a word without reducing it to
stem.
Stop words refer a set of terms/words with no inherent
useful information.
8. CLASSIFIERS
8
Naive Bayes classifier is a probabilistic classifier.
The probability model for a classifier is a
conditional model.
Simple Classification of word based on Bayes
Theorem. Used for sentiment detection, Email
spam deduction, etc.
Tokenize
Remove stopword
Pass the tokens to a sentiment classifier
polarity between -1.0 to 1.0
9. BENEFITS
Sentimental analysis provides benefits
like
Determine marketing strategy
Improved product messaging and
customer service
The involvement of mass users is
wisely used for the mining
process
Improved Decision making
9