Naïve multi label classification of you tube comments using

Naïve Multi-label classification
of YouTube comments
using comparative opinion mining
By-
Nidhi Baranwal
MCA 5th sem

Introduction
• People are connecting with each other in cyber space and show their
sentiments in the form of comments. YouTube is considered as a king
in the field of video sharing.
• There are situations in which opinion shared by user has comparative
content. User sees the video of comparison of two options and shares
his preference based on some reasoning.
• In this paper, Naïve Bayes machine learning algorithm is used to
perform multi-label classification to find out the sentiments of the
commentators .
• In order to reduce the computational requirements, it uses a naïve
assumption that words around keywords related to particular option
are enough to understand the sentiments of user.

Classification?
• Classification is a task to predict a class(label) of an instance
based on data
• Supervised Learning
Example: Naïve Bayes
• We give the system a set of instances to learn
• System builds knowledge of some structure
• System can then classify new instances

Types of Classification
• Binary classification: each instance can be only one out of two
classes
• Multiclass classification: each instance can be only one out of
more than two classes
• Multi-label classification: each instance can be multiple
classes at the same time
• Hierarchical multi-label classification: classes are organized in
a hierarchy

Opinion Mining?
• Opinion mining or Sentiment analysis is concerned as
“How people think about particular thing, person or idea”.
• It is the process of determining whether a piece of writing is
positive, negative or neutral.
• In comparative sentiment analysis we have to deal with multi-
aspect comments. Commentator compares more than one
things, people or idea on the basis of some aspects.

Tasks Involved
• To find relevant comments following tasks are involved:
1. Gathering of data (gathering comments)
2. Removal of noisy and irrelevant data.
3. Manual assignment of sentiments to the comments in order to
make training corpus.
4. Development and evaluation of classification model

Naïve Bayes Classifier
• Simple classification of words based on ‘Bayes theorem’.
• It is a ‘Bag of words’ (text represented as collection of it’s
words, discarding grammar and order of words but keeping
multiplicity) approach for analysis of a content
• Application: Sentiment detection, Email spam detection,
Document categorization etc.
• Probabilistic Analysis of Naïve Bayes: for a document d
and class c , by Bayes theorem
)(
)()/(
)|(
dP
cPcdP
dcP 

Data Analysis
• It has worked on Iphone vs Android video, which consisted of
over 8000 comments.
• Then filtered comments and only used comparative comments
in the research.
• The dataset in this research is about 400 comments which are
almost 5% of the original dataset.

Methodology followed
• Data collection
• Class assignment (2 labels and 9 classes)
• Facing difficulties with assigning annotations
-handling problems with symbols and short forms
-ambiguity in comments: various types
• Finding part of speech and neighbor words of keywords from
comments
• Using tools and steps for classification
• Finding better results

Tools and Steps used
• We used WEKA(single label classification + joined label
classification) and MEKA (multi label classification),
specialized software , to perform machine learning tasks
• Following are the steps taken to develop classification model:
 Data Processing and Class balancing
 Classification
 Naïve Bayes Probabilistic classifier

Results obtained
• The results in terms of different performance measures are not
satisfactory but the naïve assumption regarding neighborhood
words of keywords performed well as compare to others.
• Single label comments and Joined label comments give poorer
results than multi label

Naïve multi label classification of you tube comments using

Naïve multi label classification of you tube comments using

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Naïve multi label classification of you tube comments using

Similar to Naïve multi label classification of you tube comments using (20)

More from Nidhi Baranwal

More from Nidhi Baranwal (12)

Recently uploaded

Recently uploaded (20)

Naïve multi label classification of you tube comments using