2. Definition of Text Mining
Text Mining is a subset of Unstructured Data
Management.
An exploration and analysis of textual data
by automatic and semi automatic means to
discover new knowledge.
3. Definition of Emotion
A strong feeling deriving from one's
circumstances, mood, or relationships with
others.
In psychology, emotion is often defined as a
complex state of feeling that results in
physical and psychological changes that
influence thought and behavior.
Synonyms: feeling, sentiment, sensation
4. Elements
1.Thoughts: Ideas or images that pop into your
head when you are experiencing an emotion.
2.Your Body's Response: The physical changes
you experience (for example, increased heart
rate, feeling queasy) when you experience an
emotion.
3.Behaviours: The things you want or feel an
urge to do when you experience a certain
emotion.
5. Emotions from Text
The purpose is not to identify specific
emotions but rather to find the emotional
state or mind set of a writer while writing
the text.
6. Theories of Emotion
The major theories of emotion can be grouped
into three main categories:
Physiological
Neurological
Cognitive
7. Theories of Emotion
Physiological theories suggest that responses
within the body are responsible for emotions.
Neurological theories propose that activity
within the brain leads to emotional responses.
Cognitive theories argue that thoughts and
other mental activity play an essential role in
the formation of emotions.
8.
9. Positive & Negative Emotions
Positive: any emotion that makes us feel good
eg. appreciation, joy, love, passion, freedom,
excitement.
Negative: emotions stop us from thinking and
behaving rationally and seeing situations in
their true perspective eg. Jealousy, anger, fear,
guilt, shame, frustration, sadness.
13. 1. Keyword Spotting Technique
Text Tokenization Identify
Emotions Word
Analysis of
Intensity
Negation
Check
Emotion
14. 2. Lexical Affinity Method
Extension of keyword spotting technique.
It assigns a probabilistic ‘affinity’ for a particular
emotion to arbitrary words apart from picking up
emotional keywords.
These probabilities are often part of linguistic
corpora.
Disadvantages : Assigned probabilities are biased
toward corpus-specific genre of texts.
15. Example
The word ‘accident’, having been assigned a high
probability of indicating a negative emotion,
would not contribute correctly to the emotional
assessment of phrases like “I avoided an
accident” or “I met my girlfriend by accident”.
16. 3. Learning based methods
Classify the input texts into different emotions
Learning-based methods try to detect emotions
based on a previously trained classifier, which
apply various theories of machine learning to
determine which emotion category should the
input text belongs.
17. 4. Hybrid Methods
Combination of both keyword spotting technique
and learning based method
Improve accuracy
18. Limitations of above methods
Ambiguity in keyword definition
Incapability of recognizing sentences without
keywords
Lack of Linguistic Information
Difficulties in determining emotion indicators
19. 1. Ambiguity in Keyword
Definitions
Using emotion keywords is a straightforward way
to detect associated emotions, the meanings of
keywords could be multiple and vague, as most
words could change their meanings according to
different usages and contexts.
20. 2. Incapability of Recognizing
Sentences without Keywords
Keyword-based approach is totally based on the
set of emotion keywords.
Therefore, sentences without any keyword would
imply that they do not contain any emotion at
all, which is obviously wrong.
For example, “I passed my qualify exam today”
and “Hooray! I passed my qualify exam today”
should imply the same emotion (joy), but the
former without “hooray” could remain
undetected if “hooray” is the only keyword to
detect this emotion.
21. 3. Lack of Linguistic
Information
Syntax structures and semantics also have
influences on expressed emotions.
For example, “I laughed at him” and “He laughed
at me” would suggest different emotions from
the first person’s perspective.
As a result, ignoring linguistic information also
poses a problem to keyword-based methods.
23. Example: Social Network
Communication
This research examines the extent to which
emotion is present in MySpace comments, using a
combination of data mining and content analysis,
and exploring age and gender.
Sample: 819 comments to and from USA users.
Classification: Positive and Negative Emotions.
24. Step 1: Data set
Comments from and to active, normal, long-term
US Members.
Members having public profile- Normal
Comments filtered for standard picture
comments , spam, chain messages using regular
expressions.
Resulting comments formed the raw data.
25. Step 2: Classification
Positive & Negative comments.
Example: “I Miss You” can be interpreted as
positive emotion & is almost synonym of “I Love
You” , even though it suggested sadness.
Classification deals only with text of individual
comment rather than emotional state of
commenter.
Reasons for choosing particular comment is not
considered.
26. Results
Females send and receive significantly more
positive emotions than men.
Negative emotion is much rarer than positive
emotions.
Limitation: We are considering only public
comments, so the situation for the private
messages may be different from above.