EMOTION
MINING IN TEXT
LOVEPREET SINGH
SLIET Longowal
Definition of Text Mining
 Text Mining is a subset of Unstructured Data
Management.
 An exploration and analysis of textual data
by automatic and semi automatic means to
discover new knowledge.
Definition of Emotion
 A strong feeling deriving from one's
circumstances, mood, or relationships with
others.
 In psychology, emotion is often defined as a
complex state of feeling that results in
physical and psychological changes that
influence thought and behavior.
 Synonyms: feeling, sentiment, sensation
Elements
1.Thoughts: Ideas or images that pop into your
head when you are experiencing an emotion.
2.Your Body's Response: The physical changes
you experience (for example, increased heart
rate, feeling queasy) when you experience an
emotion.
3.Behaviours: The things you want or feel an
urge to do when you experience a certain
emotion.
Emotions from Text
 The purpose is not to identify specific
emotions but rather to find the emotional
state or mind set of a writer while writing
the text.
Theories of Emotion
The major theories of emotion can be grouped
into three main categories:
 Physiological
 Neurological
 Cognitive
Theories of Emotion
Physiological theories suggest that responses
within the body are responsible for emotions.
Neurological theories propose that activity
within the brain leads to emotional responses.
Cognitive theories argue that thoughts and
other mental activity play an essential role in
the formation of emotions.
Positive & Negative Emotions
Positive: any emotion that makes us feel good
eg. appreciation, joy, love, passion, freedom,
excitement.
Negative: emotions stop us from thinking and
behaving rationally and seeing situations in
their true perspective eg. Jealousy, anger, fear,
guilt, shame, frustration, sadness.
Example: Emotion by viewing
an object
Factors on which Emotion
depends
 Gender
 Situation
 Age
Techniques for Emotion
Detection
 Keyword Spotting Technique
 Lexical Affinity Method
 Learning Based Methods
 Hybrid Methods
1. Keyword Spotting Technique
Text Tokenization Identify
Emotions Word
Analysis of
Intensity
Negation
Check
Emotion
2. Lexical Affinity Method
 Extension of keyword spotting technique.
 It assigns a probabilistic ‘affinity’ for a particular
emotion to arbitrary words apart from picking up
emotional keywords.
 These probabilities are often part of linguistic
corpora.
 Disadvantages : Assigned probabilities are biased
toward corpus-specific genre of texts.
Example
 The word ‘accident’, having been assigned a high
probability of indicating a negative emotion,
would not contribute correctly to the emotional
assessment of phrases like “I avoided an
accident” or “I met my girlfriend by accident”.
3. Learning based methods
 Classify the input texts into different emotions
 Learning-based methods try to detect emotions
based on a previously trained classifier, which
apply various theories of machine learning to
determine which emotion category should the
input text belongs.
4. Hybrid Methods
 Combination of both keyword spotting technique
and learning based method
 Improve accuracy
Limitations of above methods
 Ambiguity in keyword definition
 Incapability of recognizing sentences without
keywords
 Lack of Linguistic Information
 Difficulties in determining emotion indicators
1. Ambiguity in Keyword
Definitions
 Using emotion keywords is a straightforward way
to detect associated emotions, the meanings of
keywords could be multiple and vague, as most
words could change their meanings according to
different usages and contexts.
2. Incapability of Recognizing
Sentences without Keywords
 Keyword-based approach is totally based on the
set of emotion keywords.
 Therefore, sentences without any keyword would
imply that they do not contain any emotion at
all, which is obviously wrong.
 For example, “I passed my qualify exam today”
and “Hooray! I passed my qualify exam today”
should imply the same emotion (joy), but the
former without “hooray” could remain
undetected if “hooray” is the only keyword to
detect this emotion.
3. Lack of Linguistic
Information
 Syntax structures and semantics also have
influences on expressed emotions.
 For example, “I laughed at him” and “He laughed
at me” would suggest different emotions from
the first person’s perspective.
 As a result, ignoring linguistic information also
poses a problem to keyword-based methods.
Architecture
Input text Output textEmotion
Detector
Emotion
Word
Ontology
Emotion Class
Example: Social Network
Communication
 This research examines the extent to which
emotion is present in MySpace comments, using a
combination of data mining and content analysis,
and exploring age and gender.
 Sample: 819 comments to and from USA users.
 Classification: Positive and Negative Emotions.
Step 1: Data set
 Comments from and to active, normal, long-term
US Members.
 Members having public profile- Normal
 Comments filtered for standard picture
comments , spam, chain messages using regular
expressions.
 Resulting comments formed the raw data.
Step 2: Classification
 Positive & Negative comments.
 Example: “I Miss You” can be interpreted as
positive emotion & is almost synonym of “I Love
You” , even though it suggested sadness.
 Classification deals only with text of individual
comment rather than emotional state of
commenter.
 Reasons for choosing particular comment is not
considered.
Results
 Females send and receive significantly more
positive emotions than men.
 Negative emotion is much rarer than positive
emotions.
 Limitation: We are considering only public
comments, so the situation for the private
messages may be different from above.
Emotion mining in text

Emotion mining in text

  • 1.
  • 2.
    Definition of TextMining  Text Mining is a subset of Unstructured Data Management.  An exploration and analysis of textual data by automatic and semi automatic means to discover new knowledge.
  • 3.
    Definition of Emotion A strong feeling deriving from one's circumstances, mood, or relationships with others.  In psychology, emotion is often defined as a complex state of feeling that results in physical and psychological changes that influence thought and behavior.  Synonyms: feeling, sentiment, sensation
  • 4.
    Elements 1.Thoughts: Ideas orimages that pop into your head when you are experiencing an emotion. 2.Your Body's Response: The physical changes you experience (for example, increased heart rate, feeling queasy) when you experience an emotion. 3.Behaviours: The things you want or feel an urge to do when you experience a certain emotion.
  • 5.
    Emotions from Text The purpose is not to identify specific emotions but rather to find the emotional state or mind set of a writer while writing the text.
  • 6.
    Theories of Emotion Themajor theories of emotion can be grouped into three main categories:  Physiological  Neurological  Cognitive
  • 7.
    Theories of Emotion Physiologicaltheories suggest that responses within the body are responsible for emotions. Neurological theories propose that activity within the brain leads to emotional responses. Cognitive theories argue that thoughts and other mental activity play an essential role in the formation of emotions.
  • 9.
    Positive & NegativeEmotions Positive: any emotion that makes us feel good eg. appreciation, joy, love, passion, freedom, excitement. Negative: emotions stop us from thinking and behaving rationally and seeing situations in their true perspective eg. Jealousy, anger, fear, guilt, shame, frustration, sadness.
  • 10.
    Example: Emotion byviewing an object
  • 11.
    Factors on whichEmotion depends  Gender  Situation  Age
  • 12.
    Techniques for Emotion Detection Keyword Spotting Technique  Lexical Affinity Method  Learning Based Methods  Hybrid Methods
  • 13.
    1. Keyword SpottingTechnique Text Tokenization Identify Emotions Word Analysis of Intensity Negation Check Emotion
  • 14.
    2. Lexical AffinityMethod  Extension of keyword spotting technique.  It assigns a probabilistic ‘affinity’ for a particular emotion to arbitrary words apart from picking up emotional keywords.  These probabilities are often part of linguistic corpora.  Disadvantages : Assigned probabilities are biased toward corpus-specific genre of texts.
  • 15.
    Example  The word‘accident’, having been assigned a high probability of indicating a negative emotion, would not contribute correctly to the emotional assessment of phrases like “I avoided an accident” or “I met my girlfriend by accident”.
  • 16.
    3. Learning basedmethods  Classify the input texts into different emotions  Learning-based methods try to detect emotions based on a previously trained classifier, which apply various theories of machine learning to determine which emotion category should the input text belongs.
  • 17.
    4. Hybrid Methods Combination of both keyword spotting technique and learning based method  Improve accuracy
  • 18.
    Limitations of abovemethods  Ambiguity in keyword definition  Incapability of recognizing sentences without keywords  Lack of Linguistic Information  Difficulties in determining emotion indicators
  • 19.
    1. Ambiguity inKeyword Definitions  Using emotion keywords is a straightforward way to detect associated emotions, the meanings of keywords could be multiple and vague, as most words could change their meanings according to different usages and contexts.
  • 20.
    2. Incapability ofRecognizing Sentences without Keywords  Keyword-based approach is totally based on the set of emotion keywords.  Therefore, sentences without any keyword would imply that they do not contain any emotion at all, which is obviously wrong.  For example, “I passed my qualify exam today” and “Hooray! I passed my qualify exam today” should imply the same emotion (joy), but the former without “hooray” could remain undetected if “hooray” is the only keyword to detect this emotion.
  • 21.
    3. Lack ofLinguistic Information  Syntax structures and semantics also have influences on expressed emotions.  For example, “I laughed at him” and “He laughed at me” would suggest different emotions from the first person’s perspective.  As a result, ignoring linguistic information also poses a problem to keyword-based methods.
  • 22.
    Architecture Input text OutputtextEmotion Detector Emotion Word Ontology Emotion Class
  • 23.
    Example: Social Network Communication This research examines the extent to which emotion is present in MySpace comments, using a combination of data mining and content analysis, and exploring age and gender.  Sample: 819 comments to and from USA users.  Classification: Positive and Negative Emotions.
  • 24.
    Step 1: Dataset  Comments from and to active, normal, long-term US Members.  Members having public profile- Normal  Comments filtered for standard picture comments , spam, chain messages using regular expressions.  Resulting comments formed the raw data.
  • 25.
    Step 2: Classification Positive & Negative comments.  Example: “I Miss You” can be interpreted as positive emotion & is almost synonym of “I Love You” , even though it suggested sadness.  Classification deals only with text of individual comment rather than emotional state of commenter.  Reasons for choosing particular comment is not considered.
  • 26.
    Results  Females sendand receive significantly more positive emotions than men.  Negative emotion is much rarer than positive emotions.  Limitation: We are considering only public comments, so the situation for the private messages may be different from above.