1
A SENTIMENT ANAYSIS AND CLASSIFICATION ALGORITHM
UTILIZING AN INDEPENDENT TERM MATCHING SCHEME
SENSITIVE TO WORD COUNT PATERNS
Authors:
Asoka Korale, Ph.D., C.Eng., MIET
Chanuka Perera, Dip., ABE(UK)
Eranda Adikari, B.Sc., C.Eng., MIESL
Nadeesha Ekanayake, B.Sc.,
2
Business Drivers of “Sentiment Analysis” & Classification
Devise a Customer focused Corporate Strategy
Help Determine Areas of Future Investments
Analysis of Customer Feedback for Decision making
Insights on Corporate Image, Service Level and Performance
Business Process Improvement …
3
Objective of the Modeling
Prioritize Comments by Sentiment (Severity of Feedback)
Classify Comments to Pre Defined Categories
Rate Sentiment contained in Feedback
Analyze Feedback Comments, Prioritize and Classify for Timely Action
Direct each Class to Appropriate Authority in Priority Order for Timely action
4
“Sentiment” a Definition
Concise “Comments” give insight to “Emotional” content of message
Emotional Dimensions of Words
Valence (Happiness), Activation (Arousal), Dominance
An Opinion, View held or Expressed
Only “Select” words convey “Emotion”
Dictionaries of rated Words across each Emotional Dimension
Account separately for “Negations”
Words rated for “Sentiment” by Human agents via large Surveys
Introduce Local Language Support
5
Feedback Comment Classification Process
Supervised Methods employ “Training Sequences”
Technique uses word Combinations, Patterns, Frequencies
Grouping comments on a “Theme” or Criteria in to “Classes”
Requires Pre Classified Comments
Suitable for classifying large texts
6
Sentiment Analysis via Independent Term Matching
Assumptions -
Twitter, FB & Customer
comments
Each term in a comment independent of others
Valence, Activation and Dominance components of each word drawn from a
Normal Distribution with specified Mean and Standard Deviation
Combined overall sentiment rating of matched words occurs at
maximum of the sum of the individual Normal Densities
Overall Sentiment in a comment represented by the combined effect of
the sentiment of individual words in the comment
Suitable for small text data
Ref: http://www.csc.ncsu.edu/faculty/healey/tweet_viz/
7
Algorithm – Sentiment Score for each Comment
I. Comments in
Series: Each
Analyzed
Separately
II. Select a Comment,
Convert words to
Lower case and
Remove Punctuation
V. Compute a Normal Density
Function with Mean and Standard
Deviation corresponding to each
Attribute of each matched word by
scaling a Standard Normal Random
Variable
III. Find match in Dictionary for
each word in selected comment
and get corresponding mean and
standard deviation
IV. Extract Mean and Standard
Deviation of “Valence” and
“Activation” attributes of each
matched word from Dictionary
Vi. Compute the sum of
the Density functions
corresponding to each
attribute of all matched
words in the comment
Vii. Determine Maximum point “max-GMM” of the sum of the Density functions to arrive
at an average score for the effect of that attribute across all words in the comment
µ =
µ1
µ2
…
…
µ 𝑛
𝜎 =
𝜎1
𝜎2
…
…
𝜎 𝑛
Comment
Words Valence Rating Activation Rating
Dictionary
Value Mean Std Dev Mean Std Dev
'service' 6.83 1.54 2.95 2.09
'good' 7.89 1.24 3.66 2.72
'late' 3.32 1.17 5.57 2.56
Simple
Average 6.01 1.32 4.06 2.46
Word Valence Rating Activation Rating
max- GMM 7.5 3.7
8
Gaussian Mixtures in Rating “Total Sentiment”



N
k
kkk mxgpxf
1
);();( 
N
pk
1

2
2
1
2
1
),;(







 

 k
kmx
k
kk emxg



the mean and stand deviation of the Normal Distribution of the ratings of each
matched word
overall sentiment xcomment of a comment in a particular dimension is then determined as
Consider the cumulative effect of all matched sentiment bearing words via the sum of the
individual probability densities.
x represents the sentiment score, N the number of matched words in a comment
kkm ,
where and
which is the point at which the probability of the mixture of distribution is
a maximum, and so is the most likely value for the overall sentiment of
a comment composed of several words.
);(
max
xf
x
xcomment 
9
Overall Valance (Happiness) and Activation (Arousal) of a comment
Comment Words Valence Rating Activation Rating
Dictionary Value Mean Std Dev Mean Std Dev
'service' 6.83 1.54 2.95 2.09
'good' 7.89 1.24 3.66 2.72
'late' 3.32 1.17 5.57 2.56
Simple Average 6.01 1.32 4.06 2.46
Word Valence Rating Activation Rating
max- GMM 7.5 3.7
Figure 1: Gaussian Mixtures of matched words in
the Valence Dimension
Figure 2: Gaussian Mixtures of matched words in
the Activation Dimension
10
IMPACT OF “NEGATIONS” ON TOTAL RATING
Comment Words Valence Rating Activation Rating
Dictionary Value Mean Std Dev Mean Std Dev
'service' 6.83 1.54 2.95 2.09
Not 'good' 6.65 1.24 6.38 2.72
'late' 3.32 1.17 5.57 2.56
Simple Average 5.6 1.32 4.97 2.46
Word Valence Rating Activation Rating
max- GMM 6.7 4.5
Comment Words Valence Rating Activation Rating
Dictionary Value Mean Std Dev Mean Std Dev
'service' 6.83 1.54 2.95 2.09
'good' 7.89 1.24 3.66 2.72
'late' 3.32 1.17 5.57 2.56
Simple Average 6.01 1.32 4.06 2.46
“the service was not good and late”“the service was good but was late”
Word Valence Rating Activation Rating
max- GMM 7.5 3.7
 Account for Negations by adjusting the sentiment score of word immediately following the negation in a
direction opposite in polarity to its matched directory sentiment value.
 The magnitude of the adjustment made corresponds to the standard deviation of the particular rating value
being adjusted.
 The magnitude of the adjustment can also be user definable
11
Variance in Max GMM and Simple Average Measure
 It is seen that 90% of the time the samples are
within +/- 0.5 in the case of the Valence Attribute.
 The CDF of the difference in the Activation attribute
is tightly centered on the origin indicating hardly any
variance.
 This is also an indication that most comments
convey sentiments of a single polarity and only a
few comments (less than 10%) have words with
conflicting emotional content.
Figure 1: Variance between GMM and Simple Average
measures for estimating overall comment sentiment
A measure of the degree of disparate emotions in the comments
12
Sample Comments for Rating and Classification
1.HOTLINE ISSUES - DELAY IN ANSWERING - CX SERVICE ASSISTANCE
Today morning CX has called to the 444 HL for Movie Ticket & he has waited
for more than 10 mins in the line, regarding this now CX was very
disappointed on our service. So pls be kind enough to chk on ths & give the
call back to the CX ASAP. * Note: - Regarding this issue CX need the call
back from one of our manager & CX has requested not to charge a single
rupee from his no for this issue.
2.Yes,man magea prshnaya kiyapu gaman eyaa magea prshnea wisaduwaa
he's a good
3.Yes kad pin nambar signal
4.Wenath ayathana wala mema pahasukam nomati nisa
5.very good service
6.uparimaya
7.Uparima
8.think so
9.thanks
10.Super
11.Solved
12.She resolved my problem.
13.Service nallam
14.Sambanda weemata boho welawak giya nisa
15.recharge
16.Prashnayata pilithura hodin pahadili kara dima
17. Payak athulatha gataluwa nirakaranaya karanwa kiuwa. Thawamath
gataluwa nirakaranaya kara natha.
18.oba ayathanaya sewawan sadaha ihala mudalak ayakarana nisa
19.no mms setting laba dunnada save kala nohaka
20.nam apahu e tika ewanna
21.Mata awashshaya u pilithurau pahadili lesa laba ganemata hakiuna.
22.mage parshnata pilithuru dunna.
23.lotari SMS stop
24.Its professional
25.ing tone sewawa ain kirima
26.I submitted Xtv reg form on 27th oct at yr crescat arcade. They told to call
me on 28th wed to give the AC No
27.Hot line eka answer karapu girlge voice eka and care eka good
28.Hi kohomada? Mama mea dawas wala plan karagena yanawa mage next
music video eka karanna. Song eka "Mata Rawana" :-)
29.harima pehediliwa mage getaluwa nirakaranaya kala thanks
30.Good service but shortcomings due to some arrogant customer care
officers
31.good men
32Good
33.getaluwa hadunagenimata noheki wiya..
34.First of all its great to be treated as a privilege customer. Reason is simple.
I'm using X mobile connection and XTV, because dialog has the better
35.durakathanayata pilithuru denda epai eke hoda naraka kiyanna.
36.Cx need to add the CHU CHU TV which is a kids channel to the channel
list.Since this channel is available on another TV connection.Cx need this
channel to activate for XTV aswell.Please check on this and do the needfull.
Thank you
37.Customer service personal have to be trained better cause they can't think
out of the box.
38.bashawa wenaskaranna
13
Sentiment Aggregates on Sample Comments
Fig 1: Heat Map of Sentiment rated sample comments Fig 2: Sentiment Dimensions of sample comments
14
A Novel Association Rule Mining Algorithm
• Initialize (at level L1) by determining set of all Items {I} that meet minimum support criteria
• Determine support for all pairs of items {Ii,Ij} (i ~= j) in {I}
• Determine rules for all pairs of items of the form Ii->Ij
• At each subsequent level (Lp), p > 1
• Determine item combinations that meet minimum support criteria
• Items at subsequent stages selected from rules of previous stage that met min support
criteria
• Antecedent at subsequent level (Lp+1) is formed by merging the antecedent and
consequent terms of the rules that meet the minimum support criteria at level Lp
• Stop when combined terms no longer meet min support criteria
Deriving likely word combinations (Keyword Selection)
• Selection Measures NBANBASupport /)()( 
)( BAConfidence  )(/)( ASupportBASupport 
)(/)&( ABA EPEEP
)/( AB EEP
15
Simplifying Assumptions of the Naïve Bayes Technique
Sli
)(/),,...,,()/,...,( 2121 jjNjN CPCXXXPCXXXP 
)(/),,..,,(),,...,/( 3221 jJNjN CPCXXXPCXXXP
)(/)()/()......,,..,/( 21 jjjnjN CPCPCXPCXXXP
)/(),,.../( 2 jijNi CXPCXXXP 
)/)...(/()/()/,...,,( 2121 jNjjjN CXCXPCXPCXXXP 
Under the assumption of conditional independence of word Xi given class Cj
)}()/({
max
)/( jj
j
CPCXP
C
XCP 
)}()./().../()/({
max
21 jjNjj
j
CPCXPCXPCXP
C

probability of a sequence of words {Xi} in a comment given class Cj
Probability of class C given a set
of words X = {X1,X2…,XN}
16
Classification via Naïve Bayes
Assumptions -
The order of words {Xi} in a comment is independent of each other given
the class {Cj}
A class is determined solely on the specific words in a comment and
their frequency of occurrence in that comment
Conditional Independence of the words in a comment given the class of
the comment
a “bag of words model”
17
Performance of the Classification Algorithm
Accuracy greater than 75% on predicted classes
Accuracy greater than 90% on training samples
Performance will further increase with preprocessing and filtering
single word comments don’t convey meaningful category information
Use misclassified comments to “Retrain” algorithm
Key Words for classification via Association Rules
18
Algorithm Implementation & Results
• Algorithm designed and built from first principals using Matlab programming language
• Local Language Support by updating Dictionary with Sinhala and Tamil words conveying emotion
• 59,000 comments analyzed and Rated for Sentiment and Classified / Binned in to six categories
• Improved Classification by word relationships (key words) derived from Association Rule Mining
• 3000 Training comments used with six classes for Training Model
• Fast implementation processing all comments in a few hours
• A Word vs. Frequency Analysis used to determine which new words to add to the Dictionary
• The Sentiment rating is a means to “prioritize” the handling of the sorted and binned comments
• Performance improvement by “re-classifying” , miss classified comments and reuse in Training
19
Conclusion
• Pre Processing – improved performance by retaining only relevant words and word combinations
for the classification the business, purpose of the analysis
• Spelling mistakes will cause problems as words will not match those in dictionary
• Update Dictionary with new words and miss spelled words
• Introduce limits on the minimum number of words that should be matched for a comment to
be analyzed – for increased reliability
• Independent Term Matching – doesn’t necessarily capture “meaning” of comment
• short comments can be analyzed to assess overall sentiment
• Rate the emotional content in a comment
• Algorithm can provide other segmentations by matching words specific to the purpose of routing
• Naïve Bayes gave good classification accuracy
• The severity of sentiment in the classified comment used to prioritize comment handling
• Simple averaging of the attribute values to arrive at the combined effect of all matched words in a
comment can also be considered and may give results that are not that far off from the assumption
of Normality
20
THANK YOU

Sentiment Analysis for IET ATC 2016

  • 1.
    1 A SENTIMENT ANAYSISAND CLASSIFICATION ALGORITHM UTILIZING AN INDEPENDENT TERM MATCHING SCHEME SENSITIVE TO WORD COUNT PATERNS Authors: Asoka Korale, Ph.D., C.Eng., MIET Chanuka Perera, Dip., ABE(UK) Eranda Adikari, B.Sc., C.Eng., MIESL Nadeesha Ekanayake, B.Sc.,
  • 2.
    2 Business Drivers of“Sentiment Analysis” & Classification Devise a Customer focused Corporate Strategy Help Determine Areas of Future Investments Analysis of Customer Feedback for Decision making Insights on Corporate Image, Service Level and Performance Business Process Improvement …
  • 3.
    3 Objective of theModeling Prioritize Comments by Sentiment (Severity of Feedback) Classify Comments to Pre Defined Categories Rate Sentiment contained in Feedback Analyze Feedback Comments, Prioritize and Classify for Timely Action Direct each Class to Appropriate Authority in Priority Order for Timely action
  • 4.
    4 “Sentiment” a Definition Concise“Comments” give insight to “Emotional” content of message Emotional Dimensions of Words Valence (Happiness), Activation (Arousal), Dominance An Opinion, View held or Expressed Only “Select” words convey “Emotion” Dictionaries of rated Words across each Emotional Dimension Account separately for “Negations” Words rated for “Sentiment” by Human agents via large Surveys Introduce Local Language Support
  • 5.
    5 Feedback Comment ClassificationProcess Supervised Methods employ “Training Sequences” Technique uses word Combinations, Patterns, Frequencies Grouping comments on a “Theme” or Criteria in to “Classes” Requires Pre Classified Comments Suitable for classifying large texts
  • 6.
    6 Sentiment Analysis viaIndependent Term Matching Assumptions - Twitter, FB & Customer comments Each term in a comment independent of others Valence, Activation and Dominance components of each word drawn from a Normal Distribution with specified Mean and Standard Deviation Combined overall sentiment rating of matched words occurs at maximum of the sum of the individual Normal Densities Overall Sentiment in a comment represented by the combined effect of the sentiment of individual words in the comment Suitable for small text data Ref: http://www.csc.ncsu.edu/faculty/healey/tweet_viz/
  • 7.
    7 Algorithm – SentimentScore for each Comment I. Comments in Series: Each Analyzed Separately II. Select a Comment, Convert words to Lower case and Remove Punctuation V. Compute a Normal Density Function with Mean and Standard Deviation corresponding to each Attribute of each matched word by scaling a Standard Normal Random Variable III. Find match in Dictionary for each word in selected comment and get corresponding mean and standard deviation IV. Extract Mean and Standard Deviation of “Valence” and “Activation” attributes of each matched word from Dictionary Vi. Compute the sum of the Density functions corresponding to each attribute of all matched words in the comment Vii. Determine Maximum point “max-GMM” of the sum of the Density functions to arrive at an average score for the effect of that attribute across all words in the comment µ = µ1 µ2 … … µ 𝑛 𝜎 = 𝜎1 𝜎2 … … 𝜎 𝑛 Comment Words Valence Rating Activation Rating Dictionary Value Mean Std Dev Mean Std Dev 'service' 6.83 1.54 2.95 2.09 'good' 7.89 1.24 3.66 2.72 'late' 3.32 1.17 5.57 2.56 Simple Average 6.01 1.32 4.06 2.46 Word Valence Rating Activation Rating max- GMM 7.5 3.7
  • 8.
    8 Gaussian Mixtures inRating “Total Sentiment”    N k kkk mxgpxf 1 );();(  N pk 1  2 2 1 2 1 ),;(            k kmx k kk emxg    the mean and stand deviation of the Normal Distribution of the ratings of each matched word overall sentiment xcomment of a comment in a particular dimension is then determined as Consider the cumulative effect of all matched sentiment bearing words via the sum of the individual probability densities. x represents the sentiment score, N the number of matched words in a comment kkm , where and which is the point at which the probability of the mixture of distribution is a maximum, and so is the most likely value for the overall sentiment of a comment composed of several words. );( max xf x xcomment 
  • 9.
    9 Overall Valance (Happiness)and Activation (Arousal) of a comment Comment Words Valence Rating Activation Rating Dictionary Value Mean Std Dev Mean Std Dev 'service' 6.83 1.54 2.95 2.09 'good' 7.89 1.24 3.66 2.72 'late' 3.32 1.17 5.57 2.56 Simple Average 6.01 1.32 4.06 2.46 Word Valence Rating Activation Rating max- GMM 7.5 3.7 Figure 1: Gaussian Mixtures of matched words in the Valence Dimension Figure 2: Gaussian Mixtures of matched words in the Activation Dimension
  • 10.
    10 IMPACT OF “NEGATIONS”ON TOTAL RATING Comment Words Valence Rating Activation Rating Dictionary Value Mean Std Dev Mean Std Dev 'service' 6.83 1.54 2.95 2.09 Not 'good' 6.65 1.24 6.38 2.72 'late' 3.32 1.17 5.57 2.56 Simple Average 5.6 1.32 4.97 2.46 Word Valence Rating Activation Rating max- GMM 6.7 4.5 Comment Words Valence Rating Activation Rating Dictionary Value Mean Std Dev Mean Std Dev 'service' 6.83 1.54 2.95 2.09 'good' 7.89 1.24 3.66 2.72 'late' 3.32 1.17 5.57 2.56 Simple Average 6.01 1.32 4.06 2.46 “the service was not good and late”“the service was good but was late” Word Valence Rating Activation Rating max- GMM 7.5 3.7  Account for Negations by adjusting the sentiment score of word immediately following the negation in a direction opposite in polarity to its matched directory sentiment value.  The magnitude of the adjustment made corresponds to the standard deviation of the particular rating value being adjusted.  The magnitude of the adjustment can also be user definable
  • 11.
    11 Variance in MaxGMM and Simple Average Measure  It is seen that 90% of the time the samples are within +/- 0.5 in the case of the Valence Attribute.  The CDF of the difference in the Activation attribute is tightly centered on the origin indicating hardly any variance.  This is also an indication that most comments convey sentiments of a single polarity and only a few comments (less than 10%) have words with conflicting emotional content. Figure 1: Variance between GMM and Simple Average measures for estimating overall comment sentiment A measure of the degree of disparate emotions in the comments
  • 12.
    12 Sample Comments forRating and Classification 1.HOTLINE ISSUES - DELAY IN ANSWERING - CX SERVICE ASSISTANCE Today morning CX has called to the 444 HL for Movie Ticket & he has waited for more than 10 mins in the line, regarding this now CX was very disappointed on our service. So pls be kind enough to chk on ths & give the call back to the CX ASAP. * Note: - Regarding this issue CX need the call back from one of our manager & CX has requested not to charge a single rupee from his no for this issue. 2.Yes,man magea prshnaya kiyapu gaman eyaa magea prshnea wisaduwaa he's a good 3.Yes kad pin nambar signal 4.Wenath ayathana wala mema pahasukam nomati nisa 5.very good service 6.uparimaya 7.Uparima 8.think so 9.thanks 10.Super 11.Solved 12.She resolved my problem. 13.Service nallam 14.Sambanda weemata boho welawak giya nisa 15.recharge 16.Prashnayata pilithura hodin pahadili kara dima 17. Payak athulatha gataluwa nirakaranaya karanwa kiuwa. Thawamath gataluwa nirakaranaya kara natha. 18.oba ayathanaya sewawan sadaha ihala mudalak ayakarana nisa 19.no mms setting laba dunnada save kala nohaka 20.nam apahu e tika ewanna 21.Mata awashshaya u pilithurau pahadili lesa laba ganemata hakiuna. 22.mage parshnata pilithuru dunna. 23.lotari SMS stop 24.Its professional 25.ing tone sewawa ain kirima 26.I submitted Xtv reg form on 27th oct at yr crescat arcade. They told to call me on 28th wed to give the AC No 27.Hot line eka answer karapu girlge voice eka and care eka good 28.Hi kohomada? Mama mea dawas wala plan karagena yanawa mage next music video eka karanna. Song eka "Mata Rawana" :-) 29.harima pehediliwa mage getaluwa nirakaranaya kala thanks 30.Good service but shortcomings due to some arrogant customer care officers 31.good men 32Good 33.getaluwa hadunagenimata noheki wiya.. 34.First of all its great to be treated as a privilege customer. Reason is simple. I'm using X mobile connection and XTV, because dialog has the better 35.durakathanayata pilithuru denda epai eke hoda naraka kiyanna. 36.Cx need to add the CHU CHU TV which is a kids channel to the channel list.Since this channel is available on another TV connection.Cx need this channel to activate for XTV aswell.Please check on this and do the needfull. Thank you 37.Customer service personal have to be trained better cause they can't think out of the box. 38.bashawa wenaskaranna
  • 13.
    13 Sentiment Aggregates onSample Comments Fig 1: Heat Map of Sentiment rated sample comments Fig 2: Sentiment Dimensions of sample comments
  • 14.
    14 A Novel AssociationRule Mining Algorithm • Initialize (at level L1) by determining set of all Items {I} that meet minimum support criteria • Determine support for all pairs of items {Ii,Ij} (i ~= j) in {I} • Determine rules for all pairs of items of the form Ii->Ij • At each subsequent level (Lp), p > 1 • Determine item combinations that meet minimum support criteria • Items at subsequent stages selected from rules of previous stage that met min support criteria • Antecedent at subsequent level (Lp+1) is formed by merging the antecedent and consequent terms of the rules that meet the minimum support criteria at level Lp • Stop when combined terms no longer meet min support criteria Deriving likely word combinations (Keyword Selection) • Selection Measures NBANBASupport /)()(  )( BAConfidence  )(/)( ASupportBASupport  )(/)&( ABA EPEEP )/( AB EEP
  • 15.
    15 Simplifying Assumptions ofthe Naïve Bayes Technique Sli )(/),,...,,()/,...,( 2121 jjNjN CPCXXXPCXXXP  )(/),,..,,(),,...,/( 3221 jJNjN CPCXXXPCXXXP )(/)()/()......,,..,/( 21 jjjnjN CPCPCXPCXXXP )/(),,.../( 2 jijNi CXPCXXXP  )/)...(/()/()/,...,,( 2121 jNjjjN CXCXPCXPCXXXP  Under the assumption of conditional independence of word Xi given class Cj )}()/({ max )/( jj j CPCXP C XCP  )}()./().../()/({ max 21 jjNjj j CPCXPCXPCXP C  probability of a sequence of words {Xi} in a comment given class Cj Probability of class C given a set of words X = {X1,X2…,XN}
  • 16.
    16 Classification via NaïveBayes Assumptions - The order of words {Xi} in a comment is independent of each other given the class {Cj} A class is determined solely on the specific words in a comment and their frequency of occurrence in that comment Conditional Independence of the words in a comment given the class of the comment a “bag of words model”
  • 17.
    17 Performance of theClassification Algorithm Accuracy greater than 75% on predicted classes Accuracy greater than 90% on training samples Performance will further increase with preprocessing and filtering single word comments don’t convey meaningful category information Use misclassified comments to “Retrain” algorithm Key Words for classification via Association Rules
  • 18.
    18 Algorithm Implementation &Results • Algorithm designed and built from first principals using Matlab programming language • Local Language Support by updating Dictionary with Sinhala and Tamil words conveying emotion • 59,000 comments analyzed and Rated for Sentiment and Classified / Binned in to six categories • Improved Classification by word relationships (key words) derived from Association Rule Mining • 3000 Training comments used with six classes for Training Model • Fast implementation processing all comments in a few hours • A Word vs. Frequency Analysis used to determine which new words to add to the Dictionary • The Sentiment rating is a means to “prioritize” the handling of the sorted and binned comments • Performance improvement by “re-classifying” , miss classified comments and reuse in Training
  • 19.
    19 Conclusion • Pre Processing– improved performance by retaining only relevant words and word combinations for the classification the business, purpose of the analysis • Spelling mistakes will cause problems as words will not match those in dictionary • Update Dictionary with new words and miss spelled words • Introduce limits on the minimum number of words that should be matched for a comment to be analyzed – for increased reliability • Independent Term Matching – doesn’t necessarily capture “meaning” of comment • short comments can be analyzed to assess overall sentiment • Rate the emotional content in a comment • Algorithm can provide other segmentations by matching words specific to the purpose of routing • Naïve Bayes gave good classification accuracy • The severity of sentiment in the classified comment used to prioritize comment handling • Simple averaging of the attribute values to arrive at the combined effect of all matched words in a comment can also be considered and may give results that are not that far off from the assumption of Normality
  • 20.