A talk presented at CCWI 2016 on using sentiment analysis of social media content to provide a KPI that can be used in constructing a league table. The case study is of water companies in the UK, but the method can be applied to any set of individuals using Twitter for which a league table is required.
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Toward multi-criteria Analysis of Water Company Performance using Sentiment Analysis of Social Media Content
1. Toward multi-criteria analysis of water company
performance using sentiment analysis of social media
content
David J. Walker
Centre for Water Systems
College of Engineering, Mathematics and Physical Sciences
University of Exeter
D.J.Walker@exeter.ac.uk
November 2016
David J. Walker November 2016 1 / 8
2. Measuring Performance
Performance is frequently evaluated according to a set of key
performance indicators (single measurable indicators of an
individual’s performance)
KPIs are collected according to a schedule (often annual) and
aggregated into league tables
Limitations of league tables
Data is often provided by the individuals themselves
KPIs are relatively “static”
Collecting data might be expensive
David J. Walker November 2016 2 / 8
3. Measuring Performance
Performance is frequently evaluated according to a set of key
performance indicators (single measurable indicators of an
individual’s performance)
KPIs are collected according to a schedule (often annual) and
aggregated into league tables
Limitations of league tables
Data is often provided by the individuals themselves
KPIs are relatively “static”
Collecting data might be expensive
A solution
Perform sentiment analysis on social media content relating to an
individual to assess their performance
David J. Walker November 2016 2 / 8
4. Sentiment Analysis
Sentiment analysis is used to infer people’s attitudes or opinions from
written text
“I went on holiday and had a really brilliant time”
“My last holiday was completely terrible,
the service was horrendous”
Often relies on a sentiment lexicon – a list of lexical features
classified as positive or negative (or neutral)
Alternatives use Bayesian machine learning – Naive Bayes classifiers,
support vector machines. . .
Widely applied in social media
David J. Walker November 2016 3 / 8
5. VADER: Valence Aware Dictionary for sEntiment
Reasoning
VADER reports on the polarity and intensity of sentiment
Heuristics used to describe text characteristics emphasising sentiment
(e.g., punctuation, capitalisation. . . )
Competitive with other state-of-the-art sentiment analysis tools
evaluated on a range of datasets
Python implementation – nltk
“I went on holiday and had a really brilliant time”
Positive: 0.531, Neutral: 0.469, Negative: 0.0, Compound: 0.7778
“My last holiday was completely terrible, the service was horrendous”
Positive: 0.16, Neutral: 0.414, Negative: 0.426, Compound: -0.6697
David J. Walker November 2016 4 / 8
6. Case study – UK water companies
Twitter data for nine UK water companies (A-I) obtained for the
period 1st September 2016 – 31st October 2016
Tweets are from each company’s timeline – their own tweets, and
those of other users they have retweeted
Varying numbers of tweets – with two exceptions, between 1,000 and
5,000 (two featured 211 and 637, respectively)
David J. Walker November 2016 5 / 8
7. A Sentiment-based KPI
Given a sentiment coefficient ct for a given timestep (day) – the mean
sentiment coefficient of each of the tweets for a given day
sn =
1
T
T
t=1
ct
where t = 1 for the census day, t = 1 for the day prior to the census, and
t = T for the oldest day in the period
David J. Walker November 2016 6 / 8
8. A Sentiment-based KPI
Given a sentiment coefficient ct for a given timestep (day) – the mean
sentiment coefficient of each of the tweets for a given day
sn =
1
T
T
t=1
ct
where t = 1 for the census day, t = 1 for the day prior to the census, and
t = T for the oldest day in the period
Incorporating “history”
Use an exponential decay factor to weight sentiment coefficients so that
recent sentiment is given more weight than historical sentiment
sn =
1
T
T
t=1
1
exp(t)
ct
David J. Walker November 2016 6 / 8
9. UK water companies – sentiment league table
Company Basic Rank History Rank
A 1 1
B 5 2
C 2 3
D 3 4
E 6 5
F 4 6
G 7 7
H 8 8
I 9 9
Company A
David J. Walker November 2016 7 / 8
10. UK water companies – sentiment league table
Company Basic Rank History Rank
A 1 1
B 5 2
C 2 3
D 3 4
E 6 5
F 4 6
G 7 7
H 8 8
I 9 9
Company I
David J. Walker November 2016 7 / 8
11. Summary & Future Work
Summary
Social media provides a rich corpus of data from which performance
information can be inferred – sentiment analysis
The example herein shows a basic use of Twitter data for illustrating
performance
Fast to compute (order of seconds)
Future Work
Understand the relationship between sentiment KPIs and existing
KPIs (e.g., Offwat SIM survey)
Provide further insight by classifying tweets by reason for contact
(e.g., billing query, water status. . . )
Compare lexicon-based and machine learning-based approaches to
sentiment analysis
(Much) more extensive use of social media aspects – how often is a
tweet retweeted?
David J. Walker November 2016 8 / 8