Online Reputation Monitoring in Twitter from an Information Access Perspective

Online Reputation Monitoring in Twitter
from an Information Access Perspective
Damiano Spina
damiano@lsi.uned.es
@damiano10

UNED NLP & IR Group
January 29, 2014
FdI UCM, Madrid, Spain

In Collaboration with
University of Amsterdam
●

Julio Gonzalo

●

Maarten de Rijke

●

Enrique Amigó

●

Edgar Meij (Yahoo! Barcelona)

●

Jorge Carrillo de Albornoz

●

Mª Hendrike Peetz

●

Irina Chugur

●

Tamara Martín

Llorente & Cuenca
●

●

Ana Pitart

●

LiMoSINe EU Project
www.limosine-project.eu

Vanessa Álvarez

Adolfo Corujo

Arab Spring in Egypt, Jan 2011

Online Reputation Monitoring (ORM)
●

Reputation/public image is key for entities:
–

Companies, Organizations, Personalities

●

–

●


Social Media:
–

Necessity (and opportunity) of handling the public image
of entities on the Web

●

–

●


Social Media:
–

Necessity (and opportunity) of handling the public image
of entities on the Web

–

Online Reputation Managers/Analysts
●

Handle the reputation of an entity of interest (i.e., customer)

●

Among other tasks, monitoring Social Media (manually!)
–

Early detection of issues/conversations/topics that may damage the
reputation of the entity of interest

Automatic Tools for ORM
Information Access (IA) techniques for
-Tracking Relevant Mentions
- Sentiment Analysis
- Discover Keywords/Topics

Problem
●

Lack of standard benchmarks

for evaluation

Problem
●

Lack of standard benchmarks

for evaluation

●

It is hard for the analysts to know
how automatic tools will perform
on their real data

Goals
●

Formalize the Online Reputation Monitoring
problem as scientific challenges

Goals
●

–

Build standard test collections

–

Organize International evaluation campaigns

–

Bring together ORM and IA experts from Industrial and
Academic communities

Goals
●

–
–

Organize International evaluation campaigns

–

●

Build standard test collections

Bring together ORM and IA experts from Industrial and
Academic communities

Propose automatic solutions that may assist the
reputation manager, reducing the effort in their daily
work

Outline
●


Outline
●


●

Formalization from an Information Access perspective
–

Tasks Definition

–

Evaluation Framework

Outline
●


●

–
–

●

Tasks Definition

How much of the problem can be solved automatically?
–

Filtering

–

Topic Detection

Outline
●


●

–
–

●

Tasks Definition

How much of the problem can be solved automatically?
–
–

●

Filtering
Topic Detection

Putting the Human in the Loop: A Semi-Automatic ORM
Assistant

Online Reputation Monitoring in
Twitter
●

Analysts' daily work
–

Focus on a given entity of interest

Twitter
●

–


–

Recall oriented
●

They have to check all potential mentions!

●

Also filter out not relevant mentions manually

Twitter
●

–


–

Recall oriented
●
●

–

They have to check all potential mentions!

Also filter out not relevant mentions manually

They make a summary to report to the client periodically
Summary

–
●

What is being said about the entity in Twitter?
What are the topics that may damage its reputation?

Why Twitter?
●

●

(Bad) news spread earlier/faster/more unpredictable
than any other source in the Web
Most popular microblogging service
–

>230M monthly active users

–

5k tweets published per second

Why Twitter?
●

●

(Bad) news spread earlier/faster/more unpredictable
than any other source in the Web
Most popular microblogging service
–

–
●

>230M monthly active users

5k tweets published per second

Challenging for Information Access
–

Little context (only 140 characters)

–

Non-standard, SMS-like language

Twitter

Twitter

?

Problem Formalization
ORM from an Information Access Perspective

Filtering Task
●

Is the tweet related to the entity of interest?

●

Example: Suzuki

related

unrelated

Filtering Task
●

Is the tweet related to the entity of interest?

●

Example: Suzuki

related
●

●

unrelated

Input: Entity of interest (name + representative
URL) + tweets that potentially mention the entity

Output: Binary classification at tweet-level
(relevant/not relevant)

Polarity for Reputation Task
●

●

Does the tweet affect negatively/positively to the reputation
of the entity?
Example: Goldman Sachs

Polarity for Reputation Task
●

●

●

●

Does the tweet affect negatively/positively to the reputation
of the entity?
Example: Goldman Sachs

Input: Entity of interest (name + representative URL) +
Stream of tweets that potentially mention the entity

Output: Multi-class classification at tweet-level
(positive/negative/neutral)

Topic Detection Task
●

What are the topics discussed in the tweets?

Topic Detection Task
●

●

●

What are the topics discussed in the tweets?

Input: Entity of interest (name + representative URL) +
Stream of tweets that mention the entity

Output: Topics (Cluster of tweets)

Topic Priority Task
●

What is the priority of each topics
in terms of reputational issues?

●

Input: Topics

●

Output: Ranking of Topics
–

Alerts go first

●

Reusable Test Collections

●

Evaluation Measures
–

Compare systems to annotated ground truth

●


●

Evaluation Measures
–

●

Compare systems to annotated ground truth

Evaluation Campaigns
–

Involve community

–

Compare different approaches

RepLab: Evaluating Online Reputation
Management Systems
●

Organized as CLEF Labs
Cross-Language Evaluation Forum

RepLab: Evaluating Online Reputation
Management Systems
●

Organized as CLEF Labs
Cross-Language Evaluation Forum

●

2 editions so far (+1 this year)
–

RepLab 2012
●

Filtering and Polarity for Reputation

●

Topic Detection and Topic Priority as Monitoring Pilot Task

–

RepLab 2013

–

RepLab 2014 (in progress)

E. Amigó, J. Carrillo de Albornoz, I. Chugur, A. Corujo, J. Gonzalo, T. Martín, E. Meij, M. de Rijke, D. Spina
Overview of RepLab 2013: Evaluating Online Reputation Monitoring Systems
Proceedings of the Fourth International Conference of the CLEF initiative. 2013.

Why we Need All this Stuff?
●

To Evaluate Automatic Systems

●

To be able to answer the questions:
–

Which system performs better?

–

Can tasks be solved automatically?

Automatic Solutions for ORM:
Filtering + Topic Detection

Evaluation: Filtering Task

Automatic systems can significantly help
when there is enough training data for each entity (750 tweets)

Evaluation: Filtering Task

Automatic systems can significantly help
when there is enough training data for each entity (750 tweets)
How?
* Supervised learning
POPSTAR (Univ. of Porto):
Features: Twitter metadata, textual features, keyword similarity +
external resources such as the entity’s homepage, Freebase and Wikipedia.

Evaluation: Topic Detection

Much more difficult than the Filtering Task

Evaluation: Topic Detection

Much more difficult than the Filtering Task
What performed better in RepLab?
UNED_ORM:
Clustering of wikified tweets
Tweets are represented as Bag of Wikipedia Concepts
Tweet content linked to Wikipedia concepts based on intra-Wikipedia links

Topic Detection Approach
●

●

Tweet -> Set of Wikipedia Concepts/Articles

Clustering: Tweets sharing x% of identified
Wikipedia articles are grouped together

D. Spina, J. Carrillo de Albornoz, T. Martín, E. Amigó, J. Gonzalo, F. Giner
UNED Online Reputation Monitoring Team at RepLab 2013
CLEF 2013 Labs and Workshops Notebook Papers. 2013.

Wikification: Commonness probability
WP concept c, n-gram q

q=“ferrari”

Wikification: Commonness probability
WP concept c, n-gram q

q=“ferrari”

COMMONNESS "Ferrari S.p.A.", "ferrari" =

4
= 0.57
(4 + 2 + 1)

Building Semi-Automatic Tools for
ORM

ORMA: A Semi-Automatic Tool for
Online Reputation Monitoring

J. Carrillo de Albornoz, E. Amigó, D. Spina, J. Gonzalo
ORMA: A Semi-Automatic Tool for Online Reputation Monitoring in Twitter
36th European Conference on Information Retrieval (ECIR). 2014.

Basic Filtering Approach
Support Vector Machines (SVM)

Related/Unrelated

Training tweet

Test tweet
(unknown label)
Bag of Words:
Tokenization +
Preprocessing +
Term Weighting

Filtering Classifier
0.42 F: Similar to best RepLab

Active Learning for Filtering

M. H. Peetz, D. Spina, M. de Rijke, J. Gonzalo
Towards an Active Learning System for Company Name Disambiguation in Microblog Streams
CLEF 2013 Labs and Workshops Notebook Papers. 2013.

●

Margin Sampling (confidence of the classifier)

●

After inspecting 2% of test data (30 out of 1500 tweets):
–

0.42 -> 0.52 F(R,S) (19.2% improvement)

–

Higher than the best RepLab contribution

●

Margin Sampling (confidence of the classifier)

●

After inspecting 2% of test data (30 out of 1500 tweets):
–
–

●

0.42 -> 0.52 F(R,S) (19.2% improvement)
Higher than the best RepLab contribution

The cost of initial training data can be reduced
substantially:
–

Effectiveness:
10% training + 10% test for feedback = 100% training

Conclusions
●


Conclusions
●


●

Formalized as Information Access Tasks
–


–

Systematic Evaluation

Conclusions
●


●

–
–

●


–

Filtering: Almost solved with enough training data
(0.49F, 0.91 accuracy)

–

Topic: Systems are useful but not perfect

Conclusions
●


●

–
–

●


–

–
●

Filtering: Almost solved with enough training data
(0.49F, 0.91 accuracy)
Topic: Systems are useful but not perfect

We need the expert in the loop
–

With a substantial reduction of manual effort

from an Information Access Persepective
Damiano Spina
damiano@lsi.uned.es
@damiano10

UNED NLP & IR Group
January 29, 2014
FdI UCM, Madrid, Spain

Online Reputation Monitoring in Twitter from an Information Access Perspective

More Related Content

Similar to Online Reputation Monitoring in Twitter from an Information Access Perspective

More from Damiano Spina

Recently uploaded

Online Reputation Monitoring in Twitter from an Information Access Perspective