SlideShare a Scribd company logo
PHEME http://www.pheme.eu
Detection and Resolution
of Rumours
in Social Media
Kalina Bontcheva
University of Sheffield
@kbontcheva
© The University of Sheffield, 2014-2017
This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike Licence
PHEME http://www.pheme.eu
Veracity: the 4th
V of Big Data
I coined the term phemes
memes are thematic motifs that spread through social
media in ways analogous to genetic traits
phemes add truthfulness and deception to the mix
named after ancient Greek Pheme,“embodiment of
fame and notoriety, her favour being notability, her wrath
being scandalous rumours"
2
Focus on a fourth crucial challenge: veracity
http://en.wikipedia.org/wiki/Pheme
PHEME http://www.pheme.eu
What is a Rumour?
Our definition
“a circulating story of questionable veracity, which is
apparently credible but hard to verify, and produces
sufficient skepticism and/or anxiety”
3
PHEME http://www.pheme.eu
Social Media is Rife with Phemes
PHEME http://www.pheme.eu
Social Media is Rife with Phemes (2)
PHEME http://www.pheme.eu
Social Media is Rife with Phemes (3)
PHEME http://www.pheme.eu
Rumour analysis: The Challenges
Journalists do it mostly manually
Rumour analysis is challenging
Some rumours could take days, weeks or even months
to die out (or even never!)
Ill-meaning humans can currently outsmart computers
(and humans) and appear genuine
Role of conversational threads in rumour analysis
Tweet-rot and deletions
Real-time analysis of temporally dynamic phemes
PHEME http://www.pheme.eu
What: High Level Goal
Create a computational framework for
automatic discovery and verification of
rumours, at scale and fast
Draw on:
NLP: what’s said
web science: a priori knowledge from Linked Data
social science: who spread it, why and how
information visualisation: visual analytics
PHEME http://www.pheme.eu
Content verification
● A number of projects focussed on detecting
inaccurate reports, which has recently been
relabelled as “fake news”.
● Most projects deal with what is known as
“fact-checking”, comparing sources to validate
reports, with an assumption that other
sources exist.
PHEME http://www.pheme.eu
Work by others on rumours
▪ Emergent.info - the most
similar project to PHEME -
tracking unverified reports
for debunking/confirmation
▪ The project has however
been discontinued
▪ Owing to the project being
totally manual and lacking
automation, maintenance
was not viable
PHEME http://www.pheme.eu
Work by others on rumours
▪ Other projects on rumours largely focussed on
visualisation, tracking rumours input by the user:
PHEME http://www.pheme.eu
Collect Tweet Datasets
PHEME http://www.pheme.eu
Twitter Collector: cloud.gate.ac.uk
Docs & YouTube videos:
https://cloud.gate.ac.uk/info/help/twitter-collector.html
PHEME http://www.pheme.eu
Live Stats and Fine Tuning
Tweets collected in
Amazon buckets for
Download
Processing on GATE
Cloud
PHEME http://www.pheme.eu
GATE Cloud services
24 services, multiple languages
PHEME http://www.pheme.eu
Rumour Annotation
PHEME http://www.pheme.eu
Dataset Overview
Data collection and sampling
Twitter public stream
Retweet threshold to establish salience
Journalists identified rumourous source tweets
and grouped them into stories
PHEME http://www.pheme.eu
Rumour Analysis
PHEME http://www.pheme.eu
Analysis Stages
1)Rumour detection: identify pieces of
information that need to be verified.
2)Rumour stance classification: classify public
stance towards the rumour.
3)Veracity classification: determine if a
rumour is true, false, or remains unverified.
PHEME http://www.pheme.eu
Experimental Methodology
– Leave-one-event-out: train on n-1 event, and
test on the n-th event.
– Cross-validation across events
– not across rumours or tweet sets
PHEME http://www.pheme.eu
Rumour detection
▪Classification problem
– Tweet level
– Classes: rumour or non-rumour
▪Why
– Identify tweets reporting unverified information.
– Necessary for subsequent stance and veracity
classification.
– “Michael Brown was involved in a robbery before
being shot by police” -> needs context
PHEME http://www.pheme.eu
Rumour detection: Results
A. Zubiaga, M. Liakata, R. Procter . Exploiting Context for Rumour Detection in Social Media. International
Conference on Social Informatics, 109-123. 2017.
PHEME http://www.pheme.eu
Rumour Stance classification
▪Classification Problem
– Tweet level
– Classes: support, deny, question, comment
▪Why
– Show journalists the unfolding online debates on a
given pheme
– Aid veracity classification
PHEME http://www.pheme.eu
Stance classification
PHEME http://www.pheme.eu
Rumour Stance & Veracity
▪ Manual Analysis [Mendoza et al. 2010]
▪ True rumours: 95% “support”, 4% “questioning” and
only 0.4% were “denies”
→ True rumours are largely supported by other
Twitter users
▪ False rumours: 38% “denies” and 17% “questioning”
→ Stance is an important factor for verification step
PHEME http://www.pheme.eu
Features: Stance Classification
▪ Syntactic: #negations, POS
▪ Lexical: e.g. bag of words, key words
▪ Lexicon: word embeddings, brown clusters
▪ Indicator: e.g. “this is nonsense”, “is a hoax”
▪ User metadata: e.g. verified, #followers/followees
▪ Semantic: e.g. emoticons, sentiment, NEs
▪ Message metadata: e.g. length of the post, URL
▪ Content formatting: e.g. capital words ratio
▪ Punctuation: e.g. ?, …, !
A. Aker, L. Derczynski, K. Bontcheva. Simple Open Stance Classification for Rumour Analysis. Recent Advances in
Natural Language Processing, 2017.
PHEME http://www.pheme.eu
USFD on RumourEval Data
▪ Applying traditional ML
▪ Focus on “problem specific” features
– Capturing Surprise, Doubt, Certainty and Support towards
rumourous tweets
– Similarity to initial tweet
▪ Random Forest (best performing traditional ML) with
features reported by related work yields to 76.54 accuracy
▪ Adding “problem specific” features yields 79.02 accuracy
(LSTM 78.4).
A. Aker, L. Derczynski, K. Bontcheva. Simple Open Stance Classification for Rumour Analysis. Recent Advances in
Natural Language Processing, 2017.
PHEME http://www.pheme.eu
Stance Classification: Findings
▪ Supporting tweets are more likely to include links.
▪ i.e. provide evidence
▪ Looking at the temporal dimension, S/D/Q tend to occur in
early stages of a rumour, and then mostly comments later
▪ We’ve looked at persistence, finding that supporting users
tend to post more tweets, i.e., further insisting
https://gate.ac.uk/wiki/pheme-stance.html
PHEME http://www.pheme.eu
Veracity classification
▪Classification Problem
– Rumour level
– Classes: true, false or unverified
▪Why
– Help journalists and the public determine if report
is accurate.
– Flagging false rumours to warn users.
PHEME http://www.pheme.eu
Features
▪Features similar to stance classification
▪NB: those features need to be computed for
the entire rumour instead of individual posts
▪E.g. adding feature values over all posts within
the rumour and performing normalisation.
▪ Stance of individual posts is an important feature
A. Aker, L. Derczynski, K. Bontcheva. Simple Open Stance Classification for Rumour Analysis. Recent Advances in
Natural Language Processing, 2017.
PHEME http://www.pheme.eu
Veracity classification
▪Evaluation: determine, after each tweet, ability
to classify veracity of rumour.
Classifier Without Stance With stance over
time
J48 64.61 69.84
Random Forest 68.20 70.76
K-NN 63.69 69.84
PHEME http://www.pheme.eu 32
Journalism dashboard
PHEME http://www.pheme.eu
Tweet Mapping
For geo-tagged tweets
in cluster, map their
origin
PHEME http://www.pheme.eu
Acknowledgements
Project partially supported by the European Union/EU under the
Information and Communication Technologies (ICT) theme of
the 7th Framework Programme
for R&D (FP7), grant PHEME (611233)”
34
This document does not represent the opinion of the European Community, and the European Community is not responsible for any use
that might be made of its content

More Related Content

Similar to Detection and Resolution of Rumours in Social Media

Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019
Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019
Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019
Weverify
 
How to cope with new communication formats
How to cope with new communication formatsHow to cope with new communication formats
How to cope with new communication formats
cleenknecht
 
We verify @ meta forum 2020 - Dec 2 2020
We verify @ meta forum 2020 - Dec 2 2020We verify @ meta forum 2020 - Dec 2 2020
We verify @ meta forum 2020 - Dec 2 2020
Weverify
 
Social Media for Event Organisers
Social Media for Event OrganisersSocial Media for Event Organisers
Social Media for Event Organisers
RichieKelly
 
Ten News - Digital workshop
Ten News - Digital workshopTen News - Digital workshop
Ten News - Digital workshop
TenDigital
 
WWT 2010: Female Ferocity
WWT 2010: Female FerocityWWT 2010: Female Ferocity
WWT 2010: Female Ferocity
Rad Campaign and Women Who Tech
 
Video Activism
Video ActivismVideo Activism
Hands on verification training for journalists
Hands on verification training for journalists Hands on verification training for journalists
Hands on verification training for journalists
Weverify
 
Engaging The Conversation ( Homeland Security Conference)
Engaging  The  Conversation ( Homeland  Security  Conference)Engaging  The  Conversation ( Homeland  Security  Conference)
Engaging The Conversation ( Homeland Security Conference)
Fort Bend County Office of Emergency Management
 
Engage and Inspire Language Teachers and Learners with Twitter #InspireTESL
Engage and Inspire Language Teachers and Learners with Twitter #InspireTESLEngage and Inspire Language Teachers and Learners with Twitter #InspireTESL
Engage and Inspire Language Teachers and Learners with Twitter #InspireTESL
faithmarcel
 
Introduction to Social Media
Introduction to Social MediaIntroduction to Social Media
Introduction to Social Media
Michele Martin
 
WeVerify Overview
WeVerify Overview WeVerify Overview
WeVerify Overview
Weverify
 
Detection and resolution of rumours in social media
Detection and resolution of rumours in social mediaDetection and resolution of rumours in social media
Detection and resolution of rumours in social media
ObedullahFahad
 
Social media: Choosing the Most Successful Tools
Social media: Choosing the Most Successful ToolsSocial media: Choosing the Most Successful Tools
Social media: Choosing the Most Successful Tools
Web2LLP
 
Leslie Bradshaw // Online News Association // 9.13.08
Leslie Bradshaw // Online News Association // 9.13.08Leslie Bradshaw // Online News Association // 9.13.08
Leslie Bradshaw // Online News Association // 9.13.08
Leslie Bradshaw
 
Social Media: Should We, Should We Not, or Should We Ignore the Whole Thing
Social Media: Should We, Should We Not, or Should We Ignore the Whole ThingSocial Media: Should We, Should We Not, or Should We Ignore the Whole Thing
Social Media: Should We, Should We Not, or Should We Ignore the Whole Thing
Jim Cahill
 
Hypothesis quick overview 2011-10-19
Hypothesis  quick overview 2011-10-19Hypothesis  quick overview 2011-10-19
Hypothesis quick overview 2011-10-19
dwhly
 
Social Media Trends Summer 2014
Social Media Trends Summer 2014Social Media Trends Summer 2014
Social Media Trends Summer 2014
Daniel Ord Rasmussen
 
Engaging The Conversation
Engaging The ConversationEngaging The Conversation
New business model digital platform
New business model digital platformNew business model digital platform
New business model digital platform
Hendi Kariawan
 

Similar to Detection and Resolution of Rumours in Social Media (20)

Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019
Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019
Presentation / invited talk by Kalina Bontcheva at Digilience 2019, Oct 2019
 
How to cope with new communication formats
How to cope with new communication formatsHow to cope with new communication formats
How to cope with new communication formats
 
We verify @ meta forum 2020 - Dec 2 2020
We verify @ meta forum 2020 - Dec 2 2020We verify @ meta forum 2020 - Dec 2 2020
We verify @ meta forum 2020 - Dec 2 2020
 
Social Media for Event Organisers
Social Media for Event OrganisersSocial Media for Event Organisers
Social Media for Event Organisers
 
Ten News - Digital workshop
Ten News - Digital workshopTen News - Digital workshop
Ten News - Digital workshop
 
WWT 2010: Female Ferocity
WWT 2010: Female FerocityWWT 2010: Female Ferocity
WWT 2010: Female Ferocity
 
Video Activism
Video ActivismVideo Activism
Video Activism
 
Hands on verification training for journalists
Hands on verification training for journalists Hands on verification training for journalists
Hands on verification training for journalists
 
Engaging The Conversation ( Homeland Security Conference)
Engaging  The  Conversation ( Homeland  Security  Conference)Engaging  The  Conversation ( Homeland  Security  Conference)
Engaging The Conversation ( Homeland Security Conference)
 
Engage and Inspire Language Teachers and Learners with Twitter #InspireTESL
Engage and Inspire Language Teachers and Learners with Twitter #InspireTESLEngage and Inspire Language Teachers and Learners with Twitter #InspireTESL
Engage and Inspire Language Teachers and Learners with Twitter #InspireTESL
 
Introduction to Social Media
Introduction to Social MediaIntroduction to Social Media
Introduction to Social Media
 
WeVerify Overview
WeVerify Overview WeVerify Overview
WeVerify Overview
 
Detection and resolution of rumours in social media
Detection and resolution of rumours in social mediaDetection and resolution of rumours in social media
Detection and resolution of rumours in social media
 
Social media: Choosing the Most Successful Tools
Social media: Choosing the Most Successful ToolsSocial media: Choosing the Most Successful Tools
Social media: Choosing the Most Successful Tools
 
Leslie Bradshaw // Online News Association // 9.13.08
Leslie Bradshaw // Online News Association // 9.13.08Leslie Bradshaw // Online News Association // 9.13.08
Leslie Bradshaw // Online News Association // 9.13.08
 
Social Media: Should We, Should We Not, or Should We Ignore the Whole Thing
Social Media: Should We, Should We Not, or Should We Ignore the Whole ThingSocial Media: Should We, Should We Not, or Should We Ignore the Whole Thing
Social Media: Should We, Should We Not, or Should We Ignore the Whole Thing
 
Hypothesis quick overview 2011-10-19
Hypothesis  quick overview 2011-10-19Hypothesis  quick overview 2011-10-19
Hypothesis quick overview 2011-10-19
 
Social Media Trends Summer 2014
Social Media Trends Summer 2014Social Media Trends Summer 2014
Social Media Trends Summer 2014
 
Engaging The Conversation
Engaging The ConversationEngaging The Conversation
Engaging The Conversation
 
New business model digital platform
New business model digital platformNew business model digital platform
New business model digital platform
 

Recently uploaded

TSF - Task 1 - Digital Marketing : Social Media
TSF - Task 1 - Digital Marketing  : Social MediaTSF - Task 1 - Digital Marketing  : Social Media
TSF - Task 1 - Digital Marketing : Social Media
JayaBharne2
 
Using Playlists to Increase YouTube Watch Time
Using Playlists to Increase YouTube Watch TimeUsing Playlists to Increase YouTube Watch Time
Using Playlists to Increase YouTube Watch Time
SocioCosmos
 
Top Google Tools for SEO: Enhance Your Website's Performance
Top Google Tools for SEO: Enhance Your Website's PerformanceTop Google Tools for SEO: Enhance Your Website's Performance
Top Google Tools for SEO: Enhance Your Website's Performance
Elysian Digital Services Pvt. Ltd.
 
On Storytelling & Magic Realism in Rushdie’s Midnight’s Children, Shame, and ...
On Storytelling & Magic Realism in Rushdie’s Midnight’s Children, Shame, and ...On Storytelling & Magic Realism in Rushdie’s Midnight’s Children, Shame, and ...
On Storytelling & Magic Realism in Rushdie’s Midnight’s Children, Shame, and ...
AJHSSR Journal
 
Ahmedabad Call Girls 🔥 9352988975 ❤️ Book High Class Models In Ahmedabad
Ahmedabad Call Girls 🔥 9352988975 ❤️  Book High Class Models In AhmedabadAhmedabad Call Girls 🔥 9352988975 ❤️  Book High Class Models In Ahmedabad
Ahmedabad Call Girls 🔥 9352988975 ❤️ Book High Class Models In Ahmedabad
falaqmalikmodel
 
一比一原版(CSULB毕业证)加州州立大学长滩分校毕业证如何办理
一比一原版(CSULB毕业证)加州州立大学长滩分校毕业证如何办理一比一原版(CSULB毕业证)加州州立大学长滩分校毕业证如何办理
一比一原版(CSULB毕业证)加州州立大学长滩分校毕业证如何办理
exqfuhe
 
Facebook Fan Page Profits to boost your profits today!
Facebook Fan Page Profits  to boost your profits today!Facebook Fan Page Profits  to boost your profits today!
Facebook Fan Page Profits to boost your profits today!
Rohit Gupta
 
TACKLING ILLEGAL LOGGING: PROBLEMS AND CHALLENGES
TACKLING ILLEGAL LOGGING: PROBLEMS AND CHALLENGESTACKLING ILLEGAL LOGGING: PROBLEMS AND CHALLENGES
TACKLING ILLEGAL LOGGING: PROBLEMS AND CHALLENGES
AJHSSR Journal
 
CYBER SECURITY ENHANCEMENT IN NIGERIA. A CASE STUDY OF SIX STATES IN THE NORT...
CYBER SECURITY ENHANCEMENT IN NIGERIA. A CASE STUDY OF SIX STATES IN THE NORT...CYBER SECURITY ENHANCEMENT IN NIGERIA. A CASE STUDY OF SIX STATES IN THE NORT...
CYBER SECURITY ENHANCEMENT IN NIGERIA. A CASE STUDY OF SIX STATES IN THE NORT...
AJHSSR Journal
 
Satta Matka Dpboss Kalyan Fix Game matka
Satta Matka Dpboss Kalyan Fix Game matkaSatta Matka Dpboss Kalyan Fix Game matka
一比一原版(UCSD毕业证)加利福尼亚大学圣迭戈分校毕业证如何办理
一比一原版(UCSD毕业证)加利福尼亚大学圣迭戈分校毕业证如何办理一比一原版(UCSD毕业证)加利福尼亚大学圣迭戈分校毕业证如何办理
一比一原版(UCSD毕业证)加利福尼亚大学圣迭戈分校毕业证如何办理
wozek1
 
一比一原版(lu毕业证书)拉夫堡大学毕业证如何办理
一比一原版(lu毕业证书)拉夫堡大学毕业证如何办理一比一原版(lu毕业证书)拉夫堡大学毕业证如何办理
一比一原版(lu毕业证书)拉夫堡大学毕业证如何办理
aoxfo
 
9861615390 Satta Dpboss Sattamatka matka
9861615390 Satta Dpboss Sattamatka matka9861615390 Satta Dpboss Sattamatka matka
一比一原版(uon毕业证书)澳洲纽卡斯尔大学毕业证如何办理
一比一原版(uon毕业证书)澳洲纽卡斯尔大学毕业证如何办理一比一原版(uon毕业证书)澳洲纽卡斯尔大学毕业证如何办理
一比一原版(uon毕业证书)澳洲纽卡斯尔大学毕业证如何办理
tohufue
 
Pulse Vibes Media.......................
Pulse Vibes Media.......................Pulse Vibes Media.......................
Pulse Vibes Media.......................
davidhasan005
 
Call Girls Hyderabad🔥7023059433🔥Vip Profile Escorts in Hyderabad Available 24/7
Call Girls Hyderabad🔥7023059433🔥Vip Profile Escorts in Hyderabad Available 24/7Call Girls Hyderabad🔥7023059433🔥Vip Profile Escorts in Hyderabad Available 24/7
Call Girls Hyderabad🔥7023059433🔥Vip Profile Escorts in Hyderabad Available 24/7
manji sharman06
 

Recently uploaded (16)

TSF - Task 1 - Digital Marketing : Social Media
TSF - Task 1 - Digital Marketing  : Social MediaTSF - Task 1 - Digital Marketing  : Social Media
TSF - Task 1 - Digital Marketing : Social Media
 
Using Playlists to Increase YouTube Watch Time
Using Playlists to Increase YouTube Watch TimeUsing Playlists to Increase YouTube Watch Time
Using Playlists to Increase YouTube Watch Time
 
Top Google Tools for SEO: Enhance Your Website's Performance
Top Google Tools for SEO: Enhance Your Website's PerformanceTop Google Tools for SEO: Enhance Your Website's Performance
Top Google Tools for SEO: Enhance Your Website's Performance
 
On Storytelling & Magic Realism in Rushdie’s Midnight’s Children, Shame, and ...
On Storytelling & Magic Realism in Rushdie’s Midnight’s Children, Shame, and ...On Storytelling & Magic Realism in Rushdie’s Midnight’s Children, Shame, and ...
On Storytelling & Magic Realism in Rushdie’s Midnight’s Children, Shame, and ...
 
Ahmedabad Call Girls 🔥 9352988975 ❤️ Book High Class Models In Ahmedabad
Ahmedabad Call Girls 🔥 9352988975 ❤️  Book High Class Models In AhmedabadAhmedabad Call Girls 🔥 9352988975 ❤️  Book High Class Models In Ahmedabad
Ahmedabad Call Girls 🔥 9352988975 ❤️ Book High Class Models In Ahmedabad
 
一比一原版(CSULB毕业证)加州州立大学长滩分校毕业证如何办理
一比一原版(CSULB毕业证)加州州立大学长滩分校毕业证如何办理一比一原版(CSULB毕业证)加州州立大学长滩分校毕业证如何办理
一比一原版(CSULB毕业证)加州州立大学长滩分校毕业证如何办理
 
Facebook Fan Page Profits to boost your profits today!
Facebook Fan Page Profits  to boost your profits today!Facebook Fan Page Profits  to boost your profits today!
Facebook Fan Page Profits to boost your profits today!
 
TACKLING ILLEGAL LOGGING: PROBLEMS AND CHALLENGES
TACKLING ILLEGAL LOGGING: PROBLEMS AND CHALLENGESTACKLING ILLEGAL LOGGING: PROBLEMS AND CHALLENGES
TACKLING ILLEGAL LOGGING: PROBLEMS AND CHALLENGES
 
CYBER SECURITY ENHANCEMENT IN NIGERIA. A CASE STUDY OF SIX STATES IN THE NORT...
CYBER SECURITY ENHANCEMENT IN NIGERIA. A CASE STUDY OF SIX STATES IN THE NORT...CYBER SECURITY ENHANCEMENT IN NIGERIA. A CASE STUDY OF SIX STATES IN THE NORT...
CYBER SECURITY ENHANCEMENT IN NIGERIA. A CASE STUDY OF SIX STATES IN THE NORT...
 
Satta Matka Dpboss Kalyan Fix Game matka
Satta Matka Dpboss Kalyan Fix Game matkaSatta Matka Dpboss Kalyan Fix Game matka
Satta Matka Dpboss Kalyan Fix Game matka
 
一比一原版(UCSD毕业证)加利福尼亚大学圣迭戈分校毕业证如何办理
一比一原版(UCSD毕业证)加利福尼亚大学圣迭戈分校毕业证如何办理一比一原版(UCSD毕业证)加利福尼亚大学圣迭戈分校毕业证如何办理
一比一原版(UCSD毕业证)加利福尼亚大学圣迭戈分校毕业证如何办理
 
一比一原版(lu毕业证书)拉夫堡大学毕业证如何办理
一比一原版(lu毕业证书)拉夫堡大学毕业证如何办理一比一原版(lu毕业证书)拉夫堡大学毕业证如何办理
一比一原版(lu毕业证书)拉夫堡大学毕业证如何办理
 
9861615390 Satta Dpboss Sattamatka matka
9861615390 Satta Dpboss Sattamatka matka9861615390 Satta Dpboss Sattamatka matka
9861615390 Satta Dpboss Sattamatka matka
 
一比一原版(uon毕业证书)澳洲纽卡斯尔大学毕业证如何办理
一比一原版(uon毕业证书)澳洲纽卡斯尔大学毕业证如何办理一比一原版(uon毕业证书)澳洲纽卡斯尔大学毕业证如何办理
一比一原版(uon毕业证书)澳洲纽卡斯尔大学毕业证如何办理
 
Pulse Vibes Media.......................
Pulse Vibes Media.......................Pulse Vibes Media.......................
Pulse Vibes Media.......................
 
Call Girls Hyderabad🔥7023059433🔥Vip Profile Escorts in Hyderabad Available 24/7
Call Girls Hyderabad🔥7023059433🔥Vip Profile Escorts in Hyderabad Available 24/7Call Girls Hyderabad🔥7023059433🔥Vip Profile Escorts in Hyderabad Available 24/7
Call Girls Hyderabad🔥7023059433🔥Vip Profile Escorts in Hyderabad Available 24/7
 

Detection and Resolution of Rumours in Social Media

  • 1. PHEME http://www.pheme.eu Detection and Resolution of Rumours in Social Media Kalina Bontcheva University of Sheffield @kbontcheva © The University of Sheffield, 2014-2017 This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike Licence
  • 2. PHEME http://www.pheme.eu Veracity: the 4th V of Big Data I coined the term phemes memes are thematic motifs that spread through social media in ways analogous to genetic traits phemes add truthfulness and deception to the mix named after ancient Greek Pheme,“embodiment of fame and notoriety, her favour being notability, her wrath being scandalous rumours" 2 Focus on a fourth crucial challenge: veracity http://en.wikipedia.org/wiki/Pheme
  • 3. PHEME http://www.pheme.eu What is a Rumour? Our definition “a circulating story of questionable veracity, which is apparently credible but hard to verify, and produces sufficient skepticism and/or anxiety” 3
  • 5. PHEME http://www.pheme.eu Social Media is Rife with Phemes (2)
  • 6. PHEME http://www.pheme.eu Social Media is Rife with Phemes (3)
  • 7. PHEME http://www.pheme.eu Rumour analysis: The Challenges Journalists do it mostly manually Rumour analysis is challenging Some rumours could take days, weeks or even months to die out (or even never!) Ill-meaning humans can currently outsmart computers (and humans) and appear genuine Role of conversational threads in rumour analysis Tweet-rot and deletions Real-time analysis of temporally dynamic phemes
  • 8. PHEME http://www.pheme.eu What: High Level Goal Create a computational framework for automatic discovery and verification of rumours, at scale and fast Draw on: NLP: what’s said web science: a priori knowledge from Linked Data social science: who spread it, why and how information visualisation: visual analytics
  • 9. PHEME http://www.pheme.eu Content verification ● A number of projects focussed on detecting inaccurate reports, which has recently been relabelled as “fake news”. ● Most projects deal with what is known as “fact-checking”, comparing sources to validate reports, with an assumption that other sources exist.
  • 10. PHEME http://www.pheme.eu Work by others on rumours ▪ Emergent.info - the most similar project to PHEME - tracking unverified reports for debunking/confirmation ▪ The project has however been discontinued ▪ Owing to the project being totally manual and lacking automation, maintenance was not viable
  • 11. PHEME http://www.pheme.eu Work by others on rumours ▪ Other projects on rumours largely focussed on visualisation, tracking rumours input by the user:
  • 13. PHEME http://www.pheme.eu Twitter Collector: cloud.gate.ac.uk Docs & YouTube videos: https://cloud.gate.ac.uk/info/help/twitter-collector.html
  • 14. PHEME http://www.pheme.eu Live Stats and Fine Tuning Tweets collected in Amazon buckets for Download Processing on GATE Cloud
  • 15. PHEME http://www.pheme.eu GATE Cloud services 24 services, multiple languages
  • 17. PHEME http://www.pheme.eu Dataset Overview Data collection and sampling Twitter public stream Retweet threshold to establish salience Journalists identified rumourous source tweets and grouped them into stories
  • 19. PHEME http://www.pheme.eu Analysis Stages 1)Rumour detection: identify pieces of information that need to be verified. 2)Rumour stance classification: classify public stance towards the rumour. 3)Veracity classification: determine if a rumour is true, false, or remains unverified.
  • 20. PHEME http://www.pheme.eu Experimental Methodology – Leave-one-event-out: train on n-1 event, and test on the n-th event. – Cross-validation across events – not across rumours or tweet sets
  • 21. PHEME http://www.pheme.eu Rumour detection ▪Classification problem – Tweet level – Classes: rumour or non-rumour ▪Why – Identify tweets reporting unverified information. – Necessary for subsequent stance and veracity classification. – “Michael Brown was involved in a robbery before being shot by police” -> needs context
  • 22. PHEME http://www.pheme.eu Rumour detection: Results A. Zubiaga, M. Liakata, R. Procter . Exploiting Context for Rumour Detection in Social Media. International Conference on Social Informatics, 109-123. 2017.
  • 23. PHEME http://www.pheme.eu Rumour Stance classification ▪Classification Problem – Tweet level – Classes: support, deny, question, comment ▪Why – Show journalists the unfolding online debates on a given pheme – Aid veracity classification
  • 25. PHEME http://www.pheme.eu Rumour Stance & Veracity ▪ Manual Analysis [Mendoza et al. 2010] ▪ True rumours: 95% “support”, 4% “questioning” and only 0.4% were “denies” → True rumours are largely supported by other Twitter users ▪ False rumours: 38% “denies” and 17% “questioning” → Stance is an important factor for verification step
  • 26. PHEME http://www.pheme.eu Features: Stance Classification ▪ Syntactic: #negations, POS ▪ Lexical: e.g. bag of words, key words ▪ Lexicon: word embeddings, brown clusters ▪ Indicator: e.g. “this is nonsense”, “is a hoax” ▪ User metadata: e.g. verified, #followers/followees ▪ Semantic: e.g. emoticons, sentiment, NEs ▪ Message metadata: e.g. length of the post, URL ▪ Content formatting: e.g. capital words ratio ▪ Punctuation: e.g. ?, …, ! A. Aker, L. Derczynski, K. Bontcheva. Simple Open Stance Classification for Rumour Analysis. Recent Advances in Natural Language Processing, 2017.
  • 27. PHEME http://www.pheme.eu USFD on RumourEval Data ▪ Applying traditional ML ▪ Focus on “problem specific” features – Capturing Surprise, Doubt, Certainty and Support towards rumourous tweets – Similarity to initial tweet ▪ Random Forest (best performing traditional ML) with features reported by related work yields to 76.54 accuracy ▪ Adding “problem specific” features yields 79.02 accuracy (LSTM 78.4). A. Aker, L. Derczynski, K. Bontcheva. Simple Open Stance Classification for Rumour Analysis. Recent Advances in Natural Language Processing, 2017.
  • 28. PHEME http://www.pheme.eu Stance Classification: Findings ▪ Supporting tweets are more likely to include links. ▪ i.e. provide evidence ▪ Looking at the temporal dimension, S/D/Q tend to occur in early stages of a rumour, and then mostly comments later ▪ We’ve looked at persistence, finding that supporting users tend to post more tweets, i.e., further insisting https://gate.ac.uk/wiki/pheme-stance.html
  • 29. PHEME http://www.pheme.eu Veracity classification ▪Classification Problem – Rumour level – Classes: true, false or unverified ▪Why – Help journalists and the public determine if report is accurate. – Flagging false rumours to warn users.
  • 30. PHEME http://www.pheme.eu Features ▪Features similar to stance classification ▪NB: those features need to be computed for the entire rumour instead of individual posts ▪E.g. adding feature values over all posts within the rumour and performing normalisation. ▪ Stance of individual posts is an important feature A. Aker, L. Derczynski, K. Bontcheva. Simple Open Stance Classification for Rumour Analysis. Recent Advances in Natural Language Processing, 2017.
  • 31. PHEME http://www.pheme.eu Veracity classification ▪Evaluation: determine, after each tweet, ability to classify veracity of rumour. Classifier Without Stance With stance over time J48 64.61 69.84 Random Forest 68.20 70.76 K-NN 63.69 69.84
  • 33. PHEME http://www.pheme.eu Tweet Mapping For geo-tagged tweets in cluster, map their origin
  • 34. PHEME http://www.pheme.eu Acknowledgements Project partially supported by the European Union/EU under the Information and Communication Technologies (ICT) theme of the 7th Framework Programme for R&D (FP7), grant PHEME (611233)” 34 This document does not represent the opinion of the European Community, and the European Community is not responsible for any use that might be made of its content

Editor's Notes

  1. Phemes are
  2. As soon as a news story broke and the journalists informed us about the event being of interest, we set up the data collection process for that event, tracking the main hashtags and keywords pertaining to the event as a whole. Twitter's public stream was accessed through the 'statuses/filter' endpoint4, making use of the 'track' parameter to specify the keywords. Note that while we may have missed the very early tweets about an event for launching the collection slightly after the news breaks, we still keep collecting subsequent retweets to those early tweets, which makes it much more likely to retrieve the most retweeted tweets from the very first minutes. Once the tweets below the source tweet are scraped, the script performs the following two steps to make sure that the whole conversation started by the source tweet is retrieved: (1) The script also checks if there are more pages with responses (since Twitter pages the responses), and (2) the script then retrieves, recursively, the responses to all those responding tweets, which enables retrieval of nested interactions. This process gives us the tweet IDs of all the tweets responding to the source tweet, with which we can form the whole tree. To collect all the metadata for those tweets, we then access the Twitter API using the tweet IDs; specifically, we use the 'statuses/lookup’ endpoint6, which provides all the metadata for up to 100 tweets at once.