SlideShare a Scribd company logo
1 of 28
Download to read offline
Credibility	
  Ranking	
  of	
  Tweets	
  
during	
  High	
  Impact	
  Events	
  
Adi$	
  Gupta	
  &	
  Ponnurangam	
  Kumaraguru	
  
PSOSM@WWW	
  
April	
  17,	
  2012	
  
Problem	
  MoOvaOon	
  

	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

2	
  
Problem	
  MoOvaOon	
  
Informa$on	
  

Opinion	
  

Spam	
  

	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

3	
  
Outline	
  
• 
• 
• 
• 
• 
• 
• 

	
  

	
  

Research	
  statement	
  
Architecture	
  
Data	
  collecOon	
  
Analysis	
  
Results	
  
ImplementaOon	
  
Future	
  direcOon	
  
	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

4	
  
Research	
  Statement	
  
•  IdenOfy	
  parameters	
  that	
  affect	
  credibility	
  of	
  
content	
  on	
  TwiTer	
  
•  Develop	
  a	
  semi-­‐automated	
  algorithm	
  to	
  
assess	
  credibility	
  of	
  tweets	
  	
  

	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

5	
  
Terminology	
  
TWEET:	
  A	
  status	
  (140	
  
chars)	
  

HASHTAG	
  

RETWEET	
  
USER	
  
PROFILE	
  

URL	
  
USER	
  NAME	
  @screen_name	
  

FOLLOWERS	
  
Tweets	
  
@-­‐MENTIONS	
  

	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

6	
  
Credibility	
  
•  “The	
  quality	
  of	
  being	
  trusted	
  and	
  believed	
  in.”	
  
	
  
•  In	
  this	
  research	
  
–  Assess	
  the	
  credibility	
  of	
  the	
  informaOon	
  in	
  the	
  
content	
  of	
  a	
  tweet	
  (message)	
  by	
  a	
  user	
  on	
  TwiTer.	
  
	
  
–  	
  A	
  tweet	
  is	
  said	
  to	
  contain	
  credible	
  informaOon	
  
about	
  a	
  news	
  event,	
  if	
  you	
  trust	
  or	
  believe	
  that	
  
informaOon	
  in	
  the	
  tweet	
  to	
  be	
  correct	
  /	
  true.	
  
	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

7	
  
News	
  on	
  TwiTer	
  
News	
  on	
  
Twi7er	
  

Topics	
  on	
  
Twi7er	
  

News	
  
Events	
  
E.g.	
  #Irene,	
  
#Libyacrisis	
  
	
  	
  
	
  

	
  

	
  precog.iiitd.edu.in	
  

Credible	
  
Informa$on	
  

Chit-­‐Chat	
  

Fake	
  news	
  /	
  Rumors	
  /
Spam	
  /	
  Personal	
  
Opinions	
  

E.g.	
  
#nothingwrongwith,	
  
#goodmorningtwiTer	
  
	
  

	
  

	
  

	
  

	
  

Non-­‐
Credible	
  
Informa$on	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

8	
  
Our	
  ContribuOons	
  
•  30%	
  of	
  tweets	
  provide	
  informaOon	
  (17%	
  credible	
  informaOon)	
  
and	
  14%	
  was	
  spam	
  
	
  
•  Linear	
  logisOc	
  regression	
  	
  
–  Content	
  based:	
  #unique	
  characters,	
  swear	
  words,	
  
pronouns	
  and	
  emoOcons	
  
–  User	
  based:	
  #followers	
  and	
  length	
  of	
  username	
  
	
  
•  Present	
  automated	
  algorithm	
  (supervised	
  ML	
  and	
  relevance	
  
feedback)	
  to	
  assess	
  credibility	
  in	
  tweets	
  

	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

9	
  
Data	
  StaOsOcs	
  
Total	
  tweets

35,748,136

Total	
  unique	
  users

6,877,320

Tweets	
  with	
  URLs

4,973,457

Number	
  of	
  singleton	
  tweets

22,481,898

Number	
  of	
  re-­‐tweets	
  /	
  replies

13,266,238

Start	
  date

12th	
  July,	
  2011

End	
  date

30th	
  August,	
  2011

•  High	
  impact	
  events:	
  
–  Greater	
  25K	
  tweets	
  
–  More	
  than	
  48	
  hours	
  in	
  trending	
  topics	
  

	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

10	
  
Data	
  StaOsOcs	
  

	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

11	
  
Data	
  StaOsOcs	
  
Events

542,685

#ukriots, #londonri- ots, #prayforlondon

Libya Crisis

389,506

libya, tripoli

Earthquake in Virginia

277,604

#earthquake, Earth- quake in SF

JanLokPal Bill Agitation

182,692

Anna Hazare, #jan- lokpal, #anna

Apple CEO Steve Jobs resigns

158,816

Steve Jobs, Tim Cook, Apple CEO

US Downgrading

148,047

S&P, AAA to AA

Hurricane Irene

90,237

Hurricane Irene, Tropical Storm Irene

Google acquires Motorola Mobility

68,527

Google, Motorola Mobility

News of the World Scandal

67,602

Rupert Murdoch, #murdoch

Abercrombie & Fitch stocks drop

54,763

Abercrombie & Fitch, A&F

Muppets Bert and Ernie were gay

52,401

Bert and Ernie

Indiana State Fair Tragedy

49,924

Indiana State Fair

Mumbai Blast, 2011

32,156

#mumbaiblast, Dadar, #needhelp

New Facebook Messenger

	
  

Trending Topics

UK Riots

	
  

Tweets

28,206

Facebook Messenger

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

12	
  
Architecture	
  

	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

13	
  
Human	
  AnnotaOon	
  
•  For	
  each	
  tweet:	
  

–  Tweet	
  contains	
  informaOon	
  about	
  the	
  event.	
  Rate	
  the	
  credibility	
  of	
  
informaOon	
  present:	
  
•  Definitely	
  Credible	
  
•  Seems	
  Credible	
  
•  Definitely	
  Incredible	
  
•  I	
  can’t	
  Decide	
  
–  Tweet	
  is	
  related	
  to	
  the	
  news	
  event,	
  but	
  contains	
  no	
  informaOon	
  
–  Tweet	
  is	
  not	
  related	
  to	
  news	
  event	
  
–  Skip	
  tweet	
  

	
  

•  Each	
  tweet	
  annotated	
  by	
  3	
  people	
  
•  Inter-­‐annotator	
  agreement	
  (Cronbach	
  Alpha)	
  =	
  0.748	
  
	
  
•  30%	
  of	
  tweets	
  provide	
  informaOon	
  (17%	
  credible	
  informaOon)	
  and	
  
14%	
  was	
  spam	
  
	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

14	
  
ANALYSIS	
  

	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

15	
  
Feature	
  Sets	
  
Message based features

Source based features

Length of the tweet
Registration age of the user

Number of words
Number of unique characters

Number of statuses

Number of hashtags
Number of followers

Number of retweets
Number of swear language words

Number of friends

Number of positive sentiment words
Number of negative sentiment words

Is a verified account

Tweet is a retweet

Length of description

Number of special symbols [$, !]
Length of screen name

Number of emoticons [:-), :-(]
Tweet is a reply

Has URL

Number of @- mentions
Ratio of followers to followees

Number of retweets
Time lapse since the query

Source based features

Has URL
Registration age of the user

Number of URLs
Use of URL shortener service

Number of statuses

Message based features
Number of followers

Length of the tweet
Number of words

	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

16	
  
PRF	
  
•  PRF	
  (Pseudo	
  Relevance	
  Feedback)	
  	
  
–  Extract	
  k	
  ranked	
  documents	
  and	
  then	
  re-­‐rank	
  
those	
  documents	
  according	
  to	
  a	
  defined	
  score	
  
	
  
–  Re-­‐ranking	
  based	
  on	
  ‘context’	
  of	
  the	
  event	
  
	
  
–  Top	
  n	
  unigrams	
  based	
  on	
  BM25	
  metric	
  

	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

17	
  
Algorithm	
  

	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

18	
  
EvaluaOon	
  Metric	
  
EvaluaOon	
  Metric:	
  NDCG	
  (Normalized	
  Discounted	
  CumulaOve	
  Gain)	
  
	
  
	
  
	
  
	
  
NDCG	
  is	
  the	
  standard	
  metric	
  used	
  to	
  evaluate	
  “graded”	
  results	
  

	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

19	
  
Ranking	
  Results	
  
•  Tweet	
  and	
  user	
  based	
  features	
  contribute	
  in	
  determining	
  the	
  credibility	
  –	
  it	
  
maTers	
  “what	
  you	
  post	
  and	
  who	
  you	
  are”	
  
	
  
•  Context	
  based	
  (PRF)	
  ranking	
  greatly	
  enhances	
  the	
  performance	
  (upto	
  .74	
  
NDCG)	
  

	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

20	
  
Web-­‐portal	
  ImplementaOon	
  

	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

21	
  
LimitaOons	
  &	
  Future	
  Work	
  
•  Human	
  input	
  required	
  
–  Need	
  to	
  develop	
  self	
  learning	
  (completely	
  
automated)	
  soluOons	
  

•  Analyze	
  events	
  with	
  a	
  greater	
  temporal	
  
variaOon	
  
•  Understanding	
  user’s	
  perspecOve	
  of	
  credibility	
  
of	
  content	
  on	
  TwiTer	
  

	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

22	
  
Challenges	
  
• 
• 
• 
• 

	
  

	
  

Large	
  volume	
  of	
  data	
  being	
  generated	
  
Real-­‐Ome	
  soluOons	
  needed	
  
Only	
  140	
  characters	
  
Informal	
  language	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

23	
  
Acknowledgements	
  
•  All	
  members	
  of	
  our	
  research	
  group	
  
•  Dept.	
  of	
  InformaOon	
  Technology,	
  Government	
  
of	
  India	
  

	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

24	
  
References	
  
•  C.	
  CasOllo,	
  M.	
  Mendoza,	
  and	
  B.	
  Poblete.	
  InformaOon	
  Credibility	
  on	
  TwiTer.	
  
In	
  WWW,	
  pages	
  675–684,	
  2011.	
  
•  J.	
  Chen,	
  R.	
  Nairn,	
  L.	
  Nelson,	
  M.	
  Bernstein,	
  and	
  E.	
  Chi.	
  Short	
  and	
  tweet:	
  
experiments	
  on	
  recommending	
  content	
  from	
  informaOon	
  streams.	
  CHI	
  
’10,	
  pages	
  1185–1194,	
  2010.	
  
•  J.	
  Ratkiewicz,	
  M.	
  Conover,	
  M.	
  Meiss,	
  B.	
  Gon	
  ̧calves,	
  S.	
  PaOl,	
  A.	
  Flammini,	
  
and	
  F.	
  Menczer.	
  Truthy:	
  mapping	
  the	
  spread	
  of	
  astroturf	
  in	
  microblog	
  
streams.	
  WWW	
  ’11.	
  
•  S.	
  E.	
  Robertson,	
  S.	
  Walker,	
  and	
  M.	
  Beaulieu.	
  Okapi	
  at	
  trec-­‐7:	
  automaOc	
  ad	
  
hoc,	
  filtering,	
  vlc	
  and	
  interacOve	
  track.	
  IN,	
  1999.	
  
•  T.	
  Sakaki,	
  M.	
  Okazaki,	
  and	
  Y.	
  Matsuo.	
  Earthquake	
  shakes	
  twiTer	
  users:	
  
real-­‐Ome	
  event	
  detecOon	
  by	
  social	
  sensors.	
  WWW	
  ’10,	
  2010.	
  
•  S.	
  Verma,	
  S.	
  Vieweg,	
  W.	
  J.	
  Corvey,	
  L.	
  Palen,	
  J.	
  H.	
  MarOn,	
  M.	
  Palmer,	
  A.	
  
Schram,	
  and	
  K.	
  M.	
  Anderson.	
  Nlp	
  to	
  the	
  rescue?	
  extracOng	
  “situaOonal	
  
awareness”	
  tweets	
  during	
  mass	
  emergency.	
  ICWSM,	
  2011.	
  

	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

25	
  
QuesOons?	
  

	
  

	
  

	
  precog.iiitd.edu.in	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  

	
  IIIT-­‐Delhi	
  

26	
  
 
	
  
	
  
	
  
	
  
	
  

Thank	
  You!	
  	
  
	
  
adiOg@iiit.ac.in	
  
pk@iiitd.ac.in	
  
precog.iiitd.edu.in	
  
For	
  any	
  further	
  informaOon,	
  please	
  write	
  to	
  
pk@iiitd.ac.in	
  
precog.iiitd.edu.in	
  

28	
  

More Related Content

Similar to Credibility Ranking of Tweets during High Impact Events

Privacy and Security on Online Social Media: Workshop on Data Analytics & Its...
Privacy and Security on Online Social Media: Workshop on Data Analytics & Its...Privacy and Security on Online Social Media: Workshop on Data Analytics & Its...
Privacy and Security on Online Social Media: Workshop on Data Analytics & Its...IIIT Hyderabad
 
Seattle Biz-Tech Summit 10-2015 CyberSecurity and the Board
Seattle Biz-Tech Summit 10-2015 CyberSecurity and the BoardSeattle Biz-Tech Summit 10-2015 CyberSecurity and the Board
Seattle Biz-Tech Summit 10-2015 CyberSecurity and the BoardLERNER Consulting
 
Seattle Biz-Tech Summit 10-2015 CyberSecurity and the Board
Seattle Biz-Tech Summit 10-2015 CyberSecurity and the BoardSeattle Biz-Tech Summit 10-2015 CyberSecurity and the Board
Seattle Biz-Tech Summit 10-2015 CyberSecurity and the BoardLERNER Consulting
 
OSINT using Twitter & Python
OSINT using Twitter & PythonOSINT using Twitter & Python
OSINT using Twitter & Python37point2
 
New Albany Twitter Seminar
New Albany Twitter SeminarNew Albany Twitter Seminar
New Albany Twitter SeminarKyle Lacy
 
Virtual Gov Day - Introduction & Keynote - Alan Webber, IDC Government Insights
Virtual Gov Day - Introduction & Keynote - Alan Webber, IDC Government InsightsVirtual Gov Day - Introduction & Keynote - Alan Webber, IDC Government Insights
Virtual Gov Day - Introduction & Keynote - Alan Webber, IDC Government InsightsSplunk
 
Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...
Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...
Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...IRJET Journal
 
“Twitter Will Win — And With the Right Plan of Attack, So Will You”
“Twitter Will Win — And With the Right Plan of Attack, So Will You”“Twitter Will Win — And With the Right Plan of Attack, So Will You”
“Twitter Will Win — And With the Right Plan of Attack, So Will You”Content Marketing World
 
Twitter recruiting McGill Sept 2013
Twitter recruiting McGill Sept 2013Twitter recruiting McGill Sept 2013
Twitter recruiting McGill Sept 2013Philip Youssef
 
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...IRJET Journal
 
IRJET- Tweet Segmentation and its Application to Named Entity Recognition
IRJET- Tweet Segmentation and its Application to Named Entity RecognitionIRJET- Tweet Segmentation and its Application to Named Entity Recognition
IRJET- Tweet Segmentation and its Application to Named Entity RecognitionIRJET Journal
 
Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hu...
Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hu...Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hu...
Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hu...IIIT Hyderabad
 
Computational Verification Challenges in Social Media
Computational Verification Challenges in Social MediaComputational Verification Challenges in Social Media
Computational Verification Challenges in Social MediaSymeon Papadopoulos
 
Social media and research leadership
Social media and research leadershipSocial media and research leadership
Social media and research leadershipSusie Macfarlane
 
Kokomo Twitter Seminar
Kokomo Twitter SeminarKokomo Twitter Seminar
Kokomo Twitter SeminarKyle Lacy
 
Cyberskills shortage: Where is the cyber workforce of tomorrow
Cyberskills shortage:Where is the cyber workforce of tomorrowCyberskills shortage:Where is the cyber workforce of tomorrow
Cyberskills shortage: Where is the cyber workforce of tomorrowStephen Cobb
 

Similar to Credibility Ranking of Tweets during High Impact Events (20)

Privacy and Security on Online Social Media: Workshop on Data Analytics & Its...
Privacy and Security on Online Social Media: Workshop on Data Analytics & Its...Privacy and Security on Online Social Media: Workshop on Data Analytics & Its...
Privacy and Security on Online Social Media: Workshop on Data Analytics & Its...
 
Seattle Biz-Tech Summit 10-2015 CyberSecurity and the Board
Seattle Biz-Tech Summit 10-2015 CyberSecurity and the BoardSeattle Biz-Tech Summit 10-2015 CyberSecurity and the Board
Seattle Biz-Tech Summit 10-2015 CyberSecurity and the Board
 
Seattle Biz-Tech Summit 10-2015 CyberSecurity and the Board
Seattle Biz-Tech Summit 10-2015 CyberSecurity and the BoardSeattle Biz-Tech Summit 10-2015 CyberSecurity and the Board
Seattle Biz-Tech Summit 10-2015 CyberSecurity and the Board
 
OSINT using Twitter & Python
OSINT using Twitter & PythonOSINT using Twitter & Python
OSINT using Twitter & Python
 
5 Ways To Fight A DDoS Attack
5 Ways To Fight A DDoS Attack5 Ways To Fight A DDoS Attack
5 Ways To Fight A DDoS Attack
 
New Albany Twitter Seminar
New Albany Twitter SeminarNew Albany Twitter Seminar
New Albany Twitter Seminar
 
Virtual Gov Day - Introduction & Keynote - Alan Webber, IDC Government Insights
Virtual Gov Day - Introduction & Keynote - Alan Webber, IDC Government InsightsVirtual Gov Day - Introduction & Keynote - Alan Webber, IDC Government Insights
Virtual Gov Day - Introduction & Keynote - Alan Webber, IDC Government Insights
 
Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...
Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...
Classification of Sentiment Analysis on Tweets Based on Techniques from Machi...
 
“Twitter Will Win — And With the Right Plan of Attack, So Will You”
“Twitter Will Win — And With the Right Plan of Attack, So Will You”“Twitter Will Win — And With the Right Plan of Attack, So Will You”
“Twitter Will Win — And With the Right Plan of Attack, So Will You”
 
Twitter recruiting McGill Sept 2013
Twitter recruiting McGill Sept 2013Twitter recruiting McGill Sept 2013
Twitter recruiting McGill Sept 2013
 
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
 
IRJET- Tweet Segmentation and its Application to Named Entity Recognition
IRJET- Tweet Segmentation and its Application to Named Entity RecognitionIRJET- Tweet Segmentation and its Application to Named Entity Recognition
IRJET- Tweet Segmentation and its Application to Named Entity Recognition
 
Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hu...
Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hu...Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hu...
Faking Sandy: Characterizing and Identifying Fake Images on Twitter during Hu...
 
2015 Atlanta CHIME Lead Forum
2015 Atlanta CHIME Lead Forum2015 Atlanta CHIME Lead Forum
2015 Atlanta CHIME Lead Forum
 
2015 Atlanta CHIME Lead Forum
2015 Atlanta CHIME Lead Forum2015 Atlanta CHIME Lead Forum
2015 Atlanta CHIME Lead Forum
 
2015 Atlanta CHIME Lead Forum
2015 Atlanta CHIME Lead Forum 2015 Atlanta CHIME Lead Forum
2015 Atlanta CHIME Lead Forum
 
Computational Verification Challenges in Social Media
Computational Verification Challenges in Social MediaComputational Verification Challenges in Social Media
Computational Verification Challenges in Social Media
 
Social media and research leadership
Social media and research leadershipSocial media and research leadership
Social media and research leadership
 
Kokomo Twitter Seminar
Kokomo Twitter SeminarKokomo Twitter Seminar
Kokomo Twitter Seminar
 
Cyberskills shortage: Where is the cyber workforce of tomorrow
Cyberskills shortage:Where is the cyber workforce of tomorrowCyberskills shortage:Where is the cyber workforce of tomorrow
Cyberskills shortage: Where is the cyber workforce of tomorrow
 

More from IIIT Hyderabad

Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayResponsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayIIIT Hyderabad
 
International Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesInternational Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesIIIT Hyderabad
 
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasResponsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasIIIT Hyderabad
 
Identify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIdentify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIIIT Hyderabad
 
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyData Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyIIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityBeyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityIIIT Hyderabad
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...
Data Science for Social Good: #LegalNLP #AlgorithmicBias...IIIT Hyderabad
 
How to Write a (Good) Research Paper
How to Write a (Good) Research Paper How to Write a (Good) Research Paper
How to Write a (Good) Research Paper IIIT Hyderabad
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasData Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasIIIT Hyderabad
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in IndiaIIIT Hyderabad
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in IndiaIIIT Hyderabad
 
Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...IIIT Hyderabad
 
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayPrivacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayIIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...IIIT Hyderabad
 
Leveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceLeveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceIIIT Hyderabad
 
Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...IIIT Hyderabad
 
A Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesA Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesIIIT Hyderabad
 

More from IIIT Hyderabad (20)

Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT BombayResponsible & Safe AI Systems at ACM India ROCS at IIT Bombay
Responsible & Safe AI Systems at ACM India ROCS at IIT Bombay
 
International Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success storiesInternational Collaboration: Experiences, Challenges, Success stories
International Collaboration: Experiences, Challenges, Success stories
 
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBiasResponsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
Responsible & Safe AI: #LegalBias #Inconsistency #BiasinLLMs #MultiModalBias
 
Identify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake NewsIdentify, Inspect and Intervene Multimodal Fake News
Identify, Inspect and Intervene Multimodal Fake News
 
#ChatGPT #ResponsibleAI
#ChatGPT #ResponsibleAI#ChatGPT #ResponsibleAI
#ChatGPT #ResponsibleAI
 
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafetyData Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
Data Science for Social Good: #MentalHealth #CodeMix #LegalNLP #AISafety
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic AmbiguityBeyond the Surface: A Computational Exploration of Linguistic Ambiguity
Beyond the Surface: A Computational Exploration of Linguistic Ambiguity
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...Data Science for Social Good:                      #LegalNLP #AlgorithmicBias...
Data Science for Social Good: #LegalNLP #AlgorithmicBias...
 
How to Write a (Good) Research Paper
How to Write a (Good) Research Paper How to Write a (Good) Research Paper
How to Write a (Good) Research Paper
 
Data Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBiasData Science for Social Good: #LegalNLP #AlgorithmicBias
Data Science for Social Good: #LegalNLP #AlgorithmicBias
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in India
 
Social Computing Research in India
Social Computing Research in IndiaSocial Computing Research in India
Social Computing Research in India
 
Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...Modeling Online User Interactions and their Offline effects on Socio-Technica...
Modeling Online User Interactions and their Offline effects on Socio-Technica...
 
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT BombayPrivacy. Winter School on “Topics in Digital Trust”. IIT Bombay
Privacy. Winter School on “Topics in Digital Trust”. IIT Bombay
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...It is our choices, Harry, that show what we truly are, far more than our abil...
It is our choices, Harry, that show what we truly are, far more than our abil...
 
Leveraging Social Media for Financial Advice
Leveraging Social Media for Financial AdviceLeveraging Social Media for Financial Advice
Leveraging Social Media for Financial Advice
 
Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...Development of Stress Induction and Detection System to Study its Effect on B...
Development of Stress Induction and Detection System to Study its Effect on B...
 
A Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian LanguagesA Framework for Automatic Question Answering in Indian Languages
A Framework for Automatic Question Answering in Indian Languages
 

Recently uploaded

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 

Recently uploaded (20)

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 

Credibility Ranking of Tweets during High Impact Events

  • 1. Credibility  Ranking  of  Tweets   during  High  Impact  Events   Adi$  Gupta  &  Ponnurangam  Kumaraguru   PSOSM@WWW   April  17,  2012  
  • 2. Problem  MoOvaOon        precog.iiitd.edu.in                    IIIT-­‐Delhi   2  
  • 3. Problem  MoOvaOon   Informa$on   Opinion   Spam        precog.iiitd.edu.in                    IIIT-­‐Delhi   3  
  • 4. Outline   •  •  •  •  •  •  •      Research  statement   Architecture   Data  collecOon   Analysis   Results   ImplementaOon   Future  direcOon    precog.iiitd.edu.in                    IIIT-­‐Delhi   4  
  • 5. Research  Statement   •  IdenOfy  parameters  that  affect  credibility  of   content  on  TwiTer   •  Develop  a  semi-­‐automated  algorithm  to   assess  credibility  of  tweets          precog.iiitd.edu.in                    IIIT-­‐Delhi   5  
  • 6. Terminology   TWEET:  A  status  (140   chars)   HASHTAG   RETWEET   USER   PROFILE   URL   USER  NAME  @screen_name   FOLLOWERS   Tweets   @-­‐MENTIONS        precog.iiitd.edu.in                    IIIT-­‐Delhi   6  
  • 7. Credibility   •  “The  quality  of  being  trusted  and  believed  in.”     •  In  this  research   –  Assess  the  credibility  of  the  informaOon  in  the   content  of  a  tweet  (message)  by  a  user  on  TwiTer.     –   A  tweet  is  said  to  contain  credible  informaOon   about  a  news  event,  if  you  trust  or  believe  that   informaOon  in  the  tweet  to  be  correct  /  true.        precog.iiitd.edu.in                    IIIT-­‐Delhi   7  
  • 8. News  on  TwiTer   News  on   Twi7er   Topics  on   Twi7er   News   Events   E.g.  #Irene,   #Libyacrisis            precog.iiitd.edu.in   Credible   Informa$on   Chit-­‐Chat   Fake  news  /  Rumors  / Spam  /  Personal   Opinions   E.g.   #nothingwrongwith,   #goodmorningtwiTer             Non-­‐ Credible   Informa$on          IIIT-­‐Delhi   8  
  • 9. Our  ContribuOons   •  30%  of  tweets  provide  informaOon  (17%  credible  informaOon)   and  14%  was  spam     •  Linear  logisOc  regression     –  Content  based:  #unique  characters,  swear  words,   pronouns  and  emoOcons   –  User  based:  #followers  and  length  of  username     •  Present  automated  algorithm  (supervised  ML  and  relevance   feedback)  to  assess  credibility  in  tweets        precog.iiitd.edu.in                    IIIT-­‐Delhi   9  
  • 10. Data  StaOsOcs   Total  tweets 35,748,136 Total  unique  users 6,877,320 Tweets  with  URLs 4,973,457 Number  of  singleton  tweets 22,481,898 Number  of  re-­‐tweets  /  replies 13,266,238 Start  date 12th  July,  2011 End  date 30th  August,  2011 •  High  impact  events:   –  Greater  25K  tweets   –  More  than  48  hours  in  trending  topics        precog.iiitd.edu.in                    IIIT-­‐Delhi   10  
  • 11. Data  StaOsOcs        precog.iiitd.edu.in                    IIIT-­‐Delhi   11  
  • 12. Data  StaOsOcs   Events 542,685 #ukriots, #londonri- ots, #prayforlondon Libya Crisis 389,506 libya, tripoli Earthquake in Virginia 277,604 #earthquake, Earth- quake in SF JanLokPal Bill Agitation 182,692 Anna Hazare, #jan- lokpal, #anna Apple CEO Steve Jobs resigns 158,816 Steve Jobs, Tim Cook, Apple CEO US Downgrading 148,047 S&P, AAA to AA Hurricane Irene 90,237 Hurricane Irene, Tropical Storm Irene Google acquires Motorola Mobility 68,527 Google, Motorola Mobility News of the World Scandal 67,602 Rupert Murdoch, #murdoch Abercrombie & Fitch stocks drop 54,763 Abercrombie & Fitch, A&F Muppets Bert and Ernie were gay 52,401 Bert and Ernie Indiana State Fair Tragedy 49,924 Indiana State Fair Mumbai Blast, 2011 32,156 #mumbaiblast, Dadar, #needhelp New Facebook Messenger   Trending Topics UK Riots   Tweets 28,206 Facebook Messenger  precog.iiitd.edu.in                    IIIT-­‐Delhi   12  
  • 13. Architecture        precog.iiitd.edu.in                    IIIT-­‐Delhi   13  
  • 14. Human  AnnotaOon   •  For  each  tweet:   –  Tweet  contains  informaOon  about  the  event.  Rate  the  credibility  of   informaOon  present:   •  Definitely  Credible   •  Seems  Credible   •  Definitely  Incredible   •  I  can’t  Decide   –  Tweet  is  related  to  the  news  event,  but  contains  no  informaOon   –  Tweet  is  not  related  to  news  event   –  Skip  tweet     •  Each  tweet  annotated  by  3  people   •  Inter-­‐annotator  agreement  (Cronbach  Alpha)  =  0.748     •  30%  of  tweets  provide  informaOon  (17%  credible  informaOon)  and   14%  was  spam        precog.iiitd.edu.in                    IIIT-­‐Delhi   14  
  • 15. ANALYSIS        precog.iiitd.edu.in                    IIIT-­‐Delhi   15  
  • 16. Feature  Sets   Message based features Source based features Length of the tweet Registration age of the user Number of words Number of unique characters Number of statuses Number of hashtags Number of followers Number of retweets Number of swear language words Number of friends Number of positive sentiment words Number of negative sentiment words Is a verified account Tweet is a retweet Length of description Number of special symbols [$, !] Length of screen name Number of emoticons [:-), :-(] Tweet is a reply Has URL Number of @- mentions Ratio of followers to followees Number of retweets Time lapse since the query Source based features Has URL Registration age of the user Number of URLs Use of URL shortener service Number of statuses Message based features Number of followers Length of the tweet Number of words      precog.iiitd.edu.in                    IIIT-­‐Delhi   16  
  • 17. PRF   •  PRF  (Pseudo  Relevance  Feedback)     –  Extract  k  ranked  documents  and  then  re-­‐rank   those  documents  according  to  a  defined  score     –  Re-­‐ranking  based  on  ‘context’  of  the  event     –  Top  n  unigrams  based  on  BM25  metric        precog.iiitd.edu.in                    IIIT-­‐Delhi   17  
  • 18. Algorithm        precog.iiitd.edu.in                    IIIT-­‐Delhi   18  
  • 19. EvaluaOon  Metric   EvaluaOon  Metric:  NDCG  (Normalized  Discounted  CumulaOve  Gain)           NDCG  is  the  standard  metric  used  to  evaluate  “graded”  results        precog.iiitd.edu.in                    IIIT-­‐Delhi   19  
  • 20. Ranking  Results   •  Tweet  and  user  based  features  contribute  in  determining  the  credibility  –  it   maTers  “what  you  post  and  who  you  are”     •  Context  based  (PRF)  ranking  greatly  enhances  the  performance  (upto  .74   NDCG)        precog.iiitd.edu.in                    IIIT-­‐Delhi   20  
  • 21. Web-­‐portal  ImplementaOon        precog.iiitd.edu.in                    IIIT-­‐Delhi   21  
  • 22. LimitaOons  &  Future  Work   •  Human  input  required   –  Need  to  develop  self  learning  (completely   automated)  soluOons   •  Analyze  events  with  a  greater  temporal   variaOon   •  Understanding  user’s  perspecOve  of  credibility   of  content  on  TwiTer        precog.iiitd.edu.in                    IIIT-­‐Delhi   22  
  • 23. Challenges   •  •  •  •      Large  volume  of  data  being  generated   Real-­‐Ome  soluOons  needed   Only  140  characters   Informal  language    precog.iiitd.edu.in                    IIIT-­‐Delhi   23  
  • 24. Acknowledgements   •  All  members  of  our  research  group   •  Dept.  of  InformaOon  Technology,  Government   of  India        precog.iiitd.edu.in                    IIIT-­‐Delhi   24  
  • 25. References   •  C.  CasOllo,  M.  Mendoza,  and  B.  Poblete.  InformaOon  Credibility  on  TwiTer.   In  WWW,  pages  675–684,  2011.   •  J.  Chen,  R.  Nairn,  L.  Nelson,  M.  Bernstein,  and  E.  Chi.  Short  and  tweet:   experiments  on  recommending  content  from  informaOon  streams.  CHI   ’10,  pages  1185–1194,  2010.   •  J.  Ratkiewicz,  M.  Conover,  M.  Meiss,  B.  Gon  ̧calves,  S.  PaOl,  A.  Flammini,   and  F.  Menczer.  Truthy:  mapping  the  spread  of  astroturf  in  microblog   streams.  WWW  ’11.   •  S.  E.  Robertson,  S.  Walker,  and  M.  Beaulieu.  Okapi  at  trec-­‐7:  automaOc  ad   hoc,  filtering,  vlc  and  interacOve  track.  IN,  1999.   •  T.  Sakaki,  M.  Okazaki,  and  Y.  Matsuo.  Earthquake  shakes  twiTer  users:   real-­‐Ome  event  detecOon  by  social  sensors.  WWW  ’10,  2010.   •  S.  Verma,  S.  Vieweg,  W.  J.  Corvey,  L.  Palen,  J.  H.  MarOn,  M.  Palmer,  A.   Schram,  and  K.  M.  Anderson.  Nlp  to  the  rescue?  extracOng  “situaOonal   awareness”  tweets  during  mass  emergency.  ICWSM,  2011.        precog.iiitd.edu.in                    IIIT-­‐Delhi   25  
  • 26. QuesOons?        precog.iiitd.edu.in                    IIIT-­‐Delhi   26  
  • 27.             Thank  You!       adiOg@iiit.ac.in   pk@iiitd.ac.in   precog.iiitd.edu.in  
  • 28. For  any  further  informaOon,  please  write  to   pk@iiitd.ac.in   precog.iiitd.edu.in   28