Credibility,	
  Identity	
  Resolution,	
  Privacy,	
  
and	
  Policing	
  on	
  Online	
  Social	
  Media	
  
IIT	
  Guwa...
Who	
  am	
  I?	
  
– Associate	
  Professor,	
  IIIT-­‐Delhi	
  	
  
– Ph.D.	
  from	
  School	
  of	
  Computer	
  Sci...
https://www.youtube.com/channel/UCHWDvG
Dh4QjWbV79bM2neSg
3
4
What	
  we	
  dabble	
  with!	
  
Non-­‐trustworthy	
  Content
FAKE
5
$
RUMORS
Methodology
6
Training	
  Data
– 500	
  Tweets	
  per	
  event
– Used	
  CrowdFlower
7
Event Tweets Users
Boston	
  Marathon	
  Blasts...
Credibility	
  Modeling	
  
8
Feature	
  set	
   Features (45)	
  
Tweet	
  meta-­‐data	
  
Number	
  of	
  seconds	
  sin...
Implementation
Feedback	
  by	
  Users
10
v
Harvard	
  (1839)	
  – Harvard	
  – Harvard	
  – Harvard	
  – MIT	
  –
Northwestern	
  – UIUC	
  – WUSL	
  – CMU	
  (2009)...
http://twitdigest.iiitd.edu.in/TweetCred/
13
De-­‐duplicating	
  audience
Social	
  audience	
  	
  =	
  437,632	
  +	
  153,000	
  +	
  805,097	
  or	
  less??
14
Challenges
15
ProfessionalOpinion
Dating
Heterogeneous	
  OSNs
Personal
Degree	
  of	
  Details
Quality	
  and	
  descript...
Generic	
  Identity	
  Resolution
16
Extract	
  
available	
  &	
  
discriminative
features
Candidate	
  
Identities
IDENT...
Heuristic	
  Identity	
  Search
17
cerc.iiitd.ac.in
Profile
Content
Self-mention
Network
Syntactic
and Image
Search Linking...
Harvard	
  (1839)	
  – Harvard	
  – Harvard	
  – Harvard	
  – MIT	
  –
Northwestern	
  – UIUC	
  – WUSL	
  – CMU	
  (2009)...
19
20
How	
  many	
  of	
  you	
  have	
  posted	
  
mobile	
  numbers	
  on	
  Online	
  Social	
  
Networks?
How	
  many	
 ...
Sample	
  posts
21
Sample	
  posts
22
Sample	
  posts
23
Sample	
  posts
24
Data	
  statistics
– Twitter:	
  12th	
  October	
  2012	
  – 20th	
  October	
  2013
– Facebook:	
  16th	
  November	
 ...
26
SocialCaller	
  App
27
https://play.google.com/store/apps/details?id=com.ayush.socialcaller&hl=en
28
http://precog.iiitd.edu.in/research/ocean/
Takeaways
– Online	
  Social	
  Media	
  is	
  a	
  different	
  beast	
  in	
  
terms	
  of	
  privacy,	
  identity,	
  ...
30
https://www.facebook.com/PreCog.IIITD/
Upcoming SlideShare
Loading in …5
×

Credibility, Identity Resolution, Privacy, and Policing in Online Social Media

173 views

Published on

Dr. PK gave an ACM Distinguished Speaker talk at IIT Guwahati.

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
173
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
9
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Credibility, Identity Resolution, Privacy, and Policing in Online Social Media

  1. 1. Credibility,  Identity  Resolution,  Privacy,   and  Policing  on  Online  Social  Media   IIT  Guwahati Sept  26,  2016 Ponnurangam  Kumaraguru  (“PK”) Associate  Professor ACM  Distinguished  Speaker fb/ponnurangam.kumaraguru,  @ponguru
  2. 2. Who  am  I?   – Associate  Professor,  IIIT-­‐Delhi     – Ph.D.  from  School  of  Computer  Science,     Carnegie  Mellon  University  (CMU)     – Research  interests   -Social  Computing,  Computational  Social  Science,   Complex  Networks  pertaining  to  Human  Behavior,   specifically  in  the  context  of  Security  &  Privacy – Co-­‐ordinate  and  manage  Precog,   precog.iiitd.edu.in – ACM  Distinguished  Speaker   2
  3. 3. https://www.youtube.com/channel/UCHWDvG Dh4QjWbV79bM2neSg 3
  4. 4. 4 What  we  dabble  with!  
  5. 5. Non-­‐trustworthy  Content FAKE 5 $ RUMORS
  6. 6. Methodology 6
  7. 7. Training  Data – 500  Tweets  per  event – Used  CrowdFlower 7 Event Tweets Users Boston  Marathon  Blasts  (2013) 7,888,374 3,677,531 Typhoon Haiyan /  Yolanda  (2013) 671,918 368,269 Cyclone  Phailin (2013) 76,136 34,776 Washington  Navy yard shootings (2013) 484,609 257,682 Polar  vortex cold wave (2014) 143,959 116,141 Oklahoma  Tornadoes (2013) 809,154 542,049 Total     10,074,150 4,996,448
  8. 8. Credibility  Modeling   8 Feature  set   Features (45)   Tweet  meta-­‐data   Number  of  seconds  since  the  tweet;  Source  of  tweet  (mobile  /  web/   etc);  Tweet  contains  geo-­‐coordinates Tweet  content  (simple)   Number  of  characters;  Number  of  words;  Number  of  URLs;  Number  of   hashtags;  Number  of  unique  characters;  Presence  of  stock  symbol;   Presence  of  happy  smiley;  Presence  of  sad  smiley;  Tweet  contains   `via';  Presence  of  colon  symbol Tweet  content  (linguistic)   Presence  of  swear  words;  Presence  of  negative  emotion  words;   Presence  of  positive  emotion  words;  Presence  of  pronouns;  Mention   of  self  words  in  tweet  (I;  my;  mine) Tweet  author   Number  of  followers;  friends;  time  since  the  user  if  on  Twitter;  etc. Tweet  network   Number  of  retweets;  Number  of  mentions;  Tweet  is  a  reply;  Tweet  is  a   retweet Tweet links   WOT  score  for  the  URL;  Ratio  of  likes  /  dislikes  for  a  YouTube  video
  9. 9. Implementation
  10. 10. Feedback  by  Users 10
  11. 11. v
  12. 12. Harvard  (1839)  – Harvard  – Harvard  – Harvard  – MIT  – Northwestern  – UIUC  – WUSL  – CMU  (2009)  – IIITD   (2015)         12
  13. 13. http://twitdigest.iiitd.edu.in/TweetCred/ 13
  14. 14. De-­‐duplicating  audience Social  audience    =  437,632  +  153,000  +  805,097  or  less?? 14
  15. 15. Challenges 15 ProfessionalOpinion Dating Heterogeneous  OSNs Personal Degree  of  Details Quality  and  descriptive  personal   And  professional  information Little  personal  information   Descriptive  opinions Attribute  Evolution Time Information  evolved  on  one  but   not  on  other {jainpari,  Bangalore} Registration  with  same   information  on  both  OSNs {paridhij,  New  Delhi}
  16. 16. Generic  Identity  Resolution 16 Extract   available  &   discriminative features Candidate   Identities IDENTITY  SEARCH IDENTITY  LINKING Pairwise   Comparisons
  17. 17. Heuristic  Identity  Search 17 cerc.iiitd.ac.in Profile Content Self-mention Network Syntactic and Image Search Linking If self-identified / returned by more than one search method No Yes Candidate Identities name, location, username mobile no, post, friends, followers Paridhi  Jain,  Ponnurangam Kumaraguru,  and  Anupam Joshi.  2013.  @I  seek  ‘fb.me’:  Identifying  Users  across  Multiple  Online  Social   Networks.  In  Proceedings  of  the  22nd  International  Conference  on  World  Wide  Web,  WWW  ’13  Companion.  ACM,  New  York,  NY,  USA,   1259-­‐ 1268.  DOI=http://dx.doi.org/10.1145/2487788.2488160    [Honorable  Mention  Award}  
  18. 18. Harvard  (1839)  – Harvard  – Harvard  – Harvard  – MIT  – Northwestern  – UIUC  – WUSL  – CMU  (2009)  – IIITD   (2016)         18
  19. 19. 19
  20. 20. 20 How  many  of  you  have  posted   mobile  numbers  on  Online  Social   Networks? How  many  of  you  have  seen   mobile  numbers  being  posted  on   Online  Social  Networks?
  21. 21. Sample  posts 21
  22. 22. Sample  posts 22
  23. 23. Sample  posts 23
  24. 24. Sample  posts 24
  25. 25. Data  statistics – Twitter:  12th  October  2012  – 20th  October  2013 – Facebook:  16th  November  2012  – 20th  April  2013 25 Numbers Category  +91 Category  0 Category  void Total Twitter Facebook Twitter Facebook Twitter Facebook Twitter Facebook Mobile   Numbers 885 2,191 14,909 8,873 25,566 25,294 41,360 36,358 User   profiles 1,074 2,663 17,913 9,028 31,149 25,406 49,817 36,588
  26. 26. 26
  27. 27. SocialCaller  App 27 https://play.google.com/store/apps/details?id=com.ayush.socialcaller&hl=en
  28. 28. 28 http://precog.iiitd.edu.in/research/ocean/
  29. 29. Takeaways – Online  Social  Media  is  a  different  beast  in   terms  of  privacy,  identity,  and  credibility -Research  /  technologies  should  be  developed – Multiple  interesting  research,  engineering,   and  innovation  waiting  to  be  done – Interested  in  hosting  students  – B.Tech.,   M.Tech.,  Ph.D. 29
  30. 30. 30 https://www.facebook.com/PreCog.IIITD/

×