@I seek 'fb.me': Identifying Users across Multiple Online Social Networks

713 views
567 views

Published on

An online user joins multiple social networks in order to enjoy different services. On each joined social network, she creates an identity and constitutes its three major dimensions namely profile, content and connection network. She largely governs her identity formulation on any social network and therefore can manipulate multiple aspects of it. With no global identifier to mark her presence uniquely in the online domain, her online identities remain unlinked, isolated and difficult to search. Literature has proposed identity search methods on the basis of profile attributes, but has left the other identity dimensions e.g. content and network, unexplored. In this work, we introduce two novel identity search algorithms based on content and network attributes and improve on traditional identity search algorithm based on profile attributes of a user. We apply proposed identity search algorithms to find a user's identity on Facebook, given her identity on Twitter. We report that a combination of proposed identity search algorithms found Facebook identity for 39% of Twitter users searched while traditional method based on profile attributes found Facebook identity for only 27.4\%. Each proposed identity search algorithm access publicly accessible attributes of a user on any social network. We deploy an identity resolution system, Finding Nemo, which uses proposed identity search methods to find a Twitter user's identity on Facebook. We conclude that inclusion of more than one identity search algorithm, each exploiting distinct dimensional attributes of an identity, helps in improving the accuracy of an identity resolution process.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
713
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
20
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

@I seek 'fb.me': Identifying Users across Multiple Online Social Networks

  1. 1. @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks Workshop  on  Web  of  Linked  En11es  (WoLE) Paridhi  Jain¶,  Ponnurangam  Kumaraguru¶,  Anupam  Joshi* ¶Indraprastha  Ins1tute  of  Informa1on  Technology  (IIIT-­‐Delhi) *University  of  Maryland,  Bal1more  County  (UMBC) 1
  2. 2. Motivation Multiple OSNs Multiple Identities Difficult to manage? Difficult to find? 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 2 2
  3. 3. Motivation Multiple OSNs Multiple Identities Social Aggregation site Difficult to manage? Difficult to find? 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 2 2
  4. 4. Motivation Multiple OSNs Multiple Identities Social Aggregation site Difficult to manage? Difficult to find? 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks Friend  Finder? Malicious  user? Influen1al  user? User  of  interest? 2 2
  5. 5. Motivation Multiple OSNs Multiple Identities Social Aggregation site Difficult to manage? Difficult to find? Friend  Finder? Malicious  user? Influen1al  user? User  of  interest? Identity Resolution Problem 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 2 2
  6. 6. Identity Resolution • For a user I, given a user identity IA on a social network A, find user identity IB on social network B. {IA} Alice 13/05/13 {IB} ?? @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 3 3
  7. 7. Identity Resolution = Identity Search + Identity Matching • Identity Search For a user I, given her identity IA on a social network A, and a search parameter S, find the set of identities IBj on social network B such that S(IA) ⋍ S(IB). {IA,S} • {IB1, ... IBj, ... , IBN} = Q Identity Matching Given a user identity IA on a social network A, a set of candidate identities Q on social network B, and a match function M, locate an identity pair (IA, IBj) such that M(IA, IBj) = max{M(IA, IB1), M(IA, IBN)} {IA, Q, M} 13/05/13 {IA, IBj} {IB} @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 4 4
  8. 8. Research Gaps? – Till  now,  focus  on  bePer  iden1ty  matching  algorithms – Only  profile  aPributes  (private  and  public)  for  Iden1ty  Search – Limita1ons  of  Profile  Search  -­‐ – Restric1ve  search,  owing  to  non-­‐availability  of  common  aPributes  across   networks.  [Gender  on  Facebook,  but  not  on  TwiPer] – Search  with  Limited  aPributes  →  Large  candidate  set  size  →  Intensive   Iden1ty  Matching  computa1on – Users  may  choose  different  profile  aPributes  →  Miss  out  correct  iden1ty  in   the  candidate  set – LiPle  research  on  using  content  and  network  aPributes  to  search  for  candidate   iden11es – Extensive  use  of  both  private  and  public  aPributes.  Need  user  authoriza1on  for   iden1ty  search 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 5 5
  9. 9. Research Gaps? – Till  now,  focus  on  bePer  iden1ty  matching  algorithms – Only  profile  aPributes  (private  and  public)  for  Iden1ty  Search – Limita1ons  of  Profile  Search  -­‐ – Restric1ve  search,  owing  to  non-­‐availability  of  common  aPributes  across   networks.  [Gender  on  Facebook,  but  not  on  TwiPer] – Search  with  Limited  aPributes  →  Large  candidate  set  size  →  Intensive   Iden1ty  Matching  computa1on – Users  may  choose  different  profile  aPributes  →  Miss  out  correct  iden1ty  in   the  candidate  set – LiPle  research  on  using  content  and  network  aPributes  to  search  for  candidate   iden11es – Extensive  use  of  both  private  and  public  aPributes.  Need  user  authoriza1on  for   iden1ty  search 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 6 6
  10. 10. Research Gaps? – Till  now,  focus  on  bePer  iden1ty  matching  algorithms – Only  profile  aPributes  (private  and  public)  for  Iden1ty  Search – Limita1ons  of  Profile  Search  -­‐ – Restric1ve  search,  owing  to  non-­‐availability  of  common  aPributes  across   networks.  [Gender  on  Facebook,  but  not  on  TwiPer] – Search  with  Limited  aPributes  →  Large  candidate  set  size  →  Intensive   Iden1ty  Matching  computa1on – Users  may  choose  different  profile  aPributes  →  Miss  out  correct  iden1ty  in   the  candidate  set – LiPle  research  on  using  content  and  network  aPributes  to  search  for  candidate   iden11es – Extensive  use  of  both  private  and  public  aPributes.  Need  user  authoriza1on  for   iden1ty  search 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 7 7
  11. 11. Proposal – Include  content  and  network  aPributes  as  search  parameters – Access  only  publicly  accessible  aPributes – Focus  on  two  popular  social  networks  -­‐  TwiPer  and  Facebook 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 8 8
  12. 12. Contribution – Proposed  novel  iden1ty  search  methods  on  social  networks – Our  iden1ty  resolu1on  methods  return  correct  Facebook  iden1ty  for  39%   TwiPer  users  within  top-­‐2  ranks – We  observe  an  increase  in  accuracy  of  iden1ty  resolu1on  by  11.6%  owing  to   inclusion  of  content  and  network  iden1ty  search,  along  with  improvised  profile   search 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 9 9
  13. 13. Methodology ? ? ? ? Search 13/05/13 Candidate Identities If self-identified / returned by more than one search method Yes No Syntactic and Image Manual Verification Match @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 10 10
  14. 14. Identity Matching – Syntac1c  Matching – Jaro  Distance  comparison  between  username  and  name – Example:  {alice123,  jane_alice},  {Alice  Naura,  Alice  N.  Janice} – Image  Matching where  hIA  and  hIBj  are  the  RGB  histograms  of  the  profile  image  and  Ns  represent   histogram  size  of  IA 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 11 11
  15. 15. Profile Search Self  -­‐  Iden1fica1on   13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 12 12
  16. 16. Profile Search Self  -­‐  Iden1fica1on   13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 12 12
  17. 17. Content Search 13 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 13
  18. 18. Content Search 13 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 13
  19. 19. Self-mention Search 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 14 14
  20. 20. Self-mention Search 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 14 14
  21. 21. Network Search 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 15 15
  22. 22. Instance, 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 16 16
  23. 23. Instance, Public  Friend  List   of  a  user  extracted   from  public  feeds 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 16 16
  24. 24. Integrated System - 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 17 17
  25. 25. Evaluation Dataset # of users Social Graph API 543 Method (543 users) % Accurate Profile (P) 205 37.7 Content (C + SM) 34 6.3 Network (N) 1 0.2 Finding Nemo 13/05/13 # of users 212 39 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 18 18
  26. 26. Evaluation Dataset # of users Social Graph API 543 Method (543 users) % Accurate Profile (P) 205 37.7 Content (C + SM) 34 6.3 Network (N) 1 0.2 Finding Nemo 212 39 Search Algorithm # of users identified Accuracy P (without URL) 149 27.4% P (with URL) + C + N + SM 13/05/13 # of users 149+56+6+1 = 149+71 27.4% + 11.6% @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 18 18
  27. 27. Mean Average Precision ↓ Matching algorithm Image (profile image) 0.83 Syntactic (username) 0.76 Syntactic (name) 13/05/13 MAP Score 0.80 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 19 19
  28. 28. Demo hPp://www.youtube.com/watch?v=-­‐AFsCtKwO0c 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 20 20
  29. 29. Take away Inclusion  of  content  and  network  a9ributes  for  iden1ty  search   not  only  improves  iden1ty  resolu1on  accuracy  but  returns   correct  Facebook  iden1ty  within  top-­‐2  ranks  for  majority  of  the   TwiPer  users. 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 21 21
  30. 30. Current and Future Work – Extend  the  social  networks  to  search  for  a  given  iden1ty.   Example,  Google+,  Foursquare,  etc. – Extend  the  search  methods  to  include  social-­‐network  specific   features – Find  mul1ple  (fake)  iden11es  of  users  within  social  networks 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 22 22
  31. 31. Questions? paridhij@iiitd.ac.in,  pk@iiitd.ac.in,  joshi@cs.umbc.edu precog.iiitd.edu.in Paper:  hPp://precog.iiitd.edu.in/publica1ons.html 13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks 23 23
  32. 32. For  any  further  informa1on,  please  write  to   pk@iiitd.ac.in precog.iiitd.edu.in 24

×