Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

274 views

Published on

The online social network (OSN) landscape has transformed significantly over the past few years with the emergence of networks. The primary capabilities of these online networks differ. Few of the major leading ones are: Relationship networks (Facebook), Media sharing networks (Instagram), Online reviews (Zomato), Discussion forums (Quora), Social publishing platforms (Twitter), etc. In order to avail these services, users end up creating multiple identities across these platforms. For each OSN, a user defines his identity with a different set of attributes, genre of content and friends to suit the purpose of using that OSN. Researchers have proposed numerous techniques to resolve multiple such identities of a user across different platforms. However, the ability to link different identities poses a threat to the users’ privacy; users may or may not want their identities to be linkable across networks. In this study, we model the notion of linkability as the probability of an adversary (who is part of the user’s network) being able to link two profiles across different platforms, to the same real user. The major factors that lead to increased linkability across social networks are similar profile attributes and cross posting across the social networks. To make users aware of the linkability across multiple social networks, as part of the thesis, we have developed a framework, which assists the users to control their linkability. It has two components; a linkability calculator that uses three state-of-the-art identity resolution techniques to compute a normalized linkability measure for each pair of social network platforms used by a user, and a soft paternalistic nudge. The user configures the desired linkability score range for each pair of networks. There are two types of nudge: Attribute-driven Notification Nudge, which alerts the user through a pop-up notification if any of their activity violates their preferred linkability score range and Content-driven Color Nudge, which notifies the user by changing the color of the box bounding the post update from black to red if the content being posted by them is found to be similar to the content already posted by them on a different social network. We evaluate the effectiveness of the nudge by conducting a controlled user study on privacy conscious users who maintain their accounts on Facebook, Twitter, and Instagram. Outcomes of user study confirmed that the proposed framework helped 75% of participants to take informed decisions, thereby preventing inadvertent exposure of their personal information across social network services. Also, the content driven color nudge refrained few participants from making post updates.

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

  1. 1. User Identities across Social Networks: Quantifying Linkability and Nudging Users to control Linkability Dr.  Ponnurangam  Kumaraguru  (Advisor)Srishti  Chandok
  2. 2. Thesis Committee Dr.  Arun  Balaji  Buduru,  IIIT-­‐Delhi   Dr.  Anuja  Arora,  JIIT-­‐Noida   Dr.  Ponnurangam  Kumaraguru  (Chair),  IIIT-­‐Delhi 2
  3. 3. Publication Our work is accepted as a short paper / 8 pages at Social Informatics 2017. 3
  4. 4. Why people join multiple OSNs??? Type of content being shared: ✦ Images ✦ Videos ✦ Short messages ✦ Combination of messages, video and images ✦ Online reviews ✦ Discussion forums Type of network being offered: ✦ Professional network ✦ Personal network 4
  5. 5. Notion of Linkability Linkability   is   a   metric   which   quantifies   closeness   between   two   identities  belonging  to  the  same  user  on  different  social  networks 5 Username: RaineRamirez1 Name: Chris Raine Ramirez Location: Caloocan City Website: NULL Username: Rainevouz Name: Christopher Delgar Ramirez Bakunawa Location: San Jose del Monte, Bulacan Website: NULL 0.31Linkability Score = There is a 31% chance that Rainevouz & RaineRamirez1 is the same person
  6. 6. Motivation 6 ✦ Social audience = 437,632 + 153,000 + 805,097 or less?? ✦ Targeted Marketing using aggregated data De-duplicating audience - finding linkability across OSNs
  7. 7. Motivation 6 Social Engineering - Aggregation of information makes attacking easy
  8. 8. 8
  9. 9. 9 1. Hacker studies Karla who is active on 2. Information aggregation: • Likes playing guitar and softball • Works on making reports at the office and recently received an award • Likes wine • Nick is her boss and Curt is her colleague 3. Hacker crafts an email 4. Downloaded on Karla’s machine 5. Hacker installs a remote access tool on the machine
  10. 10. Primary School name on 10 Motivation Streets close to ‘xyz’ school ‘xyz’ school Security Question: Street where you grew up? Cracking passwords - Personal data across multiple social networks to gather answers for password recovery questions
  11. 11. Research Aim “  Develop  a  real-­‐time  system  which  can  help  users  to  maintain   their  linkability  across  social  networks.  ” AIM Linkability Score Computation Linkability Nudge Design 11
  12. 12. Current State of Art Identity Resolution 12 Privacy Nudges Linkability Score Linkability Nudge
  13. 13. Current State of Art Identity Resolution 13 Privacy Nudges
  14. 14. Identity Resolution(Contd..) Identity Resolution: ✦ MOBIUS - ACM, SIGKDD (2013), 91% accuracy ✦ NEMO - ACM, HyperText (2015), 41% accuracy ✦ HYDRA - ACM, SIGMOD (2014) 14
  15. 15. Current State of Art Identity Resolution 15 Privacy Nudges
  16. 16. Privacy Nudges (Contd..) Privacy Nudges: ✦Wang, Yang, et al. Privacy nudges for social media: an exploratory Facebook study - Profile Picture nudge, timer nudge and sentiment nudge 16
  17. 17. 17 ✦Ronald. Context is everything sociality and privacy in online social network sites - Segregation of audience for profile attributes of user on OSNs so that its visibility is controllable. Privacy Nudges (Contd..)
  18. 18. Novelty 18 ‹#› Nudging users to prevent disclosures owing to the resolution of their multiple identities Identity Resolution Privacy Nudges
  19. 19. Architecture Diagram Linkability Score Linkability score exceeds range? Linkability Nudge NO YES 19 Activities performed on OSNs Recomputation
  20. 20. Linkability Score Linkability Score Linkability score exceeds range? Linkability Nudge NO YES 20 Activities performed on OSNs Recomputation 1. Baseline Methods Weighted Sum Probabilistic 2. Reformed Linkability Score
  21. 21. Linkability Score Linkability Score Linkability score exceeds range? Linkability Nudge NO YES 21 Activities performed on OSNs Recomputation 1. Baseline Methods Weighted Sum Probabilistic 2. Reformed Linkability Score
  22. 22. Baseline Methods 1.  Data  Collection   2.  Methods 22 Weighted  Sum  Method   Probabilistic  Method  
  23. 23. Data Collection 23
  24. 24. Data Collection (Contd..) 24 Streaming   API   Tweets   Fb.me   links   Link Expander Database http://fb.me/8dR49RHpQ https://www.facebook.com/ christie.andresen/posts/ 10210420711856356
  25. 25. Description of Data Count Positive Data 23,985 Negative Data (Type I) 96,130 Negative Data (Type II) 24,560 Positive Data: <IFb> = <ITw>, identities are same [bob12, bob12] Negative Data (Type I): <IFb> ≠ <ITw> but the identities appear to be similar [bob_c, bob_d] Negative Data (Type II):> <IFb> ≠ <ITw> and the identities appear to be dissimilar [bob, alice] Data Collection (Contd..) 25
  26. 26. Weighted Sum Method Weighted Sum Calculator 26 Feature Extractor Metric Calculator Linkability Score Linkability = ∑wi * fi Score ∑wi
  27. 27. Weighted Sum Method Username Name Location Website Longest Common Subsequence, Edit Distance, etc Weighted Sum Calculator 0.31 27 Username = 0.39 Name = 0.28 Location = 0.6 Website = 0 Username: RaineRamirez1 Name: Chris Raine Ramirez Location: Caloocan City Website: NULL Username: Rainevouz Name: Christopher Delgar Ramirez Bakunawa Location: San Jose del Monte, Bulacan Website: NULL Feature Extractor Metric Calculator Linkability Score Linkability = ∑wi * fi Score ∑wi
  28. 28. Need for different Weights Feature weights = 1,1,1,1 for username, name, geo-location and website/url, respectively. Feature weights = 2,3,4,1 for username, name, geo-location and website/url, respectively. To increase the difference between positive data and negative data thereby, ensuring that negative data (Type I) identity would not be mistaken to be as positive identity. 28
  29. 29. 29 Feature Extractor Metric Calculator Linkability Score Probability Finder Linkability = Pr(pd) Score [Pr(pd) + Pr(nd1) + Pr(nd2) ] Probabilistic Method
  30. 30. Probabilistic Method 30 Username Name Location Website Longest Common Subsequence, Edit Distance, etc Probability Finder 0.12 Username = 0.39 Name = 0.28 Location = 0.6 Website = 0 Username: RaineRamirez1 Name: Chris Raine Ramirez Location: Caloocan City Website: NULL Username: Rainevouz Name: Christopher Delgar Ramirez Bakunawa Location: San Jose del Monte, Bulacan Website: NULL Feature Extractor Metric Calculator Linkability Score Linkability = Pr(pd) Score [Pr(pd) + Pr(nd1) + Pr(nd2) ]
  31. 31. Comparison of Baseline methods Accuracy is 87% when threshold value = 0.39 [feature weights used are 2, 3, 4 and 1 for username, name of user, location and website features, respectively] Accuracy is 32% when threshold value = 0.71 31
  32. 32. Limitations of Baseline Methods Probabilistic  method  did  not  produce  anticipated  results  with  quiet   low  accuracy.     Both   the   methods   employ   a   small   set   of   features   namely   name,   username,  geo-­‐location  and  website.   They  fail  to  capture  user’s  content  sharing  behavior. 32
  33. 33. Take-aways from Baseline Methods Weighted   Sum   method   performed   better   than   Probabilistic   method.   We   will   enhance   the   feature   set   using   well   known   identity   resolution  techniques    +  our  proposed  Weighted  Sum  method  to   compute  linkability  scores 33
  34. 34. Linkability Score Linkability Score Linkability score exceeds range? Linkability Nudge NO YES 34 Activities performed on OSNs Recomputation 1. Baseline Methods Weighted Sum Probabilistic 2. Reformed Linkability Score
  35. 35. Reformed Linkability Score Leverages features from three state-of-the-art identity resolution techniques and uses Weighted Sum method to calculate linkability scores: MOBIUS NEMO HYDRA MOBIUS: http://www.public.asu.edu/~huanliu/papers/kdd2013.pdf NEMO: http://precog.iiitd.edu.in/Publications_files/19wole01-jain.pdf HYDRA: http://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=3650&context=sis_research 35
  36. 36. Reformed Linkability Score 36 Authorize And Fetch Data Feature Extraction NEMO Linkability Score MOBIUS HYDRA
  37. 37. MOBIUS 1 = srishti.chandok srishti.chandok SrishtiChandok 37
  38. 38. NEMO Srishti.chandok & SrishtiChandok Srishti Chandok & Srishti Chandok New Delhi, India & Delhi, India & & 1 0.75, 0 1 0.98 1 0.79 38 Username Similarity Score Name Similarity Score Location Similarity Score Profile Image Similarity Score Content Similarity Score & Weighted Sum
  39. 39. HYDRA Srishti Chandok & Srishti Chandok 'Salwan Public School, Rajinder Nagar, New Delhi 110060', 'I P University', 'maharaja agarsen institute of technology', 'IIIT Delhi’ & MTech from IIIT Delhi, BTech from Maharaja Agrasen Institute of Technology, IPU, SPS IIIT Delhi Teaching Assistant & MTech from IIIT Delhi, BTech from Maharaja Agrasen Institute of Technology, IPU, SPS http://www.twitter.com/SrishtiChandok & twitter.com/srishtichandok & 39 0.32 0.24 1 1 1 0.5 Name Similarity Score Education Similarity Score Profession Similarity Score Website Similarity Score Content Similarity Score Weighted Sum
  40. 40. Linkability Nudge 40 Linkability Score Linkability score exceeds range? Linkability Nudge NO YES Activities performed on OSNs Recomputation 1. Baseline Methods Weighted Sum Probabilistic 2. Reformed Linkability Score
  41. 41. Linkability Nudge 41 Soft paternalistic intervention Alerts users whenever user behavior leads to change in linkability score beyond pre-configured range
  42. 42. Components of Linkability Nudge Browser  Extension   Nudge  Server   Linkability  Compute  Server 42
  43. 43. 1. Browser Extension Maintains user's identity across the entire user session. Captures user's posting activity and changes in profile attributes on all configured OSM platforms. Displays linkability nudge in various forms (notifications and color). 43 Downloads the Chrome browser extension Nudge Server Send User’s Activity Information Linkability Score and Piecharts Browser Extension
  44. 44. 2. Nudge Server Intermediary between the browser extension and linkability compute server. Receives user's access token from browser extension and sends them to linkability compute server to obtain user's data. Passes the information pertaining to user's activities like making a post or changing profile attribute to the linkability compute server. Sends across the newly computed linkability scores to the browser extension from time to time based upon user's activities. 44 Access Token for various OSNs Forward Access Token for various OSNs Linkability Score Forward User’s Activity Information Nudge Server Linkability Compute Server Browser Extension
  45. 45. 3. Linkability Compute Server Fetches user's data from the API endpoints. Implements the identity resolution methods to compute linkability scores. Receives every user’s activity information (post or profile attribute), recomputes linkability scores and sends them back to nudge server. 45 Nudge Server Linkability Compute Server Identity Resolution Algorithms NEMO | HYDRA | MOBIUS Fetch user’s data Linkability Score Linkability Score
  46. 46. Nudge Design Content-driven Color Nudge - ✦ Similar Post --> Red color nudge ✦ Dissimilar Post --> Green color nudge   Attribute-­‐driven  Notification  Nudge  -­‐     ✦ Profile attribute update -> Linkability Score crosses range -> Notification Popup nudge 46
  47. 47. Content-driven Color Nudge 47
  48. 48. Attribute-driven Notification Nudge 48
  49. 49. Demo 49
  50. 50. User Evaluation of the Nudge 50 Control Period Treatment Period No exposure to linkability nudge Tasks (Post and profile updates) Exposure to linkability nudge Tasks (Post and profile updates)
  51. 51. Analysis of User Evaluation 58%   of   the   participants:   Understood   the   broad   concept   of   linkability  score     42%   of   participants:   More   aware   about   the   linkability   of   their   multiple  identities  across  OSNs   84%   of   the   participants:   Noticed   the   factors   contributing   to   their   linkability  scores     83%  of  the  participants:  Liked  Color  nudge  and  pie-­‐charts  more   Activities  performed  by  one  of  the  participants 51
  52. 52. Limitations Time   delay   (2-­‐5   seconds)   while   making   post   during   treatment   period.   Used  uniform  weights  for  computing  linkability  scores.   System  works  for  three  social  networks.   Evaluated  the  nudge  with  a  small  number  of  participants. 52
  53. 53. Conclusions Leverage  features  from  well  known  methods  for  identity  resolution   (NEM,  HYDRA  and  MOBIUS)  and  use  the  proposed  baseline  method,   Weighted  Sum,  to  compute  the  linkability  scores   Identify   the   factors   (profile   attributes   and   content)   that   have   contributed  to  the  computed  linkability  score   Design   and   develop   linkability   nudge,   a   soft   intervention   which   alerts   users   whenever   user   behavior   leads   to   change   in   linkability   score  beyond  preconfigured  range    Perform  a  detailed  user  study  in  a  controlled  lab  experiment  setting   to  assess  effectiveness  and  utility  of  proposed  linkability  nudge 53
  54. 54. Acknowledgements Rishabh  Kaushal,  PhD,  IIIT-­‐Delhi   Committee  Members   Sonal  and  Sonu,  Precogers   Precog  members,  family  and  friends 54
  55. 55. Thank You! 55
  56. 56. References https://www.washingtonpost.com/investigations/social-­‐engineering-­‐using-­‐ social-­‐media-­‐to-­‐launch-­‐a-­‐cyberattack/2012/09/26/a282c6be-­‐0837-­‐11e2-­‐ a10c-­‐fa5a255a9258_graphic.html?utm_term=.148a1a0244af   https://www.infosecurity-­‐magazine.com/news/phishing-­‐and-­‐social-­‐ engineering/   https://www.nytimes.com/2014/11/11/world/europe/for-­‐guccifer-­‐hacking-­‐ was-­‐easy-­‐prison-­‐is-­‐hard-­‐.html   http://www.securityweek.com/social-­‐media-­‐makes-­‐way-­‐social-­‐engineering   http://www.propertycasualty360.com/2017/07/04/how-­‐social-­‐engineering-­‐ fueled-­‐the-­‐cyber-­‐attack-­‐bus 56
  57. 57. Appendix 57
  58. 58. Feature Name Metrics username Hamming Distance, Longest Common Subsequence, Edit Distance, Cosine Distance, Jaccard Distance, Jaro Winkler Distance name Length of Common Substring, Length of Common Prefix & Common Suffix location Length of Common Substring, Geo-location (LAtitude & Longitude) website Canonical URL matching Features and Metrics for Baseline Methods 58
  59. 59. NEMO 59
  60. 60. HYDRA 60

×