Your SlideShare is downloading. ×
  • Like


Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.


Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Privacy-Enhanced Personalization


talk given by Alfred Kobsa at the IHM'10 conference

talk given by Alfred Kobsa at the IHM'10 conference

Published in Technology , News & Politics
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Privacy-Enhanced Personalization Alfred Kobsa Donald Bren School of Information and Computer Sciences University of California, Irvine
  • 2. “Traditional” personalization on the World Wide Web Tailored email alerts 64% Customized content 48% Account access 48% Personal productivity tools 23% Wish lists 23% Product recommendations 23% Saved links 20% Express transactions 16% Targeted marketing/advertising 11% Custom pricing 9%Personalized content through non-PC devices 7% News clipping services 5% Percent of 44 companies interviewed (multiple responses accepted) Source: Forrester Research
  • 3. More recent deployed personalization• Personalized search• Web courses that tailor their teaching strategy to each individual student• Information and recommendations by portable devices that consider users’ location and habits• Personalized news (to mobile devices)• Product descriptions whose complexity is geared towards the presumed level of user expertise• Tailored presentations that take into account the user’s preferences regarding product presentation and media types (text, graphics, video)
  • 4. Current personalization methods (in 30 seconds)Data sources• Explicit user input• User interaction logsMethods• Assignment to user groups• Rule-based inferences• Machine learningStorage of data about users• Persistent user profile• Updated over time
  • 5. Web personalization delivers benefits for both users and web vendorsJupiter Communications, 1998: Personalization at 25 consumer e-commerce sites increased the number of new customers by 47% in the first year, and revenues by 52%.Nielsen NetRatings, 1999:• Registered visitors to portal sites spend over 3 times longer at their home portal than other users, and view 3 to 4 times more pages at their portal• E-commerce sites offering personalized services convert significantly more visitors into buyers than those that don’t.Choicestream 2004, 2005:• 80% interested in personalized content• 60% willing to spend a least 2 minutes answering questions about themselvesDownside: Personalized sites collect significantly more personal data than regular websites, and do this often in a very inconspicuous manner.
  • 6. Many computer users are concerned about their privacy onlineNumber of users who reported:• being extremely or very concerned about divulging personal information online: 67% (Forrester 1999), 74% (AARP 2000)• being (extremely) concerned about being tracked online: 77% (AARP 2000)• leaving web sites that required registration information: 41% (Boston Consulting 1997)• having entered fake registration information: 40% (GVU 1998), 27% (Boston Consulting 1997), 32% (Forrester 1999)• having refrained from shopping online due to privacy concerns, or bought less: 32% (Forrester 1999), 32% 35% 54% : IBM 1999, 24% (AARP 2000)• wanting internet sites ask for permission to use personal data: 81% (Pew 2000)• being willing to give out personal data for getting something valuable in return: 31% (GUV 1998), 30% (Forrester 99), 51% (Personalization Consortium)
  • 7. Privacy surveys do not seem to predict people’s privacy-related actions very well Harper and Singleton, 2001 Personalization Consortium• In several privacy studies in E-commerce contexts, discrepancies have already been observed between users stating high privacy concerns but subsequently disclosing personal data carelessly.• Several authors therefore challenge the genuineness of such reported privacy attitudes and emphasize the need for experiments that allow for an observation of actual online disclosure behavior.
  • 8. Either Personalization or Privacy?• Personal data of computer users are indispensable for personalized interaction• Computer users are reluctant to give out personal data ☛ Tradeoff between privacy and personalization?
  • 9. The tension between privacy and personalization is more complex than that… • Indirect relationship between privacy and personalization • Situation-dependent • Many mitigating factorsPeople use “privacy calculus” to decidewhether or not to disclose personal data,e.g. for personalization purposes
  • 10. Privacy-Enhanced Personalization Can we have good personalization and good privacy at the same time?How can personalized systemsmaximize their personalizationbenefits, while at the same timebeing compliant with the privacyconstraints that are in effect?
  • 11. Privacy constraints, and how to deal with themPrivacy constraintsA. People’s privacy preferences in a given situation (and factors that influence them)B. Privacy norms (laws, self-regulation, principles)Reconciliation of privacy and personalization1. Use of privacy-enhancing technology2. Privacy-minded user interaction design
  • 12. Privacy constraints, and how to deal with themPrivacy constraintsA. People’s privacy preferences in a given situation (and factors that influence them)B. Privacy norms (laws, self-regulation, principles)Reconciliation of privacy and personalization1. Use of privacy-enhancing technology2. Privacy-minded user interaction design
  • 13. A. Factors that influence people’s willingness to disclose informationInformation type – Basic demographic and lifestyle information, personal tastes, hobbies – Internet behavior and purchases – Extended demographic information – Financial and contact information – Credit card and social security numbersData values – Willingness to disclose certain data decreases with deviance from group average – Confirmed for age, weight, salary, spousal salary, credit rating and amount of savings
  • 14. Valuation of personalization“the consumers’ value for personalization is almosttwo times […] more influential than the consumers’concern for privacy in determining usage ofpersonalization services” Chellappa and Sin 2005What types of personalization do users value?• time savings, monetary savings, pleasure• customized content, remembering preferences
  • 15. TrustTrust positively affects both people’s stated willingnessto provide personal information to websites, and theiractual information disclosure to experimental websites.Antecedents of trust • Past experience (of oneself and of others)  Established, long-term relationships are specifically important • Design and operation of a website • Reputation of the website operator • Presence of a privacy statement / seal• Awareness of and control over the use of personal data
  • 16. Privacy constraints, and how to deal with themPrivacy constraintsA. People’s privacy preferences in a given situation (and factors that influence them)B. Privacy norms (laws, self-regulation, principles)Reconciliation of privacy and personalization1. Use of privacy-enhancing technology2. Privacy-minded user interaction design
  • 17. B. Privacy norms• Privacy laws More than 40 countries and 100 states and provinces worldwide• Industry self-regulations Companies, industry sectors (NAI)• Privacy principles – supra-national (OECD, APEC) – national (Australia, Canada, New Zealand…) – member organizations (ACM) Several privacy laws disallow a number of frequently used personalization methods unless the user’s consents
  • 18. Privacy constraints, and how to deal with themPrivacy constraintsA. People’s privacy preferences in a given situation (and factors that influence them)B. Privacy norms (laws, self-regulation, principles)Reconciliation of privacy and personalization1. Use of privacy-enhancing technology2. Privacy-minded user interaction design
  • 19. 1. Privacy-enhancing personalization technologyPrivacy Sometimes variants of personalization methods are available that are more privacy-friendly, with little or even no loss in personalization quality Personalization
  • 20. Privacy-Enhancing Technology for Personalized SystemsCaveats• PETs are not “complete solutions” to the privacy problems posed by personalized systems• Their presence is also unlikely to “charm away” users’ privacy concerns.  PETs should be used judiciously in a user-oriented system design that takes both privacy attitudes and privacy norms into account
  • 21. Pseudonymous users and user models Users of personalized systems can receive full personalization and remain anonymous (users are unidentifiable, unlinkable and unobservable for third parties, linkable for personalized system) • In most cases privacy laws do not apply any more when users cannot+ be identified with reasonable means. • Pseudonymous users may be more inclined to disclose personal data (but this needs still to be shown!), leading to better personalization. • Anonymity is currently difficult and/or tedious to preserve when pay- ments, physical goods and non-electronic services are exchanged.– • Harbors the risk of misuse. • Hinders vendors from cross-channel personalization. • Anonymity of database entries, web trails, query terms, ratings and texts can be surprisingly well defeated by an attacker who has identified data available that can be partly matched with the “anonymous” data.
  • 22. Client-side Personalization Users’ data are all located at the client rather than the server side. All personalization processes that rely on this data are exclusively carried out at the client side. • A website with personalization at the client side only is generally not+ subject to privacy laws. • Users may be more inclined to disclose their personal data if personalization is performed locally upon locally stored data only. • Popular user modeling and personalization methods that rely on an analysis of data from the whole user population cannot be applied any more or will have to be radically redesigned.– • Server-side personalization algorithms often incorporate confidential rules / methods, and must be protected from unauthorized access.  Trusted computing platforms
  • 23. Privacy-enhancing techniques for collaborative filtering• Traditional collaborative filtering systems collect large amounts of user data in a central repository (product ratings, purchased products, visited web pages),• Central repositories may not be trustworthy, can be targets for attacks.Privacy-protecting techniquesDistribution: No central repository, but interacting distributed clusters that contain information about a few users only.Aggregation of encrypted data – Users privately maintain their own individual ratings – Community of users compute an aggregate of their private data with homomorphic encryption and peer-to-peer communication – The aggregate allows personalized recommendations to be generated at the client sidePerturbation: Ratings become systematically altered before submission to the central server, to hide users’ true values from the server.Obfuscation – A certain percentage of users’ ratings become replaced by different values before the ratings are submitted to a central server for collaborative filtering. – Users can then “plausibly deny” the accuracy of any of their data.
  • 24. Privacy constraints, and how to deal with themPrivacy constraintsA. People’s privacy preferences in a given situation (and factors that influence them)B. Privacy norms (laws, self-regulation, principles)Reconciliation of privacy and personalization1. Use of privacy-enhancing technology2. Privacy-minded user interaction design
  • 25. 2. Privacy-minded user interaction designTwo experiments on the privacy-mindedcommunication of websites’ privacy practices
  • 26. Current “industry standard”: Privacy statementsPrivacy statements…• are regarded as very important by 76% (DTI 2001)• make 55% more comfortable providing personal information (Roy Morgan 2001, Gartner 2001)• are claimed to be viewed by 73% [“always” by 26%] (Harris Interactive 2001)• are effectively viewed by only 0.5% (Kohavi 2001), 1% (Reagan 2001)• are several readability levels too difficult (Lutz 04, Jensen&Potts 04)
  • 27. Our counterproposal: A design pattern for personalized websites that collect user dataDesign patterns constitute descriptions of best practices based on researchand application experience. They give designers guidelines for the efficientand effective design of user interfaces.Every personalized site that collects user data should include the followingelements on every page:1. Traditional link to global privacy statement • Still necessary for legal reasons2. Additionally, contextualized local communication of privacy practices and personalization benefits • Break long privacy policies into small, understandable pieces • Relate them specifically to the current context • Explain privacy practices as well as personalization benefits
  • 28. An example webpage based on the proposed design patternTraditional link to a privacy statement Explanation of personalization benefits Explanation of privacy practices
  • 29. The tested “experimental new versionLinks to original privacy state-ment of an online bookstore” (split into privacy, security and personalization notice) “Selection counter” Contextualized short description of relevant privacy practices (taken from original privacy statement) Contextualized short description of relevant personalization benefits (derived from original privacy statement)
  • 30. Traditional version, with link to privacy statement onlyLinks to original privacy state-ment (split into privacy, security and personalization notice) “Selection counter”
  • 31. Why not simply ask users what they like better?• Inquiry-based empirical studies Reveal aspects of users’ rationale that cannot be inferred from mere observation• Observational empirical studies Reveal actual user behavior which may differ from users’ stated behavior  This divergence seems to be quite substantial in the case of stated privacy preferences vs. actual privacy-related behavior
  • 32. Experimental Procedures (partly based on deception)1. Instructions to subjects  “Usability test with new version of a well-known online book retailer”  Answering questions to allegedly obtain better book recommendations  No obligation to answer any question, but helpful for better recommendation.  Data that subjects entered would purportedly be available to company  Possibility to buy one of the recommended books with a 70% discount.  Reminder that if they buy a book, ID card and bank/credit card would be checked (subjects were instructed beforehand to bring these documents if they wish to buy)2. Answering interest questions in order to “filter the selection set” (anonymous) • 32 questions with 86/64 answer options become presented (some free-text) • Most questions were about users’ interests (a very few were fairly sensitive) • All “make sense” in the context of filtering books that are interesting for readers • Answering questions decreased the “selection counter” in a systematic manner • After nine pages of data entry, users are encouraged to review their entries, and then to view those books that purportedly match their interests
  • 33. Experimental Procedures (cont’d)3. “Recommendation” of 50 books (anonymous) • 50 predetermined and invariant books are displayed (popular fiction, politics, travel, sex and health advisories) • Selected based on their low price and their presumable attractiveness for students • Prices of all books are visibly marked down by 70%, resulting in out-of-pocket expenses between €2 and €12 for a book purchase. • Extensive information on every book available4. Purchase of one book (identified) • Subjects may purchase one book if they wish • Those who do are asked for their names, shipping and payment data (bank account or credit card charge).5. Completing questionnaires6. Verification of name, address and bank data (if book purchased)
  • 34. Subjects and experimental design• 58 economics/business/MIS students of Humboldt University, Berlin, Germany (data of 6 students were eventually discarded)• Randomly assigned to one of two system versions: a) 26 used traditional version (that ony had links to global descriptions of privacy, personalization and security) b) 26 used contextualized version (with additional local information on privacy practices and personalization benefits for each entry field) Hypothesis: – User will be more willing to share personal data – User will regard enhanced site more favorably in terms of privacy
  • 35. Results With contextual Control % increase p explanationsObservations Questions answered 84% 91% +8% <0.001 Answers given 56% 67% +20% <0.001 Book buyers 58% 77% +33% 0.07 “Privacy has priority” 3.34 3.95 +18% 0.01Perception “Data is used responsibly” 3.62 3.91 +8% 0.12 “Data allowed store to select 2.85 3.40 +19% 0.035 better books”
  • 36. Privacy, Trust and Personalization Purchases + + Perceived + benefits +Understanding + Trust Quality of personalization + + Willingness + to disclose + + + personal data Control of + Perceived own data privacy
  • 37. Other factors to be studied • Site reputation With w/o contextual contextual Increase p explanations explanations • Is privacy or benefit dis-Observations Questions answered 91% 84% +8% <0.001 closure more important? Answers given 67% 56% +20% <0.001 Book buyers 77% 58% +33% 0.07 • Stringency of privacy “Privacy has 3.95 3.34 +18% 0.01 practicesPerception priority” “Data allowed store to select 3.40 2.85 +19% 0.035 • Permanent visibility of better books” “Data is used contextual explanations responsibly” 3.91 3.62 +8% 0.12 • References to full privacy policy
  • 38. Privacy Finder (Cranor et al.)P3P• P3P is a W3C standard that allows a website to create a machine-readable version of its privacy policy• P3P policies an be automatically downloaded and evaluated against an individuals privacy preferencesPrivacy Finder is a search engine that• Displays the search results• Evaluates the P3P policies of the retrieved sites• Compares them with the user’s individual preferences• Displays this information graphically
  • 39. Will online shoppers pay a premium for privacy? (Tsai et al., WEIS 2007)• Participants were adults from the general Pittsburgh population (12.5% privacy-unconcerned were screened out)• They received a flat fee of $45, and had to buy a battery pack and a sex toy online (each about $15 with shipping). They could keep the rest.• Three conditions, each with 16 subjects.• Subjects had to enter a pre-determined search expression, and were shown a list of real search results that were artificially ranked and supplemented with additional information: – in condition no-info: total price only (with shipping) – in condition irrelevant-info: total price, accessibility rating – in condition privacy-info: total price, privacy rating
  • 40. Condition “privacy info”
  • 41. Condition “privacy info”
  • 42. Results1. Between conditions no-info and accessibility-info: No significant difference in avg. purchase price2. Between conditions no-info and privacy info: Users with privacy-info paid an average premium of $0.59 for batteries, and $0.62 for sex toys (p<0.0001)3. In the absence of prominent privacy information, people will purchase where price is lowest: Batteries Sex toys no-info 83% 80% irrelevant-info 75% 67% Percentage of users who privacy-info 22% 33% bought at the cheapest site
  • 43. Ratio of purchases by level of privacy
  • 44. There is no magic bullet forreconciling personalization with privacy Effort is comparable to … making systems secure … making systems fast … making systems reliable
  • 45. Privacy-Enhanced Personalization: Need for a process approach1. Gain the user’s trust – Respect the user’s privacy attitude (and let the user know) • Respect privacy laws / industry privacy agreements – Provide benefits (including optimal personalization within the given privacy constraints) – Increase the user’s understanding (don’t do magic) – Use trust-enhancing methods – Give users control – Use privacy-enhancing technology (and let the user know)2. Then be patient, and most users will incrementally come forward with personal data / permissions if the usage purpose for the data and the ensuing benefits are clear and valuable enough to them.
  • 46. Existing approaches for catering to privacy constraints• Largest permissible dominator (e.g., Disney) – Infeasible if a large number of jurisdictions are involved, since the largest permissible denominator would be very small – Individual preferences not taken into account• Different country/region versions (e.g., IBM) – Infeasible as soon as the number of countries/regions, and hence the number of different versions of the personalized system, increases – Individual preferences not taken into account• Anonymous personalization (users are not identified) – Nearly full personalization possible – Harbors the risk of misuse – Slightly difficult to implement if physical shipments are involved – Practical extent of protection unclear – Individual user preferences not taken into account
  • 47. Our approachSeeking a mechanism to dynamically select usermodeling components that comply with the currentlyprevailing privacy constraints
  • 48. User modeling component pool user data used modeling methods used demographic user-supplied visitedcomponent data data pages UMC1 clustering X UMC2 rule-based reasoning X UMC3 fuzzy reasoning with uncertainty X UMC4 rule-based reasoning X X UMC5 fuzzy reasoning with uncertainty X X UMC6 incremental machine learning X X UMC7 one-time machine learning across X X sessions UMC8 one-time machine learning + X X fuzzy reasoning with uncertaintyDifferent methods differ in their data requirements,quality of predictions, and also their privacy implications
  • 49. Product line architecture• “The common architecture for a set of related products or systems developed by an organization.” [Bosch, 2000]• a PLA includes – Stable core: basic functionalities – Options: optional features/qualities – Variants: alternative features/qualities• Dynamic runtime selection (van der Hoek 2002): A particular architecture instance is selected from the product-line architecture
  • 50. Our approach Selection Component(Fink & Kobsa 2003)
  • 51. Example: .com with privacy
  • 52. The privacy constraints
  • 53. Roadmap for Privacy-Enhanced Personalization Research• Study the impacts of privacy laws, industry regulations and individual privacy preferences on the admissibility of personalization methods• Provide optimal personalization while respecting privacy constraints• Apply state-of-the-art industry practice for managing the combinatorial complexity of privacy constraints
  • 54. Readings…• A. Kobsa: Privacy-Enhanced Web Personalization. In P. Brusilovsky, A. Kobsa, W. Nejdl, eds.: The Adaptive Web: Methods and Strategies of Web Personalization. Springer Verlag.• A. Kobsa: Privacy-Enhanced Personalization. Communications of the ACM, Aug. 2007
  • 55. Survey of privacy laws