Achieving Optimal Privacy in Trust-Aware Collaborative Filtering Recommender Systems

852 views

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
852
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
16
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • 30 minutes = 25 (P) + 5 (Q) mins = Max 25 slides
  • Shilling Attacks: An underhanded and cheap way to increase recommendation frequency is to manipulate or trick the system into doing so. This can be done by having a group of users (human or agent) use the recommender system and provide specially crafted opinions” that cause it to make the desired recommendation more often. For example, it has been shown that a number of book reviews published on Amazon.com are actually written by the author of the book being reviewed. A consumer trying to decide which book to purchase could be misled by such reviews into believing that the book is better than it really is. This is known as shilling attack and recommender systems should protect against these attacks
  • Here we can talk about the proposed method briefly. Normalization of data is very critical step to increase privacy level. normalization is the process of isolating statistical error in repeated measured data. A normalization is sometimes based on a property

    We utilize z-score transformation for normalizing data. Since z-score values have zero mean, we can hide their value by adding random numbers from a distribution with zero mean and predefined standard deviation. As a result, users will all make computations with their z-scores instead of their actual ratings
  • To protect the private data, the level of perturbation is vital. If the amount is too low, the masked data still discloses considerable amounts of information; if it is too high, accuracy will be very low

    we take into account the configurations that affect the privacy mechanism at one hand, and take into account the configurations affecting trust in another hand,
    we can argue that an optimal setting can be defined where privacy and accuracy can be both maintained at the same time we can argue that an optimal setting can be defined where privacy and accuracy can be both maintained at the same time

  • the goal is achieving acceptable accuracy and respective privacy at the same time then optimization problem becomes multi-objective. As a result, problem of achieving a trade-off between accuracy and privacy in the current context becomes a Pareto optimization problem.
  • Other methods are hypothetical optimization approaches such as Skyline and Maximal Vector Problem.
  • We perturb the overall user data using Gaussian and Uniform distributions (δ, β),
    (δ, β) = (1,1) yield best results as it exhibits the minimal privacy loss.

  • Compare the results from MAE of framework under masked data,
    set of (n, t) = (3,100), being fixed on (δ, β) = (1,1), yields reasonable accuracy, while privacy is maintained
  • So we utilize of different intervals of f with the system being fixed on (δ, β, n, t) configuration from previous step.
    Through observation of consistent accuracy of different f intervals, we can fine-tune the configuration from previous step and infer an optimum privacy configuration.
    Taking into account the results (Fig. 4),
    we observe consistent increase in intervals of f which finalizes the choice of n, t, δ, β and finalizes the results in ordered set of n=3, t=100, δ=1, β=1 and f = [0, d] supporting both accurate and private recommendations: Considering the existing range of configurations
  • a framework for enabling privacy-preserving trust-aware recommendation generation.

    a Pareto set can be always be found which can make a trade-off between
    presented a heuristic that experimentally infers this set.

    We also showed that privacy increases under proposed framework, while even optimal privacy of our framework is better than the best performance of base framework in its best configurations. As a result privacy can be introduced in trust recommenders and can be optimized to avoid private data loss and at the same time produce accurate recommendations

  • Achieving Optimal Privacy in Trust-Aware Collaborative Filtering Recommender Systems

    1. 1. Achieving Optimal Privacy in Trust-Aware Social Recommender Systems Nima Dokoohaki, Cihan Kaleli, Huseyin Polat, Mihhail Matskin The Second International Conference on Social Informatics (SocInfo’10) 27-29 October, 2010, Laxenburg, Austria
    2. 2. Emegrence of Trust in Social Recommender Systems • Most successful recommenders employ well-known collaborative filtering (CF) techniques - Social Recommender Systems (SRS) – Those CF-based recommenders that use social network as backbone. • CF automates the word-of-mouth process - Finding users similar to the user receiving their recommendation and suggesting her items rated high in the past by similar taste users • Shortcoming : sparsity of User-Rating Matrix - There are always numerous items and the ratings scored by users are sparse, often the step of finding similar users fails . Trust is proposed as remedy.
    3. 3. Extending Social Recommender Systems with Trust Metric • Extend CF recommenders as follows: - Utilizing a trust metric, which enables a trust-based heuristic to propagate and find users whom are trustworthy with respect to active user that we are gathering/generating recommendations for. • Trust has shown that it can improve the accuracy of recommenders. ( Golbeck, Ziegler, Massa, ...) • Complete list of problems addressed by trust- recommenders - Massa, P., & Avesani, P. Trust Metrics in Recommender Systems. In Computing with Social Trust (pp. 259-285), 2009.
    4. 4. Problems with Existing Trust-aware Recommender • Privacy and lack of decentralization ... • Growing concern about the vulnerability to shilling attacks : - Current implementations are centralized or not tested in a decentralized fashion • Current research has paid least attention to clearly address the privacy issues surrounding the architecture and components of trust recommenders.
    5. 5. Privacy issues with Social Recommender Systems • CF systems including social networks-based ones have several advantages. However, they fail to protect users’ privacy. • … Also, Data collected for CF can be used for unsolicited marketing, government surveillance, profiling users, etc. • Users who remain concerned about their privacy, … - users might decide to give false data that effect producing truthful recommendations. • This in turn leads to decrease in accuracy of performance of recommender system
    6. 6. Motivation and Contributions • Emphasizing importance of dealing with privacy issues surrounding the architecture and components of trust- aware recommender systems. - Extending Architecture of Trust Recommenders with Privacy Preserving Module - Proposing use data perturbation techniques to protect users’ privacy while still providing accurate recommendations. • Dealing with conflict of privacy goals and trust goals through Agent Mechanisms - Utilizing Pareto efficiency
    7. 7. Trust-Aware Recommender System Architecture Taken from Massa, P., & Avesani, P. “Trust Metrics in Recommender Systems”. In Computing with Social Trust (pp. 259-285), 2009.
    8. 8. Trust [NxN] Rating [NxM] Trust Metric Estimated Trust [NxN] INPUT N:: users M:: items First Step Second Step Trust-aware Similarity Metric User Similarity [NxN] Rating Predictor Pure Collaborative Filtering Disguised Ratings [NxM] OUTPUT Private Trust- Metric Estimated Private Trust [NxN] Private Trust-Aware Collaborative Filtering Private Rating Predictor Predicted Rating [NxM] Private Trust-Aware Recommender Architecture
    9. 9. Privacy Protection Methodology: Data Normalization with z-score • Normalization of data is very critical to increase privacy level. • To privacy protection, users employ data perturbation techniques. We propose to use normalized version of actual ratings to improve the privacy level. • As a result, z-score values are utilized. • *z-score of an item indicates how far and in what direction, that item deviates from its distribution's mean, expressed in units of its distribution's standard deviation. *W. Du and H. Polat. Privacy-preserving collaborative filtering. International Journal of Electronic Commerce, 9(4):9-36, 2005.
    10. 10. Privacy Protection Methodology: Random Perturbations • To disguise data, users add random numbers to z- scores. They select such random numbers from two different distribution which are Gaussian and uniform • Since adding random numbers hides ratings of rated items, users add random ratings to hide unrated items. • After disguising their private data, users compute trust between each other.
    11. 11. Private Trust Estimation: Trust Formalization • Assume there are two users; ua and ub. We formalize the trust between them as follows:
    12. 12. Private Trust Estimation: Trust Estimation
    13. 13. Private Recommendation Process: Producing Referrals
    14. 14. Mutual Effects of Trust and Privacy: Notion of Conflict • privacy and accuracy are conflicting goals • Conflict - Trust metrics along at each step of trust estimation increase or maintain the accuracy of predictions. - Increasing the amount of perturbations leads to further information loss. • Dealing with Conflict through Optimization - we can argue that an optimal setting can be defined where privacy and accuracy can be both maintained at the same time
    15. 15. Optimization design space • PCS (privacy configuration set) • TCS(trust configuration set) • Probrem space consists of all possible configurations:
    16. 16. Mapping Design Space to a Pareto Optimization Space
    17. 17. Inferring The Optimal Privacy Set • Heuristic. To infer OPS, following heuristic is used: 1. Perturbing the overall user data using different PCS settings; 2. Observing the framework under variations of TCS; • (steps 2 & 3 are interchangeable depending on goals at hand) 3. Perturbing the sparse user data with PCS inferred from step 2 allows for inferring OPS and finalizing the Pareto optimal setting
    18. 18. Evaluating the Recommendation Framework: Dataset • Two sets of experiments: - First set demonstrates the effect of insertion of random data on accuracy of predictions generated as output of the recommendation system. - The second set of experiments demonstrates how filling unrated items with varying f values affect the overall accuracy of recommender system • MovieLens dataset, http://www.grouplens.org/node/73 - 943 user rating profiles, with more than 100000 rating values. Rating values are on a 5 point scale.
    19. 19. Evaluating the Recommendation Framework: Recommender • We used Trust Recommender from: - S. Fazeli, A. Zarghami, N. Dokoohaki, and M. Matskin, "Elevating Prediction Accuracy in Trust-aware Collaborative filtering Recommenders through T-index Metric and TopTrustee lists," the Journal of Emerging Technologies in Web Intelligence (JETWI), 2010. • a decentralized trust-aware recommender - T-index , as a trust metric for filtering trust between users. Unlike previous approaches, - a trust network between users can automatically be built from existing ratings between users. - a Distributed Hash Table (DHT) like list of trustees, TopTrusteeList (TTL) [19] that wraps around the items, which are tasted similarly to those of current user.
    20. 20. MAE of recommendation framework, without adding any perturbations 0.85 0.86 0.87 0.88 0.89 0.9 0.91 0.92 n=2 n=3 n=5 n=10 n=20 n=50 T=0 T=25 T=50 T=100 T=200 T=500 T=1000 Zarghami, A., Fazeli, S., Dokoohaki, N., & Matskin, M. (2009). Social Trust-Aware Recommendation System: A T-Index Approach. In Web Intelligence and Intelligent Agent Technology, IEEE/WIC/ACM International Conference on (Vol. 3, pp. 85-90). IEEE Computer Society. doi: 10.1109/WI-IAT.2009.237.
    21. 21. MAE with added perturbations to user data, having Gaussian distribution 0.42 0.92 1.42 1.92 2.42 2.92 β=1 β=2 β=3 β=4 N=2, T=0 N=2, T=100 N=3, T=0 N=3, T=100 N=5, T=0
    22. 22. MAE with added perturbations to user data, having Uniform distribution 0.62 0.82 1.02 1.22 1.42 1.62 δ=1 δ=2 δ=3 δ=4 N=2, T=0 N=2, T=100 N=3, T=0 N=3, T=100 N=5, T=0
    23. 23. 0.42 0.92 1.42 1.92 2.42 2.92 β=1 β=2 β=3 β=4 N=2, T=0 N=2, T=100 N=3, T=0 N=3, T=100 N=5, T=0 0.62 0.82 1.02 1.22 1.42 1.62 δ=1 δ=2 δ=3 δ=4 N=2, T=0 N=2, T=100 N=3, T=0 N=3, T=100 N=5, T=0 (δ, β) = (1,1) Perturbing the overall user data using Gaussian and Uniform distributions (δ, β)
    24. 24. 0.42 0.92 1.42 1.92 2.42 2.92 β=1 β=2 β=3 β=4 N=3, T=100 0.62 0.82 1.02 1.22 1.42 1.62 δ=1 δ=2 δ=3 δ=4 N=3, T=100 (δ, β) = (1,1) Compare the results from MAE of framework under masked data (n, t) = (3,100)
    25. 25. Filling Sparse Data with Random Gaussian distribution with respect to f 0.74 0.75 0.76 0.77 0.78 0.79 0.8 Half Density Full Density Double Density
    26. 26. Fine-tuning the privacy • Perturb the sparse user data with (δ, β, n, t) inferred from previous step for fine-tuning the privacy. • we observe consistent increase in intervals of f which finalizes the choice of n, t, δ, β • We finalize the results in • ordered set of n=3, t=100, δ=1, β=1 and f = [0, d] Will be the Pareto front.
    27. 27. Inferring Optimality Set: Comparison with Non-Masked Results • Optimality holds under Masked Data. • Comparison with MAE results of non-masked framework with framework under masked results: - we inferred the optimum values for β=1, n=3 and t=100 and for these parameters MAE= 0.7994, while for similar parameters without adding perturbations we achieve MAE=0.881. • MAE results are still less than results of MAE without adding perturbations. - we achieve the best results with MAE= 0.863 for (n,t)=(50,100) and this value is still greater than our optimum value.
    28. 28. MAE results are still better than results of MAE without adding perturbations.
    29. 29. Conclusions • A framework for addressing the problem of privacy in trust recommenders is proposed, • Conflicting goals of privacy and accuracy, • Through experiments we showed that we can infer such setting that holds even when trust recommender is not under privacy measures, • As a result privacy can be introduced in trust recommenders and can be optimized to avoid private data loss and at the same time produce accurate recommendations
    30. 30. Thank you Nima Dokoohaki http://web.it.kth.se/~nimad/

    ×