Resource recommendation vs privacy enhancement

1/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Silvia Puglisi
silvia.puglisi@upc.edu
“Research Seminar”
Master in Telematics Engineering-UPC
On Content-Based Recommendation and Users Privacy in Social Tagging Systems
Silvia Puglisi
Barcelona, UPC, 2013

2/21
Social tagging is the activity that allows users to assign keywords (tags) to web
based resources.
What is social tagging?

3/21
Tagging and tags
Tag: a label attached to someone or something for identification or other
information

4/21
Scenario
Social tagging enables semantic interoperability in web applications.
Recommendation and information filtering systems have been developed to
predict users preferences.
Users hence reveal their personal preferences on social tagging platforms.
Privacy enhancing techniques (PET) have been developed to protect user
privacy to a certain extent, at the expense of semantic loss.

5/21
Objective
Using as starting point research done in the field of recommendations systems
[1] and PET [2].
The objective of this study is evaluate the impact of two PET, tag forgery and
suppression, on the performance of a recommendation system, on real world
application data.
[1] Bellogín, Alejandro, Iván Cantador, and Pablo Castells. "A comparative study of heterogeneous item
recommendations in social systems." Information Sciences (2012)
[2] Parra-Arnau, Javier, David Rebollo-Monedero, and Jordi Forné. "A privacy-protecting architecture for collaborative
filtering via forgery and suppression of ratings." Data Privacy Management and Autonomous Spontaneus Security
(2012): 42-57.

6/21
Dataset
Considering different social bookmarking platform, Delicious was identified as a
representative system of an application rich in collaborative tagging information.
Delicious is a social bookmarking platform for web resources.
The dataset containing Delicious data was obtained from the ones publicly
available at the 2nd
International Workshop on Information Heterogeneity and
Fusion in Recommender Systems.

7/21
Delicious

8/21
Techniques
Modelling the User/Item Profile
The simplest approach to model users and items is to count the number of
times a tag has been used:
•By a user to annotate different items in the same category.
•Or by the community to annotate the item.
The user/item profile is then described as a histogram of the relative
frequencies of tags within a predefined set of categories of interest.

9/21
Techniques
Histogram of a user profile

10/21
Techniques
Privacy Metric
The Kullback-Leibler (KL) divergence has been adopted as privacy criteria,
following the perspective of Jaynes’ rationale on entropy maximization methods.
Since the KL divergence may be regarded as a generalization of entropy of a
distribution, relative to another, it is often referred to as relative entropy.
D(p || q) = Ep log
p(x)
q(x)
= p(x)log
x
∑
p(x)
q(x)

11/21
Techniques
Utility Metric
A measure of how an item is useful for a certain user is needed.
We could convey that an item is useful if its profile is somehow similar to the
user profile.
Hence we need a measure of similarity.
Content based recommender models are defined as similarity measures
between users and item profiles. This is provided by the cosine-based similarity
measure:

12/21
Techniques
Performance Metric
The recommender system is evaluated considering a content retrieval scenario
where a user is provided with a ranked list of N recommended items.
The performance metric adopted is hence among the commonly used for
ranked list prediction, i.e. precision at top N.
In the field of Information Retrieval precision can be defined as the fraction of
recommended items that are relevant for a target user.

13/21
Techniques
Tag Forgery and Suppression
Tag suppression and forgery are privacy enhancing techniques that helps users
who tags resources online, from revealing sensible information to a possible
attacker.

14/21
Techniques
Tag Forgery and Suppression Rates
The tag forgery rate represents the ratio of forged items:
The tag suppression rate, is the proportion of items that the user consents to
eliminate:
ρ ∈ [0,1)
σ ∈ [0,1)

15/21
Techniques
The Privacy-Forgery-Suppression Function
Consistently the privacy-forgery-suppression function can be defined:
P(ρ,σ ) = maxr,s D
q +r − s
1+ ρ −σ





÷
ri ≥ 0 ri = ρ
i
n
∑
qi ≥ si ≥ 0 si =
1
n
∑ σ

16/21
Evaluation

17/21
Evaluation
Statistics about the dataset
Categories 11 Users 1867
Item-Category
Tuples
98998 Avg. tags per user 477.75
Items 69226
Avg. Items per
Category
81044
Avg. categories
per item
1.4 Tags per item 13.06

18/21
Results
Relative Risk Reduction with forgery - Utility
100×
Dinit (um || P)− Dρ,σ (um || P)
Dinit (um || P)

19/21
Results
Relative Risk Reduction with suppression - Utility

20/21
Conclusions
Tag suppression and forgery are simple privacy enhancing techniques able to
protect users privacy at the cost of some semantic loss.
This study shows with a simple experimental evaluation, in a real world
application scenario, how the performances degradation of a recommender
system, is small if compared to the privacy risk reduction offered by the
application of these techniques.

21/21
Thank you!

Resource recommendation vs privacy enhancement

Recommended

Recommended

More Related Content

Similar to Resource recommendation vs privacy enhancement

Similar to Resource recommendation vs privacy enhancement (20)

More from Silvia Puglisi

More from Silvia Puglisi (7)

Recently uploaded

Recently uploaded (20)

Resource recommendation vs privacy enhancement

Editor's Notes