SlideShare a Scribd company logo
1/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Silvia Puglisi
silvia.puglisi@upc.edu
“Research Seminar”
Master in Telematics Engineering-UPC
On Content-Based Recommendation and Users Privacy in Social Tagging Systems
Silvia Puglisi
Barcelona, UPC, 2013
2/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Social tagging is the activity that allows users to assign keywords (tags) to web
based resources.
What is social tagging?
3/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Tagging and tags
Tag: a label attached to someone or something for identification or other
information
4/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Scenario
Social tagging enables semantic interoperability in web applications.
Recommendation and information filtering systems have been developed to
predict users preferences.
Users hence reveal their personal preferences on social tagging platforms.
Privacy enhancing techniques (PET) have been developed to protect user
privacy to a certain extent, at the expense of semantic loss.
5/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Objective
Using as starting point research done in the field of recommendations systems
[1] and PET [2].
The objective of this study is evaluate the impact of two PET, tag forgery and
suppression, on the performance of a recommendation system, on real world
application data.
[1] Bellogín, Alejandro, Iván Cantador, and Pablo Castells. "A comparative study of heterogeneous item
recommendations in social systems." Information Sciences (2012)
[2] Parra-Arnau, Javier, David Rebollo-Monedero, and Jordi Forné. "A privacy-protecting architecture for collaborative
filtering via forgery and suppression of ratings." Data Privacy Management and Autonomous Spontaneus Security
(2012): 42-57.
6/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Dataset
Considering different social bookmarking platform, Delicious was identified as a
representative system of an application rich in collaborative tagging information.
Delicious is a social bookmarking platform for web resources.
The dataset containing Delicious data was obtained from the ones publicly
available at the 2nd
International Workshop on Information Heterogeneity and
Fusion in Recommender Systems.
7/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Delicious
8/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Techniques
Modelling the User/Item Profile
The simplest approach to model users and items is to count the number of
times a tag has been used:
•By a user to annotate different items in the same category.
•Or by the community to annotate the item.
The user/item profile is then described as a histogram of the relative
frequencies of tags within a predefined set of categories of interest.
9/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Techniques
Histogram of a user profile
10/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Techniques
Privacy Metric
The Kullback-Leibler (KL) divergence has been adopted as privacy criteria,
following the perspective of Jaynes’ rationale on entropy maximization methods.
Since the KL divergence may be regarded as a generalization of entropy of a
distribution, relative to another, it is often referred to as relative entropy.
D(p || q) = Ep log
p(x)
q(x)
= p(x)log
x
∑
p(x)
q(x)
11/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Techniques
Utility Metric
A measure of how an item is useful for a certain user is needed.
We could convey that an item is useful if its profile is somehow similar to the
user profile.
Hence we need a measure of similarity.
Content based recommender models are defined as similarity measures
between users and item profiles. This is provided by the cosine-based similarity
measure:
12/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Techniques
Performance Metric
The recommender system is evaluated considering a content retrieval scenario
where a user is provided with a ranked list of N recommended items.
The performance metric adopted is hence among the commonly used for
ranked list prediction, i.e. precision at top N.
In the field of Information Retrieval precision can be defined as the fraction of
recommended items that are relevant for a target user.
13/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Techniques
Tag Forgery and Suppression
Tag suppression and forgery are privacy enhancing techniques that helps users
who tags resources online, from revealing sensible information to a possible
attacker.
14/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Techniques
Tag Forgery and Suppression Rates
The tag forgery rate represents the ratio of forged items:
The tag suppression rate, is the proportion of items that the user consents to
eliminate:
ρ ∈ [0,1)
σ ∈ [0,1)
15/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Techniques
The Privacy-Forgery-Suppression Function
Consistently the privacy-forgery-suppression function can be defined:
P(ρ,σ ) = maxr,s D
q +r − s
1+ ρ −σ





÷
ri ≥ 0 ri = ρ
i
n
∑
qi ≥ si ≥ 0 si =
1
n
∑ σ
16/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Evaluation
17/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Evaluation
Statistics about the dataset
Categories 11 Users 1867
Item-Category
Tuples
98998 Avg. tags per user 477.75
Items 69226
Avg. Items per
Category
81044
Avg. categories
per item
1.4 Tags per item 13.06
18/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Results
Relative Risk Reduction with forgery - Utility
100×
Dinit (um || P)− Dρ,σ (um || P)
Dinit (um || P)
19/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Results
Relative Risk Reduction with suppression - Utility
20/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Conclusions
Tag suppression and forgery are simple privacy enhancing techniques able to
protect users privacy at the cost of some semantic loss.
This study shows with a simple experimental evaluation, in a real world
application scenario, how the performances degradation of a recommender
system, is small if compared to the privacy risk reduction offered by the
application of these techniques.
21/21
Research Seminar. Silvia Puglisi
Departament d'Enginyeria Telemàtica
Thank you!

More Related Content

Similar to Resource recommendation vs privacy enhancement

To the end of our possibilities with Adaptive User Interfaces
To the end of our possibilities with Adaptive User InterfacesTo the end of our possibilities with Adaptive User Interfaces
To the end of our possibilities with Adaptive User Interfaces
Jean Vanderdonckt
 
A Survey on Person Detection for Social Distancing and Safety Violation Alert...
A Survey on Person Detection for Social Distancing and Safety Violation Alert...A Survey on Person Detection for Social Distancing and Safety Violation Alert...
A Survey on Person Detection for Social Distancing and Safety Violation Alert...
IRJET Journal
 
Community-based Crowdsourcing
Community-based CrowdsourcingCommunity-based Crowdsourcing
Community-based Crowdsourcing
Andrea Mauri
 
Bridging Sensor Data Streams and Human Knowledge
Bridging Sensor Data Streams and Human KnowledgeBridging Sensor Data Streams and Human Knowledge
Bridging Sensor Data Streams and Human Knowledge
Mattia Zeni
 
Finding Critical Link and Critical Node Vulnerability for Network
Finding Critical Link and Critical Node Vulnerability for NetworkFinding Critical Link and Critical Node Vulnerability for Network
Finding Critical Link and Critical Node Vulnerability for Network
ijircee
 
journal for research
journal for researchjournal for research
journal for research
rikaseorika
 
IDENTITY DISCLOSURE PROTECTION IN DYNAMIC NETWORKS USING K W – STRUCTURAL DIV...
IDENTITY DISCLOSURE PROTECTION IN DYNAMIC NETWORKS USING K W – STRUCTURAL DIV...IDENTITY DISCLOSURE PROTECTION IN DYNAMIC NETWORKS USING K W – STRUCTURAL DIV...
IDENTITY DISCLOSURE PROTECTION IN DYNAMIC NETWORKS USING K W – STRUCTURAL DIV...
IJITE
 
IRJET- Event Detection and Text Summary by Disaster Warning
IRJET- Event Detection and Text Summary by Disaster WarningIRJET- Event Detection and Text Summary by Disaster Warning
IRJET- Event Detection and Text Summary by Disaster Warning
IRJET Journal
 
Analysis and assessment software for multi-user collaborative cognitive radi...
Analysis and assessment software for multi-user collaborative  cognitive radi...Analysis and assessment software for multi-user collaborative  cognitive radi...
Analysis and assessment software for multi-user collaborative cognitive radi...
IJECEIAES
 
2020 book challenges_andtrendsinmultimoda
2020 book challenges_andtrendsinmultimoda2020 book challenges_andtrendsinmultimoda
2020 book challenges_andtrendsinmultimoda
ssuserbf2656
 
DYNAMIC ENERGY MANAGEMENT USING REAL TIME OBJECT DETECTION
DYNAMIC ENERGY MANAGEMENT USING REAL TIME OBJECT DETECTIONDYNAMIC ENERGY MANAGEMENT USING REAL TIME OBJECT DETECTION
DYNAMIC ENERGY MANAGEMENT USING REAL TIME OBJECT DETECTION
IRJET Journal
 
Show and Tell - Data and Digitalisation, Digital Twins.pdf
Show and Tell - Data and Digitalisation, Digital Twins.pdfShow and Tell - Data and Digitalisation, Digital Twins.pdf
Show and Tell - Data and Digitalisation, Digital Twins.pdf
SIFOfgem
 
IRJET- Steganographic Scheme for Outsourced Biomedical Time Series Data u...
IRJET-  	  Steganographic Scheme for Outsourced Biomedical Time Series Data u...IRJET-  	  Steganographic Scheme for Outsourced Biomedical Time Series Data u...
IRJET- Steganographic Scheme for Outsourced Biomedical Time Series Data u...
IRJET Journal
 
Dr. Frederic Andres (NII, Japan) „Collective Intelligence-based Social Projec...
Dr. Frederic Andres (NII, Japan) „Collective Intelligence-based Social Projec...Dr. Frederic Andres (NII, Japan) „Collective Intelligence-based Social Projec...
Dr. Frederic Andres (NII, Japan) „Collective Intelligence-based Social Projec...
Saulius Maskeliunas
 
Wiamis2010 Pres
Wiamis2010 PresWiamis2010 Pres
Wiamis2010 Pres
Pluribus One
 
In Processes We Trust: Privacy and Trust in Business Processes
In Processes We Trust: Privacy and Trust in Business ProcessesIn Processes We Trust: Privacy and Trust in Business Processes
In Processes We Trust: Privacy and Trust in Business Processes
Marlon Dumas
 
Ontology based top-k query answering over massive, heterogeneous, and dynamic...
Ontology based top-k query answering over massive, heterogeneous, and dynamic...Ontology based top-k query answering over massive, heterogeneous, and dynamic...
Ontology based top-k query answering over massive, heterogeneous, and dynamic...
Daniele Dell'Aglio
 
Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...
Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...
Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...
Pluribus One
 
Research Group 'Multimedia Communication' Presentation (March 2015)
Research Group 'Multimedia Communication' Presentation (March 2015)Research Group 'Multimedia Communication' Presentation (March 2015)
Research Group 'Multimedia Communication' Presentation (March 2015)
hellwagner
 

Similar to Resource recommendation vs privacy enhancement (20)

To the end of our possibilities with Adaptive User Interfaces
To the end of our possibilities with Adaptive User InterfacesTo the end of our possibilities with Adaptive User Interfaces
To the end of our possibilities with Adaptive User Interfaces
 
A Survey on Person Detection for Social Distancing and Safety Violation Alert...
A Survey on Person Detection for Social Distancing and Safety Violation Alert...A Survey on Person Detection for Social Distancing and Safety Violation Alert...
A Survey on Person Detection for Social Distancing and Safety Violation Alert...
 
Community-based Crowdsourcing
Community-based CrowdsourcingCommunity-based Crowdsourcing
Community-based Crowdsourcing
 
Bridging Sensor Data Streams and Human Knowledge
Bridging Sensor Data Streams and Human KnowledgeBridging Sensor Data Streams and Human Knowledge
Bridging Sensor Data Streams and Human Knowledge
 
ANURADHA_FINAL_REPORT
ANURADHA_FINAL_REPORTANURADHA_FINAL_REPORT
ANURADHA_FINAL_REPORT
 
Finding Critical Link and Critical Node Vulnerability for Network
Finding Critical Link and Critical Node Vulnerability for NetworkFinding Critical Link and Critical Node Vulnerability for Network
Finding Critical Link and Critical Node Vulnerability for Network
 
journal for research
journal for researchjournal for research
journal for research
 
IDENTITY DISCLOSURE PROTECTION IN DYNAMIC NETWORKS USING K W – STRUCTURAL DIV...
IDENTITY DISCLOSURE PROTECTION IN DYNAMIC NETWORKS USING K W – STRUCTURAL DIV...IDENTITY DISCLOSURE PROTECTION IN DYNAMIC NETWORKS USING K W – STRUCTURAL DIV...
IDENTITY DISCLOSURE PROTECTION IN DYNAMIC NETWORKS USING K W – STRUCTURAL DIV...
 
IRJET- Event Detection and Text Summary by Disaster Warning
IRJET- Event Detection and Text Summary by Disaster WarningIRJET- Event Detection and Text Summary by Disaster Warning
IRJET- Event Detection and Text Summary by Disaster Warning
 
Analysis and assessment software for multi-user collaborative cognitive radi...
Analysis and assessment software for multi-user collaborative  cognitive radi...Analysis and assessment software for multi-user collaborative  cognitive radi...
Analysis and assessment software for multi-user collaborative cognitive radi...
 
2020 book challenges_andtrendsinmultimoda
2020 book challenges_andtrendsinmultimoda2020 book challenges_andtrendsinmultimoda
2020 book challenges_andtrendsinmultimoda
 
DYNAMIC ENERGY MANAGEMENT USING REAL TIME OBJECT DETECTION
DYNAMIC ENERGY MANAGEMENT USING REAL TIME OBJECT DETECTIONDYNAMIC ENERGY MANAGEMENT USING REAL TIME OBJECT DETECTION
DYNAMIC ENERGY MANAGEMENT USING REAL TIME OBJECT DETECTION
 
Show and Tell - Data and Digitalisation, Digital Twins.pdf
Show and Tell - Data and Digitalisation, Digital Twins.pdfShow and Tell - Data and Digitalisation, Digital Twins.pdf
Show and Tell - Data and Digitalisation, Digital Twins.pdf
 
IRJET- Steganographic Scheme for Outsourced Biomedical Time Series Data u...
IRJET-  	  Steganographic Scheme for Outsourced Biomedical Time Series Data u...IRJET-  	  Steganographic Scheme for Outsourced Biomedical Time Series Data u...
IRJET- Steganographic Scheme for Outsourced Biomedical Time Series Data u...
 
Dr. Frederic Andres (NII, Japan) „Collective Intelligence-based Social Projec...
Dr. Frederic Andres (NII, Japan) „Collective Intelligence-based Social Projec...Dr. Frederic Andres (NII, Japan) „Collective Intelligence-based Social Projec...
Dr. Frederic Andres (NII, Japan) „Collective Intelligence-based Social Projec...
 
Wiamis2010 Pres
Wiamis2010 PresWiamis2010 Pres
Wiamis2010 Pres
 
In Processes We Trust: Privacy and Trust in Business Processes
In Processes We Trust: Privacy and Trust in Business ProcessesIn Processes We Trust: Privacy and Trust in Business Processes
In Processes We Trust: Privacy and Trust in Business Processes
 
Ontology based top-k query answering over massive, heterogeneous, and dynamic...
Ontology based top-k query answering over massive, heterogeneous, and dynamic...Ontology based top-k query answering over massive, heterogeneous, and dynamic...
Ontology based top-k query answering over massive, heterogeneous, and dynamic...
 
Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...
Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...
Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...
 
Research Group 'Multimedia Communication' Presentation (March 2015)
Research Group 'Multimedia Communication' Presentation (March 2015)Research Group 'Multimedia Communication' Presentation (March 2015)
Research Group 'Multimedia Communication' Presentation (March 2015)
 

More from Silvia Puglisi

Personal tracking devices - A Journey Into The True Dark Net
Personal tracking devices - A Journey Into The True Dark NetPersonal tracking devices - A Journey Into The True Dark Net
Personal tracking devices - A Journey Into The True Dark Net
Silvia Puglisi
 
Analysis, modelling and protection of online private data.
Analysis, modelling and protection of online private data.Analysis, modelling and protection of online private data.
Analysis, modelling and protection of online private data.
Silvia Puglisi
 
On line footprint @upc
On line footprint @upcOn line footprint @upc
On line footprint @upcSilvia Puglisi
 
On line footprint
On line footprintOn line footprint
On line footprint
Silvia Puglisi
 
Searching for patterns in crowdsourced information
Searching for patterns in crowdsourced informationSearching for patterns in crowdsourced information
Searching for patterns in crowdsourced informationSilvia Puglisi
 

More from Silvia Puglisi (7)

you_never_surf_alone
you_never_surf_aloneyou_never_surf_alone
you_never_surf_alone
 
Mobilitapp
MobilitappMobilitapp
Mobilitapp
 
Personal tracking devices - A Journey Into The True Dark Net
Personal tracking devices - A Journey Into The True Dark NetPersonal tracking devices - A Journey Into The True Dark Net
Personal tracking devices - A Journey Into The True Dark Net
 
Analysis, modelling and protection of online private data.
Analysis, modelling and protection of online private data.Analysis, modelling and protection of online private data.
Analysis, modelling and protection of online private data.
 
On line footprint @upc
On line footprint @upcOn line footprint @upc
On line footprint @upc
 
On line footprint
On line footprintOn line footprint
On line footprint
 
Searching for patterns in crowdsourced information
Searching for patterns in crowdsourced informationSearching for patterns in crowdsourced information
Searching for patterns in crowdsourced information
 

Recently uploaded

Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 

Recently uploaded (20)

Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 

Resource recommendation vs privacy enhancement

  • 1. 1/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Silvia Puglisi silvia.puglisi@upc.edu “Research Seminar” Master in Telematics Engineering-UPC On Content-Based Recommendation and Users Privacy in Social Tagging Systems Silvia Puglisi Barcelona, UPC, 2013
  • 2. 2/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Social tagging is the activity that allows users to assign keywords (tags) to web based resources. What is social tagging?
  • 3. 3/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Tagging and tags Tag: a label attached to someone or something for identification or other information
  • 4. 4/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Scenario Social tagging enables semantic interoperability in web applications. Recommendation and information filtering systems have been developed to predict users preferences. Users hence reveal their personal preferences on social tagging platforms. Privacy enhancing techniques (PET) have been developed to protect user privacy to a certain extent, at the expense of semantic loss.
  • 5. 5/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Objective Using as starting point research done in the field of recommendations systems [1] and PET [2]. The objective of this study is evaluate the impact of two PET, tag forgery and suppression, on the performance of a recommendation system, on real world application data. [1] Bellogín, Alejandro, Iván Cantador, and Pablo Castells. "A comparative study of heterogeneous item recommendations in social systems." Information Sciences (2012) [2] Parra-Arnau, Javier, David Rebollo-Monedero, and Jordi Forné. "A privacy-protecting architecture for collaborative filtering via forgery and suppression of ratings." Data Privacy Management and Autonomous Spontaneus Security (2012): 42-57.
  • 6. 6/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Dataset Considering different social bookmarking platform, Delicious was identified as a representative system of an application rich in collaborative tagging information. Delicious is a social bookmarking platform for web resources. The dataset containing Delicious data was obtained from the ones publicly available at the 2nd International Workshop on Information Heterogeneity and Fusion in Recommender Systems.
  • 7. 7/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Delicious
  • 8. 8/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Techniques Modelling the User/Item Profile The simplest approach to model users and items is to count the number of times a tag has been used: •By a user to annotate different items in the same category. •Or by the community to annotate the item. The user/item profile is then described as a histogram of the relative frequencies of tags within a predefined set of categories of interest.
  • 9. 9/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Techniques Histogram of a user profile
  • 10. 10/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Techniques Privacy Metric The Kullback-Leibler (KL) divergence has been adopted as privacy criteria, following the perspective of Jaynes’ rationale on entropy maximization methods. Since the KL divergence may be regarded as a generalization of entropy of a distribution, relative to another, it is often referred to as relative entropy. D(p || q) = Ep log p(x) q(x) = p(x)log x ∑ p(x) q(x)
  • 11. 11/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Techniques Utility Metric A measure of how an item is useful for a certain user is needed. We could convey that an item is useful if its profile is somehow similar to the user profile. Hence we need a measure of similarity. Content based recommender models are defined as similarity measures between users and item profiles. This is provided by the cosine-based similarity measure:
  • 12. 12/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Techniques Performance Metric The recommender system is evaluated considering a content retrieval scenario where a user is provided with a ranked list of N recommended items. The performance metric adopted is hence among the commonly used for ranked list prediction, i.e. precision at top N. In the field of Information Retrieval precision can be defined as the fraction of recommended items that are relevant for a target user.
  • 13. 13/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Techniques Tag Forgery and Suppression Tag suppression and forgery are privacy enhancing techniques that helps users who tags resources online, from revealing sensible information to a possible attacker.
  • 14. 14/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Techniques Tag Forgery and Suppression Rates The tag forgery rate represents the ratio of forged items: The tag suppression rate, is the proportion of items that the user consents to eliminate: ρ ∈ [0,1) σ ∈ [0,1)
  • 15. 15/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Techniques The Privacy-Forgery-Suppression Function Consistently the privacy-forgery-suppression function can be defined: P(ρ,σ ) = maxr,s D q +r − s 1+ ρ −σ      ÷ ri ≥ 0 ri = ρ i n ∑ qi ≥ si ≥ 0 si = 1 n ∑ σ
  • 16. 16/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Evaluation
  • 17. 17/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Evaluation Statistics about the dataset Categories 11 Users 1867 Item-Category Tuples 98998 Avg. tags per user 477.75 Items 69226 Avg. Items per Category 81044 Avg. categories per item 1.4 Tags per item 13.06
  • 18. 18/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Results Relative Risk Reduction with forgery - Utility 100× Dinit (um || P)− Dρ,σ (um || P) Dinit (um || P)
  • 19. 19/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Results Relative Risk Reduction with suppression - Utility
  • 20. 20/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Conclusions Tag suppression and forgery are simple privacy enhancing techniques able to protect users privacy at the cost of some semantic loss. This study shows with a simple experimental evaluation, in a real world application scenario, how the performances degradation of a recommender system, is small if compared to the privacy risk reduction offered by the application of these techniques.
  • 21. 21/21 Research Seminar. Silvia Puglisi Departament d'Enginyeria Telemàtica Thank you!

Editor's Notes

  1. In information systems , a tag is a non-hierarchical keyword or term assigned to a piece of information (such as an Internet bookmark , digital image, or computer file ). This kind of metadata helps describe an item and allows it to be found again by browsing or searching. Tags are generally chosen informally and personally by the item's creator or by its viewer, depending on the system. Labeling and tagging are carried out to perform functions such as aiding in classification , marking ownership, noting boundaries, and indicating online identity . They may take the form of words, images, or other identifying marks. An analogous example of tags in the physical world is museum object tagging. In the organization of information and objects, the use of textual keywords as part of identification and classification long predates computers. However, computer based searching made the use of keywords a rapid way of exploring records.
  2. Tagging has gained wide popularity due to the growth of social networking, photography sharing and bookmarking sites. These sites allow users to create and manage labels (or “tags”) that categorize content using simple keywords. The use of keywords as part of an identification and classification system long predates computers. In the early days of the web keywords meta tags were used by web page designers to tell search engines what the web page was about. Today's tagging takes the meta keywords concept and re-uses it. The users add the tags. The tags are clearly visible, and are themselves links to other items that share that keyword tag. User annotate items that are relevant for them. Tags describe interests, tastes, needs.
  3. Tagging is popular. Everyone using web or mobile app tags resources online. Many blog systems allow authors to add free-form tags to a post, along with (or instead of) placing the post into categories. For example, a post may display that it has been tagged with baseball and tickets . Each of those tags is usually a web link leading to an index page listing all of the posts associated with that tag. The blog may have a sidebar listing all the tags in use on that blog, with each tag leading to an index page. To reclassify a post, an author edits its list of tags. All connections between posts are automatically tracked and updated by the blog software; there is no need to relocate the page within a complex hierarchy of categories.
  4. One of the most popular privacy criteria in database anonymisation is k-anonymity. I.e. each combination of key attribute values is shared by at least k records in the set. K-anonymity is vulnerable against similarity attacks. An attacker will be able to compromise user privacy as long as the apparent user profile diverges from a reference probability measure. In probability theory and information theory , the Kullback–Leibler divergence [1] [2] [3] (also information divergence , information gain , relative entropy , or KLIC ) is a non-symmetric measure of the difference between two probability distributions P and Q . Specifically, the Kullback–Leibler divergence of Q from P , denoted D KL ( P || Q ), is a measure of the information lost when Q is used to approximate P : [4] KL measures the expected number of extra bits required to code samples from P when using a code based on Q , rather than using a code based on P . Typically P represents the "true" distribution of data, observations, or a precisely calculated theoretical distribution. The measure Q typically represents a theory, model, description, or approximation of P . Although it is often intuited as a metric or distance, the KL divergence is not a true metric — for example, it is not symmetric: the KL from P to Q is generally not the same as the KL from Q to P . However, its infinitesimal form, specifically its Hessian, is a metric tensor: it is the Fisher information metric.
  5. Cosine similarity is a measure of similarity between two vectors of an inner product space that measures the cosine of the angle between them. The cosine of 0° is 1, and it is less than 1 for any other angle. It is thus a judgement of orientation and not magnitude: two vectors with the same orientation have a Cosine similarity of 1, two vectors at 90° have a similarity of 0, and two vectors diametrically opposed have a similarity of -1, independent of their magnitude. Cosine similarity is particularly used in positive space, where the outcome is neatly bounded in [0,1].
  6. Two possible strategies can be contemplated for the user: a mixed strategy, where forgery and suppression are used in conjunction, a pure strategy , where either forgery or suppression are applied. Only the pure strategy is going to be evaluated for the purpose of this study. q is introduced as the probability distribution of the known items of a particular user. This is the probability distribution capturing the actual preferences of the user.