SlideShare a Scribd company logo

Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms

Onur Kucuktunc (presenter), Erik Saule, Kamer Kaya, Umit V. Catalyurek WWW'13 presentation, Research Track - Recommender Systems and OSN

1 of 25
Download to read offline
Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 1/25
Diversified Recommendation on Graphs:
Pitfalls, Measures, and Algorithms
Onur Küçüktunç1,2
Erik Saule1
Kamer Kaya1
Ümit V. Çatalyürek1,3
WWW 2013, May 13–17, 2013, Rio de Janeiro, Brazil.
1Dept. Biomedical Informatics
2Dept. of Computer Science and Engineering
3Dept. of Electrical and Computer Engineering
The Ohio State University
Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 2/25
Outline
•  Problem definition
–  Motivation
–  Result diversification algorithms
•  How to measure diversity
–  Classical relevance and diversity measures
–  Bicriteria optimization?!
–  Combined measures
•  Best Coverage method
–  Complexity, submodularity
–  A greedy solution, relaxation
•  Experiments
Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 3/25
Problem definition
G = (V, E)
Q ✓ V
Online shopping
product
co-purchasing
•  one product
•  previous purchases
•  page visit history
Academic
paper-to-paper
citations
•  paper/field of interest
•  set of references
•  researcher himself/herself
product recommendations
“you might also like…”R ⇢ V references for related work
new collaborators
collaboration
network
Social
friendship
network
•  user himself/herself
•  set of people
friend recommendations
“you might also know…”
Let G = (V, E) be an undirected graph.
Given a set of m seed nodes
Q = {q1, . . . , qm} s.t. Q ✓ V ,
and a parameter k, return top-k items
which are relevant to the ones in Q,
but diverse among themselves, covering
di↵erent aspects of the query.
Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 4/25
Problem definition
Let G = (V, E) be an undirected graph.
Given a set of m seed nodes
Q = {q1, . . . , qm} s.t. Q ✓ V ,
and a parameter k, return top-k items
which are relevant to the ones in Q,
but diverse among themselves, covering
di↵erent aspects of the query.
•  We assume that the graph itself is the only information we have, and
no categories or intents are available
•  no comparisons to intent-aware algorithms [Agrawal09,Welch11,etc.]
•  but we will compare against intent-aware measures
•  Relevance scores are obtained with Personalized PageRank (PPR)
[Haveliwala02]
p⇤
(v) =
(
1/m, if v 2 Q
0, otherwise.
Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 5/25
Result diversification algorithms
•  GrassHopper [Zhu07]
–  ranks the graph k times
•  turns the highest-ranked vertex into a sink node at each iteration
0 5 10
0
2
4
6
8
0 5 10
0
5
10
0
0.005
0.01
0.015
g1
0 5 10
0
5
10
0
2
4
6
g2
00
5
10
0
0.5
1
1.5
(a) (b) (c) (d)
Figure 1: (a) A toy data set. (b) The stationary distribution π reflects centrality. The item with
probability is selected as the first item g1. (c) The expected number of visits v to each node after g
an absorbing state. (d) After both g1 and g2 become absorbing states. Note the diversity in g1, g2,
0 5 10
0
2
4
6
8
0 5 10
0
5
10
0
0.005
0.01
0.015
g1
0 5 10
0
5
10
0
2
4
6
g2
00
5
10
0
0.5
1
1.5
(a) (b) (c) (d
Figure 1: (a) A toy data set. (b) The stationary distribution π reflects centrality. The item with
probability is selected as the first item g1. (c) The expected number of visits v to each node after
an absorbing state. (d) After both g1 and g2 become absorbing states. Note the diversity in g1, g2
come from different groups.
Items at group centers have higher probabilities, and 2.3 Ranking the Remaining Items
5 10 0 5 10
0
5
10
0
0.005
0.01
0.015
g1
0 5 10
0
5
10
0
2
4
6
g2
0 5 10
0
5
10
0
0.5
1
1.5
g
3
(a) (b) (c) (d)
highest-ranked
vertex
R = {g1}
R = {g1,g2}
R = {g1,g2,g3}
g1 turned into
a sink node
highest-ranked
in the next step
Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 6/25
Result diversification algorithms
•  GrassHopper [Zhu07]
–  ranks the graph k times
•  turns the highest-ranked vertex into a sink node at each iteration
•  DivRank [Mei10]
–  based on vertex-reinforced random walks (VRRW)
•  adjusts the transition matrix based on the number of visits to the
vertices (rich-gets-richer mechanism)
(a) An illustrative network. (b) Weighting with PageRank. (c) A diverse weighting.(a) An illustrative network. (b) Weighting with PageRank. (c) A diverse weighting.(a) An illustrative network. (b) Weighting with PageRank. (c) A diverse weighting.sample graph weighting with PPR diverse weighting

Recommended

Modelling & Control of Drinkable Water Networks
Modelling & Control of Drinkable Water NetworksModelling & Control of Drinkable Water Networks
Modelling & Control of Drinkable Water NetworksPantelis Sopasakis
 
Recommendations with Neo4j (FOSDEM 2015)
Recommendations with Neo4j (FOSDEM 2015)Recommendations with Neo4j (FOSDEM 2015)
Recommendations with Neo4j (FOSDEM 2015)Michal Bachman
 
Introduction to R Package Recommendation System Competition
Introduction to R Package Recommendation System CompetitionIntroduction to R Package Recommendation System Competition
Introduction to R Package Recommendation System CompetitionNYC Predictive Analytics
 
Bootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4jBootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4jMax De Marzi
 
C* Summit 2013: Buy It Now! Cassandra at eBay by Jay Patel
C* Summit 2013: Buy It Now! Cassandra at eBay by Jay PatelC* Summit 2013: Buy It Now! Cassandra at eBay by Jay Patel
C* Summit 2013: Buy It Now! Cassandra at eBay by Jay PatelDataStax Academy
 
Introduction to behavior based recommendation system
Introduction to behavior based recommendation systemIntroduction to behavior based recommendation system
Introduction to behavior based recommendation systemKimikazu Kato
 
26-grover-john-draft
26-grover-john-draft26-grover-john-draft
26-grover-john-draftJohn Grover
 

More Related Content

Viewers also liked

El sistema científico tecnológico de venezuela
El sistema científico tecnológico de venezuelaEl sistema científico tecnológico de venezuela
El sistema científico tecnológico de venezuelanahomyc
 
proceso unico de ejecucion de garantias
proceso unico de ejecucion de garantiasproceso unico de ejecucion de garantias
proceso unico de ejecucion de garantiasmaria_leo
 
Livret recettes bridelight
Livret recettes bridelightLivret recettes bridelight
Livret recettes bridelightsarah Benmerzouk
 
cas resolucion de conflictos
cas resolucion de conflictoscas resolucion de conflictos
cas resolucion de conflictosLaura Ortiz
 
Dasar kurikulum smkadg 2013
Dasar kurikulum smkadg 2013Dasar kurikulum smkadg 2013
Dasar kurikulum smkadg 2013Aliff Hafiz
 
Capacitación Docente
Capacitación DocenteCapacitación Docente
Capacitación DocenteGaby Banegas
 
Juzgados de paz letrado
Juzgados de paz letradoJuzgados de paz letrado
Juzgados de paz letradosageta12
 
Strategy During turbulent times
Strategy During  turbulent timesStrategy During  turbulent times
Strategy During turbulent timesKannan R
 
Raising Capital for an Event Business
Raising Capital for an Event BusinessRaising Capital for an Event Business
Raising Capital for an Event BusinessSteven GINTOWT
 
Electrólisis de una disolución acuosa de yoduro de potasio
Electrólisis de una disolución acuosa de yoduro de potasioElectrólisis de una disolución acuosa de yoduro de potasio
Electrólisis de una disolución acuosa de yoduro de potasioMarii Michaus
 

Viewers also liked (17)

El sistema científico tecnológico de venezuela
El sistema científico tecnológico de venezuelaEl sistema científico tecnológico de venezuela
El sistema científico tecnológico de venezuela
 
Enfermería kathy
Enfermería kathyEnfermería kathy
Enfermería kathy
 
proceso unico de ejecucion de garantias
proceso unico de ejecucion de garantiasproceso unico de ejecucion de garantias
proceso unico de ejecucion de garantias
 
Livret recettes bridelight
Livret recettes bridelightLivret recettes bridelight
Livret recettes bridelight
 
cas resolucion de conflictos
cas resolucion de conflictoscas resolucion de conflictos
cas resolucion de conflictos
 
Mapa conceptual
Mapa conceptualMapa conceptual
Mapa conceptual
 
Dasar kurikulum smkadg 2013
Dasar kurikulum smkadg 2013Dasar kurikulum smkadg 2013
Dasar kurikulum smkadg 2013
 
Ma présentation osvaldo
Ma présentation osvaldoMa présentation osvaldo
Ma présentation osvaldo
 
Capacitación Docente
Capacitación DocenteCapacitación Docente
Capacitación Docente
 
CCNA Security
CCNA SecurityCCNA Security
CCNA Security
 
Juzgados de paz letrado
Juzgados de paz letradoJuzgados de paz letrado
Juzgados de paz letrado
 
Strategy During turbulent times
Strategy During  turbulent timesStrategy During  turbulent times
Strategy During turbulent times
 
Raising Capital for an Event Business
Raising Capital for an Event BusinessRaising Capital for an Event Business
Raising Capital for an Event Business
 
Cultura ciudad
Cultura ciudadCultura ciudad
Cultura ciudad
 
October 28 (EE)
October 28 (EE)October 28 (EE)
October 28 (EE)
 
SVC Logo
SVC LogoSVC Logo
SVC Logo
 
Electrólisis de una disolución acuosa de yoduro de potasio
Electrólisis de una disolución acuosa de yoduro de potasioElectrólisis de una disolución acuosa de yoduro de potasio
Electrólisis de una disolución acuosa de yoduro de potasio
 

Similar to Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms

Introduction geostatistic for_mineral_resources
Introduction geostatistic for_mineral_resourcesIntroduction geostatistic for_mineral_resources
Introduction geostatistic for_mineral_resourcesAdi Handarbeni
 
Social network analysis: Quadratic assignment procedure
Social network analysis: Quadratic assignment procedureSocial network analysis: Quadratic assignment procedure
Social network analysis: Quadratic assignment procedureMatthew Gwynfryn Thomas
 
QX Simulator and quantum programming - 2020-04-28
QX Simulator and quantum programming - 2020-04-28QX Simulator and quantum programming - 2020-04-28
QX Simulator and quantum programming - 2020-04-28Aritra Sarkar
 
Understanding how is that adaptive cursor sharing (acs) produces multiple opt...
Understanding how is that adaptive cursor sharing (acs) produces multiple opt...Understanding how is that adaptive cursor sharing (acs) produces multiple opt...
Understanding how is that adaptive cursor sharing (acs) produces multiple opt...Enkitec
 
Click Model-Based Information Retrieval Metrics
Click Model-Based Information Retrieval MetricsClick Model-Based Information Retrieval Metrics
Click Model-Based Information Retrieval MetricsAleksandr Chuklin
 
Quantum Gaussian Processes - Gawel Kus
Quantum Gaussian Processes - Gawel KusQuantum Gaussian Processes - Gawel Kus
Quantum Gaussian Processes - Gawel KusAdvanced-Concepts-Team
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...MLconf
 
Anomaly detection: Core Techniques and Advances in Big Data and Deep Learning
Anomaly detection: Core Techniques and Advances in Big Data and Deep LearningAnomaly detection: Core Techniques and Advances in Big Data and Deep Learning
Anomaly detection: Core Techniques and Advances in Big Data and Deep LearningQuantUniversity
 
Recommender Systems with Implicit Feedback Challenges, Techniques, and Applic...
Recommender Systems with Implicit Feedback Challenges, Techniques, and Applic...Recommender Systems with Implicit Feedback Challenges, Techniques, and Applic...
Recommender Systems with Implicit Feedback Challenges, Techniques, and Applic...NAVER Engineering
 
Virus, Vaccines, Genes and Quantum - 2020-06-18
Virus, Vaccines, Genes and Quantum - 2020-06-18Virus, Vaccines, Genes and Quantum - 2020-06-18
Virus, Vaccines, Genes and Quantum - 2020-06-18Aritra Sarkar
 
Anomaly detection Meetup Slides
Anomaly detection Meetup SlidesAnomaly detection Meetup Slides
Anomaly detection Meetup SlidesQuantUniversity
 
Scalable and Parallelizable Processing of Influence Maximization for Large-S...
Scalable and Parallelizable Processing of Influence Maximization  for Large-S...Scalable and Parallelizable Processing of Influence Maximization  for Large-S...
Scalable and Parallelizable Processing of Influence Maximization for Large-S...Jinha Kim
 
Predicting SPARQL query execution time and suggesting SPARQL queries based on...
Predicting SPARQL query execution time and suggesting SPARQL queries based on...Predicting SPARQL query execution time and suggesting SPARQL queries based on...
Predicting SPARQL query execution time and suggesting SPARQL queries based on...Rakebul Hasan
 
How Do Gain and Discount Functions Affect the Correlation between DCG and Use...
How Do Gain and Discount Functions Affect the Correlation between DCG and Use...How Do Gain and Discount Functions Affect the Correlation between DCG and Use...
How Do Gain and Discount Functions Affect the Correlation between DCG and Use...Julián Urbano
 

Similar to Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms (20)

Introduction geostatistic for_mineral_resources
Introduction geostatistic for_mineral_resourcesIntroduction geostatistic for_mineral_resources
Introduction geostatistic for_mineral_resources
 
Social network analysis: Quadratic assignment procedure
Social network analysis: Quadratic assignment procedureSocial network analysis: Quadratic assignment procedure
Social network analysis: Quadratic assignment procedure
 
Declarative data analysis
Declarative data analysisDeclarative data analysis
Declarative data analysis
 
QX Simulator and quantum programming - 2020-04-28
QX Simulator and quantum programming - 2020-04-28QX Simulator and quantum programming - 2020-04-28
QX Simulator and quantum programming - 2020-04-28
 
Understanding how is that adaptive cursor sharing (acs) produces multiple opt...
Understanding how is that adaptive cursor sharing (acs) produces multiple opt...Understanding how is that adaptive cursor sharing (acs) produces multiple opt...
Understanding how is that adaptive cursor sharing (acs) produces multiple opt...
 
Click Model-Based Information Retrieval Metrics
Click Model-Based Information Retrieval MetricsClick Model-Based Information Retrieval Metrics
Click Model-Based Information Retrieval Metrics
 
Quantum Gaussian Processes - Gawel Kus
Quantum Gaussian Processes - Gawel KusQuantum Gaussian Processes - Gawel Kus
Quantum Gaussian Processes - Gawel Kus
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
 
Anomaly detection: Core Techniques and Advances in Big Data and Deep Learning
Anomaly detection: Core Techniques and Advances in Big Data and Deep LearningAnomaly detection: Core Techniques and Advances in Big Data and Deep Learning
Anomaly detection: Core Techniques and Advances in Big Data and Deep Learning
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
 
Recommender Systems with Implicit Feedback Challenges, Techniques, and Applic...
Recommender Systems with Implicit Feedback Challenges, Techniques, and Applic...Recommender Systems with Implicit Feedback Challenges, Techniques, and Applic...
Recommender Systems with Implicit Feedback Challenges, Techniques, and Applic...
 
Virus, Vaccines, Genes and Quantum - 2020-06-18
Virus, Vaccines, Genes and Quantum - 2020-06-18Virus, Vaccines, Genes and Quantum - 2020-06-18
Virus, Vaccines, Genes and Quantum - 2020-06-18
 
Anomaly detection Meetup Slides
Anomaly detection Meetup SlidesAnomaly detection Meetup Slides
Anomaly detection Meetup Slides
 
Scalable and Parallelizable Processing of Influence Maximization for Large-S...
Scalable and Parallelizable Processing of Influence Maximization  for Large-S...Scalable and Parallelizable Processing of Influence Maximization  for Large-S...
Scalable and Parallelizable Processing of Influence Maximization for Large-S...
 
Predicting SPARQL query execution time and suggesting SPARQL queries based on...
Predicting SPARQL query execution time and suggesting SPARQL queries based on...Predicting SPARQL query execution time and suggesting SPARQL queries based on...
Predicting SPARQL query execution time and suggesting SPARQL queries based on...
 
Machine learning meetup
Machine learning meetupMachine learning meetup
Machine learning meetup
 
How Do Gain and Discount Functions Affect the Correlation between DCG and Use...
How Do Gain and Discount Functions Affect the Correlation between DCG and Use...How Do Gain and Discount Functions Affect the Correlation between DCG and Use...
How Do Gain and Discount Functions Affect the Correlation between DCG and Use...
 
Real time traffic management - challenges and solutions
Real time traffic management - challenges and solutionsReal time traffic management - challenges and solutions
Real time traffic management - challenges and solutions
 
Lower back pain Regression models
Lower back pain Regression modelsLower back pain Regression models
Lower back pain Regression models
 

Recently uploaded

Dynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringDynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringMassimo Talia
 
Traffic Signboard Classification with Voice alert to the driver.pptx
Traffic Signboard Classification with Voice alert to the driver.pptxTraffic Signboard Classification with Voice alert to the driver.pptx
Traffic Signboard Classification with Voice alert to the driver.pptxharimaxwell0712
 
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...UiPathCommunity
 
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre..."Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...shaiyuvasv
 
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17Ana-Maria Mihalceanu
 
Battle of React State Managers in frontend applications
Battle of React State Managers in frontend applicationsBattle of React State Managers in frontend applications
Battle of React State Managers in frontend applicationsEvangelia Mitsopoulou
 
Importance of magazines in education ppt
Importance of magazines in education pptImportance of magazines in education ppt
Importance of magazines in education pptsafnarafeek2002
 
Automate Your Master Data Processes for Shared Service Center Excellence
Automate Your Master Data Processes for Shared Service Center ExcellenceAutomate Your Master Data Processes for Shared Service Center Excellence
Automate Your Master Data Processes for Shared Service Center ExcellencePrecisely
 
My self introduction to know others abut me
My self  introduction to know others abut meMy self  introduction to know others abut me
My self introduction to know others abut meManoj Prabakar B
 
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...ISPMAIndia
 
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes", Volodymyr TsapFwdays
 
Unlocking the Cloud's True Potential: Why Multitenancy Is The Key?
Unlocking the Cloud's True Potential: Why Multitenancy Is The Key?Unlocking the Cloud's True Potential: Why Multitenancy Is The Key?
Unlocking the Cloud's True Potential: Why Multitenancy Is The Key?GleecusTechlabs1
 
My sample product research idea for you!
My sample product research idea for you!My sample product research idea for you!
My sample product research idea for you!KivenRaySarsaba
 
Power of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdfPower of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdfkatalinjordans1
 
Confoo 2024 Gettings started with OpenAI and data science
Confoo 2024 Gettings started with OpenAI and data scienceConfoo 2024 Gettings started with OpenAI and data science
Confoo 2024 Gettings started with OpenAI and data scienceSusan Ibach
 
IT Nation Evolve event 2024 - Quarter 1
IT Nation Evolve event 2024  - Quarter 1IT Nation Evolve event 2024  - Quarter 1
IT Nation Evolve event 2024 - Quarter 1Inbay UK
 
"AIRe - AI Reliability Engineering", Denys Vasyliev
"AIRe - AI Reliability Engineering", Denys Vasyliev"AIRe - AI Reliability Engineering", Denys Vasyliev
"AIRe - AI Reliability Engineering", Denys VasylievFwdays
 
2024 February Patch Tuesday
2024 February Patch Tuesday2024 February Patch Tuesday
2024 February Patch TuesdayIvanti
 
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...DianaGray10
 

Recently uploaded (20)

Dynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringDynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineering
 
Traffic Signboard Classification with Voice alert to the driver.pptx
Traffic Signboard Classification with Voice alert to the driver.pptxTraffic Signboard Classification with Voice alert to the driver.pptx
Traffic Signboard Classification with Voice alert to the driver.pptx
 
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
 
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre..."Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
 
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17
 
Battle of React State Managers in frontend applications
Battle of React State Managers in frontend applicationsBattle of React State Managers in frontend applications
Battle of React State Managers in frontend applications
 
Importance of magazines in education ppt
Importance of magazines in education pptImportance of magazines in education ppt
Importance of magazines in education ppt
 
Automate Your Master Data Processes for Shared Service Center Excellence
Automate Your Master Data Processes for Shared Service Center ExcellenceAutomate Your Master Data Processes for Shared Service Center Excellence
Automate Your Master Data Processes for Shared Service Center Excellence
 
My self introduction to know others abut me
My self  introduction to know others abut meMy self  introduction to know others abut me
My self introduction to know others abut me
 
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
 
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap"Running Open-Source LLM models on Kubernetes",  Volodymyr Tsap
"Running Open-Source LLM models on Kubernetes", Volodymyr Tsap
 
Unlocking the Cloud's True Potential: Why Multitenancy Is The Key?
Unlocking the Cloud's True Potential: Why Multitenancy Is The Key?Unlocking the Cloud's True Potential: Why Multitenancy Is The Key?
Unlocking the Cloud's True Potential: Why Multitenancy Is The Key?
 
My sample product research idea for you!
My sample product research idea for you!My sample product research idea for you!
My sample product research idea for you!
 
Power of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdfPower of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdf
 
Confoo 2024 Gettings started with OpenAI and data science
Confoo 2024 Gettings started with OpenAI and data scienceConfoo 2024 Gettings started with OpenAI and data science
Confoo 2024 Gettings started with OpenAI and data science
 
IT Nation Evolve event 2024 - Quarter 1
IT Nation Evolve event 2024  - Quarter 1IT Nation Evolve event 2024  - Quarter 1
IT Nation Evolve event 2024 - Quarter 1
 
"AIRe - AI Reliability Engineering", Denys Vasyliev
"AIRe - AI Reliability Engineering", Denys Vasyliev"AIRe - AI Reliability Engineering", Denys Vasyliev
"AIRe - AI Reliability Engineering", Denys Vasyliev
 
5 Tech Trend to Notice in ESG Landscape- 47Billion
5 Tech Trend to Notice in ESG Landscape- 47Billion5 Tech Trend to Notice in ESG Landscape- 47Billion
5 Tech Trend to Notice in ESG Landscape- 47Billion
 
2024 February Patch Tuesday
2024 February Patch Tuesday2024 February Patch Tuesday
2024 February Patch Tuesday
 
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
 

Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms

  • 1. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 1/25 Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms Onur Küçüktunç1,2 Erik Saule1 Kamer Kaya1 Ümit V. Çatalyürek1,3 WWW 2013, May 13–17, 2013, Rio de Janeiro, Brazil. 1Dept. Biomedical Informatics 2Dept. of Computer Science and Engineering 3Dept. of Electrical and Computer Engineering The Ohio State University
  • 2. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 2/25 Outline •  Problem definition –  Motivation –  Result diversification algorithms •  How to measure diversity –  Classical relevance and diversity measures –  Bicriteria optimization?! –  Combined measures •  Best Coverage method –  Complexity, submodularity –  A greedy solution, relaxation •  Experiments
  • 3. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 3/25 Problem definition G = (V, E) Q ✓ V Online shopping product co-purchasing •  one product •  previous purchases •  page visit history Academic paper-to-paper citations •  paper/field of interest •  set of references •  researcher himself/herself product recommendations “you might also like…”R ⇢ V references for related work new collaborators collaboration network Social friendship network •  user himself/herself •  set of people friend recommendations “you might also know…” Let G = (V, E) be an undirected graph. Given a set of m seed nodes Q = {q1, . . . , qm} s.t. Q ✓ V , and a parameter k, return top-k items which are relevant to the ones in Q, but diverse among themselves, covering di↵erent aspects of the query.
  • 4. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 4/25 Problem definition Let G = (V, E) be an undirected graph. Given a set of m seed nodes Q = {q1, . . . , qm} s.t. Q ✓ V , and a parameter k, return top-k items which are relevant to the ones in Q, but diverse among themselves, covering di↵erent aspects of the query. •  We assume that the graph itself is the only information we have, and no categories or intents are available •  no comparisons to intent-aware algorithms [Agrawal09,Welch11,etc.] •  but we will compare against intent-aware measures •  Relevance scores are obtained with Personalized PageRank (PPR) [Haveliwala02] p⇤ (v) = ( 1/m, if v 2 Q 0, otherwise.
  • 5. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 5/25 Result diversification algorithms •  GrassHopper [Zhu07] –  ranks the graph k times •  turns the highest-ranked vertex into a sink node at each iteration 0 5 10 0 2 4 6 8 0 5 10 0 5 10 0 0.005 0.01 0.015 g1 0 5 10 0 5 10 0 2 4 6 g2 00 5 10 0 0.5 1 1.5 (a) (b) (c) (d) Figure 1: (a) A toy data set. (b) The stationary distribution π reflects centrality. The item with probability is selected as the first item g1. (c) The expected number of visits v to each node after g an absorbing state. (d) After both g1 and g2 become absorbing states. Note the diversity in g1, g2, 0 5 10 0 2 4 6 8 0 5 10 0 5 10 0 0.005 0.01 0.015 g1 0 5 10 0 5 10 0 2 4 6 g2 00 5 10 0 0.5 1 1.5 (a) (b) (c) (d Figure 1: (a) A toy data set. (b) The stationary distribution π reflects centrality. The item with probability is selected as the first item g1. (c) The expected number of visits v to each node after an absorbing state. (d) After both g1 and g2 become absorbing states. Note the diversity in g1, g2 come from different groups. Items at group centers have higher probabilities, and 2.3 Ranking the Remaining Items 5 10 0 5 10 0 5 10 0 0.005 0.01 0.015 g1 0 5 10 0 5 10 0 2 4 6 g2 0 5 10 0 5 10 0 0.5 1 1.5 g 3 (a) (b) (c) (d) highest-ranked vertex R = {g1} R = {g1,g2} R = {g1,g2,g3} g1 turned into a sink node highest-ranked in the next step
  • 6. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 6/25 Result diversification algorithms •  GrassHopper [Zhu07] –  ranks the graph k times •  turns the highest-ranked vertex into a sink node at each iteration •  DivRank [Mei10] –  based on vertex-reinforced random walks (VRRW) •  adjusts the transition matrix based on the number of visits to the vertices (rich-gets-richer mechanism) (a) An illustrative network. (b) Weighting with PageRank. (c) A diverse weighting.(a) An illustrative network. (b) Weighting with PageRank. (c) A diverse weighting.(a) An illustrative network. (b) Weighting with PageRank. (c) A diverse weighting.sample graph weighting with PPR diverse weighting
  • 7. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 7/25 Result diversification algorithms •  GrassHopper [Zhu07] –  ranks the graph k times •  turns the highest-ranked vertex into a sink node at each iteration •  DivRank [Mei10] –  based on vertex-reinforced random walks (VRRW) •  adjusts the transition matrix based on the number of visits to the vertices (rich-gets-richer mechanism) •  Dragon [Tong11] –  based on optimizing the goodness measure •  punishes the score when two neighbors are included in the results
  • 8. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 8/25 Measuring diversity Relevance measures •  Normalized relevance •  Difference ratio •  nDCG Diversity measures •  l-step graph density •  l-expansion ratio rel(S) = P v2S ⇡v Pk i=1 ˆ⇡i di↵(S, ˆS) = 1 |S ˆS| |S| nDCGk = ⇡s1 + Pk i=2 ⇡si log2 i ˆ⇡1 + Pk i=2 ˆ⇡i log2 i dens`(S) = P u,v2S,u6=v d`(u, v) |S| ⇥ (|S| 1) `(S) = |N`(S)| n N`(S) = S [ {v 2 (V S) : 9u 2 S, d(u, v)  `} where
  • 9. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 9/25 Bicriteria optimization measures •  aggregate a relevance and a diversity measure •  [Carbonell98] •  [Li11] •  [Vieira11] •  max-sum diversification, max-min diversification, k-similar diversification set, etc. [Gollapudi09] fMMR(S) = (1 ) X v2S ⇡v X u2S max v2S u6=v sim(u, v) fL(S) = X v2S ⇡v + |N(S)| n fMSD(S) = (k 1)(1 ) X v2S ⇡v + 2 X u2S X v2S u6=v div(u, v)
  • 10. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 10/25 Bicriteria optimization is not the answer •  Objective: diversify top-10 results •  Two query-oblivious algorithms: –  top-% + random –  top-% + greedy-σ2
  • 11. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 11/25 Bicriteria optimization is not the answer •  normalized relevance and 2-step graph density •  evaluating result diversification as a bicriteria optimization problem with –  a relevance measure that ignores diversity, and –  a diversity measure that ignores relevancy. 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 dens2 rel 10-RLM BC1 BC1V BC2 BC2V BC1100 BC11000 BC150 BC2100 BC21000 BC250 CDivRank DRAGON GRASSHOPPER GSPARSE IL1 IL2 LM PDivRank PR k-RLM top-90%+random top-75%+random top-50%+random top-25%+random All Random 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 dens2 rel 10-RLM BC1 BC1V BC2 BC2V BC1100 BC11000 BC150 BC2100 BC21000 BC250 CDivRank DRAGON GRASSHOPPER GSPARSE IL1 IL2 LM PDivRank PR k-RLM top-90%+random top-75%+random top-50%+random top-25%+random All Random better 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 dens2 rel 10-RLM BC1 BC1V BC2 BC2V BC1100 BC11000 BC150 BC2100 BC21000 BC250 CDivRank DRAGON GRASSHOPPER GSPARSE IL1 IL2 LM PDivRank PR k-RLM top-90%+random top-75%+random top-50%+random top-25%+random All Random 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 dens2 rel 10-RLM BC1 BC1V BC2 BC2V BC1100 BC11000 BC150 BC2100 BC21000 BC250 CDivRank DRAGON GRASSHOPPER GSPARSE IL1 IL2 LM PDivRank PR k-RLM top-90%+random top-75%+random top-50%+random top-25%+random All Random better 0 0.05 0.1 0.15 0.2 0.25 0.3 0 0.2 0.4 0.6 0.8 1 σ2 rel 0 0.05 0.1 0.15 0.2 0.25 0.3 0 0.2 0.4 0.6 0.8 1 σ2 rel better 2 0.4 0.6 0.8 1 others others others others others top-90%+random top-75%+random top-50%+random top-25%+random All random others others others top-90%+greedy-σ2 top-75%+greedy-σ2 top-50%+greedy-σ2 top-25%+greedy-σ2 All greedy-σ2 others others others other algorithms 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 others others others others others top-90%+random top-75%+random top-50%+random top-25%+random All random others others others top-90%+greedy-σ2 top-75%+greedy-σ2 top-50%+greedy-σ2 top-25%+greedy-σ2 All greedy-σ2 others others others other algorithms
  • 12. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 12/25 A better measure? Combine both •  We need a combined measure that tightly integrates both relevance and diversity aspects of the result set •  goodness [Tong11] –  downside: highly dominated by relevance fG(S) = 2 X i2S ⇡i d X i,j2S A(j, i)⇡j (1 d) X j2S ⇡j X i2S p⇤ (i)max-sum relevance penalize the score when two results share an edge
  • 13. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 13/25 Proposed measure: l-step expanded relevance •  a combined measure of –  l-step expansion ratio (σ2) –  relevance scores (π) •  quantifies: relevance of the covered region of the graph •  do some sanity check with this new measure `-step expanded relevance: exprel`(S) = X v2N`(S) ⇡v where N`(S) is the `-step expansion set of the result set S, and ⇡ is the PPR scores of the items in the graph. 0 0.2 0.4 0.6 0.8 1 510 20 50 100 exprel2 k 0 0.2 0.4 0.6 0.8 1 510 20 50 100 exprel2 k 0.2 0.4 0.6 0.8 1 others others others others others top-90%+random top-75%+random top-50%+random top-25%+random All random others others others top-90%+greedy-σ2 top-75%+greedy-σ2 top-50%+greedy-σ2 top-25%+greedy-σ2 All greedy-σ2 others others others other algorithms 0.2 0.4 0.6 0.8 1 others others others others others top-90%+random top-75%+random top-50%+random top-25%+random All random others others others top-90%+greedy-σ2 top-75%+greedy-σ2 top-50%+greedy-σ2 top-25%+greedy-σ2 All greedy-σ2 others others others other algorithms
  • 14. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 14/25 Correlations of the measures relevancediversity goodness is dominated by the relevancy measures exprel has no high correlations with other relevance or diversity measures
  • 15. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 15/25 Proposed algorithm: Best Coverage •  Can we use -step expanded relevance as an objective function? •  Define: •  Complexity: generalization of weighted maximum coverage problem –  NP-hard! –  but exprell is a submodular function (Lemma 4.2) –  a greedy solution (Algorithm 1) that selects the item with the highest marginal utility at each step is the best possible polynomial time approximation (proof based on [Nemhauser78]) •  Relaxation: computes BestCoverage on highest ranked vertices to improve runtime exprel`-diversified top-k ranking (DTR`) S = argmax S0 ✓V |S0 |=k exprel`(S0 ) g(v, S) = P v02N`({v}) N`(S) ⇡v0 ALGORITHM 1: BestCoverage Input: k, G, ⇡, ` Output: a list of recommendations S S = ; while |S| < k do v⇤ argmaxv g(v, S) S S [ {v⇤ } return S ALGORITHM 1: BestCoverage (relaxed) ALGORITHM 2: BestCoverage (relaxed) Input: k, G, ⇡, ` Output: a list of recommendations S S = ; Sort(V ) w.r.t ⇡i non-increasing S1 V [1..k0 ], i.e., top-k0 vertices where k0 = k¯` 8v 2 S1, g(v) g(v, ;) 8v 2 S1, c(v) Uncovered while |S| < k do v⇤ argmaxv2S1 g(v) S S [ {v⇤ } S2 N`({v⇤ }) for each v0 2 S2 do if c(v0 ) = Uncovered then S3 N`({v0 }) 8u 2 S3, g(u) g(u) ⇡v0 c(v0 ) Covered return S `
  • 16. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 16/25 Experiments •  5 target application areas, 5 graphs from SNAP •  Queries generated based on 3 scenario types –  one random vertex –  random vertices from one area of interest –  multiple vertices from multiple areas of interest Dataset |V | |E| ¯ D D90% CC amazon0601 403.3K 3.3M 16.8 21 7.6 0.42 ca-AstroPh 18.7K 396.1K 42.2 14 5.1 0.63 cit-Patents 3.7M 16.5M 8.7 22 9.4 0.09 soc-LiveJournal1 4.8M 68.9M 28.4 18 6.5 0.31 web-Google 875.7K 5.1M 11.6 22 8.1 0.60
  • 17. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 17/25 Results – relevance •  Methods should trade-off relevance for better diversity •  Normalized relevance of top-k set is always 1 •  DRAGON always return results having 70% similar items to top-k, with more than 80% rel score amazon0601, combined 0 0.2 0.4 0.6 0.8 1 5 10 20 50 100 rel k PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) 100 PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) 50 100 k PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) 0 0.2 0.4 0.6 0.8 1 5 10 20 50 100 rel k PPR (top-k GrassHop Dragon PDivRank CDivRank k-RLM GSparse BC1 BC1 (relax soc-LiveJournal1, combined 10 100 1000 10000 5 10 20 50 100 time(sec) PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC1 (relaxed) 10 100 1000 10000 5 10 20 50 100 time(sec) k PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC1 (relaxed) amazon0601, combined 0.6 0.8 1 diff PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) 0.6 0.8 1 diff PPR (top- GrassHop Dragon PDivRank CDivRank k-RLM GSparse BC1 BC1 (relax soc-LiveJournal1, combined 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 5 10 2 σ2 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 5 10 20 50 100 σ2 k P G D P C k G B B B B 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 5 10 20 50 σ2 k 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 5 10 2 σ2 Figure ing k. highest and k-R 0.8 top-k DRAGON
  • 18. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 18/25 Results – coverage •  l-step expansion ratio (σ2) gives the graph coverage of the result set: better coverage = better diversity •  BestCoverage and DivRank variants, especially BC2 and PDivRank, have the highest coverage 100 PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC1 (relaxed) ned PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC1 (relaxed) ined amazon0601, combined 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 5 10 20 50 100 σ2 k PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) 100 PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) 50 100 k PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 5 10 20 50 100 σ2 k PPR (top-k GrassHop Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relax BC2 (relax ca-AstroPh, combined 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 σ2 PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 5 10 20 50 100 σ2 k PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) Figure 3: Coverage ( 2) of the algorithms with vary- ing k. BestCoverage and DivRank variants have the highest coverage on the graphs while Dragon, GSparse, and k-RLM have similar coverages to top-k results. amazon0601, combined 0.8 PPR (top-k) 0.8 PPR (top-k) soc-LiveJournal1, combined
  • 19. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 19/25 Results – expanded relevance •  combined measure for relevance and diversity •  BestCoverage variants and GrassHopper perform better •  Although PDivRank gives the highest coverage on amazon graph, it fails to cover the relevant parts! 100 PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC1 (relaxed) ent rse ele- the 2.3, ned ores 0 0.1 0.2 0.3 0.4 5 10 20 50 100 σ k BC2 (relaxed) 0 0.1 0.2 5 10 20 50 100 k ing k. BestCoverage and DivRank variants have the highest coverage on the graphs while Dragon, GSparse, and k-RLM have similar coverages to top-k results. amazon0601, c o m b i n e d 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 5 10 20 50 100 exprel2 k PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) 50 100 k PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 5 10 20 50 100 exprel2 k PPR (top-k GrassHopp Dragon PDivRank CDivRank k-RLM GSparse BC1 BC1 (relaxe soc-LiveJournal1, c o m b i n e d 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 5 10 20 50 100 exprel2 k PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC1 (relaxed) Figure 4: Expanded relevance (exprel2 ) with vary- ing k. BC1 and BC2 variants mostly score the best, GrassHopper performs high in soc-LiveJournal1. Al- though PDivRank gives the highest coverage on ama- zon0601 (Fig. 3), it fails to cover the relevant parts. BC2 BC1 PDivRank GrassHopper
  • 20. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 20/25 Results – efficiency •  BC1 always performs better, with a running time less than, DivRank and GrassHopper •  BC1 (relaxed) offers reasonable diversity, with a very little overhead on top of the PPR computation 100 PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) 3 ) ) 100 PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) 3 0.01 0.1 1 10 5 10 20 50 100 time(sec) k PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) 100 PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) 50 100 k PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) 20 50 100 k PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) ca-AstroPh, combined 10 100 1000 10000 5 10 20 50 100 time(sec) k PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC1 (relaxed) 0.1 1 10 100 5 10 20 50 100 time(sec) k PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) 10 100 1000 10000 5 10 20 50 100 time(sec) k PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC1 (relaxed) soc-LiveJournal1, combined 10 100 1000 10000 5 10 20 50 100 time(sec) k PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) 100 PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) 50 100 k PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) cit-Patents, scenario 1 1 10 100 1000 5 10 20 50 100 time(sec) k PPR (top-k) GrassHopper Dragon PDivRank CDivRank k-RLM GSparse BC1 BC2 BC1 (relaxed) BC2 (relaxed) web-Google, scenario 1 Figure 7: Running times of the algorithms with varying k. BC1 method always perform better with a running time less than GrassHopper and DivRank vari- ants, while the relaxed versions score similarly with a slight overhead on top of the PPR computation. BC1 BC1 BC1 BC1 DivRank DivRank DivRank DivRank BC1 (relaxed) BC1 (relaxed) BC1 (relaxed)BC1 (relaxed)
  • 21. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 21/25 Results – intent aware experiments •  evaluation of intent-oblivious algorithms against intent-aware measures •  two measures –  group coverage [Li11] –  S-recall [Zhai03] •  cit-Patent dataset has the categorical information –  426 class labels, belong to 36 subtopics
  • 22. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 22/25 Results – intent aware experiments •  group coverage [Li11] –  How many different groups are covered by the results? –  omits the actual intent of the query •  top-k results are not diverse enough •  AllRandom results cover the most number of groups •  PDivRank and BC2 follows 0 10 20 30 40 50 60 70 5 10 20 50 100 Classcoverage k PPR (top-k) Dragon PDivRank CDivRank k-RLM BC1 BC2 BC1 (relaxed) BC2 (relaxed) AllRandom (a) Class coverage 0 5 10 15 20 25 30 5 10 20 50 100 Subtopiccoverage k PPR (top-k) Dragon PDivRank CDivRank k-RLM BC1 BC2 BC1 (relaxed) BC2 (relaxed) AllRandom (b) Subtopic coverage 0 1 2 3 4 5 6 5 10 20 Topiccoverage (c) Topic 0 5 10 15 20 25 30 5 10 20 50 Subtopiccoverage k 50 100 k PPR (top-k) Dragon PDivRank CDivRank k-RLM BC1 BC2 BC1 (relaxed) BC2 (relaxed) AllRandom ) Topic coverage 100 PPR (top-k) Dragon PDivRank CDivRank k-RLM BC1 BC2 BC1 (relaxed) BC2 (relaxed) AllRandom 100 PPR (top-k) Dragon PDivRank CDivRank k-RLM BC1 BC2 BC1 (relaxed) BC2 (relaxed) AllRandom ge 0 5 10 15 20 25 30 5 10 20 50 100 Subtopiccoverage k PPR (top-k) Dragon PDivRank CDivRank k-RLM BC1 BC2 BC1 (relaxed) BC2 (relaxed) AllRandom (b) Subtopic coverage 0 1 2 3 4 5 6 5 10 20 50 100 Topiccoverage k PPR (top-k) Dragon PDivRank CDivRank k-RLM BC1 BC2 BC1 (relaxed) BC2 (relaxed) AllRandom (c) Topic coverage 0 5 10 15 20 25 30 5 10 20 50 100 Subtopiccoverage k PPR (top-k) Dragon PDivRank CDivRank k-RLM BC1 BC2 BC1 (relaxed) BC2 (relaxed) AllRandom ses (e) S-recall on subtopics (f) S-recall on topics Intent-aware results on cit-Patents dataset with scenario-3 queries. top-k top-k random BC2 BC2
  • 23. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 23/25 0 10 20 30 40 50 60 5 10 20 50 100 Classcoverage k PDivRank CDivRank k-RLM BC1 BC2 BC1 (relaxed) BC2 (relaxed) AllRandom (a) Class coverage 0 5 10 15 20 25 5 10 20 50 100 Subtopiccoverage k PDivRank CDivRank k-RLM BC1 BC2 BC1 (relaxed) BC2 (relaxed) AllRandom (b) Subtopic coverage 0 1 2 3 4 5 5 10 20 Topiccoverage (c) Topic 0 5 10 15 20 25 5 10 20 50 Subtopiccoverage k 0 0.1 0.2 0.3 0.4 0.5 0.6 5 10 20 S-recall(class) k (d) S-recall on classes 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 5 10 20 S-recall(subtopic) k (e) S-recall on subtopics (f) S-recal Figure 8: Intent-aware results on cit-Patents dataset with sc level topics7 . Here we present an evaluation of the intent- oblivious algorithms against intent-aware measures. This evaluation provides a validation of the diversification tech- niques with an external measure such as group coverage [14] and S-recall [23]. sults of all of the dive those two extremes, w and Dragon covers the However, S-recall ind was actually useful or Results – intent aware experiments •  S-recall [Zhai03], Intent-coverage [Zhu11] –  percentage of relevant subtopics covered by the result set –  the intent is given with the classes of the seed nodes •  AllRandom brings irrelevant items from the search space •  top-k results do not have the necessary diversity •  BC2 variants and BC1 perform better than DivRank •  BC1 (relaxed) and DivRank scores similar, but BC1r much faster top-k random DivRank BC 0 10 20 30 40 50 60 70 5 10 20 50 100 k PPR (top-k) Dragon PDivRank CDivRank k-RLM BC1 BC2 BC1 (relaxed) BC2 (relaxed) AllRandom (a) Class coverage 0 5 10 15 20 25 30 5 10 20 50 100 Subtopiccoverage k PPR (top-k) Dragon PDivRank CDivRank k-RLM BC1 BC2 BC1 (relaxed) BC2 (relaxed) AllRandom (b) Subtopic coverage 0 1 2 3 4 5 6 5 10 20 50 100 Topiccoverage k PPR (top-k) Dragon PDivRank CDivRank k-RLM BC1 BC2 BC1 (relaxed) BC2 (relaxed) AllRandom (c) Topic coverage 0 5 10 15 20 25 30 5 10 20 50 100 Subtopiccoverage k PPR (top-k) Dragon PDivRank CDivRank k-RLM BC1 BC2 BC1 (relaxed) BC2 (relaxed) AllRandom (d) S-recall on classes (e) S-recall on subtopics (f) S-recall on topics PPR (top-k) Dragon PDivRank CDivRank k-RLM BC1 BC2 BC1 (relaxed) BC2 (relaxed) AllRandom Figure 8: Intent-aware results on cit-Patents dataset with scenario-3 queries. pics7 . Here we present an evaluation of the intent- s algorithms against intent-aware measures. This on provides a validation of the diversification tech- with an external measure such as group coverage [14] ecall [23]. ts of a query set Q is extracted by collecting the subtopics, and topics of each seed node. Since our o evaluate the results based on the coverage of dif- sults of all of the diversification algorithms reside between those two extremes, where the PDivRank covers the most and Dragon covers the least number of groups. However, S-recall index measures whether a covered group was actually useful or not. Obviously, AllRandom scores the lowest as it dismisses the actual query (you may omit the S- recall on topics since there are only 6 groups in this granular ity level). Among the algorithms, BC2 variants and BC1 score random random random random random top-k top-k top-k top-k top-k DivRank DivRank DivRank DivRank DR BC BC BC BC BC
  • 24. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 24/25 Conclusions •  Result diversification should not be evaluated as a bicriteria optimization problem with –  a relevance measure that ignores diversity, and –  a diversity measure that ignores relevancy •  l-step expanded relevance is a simple measure that combines both relevance and diversity •  BestCoverage, a greedy solution that maximizes exprell is a (1-1/e)-approximation of the optimal solution •  BestCoverage variants perform better than others, its relaxation is extremely efficient •  goodness in DRAGON is dominated by relevancy •  DivRank variants implicitly optimize expansion ratio
  • 25. Kucuktunc et al. “Diversified Recommendation on Graphs: Pitfalls, Measures, and Algorithms”, WWW’13 25/25 Thank you •  For more information visit •  http://bmi.osu.edu/hpc •  Research at the HPC Lab is funded by