A Random-Walk Based Scoring Algorithm with Application to Recommender Systems for Large-Scale E-Commerce Marco Gori and  A...
Recommending Problem <ul><li>Too many resources (products, movies, books, papers, web pages...) </li></ul><ul><li>Resource...
Problem Ingredients <ul><li>A set of users: U = {  u 1   ,  u 2  , ... ,  u |U|  } </li></ul><ul><li>A set of items: P = {...
Goal <ul><li>User  u  preferences </li></ul><ul><li>Other users’ preferences </li></ul><ul><li>Item correlation matrix  C ...
The Idea <ul><li>Correlation Graph obtained by Correlation Matrix </li></ul><ul><li>Spread user preferences over the Corre...
ItemRank Algorithm <ul><li>Iterative equation: </li></ul><ul><li>IR ItemRank values vector for user  u </li></ul><ul><li>I...
ItemRank for Movies <ul><li>Movie Graph </li></ul><ul><li>User  u  rates some movies (non-zero entries in  d ) </li></ul><...
Movie DataSet <ul><li>943 users </li></ul><ul><li>1,682 movies </li></ul><ul><li>100,000 ratings </li></ul><ul><li>Ratings...
Movie DataSet WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
Experimental Results <ul><li>Macro-DOA and difference with MaxF (in %): </li></ul><ul><li>ItemRank with binary graph: </li...
ItemRank for Papers <ul><li>Paper Graph </li></ul><ul><li>User  u  selects some interesting papers (non-zero entries in  d...
Paper DataSet WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
Experimental Results WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA,...
Performance Issues <ul><li>BAD NEWS </li></ul><ul><li>ItemRank computation is user dependent </li></ul><ul><li>Correlation...
Future Works <ul><li>Scalability improvements: </li></ul><ul><ul><ul><li>Users soft-clustering </li></ul></ul></ul><ul><ul...
Upcoming SlideShare
Loading in …5
×

Webkdd2006

436 views

Published on

"A Random-Walk Based Scoring Algorithm Applied to Recommender Engines"
Presentation at WebKDD2006 conference.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
436
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
6
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Webkdd2006

  1. 1. A Random-Walk Based Scoring Algorithm with Application to Recommender Systems for Large-Scale E-Commerce Marco Gori and Augusto Pucci Dipartimento di Ingegneria dell’Informazione University of Siena Via Roma 56, 53100 Siena (ITALY) WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
  2. 2. Recommending Problem <ul><li>Too many resources (products, movies, books, papers, web pages...) </li></ul><ul><li>Resource filtering is time consuming </li></ul><ul><li>Help users during the filtering process </li></ul><ul><li>Personalized ranking of resources </li></ul><ul><li>Suggest useful resources to a given user </li></ul>WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
  3. 3. Problem Ingredients <ul><li>A set of users: U = { u 1 , u 2 , ... , u |U| } </li></ul><ul><li>A set of items: P = { p 1 , p 2 , ... , p |P| } </li></ul><ul><li>A set of opinions: t i,k = ( u i , p k , r i,k ) </li></ul><ul><li>Correlation between items: </li></ul><ul><li>C i,k = corr ( p i , p k ) </li></ul><ul><li>NOTE: the correlation matrix has to be normalized </li></ul>WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
  4. 4. Goal <ul><li>User u preferences </li></ul><ul><li>Other users’ preferences </li></ul><ul><li>Item correlation matrix C </li></ul>Properly rank items in set P with respect to: WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
  5. 5. The Idea <ul><li>Correlation Graph obtained by Correlation Matrix </li></ul><ul><li>Spread user preferences over the Correlation Graph </li></ul>WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
  6. 6. ItemRank Algorithm <ul><li>Iterative equation: </li></ul><ul><li>IR ItemRank values vector for user u </li></ul><ul><li>IR i ItemRank value for item p i </li></ul><ul><li>d preference vector for user u </li></ul><ul><li>About 20 iterations to converge </li></ul>WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
  7. 7. ItemRank for Movies <ul><li>Movie Graph </li></ul><ul><li>User u rates some movies (non-zero entries in d ) </li></ul><ul><li>Correlation Graph built by co-occurrence of movies in user preference lists </li></ul><ul><li>ItemRank suggests good movies to be watched </li></ul>WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
  8. 8. Movie DataSet <ul><li>943 users </li></ul><ul><li>1,682 movies </li></ul><ul><li>100,000 ratings </li></ul><ul><li>Ratings from 1 to 5 </li></ul><ul><li>5 standard training/test splittings </li></ul>WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
  9. 9. Movie DataSet WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
  10. 10. Experimental Results <ul><li>Macro-DOA and difference with MaxF (in %): </li></ul><ul><li>ItemRank with binary graph: </li></ul>
  11. 11. ItemRank for Papers <ul><li>Paper Graph </li></ul><ul><li>User u selects some interesting papers (non-zero entries in d ) </li></ul><ul><li>Correlation Graph is a weighted citation graph connecting papers </li></ul><ul><li>ItemRank suggests good reference candidates </li></ul>WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
  12. 12. Paper DataSet WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
  13. 13. Experimental Results WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
  14. 14. Performance Issues <ul><li>BAD NEWS </li></ul><ul><li>ItemRank computation is user dependent </li></ul><ul><li>Correlation Graph can be huge </li></ul><ul><li>GOOD NEWS </li></ul><ul><li>User clustering is possible (also soft-clustering) </li></ul><ul><li>C matrix is sparse: efficient way to compute PageRank-like algorithms </li></ul>WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
  15. 15. Future Works <ul><li>Scalability improvements: </li></ul><ul><ul><ul><li>Users soft-clustering </li></ul></ul></ul><ul><ul><ul><li>Efficient computation with sparse matrices </li></ul></ul></ul><ul><li>Correlation matrix learning </li></ul><ul><li>Comparison with graph regularization frameworks </li></ul><ul><li>Applications to new datasets </li></ul>WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA

×