- 1. Factorization Machines with libFM Steffen Rendle University of Konstanz ACM TIST, May 2012 WUME Reading Group Liangjie Hong
- 2. Outline • Motivations • Model • Experiments
- 3. Motivation Factorization models show superior performance • Collaborative filtering ▫ Movie recommendation ▫ Tag recommendation • Link prediction
- 4. Motivation • (Too) many factorization models ▫ General Form matrix factorization [Srebro and Jaakkola 2003] tensor factorization [Tucker 1966, Harshman 1970] ▫ Specific Tasks SVD++ [Koren 2008] STE [Ma et al. 2011] timeSVD++ [Koren 2009b] BPTF [Xiong et al. 2010]
- 5. Motivation • Each task requires re-design ▫ model ▫ inference algorithm
- 6. Motivation • What we want! ▫ Simple, easy to use, like libSVM, Weka… ▫ Feed in feature vectors ▫ Keep factorizations!
- 7. Proposed method • Factorization Machines ▫ like libSVM… ▫ enjoy the benefits of factorized interactions between variables 2-n order interactions… ▫ can mimic many successful models ▫ three major inference algorithms SGD ALS MCMC
- 8. Proposed method • Similar approaches ▫ Regression-based latent factor models ▫ SVD-feature model ▫ MF with Gaussian process/Dirichlet mixture process
- 9. Roadmap • Model ▫ properties • Probabilistic Interpretation • Relationships with other Factorization models ▫ matrix factorizations ▫ pairwise interaction tensor factorization ▫ SVD++ and FPMC ▫ BPFT and TimeSVD++ ▫ NN ▫ attribute-aware models • Inference algorithms ▫ SGD, ALS, MCMC
- 10. Model
- 11. Model
- 12. Model • Factorization model with degree = 2
- 13. Model • Factorization model with degree = 2 global “bias” pairwise interaction factorization! regression coefficients strength of j-th variable
- 14. Model • Factorization model with degree = 2
- 15. Model
- 16. Model
- 29. Relationships to other models • Matrix factorization • Pairwise interaction tensor factorization • SVD++ and FPMC • BPTF and TimeSVD++ • NN • Attribute-aware models • SVM • Others
- 30. Relationships to other models • Matrix factorization • Pairwise interaction tensor factorization • SVD++ and FPMC • BPTF and TimeSVD++ • NN • Attribute-aware models • SVM • Others
- 31. Relationships to other models • Matrix factorization
- 32. Relationships to other models • Pairwise Interaction Tensor Factorization ▫ [Rendle and Schmidt-Thieme 2010]
- 33. Relationships to other models • Pairwise Interaction Tensor Factorization
- 34. Relationships to other models • Pairwise Interaction Tensor Factorization ▫ Tucker Decomposition
- 35. Relationships to other models • Pairwise Interaction Tensor Factorization ▫ Canonical Decomposition (CD)
- 36. Relationships to other models • Pairwise Interaction Tensor Factorization ▫ Pairwise Decomposition
- 37. Relationships to other models • Pairwise Interaction Tensor Factorization
- 38. Relationships to other models • Pairwise Interaction Tensor Factorization
- 39. Relationships to other models • Pairwise Interaction Tensor Factorization
- 40. Relationships to other models • SVD++ ▫ SVD++ [Koren 2008]
- 41. Relationships to other models • SVD++
- 42. Relationships to other models • SVD++
- 43. Relationships to other models • Bayesian Probabilistic Tensor Factorization ▫ [Xiong et al. 2010] • TimeSVD++ ▫ [Koren 2009b] • Capture temporal effects
- 44. Relationships to other models
- 45. Relationships to other models • Nearest neighbor Models ▫ Factorized nearest neighbor model [Koren 2010] ▫ Non-factorized nearest neighbor model [Koren 2008b]
- 46. Relationships to other models • Nearest neighbor Models
- 47. Relationships to other models • Nearest neighbor Models
- 48. Relationships to other models • Attribute-aware models
- 49. Relationships to other models • Attribute-aware models ▫ [Agarwal and Chen 2009] ▫ [Gantner et al. 2010] • Cold-start problem
- 50. Relationships to other models • Attribute-aware models
- 51. Relationships to other models • Attribute-aware models
- 52. Relationships to other models • Attribute-aware models
- 53. Relationships to other models • SVM
- 54. Relationships to other models • SVM ▫ Linear kernel
- 55. Relationships to other models • SVM ▫ Linear kernel • identical to 1st order FM
- 56. Relationships to other models • SVM ▫ Polynomial kernel
- 57. Relationships to other models • SVM ▫ Polynomial kernel
- 58. Relationships to other models • SVM V.S.
- 59. Relationships to other models • SVM V.S.
- 60. Experiments • Rating prediction ▫ Netflix data ▫ RMSE • Context-aware recommendation ▫ Yahoo! Webscope data ▫ RMSE • Tag recommendation ▫ ECML/PKDD data ▫ F1 measure
- 65. Conclusion • libFM is available. • (potentially) integrate many more models. • A simple way to combine features & latent factors
- 66. Conclusion • libFM is available. • (potentially) integrate many more models. • A simple way to combine features & latent factors • Both 4th position in KDD Cup 2012 T1/T2
- 67. Reference • Steffen Rendle. Factorization machines with libfm. ACM Transactions on Intelligent Systems and Technology, 3(3):57:1–57:22, May 2012 • Steffen Rendle. Factorization machines. In Proceedings of the 2010 IEEE International Conference on Data Mining, pages 995– 1000, Washington, DC, USA, 2010. IEEE Computer Society. • Steffen Rendle, Zeno Gantner, Christoph Freudenthaler, and Lars Schmidt- Thieme. Fast context-aware recommendations with factorization machines. In Proceedings of the 34th international ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pages 635–644, New York, NY, USA, 2011. ACM • Christoph Freudenthaler, Lars Schmidt-Thieme, and Steffen Rendle. Bayesian factorization machines. In Workshop on Sparse Representation and Low-rank Approximation, Neural Information Processing Systems (NIPS), Granada, Spain, 2011