Utilizing additional information in factorization methods (research overview, April 2014)

334 views
239 views

Published on

This presentation contains the main points of my recommender systems related research. It describes the arc of my research starting from improving matrix factorization, through the developement of my context-aware algorithms & addressing scalability issues to developing a general factorization framework & dealing with context dimension modeling. The slides were presented at the Delft University of Technology where I was invited to give this introductory talk as part of the collaboration between participiants of the CrowdRec project. The presentation was given on 11th April 2014.

Published in: Science, Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
334
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Utilizing additional information in factorization methods (research overview, April 2014)

  1. 1. Utilizing additional information in factorization methods - Research overview - - 11. April 2014 - Balázs Hidasi balazs@hidasi.eu
  2. 2. About me • Datamining researcher at Gravity R&D • PhD student at BME (BUTE) • Research interests: • Machine learning & Data mining • Algorithm research and development • Currently: recommender systems • Previously: time series classification
  3. 3. Gravity R&D • Recommender service provider, based in Hungary • Founded by team Gravity after the Netflix Prize • Started working there: January 2010 • Data analysis • Algorithm development & implementation • Research
  4. 4. Budapest University of Technology and Economics • Leading tech university in Hungary • Faculty of Electrical Engineering and Informatics • Computer science and engineering B.Sc./M.Sc. • Ph.D. student since September 2011. • Department of Telecommunications and Media Informatics • Data Science and Content Technologies Laboratory (DC Lab)
  5. 5. RecSys research – aims & roots • Aims: Developing novel algorithms that enable the usage of additional information with factorization to improve recommendation accuracy for implicit feedback based recommendation tasks • Roots: • Implicit feedback • Context • Factorization • In addition: • ALS learning • Recall based evaluation
  6. 6. Implicit feedback • Transactions provide no explicit user preference • View, buy, etc. • Presence of an event  noisy positive feedback • Absence of an event  ? • Negative feedback is not available
  7. 7. Context •
  8. 8. Factorization •
  9. 9. ALS based learning •
  10. 10. Recall based evaluation • Recall: number of relevant and recommended items in proportion to the number of relevant items • @N: only topN items are considered • Nowadays less common in RecSys • MAP, NDCG • Practical point of view • Rank does not matter as long as the item is shown • TopN list presented in chunks • TopN list should contain the relevant items • For many practical scenarios; there are exceptions
  11. 11. RecSys research – overview • Injecting additional info into MF (through initialization) • Context-aware methods: iTALS, iTALSx • Scalability improvement: CD/CG learning • General factorization framework • Modeling context • Pairwise ranking loss with ALS
  12. 12. Context-aware methods •
  13. 13. Approximate ALS learning (CG/CD) •
  14. 14. CD learning •
  15. 15. CG learning •
  16. 16. LS/CD/CG comparison • Little to none degradation in recall • Training time: CG < CD < LS • CD is unstable with models using members of higher order 0.00 100.00 200.00 300.00 400.00 500.00 600.00 700.00 800.00 900.00 1000.00 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 Runningtime(s) Number of features (K) iTALS iTALS-CG (N_I = 2) iTALS-CD (N_I = 2)
  17. 17. General factorization framework • Goal: fully flexible framework that allows experimentation with arbitrary linear factorization models • State-of-the-art methods use fixed models/model- classes • Designed for implicit but supports explicit feedback as well • wRMSE+ALS based learning • Approximate LS with CG for better scaling • No restrictions on the number and meaning of the used dimension • Even items and/or users can be emitted • Duplication of dimensions is allowed
  18. 18. General factorization framework •
  19. 19. General factorization framework
  20. 20. User-item-context relations • Basically 3 types: • UCI: user-item relation is reweighted by the feature vector of the current context • IC: context dependent item bias • UC: context dependent user bias • Doesn’t play role in ranking • Different context dimensions for different roles
  21. 21. Context modeling – Utility of standard context dimensions • Quality of context dimension • Huge impact on accuracy • Can we measure it? • Which context for which role? • CA item bias / CA user bias / reweighting user-item relations • Can it be predetermined? • Usefulness of a context dimension • Given a number of already defined dimension • Can it be measured without training?
  22. 22. Context modeling – Non-standard context dimensions • Composite context • E.g. transactions of the current session • General factorization framework handles it • Continuous context (& ordered context) • E.g. time or distance based context • Problems: • Context-state rigidness • Context-state ordinality • Context-state continuity • A solution: to be presented Sunday at CaRR 2014
  23. 23. Summary • Context-aware factorization methods mainly for the implicit feedback based problem • From improved MF, • through context-aware tensor methods • to a fully flexible general framework • On the way: • Improving scalability • Future: • Context modeling • Automatic model learning • Option for pairwise ranking loss
  24. 24. Thanks for the attention! Papers & slides available through my website: http://hidasi.eu MF initia- lization iTALS iTALSx Scalability CG/CD General Framework Model learning Pairwise ranking loss Context utility estimation Continuous context modeling Implicit feedback; context; factorization; (ALS);

×