Upcoming SlideShare
×

# Enhancing Matrix Factorization Through Initialization for Implicit Feedback Databases

1,679 views

Published on

Published in: Technology, Education
1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
1,679
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
25
0
Likes
1
Embeds 0
No embeds

No notes for slide

### Enhancing Matrix Factorization Through Initialization for Implicit Feedback Databases

1. 1. ENHANCING MATRIX FACTORIZATIONTHROUGH INITIALIZATION FORIMPLICIT FEEDBACK DATABASES Balázs Hidasi Domonkos Tikk Gravity R&D Ltd. Budapest University of Technology and Economics
2. 2. OUTLINE Matrix factoriztion Initialization concept Methods  Naive  SimFactor Results Discussion 2/19
3. 3. MATRIX FACTORIZATION Collaborative Filtering One of the most common approaches Approximates the rating matrix as product of low- rank matrices Items Q Users R ≈ P 3/19
4. 4. MATRIX FACTORIZATION Initialize P and Q with small random numbers Teach P and Q  Alternating Least Squares  Gradient Descent  Etc. Transforms the data to a feature space  Separately for users and items  Noise reduction  Compression  Generalization 4/19
5. 5. IMPLICIT FEEDBACK No ratings User-item interactions (events) Much noisier  Presence of an event  might not be positive feedback  Absence of an event  does not mean negative feedback  No negative feedback is available! More common problem MF for implicit feedback  Less accurate results due to noise  Mostly ALS is used 5/19  Scalability problems (rating matrix is dense)
6. 6. CONCEPT Good MF model  The feature vectors of similar entities are similar  If data is too noisy  similar entities won’t be similar by their features Start MF from a „good” point  Feature vector similarities are OK Data is more than just events  Metadata  Info about items/users  Contextual data  In what context did the event occured  Can we incorporate those to help implicit MF? 6/19
7. 7. NAIVE APPROACH Describe items using any data we have (detailed later)  Long, sparse vectors for item description Compress these vectors to dense feature vectors  PCA, MLP, MF, …  Length of desired vectors = Number of features in MF Use these features as starting points 7/19
8. 8. NAIVE APPROACH Compression and also noise reduction Does not really care about similarities But often feature similarities are not that bad If MF is used  Half of the results is thrown out Descriptors features Descriptor features ≈ Items Item Description of items 8/19
9. 9. SIMFACTOR ALGORITHM Try to preserve similarities better Starting from an MF of item description Descriptors Descriptors features features Description of items ≈ Items Item (D) Similarities of items: DD’ Description of items  Some metrics require transformation on D (D’) Item Description of items similarities (S) = (D) 9/19
10. 10. SIMFACTOR ALGORITHM Descriptors features features Item Descriptors features Item ≈ Item (X)similarities (Y’) features (S) (X’) (Y) Similarity approximation features Item Item Item ≈ (X) similarities Y’Y features (S) (X’) Y’Y  KxK symmetric 10/19  Eigendecomposition
11. 11. SIMFACTOR ALGORITHMY’Y = U λ U’ λ diagonal  λ = SQRT(λ) * SQRT(λ) features Item Item ≈ Item SQRT SQRT (X)similarities U (λ) (λ) U’ features (S) (X’) X*U*SQRT(λ) = (SQRT(λ)*U’*X’)’=F F is MxK matrix S  F * F’  F used for initialization Itemsimilarities ≈ F’ 11/19 F (S)
12. 12. CREATING THE DESCRIPTION MATRIX „Any” data about the entity  Vector-space reprezentation For Items:  Metadata vector (title, category, description, etc)  Event vector (who bought the item)  Context-state vector (in which context state was it bought)  Context-event (in which context state who bought it) For Users:  All above except metadata Currently: Choose one source for D matrix 12/19 Context used: seasonality
13. 13. EXPERIMENTS: SIMILARITY PRESERVATION Real life dataset: online grocery shopping events SimFactor RMSE improvement over naive in similarity approximation 52,36% 48,70% 26,22% 16,86% 13,39% 12,38% 10,81%Item context state User context state Item context- User context- Item event data User event data Item metadata event event 13/19 SimFactor approximates similarities better
14. 14. EXPERIMENTS: INITIALIZATION Using different description matrices And both naive and SimFactor initialization Baseline: random init Evaluation metric: recall@50 14/19
15. 15. EXPERIMENTS: GROCERY DB Up to 6% improvement Best methods use SimFactor and user context data Top5 methods on Grocery DB 5,71% 4,88% 4,30% 4,12% 4,04% 15/19 User context state User context state User context event User event data User context event (SimFactor) (Naive) (SimFactor) (SimFactor) (Naive)
16. 16. EXPERIMENTS: „IMPLICITIZED” MOVIELENS Keeping 5 star ratings  implicit events Up to 10% improvement Best methods use SimFactor and item context data Top5 methods on MovieLens DB 10% 9,17% 9,17% 9,17% 9,17% 16/19 Item context state User context state Item context event Item context event Item context state (SimFactor) (SimFactor) (SimFactor) (Naive) (Naive)
17. 17. DISCUSSION OF RESULTS SimFactor yields better results than naive Context information yields better results than other descriptions Context information separates well between entities  Grocery: User context  People’s routines  Different types of shoppings in different times  MovieLens: Item context  Different types of movies watched on different hours Context-based similarity 17/19
18. 18. WHY CONTEXT? Grocery example  Correlation between context states by users low ITEM USER Mon Tue Wed Thu Fri Sat Sun Mon Tue Wed Thu Fri Sat SunMon 1,00 0,79 0,79 0,78 0,76 0,70 0,74 Mon 1,00 0,36 0,34 0,34 0,35 0,29 0,19Tue 0,79 1,00 0,79 0,78 0,76 0,69 0,73 Tue 0,36 1,00 0,34 0,34 0,33 0,29 0,19Wed 0,79 0,79 1,00 0,79 0,76 0,70 0,74 Wed 0,34 0,34 1,00 0,36 0,35 0,27 0,17Thu 0,78 0,78 0,79 1,00 0,76 0,71 0,74 Thu 0,34 0,34 0,36 1,00 0,39 0,30 0,16Fri 0,76 0,76 0,76 0,76 1,00 0,71 0,72 Fri 0,35 0,33 0,35 0,39 1,00 0,32 0,16Sat 0,70 0,69 0,70 0,71 0,71 1,00 0,71 Sat 0,29 0,29 0,27 0,30 0,32 1,00 0,33Sun 0,74 0,73 0,74 0,74 0,72 0,71 1,00 Sun 0,19 0,19 0,17 0,16 0,16 0,33 1,00 Why can context aware algorithms be efficient?  Different recommendations in different context states 18/19  Context differentiates well between entities  Easier subtasks
19. 19. CONCLUSION & FUTURE WORK SimFactor  Similarity preserving compression Similarity based MF initialization:  Description matrix from any data  Apply SimFactor  Use output as initial features for MF Context differentitates between entities well Future work:  Mixed description matrix (multiple data sources)  Multiple description matrix  Using different context information  Using different similarity metrics 19/19
20. 20. THANKS FOR YOUR ATTENTION!Questions?