Initialization of matrix factorization (CaRR 2012 presentation)

ENHANCING MATRIX FACTORIZATION
THROUGH INITIALIZATION FOR
IMPLICIT FEEDBACK DATABASES

Balázs Hidasi
Domonkos Tikk

Gravity R&D Ltd.
Budapest University of Technology and Economics

CARR WORKSHOP, 14TH FEBRUARY 2012, LISBON

OUTLINE
 Matrix factoriztion
 Initialization concept

 Methods
 Naive
 SimFactor

 Results
 Discussion

2/19

MATRIX FACTORIZATION
 Collaborative Filtering
 One of the most common approaches

 Approximates the rating matrix as product of low-
rank matrices
Items

Q
Users

R ≈ P

3/19

MATRIX FACTORIZATION
 Initialize P and Q with small random numbers
 Teach P and Q
 Alternating Least Squares
 Gradient Descent
 Etc.

 Transforms the data to a feature space
 Separately for users and items
 Noise reduction
 Compression
 Generalization

4/19

IMPLICIT FEEDBACK
 No ratings
 User-item interactions (events)

 Much noisier
 Presence of an event  might not be positive feedback
 Absence of an event  does not mean negative
feedback
 No negative feedback is available!

 More common problem
 MF for implicit feedback
 Less accurate results due to noise
 Mostly ALS is used 5/19
 Scalability problems (rating matrix is dense)

CONCEPT
 Good MF model
 The feature vectors of similar entities are similar
 If data is too noisy  similar entities won’t be similar by
their features
 Start MF from a „good” point
 Feature vector similarities are OK
 Data is more than just events
 Metadata
 Info about items/users
 Contextual data
 In what context did the event occured
 Can we incorporate those to help implicit MF? 6/19

NAIVE APPROACH
 Describe items using any data we have (detailed
later)
 Long, sparse vectors for item description
 Compress these vectors to dense feature vectors
 PCA, MLP, MF, …
 Length of desired vectors = Number of features in MF

 Use these features as starting points

7/19

NAIVE APPROACH
 Compression and also noise reduction
 Does not really care about similarities

 But often feature similarities are not that bad

 If MF is used
 Half of the results is thrown out

Descriptors
features Descriptor features
≈
Items

Item

Description of items

8/19

SIMFACTOR ALGORITHM
 Try to preserve similarities better
 Starting from an MF of item description
Descriptors

Descriptors features

features
≈
Items

Item
(D)

 Similarities of items: DD’

 Some metrics require transformation on D

(D’)
Item
similarities
(S)
= (D) 9/19

SIMFACTOR ALGORITHM

Descriptors features
features
Item Descriptors features Item
≈ Item

(X)
similarities (Y’) features
(S) (X’)

(Y)
 Similarity approximation

features

Item Item
Item

≈
(X)

similarities Y’Y features
(S) (X’)

 Y’Y  KxK symmetric 10/19
 Eigendecomposition

SIMFACTOR ALGORITHM

Y’Y = U λ U’

 λ diagonal  λ = SQRT(λ) * SQRT(λ)
features

Item Item
≈
Item

SQRT SQRT
(X)

similarities U (λ) (λ) U’ features
(S) (X’)

 X*U*SQRT(λ) = (SQRT(λ)*U’*X’)’=F
 F is MxK matrix

 S F * F’  F used for initialization

Item
similarities ≈ F’ 11/19
F

(S)

CREATING THE DESCRIPTION MATRIX
 „Any” data about the entity
 Vector-space reprezentation
 For Items:
 Metadata vector (title, category, description, etc)
 Event vector (who bought the item)
 Context-state vector (in which context state was it
bought)
 Context-event (in which context state who bought it)

 For Users:
 All above except metadata
 Currently: Choose one source for D matrix
12/19
 Context used: seasonality

EXPERIMENTS: SIMILARITY PRESERVATION
 Real life dataset: online grocery shopping events
SimFactor RMSE improvement over naive in similarity
approximation

52.36%
48.70%

26.22%

16.86%
13.39% 12.38%
10.81%

Item context state User context state Item context- User context- Item event data User event data Item metadata
event event
13/19
 SimFactor approximates similarities better

EXPERIMENTS: INITIALIZATION
 Using different description matrices
 And both naive and SimFactor initialization

 Baseline: random init

 Evaluation metric: recall@50

14/19

EXPERIMENTS: GROCERY DB
 Up to 6% improvement
 Best methods use SimFactor and user context data

Top5 methods on Grocery DB
5.71%

4.88%

4.30%
4.12% 4.04%

15/19
User context state User context state User context event User event data User context event
(SimFactor) (Naive) (SimFactor) (SimFactor) (Naive)

EXPERIMENTS: „IMPLICITIZED” MOVIELENS
 Keeping 5 star ratings  implicit events
 Up to 10% improvement

 Best methods use SimFactor and item context data
Top5 methods on MovieLens DB

10%

9.17% 9.17% 9.17% 9.17%

16/19
Item context state User context state Item context event Item context event Item context state
(SimFactor) (SimFactor) (SimFactor) (Naive) (Naive)

DISCUSSION OF RESULTS
 SimFactor yields better results than naive
 Context information yields better results than other
descriptions
 Context information separates well between entities
 Grocery: User context
 People’s routines
 Different types of shoppings in different times

 MovieLens: Item context
 Different types of movies watched on different hours
 Context-based similarity

17/19

WHY CONTEXT?
 Grocery example
 Correlation between context states by users low
ITEM USER
Mon Tue Wed Thu Fri Sat Sun Mon Tue Wed Thu Fri Sat Sun
Mon 1,00 0,79 0,79 0,78 0,76 0,70 0,74 Mon 1,00 0,36 0,34 0,34 0,35 0,29 0,19
Tue 0,79 1,00 0,79 0,78 0,76 0,69 0,73 Tue 0,36 1,00 0,34 0,34 0,33 0,29 0,19
Wed 0,79 0,79 1,00 0,79 0,76 0,70 0,74 Wed 0,34 0,34 1,00 0,36 0,35 0,27 0,17
Thu 0,78 0,78 0,79 1,00 0,76 0,71 0,74 Thu 0,34 0,34 0,36 1,00 0,39 0,30 0,16
Fri 0,76 0,76 0,76 0,76 1,00 0,71 0,72 Fri 0,35 0,33 0,35 0,39 1,00 0,32 0,16
Sat 0,70 0,69 0,70 0,71 0,71 1,00 0,71 Sat 0,29 0,29 0,27 0,30 0,32 1,00 0,33
Sun 0,74 0,73 0,74 0,74 0,72 0,71 1,00 Sun 0,19 0,19 0,17 0,16 0,16 0,33 1,00
 Why can context aware algorithms be efficient?
 Different recommendations in different context states
18/19
 Context differentiates well between entities
 Easier subtasks

CONCLUSION & FUTURE WORK
 SimFactor  Similarity preserving compression
 Similarity based MF initialization:
 Description matrix from any data
 Apply SimFactor
 Use output as initial features for MF

 Context differentitates between entities well
 Future work:
 Mixed description matrix (multiple data sources)
 Multiple description matrix
 Using different context information
 Using different similarity metrics 19/19

THANKS FOR YOUR ATTENTION!

For more of my recommender systems related research visit my website:
http://www.hidasi.eu

Initialization of matrix factorization (CaRR 2012 presentation)

Recommended

Recommended

More Related Content

More from Balázs Hidasi

More from Balázs Hidasi (14)

Recently uploaded

Recently uploaded (20)

Initialization of matrix factorization (CaRR 2012 presentation)