Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems

UMAP Doctoral Consortium - July 2014, Aalborg, Denmark
Hybrid Solution of the Cold-Start Problem in
Context-Aware Recommender Systems
Matthias Braunhofer

!
Free University of Bozen - Bolzano

Piazza Domenicani 3, 39100 Bolzano, Italy

mbraunhofer@unibz.it

Outline
2
• Context-Aware Recommenders and the Cold-Start Problem
• Related Work
• Basic Context-Aware Rating Prediction Models
• Hybrid Context-Aware Rating Prediction Models
• Conclusions and Open Issues

• Context-Aware Recommender Systems (CARSs) aim to provide better
recommendations by exploiting contextual information (e.g., weather)

• Rating prediction function is: R: Users x Items x Context → Ratings

• Three basic approaches:

• Contextual pre-ﬁltering
• Contextual post-ﬁltering
• Contextual modelling
Context-Aware Recommender Systems
3

Cold-Start Problem
• CARSs suﬀer from the cold-start problem
• New user problem: How do you recommend to a new user?

• New item problem: How do you recommend a new item with no ratings?

• New context problem: How do you recommend in a new context?
4
1 ? 1 ?
2 5 ?
? 3 ?
3 ? 5 ?
2 5 ?
? 3 ?
5 ? 5 ?
4 5 4 ?
? 3 5 ?
1 ? 1
2 5
? 3
3 ? 5
2 5
? 3
5 ? 5
4 5 4
? 3 5
? ? ?
? ? ?
1 ? 1
2 5
? 3
3 ? 5
2 5
? 3
5 ? 5
4 5 4
? 3 5

Our Solution: Hybrid CARS
• Ultimate goal: design and development of hybrid CARSs that combine
diﬀerent CARS algorithms depending on their estimated strengths and
weaknesses to predict a user’s rating for an item given a particular cold-start
situation

• Example:
5
(user, item,
context) tuple
CARS 1
CARS 2
Combination Final score
Score
Score
Hybrid CARS

Key Steps
• Identify candidate basic context-aware rating prediction models

• Analyse candidate rating prediction models (what are their strengths and
weaknesses under cold-start situations?)

• Design, develop and evaluate novel hybrid CARSs

• Integrate the best-performing hybrid CARS into our STS (South Tyrol
Suggests) mobile app

• Evaluate it through a live user study
6

Outline
7
• Context-Aware Recommenders
• Related Work
• Hybrid

Related Work
8
Cold-starting CARSs
… using additional data
… better processing known data
Active learning
(Elahi et al., 2013)
Cross-domain rec.
(Enrich et al., 2013)
User / item attributes
(Woerndl et al., 2009)
Context similarities
(Zheng et al., 2013)
(Codina et al., 2013)

Outline
9
• Related Work

CAMF-CC (Baltrunas et al., 2011)
• CAMF-CC (Context-Aware Matrix Factorization for item categories) is a
variant of CAMF that extends standard Matrix Factorization (MF) by
incorporating baseline parameters for contextual condition-item category
pairs
10
ˆruic1,...,ck
= qi
T
pu + µ + bi + bu + btcj
j=1
k
∑
t∈T (i)
∑
qi latent factor vector of item i

pu latent factor vector of user u

μ overall average rating

bi baseline for item i

bu baseline for user u

T(i) set of categories associated to item i

btcj baseline for item category-contextual condition tcj

SPF (Codina et al., 2013)
• SPF (Semantic Pre-Filtering) is a contextual pre-ﬁltering method that, given
a target contextual situation, uses a standard MF model learnt from all the
ratings tagged with contextual situations identical or similar to the target one

• Conjecture: addresses cold-start problems caused by exact pre-ﬁltering
• Key step: similarity calculation
11
1 -0.5 2 1
-2 0.5 -2 -1.5
-2 0.5 -1 -1
1 -0.96 -0.84
-0.96 1 0.96
-0.84 0.96 1
Condition-to-item co-occurrence matrix Cosine similarity between conditions

Category-based CAMF-CC
• It is a novel variant of CAMF-CC that incorporates additional sources of
information about the items, i.e., category or genre information

• Conjecture: alleviates the new item problem of CAMF-CC
12
ˆruic1,...,ck
= (qi + xt )
t∈T (i)
∑
T
pu + µ + bi + bu + btcj
j=1
k
∑
t∈T (i)
∑


xt latent factor vector of item category t






Demographics-based CAMF-CC
• It is a novel variant of CAMF-CC that proﬁles users through known user
attributes (e.g., age group, gender, personality traits)

• Conjecture: alleviates the new user problem of CAMF-CC
13
ˆruic1,...,ck
= qi
T
(pu + ya )
a∈A(u)
∑ + µ + bi + bu + btcj
j=1
k
∑
t∈T (i)
∑


A(u) set of user attributes

ya latent factor vector of user attribute a






Evaluation
Discussion
• Offline evaluation of cold-start performance of CARSs is a complex task:
• Not done before

• Requires large (enough) contextually-tagged rating datasets with user and
item attributes

• Must consider multiple perspectives: new users, new items, new
contextual situations, mixtures of elementary cold-start cases, different
degrees of coldness, different types of user and item attribute information
available
14

• 2 contextually-tagged rating datasets
STS
LDOS-CoMoDa
(Odić et al., 2013)
Domain POIs Movies
Rating scale 1-5 1-5
Ratings 2,422 2,296
Users 305 121
Items 238 1,232
Contextual factors 14 12
Contextual conditions 57 49
Contextual situations 880 1,969
User attributes 7 4
Item features 1 7
Evaluation
Used Datasets
15

Evaluation
Evaluation Procedure
• Five-fold cross-validation where proper subsets of the testing set are used, depending
on the cold-start situation under consideration
• Divide the ratings into five cross-validation folds

• For each fold k = 1, 2, …, 5

• Use all ratings except those in fold k to train the prediction models

• Calculate the Mean Absolute Error (MAE) on those ratings in fold k that are coming
from new users, new items and new contextual situations, respectively

• Users, items or contextual situations are new if they have at most n ratings in the
training set, with n ranging from 0 to 10

• Advantage: allows to test for different degrees of coldness

• Drawback: small testing sets are filtered and get even smaller
16

Evaluation
Obtained Results (1/3)
MAEs for new users
17
CoMoDa
MAE
0.65
0.75
0.85
0.95
1.05
1.15
1.25
User proﬁle size
0 1 2 3 4 5 6 7 8 9 10
MF CAMF-CC SPF Category-based CAMF-CC Demographics-based CAMF-CC
STS
MAE
0.65
0.75
0.85
0.95
1.05
1.15
1.25
User proﬁle size
0 1 2 3 4 5 6 7 8 9 10

Evaluation
MAEs for new items
18
CoMoDa
MAE
0.70
0.75
0.80
0.85
0.90
0.95
1.00
1.05
1.10
Item proﬁle size
0 1 2 3 4 5 6 7 8 9 10
STS
MAE
0.70
0.75
0.80
0.85
0.90
0.95
1.00
1.05
1.10
Item proﬁle size
0 1 2 3 4 5 6 7 8 9 10

Evaluation
MAEs for new contextual situations
19
CoMoDa
MAE
0.70
0.75
0.80
0.85
0.90
0.95
Context proﬁle size
0 1 2 3 4 5 6 7 8 9 10
STS
MAE
0.70
0.75
0.80
0.85
0.90
0.95
Context proﬁle size
0 1 2 3 4 5 6 7 8 9 10

Outline
20
• Related Work

Heuristic Switching*
• Main idea: use a stable heuristic to switch between the basic CARS
algorithms depending on the encountered cold-start situation
21
(user, item, context)
tuple
Final score
Y Demogr.-CAMF-CC
Content-CAMF-CC
CAMF-CC
New
item?
N
Y
N
New
context?
New
context?
Y
N
New
item?
New
user?
Content-CAMF-CC &
Demogr.-CAMF-CC
Y
N
Y
N
Final score
Final score
Final score
Score
Score
Score
Score
* Described in our short paper submitted to ACM RecSys 2014

Adaptive Weighted*
• Main idea: adaptively weight each basic CARS algorithm based on how well
it performs for the user, item and contextual situation in question
22
(user, item, context)
tuple
CAMF-CC
SPF
Content-CAMF-CC
Demogr.-CAMF-CC
Adapter
Adapter
Adapter
Adapter
Score
Score
Score
Score
(Score, Weight)
(Score, Weight)
(Score, Weight)
(Score, Weight)
∑ Final score
Algorithms layer Adaptive layer Aggregation
* Described in our paper submitted to ACM RecSys 2014 Doctoral Symposium

• 3 contextually-tagged rating datasets
STS
LDOS-CoMoDa
(Odić et al., 2013)
Music
(Baltrunas et al., 2011)
Domain POIs Movies Music
Rating scale 1-5 1-5 1-5
Ratings 2,534 2,296 4,012
Users 325 121 139
Items 249 1,232 139
Contextual factors 14 12 8
Contextual conditions 57 49 26
Contextual situations 931 1,969 26
User attributes 7 4 10
Item features 1 7 2
Evaluation
Used Datasets
23

Evaluation
Evaluation Procedure
24
• Randomly split users / items / contexts into training set and testing set →
creates a set of users / items / contexts in the testing set that have no ratings
in the training set

• Advantage: the entire rating dataset can be used

• Drawback: can’t test for diﬀerent degrees of coldness
1 ? 1
2 5
? 3
3 ? 5
2 5
? 3
5 ? 5
4 5 4
? 3 5
1 ? 1
2 5
? 3
3 ? 5
2 5
? 3
5 ? 5
4 5 4
? 3 5
1 ? 1
2 5
? 3
3 ? 5
2 5
? 3
5 ? 5
4 5 4
? 3 5
New user test New item test New context test
Training set Testing set

Evaluation
Summary of Obtained Results
• Significant differences in normalised Discounted Cumulative Gain
(nDCG) and MAE between basic CARS algorithms across different cold-
start cases
• Content-based CAMF-CC works best for the new item situation

• Demographics-CAMF-CC works best both for the new user and new
context situation

• Hybridisation techniques can improve performance
• In almost all cases, they outperformed the state-of-the-art CARS
algorithms (i.e., CAMF-CC and SPF), thus easing the problem of model
selection
25

Outline
26
• Related Work

Conclusions
• Basic CARS algorithms perform very diﬀerently in the diﬀerent cold-start
situations

• Knowledge of strengths and weaknesses of each basic CARS algorithm in the
various cold-start situations allows the development of hybrid techniques

• First developed and tested hybrid CARS algorithms are able to outperform
the state-of-the-art CARS algorithms (i.e., CAMF-CC and SPF)
27

Open Issues
• Review additional knowledge sources which may be used to incorporate
additional information about users, items and contextual situations

• Check the availability of large-scale, contextually-tagged datasets with item
and user attributes

• Revise the used evaluation procedure and evaluation metrics

• Identify the best-performing hybridisation method for cold-start situations

• Design and execute a live user study
28

Questions or Comments?
Thank you.

Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (11)

Similar to Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems

Similar to Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems (20)

Recently uploaded

Recently uploaded (20)

Hybrid Solution of the Cold-Start Problem in Context-Aware Recommender Systems