The document discusses the development of a music recommender system using the Beep Tunes dataset, outlining various approaches such as collaborative filtering and content-based filtering. It emphasizes the importance of user and item profiling, data analysis, and different model evaluation techniques including mean absolute error. The system's evaluation demonstrated significant improvements in recommendation accuracy by optimizing parameters and incorporating additional data features.
Building a Music
Recommender
Systemfrom Scratch
on the Beep Tunes
Dataset
Niloufar Farajpour, Mohamadreza Kiani,
Mohamadreza Fereydooni, Tadeh Alexani
1
Rahnema College - Winter 2020
2.
Frank Kane
As adata scientist,
question the results,
because, often there is
something you missed.
2
6
CollaborativeFiltering(CF)
Model Based
Memory Based
Findsimilar users based on
cosine similarity or pearson
correlation and take
weighted avg. of ratings
Use machine learning to
ļ¬nd user ratings of unrated
items. e.g. PCA, SVD, Neural
Nets, Matrix Factorization
Advantage
Easy creation and
explainability of results
Disadvantage
Performance Reduces when
data is sparse. So, non
scalable
Advantage
Dimensionality reduction
deals with missing/ sparse
data
Disadvantage
Inference is intractable
because of hidden/ latent
factors
13
EDA
We considered twotime parameters as well.
ā Action-Publish:
ā If it is less than 10 days 1.5
ā If it is more than 10 days 1
ā Action-Today:
1 + (Action Year - 2011) / 10 + Action Month * 0.025
Example: 2016-5 : 1 + (2016-2011)/10 + (5*0.025) 1.625
Evaluation: Outline ofOnline and Offline
Impression
15
Model
Recommendation List
1- Track A
2- Track B
3- Track C
Evaluate
1- Track A
2- Track B
1- Track A
2- Track B
Dataset for Recommendation
Dataset for
Recommendation
Model
1- Track A
2- Track B
3- Track C
Recommendation List
Compare
Correct Data
OnlineOffline
16.
16
Evaluation
ā Split totrain/test data using date (e.g. a year)
ā From 70.000.000 action records from 2011 to 2020:
ā Train data -> 2019 -> ~ 10.000.000 actions
ā Test data -> 2020 -> ~ 1.000.000 actions
20
Model: Model BasedCF
ā Compute a Correlation Score for every column pair in the matrix
ā This gives us a Correlation Score between every pair of track
ā Too long to compute
ā Sparseness
ā Scalability
23
Model: Matrix Factorizationof User-Item Matrix
4.5 2.0
4.0 3.5
5.0 2.0
3.5 4.0 1.0
User
Item
1.2 0.8
1.4 0.9
1.5 1.0
1.2 0.8
1.5 1.2 1.0 0.8
1.7 0.6 1.1 0.4
= x
User Matrix
Item Matrix
W X Y Z
A
B
C
D
W X Y ZA
B
C
D
24.
24
Model: Matrix Factorizationof User-Item Matrix
ā Latent factors are the features in the lower dimension
latent space projected from user-item interaction
matrix.
ā Matrix factorization is one of very eļ¬ective dimension
reduction techniques in machine learning.
25.
25
Model: ALS
Alternating LeastSquare (ALS) is also
a matrix factorization algorithm and it
runs itself in a parallel fashion.
26.
26
Model: ALS
ā Solvescalability and sparseness of the Ratings data
ā Itās simple and scales well to very large datasets
29
Optimization: EvaluationOptimization: EvaluationResult
ā Using 2019 data as train ā 10M
ā Using 2020 data as test ā 1M
ā Running on the Old Result:
ā Total Users: 116526
ā Mean Score: 0.01%
ā Max Score: 40%
ā Using all data ā 70M
ā Add date coeļ¬cients
ā Finding Best Parameters
ā Running on the New Result:
ā Total Users: 577457
ā Mean Score: 1.1%
ā Max Score: 50%
x110 improvement based on new data
32
Model: Content-Based Filtering
āItem proļ¬le for each track we should construct a vector
based on itās features like tags and artists it has
ā User proļ¬le for each user we need a vector that shows
his interests based on ratings or likes and downloads
34
Model: Content-Based Filtering/Userproļ¬le
ā User has rated items with proļ¬les i1
, i2,
i3,
... ,
in
ā One approach is weighted average of rated item
proļ¬les
35.
35
Model: Content-Based Filtering/Userproļ¬le
ā Items are songs, only feature is ātagā
ā Item proļ¬le: vector with 0 or 1 for each Actor
ā Suppose user x has downloaded or liked 5 songs
ā 2 songs featuring TAG A
ā 3 songs featuring TAG B
ā User proļ¬le = mean of item proļ¬les
ā Feature Aās weight = 2/5 = 0.4
ā Feature Bās weight = 3/5 = 0.6
38
Pros
ā No needfor data on OTHER users
ā Able to recommend to users with unique tastes
ā Able to recommend new & unpopular items
ā No ļ¬rst-rater problem
ā Explanations for recommended items
39.
39
Cons
ā Finding theappropriate features is hard
ā Overspecialization
ā Never recommends items outside userās content
proļ¬le