Collaborative filtering (CF) is a successful approach commonly used by many recommender systems. Conventional
CF-based methods use the ratings given to items by users
as the sole source of information for learning to make recommendation. However, the ratings are often very sparse in
many applications, causing CF-based methods to degrade
significantly in their recommendation performance. To address this sparsity problem, auxiliary information such as
item content information may be utilized. Collaborative
topic regression (CTR) is an appealing recent method taking
this approach which tightly couples the two components that
learn from two different sources of information. Nevertheless, the latent representation learned by CTR may not be
very effective when the auxiliary information is very sparse.
To address this problem, we generalize recent advances in
deep learning from i.i.d. input to non-i.i.d. (CF-based) input and propose in this paper a hierarchical Bayesian model
called collaborative deep learning (CDL), which jointly performs deep representation learning for the content information and collaborative filtering for the ratings (feedback)
matrix. Extensive experiments on three real-world datasets
from different domains show that CDL can significantly advance the state of the art.
2. RITESHSAWANT
What is a Recommmendation System?
Recommendation system is an information filtering technique, which
provides users with information, which he/she may be interested in.
Examples:
6. RITESHSAWANT
Motivation
Collaborative Deep learning For Recommender Systems
Hao Wang
Hong Kong University of
Science and Technology
hwangaz@cse.ust.hk
Naiyan Wang
Hong Kong University of
Science and Technology arXiv:1409.2944v2 [cs.LG] 18 Jun 2015
winsty@gmail.com
Dit-Yan Yeung
Hong Kong University of
Science and Technology
dyyeung@cse.ust.hk
7. RITESHSAWANT
Data sparsity
In recommendation system, it is defined as inability to find a sufficient
quantity of good quality neighbors to aid in the prediction process due to
insufficient overlap of ratings between the active user and his neighbors
We can tackle sparsity using various algorithms such as
collaborative filtering,
Matrix factorization in SVD Technique
, K- means Model and etc.
8. RITESHSAWANT
Collaborative Filtering
CF can be Memory based or Model based
Our approach is going to be Model based , Here we apply different models to our
Data and compare the Accuracy of each model.
Collaborative Filtering Model
Matrix Factorization(SVD)
K-means algorithm
New Approach –Collaborative Deep Learning
11. RITESHSAWANT
the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of
a procedure for estimating an unobserved quantity) measures the average of the
squares of the errors or deviations—that is, the difference between the estimator and
what is estimated.
REGULARIZATION-in mathematics and statistics and particularly in the fields of
machine learning and inverse problems, is a process of introducing additional
information in order to solve an ill-posed problem or to prevent overfitting.
12. RITESHSAWANT
CF Algorithm
INITIALIZE features and parameters to some small value
MINIMIZE Using GRADIENT DESCENT
Then for USERS with parameter THETA and a movie with learned features X predict the Star Rating
17. RITESHSAWANT
Solving Sparsity Using the state of the art
Methods K-means (ALGORITHM)
1.Assume the two mean pointsfor the given cluster.
2.Using the Euclidean distance formula calculate the distance
Dist[(x,a)]=sqrt(x-a)2.
3.tabulate the data with reference to the cluster.
4.display the cluster
5.recalculate the mean for the new clusters and repeat the steps
2&4.
6. similar repetitive clusters are formed then stop.
18. RITESHSAWANT
Why we need K-mediod?
1.It is said that k-mean is widely used method
which is very efficient but it has some inefficiencies.
2.K-mediod method also works on similar lines as
k-mean method.
3.It forms 'k' clusters of the present data set
4.It picks a point value in data set randomly for 1st
iteration.
5. It calculates the absolute center of the cluster
rather than the distance mean .
19. RITESHSAWANT
Problems faced in k-method
1.Improper picking of first point.
2.Missing out on boundary points.
Solns:-
1.Sampling
2.Picking "dispersed" points in a cluster
21. RITESHSAWANT
Best Approach - Collaborative Deep Learning
All the previously discussed algorithms do not perform well i.e.-their accuracy
drops when the data is sparse
Hence we introduce a new hierarchical Bayesian model
called collaborative deep learning which significantly advances the state of
the art
We first present a Bayesian formulation of a deep learning model called
stacked de-noising autoencoder (SDAE)
By performing deep learning collaboratively, CDL can
simultaneously extract an effective deep feature representation from content
and capture the similarity and implicit relationship between items (and users)
22. RITESHSAWANT
Collaborative Deep Learning-SDAE
is a feedforward neural network for learning representations (encoding) of the input
data by learning to predict the clean input itself in the output
SDAE solves the foll.
optimization problem
25. RITESHSAWANT
MAX APOSTERIORI ESTIMATES
In Bayesian statistics, a maximum a posteriori probability
(MAP) estimate is anestimate of an unknown quantity, that
equals the mode of the posterior distribution. The MAP can be
used to obtain a point estimate of an unobserved quantity on
the basis of empirical data.
26. RITESHSAWANT
Collaborative Deep Learning
Seeing from the view of neural networks (NN), when λs approaches
positive infinity, training of the probabilistic graphical model of CDL in
Figure 1(left) would degenerate to simultaneously training two neural
networks
overlaid together with a common input layer (the corrupted input) but
different output layers, as shown in Figure 3.
PREDICTED RATINGS- E[RijjD] ≈ E[uijD]T (E[fe(X0;j∗; W+)T jD] + E[jjD]);
Approximated as R∗ij ≈ (u∗ j)T (fe(X0;j∗; W+∗)T + ∗ j) = (u∗ i )T vj ∗:
27. RITESHSAWANT
Collaborative Deep Learning
EVALUATION SCHEME
We randomly select P items associated with each user to form the
training set and use all the rest of the dataset as the test set
We use Recall as the performance measure for all our Training algorithms