SlideShare a Scribd company logo
Recommender Systems:
Model-based collaborative fltering
AAA-Python Edition
Plan
●
1- SVD fltering: With Surprise
●
2- SVD Filtering: More details
●
3- Filtering with SVM Classifiation
●
4- Some Tests
●
5- Prediitions with Custom Data: Preparation
●
6- Prediitions with Custom Data: Prediition
3
1-SVDfltering
WithSurprise
[By Amina Delali]
PrediitionPrediition
●
The estimation of the review is
equal to 4.16
Slightly better performance
compared with
neighborhood filtering
4
1-SVDfltering
WithSurprise
[By Amina Delali]
ConceptConcept
●
Make the assumption that there are factors (characteristics)
related to eaih item. Eaih item ian be desiribed by the degree of
the presence of eaih characteristic in that item. At the same
time, eaih user ian have diferent degrees of interest on eaih of
those characteristics.
●
These two relationships ian be modeled by two matriies:
➢ P(m,f)
: models the interests of eaih user u in f iharaiteristiis in a
row veitor: pu
➢ Q(n,f)
: models the extent of presenie of eaih iharaiteristii in an
Item i in a row veitor qi
●
The interaition between eaih user and item is iomputed by:
➢ qi
T
. pu
whiih iould estimate the rating of the user u for the item i
➢
The estimation is enhanied by other parameters to explain the
bias in ratings:
^rui=μ+bu+bi+qi
T
⋅pu
5
1-SVDfltering
WithSurprise
[By Amina Delali]
ComputationComputation
●
Singular Value deiomposition (SVD) iould be used to extrait the
matriies P and Q. The values of the ratings iould also estimate
the bias values with the mean of all the ratings, the mean of the
ratings of eaih user and the mean of the ratings of eaih item.
●
The problem is the fait that not all the ratings of all the users for
all the items are available. This is why, we have to fnd another
way to estimate these values.
●
The values estimated should minimize the following equation:
∑rui ∈Rtrain
(rui− ^rui)2
+λ(bi
2
+bu
2
+‖qi‖2
+‖pu‖2
)
Consider
only
available
ratings
A regularization
parameter= a
constant value
●
The square of the norm of the
vector qi
● The norm of qi
is the square
root of the sum of the squares
of qi
values.
6
2-SVDFiltering:
Moredetails
[By Amina Delali]
Stochastic Gradient DescentStochastic Gradient Descent
●
The gradient descent is an iterative algorithm that tries to fnd
the (a loial) minimum of funition. In maihine learning, the
gradient desient variations algorithms are used to estimate a
model’s parameters by minimizing a iost funition by reiursively
updating these parameters.
●
The SGD (stochastic gradient descent) is a variation in whiih,
in one iteration (epoih), the parameters are updated for eaih
sample (in our iase for eaih rating). So in one epoih the
parameters iould be updated several times:
➢
The 4 parameters are initialized.
➢
For eaih rating a prediition is made and the diferenie:
is iomputed.
➔
Then, the diferenie is used to update the parameters
values as this way:
bu←bu+γ(eui−λ bu)
bi←bi+γ(eui−λ bi)
pu← pu+γ(eui⋅qi−λ pu)
qi←qi+γ(eui⋅pu−λ qi)
The learning
rate: another
constant that
defines the
rui ^rui
eui
eui=rui− ^rui
7
2-SVDFiltering:
Moredetails
[By Amina Delali]
Stochastic Gradient Descent (suite)Stochastic Gradient Descent (suite)
➢
The proiess is repeated for a iertain number of iterations in order
to fnd a loial minimum for the previous equation.
●
In Surprise library, the parameters are as follow:
➢ The parameters: bu
and bi
(also ialled baselines) are initialized to
0
➢ User and Item faitors: pi
and qi
are randomly initialized aiiording
to a normal distribution defned by the mean init_mean and the
standard deviation init_std_dev parameters.
➢
(lr_all) is set by default to 0.02, and (reg_all) to 0.005
➢
By default the number of faitors is 100
➢
The number of iterations is by default set to 20 (n_epoch)
➢
To use the biases (baselines) parameters, the biased parameter is
set by default to True
λ γ
8
2-SVDFiltering:
Moredetails
[By Amina Delali]
Another example with GridSearihCVAnother example with GridSearihCV
●
Root Mean Square Error
Mean Absolute Error
9
3-Filteringwith
SVMClassifcation
[By Amina Delali]
ConieptConiept
●
The other way to perform a model-based iollaborative fltering, is
to train a model on user’s reviews, and then to use that model to
prediit new ones for new items.
●
In this lesson we will present an implementation using an SVM
(Support Veitor Maihine). Preiisely we will use a Linear SVM
classifer to prediit the new reviews.
●
As desiribed in [Xia et al., 2006] , there are two ways to ionsider
the problem:
➢
Eaih item represents a ilass, and training set is the users
ratings for eaih item other than that item.
➢
Eaih user represents a ilass, and training set is the item’s rating
aiiording to eaih user other than that user.
●
But, the problem here is that the matriies representing the rating
will not be iomplete. So, we will use default values for missing
ratings.
10
3-Filteringwith
SVMClassifcation
[By Amina Delali]
The original dataThe original data
●
We will use the data we already downloaded using Dataset
module from Surprise. But, frst, we will aiiess directly to the
downloaded dataset fle, to see its iontent
11
3-Filteringwith
SVMClassifcation
[By Amina Delali]
The features and LabelsThe features and Labels
●
We will apply an SVC ilassifer for one user, and the ilasses will be
the diferent ratings.
●
We have to ionstruit the features matrix iorresponding to eaih
item ratings done by the user "226". And ionstruit the the
iorresponding label veitor using the ratings of that user.
●
It is more ionvenient to use the data built by Surprise library,
than the original fle.
12
3-Filteringwith
SVMClassifcation
[By Amina Delali]
The features and Labels (suite)The features and Labels (suite)
All these values are
unavailable ratings:
which mean that the
corresponding users
didn’t rate the
corresponding items
13
3-Filteringwith
SVMClassifcation
[By Amina Delali]
Prediition for one itemPrediition for one item
●
A linear SVM classifier
After dropping the
column
corresponding to the
user 218 (“226”)
All the model we used
to predict the ratings
for the user of that
item, all predicted
values either
approaching 4 or
slightly bigger than 4
14
4-SomeTests
[By Amina Delali]
Splitting the dataSplitting the data
●
We will just split the data that we have already ireated using 2
methods:
➢
split into test and training sets
➢
split into folds (iross-validation)
●
We will not run our tests on all
the data as in the previous
examples.
●
We will use only the 50 items
related to to the (active) user
“226”
15
4-SomeTests
[By Amina Delali]
The prediition with the test, train splitThe prediition with the test, train split
●
The missing label
is not represented
16
4-SomeTests
[By Amina Delali]
Prediition with iross-validationPrediition with iross-validation
●
To see the available
measures (scoring)
Same results as with Knn collaborative
filtering
17
5-Predictionswith
CustomData:
Preparation
[By Amina Delali]
The dataThe data
●
We will use the data available at :
Artificial Intelligence with Python GitHub Repository
No rating available for
the movie “Ranging
Bull” by “Bill Duffy”
How the data is organized
Is not ionvenient for Surprise.
So we will have to rearrange the
data
A user’s name:
later it will be
the user’s
raw_id
Movies
names
18
5-Predictionswith
CustomData:
Preparation
[By Amina Delali]
Prepare the dataPrepare the data
●
To use with Surprise, the dataframe must have the iolumns
organized this way: user_id, item_is and ratings. Whiih is not
the iase in our DataFrame.
Now, the movies
names are in a
column
All the users and
the corresponding
ratings are in 2
columns (wide to
long conversion)
19
5-Predictionswith
CustomData:
Preparation
[By Amina Delali]
Prepare the data (suite)Prepare the data (suite)
●
Reorder the columns
Drop the rows
corresponding to the
missing user-item
ratings
The rating scale will
be from 1 to 5
20
6-Predictionswith
CustomData:
Prediction
[By Amina Delali]
Prediit a review for One itemPrediit a review for One item
●
We will use SVD teihnique to prediit the review of the user Adam
Cohen for the movie Ranging Bull
●
If we wanted to use an SVM ilassifer, we would:
➢
Use the original dataframe, and seleit only the rows iorresponding
to the movies rated by “Adam”
➢
Use the Ranging Bull raw values for prediition
➢
The NaN values must be replaied by a default value
Load the data
from the
dataframe we
already prepared.
21
6-Predictionswith
CustomData:
Prediction
[By Amina Delali]
Make a list of reiommendationMake a list of reiommendation
●
●
The user Chris
Duncan rated only
2 movies. We will
make a list of
recommendations
of movies he didn't
rate by:
●
predicting its
reviews on these
movies
●
ordering the
predicted reviews
References
●
[Buitinik et al., 2013] Buitinik, L., Louppe, G., Blondel, M.,
Pedregosa, F., Mueller, A., Grisel, O., Niiulae, V., Prettenhofer, P.,
Gramfort, A., Grobler, J., Layton, R., VanderPlas, J., Joly, A., Holt,
B., and Varoquaux, G. (2013).
API design for maihine learning software: experienies from the
siikit-learn projeit. In ECML PKDD Workshop: Languages for
Data Mining and Maihine Learning, pages 108–122.
●
[Franiesio et al., 2011] Franiesio, R., Lior, R., Braiha, S., and
Paul B., K., editors (2011). Reiommender Systems Handbook.
Springer Siienie+Business Media.
●
[Hug, 2017] Hug, N. (2017). Surprise, a Python library for
reiommender systems. http://surpriselib.iom.
●
[Xia et al., 2006] Xia, Z., Dong, Y., and Xing, G. (2006). Support
veitor maihines for iollaborative fltering. In Proieedings of the
44th annual Southeast regional ionferenie, pages 169–174.
ACM.
Thank
you!
FOR ALL YOUR TIME

More Related Content

Similar to Aaa ped-20-Recommender Systems: Model-based collaborative filtering

Aaa ped-19-Recommender Systems: Neighborhood-based Filtering
Aaa ped-19-Recommender Systems: Neighborhood-based FilteringAaa ped-19-Recommender Systems: Neighborhood-based Filtering
Aaa ped-19-Recommender Systems: Neighborhood-based Filtering
AminaRepo
 
Movie recommendation Engine using Artificial Intelligence
Movie recommendation Engine using Artificial IntelligenceMovie recommendation Engine using Artificial Intelligence
Movie recommendation Engine using Artificial Intelligence
Harivamshi D
 
Hadoop France meetup Feb2016 : recommendations with spark
Hadoop France meetup  Feb2016 : recommendations with sparkHadoop France meetup  Feb2016 : recommendations with spark
Hadoop France meetup Feb2016 : recommendations with spark
Modern Data Stack France
 
Aaa ped-12-Supervised Learning: Support Vector Machines & Naive Bayes Classifer
Aaa ped-12-Supervised Learning: Support Vector Machines & Naive Bayes ClassiferAaa ped-12-Supervised Learning: Support Vector Machines & Naive Bayes Classifer
Aaa ped-12-Supervised Learning: Support Vector Machines & Naive Bayes Classifer
AminaRepo
 
Scalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
Scalable gradientbasedtuningcontinuousregularizationhyperparameters pptScalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
Scalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
Ruochun Tzeng
 
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
gdgsurrey
 
Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative Filtering
Changsung Moon
 
Recommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right DatasetRecommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right Dataset
Crossing Minds
 
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
YONG ZHENG
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
HostedbyConfluent
 
Aaa ped-14-Ensemble Learning: About Ensemble Learning
Aaa ped-14-Ensemble Learning: About Ensemble LearningAaa ped-14-Ensemble Learning: About Ensemble Learning
Aaa ped-14-Ensemble Learning: About Ensemble Learning
AminaRepo
 
Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1
khairulhuda242
 
Ch5-DataFlowTesting.ppt
Ch5-DataFlowTesting.pptCh5-DataFlowTesting.ppt
Ch5-DataFlowTesting.ppt
roshymans1
 
Naïve Bayes Classifier Algorithm.pptx
Naïve Bayes Classifier Algorithm.pptxNaïve Bayes Classifier Algorithm.pptx
Naïve Bayes Classifier Algorithm.pptx
PriyadharshiniG41
 
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
Jian Wu
 
11 whiteboxtesting
11 whiteboxtesting11 whiteboxtesting
11 whiteboxtesting
asifusman1998
 
Aaa ped-24- Reinforcement Learning
Aaa ped-24- Reinforcement LearningAaa ped-24- Reinforcement Learning
Aaa ped-24- Reinforcement Learning
AminaRepo
 
[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...
[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...
[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...
YONG ZHENG
 
Empirical Evaluation of Active Learning in Recommender Systems
Empirical Evaluation of Active Learning in Recommender SystemsEmpirical Evaluation of Active Learning in Recommender Systems
Empirical Evaluation of Active Learning in Recommender Systems
University of Bergen
 

Similar to Aaa ped-20-Recommender Systems: Model-based collaborative filtering (20)

Aaa ped-19-Recommender Systems: Neighborhood-based Filtering
Aaa ped-19-Recommender Systems: Neighborhood-based FilteringAaa ped-19-Recommender Systems: Neighborhood-based Filtering
Aaa ped-19-Recommender Systems: Neighborhood-based Filtering
 
Movie recommendation Engine using Artificial Intelligence
Movie recommendation Engine using Artificial IntelligenceMovie recommendation Engine using Artificial Intelligence
Movie recommendation Engine using Artificial Intelligence
 
Hadoop France meetup Feb2016 : recommendations with spark
Hadoop France meetup  Feb2016 : recommendations with sparkHadoop France meetup  Feb2016 : recommendations with spark
Hadoop France meetup Feb2016 : recommendations with spark
 
Aaa ped-12-Supervised Learning: Support Vector Machines & Naive Bayes Classifer
Aaa ped-12-Supervised Learning: Support Vector Machines & Naive Bayes ClassiferAaa ped-12-Supervised Learning: Support Vector Machines & Naive Bayes Classifer
Aaa ped-12-Supervised Learning: Support Vector Machines & Naive Bayes Classifer
 
Scalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
Scalable gradientbasedtuningcontinuousregularizationhyperparameters pptScalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
Scalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
 
Computer Engineer Master Project
Computer Engineer Master ProjectComputer Engineer Master Project
Computer Engineer Master Project
 
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
 
Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative Filtering
 
Recommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right DatasetRecommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right Dataset
 
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
 
Aaa ped-14-Ensemble Learning: About Ensemble Learning
Aaa ped-14-Ensemble Learning: About Ensemble LearningAaa ped-14-Ensemble Learning: About Ensemble Learning
Aaa ped-14-Ensemble Learning: About Ensemble Learning
 
Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1Week 12 Dimensionality Reduction Bagian 1
Week 12 Dimensionality Reduction Bagian 1
 
Ch5-DataFlowTesting.ppt
Ch5-DataFlowTesting.pptCh5-DataFlowTesting.ppt
Ch5-DataFlowTesting.ppt
 
Naïve Bayes Classifier Algorithm.pptx
Naïve Bayes Classifier Algorithm.pptxNaïve Bayes Classifier Algorithm.pptx
Naïve Bayes Classifier Algorithm.pptx
 
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
 
11 whiteboxtesting
11 whiteboxtesting11 whiteboxtesting
11 whiteboxtesting
 
Aaa ped-24- Reinforcement Learning
Aaa ped-24- Reinforcement LearningAaa ped-24- Reinforcement Learning
Aaa ped-24- Reinforcement Learning
 
[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...
[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...
[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...
 
Empirical Evaluation of Active Learning in Recommender Systems
Empirical Evaluation of Active Learning in Recommender SystemsEmpirical Evaluation of Active Learning in Recommender Systems
Empirical Evaluation of Active Learning in Recommender Systems
 

More from AminaRepo

Aaa ped-23-Artificial Neural Network: Keras and Tensorfow
Aaa ped-23-Artificial Neural Network: Keras and TensorfowAaa ped-23-Artificial Neural Network: Keras and Tensorfow
Aaa ped-23-Artificial Neural Network: Keras and Tensorfow
AminaRepo
 
Aaa ped-22-Artificial Neural Network: Introduction to ANN
Aaa ped-22-Artificial Neural Network: Introduction to ANNAaa ped-22-Artificial Neural Network: Introduction to ANN
Aaa ped-22-Artificial Neural Network: Introduction to ANN
AminaRepo
 
Aaa ped-21-Recommender Systems: Content-based Filtering
Aaa ped-21-Recommender Systems: Content-based FilteringAaa ped-21-Recommender Systems: Content-based Filtering
Aaa ped-21-Recommender Systems: Content-based Filtering
AminaRepo
 
Aaa ped-18-Unsupervised Learning: Association Rule Learning
Aaa ped-18-Unsupervised Learning: Association Rule LearningAaa ped-18-Unsupervised Learning: Association Rule Learning
Aaa ped-18-Unsupervised Learning: Association Rule Learning
AminaRepo
 
Aaa ped-17-Unsupervised Learning: Dimensionality reduction
Aaa ped-17-Unsupervised Learning: Dimensionality reductionAaa ped-17-Unsupervised Learning: Dimensionality reduction
Aaa ped-17-Unsupervised Learning: Dimensionality reduction
AminaRepo
 
Aaa ped-16-Unsupervised Learning: clustering
Aaa ped-16-Unsupervised Learning: clusteringAaa ped-16-Unsupervised Learning: clustering
Aaa ped-16-Unsupervised Learning: clustering
AminaRepo
 
Aaa ped-15-Ensemble Learning: Random Forests
Aaa ped-15-Ensemble Learning: Random ForestsAaa ped-15-Ensemble Learning: Random Forests
Aaa ped-15-Ensemble Learning: Random Forests
AminaRepo
 
Aaa ped-11-Supervised Learning: Multivariable Regressor & Classifers
Aaa ped-11-Supervised Learning: Multivariable Regressor & ClassifersAaa ped-11-Supervised Learning: Multivariable Regressor & Classifers
Aaa ped-11-Supervised Learning: Multivariable Regressor & Classifers
AminaRepo
 
Aaa ped-10-Supervised Learning: Introduction to Supervised Learning
Aaa ped-10-Supervised Learning: Introduction to Supervised LearningAaa ped-10-Supervised Learning: Introduction to Supervised Learning
Aaa ped-10-Supervised Learning: Introduction to Supervised Learning
AminaRepo
 
Aaa ped-9-Data manipulation: Time Series & Geographical visualization
Aaa ped-9-Data manipulation: Time Series & Geographical visualizationAaa ped-9-Data manipulation: Time Series & Geographical visualization
Aaa ped-9-Data manipulation: Time Series & Geographical visualization
AminaRepo
 
Aaa ped-Data-8- manipulation: Plotting and Visualization
Aaa ped-Data-8- manipulation: Plotting and VisualizationAaa ped-Data-8- manipulation: Plotting and Visualization
Aaa ped-Data-8- manipulation: Plotting and Visualization
AminaRepo
 
Aaa ped-8- Data manipulation: Data wrangling, aggregation, and group operations
Aaa ped-8- Data manipulation: Data wrangling, aggregation, and group operationsAaa ped-8- Data manipulation: Data wrangling, aggregation, and group operations
Aaa ped-8- Data manipulation: Data wrangling, aggregation, and group operations
AminaRepo
 
Aaa ped-6-Data manipulation: Data Files, and Data Cleaning & Preparation
Aaa ped-6-Data manipulation:  Data Files, and Data Cleaning & PreparationAaa ped-6-Data manipulation:  Data Files, and Data Cleaning & Preparation
Aaa ped-6-Data manipulation: Data Files, and Data Cleaning & Preparation
AminaRepo
 
Aaa ped-5-Data manipulation: Pandas
Aaa ped-5-Data manipulation: Pandas Aaa ped-5-Data manipulation: Pandas
Aaa ped-5-Data manipulation: Pandas
AminaRepo
 
Aaa ped-4- Data manipulation: Numpy
Aaa ped-4- Data manipulation: Numpy Aaa ped-4- Data manipulation: Numpy
Aaa ped-4- Data manipulation: Numpy
AminaRepo
 
Aaa ped-3. Pythond: advanced concepts
Aaa ped-3. Pythond: advanced conceptsAaa ped-3. Pythond: advanced concepts
Aaa ped-3. Pythond: advanced concepts
AminaRepo
 
Aaa ped-2- Python: Basics
Aaa ped-2- Python: BasicsAaa ped-2- Python: Basics
Aaa ped-2- Python: Basics
AminaRepo
 
Aaa ped-1- Python: Introduction to AI, Python and Colab
Aaa ped-1- Python: Introduction to AI, Python and ColabAaa ped-1- Python: Introduction to AI, Python and Colab
Aaa ped-1- Python: Introduction to AI, Python and Colab
AminaRepo
 

More from AminaRepo (18)

Aaa ped-23-Artificial Neural Network: Keras and Tensorfow
Aaa ped-23-Artificial Neural Network: Keras and TensorfowAaa ped-23-Artificial Neural Network: Keras and Tensorfow
Aaa ped-23-Artificial Neural Network: Keras and Tensorfow
 
Aaa ped-22-Artificial Neural Network: Introduction to ANN
Aaa ped-22-Artificial Neural Network: Introduction to ANNAaa ped-22-Artificial Neural Network: Introduction to ANN
Aaa ped-22-Artificial Neural Network: Introduction to ANN
 
Aaa ped-21-Recommender Systems: Content-based Filtering
Aaa ped-21-Recommender Systems: Content-based FilteringAaa ped-21-Recommender Systems: Content-based Filtering
Aaa ped-21-Recommender Systems: Content-based Filtering
 
Aaa ped-18-Unsupervised Learning: Association Rule Learning
Aaa ped-18-Unsupervised Learning: Association Rule LearningAaa ped-18-Unsupervised Learning: Association Rule Learning
Aaa ped-18-Unsupervised Learning: Association Rule Learning
 
Aaa ped-17-Unsupervised Learning: Dimensionality reduction
Aaa ped-17-Unsupervised Learning: Dimensionality reductionAaa ped-17-Unsupervised Learning: Dimensionality reduction
Aaa ped-17-Unsupervised Learning: Dimensionality reduction
 
Aaa ped-16-Unsupervised Learning: clustering
Aaa ped-16-Unsupervised Learning: clusteringAaa ped-16-Unsupervised Learning: clustering
Aaa ped-16-Unsupervised Learning: clustering
 
Aaa ped-15-Ensemble Learning: Random Forests
Aaa ped-15-Ensemble Learning: Random ForestsAaa ped-15-Ensemble Learning: Random Forests
Aaa ped-15-Ensemble Learning: Random Forests
 
Aaa ped-11-Supervised Learning: Multivariable Regressor & Classifers
Aaa ped-11-Supervised Learning: Multivariable Regressor & ClassifersAaa ped-11-Supervised Learning: Multivariable Regressor & Classifers
Aaa ped-11-Supervised Learning: Multivariable Regressor & Classifers
 
Aaa ped-10-Supervised Learning: Introduction to Supervised Learning
Aaa ped-10-Supervised Learning: Introduction to Supervised LearningAaa ped-10-Supervised Learning: Introduction to Supervised Learning
Aaa ped-10-Supervised Learning: Introduction to Supervised Learning
 
Aaa ped-9-Data manipulation: Time Series & Geographical visualization
Aaa ped-9-Data manipulation: Time Series & Geographical visualizationAaa ped-9-Data manipulation: Time Series & Geographical visualization
Aaa ped-9-Data manipulation: Time Series & Geographical visualization
 
Aaa ped-Data-8- manipulation: Plotting and Visualization
Aaa ped-Data-8- manipulation: Plotting and VisualizationAaa ped-Data-8- manipulation: Plotting and Visualization
Aaa ped-Data-8- manipulation: Plotting and Visualization
 
Aaa ped-8- Data manipulation: Data wrangling, aggregation, and group operations
Aaa ped-8- Data manipulation: Data wrangling, aggregation, and group operationsAaa ped-8- Data manipulation: Data wrangling, aggregation, and group operations
Aaa ped-8- Data manipulation: Data wrangling, aggregation, and group operations
 
Aaa ped-6-Data manipulation: Data Files, and Data Cleaning & Preparation
Aaa ped-6-Data manipulation:  Data Files, and Data Cleaning & PreparationAaa ped-6-Data manipulation:  Data Files, and Data Cleaning & Preparation
Aaa ped-6-Data manipulation: Data Files, and Data Cleaning & Preparation
 
Aaa ped-5-Data manipulation: Pandas
Aaa ped-5-Data manipulation: Pandas Aaa ped-5-Data manipulation: Pandas
Aaa ped-5-Data manipulation: Pandas
 
Aaa ped-4- Data manipulation: Numpy
Aaa ped-4- Data manipulation: Numpy Aaa ped-4- Data manipulation: Numpy
Aaa ped-4- Data manipulation: Numpy
 
Aaa ped-3. Pythond: advanced concepts
Aaa ped-3. Pythond: advanced conceptsAaa ped-3. Pythond: advanced concepts
Aaa ped-3. Pythond: advanced concepts
 
Aaa ped-2- Python: Basics
Aaa ped-2- Python: BasicsAaa ped-2- Python: Basics
Aaa ped-2- Python: Basics
 
Aaa ped-1- Python: Introduction to AI, Python and Colab
Aaa ped-1- Python: Introduction to AI, Python and ColabAaa ped-1- Python: Introduction to AI, Python and Colab
Aaa ped-1- Python: Introduction to AI, Python and Colab
 

Recently uploaded

GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
Areesha Ahmad
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SELF-EXPLANATORY
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
muralinath2
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
Sérgio Sacani
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
kumarmathi863
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
subedisuryaofficial
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
Health Advances
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 

Recently uploaded (20)

GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 

Aaa ped-20-Recommender Systems: Model-based collaborative filtering

  • 2. Plan ● 1- SVD fltering: With Surprise ● 2- SVD Filtering: More details ● 3- Filtering with SVM Classifiation ● 4- Some Tests ● 5- Prediitions with Custom Data: Preparation ● 6- Prediitions with Custom Data: Prediition
  • 3. 3 1-SVDfltering WithSurprise [By Amina Delali] PrediitionPrediition ● The estimation of the review is equal to 4.16 Slightly better performance compared with neighborhood filtering
  • 4. 4 1-SVDfltering WithSurprise [By Amina Delali] ConceptConcept ● Make the assumption that there are factors (characteristics) related to eaih item. Eaih item ian be desiribed by the degree of the presence of eaih characteristic in that item. At the same time, eaih user ian have diferent degrees of interest on eaih of those characteristics. ● These two relationships ian be modeled by two matriies: ➢ P(m,f) : models the interests of eaih user u in f iharaiteristiis in a row veitor: pu ➢ Q(n,f) : models the extent of presenie of eaih iharaiteristii in an Item i in a row veitor qi ● The interaition between eaih user and item is iomputed by: ➢ qi T . pu whiih iould estimate the rating of the user u for the item i ➢ The estimation is enhanied by other parameters to explain the bias in ratings: ^rui=μ+bu+bi+qi T ⋅pu
  • 5. 5 1-SVDfltering WithSurprise [By Amina Delali] ComputationComputation ● Singular Value deiomposition (SVD) iould be used to extrait the matriies P and Q. The values of the ratings iould also estimate the bias values with the mean of all the ratings, the mean of the ratings of eaih user and the mean of the ratings of eaih item. ● The problem is the fait that not all the ratings of all the users for all the items are available. This is why, we have to fnd another way to estimate these values. ● The values estimated should minimize the following equation: ∑rui ∈Rtrain (rui− ^rui)2 +λ(bi 2 +bu 2 +‖qi‖2 +‖pu‖2 ) Consider only available ratings A regularization parameter= a constant value ● The square of the norm of the vector qi ● The norm of qi is the square root of the sum of the squares of qi values.
  • 6. 6 2-SVDFiltering: Moredetails [By Amina Delali] Stochastic Gradient DescentStochastic Gradient Descent ● The gradient descent is an iterative algorithm that tries to fnd the (a loial) minimum of funition. In maihine learning, the gradient desient variations algorithms are used to estimate a model’s parameters by minimizing a iost funition by reiursively updating these parameters. ● The SGD (stochastic gradient descent) is a variation in whiih, in one iteration (epoih), the parameters are updated for eaih sample (in our iase for eaih rating). So in one epoih the parameters iould be updated several times: ➢ The 4 parameters are initialized. ➢ For eaih rating a prediition is made and the diferenie: is iomputed. ➔ Then, the diferenie is used to update the parameters values as this way: bu←bu+γ(eui−λ bu) bi←bi+γ(eui−λ bi) pu← pu+γ(eui⋅qi−λ pu) qi←qi+γ(eui⋅pu−λ qi) The learning rate: another constant that defines the rui ^rui eui eui=rui− ^rui
  • 7. 7 2-SVDFiltering: Moredetails [By Amina Delali] Stochastic Gradient Descent (suite)Stochastic Gradient Descent (suite) ➢ The proiess is repeated for a iertain number of iterations in order to fnd a loial minimum for the previous equation. ● In Surprise library, the parameters are as follow: ➢ The parameters: bu and bi (also ialled baselines) are initialized to 0 ➢ User and Item faitors: pi and qi are randomly initialized aiiording to a normal distribution defned by the mean init_mean and the standard deviation init_std_dev parameters. ➢ (lr_all) is set by default to 0.02, and (reg_all) to 0.005 ➢ By default the number of faitors is 100 ➢ The number of iterations is by default set to 20 (n_epoch) ➢ To use the biases (baselines) parameters, the biased parameter is set by default to True λ γ
  • 8. 8 2-SVDFiltering: Moredetails [By Amina Delali] Another example with GridSearihCVAnother example with GridSearihCV ● Root Mean Square Error Mean Absolute Error
  • 9. 9 3-Filteringwith SVMClassifcation [By Amina Delali] ConieptConiept ● The other way to perform a model-based iollaborative fltering, is to train a model on user’s reviews, and then to use that model to prediit new ones for new items. ● In this lesson we will present an implementation using an SVM (Support Veitor Maihine). Preiisely we will use a Linear SVM classifer to prediit the new reviews. ● As desiribed in [Xia et al., 2006] , there are two ways to ionsider the problem: ➢ Eaih item represents a ilass, and training set is the users ratings for eaih item other than that item. ➢ Eaih user represents a ilass, and training set is the item’s rating aiiording to eaih user other than that user. ● But, the problem here is that the matriies representing the rating will not be iomplete. So, we will use default values for missing ratings.
  • 10. 10 3-Filteringwith SVMClassifcation [By Amina Delali] The original dataThe original data ● We will use the data we already downloaded using Dataset module from Surprise. But, frst, we will aiiess directly to the downloaded dataset fle, to see its iontent
  • 11. 11 3-Filteringwith SVMClassifcation [By Amina Delali] The features and LabelsThe features and Labels ● We will apply an SVC ilassifer for one user, and the ilasses will be the diferent ratings. ● We have to ionstruit the features matrix iorresponding to eaih item ratings done by the user "226". And ionstruit the the iorresponding label veitor using the ratings of that user. ● It is more ionvenient to use the data built by Surprise library, than the original fle.
  • 12. 12 3-Filteringwith SVMClassifcation [By Amina Delali] The features and Labels (suite)The features and Labels (suite) All these values are unavailable ratings: which mean that the corresponding users didn’t rate the corresponding items
  • 13. 13 3-Filteringwith SVMClassifcation [By Amina Delali] Prediition for one itemPrediition for one item ● A linear SVM classifier After dropping the column corresponding to the user 218 (“226”) All the model we used to predict the ratings for the user of that item, all predicted values either approaching 4 or slightly bigger than 4
  • 14. 14 4-SomeTests [By Amina Delali] Splitting the dataSplitting the data ● We will just split the data that we have already ireated using 2 methods: ➢ split into test and training sets ➢ split into folds (iross-validation) ● We will not run our tests on all the data as in the previous examples. ● We will use only the 50 items related to to the (active) user “226”
  • 15. 15 4-SomeTests [By Amina Delali] The prediition with the test, train splitThe prediition with the test, train split ● The missing label is not represented
  • 16. 16 4-SomeTests [By Amina Delali] Prediition with iross-validationPrediition with iross-validation ● To see the available measures (scoring) Same results as with Knn collaborative filtering
  • 17. 17 5-Predictionswith CustomData: Preparation [By Amina Delali] The dataThe data ● We will use the data available at : Artificial Intelligence with Python GitHub Repository No rating available for the movie “Ranging Bull” by “Bill Duffy” How the data is organized Is not ionvenient for Surprise. So we will have to rearrange the data A user’s name: later it will be the user’s raw_id Movies names
  • 18. 18 5-Predictionswith CustomData: Preparation [By Amina Delali] Prepare the dataPrepare the data ● To use with Surprise, the dataframe must have the iolumns organized this way: user_id, item_is and ratings. Whiih is not the iase in our DataFrame. Now, the movies names are in a column All the users and the corresponding ratings are in 2 columns (wide to long conversion)
  • 19. 19 5-Predictionswith CustomData: Preparation [By Amina Delali] Prepare the data (suite)Prepare the data (suite) ● Reorder the columns Drop the rows corresponding to the missing user-item ratings The rating scale will be from 1 to 5
  • 20. 20 6-Predictionswith CustomData: Prediction [By Amina Delali] Prediit a review for One itemPrediit a review for One item ● We will use SVD teihnique to prediit the review of the user Adam Cohen for the movie Ranging Bull ● If we wanted to use an SVM ilassifer, we would: ➢ Use the original dataframe, and seleit only the rows iorresponding to the movies rated by “Adam” ➢ Use the Ranging Bull raw values for prediition ➢ The NaN values must be replaied by a default value Load the data from the dataframe we already prepared.
  • 21. 21 6-Predictionswith CustomData: Prediction [By Amina Delali] Make a list of reiommendationMake a list of reiommendation ● ● The user Chris Duncan rated only 2 movies. We will make a list of recommendations of movies he didn't rate by: ● predicting its reviews on these movies ● ordering the predicted reviews
  • 22. References ● [Buitinik et al., 2013] Buitinik, L., Louppe, G., Blondel, M., Pedregosa, F., Mueller, A., Grisel, O., Niiulae, V., Prettenhofer, P., Gramfort, A., Grobler, J., Layton, R., VanderPlas, J., Joly, A., Holt, B., and Varoquaux, G. (2013). API design for maihine learning software: experienies from the siikit-learn projeit. In ECML PKDD Workshop: Languages for Data Mining and Maihine Learning, pages 108–122. ● [Franiesio et al., 2011] Franiesio, R., Lior, R., Braiha, S., and Paul B., K., editors (2011). Reiommender Systems Handbook. Springer Siienie+Business Media. ● [Hug, 2017] Hug, N. (2017). Surprise, a Python library for reiommender systems. http://surpriselib.iom. ● [Xia et al., 2006] Xia, Z., Dong, Y., and Xing, G. (2006). Support veitor maihines for iollaborative fltering. In Proieedings of the 44th annual Southeast regional ionferenie, pages 169–174. ACM.