This second meetup will be about training different models for our recommender system. We will review the simple models we can build as a baseline. After that, we will present the recommender system as an optimization problem and discuss different training losses. We will mention linear models and matrix factorization techniques. We will end the presentation with a simple introduction to non-linear models and deep learning.
Recommender Systems from A to Z – Model EvaluationCrossing Minds
The third meetup will be about evaluating different models for our recommender system. We will review the strategies we have to check if a model is under fitting or overfitting. After that, we will present and analyze the losses that are typically used in recommendation systems to train models. We will compare regression, classification, and rank based losses and when it's convenient to use each one. Finally, we are going to cover all the metrics that are typically used to evaluate the performance of different recommendation systems and how to test that the models are giving good results in production.
What really are recommendations engines nowadays?
This presentation introduces the foundations of recommendation algorithms, and covers common approaches as well as some of the most advanced techniques. Although more focused on efficiency than theoretical properties, basics of matrix algebra and optimization-based machine learning are used through the presentation.
Table of Contents:
1. Collaborative Filtering
1.1 User-User
1.2 Item-Item
1.3 User-Item
* Matrix Factorization
* Stochastic Gradient Descent (SGD)
* Truncated Singular Value Decomposition (SVD)
* Alternating Least Square (ALS)
* Deep Learning
2. Content Extraction
* Item-Item Similarities
* Deep Content Extraction: NLP, CNN, LSTM
3. Hybrid Models
4. In Production
4.1 Problematics
4.2 Solutions
4.3 Tools
Recommender Systems represent one of the most widespread and impactful applications of predictive machine learning models.
Amazon, YouTube, Netflix, Facebook and many other companies generate an important fraction of their revenues thanks to their ability to model and accurately predict users ratings and preferences.
In this presentation we cover the following points:
→ introduction to recommender systems
→ working with explicit vs implicit feedback
→ content-based vs collaborative filtering approaches
→ user-based and item-item methods
→ machine learning and deep learning models
→ pros & cons of the methods: scalability, accuracy, explainability
In this lecture, I will first cover the recent advances in neural recommender systems such as autoencoder-based and MLP-based recommender systems. Then, I will introduce the recent achievement for automatic playlist continuation in music recommendation.
Recommender Systems from A to Z – Model EvaluationCrossing Minds
The third meetup will be about evaluating different models for our recommender system. We will review the strategies we have to check if a model is under fitting or overfitting. After that, we will present and analyze the losses that are typically used in recommendation systems to train models. We will compare regression, classification, and rank based losses and when it's convenient to use each one. Finally, we are going to cover all the metrics that are typically used to evaluate the performance of different recommendation systems and how to test that the models are giving good results in production.
What really are recommendations engines nowadays?
This presentation introduces the foundations of recommendation algorithms, and covers common approaches as well as some of the most advanced techniques. Although more focused on efficiency than theoretical properties, basics of matrix algebra and optimization-based machine learning are used through the presentation.
Table of Contents:
1. Collaborative Filtering
1.1 User-User
1.2 Item-Item
1.3 User-Item
* Matrix Factorization
* Stochastic Gradient Descent (SGD)
* Truncated Singular Value Decomposition (SVD)
* Alternating Least Square (ALS)
* Deep Learning
2. Content Extraction
* Item-Item Similarities
* Deep Content Extraction: NLP, CNN, LSTM
3. Hybrid Models
4. In Production
4.1 Problematics
4.2 Solutions
4.3 Tools
Recommender Systems represent one of the most widespread and impactful applications of predictive machine learning models.
Amazon, YouTube, Netflix, Facebook and many other companies generate an important fraction of their revenues thanks to their ability to model and accurately predict users ratings and preferences.
In this presentation we cover the following points:
→ introduction to recommender systems
→ working with explicit vs implicit feedback
→ content-based vs collaborative filtering approaches
→ user-based and item-item methods
→ machine learning and deep learning models
→ pros & cons of the methods: scalability, accuracy, explainability
In this lecture, I will first cover the recent advances in neural recommender systems such as autoencoder-based and MLP-based recommender systems. Then, I will introduce the recent achievement for automatic playlist continuation in music recommendation.
Talk with Yves Raimond at the GPU Tech Conference on Marth 28, 2018 in San Jose, CA.
Abstract:
In this talk, we will survey how Deep Learning methods can be applied to personalization and recommendations. We will cover why standard Deep Learning approaches don't perform better than typical collaborative filtering techniques. Then we will survey we will go over recently published research at the intersection of Deep Learning and recommender systems, looking at how they integrate new types of data, explore new models, or change the recommendation problem statement. We will also highlight some of the ways that neural networks are used at Netflix and how we can use GPUs to train recommender systems. Finally, we will highlight promising new directions in this space.
With the explosive growth of online information, recommender system has been an effective tool to overcome information overload and promote sales. In recent years, deep learning's revolutionary advances in speech recognition, image analysis and natural language processing have gained significant attention. Meanwhile, recent studies also demonstrate its efficacy in coping with information retrieval and recommendation tasks. Applying deep learning techniques into recommender system has been gaining momentum due to its state-of-the-art performance. In this talk, I will present recent development of deep learning based recommender models and highlight some future challenges and open issues of this research field.
Overview of the Recommender system or recommendation system. RFM Concepts in brief. Collaborative Filtering in Item and User based. Content-based Recommendation also described.Product Association Recommender System. Stereotype Recommendation described with advantage and limitations.Customer Lifetime. Recommender System Analysis and Solving Cycle.
Past, present, and future of Recommender Systems: an industry perspectiveXavier Amatriain
Keynote for the ACM Intelligent User Interface conference in 2016 in Sonoma, CA. I start with the past by talking about the Recommender Problem, and the Netflix Prize. Then I go into the Present and the Future by talking about approaches that go beyond rating prediction and ranking and by finishing with some of the most important lessons learned over the years. Throughout my talk I put special emphasis on the relation between algorithms and the User Interface.
Recommender systems support the decision making processes of customers with personalized suggestions. These widely used systems influence the daily life of almost everyone across domains like ecommerce, social media, and entertainment. However, the efficient generation of relevant recommendations in large-scale systems is a very complex task. In order to provide personalization, engines and algorithms need to capture users’ varying tastes and find mostly nonlinear dependencies between them and a multitude of items. Enormous data sparsity and ambitious real-time requirements further complicate this challenge. At the same time, deep learning has been proven to solve complex tasks like object or speech recognition where traditional machine learning failed or showed mediocre performance.
Explore a use case for vehicle recommendations at mobile.de, Germany’s biggest online vehicle market. Marcel shares a novel regularization technique for the optimization criterion and evaluates it against various baselines. To achieve high scalability, he combines this method with strategies for efficient candidate generation based on user and item embeddings—providing a holistic solution for candidate generation and ranking.
The proposed approach outperforms collaborative filtering and hybrid collaborative-content-based filtering by 73% and 143% for MAP@5. It also scales well for millions of items and users returning recommendations in tens of milliseconds.
Presentation at the Netflix Expo session at RecSys 2020 virtual conference on 2020-09-24. It provides an overview of recommendation and personalization at Netflix and then highlights some of the things we’ve been working on as well as some important open research questions in the field of recommendations.
Recommender systems are software tools and techniques providing suggestions for items to be of interest to a user. Recommender systems have proved in recent years to be a valuable means of helping Web users by providing useful and effective recommendations or suggestions.
In this talk we will explain some of the main challenges that we faced at OLX Europe while trying to proof the value of a deep learning based recommender system, and to later productionize it with a high level of automation.
We'll talk about:
* Modern Recommender Systems
* Deep Learning
* Neural Item Embeddings
* Similarity Search
* Proving value through Experimentation
* From POC to PRD
* Lessons Learned
About the speakers:
Cristian Martinez works as Lead Data Scientist at OLX Group, mainly focused on Search and Recommenders, and has been working for more than a decade in different companies solving business problems with Machine Learning.
Ilia Ivanov is a Data Scientist in OLX Europe (online marketplace) with 4 years of experience in DS focusing on recommendations and NLP.
https://github.com/telecombcn-dl/dlmm-2017-dcu
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.
Talk with Yves Raimond at the GPU Tech Conference on Marth 28, 2018 in San Jose, CA.
Abstract:
In this talk, we will survey how Deep Learning methods can be applied to personalization and recommendations. We will cover why standard Deep Learning approaches don't perform better than typical collaborative filtering techniques. Then we will survey we will go over recently published research at the intersection of Deep Learning and recommender systems, looking at how they integrate new types of data, explore new models, or change the recommendation problem statement. We will also highlight some of the ways that neural networks are used at Netflix and how we can use GPUs to train recommender systems. Finally, we will highlight promising new directions in this space.
With the explosive growth of online information, recommender system has been an effective tool to overcome information overload and promote sales. In recent years, deep learning's revolutionary advances in speech recognition, image analysis and natural language processing have gained significant attention. Meanwhile, recent studies also demonstrate its efficacy in coping with information retrieval and recommendation tasks. Applying deep learning techniques into recommender system has been gaining momentum due to its state-of-the-art performance. In this talk, I will present recent development of deep learning based recommender models and highlight some future challenges and open issues of this research field.
Overview of the Recommender system or recommendation system. RFM Concepts in brief. Collaborative Filtering in Item and User based. Content-based Recommendation also described.Product Association Recommender System. Stereotype Recommendation described with advantage and limitations.Customer Lifetime. Recommender System Analysis and Solving Cycle.
Past, present, and future of Recommender Systems: an industry perspectiveXavier Amatriain
Keynote for the ACM Intelligent User Interface conference in 2016 in Sonoma, CA. I start with the past by talking about the Recommender Problem, and the Netflix Prize. Then I go into the Present and the Future by talking about approaches that go beyond rating prediction and ranking and by finishing with some of the most important lessons learned over the years. Throughout my talk I put special emphasis on the relation between algorithms and the User Interface.
Recommender systems support the decision making processes of customers with personalized suggestions. These widely used systems influence the daily life of almost everyone across domains like ecommerce, social media, and entertainment. However, the efficient generation of relevant recommendations in large-scale systems is a very complex task. In order to provide personalization, engines and algorithms need to capture users’ varying tastes and find mostly nonlinear dependencies between them and a multitude of items. Enormous data sparsity and ambitious real-time requirements further complicate this challenge. At the same time, deep learning has been proven to solve complex tasks like object or speech recognition where traditional machine learning failed or showed mediocre performance.
Explore a use case for vehicle recommendations at mobile.de, Germany’s biggest online vehicle market. Marcel shares a novel regularization technique for the optimization criterion and evaluates it against various baselines. To achieve high scalability, he combines this method with strategies for efficient candidate generation based on user and item embeddings—providing a holistic solution for candidate generation and ranking.
The proposed approach outperforms collaborative filtering and hybrid collaborative-content-based filtering by 73% and 143% for MAP@5. It also scales well for millions of items and users returning recommendations in tens of milliseconds.
Presentation at the Netflix Expo session at RecSys 2020 virtual conference on 2020-09-24. It provides an overview of recommendation and personalization at Netflix and then highlights some of the things we’ve been working on as well as some important open research questions in the field of recommendations.
Recommender systems are software tools and techniques providing suggestions for items to be of interest to a user. Recommender systems have proved in recent years to be a valuable means of helping Web users by providing useful and effective recommendations or suggestions.
In this talk we will explain some of the main challenges that we faced at OLX Europe while trying to proof the value of a deep learning based recommender system, and to later productionize it with a high level of automation.
We'll talk about:
* Modern Recommender Systems
* Deep Learning
* Neural Item Embeddings
* Similarity Search
* Proving value through Experimentation
* From POC to PRD
* Lessons Learned
About the speakers:
Cristian Martinez works as Lead Data Scientist at OLX Group, mainly focused on Search and Recommenders, and has been working for more than a decade in different companies solving business problems with Machine Learning.
Ilia Ivanov is a Data Scientist in OLX Europe (online marketplace) with 4 years of experience in DS focusing on recommendations and NLP.
https://github.com/telecombcn-dl/dlmm-2017-dcu
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.
In this tutorial, we will learn the the following topics -
+ Linear SVM Classification
+ Soft Margin Classification
+ Nonlinear SVM Classification
+ Polynomial Kernel
+ Adding Similarity Features
+ Gaussian RBF Kernel
+ Computational Complexity
+ SVM Regression
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...Maninda Edirisooriya
Bias and Variance are the deepest concepts in ML which drives the decision making of a ML project. Regularization is a solution for the high variance problem. This was one of the lectures of a full course I taught in University of Moratuwa, Sri Lanka on 2023 second half of the year.
In this talk we explore how to build Machine Learning Systems that can that can learn "continuously" from their mistakes (feedback loop) and adapt to an evolving data distribution.
The youtube link to video of the talk is here:
https://www.youtube.com/watch?v=VtBvmrmMJaI
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsJinwon Lee
TensorFlow Korea 논문읽기모임 PR12 231번째 논문 review 입니다
이번 논문은 Google Brain에서 나온 A Simple Framework for Contrastive Learning of Visual Representations입니다. Geoffrey Hinton님이 마지막 저자이시기도 해서 최근에 더 주목을 받고 있는 논문입니다.
이 논문은 최근에 굉장히 핫한 topic인 contrastive learning을 이용한 self-supervised learning쪽 논문으로 supervised learning으로 학습한 ResNet50와 동일한 성능을 얻을 수 있는 unsupervised pre-trainig 방법을 제안하였습니다. Data augmentation, Non-linear projection head, large batch size, longer training, NTXent loss 등을 활용하여 훌륭한 representation learning이 가능함을 보여주었고, semi-supervised learning이나 transfer learning에서도 매우 뛰어난 결과를 보여주었습니다. 자세한 내용은 영상을 참고해주세요
논문링크: https://arxiv.org/abs/2002.05709
영상링크: https://youtu.be/FWhM3juUM6s
Regression takes a group of random variables, thought to be predicting Y, and tries to find a mathematical relationship between them. This relationship is typically in the form of a straight line (linear regression) that best approximates all the individual data points.
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills MN
Travis Hills of Minnesota developed a method to convert waste into high-value dry fertilizer, significantly enriching soil quality. By providing farmers with a valuable resource derived from waste, Travis Hills helps enhance farm profitability while promoting environmental stewardship. Travis Hills' sustainable practices lead to cost savings and increased revenue for farmers by improving resource efficiency and reducing waste.
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
Phenomics assisted breeding in crop improvementIshaGoswami9
As the population is increasing and will reach about 9 billion upto 2050. Also due to climate change, it is difficult to meet the food requirement of such a large population. Facing the challenges presented by resource shortages, climate
change, and increasing global population, crop yield and quality need to be improved in a sustainable way over the coming decades. Genetic improvement by breeding is the best way to increase crop productivity. With the rapid progression of functional
genomics, an increasing number of crop genomes have been sequenced and dozens of genes influencing key agronomic traits have been identified. However, current genome sequence information has not been adequately exploited for understanding
the complex characteristics of multiple gene, owing to a lack of crop phenotypic data. Efficient, automatic, and accurate technologies and platforms that can capture phenotypic data that can
be linked to genomics information for crop improvement at all growth stages have become as important as genotyping. Thus,
high-throughput phenotyping has become the major bottleneck restricting crop breeding. Plant phenomics has been defined as the high-throughput, accurate acquisition and analysis of multi-dimensional phenotypes
during crop growing stages at the organism level, including the cell, tissue, organ, individual plant, plot, and field levels. With the rapid development of novel sensors, imaging technology,
and analysis methods, numerous infrastructure platforms have been developed for phenotyping.
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...University of Maribor
Slides from talk:
Aleš Zamuda: Remote Sensing and Computational, Evolutionary, Supercomputing, and Intelligent Systems.
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Inter-Society Networking Panel GRSS/MTT-S/CIS Panel Session: Promoting Connection and Cooperation
https://www.etran.rs/2024/en/home-english/
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxMAGOTI ERNEST
Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024).
Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
Recommender Systems from A to Z – Model Training
1.
2. Recommender Systems from A to Z
Part 1: The Right Dataset
Part 2: Model Training
Part 3: Model Evaluation
Part 4: Real-Time Deployment
3. Recommender Systems from A to Z
Part 1: The Right Dataset
Part 2: Model Training
Part 3: Model Evaluation
Part 4: Real-Time Deployment
4. 1. Introduction
Optimization problem, linear regression and Stochastic Gradient Descent (SGD)
1. Baseline models
Global average, user average and item-item models
1. Basic linear models
Least Squares (LS)
Regularized Least Squares (RLS)
1. Matrix factorization
Matrix Factorization, analytical solution and numerical solution
1. Non-linear models
Basic and Complex Deep Learning model
6. Model training – Introduction
Explicit vs Implicit feedback
Explicit feedback
(users’ ratings)
Implicit feedback
(users’ clicks)
7. Model training – Introduction
Explicit vs Implicit feedback
Explicit feedback
(users’ ratings)
Implicit feedback
(users’ clicks)
Explicit feedback Implicit feedback
Example Domains Movies, Tv-Shows, Music Marketplaces, Businesses
Example Data type Like/Dislike, Stars Clicks, Play-time, Purchases
Complexity Clean, Costly, Easy to interpret Dirty, Cheap, Difficult to interpret
8. Model training – Introduction
Recommendation engine types
Recommendation
engine
Content-based
Collaborative-filtering
Hybrid engine
Memory-based
Model-based
Item-Item
User-User
User-Item
9. Model training – Introduction
Recommendation engine types
Recommendation
engine
Content-based
Collaborative-filtering
Hybrid engine
Memory-based
Model-based
Item-Item
User-User
User-Item
Model When? Linear Problem definition Solutions strategies
Content-based Item Cold start Least Square, Deep Learning
Item-Item n_users >> n_items Affinity Matrix
User-User n_user << n_items KNN, Affinity Matrix
User-Item Better performance Matrix Factorization, Deep Learning
10. Model training – Introduction
Recommendation engine types
Recommendation
engine
Content-based
Collaborative-based
Hybrid engine
Memory-based
Model-based
Item-Item
User-User
User-Item
Model When? Linear Problem definition Solutions strategies
Content-based Item Cold start Least Square, Deep Learning
Item-Item n_users >> n_items Affinity Matrix
User-User n_user << n_items KNN, Affinity Matrix
User-Item Better performance Matrix Factorization, Deep Learning
12. Model training – Introduction - Optimization
Optimization problem (definitions)
Sparse matrix of ratings with
m users and n items
Dense matrix of users embeddings
Dense matrix of items embeddings
13. Model training – Introduction - Optimization
Optimization problem (definitions)
Ratings of User #1
Embedding of User #1
Embedding of Item #1
Sparse matrix of ratings with
m users and n items
Dense matrix of users embeddings
Dense matrix of items embeddings
Ratings of User #m
To Item #n
14. Model training – Introduction - Optimization
Optimization problem (definitions)
AVAILABLE DATASET
?
?
Sparse matrix of ratings with
m users and n items
Dense matrix of users embeddings
Dense matrix of items embeddings
15. Model training – Introduction - Optimization
Optimization problem (basic formulation with RMSE)
Our goal is to find U and I, such as the difference between each datapoint in R and and the product
between each user and item is minimal.
(or R)
16. Model training – Introduction - Optimization
Optimization problem (more complex formulation)
Content-based
Content-based with Regularization
17. Model training – Introduction - Optimization
Optimization problem (more complex formulation)
Content-based
Content-based with Regularization
Available data
Regularization to
avoid overfitting
18. Model training – Introduction - Optimization
Optimization problem (more complex formulation)
Content-based
Content-based with Regularization
Take home
● In content-based models we already know I (items features)
● We can find a linear solutions to this problem using Least Squares
Available data
Regularization to
avoid overfitting
19. Model training – Introduction - Optimization
Optimization problem (more complex formulation)
Collaborative-filtering
Collaborative-filtering with Regularization
20. Model training – Introduction - Optimization
Optimization problem (more complex formulation)
Collaborative-filtering
Collaborative-filtering with Regularization
Available data
Regularization to
avoid overfitting
21. Model training – Introduction - Optimization
Optimization problem (more complex formulation)
Collaborative-filtering
Collaborative-filtering with Regularization
Available data
Regularization to
avoid overfitting
Take home
● In collaborative-filtering we want to find U and I (users and items embeddings)
● We can find a linear solutions to this problem using Matrix Factorization and SGD
22. Model training – Introduction - Optimization
How to analytical solve an optimization problem?
Let’s start with the simple optimization problem: linear regression without regularization.
With m > n and. We want to find W such as:
23. Model training – Introduction - Optimization
How to analytical solve an optimization problem?
Let’s start with the simple optimization problem: linear regression without regularization.
With m > n and. We want to find W such as:
Add column of ones
to support w0
Scalar numbers
24. Model training – Introduction - Optimization
How to numerical solve an optimization problem?
Gradient descent: Start with random values for W and move in the opposite direction of the gradient
By taking just one sample
25. Model training – Introduction - Optimization
How to numerical solve an optimization problem?
Gradient descent: Start with random values for W and move in the opposite direction of the gradient
By taking just one sample
J(w)
26. Model training – Introduction - Optimization
Gradient Descent algorithm Stochastic Gradient Descent algorithm
for epoch in n_epochs:
● compute the predictions for all the samples
● compute the error between truth and predictions
● compute the gradient using all the samples
● update the parameters of the model
for epoch in n_epochs:
● shuffle the samples
● for sample in n_samples:
○ compute the predictions for the sample
○ compute the error between truth and
predictions
○ compute the gradient using the sample
○ update the parameters of the model
Mini-Batch Gradient Descent algorithm
for epoch in n_epochs:
● shuffle the batches
● for batch in n_batches:
○ compute the predictions for the batch
○ compute the error for the batch
○ compute the gradient for the batch
○ update the parameters of the model
27. Model training – Introduction - Optimization
Gradient Descent comparison
Gradient Descent Stochastic Gradient Descent Mini-Batch Gradient Descent
Gradient
Speed Very Fast (vectorized) Slow (compute sample by sample) Fast (vectorized)
Memory O(dataset) O(1) O(batch)
Convergence Needs more epochs Needs less epochs Middle point between GD and SGD
Gradient Stability Smooth updates in params Noisy updates in params Middle point between GD and SGD
28. Model training – Introduction - Optimization
A Problem with Implicit Feedback
With datasets with only unary positive feedback (e.g. clicks history)
Negative Sampling
Common fix: add random users and items with r=0
29. Model training – Introduction - Optimization
A Problem with Implicit Feedback
With datasets with only unary positive feedback (e.g. clicks history)
Negative Sampling
Common fix: add random users and items with r=0
Uniform distribution
Dataset
30. Model training – Introduction - Optimization
Negative Sampling
Common fix: add random users and items with rating=0
● Expresses “unknowns items” from users
● Acts as a regularizer
● Works also for explicit feedback
32. Model Training – Baseline models
Introduction
● Before starting to train models, always compute a baseline
● Baselines are very useful to debug more complex models
● As a general rule:
○ Very basic models can’t capture all the details on the training data and tend to underfit
○ Very complex models capture every detail on the training data and tend to overfit
● Note: During this presentation we will be using RMSE for comparing models performance
33. Model Training – Baseline models
Global Average
Average = 3.64
3.64
3.64
3.64
3.64
3.64
3.64
Prediction
RMSE = sqrt((2 - 3.64)^2 + (1-3.64)^2 + …)
RMSE = sqrt(4.13)
34. Model Training – Baseline models
Global average - Numpy code
importnumpyas np
from scipy.sparse import csr_matrix
rows= np.array([0,0,0,1,1,2,2,2,2,3,3,3,4,4,5,5,5])
cols = np.array([0,1,5,3,5,0,1,2,4,0,3,5,0,2,1,3,4])
data = np.array([2,5,4,1,5,2,4,5,4,4,5,1,5,2,1,4,2])
ratings= csr_matrix((data,(rows, cols)), shape=(6, 6))
idx = np.random.permutation(data.size)
idx_train = idx[0:int(idx.size*0.8)]
idx_valid = idx[int(idx.size*0.8):]
global_avg= data[idx_train].mean()
rmse = np.sqrt(((data[idx_valid]- global_avg)**2).sum())
35. Model Training – Baseline models
User average
Average u1 = 4.50
Average u2 = 5.00
Average u3 = 3.67
4.50
5.00
3.67
2.50
5.00
2.50
Prediction
RMSE = sqrt((2 - 4.5)^2 + (1-5.0)^2 ...)
RMSE = sqrt(6.15)
Average u4 = 2.50
Average u5 = 5.00
Average u6 = 2.50
39. Model Training – Basic linear models
Content Based - Standard Least Squares model
● Goal: very basic linear model
● Data: the matrix of items features I (may be sparse)
● Pre-processing: use PCA to reduce the dimension of I
● Solve:
● Solution is Least Squares:
40. Model Training – Basic linear models
Content Based - Standard Least Squares model
● Goal: very basic linear model
● Data: the matrix of items features I (may be sparse)
● Pre-processing: use PCA to reduce the dimension of I
● Solve:
● Solution is Least Squares:
Never compute the inverse!
(1) Use numpy:
numpy.linalg.solve(I*I.T, I*R.T)
(1) Use Cholesky decomposition:
(I * I.T) is a positive definite matrix!
41. Model Training – Basic linear models
Content Based - Regularized Least Squares model
● Goal: avoid overfitting
● Method: Tikhonov Regularization (a.k.a Ridge Regression)
● Solve:
● Solution is Regularized Least Squares:
43. Model Training – Matrix Factorization
Matrix Factorization
● If we don’t have I, to find a linear solution to our problem we need to use Matrix Factorization
techniques.
● Now we want to solve the following optimization problem:
SOLUTIONS
ANALYTICAL NUMERICAL
SVD ALS SGD
44. Model Training – Matrix Factorization
Matrix Factorization - Graphical interpretation
(or R)
45. Model Training – Matrix Factorization
Matrix Factorization - Graphical interpretation
46. Model Training – Matrix Factorization
Matrix Factorization - Graphical interpretation
47. Model Training – Matrix Factorization
Analytical solution - Singular Value Decomposition (SVD)
● Optimal Solution
● Closed Form, readily available in scikit-learn
● O(n^3) algorithm, does not scale
48. Model Training – Matrix Factorization
Numerical solution - Alternating Least Square (ALS)
Initialize:
Iterate:
● Solving least squares is easy
● Scales to big dataset
● Distributed implementation are available (e.g. on Spark)
49. Model Training – Matrix Factorization
Numerical solution - Stochastic Gradient Descent (SGD)
We are using SGD -> One sample each time
52. Model Training – Non-linear models
Simple Deep Learning model for collaborative filtering
53. Model Training – Non-linear models
Simple Deep Learning model for collaborative filtering
54. Model Training – Basic Deep Learning model
Simple Deep Learning model for collaborative filtering
55. Model Training – Complex Deep Learning problem
More complex Deep Learning model for collaborative filtering
56. Model Training – Complex Deep Learning problem
Training with Deep Learning
● Use Deep Learning Framework (e.g. PyTorch, TensorFlow)
● ...or at least Analytical Gradient Libraries (e.g. Theano, Chainer)
● Acceleration Heuristics (e.g. AdaGrad, Nesterov, RMSProp, Adam, NAdam)
● DropOut / BatchNorm
● Watch-out for Sparse Momentum Updates! Most Deep Learning frameworks don’t support it
● Hyper-parameter Optimization and Architecture Search (e.g. Gaussian Processes)
58. Model Training – Conclusions
Conclusions
Global Avg User Avg Item-Item Linear Linear + Reg Matrix Fact Deep Learning
Domains Baseline Baseline users >> items Known “I” Known “I” Unknown “I” Extra datasets
Model Complexity Trivial Trivial Simple Linear Linear Linear Non-linear
Time Complexity + + +++ ++++ ++++ ++++ ++
Overfit/Underfit Underfit Underfit May Underfit May Overfit May Perform Bad May Overfit Can Overfit
Hyper-Params 0 0 0 1 2 2–3 many
Implementation Numpy Numpy Numpy Numpy Numpy LightFM, Spark NNet libraries
59. Model Training – Conclusions
Take home
● Always start with the simplest, stupidest models
● Spend time on simple interpretable models to debug your codebase and clean your data
● Gradually increase the complexity of your models
● Add more regularization as soon as a complex model performs worse than a simpler model