SlideShare a Scribd company logo
How Pre-Training can help
solve Cold Start Problem?
Lokesh Vadlamudi
San Jose State University
Why do we need to solve cold start problem?
• Modern day recommendation systems suffer from a major issue
which is the lack of data. This is generally known as cold start problem
(data sparsity).
• The two types of models that leverage pre-training are:
• 1. Feature based models
• 2. Fine tuning models.
• In feature based models, the information of features are collected
from side information (knowledge graphs and content of items) for
users and items using pre-trained models.
• In fine tuning models, initially the model is pre-trained with user-item
interaction data, and later fine tuned to suit the needs of specific
recommendation tasks.
Different types of Feature based models are:
• 1. Content Based Recommendation:
• Item to item interaction data helps in recommending users the items.
In terms of pre-trained models, they help in getting the useful
features from text, images, etc.
• 2. Knowledge Graph Based Recommendation:
• It contains connections between users, items, etc. A knowledge graph
generally contains, user profiles, item attributes, cross domain item
relations.
• 3. Social Recommendation:
• This type of recommendation needs social graphs which in turn is
based on relation between users. A user can like items that his/her
friends already liked. In this method, the social network embeddings
pre-trained can help better the recommendation model.
Fine tuning models:
• Shallow Neural Network:
• The shallow neural network is considered as the base model for many
knowledge transfer experiments, namely shallow MLP, recurrent neural
networks.
• Ni et al.(2018) recommended a DUPN model, by implementing better pre-
training tasks. User representations are captured by LSTM and attention
layer. This model is pre-trained by multiple task objectives, including click-
through rate prediction, price prediction, shop prediction from which it can
learn universal user representations. Though the results were impressive
with accurate predictions, this model requires many extra information on
user preference for an enhanced pre-training tasks.
BERT-based Models
• Masked Item Prediction
• given the input sequence some of the items are randomly masked
with special toke [MASK].This model should rebuild the masked items.
The interaction sequence in sequential order are the items interacted
by user at a time.
• This model has used user interaction sequence by considering the
whole context for representations unlike, left-to-right next item
prediction task commonly used in session-based recommendation
system. Hence pre-trained MLP models provide accurate results.
BERT for Recommendation System
• BERT is a multi-layered bidirectional Transformer. Transformers
comprises of two sublayers namely : multi-head attention sub-layer
and point-wise-feed-forward-network
• Multi-head Attention
• the Transformer uses the multi-head self-attention which takes
information jointly from different vector sub-spaces. Specifically, this
mechanism first linearly projects the input sequence into sub-spaces
and then produces the output representation with attention
functions.
• Point-wise Feed -Forward Network
• the point-wise feed-forward function devises the model’s non-
linearity. A full connected feed-forward information is applied
individually and identically to each position. The sub-layer consists of
ReLU activation and two linear transformations
• Chen et al present a fine-tune BERT4RS with a content-based click
through prediction task. The pre-trained BERT is produced from
historical behavior sequence of the user representation and item
representation is produced for its content.
• Yang et al. (2019) follows the BERT4RS for next basket recommendation
task, in which the model is pre-trained with MIP and next basket prediction
(NBP) tasks. In reality, a user usually buys or browses a series of items (a
basket) at a time.
• Parameter-Efficient Pre-trained Model:
• Fine-tuning models for different tasks separately can be computationally
expensive. To solve this issue, Yuan et al proposed Peterrec, which uses a
grafting neural net also known as model patch. After model patching, the
networks can keep all pre-trained parameters unchanged.
•
Experimentation
• The experiment is done on movie-lens dataset. Caser model and
BERT4Rec model are used to check the use of pre-training in
recommendation. Deep knowledge transfer performs best with deep
BERT4Rec model. The next item predictions are better with Caser
model. When we inject external knowledge, BERT4Rec performs
better than Caser. Thus we can conclude that pre-training does help
in improving recommendations where cold start is present.

More Related Content

What's hot

Machine learning overview
Machine learning overviewMachine learning overview
Machine learning overview
prih_yah
 
Neural network techniques
Neural network techniquesNeural network techniques
Neural network techniques
Vipul Bhargava
 
Iaetsd an enhanced feature selection for
Iaetsd an enhanced feature selection forIaetsd an enhanced feature selection for
Iaetsd an enhanced feature selection for
Iaetsd Iaetsd
 
A neural ada boost based facial expression recogniton System
A neural ada boost based facial expression recogniton SystemA neural ada boost based facial expression recogniton System
A neural ada boost based facial expression recogniton System
International Islamic University
 
Machine learning
Machine learningMachine learning
Machine learning
hplap
 
Learning Methods in a Neural Network
Learning Methods in a Neural NetworkLearning Methods in a Neural Network
Learning Methods in a Neural Network
Saransh Choudhary
 
Machine learning ppt
Machine learning ppt Machine learning ppt
Machine learning ppt
Poojamanic
 
Symbolic-Connectionist Representational Model for Optimizing Decision Making ...
Symbolic-Connectionist Representational Model for Optimizing Decision Making ...Symbolic-Connectionist Representational Model for Optimizing Decision Making ...
Symbolic-Connectionist Representational Model for Optimizing Decision Making ...
IJECEIAES
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design Training
ESCOM
 
Artificial Neural Networks for Data Mining
Artificial Neural Networks for Data MiningArtificial Neural Networks for Data Mining
Machine learning ppt.
Machine learning ppt.Machine learning ppt.
Machine learning ppt.
ASHOK KUMAR
 
Machine learning
Machine learningMachine learning
Machine learning
ADARSHMISHRA126
 
Internship project presentation_final_upload
Internship project presentation_final_uploadInternship project presentation_final_upload
Internship project presentation_final_upload
Suraj Rathore
 
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)
butest
 
Eckovation Machine Learning
Eckovation Machine LearningEckovation Machine Learning
Eckovation Machine Learning
Shikhar Srivastava
 
M43016571
M43016571M43016571
M43016571
IJERA Editor
 
Alanoud alqoufi inductive learning
Alanoud alqoufi inductive learningAlanoud alqoufi inductive learning
Alanoud alqoufi inductive learning
Alanoud Alqoufi
 
Machine learning
Machine learningMachine learning
Machine learning
Sanjay krishne
 
Machine learning in agriculture module 2
Machine learning in agriculture module 2Machine learning in agriculture module 2
Machine learning in agriculture module 2
Prasenjit Dey
 
Types of Machine Learning
Types of Machine LearningTypes of Machine Learning
Types of Machine Learning
Samra Shahzadi
 

What's hot (20)

Machine learning overview
Machine learning overviewMachine learning overview
Machine learning overview
 
Neural network techniques
Neural network techniquesNeural network techniques
Neural network techniques
 
Iaetsd an enhanced feature selection for
Iaetsd an enhanced feature selection forIaetsd an enhanced feature selection for
Iaetsd an enhanced feature selection for
 
A neural ada boost based facial expression recogniton System
A neural ada boost based facial expression recogniton SystemA neural ada boost based facial expression recogniton System
A neural ada boost based facial expression recogniton System
 
Machine learning
Machine learningMachine learning
Machine learning
 
Learning Methods in a Neural Network
Learning Methods in a Neural NetworkLearning Methods in a Neural Network
Learning Methods in a Neural Network
 
Machine learning ppt
Machine learning ppt Machine learning ppt
Machine learning ppt
 
Symbolic-Connectionist Representational Model for Optimizing Decision Making ...
Symbolic-Connectionist Representational Model for Optimizing Decision Making ...Symbolic-Connectionist Representational Model for Optimizing Decision Making ...
Symbolic-Connectionist Representational Model for Optimizing Decision Making ...
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design Training
 
Artificial Neural Networks for Data Mining
Artificial Neural Networks for Data MiningArtificial Neural Networks for Data Mining
Artificial Neural Networks for Data Mining
 
Machine learning ppt.
Machine learning ppt.Machine learning ppt.
Machine learning ppt.
 
Machine learning
Machine learningMachine learning
Machine learning
 
Internship project presentation_final_upload
Internship project presentation_final_uploadInternship project presentation_final_upload
Internship project presentation_final_upload
 
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)
 
Eckovation Machine Learning
Eckovation Machine LearningEckovation Machine Learning
Eckovation Machine Learning
 
M43016571
M43016571M43016571
M43016571
 
Alanoud alqoufi inductive learning
Alanoud alqoufi inductive learningAlanoud alqoufi inductive learning
Alanoud alqoufi inductive learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine learning in agriculture module 2
Machine learning in agriculture module 2Machine learning in agriculture module 2
Machine learning in agriculture module 2
 
Types of Machine Learning
Types of Machine LearningTypes of Machine Learning
Types of Machine Learning
 

Similar to How can pre-training help to solve the cold start problem?

Table of Contents
Table of ContentsTable of Contents
Table of Contents
butest
 
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
gabrielesisinna
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
Yalçın Yenigün
 
Performance Comparison between Pytorch and Mindspore
Performance Comparison between Pytorch and MindsporePerformance Comparison between Pytorch and Mindspore
Performance Comparison between Pytorch and Mindspore
ijdms
 
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
Deep Learning For Practitioners,  lecture 2: Selecting the right applications...Deep Learning For Practitioners,  lecture 2: Selecting the right applications...
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
ananth
 
CBIR with RF
CBIR with RFCBIR with RF
CBIR with RF
MITS Gwalior
 
Bangla Handwritten Digit Recognition Report.pdf
Bangla Handwritten Digit Recognition  Report.pdfBangla Handwritten Digit Recognition  Report.pdf
Bangla Handwritten Digit Recognition Report.pdf
KhondokerAbuNaim
 
Survey of Attention mechanism
Survey of Attention mechanismSurvey of Attention mechanism
Survey of Attention mechanism
SwatiNarkhede1
 
Deep learning summary
Deep learning summaryDeep learning summary
Deep learning summary
ankit_ppt
 
Unit one ppt of deeep learning which includes Ann cnn
Unit one ppt of  deeep learning which includes Ann cnnUnit one ppt of  deeep learning which includes Ann cnn
Unit one ppt of deeep learning which includes Ann cnn
kartikaursang53
 
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
ijcsa
 
Presentation 7.pptx
Presentation 7.pptxPresentation 7.pptx
Presentation 7.pptx
Shivam327815
 
3e recommendation engines_meetup
3e recommendation engines_meetup3e recommendation engines_meetup
3e recommendation engines_meetup
Pranab Ghosh
 
REAL-TIME RECOMMENDATION SYSTEMS
REAL-TIME RECOMMENDATION SYSTEMS REAL-TIME RECOMMENDATION SYSTEMS
REAL-TIME RECOMMENDATION SYSTEMS
BigDataCloud
 
02 Related Concepts
02 Related Concepts02 Related Concepts
02 Related Concepts
Valerii Klymchuk
 
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
Egyptian Engineers Association
 
laptop price prediction presentation
laptop price prediction presentationlaptop price prediction presentation
laptop price prediction presentation
NeerajNishad4
 
Survey of Attention mechanism & Use in Computer Vision
Survey of Attention mechanism & Use in Computer VisionSurvey of Attention mechanism & Use in Computer Vision
Survey of Attention mechanism & Use in Computer Vision
SwatiNarkhede1
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learning
Sanghamitra Deb
 
Utilizing Marginal Net Utility for Recommendation in E-commerce
Utilizing Marginal Net Utility for Recommendation in E-commerceUtilizing Marginal Net Utility for Recommendation in E-commerce
Utilizing Marginal Net Utility for Recommendation in E-commerce
Liangjie Hong
 

Similar to How can pre-training help to solve the cold start problem? (20)

Table of Contents
Table of ContentsTable of Contents
Table of Contents
 
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
 
Performance Comparison between Pytorch and Mindspore
Performance Comparison between Pytorch and MindsporePerformance Comparison between Pytorch and Mindspore
Performance Comparison between Pytorch and Mindspore
 
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
Deep Learning For Practitioners,  lecture 2: Selecting the right applications...Deep Learning For Practitioners,  lecture 2: Selecting the right applications...
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
 
CBIR with RF
CBIR with RFCBIR with RF
CBIR with RF
 
Bangla Handwritten Digit Recognition Report.pdf
Bangla Handwritten Digit Recognition  Report.pdfBangla Handwritten Digit Recognition  Report.pdf
Bangla Handwritten Digit Recognition Report.pdf
 
Survey of Attention mechanism
Survey of Attention mechanismSurvey of Attention mechanism
Survey of Attention mechanism
 
Deep learning summary
Deep learning summaryDeep learning summary
Deep learning summary
 
Unit one ppt of deeep learning which includes Ann cnn
Unit one ppt of  deeep learning which includes Ann cnnUnit one ppt of  deeep learning which includes Ann cnn
Unit one ppt of deeep learning which includes Ann cnn
 
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
 
Presentation 7.pptx
Presentation 7.pptxPresentation 7.pptx
Presentation 7.pptx
 
3e recommendation engines_meetup
3e recommendation engines_meetup3e recommendation engines_meetup
3e recommendation engines_meetup
 
REAL-TIME RECOMMENDATION SYSTEMS
REAL-TIME RECOMMENDATION SYSTEMS REAL-TIME RECOMMENDATION SYSTEMS
REAL-TIME RECOMMENDATION SYSTEMS
 
02 Related Concepts
02 Related Concepts02 Related Concepts
02 Related Concepts
 
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
 
laptop price prediction presentation
laptop price prediction presentationlaptop price prediction presentation
laptop price prediction presentation
 
Survey of Attention mechanism & Use in Computer Vision
Survey of Attention mechanism & Use in Computer VisionSurvey of Attention mechanism & Use in Computer Vision
Survey of Attention mechanism & Use in Computer Vision
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learning
 
Utilizing Marginal Net Utility for Recommendation in E-commerce
Utilizing Marginal Net Utility for Recommendation in E-commerceUtilizing Marginal Net Utility for Recommendation in E-commerce
Utilizing Marginal Net Utility for Recommendation in E-commerce
 

Recently uploaded

Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
zsjl4mimo
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 

Recently uploaded (20)

Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 

How can pre-training help to solve the cold start problem?

  • 1. How Pre-Training can help solve Cold Start Problem? Lokesh Vadlamudi San Jose State University
  • 2. Why do we need to solve cold start problem? • Modern day recommendation systems suffer from a major issue which is the lack of data. This is generally known as cold start problem (data sparsity).
  • 3. • The two types of models that leverage pre-training are: • 1. Feature based models • 2. Fine tuning models.
  • 4. • In feature based models, the information of features are collected from side information (knowledge graphs and content of items) for users and items using pre-trained models. • In fine tuning models, initially the model is pre-trained with user-item interaction data, and later fine tuned to suit the needs of specific recommendation tasks.
  • 5. Different types of Feature based models are: • 1. Content Based Recommendation: • Item to item interaction data helps in recommending users the items. In terms of pre-trained models, they help in getting the useful features from text, images, etc. • 2. Knowledge Graph Based Recommendation: • It contains connections between users, items, etc. A knowledge graph generally contains, user profiles, item attributes, cross domain item relations.
  • 6. • 3. Social Recommendation: • This type of recommendation needs social graphs which in turn is based on relation between users. A user can like items that his/her friends already liked. In this method, the social network embeddings pre-trained can help better the recommendation model.
  • 7. Fine tuning models: • Shallow Neural Network: • The shallow neural network is considered as the base model for many knowledge transfer experiments, namely shallow MLP, recurrent neural networks. • Ni et al.(2018) recommended a DUPN model, by implementing better pre- training tasks. User representations are captured by LSTM and attention layer. This model is pre-trained by multiple task objectives, including click- through rate prediction, price prediction, shop prediction from which it can learn universal user representations. Though the results were impressive with accurate predictions, this model requires many extra information on user preference for an enhanced pre-training tasks.
  • 8. BERT-based Models • Masked Item Prediction • given the input sequence some of the items are randomly masked with special toke [MASK].This model should rebuild the masked items. The interaction sequence in sequential order are the items interacted by user at a time. • This model has used user interaction sequence by considering the whole context for representations unlike, left-to-right next item prediction task commonly used in session-based recommendation system. Hence pre-trained MLP models provide accurate results.
  • 9. BERT for Recommendation System • BERT is a multi-layered bidirectional Transformer. Transformers comprises of two sublayers namely : multi-head attention sub-layer and point-wise-feed-forward-network • Multi-head Attention • the Transformer uses the multi-head self-attention which takes information jointly from different vector sub-spaces. Specifically, this mechanism first linearly projects the input sequence into sub-spaces and then produces the output representation with attention functions.
  • 10. • Point-wise Feed -Forward Network • the point-wise feed-forward function devises the model’s non- linearity. A full connected feed-forward information is applied individually and identically to each position. The sub-layer consists of ReLU activation and two linear transformations • Chen et al present a fine-tune BERT4RS with a content-based click through prediction task. The pre-trained BERT is produced from historical behavior sequence of the user representation and item representation is produced for its content.
  • 11. • Yang et al. (2019) follows the BERT4RS for next basket recommendation task, in which the model is pre-trained with MIP and next basket prediction (NBP) tasks. In reality, a user usually buys or browses a series of items (a basket) at a time. • Parameter-Efficient Pre-trained Model: • Fine-tuning models for different tasks separately can be computationally expensive. To solve this issue, Yuan et al proposed Peterrec, which uses a grafting neural net also known as model patch. After model patching, the networks can keep all pre-trained parameters unchanged. •
  • 12. Experimentation • The experiment is done on movie-lens dataset. Caser model and BERT4Rec model are used to check the use of pre-training in recommendation. Deep knowledge transfer performs best with deep BERT4Rec model. The next item predictions are better with Caser model. When we inject external knowledge, BERT4Rec performs better than Caser. Thus we can conclude that pre-training does help in improving recommendations where cold start is present.