SlideShare a Scribd company logo
Online Tuning of Large-Scale
Recommendation Systems
Team: Yafei Wang, Yunbo Ouyang, Kinjal Basu, Ajith Muralidharan,
Shaunak Chatterjee, Shipeng Yu
Session Goal
• Motivate the need for tuning parameters
• Notifications (use-case)
• Current Approaches
• Gaussian Processes Primer
• Modelling various online metrics using Gaussian Processes
• Use Thompson sampling to find optimal solution
• Other use-cases at LinkedIn.
• Summary
Problem
• Balancing multiple metrics is a core problem at LinkedIn.
• Notifications/Email
• Maximize sessions
• Minimize sends volume and App disablements rate
• PYMK (People you may know)
• Increase invites accept rate and engagement
• Invitations below certain threshold
Problem
f1,
f2,
…
fn
T1(f1),
T1(f2),
…
T1(fn)
Transform Input
…
Score(1)
…… Final Score
w1*s1 + w2*s2
w1
wm
f1,
f2,
…
fn
T1(f1),
T1(f2),
…
T1(fn)
Transform Input
Score(m)
Session Goal
• Motivate the need for tuning parameters
• Notifications (use-case)
• Current Approaches
• Gaussian Processes Primer
• Modelling various online metrics using Gaussian Processes
• Use Thompson sampling to find optimal solution
• Other use-cases at LinkedIn.
• Summary
LinkedIn Connects the World Professionals
Remain updated about the
activities of their connections
through newsfeed
Activity Based Notifications
Non-transactional messages, time-
sensitive content
Goal: drive member engagement
while creating delightful experiences
Feeds & Events Notification
Mobile App Uses Notifications to Inform
Badging
Push
Feed
s
Actor
InApp
Recipients
Response
Notifications Problem Setup
Probability of a Click ( member, content, recipient) : pClick
Probability of a Session: pVisit
Final Decision: pClick + alpha * pVisit > T
Notifications Optimization Problem
• The weight vector x = (alpha, T) need to be tuned.
• We are interested in solving:
Here CTR and Sends are online metrics and different from
utility models
Session Goal
• Motivate the need for tuning parameters
• Notifications (use-case)
• Current Approaches
• Gaussian Processes Primer
• Modelling various online metrics using Gaussian Processes
• Use Thompson sampling to find optimal solution
• Other use-cases at LinkedIn.
• Summary
Current Approaches
• Online Method: We try several choices of weights
• Launch A/B experiments with different values of weights and monitor the
progress of the experiments.
• Searching for optimal weight can take 1-2 months.
• Pain Points
• Extremely poor model iterations velocity
• Hampers developer productivity
Proposed Solution
Use Bayesian Optimization via Thompson Sampling.
• Remove the human in the loop: Fully automatic process to find the optimal
parameters.
• Drastically improves developer productivity.
Session Goal
• Motivate the need for tuning parameters
• Notifications (use-case)
• Current Approaches
• Gaussian Processes Primer
• Modelling various online metrics using Gaussian Processes
• Use Thompson sampling to find optimal solution
• Other use-cases at LinkedIn.
• Summary
Primer on Gaussian Processes
• Given D = {x1, y1, x2, y2, ….xn, yn, xt} find yt
and
Notifications Optimization Problem
• The weight vector x = (alpha, T) need to be tuned.
• We are interested in solving:
Here CTR and Sends are online metrics and different from
utility models
Model
• Let denote if i-th member for j-th notification which was served
by parameter x perform action k. Here k = Click the Notification
Thompson Sampling Algorithm
• Consider a Gaussian process prior on each utility i.e. Clicks, Sends
• User provides search region over alpha, T as input.
• Explore Step (exploration iterations is user input and use case dependent)
• Randomly sample values to observe the metrics ( Clicks, Sends). Collect training
data (x1, y1, x2, y2,….xn, yn)
• Exploit Step
• Fit the Gaussian process – learn kernel parameters by maximizing P(y|x)
• Sample functions from posterior distribution
• Get the next distribution of hyperparameters by optimizing the objective
• Continue the Explore-Exploit until convergence
Thompson Sampling Algorithm
Overall Model Tuning Architecture
Plots (Synthetic Data)
Training data Mean of the posterior
Results
Here red line indicates the data and blue lines are samples from posterior.
PYMK Problem
Setup
• Recommend members that have high
probability of connection.
• Engage members through interactions with
connections (e.g. member engagement)
Detailed Setup - PYMK
The online metrics Accept, Invite and Engage are functions of and we
like to solve below optimization problem
Library API
Library API
Summary
• Doing A/B experiments to search for optimal hyper-parameter can
sometimes take 1-2 months.
• We have found experimentation velocity increase for several use-
cases including Notifications, PYMK, ads and feed. ( 1 -2 weeks)
• This method relies on good search range for the weights that need to
be tuned.
• High variance in the metrics may make convergence difficult.
• Very generic technique and can be used for modelling non-linear
function
Next Steps
• Time varying Gaussian Processes
• We assumed metric doesn’t change day over day. In reality
weekend traffic is different from weekday
• Grey Box Optimization
• We assumed each metric in the optimization is function over all
the parameters.
• We may know what metric gets affected by subset of parameters.
• This can help us get faster convergence

More Related Content

Similar to Online Tuning of Large Scale Recommendation Systems

LinkedIn talk at Netflix ML Platform meetup Sep 2019
LinkedIn talk at Netflix ML Platform meetup Sep 2019LinkedIn talk at Netflix ML Platform meetup Sep 2019
LinkedIn talk at Netflix ML Platform meetup Sep 2019
Faisal Siddiqi
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
ananth
 
An Introduction to Deep Learning
An Introduction to Deep LearningAn Introduction to Deep Learning
An Introduction to Deep Learning
milad abbasi
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
Mehrnaz Faraz
 
Horizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at ScaleHorizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at Scale
Databricks
 
Random Forest Decision Tree.pptx
Random Forest Decision Tree.pptxRandom Forest Decision Tree.pptx
Random Forest Decision Tree.pptx
Ramakrishna Reddy Bijjam
 
Building largescalepredictionsystemv1
Building largescalepredictionsystemv1Building largescalepredictionsystemv1
Building largescalepredictionsystemv1
arthi v
 
Learning to Learn by Gradient Descent by Gradient Descent
Learning to Learn by Gradient Descent by Gradient DescentLearning to Learn by Gradient Descent by Gradient Descent
Learning to Learn by Gradient Descent by Gradient Descent
Katy Lee
 
Thomas Jensen. Machine Learning
Thomas Jensen. Machine LearningThomas Jensen. Machine Learning
Thomas Jensen. Machine Learning
Volha Banadyseva
 
Quantitative Forecasting Techniques in SCM
Quantitative Forecasting Techniques in SCMQuantitative Forecasting Techniques in SCM
Quantitative Forecasting Techniques in SCM
Yountek1
 
Kaggle Higgs Boson Machine Learning Challenge
Kaggle Higgs Boson Machine Learning ChallengeKaggle Higgs Boson Machine Learning Challenge
Kaggle Higgs Boson Machine Learning Challenge
Bernard Ong
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
Anuj Gupta
 
Multi Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth LoganMulti Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth Logan
Spark Summit
 
Big Data Day LA 2016/ Data Science Track - Decision Making and Lambda Archite...
Big Data Day LA 2016/ Data Science Track - Decision Making and Lambda Archite...Big Data Day LA 2016/ Data Science Track - Decision Making and Lambda Archite...
Big Data Day LA 2016/ Data Science Track - Decision Making and Lambda Archite...
Data Con LA
 
Best Practices for Hyperparameter Tuning with MLflow
Best Practices for Hyperparameter Tuning with MLflowBest Practices for Hyperparameter Tuning with MLflow
Best Practices for Hyperparameter Tuning with MLflow
Databricks
 
Apache con big data 2015 - Data Science from the trenches
Apache con big data 2015 - Data Science from the trenchesApache con big data 2015 - Data Science from the trenches
Apache con big data 2015 - Data Science from the trenches
Vinay Shukla
 
Big Data Challenges and Solutions
Big Data Challenges and SolutionsBig Data Challenges and Solutions
techniques.ppt
techniques.ppttechniques.ppt
techniques.ppt
veeruyadav9
 

Similar to Online Tuning of Large Scale Recommendation Systems (20)

LinkedIn talk at Netflix ML Platform meetup Sep 2019
LinkedIn talk at Netflix ML Platform meetup Sep 2019LinkedIn talk at Netflix ML Platform meetup Sep 2019
LinkedIn talk at Netflix ML Platform meetup Sep 2019
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
 
An Introduction to Deep Learning
An Introduction to Deep LearningAn Introduction to Deep Learning
An Introduction to Deep Learning
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Horizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at ScaleHorizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at Scale
 
Random Forest Decision Tree.pptx
Random Forest Decision Tree.pptxRandom Forest Decision Tree.pptx
Random Forest Decision Tree.pptx
 
Building largescalepredictionsystemv1
Building largescalepredictionsystemv1Building largescalepredictionsystemv1
Building largescalepredictionsystemv1
 
Learning to Learn by Gradient Descent by Gradient Descent
Learning to Learn by Gradient Descent by Gradient DescentLearning to Learn by Gradient Descent by Gradient Descent
Learning to Learn by Gradient Descent by Gradient Descent
 
Thomas Jensen. Machine Learning
Thomas Jensen. Machine LearningThomas Jensen. Machine Learning
Thomas Jensen. Machine Learning
 
Quantitative Forecasting Techniques in SCM
Quantitative Forecasting Techniques in SCMQuantitative Forecasting Techniques in SCM
Quantitative Forecasting Techniques in SCM
 
Kaggle Higgs Boson Machine Learning Challenge
Kaggle Higgs Boson Machine Learning ChallengeKaggle Higgs Boson Machine Learning Challenge
Kaggle Higgs Boson Machine Learning Challenge
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
 
Multi Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth LoganMulti Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth Logan
 
Big Data Day LA 2016/ Data Science Track - Decision Making and Lambda Archite...
Big Data Day LA 2016/ Data Science Track - Decision Making and Lambda Archite...Big Data Day LA 2016/ Data Science Track - Decision Making and Lambda Archite...
Big Data Day LA 2016/ Data Science Track - Decision Making and Lambda Archite...
 
Best Practices for Hyperparameter Tuning with MLflow
Best Practices for Hyperparameter Tuning with MLflowBest Practices for Hyperparameter Tuning with MLflow
Best Practices for Hyperparameter Tuning with MLflow
 
Apache con big data 2015 - Data Science from the trenches
Apache con big data 2015 - Data Science from the trenchesApache con big data 2015 - Data Science from the trenches
Apache con big data 2015 - Data Science from the trenches
 
Big Data Challenges and Solutions
Big Data Challenges and SolutionsBig Data Challenges and Solutions
Big Data Challenges and Solutions
 
Start MPC
Start MPC Start MPC
Start MPC
 
techniques.ppt
techniques.ppttechniques.ppt
techniques.ppt
 
kdd2015
kdd2015kdd2015
kdd2015
 

Recently uploaded

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 

Recently uploaded (20)

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 

Online Tuning of Large Scale Recommendation Systems

  • 1. Online Tuning of Large-Scale Recommendation Systems Team: Yafei Wang, Yunbo Ouyang, Kinjal Basu, Ajith Muralidharan, Shaunak Chatterjee, Shipeng Yu
  • 2. Session Goal • Motivate the need for tuning parameters • Notifications (use-case) • Current Approaches • Gaussian Processes Primer • Modelling various online metrics using Gaussian Processes • Use Thompson sampling to find optimal solution • Other use-cases at LinkedIn. • Summary
  • 3. Problem • Balancing multiple metrics is a core problem at LinkedIn. • Notifications/Email • Maximize sessions • Minimize sends volume and App disablements rate • PYMK (People you may know) • Increase invites accept rate and engagement • Invitations below certain threshold
  • 4. Problem f1, f2, … fn T1(f1), T1(f2), … T1(fn) Transform Input … Score(1) …… Final Score w1*s1 + w2*s2 w1 wm f1, f2, … fn T1(f1), T1(f2), … T1(fn) Transform Input Score(m)
  • 5. Session Goal • Motivate the need for tuning parameters • Notifications (use-case) • Current Approaches • Gaussian Processes Primer • Modelling various online metrics using Gaussian Processes • Use Thompson sampling to find optimal solution • Other use-cases at LinkedIn. • Summary
  • 6. LinkedIn Connects the World Professionals Remain updated about the activities of their connections through newsfeed
  • 7. Activity Based Notifications Non-transactional messages, time- sensitive content Goal: drive member engagement while creating delightful experiences Feeds & Events Notification
  • 8. Mobile App Uses Notifications to Inform Badging Push Feed s Actor InApp Recipients Response
  • 9. Notifications Problem Setup Probability of a Click ( member, content, recipient) : pClick Probability of a Session: pVisit Final Decision: pClick + alpha * pVisit > T
  • 10. Notifications Optimization Problem • The weight vector x = (alpha, T) need to be tuned. • We are interested in solving: Here CTR and Sends are online metrics and different from utility models
  • 11. Session Goal • Motivate the need for tuning parameters • Notifications (use-case) • Current Approaches • Gaussian Processes Primer • Modelling various online metrics using Gaussian Processes • Use Thompson sampling to find optimal solution • Other use-cases at LinkedIn. • Summary
  • 12. Current Approaches • Online Method: We try several choices of weights • Launch A/B experiments with different values of weights and monitor the progress of the experiments. • Searching for optimal weight can take 1-2 months. • Pain Points • Extremely poor model iterations velocity • Hampers developer productivity
  • 13. Proposed Solution Use Bayesian Optimization via Thompson Sampling. • Remove the human in the loop: Fully automatic process to find the optimal parameters. • Drastically improves developer productivity.
  • 14. Session Goal • Motivate the need for tuning parameters • Notifications (use-case) • Current Approaches • Gaussian Processes Primer • Modelling various online metrics using Gaussian Processes • Use Thompson sampling to find optimal solution • Other use-cases at LinkedIn. • Summary
  • 15. Primer on Gaussian Processes • Given D = {x1, y1, x2, y2, ….xn, yn, xt} find yt and
  • 16. Notifications Optimization Problem • The weight vector x = (alpha, T) need to be tuned. • We are interested in solving: Here CTR and Sends are online metrics and different from utility models
  • 17. Model • Let denote if i-th member for j-th notification which was served by parameter x perform action k. Here k = Click the Notification
  • 18. Thompson Sampling Algorithm • Consider a Gaussian process prior on each utility i.e. Clicks, Sends • User provides search region over alpha, T as input. • Explore Step (exploration iterations is user input and use case dependent) • Randomly sample values to observe the metrics ( Clicks, Sends). Collect training data (x1, y1, x2, y2,….xn, yn) • Exploit Step • Fit the Gaussian process – learn kernel parameters by maximizing P(y|x) • Sample functions from posterior distribution • Get the next distribution of hyperparameters by optimizing the objective • Continue the Explore-Exploit until convergence
  • 20. Overall Model Tuning Architecture
  • 21. Plots (Synthetic Data) Training data Mean of the posterior
  • 22. Results Here red line indicates the data and blue lines are samples from posterior.
  • 23. PYMK Problem Setup • Recommend members that have high probability of connection. • Engage members through interactions with connections (e.g. member engagement)
  • 24. Detailed Setup - PYMK The online metrics Accept, Invite and Engage are functions of and we like to solve below optimization problem
  • 27. Summary • Doing A/B experiments to search for optimal hyper-parameter can sometimes take 1-2 months. • We have found experimentation velocity increase for several use- cases including Notifications, PYMK, ads and feed. ( 1 -2 weeks) • This method relies on good search range for the weights that need to be tuned. • High variance in the metrics may make convergence difficult. • Very generic technique and can be used for modelling non-linear function
  • 28. Next Steps • Time varying Gaussian Processes • We assumed metric doesn’t change day over day. In reality weekend traffic is different from weekday • Grey Box Optimization • We assumed each metric in the optimization is function over all the parameters. • We may know what metric gets affected by subset of parameters. • This can help us get faster convergence

Editor's Notes

  1. Hello everyone, I work in the communications AI team at LinkedIn. Today I am going to be talking about how we have applied online tuning on several recommendation systems problem at LinkedIn
  2. Running example is going to be Notifications Take a brief detour into GP primer which will set the stage to discuss the model. Towards the end we will discuss about other use-cases that are using this technique at LinkedIn.
  3. Any recommendation system problem has a bunch of metrics we care. Balancing multiple metrics is core problem. NOTIFICATIONS -  maximize sessions, minimize disables, PYMK - recommend members so you are delighted to establish connection and have meaningful conversations to increase overall engagement on the platform.
  4. Multiply S1 and S2 with certain weights w1 and w2. The weight here are the hyper-parameters and we may want to learn them for achieving certain metrics.
  5. We will look at the notifications ecosystem. This will help us understand the optimization problem we like to solve online.
  6. LinkedIn’s mission is to connect the world professionals and make them more productive and successful. As is the case with most social media platforms, our users access LinkedIn through the mobile app. On the app, they remain updated about the activities of their professional connections through products like newsfeed.
  7. One way to make people aware about new content thats created on the platform is through notifications. For Instance: There is an article shared about lyft cofounder talking about his vision about driverless future, If I am someone who is interested in self driving space this can be a good notification candidate for me. Such notifications fall in the category of activity based notifications (because its triggered via user activity) We realized in certain situations users do not want to miss out on these and will like to be informed in timely manner. For example, my manager was mentioned in a news or my close co-worker just posted or shared an interesting article. I liked to be informed on time so that I can timely engage the conversation about the post.
  8. Among those activity events, we realized that some of them are so important that our users can not miss and should be informed timely. For example, my manager was mentioned in a news or my close co-worker just posted or shared an interesting article. I liked to be informed on time so that I can timely engage the conversation about the post. On the other hand, there are some less important events or time-insensitive content. For example, the recommended courses, or there are people that you may know at LinkedIn. They are not that important or not necessary to be informed in real time.
  9. This slide shows how a social activity becomes a notification. An actor has some actions such as comments/likes/post/repost on the newsfeed. Then, decided by relevance model, we fanout the recipients to be informed and send either push or inapp notification. Either one will generate a badge update on the top of the app icon. By seeing pop-up message or badge update, the recipient will click the notification tap and check out what it is in notifications. Then, user may response to them depending the quality of the notification. Content Producer/ animination
  10. The system diagram may look the following. Single utility models: pClick and pVisit.
  11. This means online metrics CTR and Sends are functions of (alpha, T)
  12. Bunch of training data (x1,.xn, y1…yn) and test point xt predict yt Any subset of points follow multivariate normal. This is non-parametric. Here K* -> similarity between test and training points.
  13. The next step is we like to collect data -> This is getting training data to fit the function. Learn the kernel parameters is done by maximizing log likelihood of the data. Once we know kernel theta – we can write the posterior on test points. We just saw this is a closed form. We can sample from this function to find the x’s that satisfy the problem. When you take multiple samples you get a empirical distribution on x. You can use the new distribution on x as new inputs to collect data on.
  14. Dotted line is ground truth. In the beginning we know nothing about the ground truth But as we see we see our prediction is getting closer to ground truth.
  15. Lets look at the online part first -> For a given user we fetch the right parameters from the datastore and use it in online scoring to make decisions. As user is exposed to traffic there are certain tracking events generated like certain content was impressed, clicked, liked.
  16. very generic technique to fit non-linear functions. On the left we are looking at training data ( 12 points) On the right we have mean of the posterior (1000 points) We are able to fit the pattern very well just by looking at few data points)
  17. On the left we have plot of CTR metric. Red line shows the real data and blue line shows 10 samples from the posterior. Black line indicate the control metric. This means everything above the line is feasible.
  18. These are utility models but the true metrics we care about are Accepts, Invites, Engage.
  19. User is specifying the treatment model that needs tuning. control model.
  20. Spend time doing A/B test isn't productive. Using online tuning certainly helps. This approach does rely on search range.
  21. For example - metrics for certain cohort only depend on the parameters of the cohort.