The document discusses using Gaussian processes and Thompson sampling for online optimization of recommendation system parameters at LinkedIn. It presents notifications and People You May Know (PYMK) as use cases where balancing multiple metrics is important. Current approaches using A/B testing are slow, taking 1-2 months. The proposed solution models metrics as Gaussian processes and uses Thompson sampling for automatic, fully online optimization, improving developer productivity. It provides details on setting up the optimization problems, modeling metrics, and the Thompson sampling algorithm. Results on synthetic data and increased experimentation velocity for various use cases at LinkedIn are presented.
Facebook Talk at Netflix ML Platform meetup Sep 2019Faisal Siddiqi
In this talk at the Netflix Machine Learning Platform Meetup on 12 Sep 2019, Sam Daulton from Facebook discusses "Practical Solutions to real-world exploration problems".
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017MLconf
Deep Reinforcement Learning with Shallow Trees:
In this talk, I present Concept Network Reinforcement Learning (CNRL), developed at Bonsai. It is an industrially applicable approach to solving complex tasks using reinforcement learning, which facilitates problem decomposition, allows component reuse, and simplifies reward functions. Inspired by Sutton’s options framework, we introduce the notion of “Concept Networks” which are tree-like structures in which leaves are “sub-concepts” (sub-tasks), representing policies on a subset of state space. The parent (non-leaf) nodes are “Selectors”, containing policies on which sub-concept to choose from the child nodes, at each time during an episode. There will be a high-level overview on the reinforcement learning fundamentals at the beginning of the talk.
Bio: Matineh Shaker is an Artificial Intelligence Scientist at Bonsai in Berkeley, CA, where she builds machine learning, reinforcement learning, and deep learning tools and algorithms for general purpose intelligent systems. She was previously a Machine Learning Researcher at Geometric Intelligence, Data Science Fellow at Insight Data Science, Predoctoral Fellow at Harvard Medical School. She received her PhD from Northeastern University with a dissertation in geometry-inspired manifold learning.
We introduce a novel training procedure for policy gradient methods wherein episodic memory is used to optimize the hyperparameters of reinforcement learning algorithms on-the-fly.
Real-time ranking with concept drift using expert adviceHila Becker
Hila Becker, Marta Arias, "Real-time ranking with concept drift using expert advice", in Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '07), 86-94
Facebook Talk at Netflix ML Platform meetup Sep 2019Faisal Siddiqi
In this talk at the Netflix Machine Learning Platform Meetup on 12 Sep 2019, Sam Daulton from Facebook discusses "Practical Solutions to real-world exploration problems".
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017MLconf
Deep Reinforcement Learning with Shallow Trees:
In this talk, I present Concept Network Reinforcement Learning (CNRL), developed at Bonsai. It is an industrially applicable approach to solving complex tasks using reinforcement learning, which facilitates problem decomposition, allows component reuse, and simplifies reward functions. Inspired by Sutton’s options framework, we introduce the notion of “Concept Networks” which are tree-like structures in which leaves are “sub-concepts” (sub-tasks), representing policies on a subset of state space. The parent (non-leaf) nodes are “Selectors”, containing policies on which sub-concept to choose from the child nodes, at each time during an episode. There will be a high-level overview on the reinforcement learning fundamentals at the beginning of the talk.
Bio: Matineh Shaker is an Artificial Intelligence Scientist at Bonsai in Berkeley, CA, where she builds machine learning, reinforcement learning, and deep learning tools and algorithms for general purpose intelligent systems. She was previously a Machine Learning Researcher at Geometric Intelligence, Data Science Fellow at Insight Data Science, Predoctoral Fellow at Harvard Medical School. She received her PhD from Northeastern University with a dissertation in geometry-inspired manifold learning.
We introduce a novel training procedure for policy gradient methods wherein episodic memory is used to optimize the hyperparameters of reinforcement learning algorithms on-the-fly.
Real-time ranking with concept drift using expert adviceHila Becker
Hila Becker, Marta Arias, "Real-time ranking with concept drift using expert advice", in Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '07), 86-94
LinkedIn talk at Netflix ML Platform meetup Sep 2019Faisal Siddiqi
In this talk at the Netflix Machine Learning Platform Meetup on 12 Sep 2019, Kinjal Basu from LinkedIn discussed Online Parameter Selection for web-based Ranking vis Bayesian Optimization
Artificial Intelligence Course: Linear models ananth
In this presentation we present the linear models: Regression and Classification. We illustrate with several examples. Concepts such as underfitting (Bias) and overfitting (Variance) are presented. Linear models can be used as stand alone classifiers for simple cases and they are essential building blocks as a part of larger deep learning networks
Horizon: Deep Reinforcement Learning at ScaleDatabricks
To build a decision-making system, we must provide answers to two sets of questions: (1) ""What will happen if I make decision X?"" and (2) ""How should I pick which decision to make?"".
Typically, the first set of questions are answered with supervised learning: we build models to forecast whether someone will click on an ad, or visit a post. The second set of questions are more open-ended. In this talk, we will dive into how we can answer ""how"" questions, starting with heuristics and search. This will lead us to bandits, reinforcement learning, and Horizon: an open-source platform for training and deploying reinforcement learning models at massive scale. At Facebook, we are using Horizon, built using PyTorch 1.0 and Apache Spark, in a variety of AI-related and control tasks, spanning recommender systems, marketing & promotion distribution, and bandwidth optimization.
The talk will cover the key components of Horizon and the lessons we learned along the way that influenced the development of the platform.
Author: Jason Gauci
Kaggle Higgs Boson Machine Learning ChallengeBernard Ong
What It Took to Score the Top 2% on the Higgs Boson Machine Learning Challenge. A journey into advanced machine learning models ensembles stacking methods.
In this talk we explore how to build Machine Learning Systems that can that can learn "continuously" from their mistakes (feedback loop) and adapt to an evolving data distribution.
The youtube link to video of the talk is here:
https://www.youtube.com/watch?v=VtBvmrmMJaI
Big Data Day LA 2016/ Data Science Track - Decision Making and Lambda Archite...Data Con LA
Online decision making over time needs interacting with an ever changing environment and underlying machine learning models need to change and adapt to this changing environment. This talk discusses class of machine learning algorithms and provides details of how the computation is parallelized using the Spark framework.
Best Practices for Hyperparameter Tuning with MLflowDatabricks
Hyperparameter tuning and optimization is a powerful tool in the area of AutoML, for both traditional statistical learning models as well as for deep learning. There are many existing tools to help drive this process, including both blackbox and whitebox tuning. In this talk, we'll start with a brief survey of the most popular techniques for hyperparameter tuning (e.g., grid search, random search, Bayesian optimization, and parzen estimators) and then discuss the open source tools which implement each of these techniques. Finally, we will discuss how we can leverage MLflow with these tools and techniques to analyze how our search is performing and to productionize the best models.
Speaker: Joseph Bradley
Apache con big data 2015 - Data Science from the trenchesVinay Shukla
ApacheBigData - Budapest, 2015
Data Science from the trenches
What are the issues?
How to select best algorithm?
How to tune?
What are the problems with visualization?
How does Zeppelin help
Big data is set to offer tremendous insight. But with terabytes and petabytes of data pouring in to organizations today, traditional architectures and infrastructures are not up to the challenge. This begs the question: How do you present big data in a way that can be quickly understood and used? These data present tremendous opportunities in data mining, a burgeoning field in computer science that focuses on the development of methods that can extract knowledge from data. In many real world problems, data mining algorithms have access to massive amounts of data. Mining all the available data is prohibitive due to computational (time and memory) constraints. Much of the current research is concerned with scaling up data mining algorithms (i.e. improving on existing data mining algorithms for larger datasets). An alternative approach is to scale down the data. Thus, determining a smallest sufficient training set size that obtains the same accuracy as the entire available dataset remains an important research question. Our research focuses on selecting how many (sampling) instances to present to the data mining algorithm and also how to improve the quality of the data.
Dr. Ashwin Satyanarayana is an Assistant Professor in the Computer Systems Technology department at CityTech. Prior to joining CityTech, Ashwin was a Research Scientist at Microsoft, where he worked on several Big Data problems including Query Reformulation on Microsoft's search engine Bing. Ashwin's prior experience also includes a Senior Research Scientist on the area of Location Analytics at Placed Inc. He holds a PhD in Computer Science (Data Mining) from SUNY, with particular emphasis on Data Mining, Machine Learning and Applied Probability with applications in Real World Learning Problems.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
LinkedIn talk at Netflix ML Platform meetup Sep 2019Faisal Siddiqi
In this talk at the Netflix Machine Learning Platform Meetup on 12 Sep 2019, Kinjal Basu from LinkedIn discussed Online Parameter Selection for web-based Ranking vis Bayesian Optimization
Artificial Intelligence Course: Linear models ananth
In this presentation we present the linear models: Regression and Classification. We illustrate with several examples. Concepts such as underfitting (Bias) and overfitting (Variance) are presented. Linear models can be used as stand alone classifiers for simple cases and they are essential building blocks as a part of larger deep learning networks
Horizon: Deep Reinforcement Learning at ScaleDatabricks
To build a decision-making system, we must provide answers to two sets of questions: (1) ""What will happen if I make decision X?"" and (2) ""How should I pick which decision to make?"".
Typically, the first set of questions are answered with supervised learning: we build models to forecast whether someone will click on an ad, or visit a post. The second set of questions are more open-ended. In this talk, we will dive into how we can answer ""how"" questions, starting with heuristics and search. This will lead us to bandits, reinforcement learning, and Horizon: an open-source platform for training and deploying reinforcement learning models at massive scale. At Facebook, we are using Horizon, built using PyTorch 1.0 and Apache Spark, in a variety of AI-related and control tasks, spanning recommender systems, marketing & promotion distribution, and bandwidth optimization.
The talk will cover the key components of Horizon and the lessons we learned along the way that influenced the development of the platform.
Author: Jason Gauci
Kaggle Higgs Boson Machine Learning ChallengeBernard Ong
What It Took to Score the Top 2% on the Higgs Boson Machine Learning Challenge. A journey into advanced machine learning models ensembles stacking methods.
In this talk we explore how to build Machine Learning Systems that can that can learn "continuously" from their mistakes (feedback loop) and adapt to an evolving data distribution.
The youtube link to video of the talk is here:
https://www.youtube.com/watch?v=VtBvmrmMJaI
Big Data Day LA 2016/ Data Science Track - Decision Making and Lambda Archite...Data Con LA
Online decision making over time needs interacting with an ever changing environment and underlying machine learning models need to change and adapt to this changing environment. This talk discusses class of machine learning algorithms and provides details of how the computation is parallelized using the Spark framework.
Best Practices for Hyperparameter Tuning with MLflowDatabricks
Hyperparameter tuning and optimization is a powerful tool in the area of AutoML, for both traditional statistical learning models as well as for deep learning. There are many existing tools to help drive this process, including both blackbox and whitebox tuning. In this talk, we'll start with a brief survey of the most popular techniques for hyperparameter tuning (e.g., grid search, random search, Bayesian optimization, and parzen estimators) and then discuss the open source tools which implement each of these techniques. Finally, we will discuss how we can leverage MLflow with these tools and techniques to analyze how our search is performing and to productionize the best models.
Speaker: Joseph Bradley
Apache con big data 2015 - Data Science from the trenchesVinay Shukla
ApacheBigData - Budapest, 2015
Data Science from the trenches
What are the issues?
How to select best algorithm?
How to tune?
What are the problems with visualization?
How does Zeppelin help
Big data is set to offer tremendous insight. But with terabytes and petabytes of data pouring in to organizations today, traditional architectures and infrastructures are not up to the challenge. This begs the question: How do you present big data in a way that can be quickly understood and used? These data present tremendous opportunities in data mining, a burgeoning field in computer science that focuses on the development of methods that can extract knowledge from data. In many real world problems, data mining algorithms have access to massive amounts of data. Mining all the available data is prohibitive due to computational (time and memory) constraints. Much of the current research is concerned with scaling up data mining algorithms (i.e. improving on existing data mining algorithms for larger datasets). An alternative approach is to scale down the data. Thus, determining a smallest sufficient training set size that obtains the same accuracy as the entire available dataset remains an important research question. Our research focuses on selecting how many (sampling) instances to present to the data mining algorithm and also how to improve the quality of the data.
Dr. Ashwin Satyanarayana is an Assistant Professor in the Computer Systems Technology department at CityTech. Prior to joining CityTech, Ashwin was a Research Scientist at Microsoft, where he worked on several Big Data problems including Query Reformulation on Microsoft's search engine Bing. Ashwin's prior experience also includes a Senior Research Scientist on the area of Location Analytics at Placed Inc. He holds a PhD in Computer Science (Data Mining) from SUNY, with particular emphasis on Data Mining, Machine Learning and Applied Probability with applications in Real World Learning Problems.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Online Tuning of Large Scale Recommendation Systems
1. Online Tuning of Large-Scale
Recommendation Systems
Team: Yafei Wang, Yunbo Ouyang, Kinjal Basu, Ajith Muralidharan,
Shaunak Chatterjee, Shipeng Yu
2. Session Goal
• Motivate the need for tuning parameters
• Notifications (use-case)
• Current Approaches
• Gaussian Processes Primer
• Modelling various online metrics using Gaussian Processes
• Use Thompson sampling to find optimal solution
• Other use-cases at LinkedIn.
• Summary
3. Problem
• Balancing multiple metrics is a core problem at LinkedIn.
• Notifications/Email
• Maximize sessions
• Minimize sends volume and App disablements rate
• PYMK (People you may know)
• Increase invites accept rate and engagement
• Invitations below certain threshold
5. Session Goal
• Motivate the need for tuning parameters
• Notifications (use-case)
• Current Approaches
• Gaussian Processes Primer
• Modelling various online metrics using Gaussian Processes
• Use Thompson sampling to find optimal solution
• Other use-cases at LinkedIn.
• Summary
6. LinkedIn Connects the World Professionals
Remain updated about the
activities of their connections
through newsfeed
8. Mobile App Uses Notifications to Inform
Badging
Push
Feed
s
Actor
InApp
Recipients
Response
9. Notifications Problem Setup
Probability of a Click ( member, content, recipient) : pClick
Probability of a Session: pVisit
Final Decision: pClick + alpha * pVisit > T
10. Notifications Optimization Problem
• The weight vector x = (alpha, T) need to be tuned.
• We are interested in solving:
Here CTR and Sends are online metrics and different from
utility models
11. Session Goal
• Motivate the need for tuning parameters
• Notifications (use-case)
• Current Approaches
• Gaussian Processes Primer
• Modelling various online metrics using Gaussian Processes
• Use Thompson sampling to find optimal solution
• Other use-cases at LinkedIn.
• Summary
12. Current Approaches
• Online Method: We try several choices of weights
• Launch A/B experiments with different values of weights and monitor the
progress of the experiments.
• Searching for optimal weight can take 1-2 months.
• Pain Points
• Extremely poor model iterations velocity
• Hampers developer productivity
13. Proposed Solution
Use Bayesian Optimization via Thompson Sampling.
• Remove the human in the loop: Fully automatic process to find the optimal
parameters.
• Drastically improves developer productivity.
14. Session Goal
• Motivate the need for tuning parameters
• Notifications (use-case)
• Current Approaches
• Gaussian Processes Primer
• Modelling various online metrics using Gaussian Processes
• Use Thompson sampling to find optimal solution
• Other use-cases at LinkedIn.
• Summary
15. Primer on Gaussian Processes
• Given D = {x1, y1, x2, y2, ….xn, yn, xt} find yt
and
16. Notifications Optimization Problem
• The weight vector x = (alpha, T) need to be tuned.
• We are interested in solving:
Here CTR and Sends are online metrics and different from
utility models
17. Model
• Let denote if i-th member for j-th notification which was served
by parameter x perform action k. Here k = Click the Notification
18. Thompson Sampling Algorithm
• Consider a Gaussian process prior on each utility i.e. Clicks, Sends
• User provides search region over alpha, T as input.
• Explore Step (exploration iterations is user input and use case dependent)
• Randomly sample values to observe the metrics ( Clicks, Sends). Collect training
data (x1, y1, x2, y2,….xn, yn)
• Exploit Step
• Fit the Gaussian process – learn kernel parameters by maximizing P(y|x)
• Sample functions from posterior distribution
• Get the next distribution of hyperparameters by optimizing the objective
• Continue the Explore-Exploit until convergence
23. PYMK Problem
Setup
• Recommend members that have high
probability of connection.
• Engage members through interactions with
connections (e.g. member engagement)
24. Detailed Setup - PYMK
The online metrics Accept, Invite and Engage are functions of and we
like to solve below optimization problem
27. Summary
• Doing A/B experiments to search for optimal hyper-parameter can
sometimes take 1-2 months.
• We have found experimentation velocity increase for several use-
cases including Notifications, PYMK, ads and feed. ( 1 -2 weeks)
• This method relies on good search range for the weights that need to
be tuned.
• High variance in the metrics may make convergence difficult.
• Very generic technique and can be used for modelling non-linear
function
28. Next Steps
• Time varying Gaussian Processes
• We assumed metric doesn’t change day over day. In reality
weekend traffic is different from weekday
• Grey Box Optimization
• We assumed each metric in the optimization is function over all
the parameters.
• We may know what metric gets affected by subset of parameters.
• This can help us get faster convergence
Editor's Notes
Hello everyone, I work in the communications AI team at LinkedIn. Today I am going to be talking about how we have applied online tuning on several recommendation systems problem at LinkedIn
Running example is going to be Notifications
Take a brief detour into GP primer which
will set the stage to discuss the model.
Towards the end we will discuss about other use-cases that are using this technique at LinkedIn.
Any recommendation system problem has a bunch of metrics we care. Balancing multiple metrics is core problem.
NOTIFICATIONS - maximize sessions, minimize disables,
PYMK - recommend members so you are delighted to establish connection and have meaningful conversations to increase overall engagement on the platform.
Multiply S1 and S2 with certain weights w1 and w2.
The weight here are the hyper-parameters and we may want to learn them for achieving certain metrics.
We will look at the notifications ecosystem. This will help us understand the optimization problem we like to solve online.
LinkedIn’s mission is to connect the world professionals and make them more productive and successful. As is the case with most social media platforms, our users access LinkedIn through the mobile app. On the app, they remain updated about the activities of their professional connections through products like newsfeed.
One way to make people aware about new content thats created on the platform is through notifications.
For Instance: There is an article shared about lyft cofounder talking about his vision about driverless future,
If I am someone who is interested in self driving space this can be a good notification candidate for me.
Such notifications fall in the category of activity based notifications (because its triggered via user activity)
We realized in certain situations users do not want to miss out on these and will like to be informed in timely manner. For example, my manager was mentioned in a news or my close co-worker just posted or shared an interesting article. I liked to be informed on time so that I can timely engage the conversation about the post.
Among those activity events, we realized that some of them are so important that our users can not miss and should be informed timely. For example, my manager was mentioned in a news or my close co-worker just posted or shared an interesting article. I liked to be informed on time so that I can timely engage the conversation about the post. On the other hand, there are some less important events or time-insensitive content. For example, the recommended courses, or there are people that you may know at LinkedIn. They are not that important or not necessary to be informed in real time.
This slide shows how a social activity becomes a notification. An actor has some actions such as comments/likes/post/repost on the newsfeed. Then, decided by relevance model, we fanout the recipients to be informed and send either push or inapp notification. Either one will generate a badge update on the top of the app icon. By seeing pop-up message or badge update, the recipient will click the notification tap and check out what it is in notifications. Then, user may response to them depending the quality of the notification.
Content Producer/
animination
The system diagram may look the following.
Single utility models: pClick and pVisit.
This means online metrics CTR and Sends are functions of (alpha, T)
Bunch of training data (x1,.xn, y1…yn) and test point xt predict yt
Any subset of points follow multivariate normal.
This is non-parametric. Here K* -> similarity between test and training points.
The next step is we like to collect data -> This is getting training data to fit the function.
Learn the kernel parameters is done by maximizing log likelihood of the data.
Once we know kernel theta – we can write the posterior on test points. We just saw this is a closed form.
We can sample from this function to find the x’s that satisfy the problem. When you take multiple samples you get a empirical distribution on x.
You can use the new distribution on x as new inputs to collect data on.
Dotted line is ground truth.
In the beginning we know nothing about the ground truth
But as we see we see our prediction is getting closer to ground truth.
Lets look at the online part first ->
For a given user we fetch the right parameters from the datastore and use it in online scoring to make decisions.
As user is exposed to traffic there are certain tracking events generated like certain content was impressed, clicked, liked.
very generic technique to fit non-linear functions.
On the left we are looking at
training data ( 12 points)
On the right we have mean of the posterior (1000 points)
We are able to fit the pattern
very well just by looking at few data points)
On the left we have plot of CTR metric. Red line shows the real data and blue line shows 10 samples from the posterior.
Black line indicate the control metric. This means everything above the line is feasible.
These are utility models but the true metrics we care about are Accepts, Invites, Engage.
User is specifying the treatment model that needs tuning.
control model.
Spend time doing A/B test isn't productive.
Using online tuning certainly helps.
This approach does rely on search range.
For example - metrics for certain cohort only depend on the parameters of the cohort.