This document summarizes a presentation given by Xavier Amatriain from Netflix on their recommendation system and personalization techniques. Netflix uses a variety of machine learning models like SVD, RBMs, and linear regression to make personalized recommendations. They also personalize other aspects of the user experience like rankings, genres, and similar item suggestions. Netflix collects massive amounts of user data from ratings, searches, and streaming to train these models. The goal is to provide high quality recommendations that are accurate, novel, diverse, and increase user engagement.
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...Sudeep Das, Ph.D.
In this talk, we will provide an overview of Deep Learning methods applied to personalization and search at Netflix. We will set the stage by describing the unique challenges faced at Netflix in the areas of recommendations and information retrieval. Then we will delve into how we leverage a blend of traditional algorithms and emergent deep learning methods and new types of embeddings, especially hyperbolic space embeddings, to address these challenges.
A Multi-Armed Bandit Framework For Recommendations at NetflixJaya Kawale
In this talk, we present a general multi-armed bandit framework for recommendations on the Netflix homepage. We present two example case studies using MABs at Netflix - a) Artwork Personalization to recommend personalized visuals for each of our members for the different titles and b) Billboard recommendation to recommend the right title to be watched on the Billboard.
Talk with Yves Raimond at the GPU Tech Conference on Marth 28, 2018 in San Jose, CA.
Abstract:
In this talk, we will survey how Deep Learning methods can be applied to personalization and recommendations. We will cover why standard Deep Learning approaches don't perform better than typical collaborative filtering techniques. Then we will survey we will go over recently published research at the intersection of Deep Learning and recommender systems, looking at how they integrate new types of data, explore new models, or change the recommendation problem statement. We will also highlight some of the ways that neural networks are used at Netflix and how we can use GPUs to train recommender systems. Finally, we will highlight promising new directions in this space.
Past, Present & Future of Recommender Systems: An Industry PerspectiveJustin Basilico
Slides from our talk at the RecSys 2016 conference in Boston, MA 2016-09-18 on our perspective for what are important areas for future work in recommender systems.
(Presented at the Deep Learning Re-Work SF Summit on 01/25/2018)
In this talk, we go through the traditional recommendation systems set-up, and show that deep learning approaches in that set-up don't bring a lot of extra value. We then focus on different ways to leverage these techniques, most of which relying on breaking away from that traditional set-up; through providing additional data to your recommendation algorithm, modeling different facets of user/item interactions, and most importantly re-framing the recommendation problem itself. In particular we show a few results obtained by casting the problem as a contextual sequence prediction task, and using it to model time (a very important dimension in most recommendation systems).
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...Sudeep Das, Ph.D.
In this talk, we will provide an overview of Deep Learning methods applied to personalization and search at Netflix. We will set the stage by describing the unique challenges faced at Netflix in the areas of recommendations and information retrieval. Then we will delve into how we leverage a blend of traditional algorithms and emergent deep learning methods and new types of embeddings, especially hyperbolic space embeddings, to address these challenges.
A Multi-Armed Bandit Framework For Recommendations at NetflixJaya Kawale
In this talk, we present a general multi-armed bandit framework for recommendations on the Netflix homepage. We present two example case studies using MABs at Netflix - a) Artwork Personalization to recommend personalized visuals for each of our members for the different titles and b) Billboard recommendation to recommend the right title to be watched on the Billboard.
Talk with Yves Raimond at the GPU Tech Conference on Marth 28, 2018 in San Jose, CA.
Abstract:
In this talk, we will survey how Deep Learning methods can be applied to personalization and recommendations. We will cover why standard Deep Learning approaches don't perform better than typical collaborative filtering techniques. Then we will survey we will go over recently published research at the intersection of Deep Learning and recommender systems, looking at how they integrate new types of data, explore new models, or change the recommendation problem statement. We will also highlight some of the ways that neural networks are used at Netflix and how we can use GPUs to train recommender systems. Finally, we will highlight promising new directions in this space.
Past, Present & Future of Recommender Systems: An Industry PerspectiveJustin Basilico
Slides from our talk at the RecSys 2016 conference in Boston, MA 2016-09-18 on our perspective for what are important areas for future work in recommender systems.
(Presented at the Deep Learning Re-Work SF Summit on 01/25/2018)
In this talk, we go through the traditional recommendation systems set-up, and show that deep learning approaches in that set-up don't bring a lot of extra value. We then focus on different ways to leverage these techniques, most of which relying on breaking away from that traditional set-up; through providing additional data to your recommendation algorithm, modeling different facets of user/item interactions, and most importantly re-framing the recommendation problem itself. In particular we show a few results obtained by casting the problem as a contextual sequence prediction task, and using it to model time (a very important dimension in most recommendation systems).
Personalizing "The Netflix Experience" with Deep LearningAnoop Deoras
These are the slides from my talk presented at AI Next Con conference in Seattle in Jan 2019. Here I talk in a bit more detail about the intuition behind collaborative filtering and go a bit deeper into the details of non linear deep learned models.
Presentation at the Netflix Expo session at RecSys 2020 virtual conference on 2020-09-24. It provides an overview of recommendation and personalization at Netflix and then highlights some of the things we’ve been working on as well as some important open research questions in the field of recommendations.
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Anoop Deoras
I had a fun time giving tutorial on the topic of deep learning in recommender systems at Latin America School on Recommender Systems (LARS) in Fortaleza, Brazil.
Déjà Vu: The Importance of Time and Causality in Recommender SystemsJustin Basilico
Talk at RecSys 2017 in Como, Italy on 2017-08-29.
Abstract:
Time plays a key role in recommendation. Handling it properly is especially critical when using recommender systems in real-world applications, which may not be as clear when doing research with historical data. In this talk, we will discuss some of the important challenges of handling time in recommendation algorithms at Netflix. We will focus on challenges related to how our users, items, and systems all change over time. We will then discuss some strategies for tackling these challenges, which revolves around proper treatment of causality in our systems.
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020Zachary Schendel
In the Netflix user interface (UI), when a row or UI element is named “Because you Watched...”, “More Like This”, or “Because you added to your list”, the overarching goal is to recommend a movie or TV show that a member might like based on the fact that they took a meaningful action on a source item. We have employed similar recommendations in many UI elements: on the homepage as a row of recommendations, after you click into a title, or as a piece of information about why a member should watch a title.
From an algorithmic perspective, there are many ways to define a “successful” similar recommendation. We sought to broaden that definition of success. To this end, the Consumer Insights team recently completed a suite of research projects to explore the intricacies of member perceptions of similar recommendations. The Netflix Consumer Insights team employs qualitative (e.g., in-depth interviews) and quantitative (e.g., surveys) research methods, interfacing directly with Netflix members to uncover pain points that can inspire new product innovation. The research concluded that, while the typical member believes movies are broadly similar when they share a common genre or theme, similarity is more complex, nuanced, and personal than we might have imagined. The vernacular we use in the UI implies that there should be at least some kind of relationship between the source item and the recommendations that follow. Many of our similar recommendations felt “out of place”, mostly because the relationship between the source item and the recommendation was unclear or absent. When similar recommendations tell a completely misleading, incorrect, or confusing story, member trust can be broken.
We will structure the presentation around three new insights that our research found to have an influence on the perception of similarity in the context of Netflix as well as the research methods used to uncover those insights. First, the reason a member loves a given movie will vary. For example, do you want to watch other baseball movies like Field of Dreams, or would you prefer other romances like Field of Dreams? Second, members are more or less flexible about how similar a recommendation actually needs to be depending on the properties of and their interactions with the canvas containing the recommendation. For example, a Because You Watched row on the homepage implies vaguer similarity while a More Like This gallery behind a click into the source item implies stricter similarity. Finally, even when we held the UI element constant, we found that similar recommendations are only valuable in some contexts. After finishing a movie, a member might prefer a similar recommendation one day and a change of pace the next. Research methods discussed will include Inverse Multi-Dimensional Scaling [1], survey experimentation, and ways to apply qualitative research to improve algorithmic recommendations.
At Netflix we take context of the member seriously.
In this keynote talk we will see how modeling contextual factors such as time or device can help members to find the right content at the right moment
At the end, the goal is to maximize member satisfaction and retention
These slides will go through which contextual factors matters for the video service and why we choose to use them or not.
At Netflix, we try to provide the best personalized video recommendations to our members. To do this, we need to adapt our recommendations for each contextual situation, which depends on information such as time or device. In this talk, I will describe how state of the art Contextual Recommendations are used at Netflix. A first example of contextual adaptation is the model that powers the Continue Watching row. It uses a feature-based approach with a carefully constructed training set to learn how to adapt to the context of the member. Next, I will dive into more modern approaches such as Tensor Factorization and LSTMs and share some results from deployments of these methods. I will highlight lessons learned and some common pitfalls of using these powerful methods in industrial scale systems. Finally, I will touch upon system reliability, choice of optimization metrics, hidden costs, risks and benefits of using highly adaptive systems.
Personalized Page Generation for Browsing RecommendationsJustin Basilico
Talk from First Workshop on Recommendation Systems for TV and Online Video at RecSys 2014 in Foster City, CA on 2014-10-10 about how we personalize the layout of the Netflix homepage to make it easier for people to browse the recommendations to quickly find something to watch and enjoy.
Crafting Recommenders: the Shallow and the Deep of it! Sudeep Das, Ph.D.
I present a brief review, and an outlook on the rapid changes happening in the field of recommendation engine research on the heels of the deep learning revolution!
Talk from QCon SF on 2018-11-05
For many years, the main goal of the Netflix personalized recommendation system has been to get the right titles in front each of our members at the right time. With a catalog spanning thousands of titles and a diverse member base spanning over a hundred million accounts, recommending the titles that are just right for each member is crucial. But the job of recommendation does not end there. Why should you care about any particular title we recommend? What can we say about a new and unfamiliar title that will pique your interest? How do we convince you that a title is worth watching? Answering these questions is critical in helping our members discover great content, especially for unfamiliar titles. One way to do this is to consider the artwork or imagery we use to visually portray each title. If the artwork representing a title captures something compelling to you, then it acts as a gateway into that title and gives you some visual “evidence” for why the title might be good for you. Selecting good artwork is important because it may be the first time a member becomes aware of a title (and sometimes the only time), so it must speak to them in a meaningful way. In this talk, we will present an approach for personalizing the artwork we show for each title on the Netflix homepage. We will look at how to frame this as a machine learning problem using contextual multi-armed bandits in a recommendation system setting. We will also describe the algorithmic and system challenges involved in getting this type of approach for artwork personalization to succeed at Netflix scale. Finally, we will discuss some of the future opportunities that we see to expand and improve upon this approach.
Shallow and Deep Latent Models for Recommender SystemAnoop Deoras
In this presentation, we survey latent models, starting with shallow and progressing towards deep, as applied to personalization and recommendations. After providing an overview of the Netflix recommender system, we discuss research at the intersection of deep learning, natural language processing and recommender systems and how they relate to traditional collaborative filtering techniques. We will present case studies in the space of deep latent variable models applied to recommender systems.
Personalization at Netflix - Making Stories Travel Sudeep Das, Ph.D.
I give a high level overview of how personalization at Netflix helps our members find titles that spark joy, as well as help stories travel across the world.
Recommendation systems today are widely used across many applications such as in multimedia content platforms, social networks, and ecommerce, to provide suggestions to users that are most likely to fulfill their needs, thereby improving the user experience. Academic research, to date, largely focuses on the performance of recommendation models in terms of ranking quality or accuracy measures, which often don’t directly translate into improvements in the real-world. In this talk, we present some of the most interesting challenges that we face in the personalization efforts at Netflix. The goal of this talk is to sunshine challenging research problems in industrial recommendation systems and start a conversation about exciting areas of future research.
Artwork Personalization at Netflix Fernando Amat RecSys2018 Fernando Amat
For many years, the main goal of the Netflix personalized recommendation system has been to get the right titles in front of our members at the right time. But the job of recommendation does not end there. The homepage should be able to convey to the member enough evidence of why a title may be good for her, especially for shows that the member has never heard of. One way to address this challenge is to personalize the way we portray the titles on our service. An important aspect of how to portray titles is through the artwork or imagery we display to visually represent each title. The artwork may highlight an actor that you recognize, capture an exciting moment like a car chase, or contain a dramatic scene that conveys the essence of a movie or show. It is important to select good artwork because it may be the first time a member becomes aware of a title (and sometimes the only time), so it must speak to them in a meaningful way. In this talk, we will present an approach for personalizing the artwork we use on the Netflix homepage. The system selects an image for each member and video to give better visual evidence for why the title might be appealing to that particular member.
Building Large-scale Real-world Recommender Systems - Recsys2012 tutorialXavier Amatriain
There is more to recommendation algorithms than rating prediction. And, there is more to recommender systems than algorithms. In this tutorial, given at the 2012 ACM Recommender Systems Conference in Dublin, I review things such as different interaction and user feedback mechanisms, offline experimentation and AB testing, or software architectures for Recommender Systems.
Personalizing "The Netflix Experience" with Deep LearningAnoop Deoras
These are the slides from my talk presented at AI Next Con conference in Seattle in Jan 2019. Here I talk in a bit more detail about the intuition behind collaborative filtering and go a bit deeper into the details of non linear deep learned models.
Presentation at the Netflix Expo session at RecSys 2020 virtual conference on 2020-09-24. It provides an overview of recommendation and personalization at Netflix and then highlights some of the things we’ve been working on as well as some important open research questions in the field of recommendations.
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Anoop Deoras
I had a fun time giving tutorial on the topic of deep learning in recommender systems at Latin America School on Recommender Systems (LARS) in Fortaleza, Brazil.
Déjà Vu: The Importance of Time and Causality in Recommender SystemsJustin Basilico
Talk at RecSys 2017 in Como, Italy on 2017-08-29.
Abstract:
Time plays a key role in recommendation. Handling it properly is especially critical when using recommender systems in real-world applications, which may not be as clear when doing research with historical data. In this talk, we will discuss some of the important challenges of handling time in recommendation algorithms at Netflix. We will focus on challenges related to how our users, items, and systems all change over time. We will then discuss some strategies for tackling these challenges, which revolves around proper treatment of causality in our systems.
RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020Zachary Schendel
In the Netflix user interface (UI), when a row or UI element is named “Because you Watched...”, “More Like This”, or “Because you added to your list”, the overarching goal is to recommend a movie or TV show that a member might like based on the fact that they took a meaningful action on a source item. We have employed similar recommendations in many UI elements: on the homepage as a row of recommendations, after you click into a title, or as a piece of information about why a member should watch a title.
From an algorithmic perspective, there are many ways to define a “successful” similar recommendation. We sought to broaden that definition of success. To this end, the Consumer Insights team recently completed a suite of research projects to explore the intricacies of member perceptions of similar recommendations. The Netflix Consumer Insights team employs qualitative (e.g., in-depth interviews) and quantitative (e.g., surveys) research methods, interfacing directly with Netflix members to uncover pain points that can inspire new product innovation. The research concluded that, while the typical member believes movies are broadly similar when they share a common genre or theme, similarity is more complex, nuanced, and personal than we might have imagined. The vernacular we use in the UI implies that there should be at least some kind of relationship between the source item and the recommendations that follow. Many of our similar recommendations felt “out of place”, mostly because the relationship between the source item and the recommendation was unclear or absent. When similar recommendations tell a completely misleading, incorrect, or confusing story, member trust can be broken.
We will structure the presentation around three new insights that our research found to have an influence on the perception of similarity in the context of Netflix as well as the research methods used to uncover those insights. First, the reason a member loves a given movie will vary. For example, do you want to watch other baseball movies like Field of Dreams, or would you prefer other romances like Field of Dreams? Second, members are more or less flexible about how similar a recommendation actually needs to be depending on the properties of and their interactions with the canvas containing the recommendation. For example, a Because You Watched row on the homepage implies vaguer similarity while a More Like This gallery behind a click into the source item implies stricter similarity. Finally, even when we held the UI element constant, we found that similar recommendations are only valuable in some contexts. After finishing a movie, a member might prefer a similar recommendation one day and a change of pace the next. Research methods discussed will include Inverse Multi-Dimensional Scaling [1], survey experimentation, and ways to apply qualitative research to improve algorithmic recommendations.
At Netflix we take context of the member seriously.
In this keynote talk we will see how modeling contextual factors such as time or device can help members to find the right content at the right moment
At the end, the goal is to maximize member satisfaction and retention
These slides will go through which contextual factors matters for the video service and why we choose to use them or not.
At Netflix, we try to provide the best personalized video recommendations to our members. To do this, we need to adapt our recommendations for each contextual situation, which depends on information such as time or device. In this talk, I will describe how state of the art Contextual Recommendations are used at Netflix. A first example of contextual adaptation is the model that powers the Continue Watching row. It uses a feature-based approach with a carefully constructed training set to learn how to adapt to the context of the member. Next, I will dive into more modern approaches such as Tensor Factorization and LSTMs and share some results from deployments of these methods. I will highlight lessons learned and some common pitfalls of using these powerful methods in industrial scale systems. Finally, I will touch upon system reliability, choice of optimization metrics, hidden costs, risks and benefits of using highly adaptive systems.
Personalized Page Generation for Browsing RecommendationsJustin Basilico
Talk from First Workshop on Recommendation Systems for TV and Online Video at RecSys 2014 in Foster City, CA on 2014-10-10 about how we personalize the layout of the Netflix homepage to make it easier for people to browse the recommendations to quickly find something to watch and enjoy.
Crafting Recommenders: the Shallow and the Deep of it! Sudeep Das, Ph.D.
I present a brief review, and an outlook on the rapid changes happening in the field of recommendation engine research on the heels of the deep learning revolution!
Talk from QCon SF on 2018-11-05
For many years, the main goal of the Netflix personalized recommendation system has been to get the right titles in front each of our members at the right time. With a catalog spanning thousands of titles and a diverse member base spanning over a hundred million accounts, recommending the titles that are just right for each member is crucial. But the job of recommendation does not end there. Why should you care about any particular title we recommend? What can we say about a new and unfamiliar title that will pique your interest? How do we convince you that a title is worth watching? Answering these questions is critical in helping our members discover great content, especially for unfamiliar titles. One way to do this is to consider the artwork or imagery we use to visually portray each title. If the artwork representing a title captures something compelling to you, then it acts as a gateway into that title and gives you some visual “evidence” for why the title might be good for you. Selecting good artwork is important because it may be the first time a member becomes aware of a title (and sometimes the only time), so it must speak to them in a meaningful way. In this talk, we will present an approach for personalizing the artwork we show for each title on the Netflix homepage. We will look at how to frame this as a machine learning problem using contextual multi-armed bandits in a recommendation system setting. We will also describe the algorithmic and system challenges involved in getting this type of approach for artwork personalization to succeed at Netflix scale. Finally, we will discuss some of the future opportunities that we see to expand and improve upon this approach.
Shallow and Deep Latent Models for Recommender SystemAnoop Deoras
In this presentation, we survey latent models, starting with shallow and progressing towards deep, as applied to personalization and recommendations. After providing an overview of the Netflix recommender system, we discuss research at the intersection of deep learning, natural language processing and recommender systems and how they relate to traditional collaborative filtering techniques. We will present case studies in the space of deep latent variable models applied to recommender systems.
Personalization at Netflix - Making Stories Travel Sudeep Das, Ph.D.
I give a high level overview of how personalization at Netflix helps our members find titles that spark joy, as well as help stories travel across the world.
Recommendation systems today are widely used across many applications such as in multimedia content platforms, social networks, and ecommerce, to provide suggestions to users that are most likely to fulfill their needs, thereby improving the user experience. Academic research, to date, largely focuses on the performance of recommendation models in terms of ranking quality or accuracy measures, which often don’t directly translate into improvements in the real-world. In this talk, we present some of the most interesting challenges that we face in the personalization efforts at Netflix. The goal of this talk is to sunshine challenging research problems in industrial recommendation systems and start a conversation about exciting areas of future research.
Artwork Personalization at Netflix Fernando Amat RecSys2018 Fernando Amat
For many years, the main goal of the Netflix personalized recommendation system has been to get the right titles in front of our members at the right time. But the job of recommendation does not end there. The homepage should be able to convey to the member enough evidence of why a title may be good for her, especially for shows that the member has never heard of. One way to address this challenge is to personalize the way we portray the titles on our service. An important aspect of how to portray titles is through the artwork or imagery we display to visually represent each title. The artwork may highlight an actor that you recognize, capture an exciting moment like a car chase, or contain a dramatic scene that conveys the essence of a movie or show. It is important to select good artwork because it may be the first time a member becomes aware of a title (and sometimes the only time), so it must speak to them in a meaningful way. In this talk, we will present an approach for personalizing the artwork we use on the Netflix homepage. The system selects an image for each member and video to give better visual evidence for why the title might be appealing to that particular member.
Building Large-scale Real-world Recommender Systems - Recsys2012 tutorialXavier Amatriain
There is more to recommendation algorithms than rating prediction. And, there is more to recommender systems than algorithms. In this tutorial, given at the 2012 ACM Recommender Systems Conference in Dublin, I review things such as different interaction and user feedback mechanisms, offline experimentation and AB testing, or software architectures for Recommender Systems.
In the world of recommendation systems, there are various theories and algorithms that work together to give the best results. Among these, the core recommendation algorithm is crucial. This paper will provide an introduction to some fundamental algorithms used in recommendation systems. These algorithms are like building blocks that help make recommendations more effective.
Facets and Pivoting for Flexible and Usable Linked Data ExplorationRoberto García
The success of Open Data initiatives has increased the amount of data available on the Web. Unfortunately, most of this data is only available in raw tabular form, what makes analysis and reuse quite difficult for non-experts. Linked Data principles allow for a more sophisticated approach by making explicit both the structure and semantics of the data. However, from the end-user viewpoint, they continue to be monolithic files completely opaque or difficult to explore by making tedious semantic queries. Our objective is to facilitate the user to grasp what kind of entities are in the dataset, how they are interrelated, which are their main properties and values, etc. Rhizomer is a tool for data publishing whose interface provides a set of components borrowed from Information Architecture (IA) that facilitate awareness of the dataset at hand. It automatically generates navigation menus and facets based on the kinds of things in the dataset and how they are described through metadata properties and values. Moreover, motivated by recent tests with end-users, it also provides the possibility to pivot among the faceted views created for each class of resources in the dataset.
Understanding content using Deep learning for NLPJaya Kawale
Tubi is an advertiser based video on demand service that allows its users to watch content online. For a lot of the content, there is a large amount of textual data in the form of user reviews, synopsis, title plots and even Wikipedia. Furthermore, there is a large amount of metadata in the form of actors, ratings, year of release, studio, etc. In this talk, I will present some of the challenges in understanding the data and present our platform for content understanding.
Efficient Filtering in Pub-Sub Systems using BDDNabeel Yoosuf
Slides prepared based on the paper Efficient Filtering in Publish-Subscribe Systems using BDD by Alexis Campailla, SagarChaki, Edmund Clarke, SomeshJha, Helmut Veith
[SOCRS2013]Differential Context Modeling in Collaborative FilteringYONG ZHENG
Abstract: Context-aware recommender systems (CARS) try to adapt their recommendations to users’ specific contextual situations. In many recommender systems, particularly those based on collaborative filtering (CF), the additional contextual constraints may lead to increased sparsity in the user preference data, thus fewer matches between the current user context and previous situations. Our earlier work proposed two approaches to deal with this problem – differential context relaxation (DCR) and differential context weighting (DCW) and we have successfully examined them using user-based collaborative filtering (UBCF). In this paper, we put DCR and DCW into one framework called differential context modeling (DCM). As a general framework, DCM is able to be applied to other recommendation algorithms other than UBCF. We expand the application of DCM to the other two CF approaches: item-based CF and slope one recommender. Predictive performances are evaluated based on two real-world data sets and experimental results demonstrate that applying DCM to those two algorithms is able to improve predictive accuracy compared with our baselines: context-free CF algorithms and contextual pre-filtering algorithms.
Data/AI driven product development: from video streaming to telehealthXavier Amatriain
Healthcare is different from any other application domain, or is it not? While it is true that there are specific aspects, such as high stakes decisions and a complex regulatory framework, that make healthcare somewhat different, it is also the case that many of the lessons learned from building data-driven products in other domains translate remarcably well into healthcare. This is particularly so because healthcare is also a user facing domain, where users can be both patients or healthcare professionals. Given that data has shown to improve user experience while ensuring quality and scalability, few would argue that healthcare cannot benefit from being much more data-driven than it has traditionally been.
In this talk, I described how this experience building impactful data and AI solutions into user facing products for decades can be leveraged to revolutionize telehealth. At Curai, we combine approaches such as state-of-the-art large language models with expert systems in areas such as NLP, vision, and automated diagnosis to augment and scale doctors, and to improve user experience and healthcare outcomes. We will see some of those applications while analyzing the role of data and ML algorithms in making them possible.
AI-driven product innovation: from Recommender Systems to COVID-19Xavier Amatriain
AI/Machine Learning has become an integral part of many household tech products, from Netflix to our phones. In this talk I will draw from my experience driving AI teams at some of those companies to showcase how AI can positively impact products as different as Netflix and Curai, an online telehealth service.
With half of the world’s population lacking access to healthcare services, and 30% of the adult population in the US having inadequate health insurance coverage to get even basic access to services, it should have been clear that a pandemic like COVID-19 would strain the global healthcare system way over its maximum capacity. In this context, many are trying to embrace and encourage the use of telehealth as a way to provide safe and convenient access to care. However, telehealth in itself can not scale to cover all our needs unless we improve scalability and efficiency through AI and automation.
In this talk, we will describe how our work on combining latest AI advances with medical experts and online access has the potential to change the landscape in healthcare access and provide 24/7 quality healthcare. Combining areas such as NLP, vision, and automatic diagnosis we can augment and scale doctors. We will describe our work on combining expert systems with deep learning to build state-of-the-art medical diagnostic models that are also able to model the unknowns. We will also show our work on using language models for medical Q&A . More importantly, we will describe how those approaches have been used to address the urgent and immediate needs of the current pandemic.
AI for COVID-19: An online virtual care approachXavier Amatriain
Slides for the talk I gave at the AI and COVID-19 virtual conference at Stanford. Video here: https://hai.stanford.edu/events/covid-19-and-ai-virtual-conference/video-archive
From one to zero: Going smaller as a growth strategyXavier Amatriain
This talk was designed for Engineering managers. Having been at companies of all sizes, I recommend managers who want to grow to go smaller. At the same time I reflect on what are the important things that remain constant regardless the size and context and which ones don't.
Deep learning has accomplished impressive feats in areas such as voice recognition, image processing, and natural language processing. Deep learning enthusiasts have rushed to predict that this family of algorithms is likely to take over most other applications in the near future. This focus on deep architectures seems to have cast a shadow over more “traditional” machine learning and data science approaches, leaving researchers and practitioners alike wondering whether there is any point in investing in feature engineering or simpler models.
In this talk, I will go over what deep learning can and cannot do for you, both now and in the near future. I will also describe how different approaches will continue to be needed, and why their demand will likely grow despite the rise of deep learning. I will support my claims not only by looking at recent publications, but also by using practical examples drawn from my experience at companies at the forefront of machine learning applications, such as Quora.
Past, present, and future of Recommender Systems: an industry perspectiveXavier Amatriain
Keynote for the ACM Intelligent User Interface conference in 2016 in Sonoma, CA. I start with the past by talking about the Recommender Problem, and the Netflix Prize. Then I go into the Present and the Future by talking about approaches that go beyond rating prediction and ranking and by finishing with some of the most important lessons learned over the years. Throughout my talk I put special emphasis on the relation between algorithms and the User Interface.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
1. Ne#lix
Recommenda/ons
Beyond
the
5
Stars
ACM
SF-‐Bay
Area
October
22,
2012
Xavier
Amatriain
Personaliza?on
Science
and
Engineering
-‐
NeDlix
@xamat
2. Outline
1. The Netflix Prize & the Recommendation
Problem
2. Anatomy of Netflix Personalization
3. Data & Models
4. And…
a) Consumer (Data) Science
b) Or Software Architectures
4. SVD
What we were interested in:
§ High quality recommendations
Proxy question: Results
§ Accuracy in predicted rating • Top 2 algorithms still in
production
§ Improve by 10% = $1million!
RBM
5. What about the final prize ensembles?
§ Our offline studies showed they were too computationally
intensive to scale
§ Expected improvement not worth the engineering effort
§ Plus…. Focus had already shifted to other issues that
had more impact than rating prediction.
5
14. Genre rows
§ Personalized genre rows focus on user interest
§ Also provide context and “evidence”
§ Important for member satisfaction – moving personalized rows to top on
devices increased retention
§ How are they generated?
§ Implicit: based on user’s recent plays, ratings, & other interactions
§ Explicit taste preferences
§ Hybrid:combine the above
§ Also take into account:
§ Freshness - has this been shown before?
§ Diversity– avoid repeating tags and genres, limit number of TV genres, etc.
21. Similars
§ Displayed in
many different
contexts
§ In response to
user actions/
context (search,
queue add…)
§ More like… rows
22. Anatomy of a Personalization - Recap
§ Everything is a recommendation: not only rating
prediction, but also ranking, row selection, similarity…
§ We strive to make it easy for the user, but…
§ We want the user to be aware and be involved in the
recommendation process
§ Deal with implicit/explicit and hybrid feedback
§ Add support/explanations for recommendations
§ Consider issues such as diversity or freshness
22
24. Big Data @Netflix § Almost 30M subscribers
§ Ratings: 4M/day
§ Searches: 3M/day
§ Plays: 30M/day
§ 2B hours streamed in Q4
2011
§ 1B hours in June 2012
24
25. Smart Models
§ Logistic/linear regression
§ Elastic nets
§ SVD and other MF models
§ Restricted Boltzmann Machines
§ Markov Chains
§ Different clustering approaches
§ LDA
§ Association Rules
§ Gradient Boosted Decision Trees
§ …
25
26. SVD
X[n x m] = U[n x r] S [ r x r] (V[m x r])T
§ X: m x n matrix (e.g., m users, n videos)
§ U: m x r matrix (m users, r concepts)
§ S: r x r diagonal matrix (strength of each ‘concept’) (r: rank of the matrix)
§ V: r x n matrix (n videos, r concepts)
27. Simon Funk’s SVD
§ One of the most
interesting findings
during the Netflix
Prize came out of a
blog post
§ Incremental, iterative,
and approximate way
to compute the SVD
using gradient
descent
27
28. SVD for Rating Prediction
f
§ User factor vectors pu ∈ ℜ f and item-factors vector qv ∈ ℜ
§ Baseline buv = µ + bu + bv (user & item deviation from average)
' T
§ Predict rating as ruv = buv + pu qv
§ SVD++ (Koren et. Al) asymmetric variation w. implicit feedback
$ −
1
−
1 '
' T
& R(u) 2
r = buv + q &
uv v ∑ (ruj − buj )x j + N(u) 2
∑ yj ) )
% (
§ Where
j∈R(u) j∈N (u)
§ qv , xv , yv ∈ ℜ f are three item factor vectors
§ Users are not parametrized, but rather represented by:
§ R(u): items rated by user u
§ N(u): items for which the user has given implicit preference (e.g. rated vs. not rated)
28
29. Artificial Neural Networks – 4 generations
§ 1st - Perceptrons (~60s)
§ Single layer of hand-coded features
§ Linear activation function
§ Fundamentally limited in what they can learn to do.
§ 2nd - Back-propagation (~80s)
§ Back-propagate error signal to get derivatives for learning
§ Non-linear activation function
§ 3rd - Belief Networks (~90s)
§ Directed acyclic graph composed of (visible & hidden) stochastic variables
with weighted connections.
§ Infer the states of the unobserved variables & learn interactions between
variables to make network more likely to generate observed data.
29
30. Restricted Boltzmann Machines
§ Restrict the connectivity to make learning easier.
§ Only one layer of hidden units.
§ Although multiple layers are possible hidden
§ No connections between hidden units.
j
§ Hidden units are independent given the visible
states..
§ So we can quickly get an unbiased sample from
the posterior distribution over hidden “causes” i
when given a data-vector
visible
§ RBMs can be stacked to form Deep Belief
Nets (DBN) – 4th generation of ANNs
32. Ranking Key algorithm, sorts titles in most
contexts
33. Ranking
§ Ranking = Scoring + Sorting + Filtering § Factors
bags of movies for presentation to a user § Accuracy
§ Goal: Find the best possible ordering of a § Novelty
set of videos for a user within a specific § Diversity
context in real-time § Freshness
§ Objective: maximize consumption § Scalability
§ Aspirations: Played & “enjoyed” titles have § …
best score
§ Akin to CTR forecast for ads/search results
34. Ranking
§ Popularity is the obvious baseline
§ Ratings prediction is a clear secondary data
input that allows for personalization
§ We have added many other features (and tried
many more that have not proved useful)
§ What about the weights?
§ Based on A/B testing
§ Machine-learned
35. Example: Two features, linear model
1
Predicted Rating
2
Final
Ranking
3
4
Linear
Model:
frank(u,v)
=
w1
p(v)
+
w2
r(u,v)
+
b
5
Popularity
35
40. Learning to rank
§ Machine learning problem: goal is to construct ranking
model from training data
§ Training data can have partial order or binary judgments
(relevant/not relevant).
§ Resulting order of the items typically induced from a
numerical score
§ Learning to rank is a key element for personalization
§ You can treat the problem as a standard supervised
classification problem
40
41. Learning to Rank Approaches
1. Pointwise
§ Ranking function minimizes loss function defined on individual
relevance judgment
§ Ranking score based on regression or classification
§ Ordinal regression, Logistic regression, SVM, GBDT, …
2. Pairwise
§ Loss function is defined on pair-wise preferences
§ Goal: minimize number of inversions in ranking
§ Ranking problem is then transformed into the binary classification
problem
§ RankSVM, RankBoost, RankNet, FRank…
42. Learning to rank - metrics DCG
NDCG =
IDCG
§ Quality of ranking measured using metrics as n
relevancei
DCG = relevance1 + ∑
§ Normalized Discounted Cumulative Gain 2 log 2 i
§ Mean Reciprocal Rank (MRR)
1 1
§ Fraction of Concordant Pairs (FCP) MRR =
H
∑ rank(h )
h∈H i
§ Others…
§ But, it is hard to optimize machine-learned ∑CP(x , x ) i j
models directly on these measures (they are FCP = i≠ j
n(n −1)
not differentiable) 2
§ Recent research on models that directly
optimize ranking measures
42
43. Learning to Rank Approaches
3. Listwise
a. Indirect Loss Function
§ RankCosine: similarity between ranking list and ground truth as loss function
§ ListNet: KL-divergence as loss function by defining a probability distribution
§ Problem: optimization of listwise loss function may not optimize IR metrics
b. Directly optimizing IR measures (difficult since they are not differentiable)
§ Directly optimize IR measures through Genetic Programming
§ Directly optimize measures with Simulated Annealing
§ Gradient descent on smoothed version of objective function (e.g. CLiMF
presented at Recsys 2012 or TFMAP at SIGIR 2012)
§ SVM-MAP relaxes the MAP metric by adding it to the SVM constraints
§ AdaRank uses boosting to optimize NDCG
44. Similars
§ Different similarities computed
from different sources: metadata,
ratings, viewing data…
§ Similarities can be treated as
data/features
§ Machine Learned models
improve our concept of “similarity”
44
45. Data & Models - Recap
§ All sorts of feedback from the user can help generate better
recommendations
§ Need to design systems that capture and take advantage of
all this data
§ The right model is as important as the right data
§ It is important to come up with new theoretical models, but
also need to think about application to a domain, and practical
issues
§ Rating prediction models are only part of the solution to
recommendation (think about ranking, similarity…)
45
46. More data or better models?
Really?
Anand Rajaraman: Stanford & Senior VP at
Walmart Global eCommerce (former Kosmix) 46
47. More data or better models?
Sometimes, it’s not
about more data
47
48. More data or better models?
[Banko and Brill, 2001]
Norvig: “Google does not
have better Algorithms,
only more Data”
Many features/
low-bias models
48
49. More data or better models?
Model performance vs. sample size
(actual Netflix system)
0.09
0.08
0.07
0.06
0.05 Sometimes, it’s not
about more data
0.04
0.03
0.02
0.01
0
0 1000000 2000000 3000000 4000000 5000000 6000000
49
50. More data or better models?
Data without a sound approach = noise 50
52. Consumer Science
§ Main goal is to effectively innovate for customers
§ Innovation goals
§ “If you want to increase your success rate, double
your failure rate.” – Thomas Watson, Sr., founder of
IBM
§ The only real failure is the failure to innovate
§ Fail cheaply
§ Know why you failed/succeeded
52
53. Consumer (Data) Science
1. Start with a hypothesis:
§ Algorithm/feature/design X will increase member engagement
with our service, and ultimately member retention
2. Design a test
§ Develop a solution or prototype
§ Think about dependent & independent variables, control,
significance…
3. Execute the test
4. Let data speak for itself
53
54. Offline/Online testing process
days Weeks to months
Offline Online A/B Rollout
Feature to
testing [success]
testing [success] all users
[fail]
54
55. Offline testing
§ Optimize algorithms offline
§ Measure model performance, using metrics such as:
§ Mean Reciprocal Rank, Normalized Discounted Cumulative Gain, Fraction of
Concordant Pairs, Precision/Recall & F-measures, AUC, RMSE, Diversity…
§ Offline performance used as an indication to make informed
decisions on follow-up A/B tests
§ A critical (and unsolved) issue is how offline metrics can
correlate with A/B test results.
§ Extremely important to define a coherent offline evaluation
framework (e.g. How to create training/testing datasets is not
trivial)
55
56. Executing A/B tests
§ Many different metrics, but ultimately trust user
engagement (e.g. hours of play and customer retention)
§ Think about significance and hypothesis testing
§ Our tests usually have thousands of members and 2-20 cells
§ A/B Tests allow you to try radical ideas or test many
approaches at the same time.
§ We typically have hundreds of customer A/B tests running
§ Decisions on the product always data-driven
56
57. What to measure
§ OEC: Overall Evaluation Criteria
§ In an AB test framework, the measure of success is key
§ Short-term metrics do not always align with long term
goals
§ E.g. CTR: generating more clicks might mean that our
recommendations are actually worse
§ Use long term metrics such as LTV (Life time value)
whenever possible
§ In Netflix, we use member retention
57
58. What to measure
§ Short-term metrics can sometimes be informative, and
may allow for faster decision-taking
§ At Netflix we use many such as hours streamed by users or
%hours from a given algorithm
§ But, be aware of several caveats of using early decision
mechanisms
Initial effects appear to trend.
See “Trustworthy Online
Controlled Experiments: Five
Puzzling Outcomes
Explained” [Kohavi et. Al. KDD
12]
58
59. Consumer Data Science - Recap
§ Consumer Data Science aims to innovate for the
customer by running experiments and letting data speak
§ This is mainly done through online AB Testing
§ However, we can speed up innovation by experimenting
offline
§ But, both for online and offline experimentation, it is
important to chose the right metric and experimental
framework
59
64. Event & Data Distribution
• UI devices should broadcast many
different kinds of user events
• Clicks
• Presentations
• Browsing events
• …
• Events vs. data
• Some events only need to be
propagated and trigger an action
(low latency, low information per
event)
• Others need to be processed and
“turned into” data (higher latency,
higher information quality).
• And… there are many in between
• Real-time event flow managed
through internal tool (Manhattan)
• Data flow mostly managed through
Hadoop.
64
66. Offline Jobs
• Two kinds of offline jobs
• Model training
• Batch offline computation of
recommendations/
intermediate results
• Offline queries either in Hive or
PIG
• Need a publishing mechanism
that solves several issues
• Notify readers when result of
query is ready
• Support different repositories
(s3, cassandra…)
• Handle errors, monitoring…
• We do this through Hermes
66
68. Computation
• Two ways of computing personalized
results
• Batch/offline
• Online
• Each approach has pros/cons
• Offline
+ Allows more complex computations
+ Can use more data
- Cannot react to quick changes
- May result in staleness
• Online
+ Can respond quickly to events
+ Can use most recent data
- May fail because of SLA
- Cannot deal with “complex”
computations
• It’s not an either/or decision
• Both approaches can be combined
68
70. Signals & Models
• Both offline and online algorithms are
based on three different inputs:
• Models: previously trained from
existing data
• (Offline) Data: previously
processed and stored information
• Signals: fresh data obtained from
live services
• User-related data
• Context data (session, date,
time…)
70
72. Results
• Recommendations can be serviced
from:
• Previously computed lists
• Online algorithms
• A combination of both
• The decision on where to service the
recommendation from can respond to
many factors including context.
• Also, important to think about the
fallbacks (what if plan A fails)
• Previously computed lists/intermediate
results can be stored in a variety of
ways
• Cache
• Cassandra
• Relational DB
72
73. Alerts and Monitoring
§ A non-trivial concern in large-scale recommender
systems
§ Monitoring: continuously observe quality of system
§ Alert: fast notification if quality of system goes below a
certain pre-defined threshold
§ Questions:
§ What do we need to monitor?
§ How do we know something is “bad enough” to alert
73
74. What to monitor
Did something go
§ Staleness wrong here?
§ Monitor time since last data update
74
75. What to monitor
§ Algorithmic quality
§ Monitor different metrics by comparing what users do and what
your algorithm predicted they would do
75
76. What to monitor
§ Algorithmic quality
§ Monitor different metrics by comparing what users do and what
your algorithm predicted they would do
Did something go
wrong here?
76
77. What to monitor
§ Algorithmic source for users
§ Monitor how users interact with different algorithms
Algorithm X
Did something go
wrong here?
New version
77
78. When to alert
§ Alerting thresholds are hard to tune
§ Avoid unnecessary alerts (the “learn-to-ignore problem”)
§ Avoid important issues being noticed before the alert happens
§ Rules of thumb
§ Alert on anything that will impact user experience significantly
§ Alert on issues that are actionable
§ If a noticeable event happens without an alert… add a new alert
for next time
78
80. The Personalization Problem
§ The Netflix Prize simplified the recommendation problem
to predicting ratings
§ But…
§ User ratings are only one of the many data inputs we have
§ Rating predictions are only part of our solution
§ Other algorithms such as ranking or similarity are very important
§ We can reformulate the recommendation problem
§ Function to optimize: probability a user chooses something and
enjoys it enough to come back to the service
80
81. More data +
Better models +
More accurate metrics +
Better approaches & architectures
Lots of room for improvement!
81