This document summarizes some of the key topics and presentations from the Recsys 2018 conference. It discusses the growing popularity of deep learning and reinforcement learning in recommender systems. It provides an overview of Netflix's use of reinforcement learning for artwork recommendations. It also summarizes several papers presented at the conference, including ones on calibrated recommendations, reciprocal recommenders, the Recsys challenge on playlist continuation, and evaluating metrics for top-N recommendations. Finally, it discusses some mixed methods approaches and tutorials presented at the conference.
Recommendation and Information Retrieval: Two Sides of the Same Coin?Arjen de Vries
Status update on our current understanding of how collaborative filtering relates far more closely to information retrieval than usually thought. Includes work by Jun Wang and Alejandro Bellogín. This presentation has been given at the Siks PhD student course on computational intelligence, May 24th, 2013
From the NYC Machine Learning meetup on Jan 17, 2013: http://www.meetup.com/NYC-Machine-Learning/events/97871782/
Video is available here: http://vimeo.com/57900625
Spotify uses a range of Machine Learning models to power its music recommendation features including the Discover page and Radio. Due to the iterative nature of training these models they suffer from IO overhead of Hadoop and are a natural fit to the Spark programming paradigm. In this talk I will present both the right way as well as the wrong way to implement collaborative filtering models with Spark. Additionally, I will deep dive into how Matrix Factorization is implemented in the MLlib library.
Algorithmic Music Recommendations at SpotifyChris Johnson
In this presentation I introduce various Machine Learning methods that we utilize for music recommendations and discovery at Spotify. Specifically, I focus on Implicit Matrix Factorization for Collaborative Filtering, how to implement a small scale version using python, numpy, and scipy, as well as how to scale up to 20 Million users and 24 Million songs using Hadoop and Spark.
Music Recommendations at Scale with SparkChris Johnson
Spotify uses a range of Machine Learning models to power its music recommendation features including the Discover page, Radio, and Related Artists. Due to the iterative nature of these models they are a natural fit to the Spark computation paradigm and suffer from the IO overhead incurred by Hadoop. In this talk, I review the ALS algorithm for Matrix Factorization with implicit feedback data and how we’ve scaled it up to handle 100s of Billions of data points using Scala, Breeze, and Spark.
Models for Information Retrieval and RecommendationArjen de Vries
Online information services personalize the user experience by applying recommendation systems to identify the information that is most relevant to the user. The question how to estimate relevance has been the core concept in the field of information retrieval for many years. Not so surprisingly then, it turns out that the methods used in online recommendation systems are closely related to the models developed in the information retrieval area. In this lecture, I present a unified approach to information retrieval and collaborative filtering, and demonstrate how this let’s us turn a standard information retrieval system into a state-of-the-art recommendation system.
Recommendation and Information Retrieval: Two Sides of the Same Coin?Arjen de Vries
Status update on our current understanding of how collaborative filtering relates far more closely to information retrieval than usually thought. Includes work by Jun Wang and Alejandro Bellogín. This presentation has been given at the Siks PhD student course on computational intelligence, May 24th, 2013
From the NYC Machine Learning meetup on Jan 17, 2013: http://www.meetup.com/NYC-Machine-Learning/events/97871782/
Video is available here: http://vimeo.com/57900625
Spotify uses a range of Machine Learning models to power its music recommendation features including the Discover page and Radio. Due to the iterative nature of training these models they suffer from IO overhead of Hadoop and are a natural fit to the Spark programming paradigm. In this talk I will present both the right way as well as the wrong way to implement collaborative filtering models with Spark. Additionally, I will deep dive into how Matrix Factorization is implemented in the MLlib library.
Algorithmic Music Recommendations at SpotifyChris Johnson
In this presentation I introduce various Machine Learning methods that we utilize for music recommendations and discovery at Spotify. Specifically, I focus on Implicit Matrix Factorization for Collaborative Filtering, how to implement a small scale version using python, numpy, and scipy, as well as how to scale up to 20 Million users and 24 Million songs using Hadoop and Spark.
Music Recommendations at Scale with SparkChris Johnson
Spotify uses a range of Machine Learning models to power its music recommendation features including the Discover page, Radio, and Related Artists. Due to the iterative nature of these models they are a natural fit to the Spark computation paradigm and suffer from the IO overhead incurred by Hadoop. In this talk, I review the ALS algorithm for Matrix Factorization with implicit feedback data and how we’ve scaled it up to handle 100s of Billions of data points using Scala, Breeze, and Spark.
Models for Information Retrieval and RecommendationArjen de Vries
Online information services personalize the user experience by applying recommendation systems to identify the information that is most relevant to the user. The question how to estimate relevance has been the core concept in the field of information retrieval for many years. Not so surprisingly then, it turns out that the methods used in online recommendation systems are closely related to the models developed in the information retrieval area. In this lecture, I present a unified approach to information retrieval and collaborative filtering, and demonstrate how this let’s us turn a standard information retrieval system into a state-of-the-art recommendation system.
talk at KTH 14 May 2014 about matrix factorization, different latent and neighborhood models, graphs and energy diffusion for recommender systems, as well as what makes good/bad recommendations.
Recommender systems analyze patterns of user interest in
products to provide personalized recommendations. They seek to predict the rating or preference that user would
give to an item. Some of the most successful realizations of latent factor models are based on matrix factorization...
Recommender Systems from A to Z – The Right DatasetCrossing Minds
In the last years a lot of improvements were done in the field of Machine Learning and the Tools that support the community of developers. But still, implementing a recommender system is very hard.
That is why at Crossing Minds, we decided to create a series of 4 meetups to discuss how to implement a recommender system end-to-end:
Part 1 – The Right Dataset
Part 2 – Model Training
Part 3 – Model Evaluation
Part 4 – Real-Time Deployment
This first meetup will be about building the right dataset and doing all the preprocessing needed to create different models. We will talk about explicit vs implicit feedback, dataset analysis, likes/dislikes vs ratings, users and items features, normalization and similarities.
Building Data Pipelines for Music Recommendations at SpotifyVidhya Murali
In this talk, we will get into the architectural and functional details as to how we build scalable and robust data pipelines for music recommendations at Spotify. We will also discuss some of the challenges and an overview of work to address these challenges.
Recommender systems aim to predict the content that a user would like based on observations of the online behaviour of its users. Research in the Information Access group addresses different aspects of this problem, varying from how to measure recommendation results, how recommender systems relate to information retrieval models, and how to build effective recommender systems (note: last Friday, we won the ACM RecSys 2013 News Recommender Systems challenge). We would like to develop a general methodology to diagnose weaknesses and strengths of recommender systems. In this talk, I discuss the initial results of an analysis of the core component of collaborative filtering recommenders: the similarity metric used to find the most similar users (neighbours) that will provide the basis for the recommendation to be made. The purpose is to shed light on the question why certain user similarity metrics have been found to perform better than others. We have studied statistics computed over the distance distribution in the neighbourhood as well as properties of the nearest neighbour graph. The features identified correlate strongly with measured prediction performance - however, we have not yet discovered how to deploy this knowledge to actually improve recommendations made.
Summary of a Recommender Systems Survey paperChangsung Moon
This is the summary of the following paper:
J. Bobadilla, F. Ortega, A. Hernando and A. Gutierrez, “Recommender Systems Survey,” Knowledge Based Systems, Vol. 26, 2013, pp. 109-132.
ACM SIGIR 2020 Tutorial - Reciprocal Recommendation: matching users with the ...Iván Palomares Carrascosa
Speaker: Iván Palomares Carrascosa.
DaSCI Andalusian Institute of Data Science and Artificial Intelligence.
Key words: recommender systems, information retrieval, reciprocal recommendation, people recommendation, online dating, recruitment, online learning, social networks, socia media, data fusion, preference aggregation.
Recommender Systems from A to Z – Model TrainingCrossing Minds
This second meetup will be about training different models for our recommender system. We will review the simple models we can build as a baseline. After that, we will present the recommender system as an optimization problem and discuss different training losses. We will mention linear models and matrix factorization techniques. We will end the presentation with a simple introduction to non-linear models and deep learning.
Recommender systems are software tools and techniques providing suggestions for items to be of interest to a user. Recommender systems have proved in recent years to be a valuable means of helping Web users by providing useful and effective recommendations or suggestions.
[RIIT 2017] Identifying Grey Sheep Users By The Distribution of User Similari...YONG ZHENG
Yong Zheng, Mayur Agnani, Mili Singh. “Identifying Grey Sheep Users By The Distribution of User Similarities In Collaborative Filtering”. Proceedings of The 6th ACM Conference on Research in Information Technology (RIIT), Rochester, NY, USA, October, 2017
Scala Data Pipelines for Music RecommendationsChris Johnson
Are you still building data pipelines with Java and Python? Are you curious about the current buzz in the Big Data community surrounding Scala as a data processing environment? In this talk I'll discuss how Spotify migrated its music recommendations pipeline from Python to Scala. I'll dive into the language specific features that make Scala the ideal candidate for big data processing as well as highlight the rich set of tools and APIs that we take advantage of to process music recommendations for our 50 Million active users including Scalding, Breeze, Kafka, Spark, Parquet, Driven and Zeppelin.
basic Function and Terminology of Recommendation Systems. Some Algorithmic Implementation with some sample Dataset for Understanding. It contains all the Layers of RS Framework well explained.
talk at KTH 14 May 2014 about matrix factorization, different latent and neighborhood models, graphs and energy diffusion for recommender systems, as well as what makes good/bad recommendations.
Recommender systems analyze patterns of user interest in
products to provide personalized recommendations. They seek to predict the rating or preference that user would
give to an item. Some of the most successful realizations of latent factor models are based on matrix factorization...
Recommender Systems from A to Z – The Right DatasetCrossing Minds
In the last years a lot of improvements were done in the field of Machine Learning and the Tools that support the community of developers. But still, implementing a recommender system is very hard.
That is why at Crossing Minds, we decided to create a series of 4 meetups to discuss how to implement a recommender system end-to-end:
Part 1 – The Right Dataset
Part 2 – Model Training
Part 3 – Model Evaluation
Part 4 – Real-Time Deployment
This first meetup will be about building the right dataset and doing all the preprocessing needed to create different models. We will talk about explicit vs implicit feedback, dataset analysis, likes/dislikes vs ratings, users and items features, normalization and similarities.
Building Data Pipelines for Music Recommendations at SpotifyVidhya Murali
In this talk, we will get into the architectural and functional details as to how we build scalable and robust data pipelines for music recommendations at Spotify. We will also discuss some of the challenges and an overview of work to address these challenges.
Recommender systems aim to predict the content that a user would like based on observations of the online behaviour of its users. Research in the Information Access group addresses different aspects of this problem, varying from how to measure recommendation results, how recommender systems relate to information retrieval models, and how to build effective recommender systems (note: last Friday, we won the ACM RecSys 2013 News Recommender Systems challenge). We would like to develop a general methodology to diagnose weaknesses and strengths of recommender systems. In this talk, I discuss the initial results of an analysis of the core component of collaborative filtering recommenders: the similarity metric used to find the most similar users (neighbours) that will provide the basis for the recommendation to be made. The purpose is to shed light on the question why certain user similarity metrics have been found to perform better than others. We have studied statistics computed over the distance distribution in the neighbourhood as well as properties of the nearest neighbour graph. The features identified correlate strongly with measured prediction performance - however, we have not yet discovered how to deploy this knowledge to actually improve recommendations made.
Summary of a Recommender Systems Survey paperChangsung Moon
This is the summary of the following paper:
J. Bobadilla, F. Ortega, A. Hernando and A. Gutierrez, “Recommender Systems Survey,” Knowledge Based Systems, Vol. 26, 2013, pp. 109-132.
ACM SIGIR 2020 Tutorial - Reciprocal Recommendation: matching users with the ...Iván Palomares Carrascosa
Speaker: Iván Palomares Carrascosa.
DaSCI Andalusian Institute of Data Science and Artificial Intelligence.
Key words: recommender systems, information retrieval, reciprocal recommendation, people recommendation, online dating, recruitment, online learning, social networks, socia media, data fusion, preference aggregation.
Recommender Systems from A to Z – Model TrainingCrossing Minds
This second meetup will be about training different models for our recommender system. We will review the simple models we can build as a baseline. After that, we will present the recommender system as an optimization problem and discuss different training losses. We will mention linear models and matrix factorization techniques. We will end the presentation with a simple introduction to non-linear models and deep learning.
Recommender systems are software tools and techniques providing suggestions for items to be of interest to a user. Recommender systems have proved in recent years to be a valuable means of helping Web users by providing useful and effective recommendations or suggestions.
[RIIT 2017] Identifying Grey Sheep Users By The Distribution of User Similari...YONG ZHENG
Yong Zheng, Mayur Agnani, Mili Singh. “Identifying Grey Sheep Users By The Distribution of User Similarities In Collaborative Filtering”. Proceedings of The 6th ACM Conference on Research in Information Technology (RIIT), Rochester, NY, USA, October, 2017
Scala Data Pipelines for Music RecommendationsChris Johnson
Are you still building data pipelines with Java and Python? Are you curious about the current buzz in the Big Data community surrounding Scala as a data processing environment? In this talk I'll discuss how Spotify migrated its music recommendations pipeline from Python to Scala. I'll dive into the language specific features that make Scala the ideal candidate for big data processing as well as highlight the rich set of tools and APIs that we take advantage of to process music recommendations for our 50 Million active users including Scalding, Breeze, Kafka, Spark, Parquet, Driven and Zeppelin.
basic Function and Terminology of Recommendation Systems. Some Algorithmic Implementation with some sample Dataset for Understanding. It contains all the Layers of RS Framework well explained.
Search Quality Evaluation to Help Reproducibility: An Open-source ApproachAlessandro Benedetti
Every information retrieval practitioner ordinarily struggles with the task of evaluating how well a search engine is performing and to reproduce the performance achieved in a specific point in time.
Improving the correctness and effectiveness of a search system requires a set of tools which help measuring the direction where the system is going.
Additionally it is extremely important to track the evolution of the search system in time and to be able to reproduce and measure the same performance (through metrics of interest such as precison@k, recall, NDCG@k...).
The talk will describe the Rated Ranking Evaluator from a researcher and software engineer perspective.
RRE is an open source search quality evaluation tool, that can be used to produce a set of reports about the quality of a system, iteration after iteration and that could be integrated within a continuous integration infrastructure to monitor quality metrics after each release .
Focus of the talk will be to raise public awareness of the topic of search quality evaluation and reproducibility describing how RRE could help the industry.
In the world of recommendation systems, there are various theories and algorithms that work together to give the best results. Among these, the core recommendation algorithm is crucial. This paper will provide an introduction to some fundamental algorithms used in recommendation systems. These algorithms are like building blocks that help make recommendations more effective.
Search Quality Evaluation to Help Reproducibility : an Open Source ApproachAlessandro Benedetti
Every information retrieval practitioner ordinarily struggles with the task of evaluating how well a search engine is performing and to reproduce the performance achieved in a specific point in time.
Improving the correctness and effectiveness of a search system requires a set of tools which help measuring the direction where the system is going.
Additionally it is extremely important to track the evolution of the search system in time and to be able to reproduce and measure the same performance (through metrics of interest such as precison@k, recall, NDCG@k...).
The talk will describe the Rated Ranking Evaluator from a researcher and software engineer perspective.
RRE is an open source search quality evaluation tool, that can be used to produce a set of reports about the quality of a system, iteration after iteration and that could be integrated within a continuous integration infrastructure to monitor quality metrics after each release .
Focus of the talk will be to raise public awareness of the topic of search quality evaluation and reproducibility describing how RRE could help the industry.
The World Wide Web is moving from a Web of hyper-linked documents to a Web of linked data. Thanks to the Semantic Web technological stack and to the more recent Linked Open Data (LOD) initiative, a vast amount of RDF data have been published in freely accessible datasets connected with each other to form the so called LOD cloud. As of today, we have tons of RDF data available in the Web of Data, but only a few applications really exploit their potential power. The availability of such data is for sure an opportunity to feed personalized information access tools such as recommender systems. We will show how to plug Linked Open Data in a recommendation engine in order to build a new generation of LOD-enabled applications.
(Lecture given @ the 11th Reasoning Web Summer School - Berlin - August 1, 2015)
Nesta palestra no evento GDG DataFest, apresentei uma introdução prática sobre as principais técnicas de sistemas de recomendação, incluindo arquiteturas recentes baseadas em Deep Learning. Foram apresentados exemplos utilizando Python, TensorFlow e Google ML Engine, e fornecidos datasets para exercitarmos um cenário de recomendação de artigos e notícias.
This is part 1 of the tutorial Xavier and Deepak gave at Recsys 2016 this year. You can find the second part http://www.slideshare.net/xamat/recsys-2016-tutorial-lessons-learned-from-building-reallife-recommender-systems
Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...OpenSource Connections
Every team working on Information Retrieval software struggles with the task of evaluating how well their system performs in terms of search quality(at a specific point in time and historically).
Evaluating search quality is important both to understand and size the improvement or regression of your search application across the development cycles, and to communicate such progress to relevant stakeholders.
To satisfy these requirements an helpful tool must be:
- flexible and highly configurable for a technical user
- immediate, visual and concise for an optimal business utilization
In the industry, and especially in the open source community, the landscape is quite fragmented: such requirements are often achieved using ad-hoc partial solutions that each time require a considerable amount of development and customization effort.
To provide a standard, unified and approachable technology, we developed the Rated Ranking Evaluator (RRE), an open source tool for evaluating and measuring the search quality of a given search infrastructure. RRE is modular, compatible with multiple search technologies and easy to extend. It is composed by a core library and a set of modules and plugins that give it the flexibility to be integrated in automated evaluation processes and in continuous integrations flows.
This talk will introduce RRE, it will describe its latest developments and demonstrate how it can be integrated in a project to measure and assess the search quality of your search application.
The focus of the presentation will be on a live demo showing an example project with a set of initial relevancy issues that we will solve iteration after iteration: using RRE output feedbacks to gradually drive the improvement process until we reach an optimal balance between quality evaluation measures.
Rated Ranking Evaluator: An Open Source Approach for Search Quality EvaluationAlessandro Benedetti
Every team working on Information Retrieval software struggles with the task of evaluating how well their system performs in terms of search quality(at a specific point in time and historically).
Evaluating search quality is important both to understand and size the improvement or regression of your search application across the development cycles, and to communicate such progress to relevant stakeholders.
To satisfy these requirements an helpful tool must be:
flexible and highly configurable for a technical user
immediate, visual and concise for an optimal business utilization
In the industry, and especially in the open source community, the landscape is quite fragmented: such requirements are often achieved using ad-hoc partial solutions that each time require a considerable amount of development and customization effort.
To provide a standard, unified and approachable technology, we developed the Rated Ranking Evaluator (RRE), an open source tool for evaluating and measuring the search quality of a given search infrastructure. RRE is modular, compatible with multiple search technologies and easy to extend. It is composed by a core library and a set of modules and plugins that give it the flexibility to be integrated in automated evaluation processes and in continuous integrations flows.
This talk will introduce RRE, it will describe its latest developments and demonstrate how it can be integrated in a project to measure and assess the search quality of your search application.
The focus of the presentation will be on a live demo showing an example project with a set of initial relevancy issues that we will solve iteration after iteration: using RRE output feedbacks to gradually drive the improvement process until we reach an optimal balance between quality evaluation measures.
Rated Ranking Evaluator: an Open Source Approach for Search Quality EvaluationSease
To provide a standard, unified and approachable technology, we developed the Rated Ranking Evaluator (RRE), an open source tool for evaluating and measuring the search quality of a given search infrastructure. RRE is modular, compatible with multiple search technologies and easy to extend. It is composed by a core library and a set of modules and plugins that give it the flexibility to be integrated in automated evaluation processes and in continuous integrations flows.
This talk will introduce RRE, it will describe its latest developments and demonstrate how it can be integrated in a project to measure and assess the search quality of your search application.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
StarCompliance is a leading firm specializing in the recovery of stolen cryptocurrency. Our comprehensive services are designed to assist individuals and organizations in navigating the complex process of fraud reporting, investigation, and fund recovery. We combine cutting-edge technology with expert legal support to provide a robust solution for victims of crypto theft.
Our Services Include:
Reporting to Tracking Authorities:
We immediately notify all relevant centralized exchanges (CEX), decentralized exchanges (DEX), and wallet providers about the stolen cryptocurrency. This ensures that the stolen assets are flagged as scam transactions, making it impossible for the thief to use them.
Assistance with Filing Police Reports:
We guide you through the process of filing a valid police report. Our support team provides detailed instructions on which police department to contact and helps you complete the necessary paperwork within the critical 72-hour window.
Launching the Refund Process:
Our team of experienced lawyers can initiate lawsuits on your behalf and represent you in various jurisdictions around the world. They work diligently to recover your stolen funds and ensure that justice is served.
At StarCompliance, we understand the urgency and stress involved in dealing with cryptocurrency theft. Our dedicated team works quickly and efficiently to provide you with the support and expertise needed to recover your assets. Trust us to be your partner in navigating the complexities of the crypto world and safeguarding your investments.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
2. Recsys Overview
- Deep learning is “omnipresent” now (no more specialized workshop or DL-specific track)
- Reinforcement Learning gaining popularity (industry mostly)
- User-centric papers (calibration, diversity…)
- Evaluation and Metrics
- Recsys Challenge (Spotify) - LTR
- Tutorials (material & slides here)
- Mixed Methods (Spotify)
- Sequence-aware RS (Politecnico di Milano + Pandora)
- OpenRec (open-source and modular library for NN algo’s)
- Deep Learning (Flipkart)
- Emotions and Personality in Recommender Systems
- Many authors are making their code available
3. Reinforcement Learning
Traditionally used in robotics, games (AlphaGo), self-driving cars...
Why RL in RS? RS have 2 competing goals:
● Recommend items with the highest user predicted engagement (exploit)
● Recommend items with uncertain predicted user engagement to gather more
information (explore)
=> Traditional RS focus on exploiting only.
Exploration is important in settings with new users, new items and dynamic user
preferences.
4. RS problem framed as a RL problem
RL: Sequential decision making problem
At step t, an agent must perform an action at
in an uncertain environment which
presents the agent with a new reward r(t+1)
and a new state s(t+1).
Example:
● action: recommend a product
● reward: click or no click (binary)
The goal of the agent is to learn a policy indicating the action that maximises the
total reward collected: greedy, epsilon-greedy, upper confidence bound (UCD),
Thompson sampling...
5. Netflix: Artwork recommendation using RL
Personalise artwork of movie titles so users can better decide whether to watch
something or not
8. RL @ Recsys 2018
- REVEAL: workshop on Offline Evaluation for Recommender Systems
- BEARS (Insight, Athena): Evaluation framework to test bandit-based RS
- Netflix: Artwork recommendation using RL
- Pandora: Rank modules to show to users (i.e. “friday mood”, ...)
- Spotify: Jointly learn what items and explanations to show to users
- Criteo: Causal Embeddings (best paper)
- Deal with data confounded by the recommender by combining a large sample of biased
feedback data with a small sample of unbiased feedback data.
9. However...
Deep Reinforcement Learning Doesn't Work Yet (Feb 2018)
Reinforcement Learning never worked, and 'deep' only helped a bit (Feb 2018)
RL Researchers
11. Calibrated Recommendations (Steck)
- Nominated paper by Harald Steck (Netflix)
- RS trained on accuracy tend to focus on user’s main interests (unbalanced
recommendations)
- E.g. user’s items 70% romance and 30% action → an uncalibrated algorithm
would recommend most items in the romance category
- The work proposes
a calibration metric
and performs an
evaluation on
MovieLens
15. Calibration vs Diversity
If we have 2 genres, romance and action, the most diverse list would contain 50%
romance and 50% action movies.
But it does not consider the accuracy - diversity tradeoff for each user (some
users may not want diverse recommendations) -> ~ personalized diversity
Extension taking into account diversity (introduce a new param beta that controls
the calibration-diversity tradeoff):
Diversity-
promoting
prior
Calibration
target
17. Reciprocal Recommender Systems (Kleinerman et al)
Reciprocal -> online dating, jobs… (recommending people), marketplaces (items)
For each user receiving a recommendation, the system finds the optimal balance
of:
● Likelihood of the user accepting the recommendation
● Likelihood of the recommended user positively responding
Evaluation on an online dating site (app)
marketplaces,
18. Reciprocal Recommender Systems (Kleinerman et al)
Approach based on combining:
● Collaborative filtering -> score for each user pair CF(x,y)
● AdaBoost classifier that predicts the probability that a user will respond to
another user based on features from the sender and the receiver (content +
popularity features) -> PR(y,x)
Classifier AUC = 0.83
Baseline = CF (on its own)
Learn a weight that balances CF and PR
20. Automatic Playlist Continuation
Given a playlist, what songs should be played after?
Input: A user-created playlist, represented by:
● Playlist metadata
● A list of the K tracks in the playlist, K = [0, 1, 5, 10, 25, or 100]
Output: A list of 500 recommended tracks, ordered by relevance
21. Top 3 Teams:
1st place: Two-stage model:
● 1st: WRMF + CNN + user-user + item-Item neighborhood models.
● 2nd: gradient boosting model used to re-rank the retrieved songs
2nd place: Multimodal CF that uses an autoencoder and a character-level CBB
3rd place: Two-stage model:
● 1st: LightFM
● 2nd: gradient boosting
22. Winning Team Model Architecture
Two-stage Model for Automatic Playlist Continuation at Scale [Volkovs et al]
23. Winning Team Model Architecture
First Stage: Linear weighted ensemble method:
Second Stage: Learning to rank using gradient boosting trees (GBT)
● first stage scores (s_blend, s_wrmf…) + other engineered features
● pairwise ranking loss
● 150 trees; depth = 10 (XGBoost library)
Cold-start handled as a separate problem
24. Main Findings
● Blending (1st stage) already produced high performance (high recall of 90% 20K
candidate songs - 60% in the top 1K).
● Neighbourhood based approaches outperformed CNN and WRMF (CNN was
the worst performing)
● Re-scaling similarity scores using inverse popularity [Verstrepen et al]
significantly improved the accuracy of the 2 neighbourhood-based approaches
● First few trees of GBM (2nd stage) already beat performance of blending (1st
stage) because GBM uses output of blender (scores of the different models)
as input
25. 2nd place: Multimodal Collaborative Filtering
Multimodal approach that uses:
1. an autoencoder using the playlist
and its categorical contents
2. a character-level CNN that only
uses the playlist title
MMCF: Multimodal Collaborative Filtering for
Automatic Playlist Continuation. [Yang et al.]
26. 2nd place: Multimodal Collaborative Filtering
Combine the two models using a linear combination of output vectors witem
& wtitle
The autoencoder can capture the characteristics of a given playlist more precisely
as the number of input items increases → give more weight to witem
when number
of items is higher
N([p; ap
]): number of items I(Tp
): importance of the playlist title
27. 3rd place: hybrid two-stage recommender
1st stage: LightFM used for generating candidates
2nd stage:
● LightFM features (features produced by LightFM (score(p, t) but also bp
,bt
,
< qp
,qt
>) -- p=playlist, t=track
● Co-occurrence features related to playlist p and candidate track t
● e.g. number of playlists containing tracks t_i and t… → calculate it for each track in the playlist
t_i and candidate track t. Use mean, min, max, and median statistics over the tracks in the
playlist)
A hybrid two-stage recommender system for automatic playlist continuation
29. On the Robustness and Discriminative Power
of IR Metrics for Top-N Recommendation (Valcarce et al)
● Studies the robustness and discriminative power of several ranking metrics
(originally used in IR) when applied to the top-N recommendation task.
● A desirable metric for recommendation should be robust to incompleteness in
the test set.
● Assess robustness by simulating sparsity and popularity bias in the test set
(removing at random or removing top popular) and recalculating the metrics.
● Compare rankings of the test sets with Kendall’s correlation
30. On the Robustness and Discriminative Power
of IR Metrics for Top-N Recommendation (Valcarce et al)
3 datasets and 21 recommendation algorithms (All Items methodology)
● Precision is very robust to sparsity and popularity biases
● NDCG → high robustness to the sparsity bias and moderate robustness to
the popularity bias.
● MRR: performs poorly in RS evaluation.
Interesting approach to evaluate the robustness of our metrics for our own
datasets
31. Judging Similarity: A user-centric study of related
item recommendations (Yao et al.)
Evaluate item similarity Evaluate recommendation quality
32. Judging Similarity: A user-centric study of related
item recommendations (Yao et al.)
● User-centric evaluation of 6 related item algorithms: random,
content- based (tag genome), content-based (user reviews), svd,
item2vec, arm
● 700 participants (Movielens users invited by email)
● 2 research questions:
○ Which related item algorithms best match user perceptions of relatedness and
recommendation quality?
○ How should related item algorithms be designed to improve the user experience?
● Survey and responses are publicly available
33. Stratified sampling strategy
● 100 source items sampled from the top 2500 most popular items
○ Easier for people to know about the movies
○ 2500 most popular account for 80% of user ratings
● Stratified -> split 2500 into 10 groups, pick 10 random movies from each
● For each algorithm, retrieve top 10 neighbours
● Top 10,000 items considered as target items
34. High correlation
between similarity
and recom. Quality
(0.80 Spearman
rank-order
correlation)
CB approaches outperform CF-based ones in terms of user
expectations for similarity and recommendation quality
35. Results and Conclusions of the Study
- Content-based approaches item-similarity matches the most with user
expectations compared to CF approaches
- Perceived recommendation quality is also better.
- Users said they want something in between “similar to their interests” and “not
obvious”
- Based on the user’s feedback the authors suggest that related item
recommendations should combine item similarity with other factors such as
diversity and serendipity.
36. Interpreting User Inaction (Zhao et al.)
Most work focuses on user’s interactions with items. This work focuses on
studying the lack of interaction / inaction through a live user survey on MovieLens.
Inaction doesn’t always mean negative feedback. E.g. “explore later” type of
inaction is a positive user feedback.
Research questions:
● What causes inaction?
● Can inaction reason be predicted?
● Can we improve recommender systems using an inaction model?
37. Interpreting User Inaction (Zhao et al.)
7 categories of inaction:
● Not Noticed / Lack of attention (38.6%)
● Not Now (18.2%)
● Already Watched (14.6%)
● Others Better Titles (9.5%)
● Explore Later / Need more info to make a decision (6.9%)
● Would Not Enjoy / Not matching user’s taste (5.8%)
● Already Decided To Watch (5.8%)
User’s lack of attention → UI design should try to optimize user attention
38. Interpreting User Inaction (Zhao et al.)
Used a multinomial regression model to predict the type / category of user
inaction.
Main Findings:
=> Reason for inaction is hard to predict → overall poor classification performance
but some categories have high accuracy.
=> Predicted probability of “Not Now” could be used when to “skip” showing a
recommendation and wait for a future session to show it.
41. Mixed Methods (Discovery Weekly @ Spotify)
Qualitative:
How to do interviews and surveys (best practices)
E.g. scales shouldn’t have numbers, should have words
Quantitative:
How to collect data: attention, interaction, task-success