Recommender systems are software agents that analyze a user's preferences through transactions and provide personalized recommendations accordingly. There are several recommendation paradigms including non-personalized rules, personalized rules based on user data, and transaction-based collaborative filtering that learns from user interactions. Context-based recommender systems also consider additional information like time, location, or device to provide adaptive recommendations. Common techniques used in recommender systems include content-based filtering that recommends similar items, collaborative filtering that finds users with similar tastes, and demographic-based recommendations.
Recommender Systems represent one of the most widespread and impactful applications of predictive machine learning models.
Amazon, YouTube, Netflix, Facebook and many other companies generate an important fraction of their revenues thanks to their ability to model and accurately predict users ratings and preferences.
In this presentation we cover the following points:
→ introduction to recommender systems
→ working with explicit vs implicit feedback
→ content-based vs collaborative filtering approaches
→ user-based and item-item methods
→ machine learning and deep learning models
→ pros & cons of the methods: scalability, accuracy, explainability
Overview of the Recommender system or recommendation system. RFM Concepts in brief. Collaborative Filtering in Item and User based. Content-based Recommendation also described.Product Association Recommender System. Stereotype Recommendation described with advantage and limitations.Customer Lifetime. Recommender System Analysis and Solving Cycle.
In this lecture, I will first cover the recent advances in neural recommender systems such as autoencoder-based and MLP-based recommender systems. Then, I will introduce the recent achievement for automatic playlist continuation in music recommendation.
Recommender Systems represent one of the most widespread and impactful applications of predictive machine learning models.
Amazon, YouTube, Netflix, Facebook and many other companies generate an important fraction of their revenues thanks to their ability to model and accurately predict users ratings and preferences.
In this presentation we cover the following points:
→ introduction to recommender systems
→ working with explicit vs implicit feedback
→ content-based vs collaborative filtering approaches
→ user-based and item-item methods
→ machine learning and deep learning models
→ pros & cons of the methods: scalability, accuracy, explainability
Overview of the Recommender system or recommendation system. RFM Concepts in brief. Collaborative Filtering in Item and User based. Content-based Recommendation also described.Product Association Recommender System. Stereotype Recommendation described with advantage and limitations.Customer Lifetime. Recommender System Analysis and Solving Cycle.
In this lecture, I will first cover the recent advances in neural recommender systems such as autoencoder-based and MLP-based recommender systems. Then, I will introduce the recent achievement for automatic playlist continuation in music recommendation.
What really are recommendations engines nowadays?
This presentation introduces the foundations of recommendation algorithms, and covers common approaches as well as some of the most advanced techniques. Although more focused on efficiency than theoretical properties, basics of matrix algebra and optimization-based machine learning are used through the presentation.
Table of Contents:
1. Collaborative Filtering
1.1 User-User
1.2 Item-Item
1.3 User-Item
* Matrix Factorization
* Stochastic Gradient Descent (SGD)
* Truncated Singular Value Decomposition (SVD)
* Alternating Least Square (ALS)
* Deep Learning
2. Content Extraction
* Item-Item Similarities
* Deep Content Extraction: NLP, CNN, LSTM
3. Hybrid Models
4. In Production
4.1 Problematics
4.2 Solutions
4.3 Tools
Recommender systems are software tools and techniques providing suggestions for items to be of interest to a user. Recommender systems have proved in recent years to be a valuable means of helping Web users by providing useful and effective recommendations or suggestions.
The goal of a recommender system is to predict the degree to which a user will like or dislike a set of items, such as movies or TV shows.
Most recommender systems use a combination of different approaches, but broadly speaking there are three different methods that can be used: Content analysis, Social recommendations and Collaborative filtering.
Recommendation systems, also known as recommendation engines, are a type of information system whose purpose is to suggest, or recommend items or actions to users.
The recommendations may consist of:
-> retail items (movies, books, etc.) or
-> actions, such as following other users in a social network.
It can be said that, Recommendation engines are nothing but an automated form of a “shop counter guy”. You ask him for a product. Not only he shows that product, but also the related ones which you could buy. They are well trained in cross selling and up selling. So, does our recommendation engines.
We have built an online Movie Recommender System which is based on the analysis of users' ratings history to several movies and their demographic information. We used data from Movielens website. Collaborative filtering and matrix factorization techniques have been used for the implementation. The end result is a web application where a user is recommended with top 20 movies.
Codebase: http://goo.gl/nM7RMy
Demo Video: http://goo.gl/VgZ2uI
Recommendations are everywhere : music, movies, books, social medias, e-commerce web sites… The Web is leaving the era of search and entering one of discovery. This quick introduction will help you to understand this vast topic and why you should use it.
The Hive Think Tank: Machine Learning at Pinterest by Jure LeskovecThe Hive
Machine learning is at the core of Pinterest. Pinterest personalizes and ranks 1B+ pins, 700+ million boards for 100M+ users all over the world, using data gathered from collaborative filtering, user curation, web crawling, and more. At Pinterest we model relationships between pins, handle cold-start problems and deal with real-time recommendations.
In this presentation Jure gave an overview of the problems and effective solutions developed at Pinterest. He focused on systems and effective engineering choices made to enable productive machine learning development and enable multiple engineers effectively develop, test, and deploy machine-learned models.
What really are recommendations engines nowadays?
This presentation introduces the foundations of recommendation algorithms, and covers common approaches as well as some of the most advanced techniques. Although more focused on efficiency than theoretical properties, basics of matrix algebra and optimization-based machine learning are used through the presentation.
Table of Contents:
1. Collaborative Filtering
1.1 User-User
1.2 Item-Item
1.3 User-Item
* Matrix Factorization
* Stochastic Gradient Descent (SGD)
* Truncated Singular Value Decomposition (SVD)
* Alternating Least Square (ALS)
* Deep Learning
2. Content Extraction
* Item-Item Similarities
* Deep Content Extraction: NLP, CNN, LSTM
3. Hybrid Models
4. In Production
4.1 Problematics
4.2 Solutions
4.3 Tools
Recommender systems are software tools and techniques providing suggestions for items to be of interest to a user. Recommender systems have proved in recent years to be a valuable means of helping Web users by providing useful and effective recommendations or suggestions.
The goal of a recommender system is to predict the degree to which a user will like or dislike a set of items, such as movies or TV shows.
Most recommender systems use a combination of different approaches, but broadly speaking there are three different methods that can be used: Content analysis, Social recommendations and Collaborative filtering.
Recommendation systems, also known as recommendation engines, are a type of information system whose purpose is to suggest, or recommend items or actions to users.
The recommendations may consist of:
-> retail items (movies, books, etc.) or
-> actions, such as following other users in a social network.
It can be said that, Recommendation engines are nothing but an automated form of a “shop counter guy”. You ask him for a product. Not only he shows that product, but also the related ones which you could buy. They are well trained in cross selling and up selling. So, does our recommendation engines.
We have built an online Movie Recommender System which is based on the analysis of users' ratings history to several movies and their demographic information. We used data from Movielens website. Collaborative filtering and matrix factorization techniques have been used for the implementation. The end result is a web application where a user is recommended with top 20 movies.
Codebase: http://goo.gl/nM7RMy
Demo Video: http://goo.gl/VgZ2uI
Recommendations are everywhere : music, movies, books, social medias, e-commerce web sites… The Web is leaving the era of search and entering one of discovery. This quick introduction will help you to understand this vast topic and why you should use it.
The Hive Think Tank: Machine Learning at Pinterest by Jure LeskovecThe Hive
Machine learning is at the core of Pinterest. Pinterest personalizes and ranks 1B+ pins, 700+ million boards for 100M+ users all over the world, using data gathered from collaborative filtering, user curation, web crawling, and more. At Pinterest we model relationships between pins, handle cold-start problems and deal with real-time recommendations.
In this presentation Jure gave an overview of the problems and effective solutions developed at Pinterest. He focused on systems and effective engineering choices made to enable productive machine learning development and enable multiple engineers effectively develop, test, and deploy machine-learned models.
Great tips, resources, best practices and how-to's on Internet Marketing and Interactive Media esp. to plan launch and grow a wildly successful business.
How can you transform your visitors into happy customers ?
1- Users must care
--> Be unique
--> Define your excitement features
--> Understand your users
2- Users must test
--> Increase trust
--> Reduce fears
--> Trigger an action
3- Users must use
--> Define the final goal
--> Determine the Aha moment
--> Define the mandatory steps
--> Guide users to avoid empty spaces
4- Users must pay
--> Transform free users into premium users
--> Define your prices
--> Display your prices correctly
--> Make payment easier
Think tank - Data Culture for a Better BusinessDan Cave
Growth Hacker and Data Punk Daniel Cave talks about how how to put Data at the heart of your business.
What you should track, what you should share and who you should share it with to drive the best business decisions possible.
Introduction to Lean Startup leading up to a 3-hour workshop. Presented by me at EFYI (European Forum for Young Innovators) 2016, conference organized by Poland Innovative (Polska Innowacyjna).
Modern Perspectives on Recommender Systems and their Applications in MendeleyKris Jack
Presentation given for one of Pearson's Data Research teams. It motivates the use of recommender systems, describes common approaches to building and evaluating them and gives examples of how they are used in Mendeley. Thanks to Maya Hristakeva for creating some of the slides.
Recommender Systems meet Finance - A literature reviewDavid Zibriczky
The present work overviews the application of recommender systems in various financial domains. The relevant literature is investigated based on two directions. First, a domain-based categorization is discussed focusing on those recommendation problems, where the existing literature is significant. Second, the application of various recommendation algorithms and data mining techniques is summarized. The purpose of this paper is to provide a basis for recommender system- and financial experts to work out further scientific contributions in this field.
EPG content recommendation in large scale: a case study on interactive TV pla...David Zibriczky
EPG content recommendation in large scale: a case study on interactive TV platform -- ICMLA 2013 - Machine Learning with Multimedia Data (7th December 2013, Miami, FL)
Personalized recommendation of linear content on interactive TV platformsDavid Zibriczky
Personalized recommendation of linear content on interactive TV platforms -- International Workshop on TV and multimedia personalization (July 16th 2012, Monteal, Canada)
We investigate entropy as a novel risk measure, which explains the equity premium of securities and portfolios in a simpler way and at the same time with higher explanatory power than the beta parameter of the capital asset pricing model. Entropy represents a measure of the uncertainty of a probability variable. For asset pricing we define the continuous entropy as an alternative measure of risk. Our results show that the entropy is decreasing in the function of the number of securities involved into a portfolio similarly to the standard deviation, and the efficient portfolios are situated on a hyperbola in the expected return-variance coordinate system. We use the daily returns of 150 randomly selected securities for a period of 27 years. Regression results show that the entropy has a higher explanatory power for the expected return than CAPM beta. We show the time varying behavior of beta along with entropy.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
2. 2
About ImpressTV
• Netflix Prize (2006-2009)
• The Ensemble: Runner-up from 40K teams
• Gravity team: Members of The Ensemble
• Gravity R&D
› Hungarian start-up, launched in 2007, B2B business
› On-site personalization solution provider company for e-commerce, IPTV, OTT and classified media
› 100M+ recommendations per day
• ImpressTV Limited
› IPTV and OTT was acquired from Gravity R&D by british investors in July 2014
› HQ in Budapest, international corporate clients
• About me
› Joined Gravity R&D in January 2010, transfered to ImpressTV Limited in July 2014
› Current position: Head of Data Science at ImpressTV Limited
5. 5
• Too many existing items, too many options
› YouTube videos, Amazon books, Netflix movies, Pandora musics, …
• High item publishing intensity, hard to follow the flow of information
› Google News, Facebook posts, Twitter tweets, …
• Users are not qualified enough
› They know what they want to watch/buy, but they don’t know whether a given item would
satisfy their needs
› E.g.: Grandma would like a new computer, knows what she wants to do with it, but has no idea
of the computer parts and what they do
• One may not know what they would like
› First time in a Chinese restaurant, no experience about Chinese food.
• How to improve the user satisfaction?
Information overload II. – Consumers
6. 6
• Challange in handling ever increasing number of contents and corresponding data
• Challenge in handling transactional data
• Increasing amount of inner information
• What is useful?
• How to determ the usefulness of data?
• Competition in business market
› keeping and converting consumers is essential
› keeping the market adavantage
› increasing revenue
• How to exctract and use the collected information to improve business success?
Information overload II. – Content Providers
7. 7
„Recommender Systems (RS) are software agents that elicit the interests and
preferences of individual consumers […] and make recommendations
accordingly. They have the potential to support and improve the quality of the
decisions consumers make while searching for and selecting products online.„ 1
What are Recommender Systems?
1 Xiao, Bo, and Izak Benbasat, 2007, E-commerce product recommendation agents: Use, characteristics, and
impact, Mis Quarterly 31, 137-209.
Recommender Systems
8. 8
Recommender Systems as software agents
Items Users
Recommender
System
Recommend item X to user A
18. 18
• News (What to read now?)
• Dating sites (Which girl to contact?)
• Coupon portals (Which coupon is good for me?)
• Restaurants (Where to eat?)
• Vacations (Where to travel and what to see?)
• Retailers (How much products to supply?)
• Financial assets (loan, portfolio)
• Advertisements
• …etc
Other Examples
19. 19
• For the consumers (users)
› Helping the users to find useful contents to satisfy their needs
› Reducing the time of content searching
› Providing relevant information from the massive information flow
› Exploring new preferences, trust in recommender system
• For the business
› Improving business success indicators (or key performance indicators, KPI)
• Increasing revenue, CTR, watching/listening duration
• Increasing conversion rate and user engagement, reducing churn
• Cross-selling, upselling, advertisement
› Reducing popularity effect, less popular contents are also consumed
› Promotions, targeting, campaign
Goals and benefits of Recommender Systems
20. 20
• Goals for consumers not necessarily equal to business!
• Simplified example 1: YouTube (free contents)
› Goal of the business: Receive more income from advertisements
How: More videos watched, more advertisements seen
› Goal of the users: Having good/useful time by watching/listening videos
How: Clicking on recommendations, using search engine
› The goal is realized in the same way, more videos watched are good for both.
• Simplified example 2: Netflix (DVD/Blu-ray or Video On Demand rental)
› Goal of the business: Increase the income
How: More expensive contents, improving user engagement
› Goal for the users: Buying interesting movies, spending less time for searching
How: Using recommendation engine to make it easier
› The goal is different. Netflix wants the users pay more. The users basically don’t want to
spend more, only if they find it worth.
Difference between the goals of consumers and business
23. 23
• Items: Entites that are recommended (movies, musics, books, news, coupons, restaurants, etc...)
• Item data: Descriptive information about the items (e.g. genre, category, price)
• Users: People to whom we recommend (Who is the user? Member, cookie, unidentified?)
• Paradigm: Recommendations are calculated by predefined non-personalized rules
• How: Setting item data based rules (recommending the latest movies for all users)
• Properties:
› Non-personalized static recommendations
› Requires manual work (e.g editorial pick)
Recommendation paradigms / Non-personalized rules
Items
Recommender
System
RecoItem data
Users
24. 24
• User data: Descriptive information about the users (e.g. age, gender, location)
• Paradigm: Recommendations are calculated by predefined used data based rules
• How: Setting item and user data based rules (men between 45-55 expensive cars)
• Properties:
› Semi-personalized
› Requires manual work (e.g rule constructing)
› Interpretable
Recommendation paradigms / Personalized rules
Items User data
Recommender
System
Reco
Item data
Users
25. 25
• Transactions: Interaction between users and items
• Transaction types: Numerical ratings, ordinal ratings, binary ratings, unary ratings, textual reviews
• Explicit feedback: The user quantifies his preference about an item (rating)
• Implicit feedback: Events that indicates but not quantifies the preference of the user about an item
› Positive: buy, watch, like, add to favourite
› Negative: dislike, remove from favourites
• In practice, implicit feedback is less valueable than explicit feedback, but significantly more provided
Recommendation paradigms / Transaction based personalization I.
Items User data
Recommender
System
Reco
Item data
Users
Transactions
26. 26
• Paradigm: Recommendations are calculated based on the users interactions
• How: Learning on interactional data (collaborative filtering)
• Properties:
› Personalized
› Adaptive
› Less interpretable
Recommendation paradigms / Transaction based personalization II.
Items User data
Recommender
System
Reco
Item data
Users
Transactions
27. 27
• Context: Information that can be observed in the time of recommendation
• Types:
› Time-based temporal information (Recos of sports equipments should be different in summer and in winter)
› Mood (Different TV programs should be recommended to the same person based on her current mood)
› Device (Different content types based on device type, e.g. movie trailers in TV, music in phone)
› Location
› Event sequence information (If the user browses the TV category of a webshop, it is reasonable to recommend
DVD players, but it is not if the user browses the laptop category)
Recommendation paradigms / Context based personalization I.
Items User data
Recommender
System
Reco
Item data
Users
Context
Transactions
28. 28
• Paradigm: Recommendations are calculated based on the users interactions and current context
• How: Learning on contextual data (the user usually watches new in the morning, action at night)
• Properties:
› Context-sensitive, fully adaptive
› More interpretable
› More complex
Recommendation paradigms / Context based personalization II.
Items User data
Recommender
System
Reco
Item data
Users
Context
Transactions
29. 29
Cross-domain recommendation
Task Goal
Multi-domain
Cross- selling
Diversity
Serendipity
Linked-domain Accuracy
Cross-domain
Cold-start
New users
New items
• Multi-domain: You watched movies and bought books, we recommend movies or books
• Linked-domain: You bought books only, we recommend books using both movie and book
consumption patterns
• Cross-domain: You watched movies only, we recommend books, based on movie-book
consumption relationship
Domain Example Ratio
Attribute Comedy Thriller 12%
Type Movies Books 9%
Item Movies Restaurants 55%
System Netflix MovieLens 24%
30. 30
• Item to User
› Conventional recommendation task: „You may like these items”.
• Item to Item
› Recommending items that are somewhat similar to the item currently viewed by the user
› Personalized or non-personalized similarity
• User to User
› Recommending different users for the users based on metadata or activity
› Social recommendations (who to follow, who to connect) or similar users
• User to Item
› Promoting items that the seller wants to sell to the most probable buyers (e.g. newsletters, notifications)
› „Who would buy this item?”
• Group recommendations
› Recommending group of items or users
Recommendation types
32. 32
• Content-based Filtering (CBF)
› Recommend items that are similar to the ones that the user liked in the past.
› Similarity based on the metadata of the items.
› E.g.: If the user likes romantic movies, recommend her the like
• Collaborative Filtering (CF)
› Recommend items that are liked by users that have similar taste as the current user
› Similarity betwen users is calculated by the transaction history of users
› Only uses the transaction data domain independent
• Demographic
› Recommendations made based on the demographic profile.
› Realizes a simple personalization.
› E.g.: Recommend computer parts to young people who study informatics.
Recommendation techniques I.
33. 33
• Knowledge-based
› Uses extensive domain-specific knowledge to generate recs.
› User requirements collected (problem), items as possible solutions to the specific problems
› Example: user would like to find a digital camera, she provides her needs and the level of her skills, etc.
› Pure knowledge based systems (without learning) tend to perform better in the beginning of deployment
but they fall behind later.
• Community-based
› Recommendations based on the preferences of the user’s friends.
› People tend to accept recommendations of their friends.
› Using of social networks social recommender systems.
• Hybrid recommendation systems
› Combination of the techniques above.
› Trying to use the advantages and fix the disadvantages of the different techniques.
Recommendation techniques II.
34. 34
1. Setting the recommendation type (e.g. Item2user recommendation)
2. Requesting a recommendation
3. Selecting the recommendable items, filtering (e.g. new movies)
4. Selecting the algorithm for scoring these items (e.g. a collaborative filteirng algorithm)
5. The algorithm provides score for each items
6. Ordering the list by the scores
7. Post processing the input item list (e.g. randomizing topN, selecting one episode per series, etc…)
8. Selecting the first N items to send back as a response of recommendation request
Recommendation data flow
35. 35
• Key features of a good recommender systems, that should be considered
› Accuracy (efficiency of modeling the user preference)
› Adaptation, context awareness (ability to detect changes in user behavior)
› Diversity, coverage (avoiding monotone recommendations, and preference cannibalization)
› Novelty, serendipity (improving suprise factor, exploitation vs. exploration)
› Trust, explanation (the users should understand and trust in recommender system)
› Scalability, responsivity, availability (recommendations should be provided in reasonable time)
• Tradeoffs in recommender systems
› Accuracy vs. Diversity
› Discovery vs. Continuation
› Depth vs. Coverage
› Freshness vs. Stability
› Recommendations vs. Tasks
Properties of recommender system
36. 36
• Method
› Splitting data set into disjunct train and test set
› By time, random, users, user history
› Training on train set, measuring on test set
• Evaluation metrics
› Accuracy: Rating (RMSE), TopN (nDCG@N, Recall@N, Precision@N)
› Coverage: Ratio of the recommended items
› Diversity: Entropy, Gini-index
› Novelty: Ratio of the long tail items
› Serendipity: Ratio of the less
› Training time
› Accuracy is the typical primary metric, others are secondary
Offline Evaluation
DataSet
Train Test
37. 37
• Method
› Effects of recommender systems are measured
› A/B testing: Splitting user base into equally valuable subsets, different recommendations for
the subsets
› A/B tests avoid
› Online improvement can be measured
• Evaluation metrics
› Some effects cannot be evaluated before the recommender system set live
› Click Though Rate (CTR, Number of clicks per recomendation)
› Average Revenue Per User (ARPU), Total Revenue Increase
› Page impression (number of page views)
› Conversion Rate, churn Rate
Online Evaluation
38. 38
1. Understanding the business needs
› Business goals, Key Performance Indicators for live recommender system
› Recommendation scenarios, placeholders
› Data understanding
2. Integration (for Vendors only)
› Developing data integration method (the way the customer provide it’s data)
› Setting up recommendations request/response interface (how the customer request a recommendation,
and the vendor provides response, e.g. JSON or REST API)
3. Data preparation
› Data enrichment from external sources (e.g. crawling additional item meta data)
› Data transformation (shall we handle a complete series as an item, or a group of episodes?)
Workflow of Recommender System integration
39. 39
4. Data Mining and Offline Experiments
Workflow of Recommender System integration
40. 40
5. Online Evaluation / Deployment
› Setting recommendation engine live (starting to provide recommendations for real end-users)
› Measuring online performance metrics (e.g. CTR, ARPU or page impression)
› Measuring response times (is the algorithm fast enough in live service?)
› Analyzing correlation between the offline and online metrics (what to optimize in offline experiments)
6. Optimization
› Implementing additional algorithms (Step 4)
› A/B testing (is the new algoritm better than the original one?)
› Statistical significance based A/B selection (e.g. T-test between the performance of algorithm A and B)
7. Reporting and follow-up
› Reporting performance (weekly, monthly)
› Monitoring response times and availability (if the algorithm not roboust enough, it may result outage)
› Monitoring change in user behavior, size of user base
› Adapting to new features (e.g. new placeholders in the website)
Workflow of Recommender System integration
42. 42
• Content-based Filtering (CBF)
› Recommend items that are similar to the ones that the user liked in the past
› Similarity based on the metadata of the items
› E.g.: If the user likes horror movies, recommend her horror movies
• Collaborative Filtering (CF)
• Demographic
• Knowledge-based
• Community-based
• Hybrid recommendation systems
Recommendation Techniques
43. 43
Recommending an item to a user based upon a description of the item and a profile of the user’s interests
• Items are represented by their metadata (e.g genre or description)
• Users are represented by their transaction history (e.g. item A and B was rated by 5)
• We generate user profiles from the user transaction data and the metadata of the items in the user’s
transactions
• The user profiles are compared to the representation of the items
• The items similar to the user profile are recommended to the user
CBF Method
Twilight
5
5
?
?
Sci-Fi RomanceAdventure
The
Matrix
The
Matrix 2
The
Matrix 3
45. 45
Content Analyzer
• Preprocessing module
• Input: Item meta data
• Methods:
› Text mining
› Semantic Analysis
› Natural Language
Process
› Feature extraction
› Meta data enrichment
› Auto tagging
• Output: Item models
46. 46
• We create terms from the item metadata
› Standard text mining preprocessing steps
• Filtering stopwords
• Filtering too rare/common terms
• Stemming
• Items are represented by a (sparse) vector of terms
› The vector of each item (=document) contains weights for each term.
› Weighting is often done by the TF-IDF scheme
• Rare terms are not less relevant than frequent terms (IDF).
• Multiple occurences in a document are not less relevent than a single occurence (TF).
• Invariant of the length of the document (normalization).
• Similarity measurement
› Most common: cosine similarity
• Scalar product of the L2 normalized vectors
Content Analyzer / Vector space model (VSM)
T
k
k
T
k
k
T
k
kk
ww
ww
iisim
1
2
2
1
2
1
1
21
21 ),(
47. 47
• Semantic analysis in order to extract more information about the items
• Domain-specific knowledge and taxonomy is used
Content Analyzer / Semantic analysis, ontologies
48. 48
• The meta data provided by clients is usually not enough
• Meta data enrichment: Crawling additional data about the items from public web
• Publicly available knowledge sources
› Open Directory Project
› Yahoo! Web Directory
› Wikipedia
› Freebase
› Tribune
› Etc…
• Goals:
› Fixing missing or wrong data
› Better characterization of items
› Improving the accuracy of CBF algorithms
Content Analyzer / Meta data enrichment
49. 49
Profile Learner
• User preference modeling
• Inputs:
› Item models
› User transactional data
› User meta data
• Methods:
› Meta data weighting
› Machine learning
• Output: User profile
51. 51
• Negative and positive examples collected
› Explicit case: Items rated under 3 counts negative, above positive (variants: above overall average, user average…)
› Implicit case: Items viewed counts positive, others negative
• Approach 1
› User profile: weighted average of item vectors (Rocchio’s algorithm)
› Similarity between user profile and item vectors as relevance score
• Approach 2
› Negative and positive user profiles
› Items that match negative profile are filtered
› Similarity between the positive profile and the item’s vector as relevance score
• Approach 3
› Rule based classifiers (decision trees, decision lists) learning on the examples
› New items are judged by the classifier
• Approach 4
› Nearest Neighbor methods
Profile Learner / User profile in VSM model
52. 52
• The terms in the item metadata are handled independently
• Semantic information is lost in the process, we have no idea of the meaning of the terms
• It is hard to handle expressions containing more than one word
• Data sparsity
• Need diverse user feedbacks to improve the accuracy
Profile Learner / Problems with the VSM model
55. 55
Architecture / Filtering Component
• Recommendation list filtering
• Inputs:
› User profile model
› Item models
› Recommendable items
• Methods:
› Filtering non-relevant
items
› Similarity based ranking
• Output: Recommended items
56. 56
• User independence
› Profiles of the users are built independently
› A user can not influence the recommendations for other users (no attacks on the recommender system)
• Interpretable
› Recommendations can be easily explained (we recommend this comedy because you watched another
comedies)
› Explanations can easily built, user profiles are easy to be understood (user keyword model)
• Item cold start
› Solving item cold start problem
› Capable of recommending items that were never rated/viewed by anybody
Advantages of Content-based Filtering
57. 57
• Limited content analysis
› Natural limit in the number and type of features that are associated with the recommended items
› Domain knowledge is often needed
• Over-specialization
› Hard to improve serendipity (recommending some unexpected)
› Recommendations mirrors the user history
› Monotone recommendations: „The winner takes it all” effect
› Exploitation over exploration
• Meaning
› word meaning disambiguation (apple = company or the fruit?)
• User cold start
› Needs some ratings before the profile learner can build an accurate user model
› There will be no reliable recommendations when only a few ratings available
Disadvantages of Content-based Filtering
59. 59
• Deep learning (Sentiment Analysis, Paragraph Vector Model)
• User generated contents (Tagging, Folksonomies)
• Serendipity problem
› Using randomness or genetic algorithms
› Filtering too similar items
› Balance between exploration and exploitation
› Using poor similarity measures to produce anomalies and exceptions
• Centralized knowledgebase for meta data enrichment
Research topics in Content-Based Filtering
61. 61
• Content-based Filtering (CBF)
• Collaborative Filtering (CF)
› Recommend items that are liked by users that have similar taste as the
current user
› Similarity between users is calculated by the transaction history of users
› Only uses the transaction data domain independent
• Demographic
• Knowledge-based
• Community-based
• Hybrid recommendation systems
Recommendation Techniques
63. 63
5 ?
?
The Matrix The Matrix 2 Twilight The Matrix 3
?
• Classic item recommendation (Netflix) on explicit feedbacks (ratings)
• Rating problem: The goal is to predict how the user would rate the items
• Accuracy metrics:
› Root Mean Squared (RMSE)
› Mean Absolute Error (MAE)
test
Rriu
iu
R
rr
RMSE test
),,(
2
,
ˆ
test
Rriu
iu
R
rr
MAE test
),,(
,
ˆ
Rating prediction problem
64. 64
5 ?
?
The Matrix The Matrix 2 Twilight The Matrix 3
?
• Global average (average of all ratings)
• User average (average of user ratings)
• Item average (average of ratings given to the item)
Baseline Methods
68. 68
• User based neigbor methods
› 1. Find users who have rated item i
› 2. Select k from these users that are the most similar to u
› 3. Calculate R(u,i) from their ratings on i
• Item based neigbor methods
› 1. Find items that have been rated by user u
› 2. Select k from these items that are the most similar to i
› 3. Calculate R(u,i) from their ratings by u
Explicit CF / Neigbor methods
69. 69
• Similarity between
› Rows of the preference matrix (user based)
› Columns of the preference matrix (item based)
› Dimension reduction on user/item preference vectors feature vectors
• Similarity properties:
› S(a,a) >= S(a,b) (often S(a,a) = 1 required)
› S(a,b) = S(b,a)
• Similarity measures
› Cosine similarity (CS)
› Pearson correlation (PC)
› Adjusted cosine similarity (ACS)
› Eucledian distance (EUC)
Explicit CF / Neigbor methods / Similarity setup
2 2
uv
u v
ui vi
i I
ui vj
i I j I
r r
CS
r r
uvuv
uv
Ii
vvi
Ii
uui
Ii
vviuui
rrrr
rrrr
PC
22
)()(
))((
2 2
( )( )
( ) ( )
uv
uv uv
ui i vi i
i I
ui i vi i
i I i I
r r r r
ACS
r r r r
uvIi
viui rrEUC
2
70. 70
• Advantages
› Simplicity (easy to set up, only a few parameters)
› Justifiability (recommendations can be explained)
› Stability (not very sensitive to new items/users/ratings)
› Good for item2item and user2user recommendation
• Disadvantages
› Computationally expensive in the recommending phase
• Similarities have to be computed for recommendation
• Similarity maxtrix might be computed before recommendations but then there is a costly training
phase
• Similarity matrix might not fit in the memory
› Less accurate than model based methods for personalized recommendation
Advantages and disadvantages of neighbor methods
73. 73
• ALS (Alternating Least Squares)
› Updating P and Q matrices by multivariate linear regression based solver
• BRISMF (Biased Regularized Simultaneous Matrix Factorization)
› Stochastic gradient descent based matrix factorization
› Minimizing the prediction error by iterating on transactions and modifying factors
› Using bias for user and item model
› Prediction:
• SVD (Singular Value Decomposition)
› Decomposes matrix R to three matrices
› S is a diagonal matrix, containing singular values of matrix R
• NSVD1
› Decomposes matrix R to three matrices
› W is a weight matrix
Explicit CF / Matrix Factorization
iu
K
k
kiukiuiuui cbqpcbqpr 1
ˆ
T
R PSQ
T
R PWQ
75. 75
Explicit CF / Matrix Factorization / Model Visualization
P
John 1.1 0.5
Paul 0.6 0.9
Suzy 1.0 0.9
Item 1 Item 2 Item 3 Item 4 Item 5
QT
1.0 -0.9 0.5 -0.5 1.3
-0.3 1.6 0.7 1.6 -1.1
J
P S
1
2
3
4
5
76. 76
Explicit CF / Matrix Factorization / Clustering
P
John 1.1 0.5
Paul 0.6 0.9
Suzy 1.0 0.9
Item 1 Item 2 Item 3 Item 4 Item 5
QT
1.0 -0.9 0.5 -0.5 1.3
-0.3 1.6 0.7 1.6 -1.1
J
P S
1
2
3
4
5
A
B
C
77. 77
Memory based
algorithms
Model based algorithms
Hierarchy of Collaborative Filtering methods
Collaborative Filtering
Matrix
factorization
Explicit feedback base algorithms
Implicit feedback base algorithms
Neighbor
methods
79. 79
?
?
The Matrix The Matrix 2 Twilight The Matrix 3
?
• Classic item recommendation on implicit feedbacks (view, buy, like, add to favourite)
• Preference problem: The goal is to predict the probability that the user would choose that item
• Accuracy metrics:
› Ranking error instead of prediction error
› Recall@N
› Precision@N
Preference prediction problem
80. 80
?
?
The Matrix The Matrix 2 Twilight The Matrix 3
?
• Most popular items
• Most recent items
• Most popular items in the user’s favourite category
Baseline Methods
83. 83
• Zero value for unknown preference (zero example). Many 0s, few 1s, in practice-
• 𝒄 𝑢𝑖 confidence for known feedback (constant or function of the context of event)
• Zero examples are less important, but important.
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
C Item1 Item2 Item3 Item4
User1 𝒄11 1 1 1
User2 1 1 𝒄23 1
User3 𝒄31 𝒄32 1 1
User4 1 𝒄42 1 𝒄44
Implicit CF / Confidence Matrix
85. 85
• User based neigbor methods
› 1. Find users who have rated item i
› 2. Select k from these users that are the most similar to u
› 3. Calculate R(u,i) from their ratings on i
• Item based neigbor methods
› 1. Find items that have been rated by user u
› 2. Select k from these items that are the most similar to i
› 3. Calculate R(u,i) from their ratings by u
Implicit CF / Neigbor methods
86. 86
• Co-occurrence similarity algorithm (most often used in business)
• The algorithm:
1. For all item i:
1. Count the number of users that interacted with the item i: 𝑠𝑢𝑝𝑝(𝑖)
2. For all user u that interacted with the item i:
1. For all item i iteracted by user u:
1. Increment the counter of the co-occurrence of item i and j: 𝑠𝑢𝑝𝑝 𝑖, 𝑗
2. The similarity between item i and item j: 𝑆 𝑖, 𝑗 =
𝑠𝑢𝑝𝑝(𝑖,𝑗)
𝑠𝑢𝑝𝑝 𝑖 +𝛾 1−𝛼 𝑠𝑢𝑝𝑝 𝑗 +𝛾 𝛼
• 𝛼: popularity factor
• 𝛾: reguralization factor
3. The prediction for user u and item i: 𝑟 𝑢𝑖 =
𝑗∈𝐶 𝑢>0 𝑐 𝑢𝑗 𝑆(𝑖,𝑗)
𝑗∈𝐶 𝑢>0 𝑐 𝑢𝑗
Implicit CF / Item based neigbor methods
95. 95
R1 Item1 Item2 Item3 …
User1 1 …
User2 1 0 …
User3 …
…. … … … …
• Tensor Factorizaton
• Different preferences during the day
• Context: Time period
• Time period 1: 06:00-14:00
96. 96
R1 Item1 Item2 Item3 …
User1 1 …
User2 1 0 …
User3 …
…. … … … …
R2 Item1 Item2 Item3 …
User1 0 1 …
User2 1 …
User3 1 …
…. … … … …
• Tensor Factorizaton
• Different preferences during the day
• Context: Time period
• Time period 2: 14:00-22:00
97. 97
R1 Item1 Item2 Item3 …
User1 1 …
User2 1 0 …
User3 …
…. … … … …
R2 Item1 Item2 Item3 …
User1 0 1 …
User2 1 …
User3 1 …
…. … … … …
R3 Item1 Item2 Item3 …
User1 1 …
User2 …
User3 1 1 …
…. … … … …
• Tensor Factorizaton
• Different preferences during the day
• Context: Time period
• Time period 3: 22:00-06:00
99. 99
• Relies on feedbacks
› Explicit feedback is more reliable than implicit feedback
› But explicit feedback is often not provided, just implicit feedbacks
› Does not need heterogeneous data sources
• Extracts latent information using consumtion behaviors
• More accurate than content based filtering
• Domain independent
Advantages of Collaborative Filtering
100. 100
• Cold start problem
› Items without feedbacks cannot be recommended
› CF algorithms cannot provide personalized recommendations for users without feedbacks
• Warmup period
› Requires numerous events to be able to characterize the user properly
› Inaccurate for users with few feedbacks
› Inaccurate for domains with weak collaboration
• Vulnerable to attacks
• Harder to explain the recommendations
• Unable to provide cross-domain recommendations without overlapping data
Disadvantages of Collaborative Filtering
102. 102
Other applications / Hybrid Filtering
• Combination of collaborative and content-based filtering
• Advantages
› Transforms latent behavioral knowledge to meta data level
› Solves cold start problem
› Used for weighting meta data, able to improve the efficiency of CBF
› Behavioral knowledge can be interpreted by meta data
• Disadvantages
› Requires both meta data and transactional data
› More complex than CF and CBF
› Less developed than CF and CBF, hot topic
› Challenging to provide mixed recommendations with new and old items
› Heterogeneous data issues
103. 103
Other applications / Explanation
• Approaches
› Item explanation: Explanaining why those items were recommended
› User explanation: Description about the user profile
• Types of the item explanations
› Non-personalized: „… because this content is trending”
› Explicit features: „… because you bought horror movies”
› Explicit user-to-user links: „… because your friends love this movie”
› Explicit user relations: „.. because some users similar to you like this movie”
› Implicit features: „… because actor X is similar to your preferred actor Y”
• Types of user explanations
› User tag cloud: „the user consumes 80% horror and 20% comedy”
› User similarity: „the user is similar to another user who is a fan of Star Wars”
• Goal: Increasing trust in recommender system
• Difficulties: Hard to optimize how the recommendations should be explained
104. 104
Other applications / Recommender strategies
• The most common way to recommend from prediction scores is to order items by them
• How to avoid monotonity and „the winner takes all” effect?
• How to avoid preference cannibalization?
• Recommender strategies: How to select items from the scored list?
› Best match
› One episode per series
› At most N items per category
› Different categories should follow each other
• Exploration vs. Exploitation with Multi-Armed Bandits
• Entropy maximalization per recommendaiton box for new users