The document discusses recommender systems and describes several techniques used in collaborative filtering recommender systems including k-nearest neighbors (kNN), singular value decomposition (SVD), and similarity weights optimization (SWO). It provides examples of how these techniques work and compares kNN to SWO. The document aims to explain state-of-the-art recommender system methods.
Overview of the Recommender system or recommendation system. RFM Concepts in brief. Collaborative Filtering in Item and User based. Content-based Recommendation also described.Product Association Recommender System. Stereotype Recommendation described with advantage and limitations.Customer Lifetime. Recommender System Analysis and Solving Cycle.
We have built an online Movie Recommender System which is based on the analysis of users' ratings history to several movies and their demographic information. We used data from Movielens website. Collaborative filtering and matrix factorization techniques have been used for the implementation. The end result is a web application where a user is recommended with top 20 movies.
Codebase: http://goo.gl/nM7RMy
Demo Video: http://goo.gl/VgZ2uI
The goal of a recommender system is to predict the degree to which a user will like or dislike a set of items, such as movies or TV shows.
Most recommender systems use a combination of different approaches, but broadly speaking there are three different methods that can be used: Content analysis, Social recommendations and Collaborative filtering.
Recommender Systems represent one of the most widespread and impactful applications of predictive machine learning models.
Amazon, YouTube, Netflix, Facebook and many other companies generate an important fraction of their revenues thanks to their ability to model and accurately predict users ratings and preferences.
In this presentation we cover the following points:
→ introduction to recommender systems
→ working with explicit vs implicit feedback
→ content-based vs collaborative filtering approaches
→ user-based and item-item methods
→ machine learning and deep learning models
→ pros & cons of the methods: scalability, accuracy, explainability
In this lecture, I will first cover the recent advances in neural recommender systems such as autoencoder-based and MLP-based recommender systems. Then, I will introduce the recent achievement for automatic playlist continuation in music recommendation.
Recommender systems are software tools and techniques providing suggestions for items to be of interest to a user. Recommender systems have proved in recent years to be a valuable means of helping Web users by providing useful and effective recommendations or suggestions.
What really are recommendations engines nowadays?
This presentation introduces the foundations of recommendation algorithms, and covers common approaches as well as some of the most advanced techniques. Although more focused on efficiency than theoretical properties, basics of matrix algebra and optimization-based machine learning are used through the presentation.
Table of Contents:
1. Collaborative Filtering
1.1 User-User
1.2 Item-Item
1.3 User-Item
* Matrix Factorization
* Stochastic Gradient Descent (SGD)
* Truncated Singular Value Decomposition (SVD)
* Alternating Least Square (ALS)
* Deep Learning
2. Content Extraction
* Item-Item Similarities
* Deep Content Extraction: NLP, CNN, LSTM
3. Hybrid Models
4. In Production
4.1 Problematics
4.2 Solutions
4.3 Tools
• Performed memory-based collaborative filtering techniques like Cosine similarities, Pearson’s r & model-based Matrix Factorization techniques like Alternating Least Squares (ALS) method
• Studied the scalability of these methods on local machines & on Hadoop clusters
Overview of the Recommender system or recommendation system. RFM Concepts in brief. Collaborative Filtering in Item and User based. Content-based Recommendation also described.Product Association Recommender System. Stereotype Recommendation described with advantage and limitations.Customer Lifetime. Recommender System Analysis and Solving Cycle.
We have built an online Movie Recommender System which is based on the analysis of users' ratings history to several movies and their demographic information. We used data from Movielens website. Collaborative filtering and matrix factorization techniques have been used for the implementation. The end result is a web application where a user is recommended with top 20 movies.
Codebase: http://goo.gl/nM7RMy
Demo Video: http://goo.gl/VgZ2uI
The goal of a recommender system is to predict the degree to which a user will like or dislike a set of items, such as movies or TV shows.
Most recommender systems use a combination of different approaches, but broadly speaking there are three different methods that can be used: Content analysis, Social recommendations and Collaborative filtering.
Recommender Systems represent one of the most widespread and impactful applications of predictive machine learning models.
Amazon, YouTube, Netflix, Facebook and many other companies generate an important fraction of their revenues thanks to their ability to model and accurately predict users ratings and preferences.
In this presentation we cover the following points:
→ introduction to recommender systems
→ working with explicit vs implicit feedback
→ content-based vs collaborative filtering approaches
→ user-based and item-item methods
→ machine learning and deep learning models
→ pros & cons of the methods: scalability, accuracy, explainability
In this lecture, I will first cover the recent advances in neural recommender systems such as autoencoder-based and MLP-based recommender systems. Then, I will introduce the recent achievement for automatic playlist continuation in music recommendation.
Recommender systems are software tools and techniques providing suggestions for items to be of interest to a user. Recommender systems have proved in recent years to be a valuable means of helping Web users by providing useful and effective recommendations or suggestions.
What really are recommendations engines nowadays?
This presentation introduces the foundations of recommendation algorithms, and covers common approaches as well as some of the most advanced techniques. Although more focused on efficiency than theoretical properties, basics of matrix algebra and optimization-based machine learning are used through the presentation.
Table of Contents:
1. Collaborative Filtering
1.1 User-User
1.2 Item-Item
1.3 User-Item
* Matrix Factorization
* Stochastic Gradient Descent (SGD)
* Truncated Singular Value Decomposition (SVD)
* Alternating Least Square (ALS)
* Deep Learning
2. Content Extraction
* Item-Item Similarities
* Deep Content Extraction: NLP, CNN, LSTM
3. Hybrid Models
4. In Production
4.1 Problematics
4.2 Solutions
4.3 Tools
• Performed memory-based collaborative filtering techniques like Cosine similarities, Pearson’s r & model-based Matrix Factorization techniques like Alternating Least Squares (ALS) method
• Studied the scalability of these methods on local machines & on Hadoop clusters
Decision Forest: Twenty Years of ResearchLior Rokach
A decision tree is a predictive model that recursively partitions the covariate's space into subspaces such that each subspace constitutes a basis for a different prediction function. Decision trees can be used for various learning tasks including classification, regression and survival analysis. Due to their unique benefits, decision trees have become one of the most powerful and popular approaches in data science. Decision forest aims to improve the predictive performance of a single decision tree by training multiple trees and combining their predictions.
Past, present, and future of Recommender Systems: an industry perspectiveXavier Amatriain
Keynote for the ACM Intelligent User Interface conference in 2016 in Sonoma, CA. I start with the past by talking about the Recommender Problem, and the Netflix Prize. Then I go into the Present and the Future by talking about approaches that go beyond rating prediction and ranking and by finishing with some of the most important lessons learned over the years. Throughout my talk I put special emphasis on the relation between algorithms and the User Interface.
An overview of some deep learning methods for recommender systems along with an intro to the relevant deep learning methods such as convolutional neural networks (CNN's), recurrent neural networks (RNN's), autoencoders, restricted boltzmann machines (RBM's) and more.
This is a tutorial about recommender system for CS410 @ UIUC. It summarize some good research paper about how user profile and tags can improve recommender systems.
Toward Better Interactions in Recommender Systems: Cycling and Serpentining A...Qian Zhao
An experience design perspective on recommenders: There is a tradeoff between serving come-and-go users vs. encouraging deeper interaction/engagement!
Better understanding of the trade-off between efficiency vs. engagement can help design a better recommender user experience!
Cycling and serpentining top-N recommendation lists have benefits (higher engagement) but also costs (negative perception)!
More work combining algorithms and user experience is needed!
Building a Large Scale SEO/SEM Application with Apache SolrRahul Jain
Slides from my talk on "Building a Large Scale SEO/SEM Application with Apache Solr" in Lucene/Solr Revolution 2014 where I talk how we handle Indexing/Search of 40 billion records (documents)/month in Apache Solr with 4.6 TB compressed index data.
Abstract: We are working on building a SEO/SEM application where an end user search for a "keyword" or a "domain" and gets all the insights about these including Search engine ranking, CPC/CPM, search volume, No. of Ads, competitors details etc. in a couple of seconds. To have this intelligence, we get huge web data from various sources and after intensive processing it is 40 billion records/month in MySQL database with 4.6 TB compressed index data in Apache Solr.
Due to large volume, we faced several challenges while improving indexing performance, search latency and scaling the overall system. In this session, I will talk about our several design approaches to import data faster from MySQL, tricks & techniques to improve the indexing performance, Distributed Search, DocValues(life saver), Redis and the overall system architecture.
Which Vertical Search Engines are Relevant? Understanding Vertical Relevance ...Mounia Lalmas-Roelleke
Aggregating search results from a variety of heterogeneous
sources, so-called verticals, such as news, image and video,
into a single interface is a popular paradigm in web search.
Current approaches that evaluate the effectiveness of aggregated
search systems are based on rewarding systems that
return highly relevant verticals for a given query, where this
relevance is assessed under different assumptions. It is difficult
to evaluate or compare those systems without fully
understanding the relationship between those underlying assumptions.
To address this, we present a formal analysis and
a set of extensive user studies to investigate the effects of various
assumptions made for assessing query vertical relevance.
A total of more than 20,000 assessments on 44 search tasks
across 11 verticals are collected through Amazon Mechanical
Turk and subsequently analysed. Our results provide
insights into various aspects of query vertical relevance and
allow us to explain in more depth as well as questioning the
evaluation results published in the literature.
Work with Ke (Adam) Zhou, Ronan Cummins and Joemon Jose.
Presented at WWW 2013, Rio de Janeiro.
In this presentation I will talk about the design of scalable recommender systems and its similarity with advertising systems. The problem of generating and delivering recommendations of content/products to appropriate audiences and ultimately to individual users at scale is largely similar to the matching problem in computational advertising, specially in the context of dealing with self and cross promotional content. In this analogy with online advertising a display opportunity triggers a recommendation. The actors are the publisher (website/medium/app owner) the advertiser (content owner or promoter), whereas the ads or creatives represent the items being recommended that compete for the display opportunity and may have different monetary value to the actors. To effectively control what is recommended to whom, targeting constraints need to be defined over an attribute space, typically grouped by type (Audience, Content, Context, etc.) where some associated values are not known until decisioning time. In addition to constraints, there are business objectives (e.g. delivery quota) defined by the actors. Both constraints and objectives can be encapsulated into and expressed as campaigns. Finally, there there is the concept of relevance, directly related to users' response prediction that is computed using the same attribute space used as signals.
As in advertising, recommendation systems require a serving platform where decisioning happens in real-time (few milliseconds) typically selecting an optimal set of items to display to the user from hundreds, sometimes thousands or millions of items. User actions are then taken as feedback and used to learn models that dynamically adjust order to meet business objectives.
This is a radical departure from the traditional item-based and user-based collaborative filtering approach to recommender systems, which fails to factor-in context, such as time-of-day, geo-location or category of the surrounding content to generate more accurate recommendations. Traditional approaches also fail to recognize that recommendations don't happen in a vacuum and as such may require the evaluation of business constraints and objectives. All this should be considered when designing and developing true commercial recommender/advertising systems.
Speaker Bio
Joaquin A. Delgado is currently Director of Advertising Technology at Intel Media (a wholly owned subsidiary of Intel Corp.), working on disruptive technologies in the Internet T.V. space. Previous to that he held CTO positions at AdBrite, Lending Club and TripleHop Technologies (acquired by Oracle). He was also Director of Engineering and Sr. Architect Principal at Yahoo! His expertise lies on distributed systems, advertising technology, machine learning, recommender systems and search. He holds a Ph.D in computer science and artificial intelligence from Nagoya Institute of Technology, Japan.
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
movie recommender system using vectorization and SVD techUddeshBhagat
This system used overall TMDB Vote Count and Vote Averages to build Top Movies Charts, in general and for a specific genre. The IMDB Weighted Rating System was used to calculate ratings on which the sorting was finally performed.
We built two content based engines; one that took movie overview and taglines as input and the other which took metadata such as cast, crew, genre and keywords to come up with predictions. We also devised a simple filter to give greater preference to movies with more votes and higher ratings.
A domain-independent framework for building conversational recommender systems.
Slides presented @ Google Workshop on Conversational Search and Recommendation, London, 28-29 August 2019
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
2. About Me
Prof. Lior Rokach
Department of Information Systems Engineering
Faculty of Engineering Sciences
Head of the Machine Learning Lab
Ben-Gurion University of the Negev
Email: liorrk@bgu.ac.il
http://www.ise.bgu.ac.il/faculty/liorr/
PhD (2004) from Tel Aviv University
3. Are You Being Served?
What are you looking for?
Demographic – Age, Gender, etc.
Context-
Casual/Event
Season
Gift
Purchase History
Loyal Customer
What is the customer currently wearing?
Style
Color
Social
Friends and Family
Companion
4. Recommender Systems
A recommender system (RS) helps people that
have not sufficient personal experience or
competence to evaluate the, potentially
overwhelming, number of alternatives offered by
a Web site.
In their simplest form RSs recommend to their users
personalized and ranked lists of items
Provide consumers with information to help them
decide which items to purchase
7. What movie should I watch?
• The Internet Movie Database (IMDb)
provides information about
actors, films, television shows, television
stars, video games and production crew
personnel.
• Owned by Amazon.com since 1998
• 796,328 titles and 2,127,371 people
• More than 50M users per month.
8. abcd
The Nextflix prize story
In October 2006, Netflix announced it would give a $1 million to
whoever created a movie-recommending algorithm 10% better than its
own.
Within two weeks, the DVD rental company had received 169
submissions, including three that were slightly superior to
Cinematch, Netflix's recommendation software
After a month, more than a thousand programs had been entered, and
the top scorers were almost halfway to the goal
But what started out looking simple suddenly got hard. The rate of
improvement began to slow. The same three or four teams clogged
the top of the leader-board.
Progress was almost imperceptible, and people began to say a 10
percent improvement might not be possible.
Three years later, on 21st of September 2009, Netflix announced the
winner.
30.07.2012
10. Where should I spend my vacation?
Tripadvisor.com
I would like to escape from this ugly an tedious work life and
relax for two weeks in a sunny place. I am fed up with
these crowded and noisy places … just the sand and the
sea … and some “adventure”.
I would like to bring my wife and my children on a
holiday … it should not be to expensive. I prefer
mountainous places… not too far from home.
Children parks, easy paths and good cuisine are a
must.
I want to experience the contact with a completely different
culture. I would like to be fascinated by the people and
learn to look at my life in a totally different way.
11.
12. Usage in the market/products Recommendation Procedure SWOT
State-of-the-art solutions
Methods Summary
Model Analysis
Examined Solutions
Method Commonness
Jinni Taste Kid Nanocrowd Clerkdogs Criticker IMDb Flixster Movielens Netflix Shazam Pandora LastFM YooChoose Think Analytics Itunes Amazon
Collaborative Filtering v v v v v v v v v v v v
Content-Based Techniques v v v v v v v v v v v
Knowledge-Based Techniques v v v v v v v
Stereotype-Based Recommender Systems v v v v v v v
Ontologies and Semantic Web Technologies
v v v
for Recommender Systems
Hybrid Techniques v v v v v v v
Ensemble Techniques for Improving
v future
Recommendation
Context Dependent Recommender Systems v v v v v v
Conversational/Critiquing Recommender
v v
Systems
Community Based Recommender Systems
v v v v v
and Recommender Systems 2.0
30.07.2012
14. Recom Next Steps. Procedure SWOT
Presenting the Three selected methods
Methods Summary
Model Analysis
“Customers who bought
1 Collaborative this Item also bought…”
Filtering
2 Ensemble “The wisdom of crowds”
“Tell me the music that
3 Context Based
I want to listen NOW"
30.07.2012
15. Recom Next Steps. Procedure SWOT
Presenting the Three selected methods
Methods Summary
Model Analysis
4 Cross Domain “Can movies and books collaborate?”
"Tell me who your friends are,
5 Community
and I will tell you who you are.”
“Can you recommend a movie for
6 Group
me and my friends?”
30.07.2012
17. Method 1 Procedure SWOT
Collaborative Filtering
Methods Summary
Model Analysis
CF Ensemble Context
The method of making automatic
predictions (filtering) about the
interests of a user by collecting
Description
taste information from many
users (collaborating). The 1 Collaborative Filtering
underlying assumption of CF
approach is that those who
agreed in the past tend to agree
again in the future.
Selected Techniques
kNN - Nearest Neighbor
SVD – Matrix Factorization
Similarity Weights Optimization
(SWO)
30.07.2012
18. Collaborative Filtering Procedure SWOT
Overview
Methods Summary
Model Analysis
CF Ensemble Context
abcd
The Idea
Trying to predict the opinion the user will have on the
different items and be able to recommend the “best” items to
each user based on: the user’s previous likings and the opinions
of other like minded users
Negative
Rating
?
Positive
Rating
30.07.2012
19. Collaborative Filtering Procedure SWOT
How does it work?
Methods Summary
Model Analysis
CF Ensemble Context
“People who liked this also
abcd abcd
liked…” User-to-User
Recommendations are made by finding
users with similar tastes. Jane and Tim
both liked Item 2 and disliked Item 3; it
seems they might have similar
taste, which suggests that in general Jane
agrees with Tim. This makes Item 1 a good
recommendation for Tim.
Item This approach does not scale well for
to millions of users.
Item Item-to-Item
Recommendations are made by finding
items that have similar appeal to many
users.
Tom and Sandra are two users who liked
both Item 1 and Item 4. That suggests that,
User to in general, people who liked Item 4 will
User also like item 1, so Item 1 will be
recommended to Tim. This approach is
scalable to millions of users and
millions of items.
30.07.2012
20. Collaborative Filtering Procedure SWOT
Rating Matrix
Methods Summary
Model Analysis
CF Ensemble Context
abcd
Sample of a matrix
The ratings of users and items are represented in a matrix
All CF methods are based on such rating matrix
abcd
Items
abcd
Users TheItems in
the system
TheUsers in
the system
abcd
Ratings
Eachitem
may have a
rating
30.07.2012
21. Collaborative Filtering Procedure SWOT
What is new?
Methods Summary
Model Analysis
CF Ensemble Context
abcd
Few words about the techniques
Collaborative filtering is one of the most common
recommendation methods in the market today.
Up until two years ago, the kNN (“k” Nearest Neighbor)
technique was the norm. SVD (Singular Value Decomposition),
which has shown to be successful in the Netflix
recommendation competition, became common in the last
year. SWO is also a newer technique asking to enhance the
veteran kNN.
In the following slides the three techniques will be
presented. It is important to get acquainted with the
techniques as they will be employed by the Ensemble.
30.07.2012
24. kNN - Nearest Neighbor Procedure SWOT
High level explanation
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
abcd
k-nearest neighbors algorithm
A method for classifying objects based on closest
training examples in the feature space.
It is assumed that similar samples are grouped together
“k” means the number of neighbors – a proximity
measure
abcd
Recommendation example
Finding the most relevant song by comparing to a set of
already heard ones.
30.07.2012
25. kNN - Nearest Neighbor Procedure SWOT
Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
Current User Users
1 1st item rate
0 Dislike
?
1
0
1 Like
abcd
abcd
Unknown Rating
Prediction
abcd
Other Users
1 This user did
The prediction
not rate the There are
Items
? Unknown 1 was made
item. We will other users
based on the
try to predict who rated the
0 nearest
a rating same item. We
are interested
1 neighbor. toabcd
according
Hamming Distance
in the Nearest
his The Hamming distance is named
neighbors.
1
after Richard Hamming.
Neighbors.
0 In information theory, the
User Model = 1
abcd
Hamming distance between
two strings of equal length is
interactionlooking 1
Nearest Neighbors
We are
the number of positions at
which the corresponding abcd
for the
history
symbols are different.
Nearest 1 Nearest
Neighbor. The
one with the 1 Neighbor
lowest
Hamming
0 14th item rate
distance.
Hamming 5 6 6 5 4 8
distance
30.07.2012
27. SVD - Singular Value Decomposition Procedure SWOT
Matrix factorization technique
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
abcd abcd
SVD sample matrix
SVD is extraordinarily useful and
has many applications such as data
analysis, signal processing, pattern
recognition, image compression,
weather prediction, and Latent
Semantic Analysis or LSA
Probably most popular model
among Netflix contestants.
Has become the Collaborative
Filtering standard
The Singular Value Decomposition
(SVD) is a widely used technique to
decompose a matrix into several
component matrices, exposing
many of the useful and interesting
properties of the original matrix.
30.07.2012
28. SVD - Singular Value Decomposition Procedure SWOT
Matrix factorization technique
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
abcd abcd
SVD sample matrix
In the Recommendation Systems
field, SVD models users and items
as vectors of latent features
which when cross product produce
the rating for the user of the item
With SVD a matrix is factored into
a series of linear approximations
that expose the underlying
structure of the matrix.
The goal is to uncover latent
features that explain observed
ratings
30.07.2012
29. Latent Factor Models Procedure SWOT
Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
Users & Ratings Latent Concepts or Factors
abcd
Hidden Concept
SVDreveals
hidden
connections
and its
strength
abcdVD
S
SVD Process
abcd
Revealed Concept
abcd
SVD
Malethat like
watching
User Rating serious Movies
30.07.2012
30. Latent Factor Models Procedure SWOT
Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
Users & Ratings Latent Concepts or Factors
abcd
Recommendation
SVD
revealed a
movie this
user might
like!
30.07.2012
31. Latent Factor Models Procedure SWOT
Concept space
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
30.07.2012
33. Similarity Weights Optimization Procedure SWOT
SWO vs. Nearest Neighbor
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
abcd abcd
SWO kNN
The similarity function the similarity function
(Pearson, Cosine) is used (Pearson, Cosine) is used
to determine the for both:
neighbors. Determining the nearest
The weights for the neighbors.
weighted average are Determining the weights in
found via an optimization the weighted average of
process which minimizes the prediction.
the total prediction
error.
30.07.2012
34. Similarity Weights Optimization Procedure SWOT
Data Normalization
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
abcd
Data Normalization
Need to identify relations and mix ratings across items/users
However, User and item-specific variability masks fundamental
relationships
Examples:
Some items are systematically rated higher
Some items were rated by users that tend to rate
low
Ratings change along time
Normalization is critical to the success of a kNN
approach
30.07.2012
35. Similarity Weights Optimization
Data Normalization
Procedure SWOT
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
abcd
Data Normalization
Remove data characteristics that are unlikely to be
explained by kNN
Common practice is to use centering: Remove user- and
item-means
A more comprehensive approach eliminates additional
interfering variability such as time effects
Here, we normalize by removing the baseline estimates
30.07.2012
36. Similarity Weights Optimization Procedure SWOT
Neighborhood modeling through global optimization Model
CF
Methods
Ensemble
Analysis
Summary
Context
kNN SVD SWO
abcd
A basic model
30.07.2012
38. Method 2 Procedure SWOT
Ensemble
Methods Summary
Model Analysis
CF Ensemble Context
Ensemble methodology imitates
Description
the human nature to seek advice
before making any crucial 2 Ensemble
decision.
“Two heads are better than one”.
Bagging (Breiman, 1996)
Selected Techniques
AdaBoost (Freund and
Schapire, 1996)
Random Parameter Manipulation
The innovation is adopting the
Ensemble concept from the
general machine learning field to
the Recommender System domain.
30.07.2012
39. Ensemble at 30,000 feet Procedure SWOT
Overview
Methods Summary
Model Analysis
CF Ensemble Context
abcd
Overview
When important decisions have to be made, society often
places its trust in groups of people. We have parliaments,
juries, committees, and boards of directors, whom we are
happy to have make decisions for us.
Ensemble imitates the human nature to seek advice before
making any crucial decision. It is achieved by weighing the
individual opinions, and combining them before reaching a final
decision, hence the names “The Wisdom of Crowds” and
“Committee of Experts”.
We can ensure that the ensemble will produce
results that are in the worst case as bad as the
worst classifier in the ensemble.
30.07.2012
40. Ensemble Procedure SWOT
Overview
Methods Summary
Model Analysis
CF Ensemble Context
abcd
What is it?
If you think about it, Ensemble is not a question to be
answered.
So what is it than?
Ensemble is the answer.
So what is the question?
How to improve results!
30.07.2012
41. Ensemble
Improving result…
Procedure SWOT
Methods Summary
Model Analysis
CF Ensemble Context
abcd abcd
Why do we care? Because...
Having improved
results will prevent
cases like this.
30.07.2012
42. Ensemble Procedure SWOT
A short story
Methods Summary
Model Analysis
CF Ensemble Context
abcd
Francis Galton
Galton promoted statistics and invented the concept of
correlation.
In 1906 Galton visited a livestock fair and stumbled upon an
intriguing contest.
An ox was on display, and the villagers were invited to guess
the animal's weight.
Nearly 800 gave it a go and, not surprisingly, not one hit the
exact mark: 1,198 pounds.
Astonishingly, however, the average of those 800 guesses came
close - very close indeed. It was 1,197 pounds.
30.07.2012
43. Ensemble Procedure SWOT
Does it always work?
Methods Summary
Model Analysis
CF Ensemble Context
abcd abcd
Does Ensemble always work? No
Not all crowds
(groups) are wise.
For example, crazed
investors in a stock
market bubble.
30.07.2012
44. Ensemble Procedure SWOT
Schematic Example
Methods Summary
Model Analysis
CF Ensemble Context
abcd
Recommender 1 abcd
Recommender 2 abcd
Recommender 3
abcd
Weak Learners
And
they all
abcd may be just
Problem Example
weak
Linear
learners.
recommenders
cannot solve non-
linearly
separable
abcd
Combined Recommender
problems
however,
their
combination can
30.07.2012
45. Ensemble
Why using Ensembles?
Procedure SWOT
Methods Summary
Model Analysis
CF Ensemble Context
Statistical Reasons, Risk reduction Computational Reasons
Out of many recommender models Every time we run a
with similar training / test errors, recommendation algorithm, we may
which one shall we pick? If we just find different local optima.
pick one at random, we risk the
possibility of choosing a really Combining their outputs may allow
poor one us to find a solution that is closer
Combining / averaging them may to the global minimum.
prevent us from making one such
unfortunate
decision
Too little data / too much data Representational Reasons
Generating multiple recommenders The recommender space may not
with the re-sampling of the contain the solution to a given
available data / mutually exclusive particular problem. However, an
subsets of the available data. ensemble of such recommenders
may.
30.07.2012
46. Ensemble Procedure SWOT
The Diversity Paradox
Methods Summary
Model Analysis
CF Ensemble Context
abcd abcd
Diversity vs. Accuracy Description
On one hand we expect the
ensemble members to be
as good as possible so
they all target the same
goal.
On the other hand they
have to be independent,
which means different,
hence, lowering the
accuracy.
abcd
There’s no real Paradox…
Ideally, all committee members would be right about everything!
If not, they should be wrong about different things.
30.07.2012
47. Ensemble Procedure SWOT
Single–model Ensemble RS
Methods Summary
Model Analysis
CF Ensemble Context
abcd
Example configuration
abcd 4
Step
abcd 2
Step
Produce
several
abcd 5
Step
Generate recommendatio
different ns Combinethe
variations of different
the same input recommendations
Rating
RS 1
Matrix 1
Training
Rating Inducer Ensemble ratings
Matrix RS
Rating
abcd 1
Step RS M
Matrix M abcdtep 6
S
abcd 3
Step
Users&
Items Theactual CF Generates more
ratings Method & accurate predictions
input Technique than each individual RS
30.07.2012
48. Netflix Prize Procedure SWOT
The Competition
Methods Summary
Model Analysis
CF Ensemble Context
abcd
The Nextflix prize story
In October 2006, Netflix announced it would give a $1 million to
whoever created a movie-recommending algorithm 10% better than its
own.
Within two weeks, the DVD rental company had received 169
submissions, including three that were slightly superior to Cinematch,
Netflix's recommendation software
After a month, more than a thousand programs had been entered, and
the top scorers were almost halfway to the goal
But what started out looking simple suddenly got hard. The rate of
improvement began to slow. The same three or four teams clogged
the top of the leader-board.
Progress was almost imperceptible, and people began to say a 10
percent improvement might not be possible.
Three years later, on 21st of September 2009, Netflix announced the
winner.
30.07.2012
49. Netflix Prize Procedure SWOT
The winner team used an Ensemble
Methods Summary
Model Analysis
CF Ensemble Context
abcdFACT
Actually, the top
100 solutions
were Ensemble
based
30.07.2012
50. Netflix Prize
And the winner is…
Procedure SWOT
Methods Summary
Model Analysis
CF Ensemble Context
abcd abcd
We have a winner! So why bother?
You may ask yourself,
why do we need to
further research &
develop the Ensemble?
Because it was solved in a
manual tailored way,
combining a set of
predefined methods.
There is plenty of room
for improvements.
30.07.2012
51. Netflix Prize Procedure
Methods
SWOT
Summary
The real winner
Model Analysis
CF Ensemble Context
abcd
The real winner is the method!
One could say that the Ensemble techniques and methods helped tip the
scales.
While the algorithms and good knowledge of statistics goes a long
way, it was ultimately the cross-team collaboration that ended the
contest.
It is easy to overlook the fact that many teams were actually
committees of experts by themselves.
"The Ensemble" team, appropriately named for the technique they used
to merge their results consists of over 30 people.
Likewise, the winning team is a collaborative effort of several distinct
groups that merged their results.
30.07.2012
54. Bagging Procedure SWOT
Overview
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd
Overview
Introduced by Breiman (1996)
“Bagging” stands for “bootstrap aggregating”.
It is an ensemble method
a method of combining multiple predictors.
The intuition is that by using only part of the data and making
some data (randomly) have more impact, you get a better
variety of models that will reduce over fitting
30.07.2012
55. Bagging-based sampling of rating matrix Procedure SWOT
Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd
Bagging in action
abcd
Step 1
Arandom
subset of the
training set is
taken.
30.07.2012
56. Bagging-based sampling of rating matrix Procedure SWOT
Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd
Bagging in action
30.07.2012
57. Bagging-based sampling of rating matrix Procedure SWOT
Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd
Bagging in action
abcd 2
Step
Some of the
data in this
subset is
duplicated
several times.
30.07.2012
58. Bagging-based sampling of rating matrix Procedure SWOT
Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd
Bagging in action
abcd
From here to a recommendation
The input set is given to one
of the recommendation
methods.
It is repeated until every
method has an input set.
The average result (or most
common one) is picked.
30.07.2012
60. AdaBoost Procedure SWOT
Overview
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd
Overview
Introduced by Freund and Schapire, 1996
“AadBoost” stands for “Adaptive Boosting”.
Boosting - To boost a “weak” learning algorithm into a
“strong” learning algorithm
It is an ensemble method
Training samples are weighted differently across the
ensemble members
30.07.2012
61. AdaBoost Procedure SWOT
Overview
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd abcd
Overview The Process
We start with
building an initial
model.
Next that model is
improved, by
modifying the input
(training) set to
emphasize (for
example by
duplicating) the
part of the input
where the model
was less accurate.
The model is
rebuilt and checked
for its accuracy.
The process repeats
until the error of
the model is lower
than some bound.
30.07.2012
62. AdaBoost Procedure SWOT
Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd
Step 1
abcd
Step 2
We start with
Next that model is building an
improved, by abcd Step initial model.
Final
modifying the input set
abcd 3
Step to emphasize the part process
The
repeats until
of the input where the
The model ismodel was less the error of
rebuilt and accurate.
Training
checked for its
the model is Combined
lower than
accuracy. some bound. recommender
30.07.2012
64. Random Parameter Manipulation Procedure SWOT
Overview
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd
Overview
The idea is to have multiple variations of the same
recommendation technique
The variations are formed by changing the input parameters
systematically
The Ensemble is achieved by combining the modified
recommenders in order to produce a unified prediction
30.07.2012
65. Random Parameter Manipulation Procedure SWOT
Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd
Example: Averaging multiple SVD matrix based on different values of F
abcd
Variations of SVD
Different F
values, 3 to 5
abcd
Ensemble
Combined
Recommenders
30.07.2012
67. Ensemble Procedure SWOT
Testing coverage
Methods Summary
Model Analysis
CF Ensemble Context
abcd abcd
Coverage Details
Each of the three CF
techniques will be tested
with an ensemble technique
There are 9 possible
combinations of techniques.
The diagram is color coded
for convenience.
30.07.2012
69. Method 3 Procedure SWOT
Context-Based
Methods Summary
Model Analysis
CF Ensemble Context
Adapting the recommendations to
Description
the specific user context.
“Tell me the music that I want to
3 Context-Based
listen NOW“.
Selected Techniques
Item Split
Linear Models
30.07.2012
70. Context-Based Recommender Systems Procedure SWOT
Overview
Methods Summary
Model Analysis
CF Ensemble Context
abcd
Overview
The recommender system uses additional data about the
context of an item consumption.
For example, in the case of a restaurant the time or the
location may be used to improve the recommendation
compared to what could be performed without this
additional source of information.
A restaurant recommendation for a Saturday evening when
you go with your spouse should be different than a restaurant
recommendation on a workday afternoon when you go with
co-workers
30.07.2012
71. Context-Based Recommender Systems Procedure SWOT
Motivation
Methods Summary
Model Analysis
CF Ensemble Context
Motivating Examples
Recommend a vacation
Winter vs. summer
Recommend a purchase (e-retailer)
Gift vs. for yourself
Recommend a movie
To a student who wants to see it on Saturday
night with his girlfriend in a movie theater.
30.07.2012
72. Context-Based Recommender Systems Procedure SWOT
Motivation
Methods Summary
Model Analysis
CF Ensemble Context
Motivating Examples
Recommend music
The music that we like to hear is greatly affected by a
context, such that can be thought of a mixture of our
feelings (mood) and the situation or location (the theme)
we associate it with.
Listen to Bruce Springteen "Born in USA" while driving
along the 101.
Listening to Mozart's Magic Flute while walking in
Salzburg.
30.07.2012
73. Information Discovery: Example
“Tell me the music that I want to listen NOW"
Procedure SWOT
Methods Summary
Model Analysis
CF Ensemble Context
abcd abcd
Musicovery.com Details
An Interactive
personalized WebRadio
A mood matrix propose
a relationship between
music and mood.
20 genres and time
periods, a popularity
scale (hits, less known
songs/discovery).
covers all musical
genres, rap to funk via
electro, rock, disco…
or classical.
Ethnographic studies
have shown that people
choose music peaces
according to their
mood or mood change
expectation.
Musicovery relied on
this principle to build
an effective
relationship between
music and emotion.
30.07.2012
74. Context-Based Recommender Systems Procedure SWOT
Context vs. others
Methods Summary
Model Analysis
CF Ensemble Context
What simple recommendation techniques ignore?
What is the user when asking for a recommendation?
Where (and when) the user is ?
What does the user (e.g., improve his knowledge
or really buy a product)?
Is the user or with other ?
Are there products to choose or only ?
Is the word economy or ?
30.07.2012
75. Context-Based Recommender Systems
Context vs. others
Procedure SWOT
Methods Summary
Model Analysis
CF Ensemble Context
What simple recommendation techniques ignore?
What is the user when asking for a recommendation?
Where (and when) the user is ?
What does the user (e.g., improve his knowledge
or really buy a product)?
Is the user or with other ?
Are there products to choose or only ?
Is the word economy or ?
Plain recommendation technologies forget to
take
into account the user context.
30.07.2012
76. Context-Based Recommender Systems Procedure SWOT
Foundations
Methods Summary
Model Analysis
CF Ensemble Context
abcd
Contextual Computing
Contextual computing refers to the enhancement of a user’s
interactions by understanding the user, the context, and the
applications and information being used, typically across a
wide set of user goals
Actively adapting the computational environment - for each
and every user - at each point of computation
Contextual computing approach focuses on understanding the
information consumption patterns of each user
Contextual computing focuses on the process not only on the
output of the search process. [Pitkow
et al., 2002]
30.07.2012
77. Context-Based Recommender Systems Procedure SWOT
Major obstacles
Methods Summary
Model Analysis
CF Ensemble Context
abcd
Major obstacle for contextual computing
Obtain sufficient and reliable data describing the user context
Selecting the right information, i.e., relevant in a particular
personalization task
Understand the impact of contextual dimensions on the
personalization process
Computational model the contextual dimension in a more
classical recommendation technology
For instance: how to extend Collaborative Filtering to
include contextual dimensions?
30.07.2012
79. Context-Based Recommender Systems Procedure SWOT
Item Split approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Item Split - Intuition and Approach
The same item in different contextual conditions may produce
a different user experience
We consider the same item in different contexts as distinct
items
Research goal: Provide better music recommendations. Improve
Collaborative Filtering accuracy when the user context is known.
30.07.2012
80. Context-Based Recommender Systems Procedure SWOT
Collaborative Filtering
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Context in Collaborative Filtering
“Context is any information that can be used to characterize
the situation of an entity” [A.K.Dey, 2001]
In Item Splitting approach - similarly to [Adomavicius et. al,
2005] - we model the context with a set of dynamic features
of the rating – representing conditions that can rapidly change
their state
When a user evaluates an item, the rating is recoded together
with the current state of the contextual variables
CF does not provide a direct method to integrate additional
information into the recommendation process
30.07.2012
81. Context-Based Recommender Systems Procedure SWOT
Reduction-Based Approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Reduction-Based Approach
Reduce the problem of multi-dimensional recommendation to the
traditional two-dimensional User x Item
For each “value” of the contextual dimension(s) estimate the missing
ratings with a traditional method
abcd
Example
R: U x I x T [0,1] U {?} ; User, Item, Time
RD(u, i, t) = RD[T=t](u, i)
The context-dependent estimation for (u, i, t) is computed using a
traditional approach, in a two-dimensional setting, but using only the
ratings that have T=t.
30.07.2012
82. Context-Based Recommender Systems Procedure SWOT
Reduction-Based Approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
Multidimensional Model Bi-dimensional Model
item
We use only the
slice for T=t
user
User
ratings features
abcd
From here
Theidea is Product
to reduce features
the
problem
abcdhere
To
Into
a
manageable
model
30.07.2012
83. Context-Based Recommender Systems Procedure SWOT
Reduction-Based vs. Item splitting
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
Reduction Based Item splitting
Uses cross-validation as Uses external impurity
goodness of segmentation – measures
Expensive (i.e. IG) - Heuristic based
Segments are the same for Each item is tested for a split
all the items separately
Prediction is made using only Prediction is made using all
the relevant segment the information, including
split items
Bottom Line
The best known method (Reduction Based) is difficult to apply
(need to search in a huge space of contextual sectors).
We are proposing a more adaptive, and computationally
efficient approach.
30.07.2012
84. Context-Based Recommender Systems Procedure SWOT
Item Split technique
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Item Split - Intuition and Approach
Each item in the data base ( ) is a candidate for splitting
Context defines ( ) all possible splits of an item ratings vector
We test all the possible splits – we do not have many contextual
features
We choose one split (using a single contextual feature) that maximizes
an impurity measure and whose impurity is higher than a threshold
30.07.2012
86. Context-Based Recommender Systems Procedure SWOT
Contextual Modelling approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Overview
In these approaches the context data are explicitly used in the
prediction model.
There are several possibilities for using the contextual data.
For instance the context can be used to extend the definition
of the distance function in nearest neighbours approaches
The distance function must now also include a "context
distance"
aspect in it in addition to the user distance (CF) or item
distance (CB).
30.07.2012
87. Context-Based Recommender Systems Procedure SWOT
Linear Models approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Overview
Presents an extension of the Matrix Factorization (MF) rating
prediction technique that incorporates contextual
information to adapt the recommendation to the user target
context.
In this approach one model parameter was introduced for
each contextual factor and music track genre pair.
This allowed learning how the context affects the ratings and
how they deviate from the classical personalized prediction.
30.07.2012
88. Context-Based Recommender Systems Procedure SWOT
Linear Models approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Example
standard rating prediction for a user u and item i that can be
computed by a standard matrix factorization method for
collaborative filtering, this is the simple predicted rating for
this user and item pair, namely 4.24.
30.07.2012
89. Context-Based Recommender Systems Procedure SWOT
Linear Models approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Example
The model that we have used in addition to that estimates
context-aware predictions, i.e., predictions were a context is
specified:
in the figure we have two contexts c1 and c2 (sun and
rain).
30.07.2012
90. Context-Based Recommender Systems Procedure SWOT
Linear Models approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Example
The model makes these two context aware rating predictions
(4.94 and 3.84) by estimating on the available data two
additional parameters that models the influence of the
context on the item, bic1 and bic2
These two parameters describe the modifications to be made
to the non context-aware prediction to take into account the
context.In the first case the predicted rating must be
increased by 0.7 and in the second case decreased by 0.4.
30.07.2012
91. Context-Based Recommender Systems Procedure SWOT
Linear Models approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Predictive Model
Context Aware Collaborative Filtering
30.07.2012
92. Context-Based Recommender Systems Procedure SWOT
Linear Models approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Comparison performance of Mean Absolute Error
The largest improvement with respect to the non-personalized model based on
the item average is achieved as expected, by personalizing the recommendations
(“MF CF"), This gives an improvement of 5%.
The personalized model can be further improved by contextualization (“MF CF +
Context") producing an improvement of 7% with respect to the item average
prediction, and a 3% improvement over the personalized model.
The modeling approach and the rating acquisition process can substantially
improve the rating prediction accuracy when taking into account the contextual
information.
30.07.2012
94. Method 4 Procedure SWOT
Cross Domain
Methods Summary
Model Analysis
Cross Domain Community Group
Cross-domain recommenders can
recommend products and services of
several domains that share resources
Description
(e.g., users, items, ratings, features, late
nt patterns s, features, latent
patterns).
4 Cross Domain
Knowledge from one or several
domains might be utilized in another
domain to improve recommendations.
Selected Techniques
User-model mediation and
aggregation
30.07.2012
95. Cross-Domain Procedure SWOT
Overview
Methods Summary
Model Analysis
Cross Domain Community Group
abcd
Overview
The majority of recommender systems (RS) work in a single
domain, such as movies, books, tourism etc.
However, human preferences may span across multiple
domains.
Knowledge of a user’s behavior in different domains might
improve prediction in a specific domain.
A company might have knowledge of a user in one or more
different domains than the target recommendation and would
like to use it
30.07.2012
96. Cross-Domain Procedure SWOT
Overview
Methods Summary
Model Analysis
Cross Domain Community Group
abcd
Motivation
Sparsity and cold-start problems: cross-domain algorithms may
enrich the training data with data from other domains to prevent
sparsity.
User friendly systems: by making use of data that was collected for
one domain in other domains, systems can prevent user’s interfering
for providing feedback.
Availability of cross domain data: many e-commerce systems and
social networks contain information of users' preferences in several
domains. Thus, cross-domain information is available, and it is
motivating to look for effective algorithm that can make use of this
data to improve recommender systems performance (e.g., x-loads
domains).
Marketing – cross-selling of new products: Marketing studies found
out that it is effective to promote products from different domains
to a user if they fit her buying patterns across domains.
30.07.2012
97. Cross-Domain Procedure SWOT
Overview
Methods Summary
Model Analysis
Cross Domain Community Group
abcd
State of the art techniques
User-model mediation and aggregation
This technique was suggested by (Berkovsy et al, 2006,2007,2008).
Aims at the sparsity challenge of recommender systems by
enriching the UM with data from a remote system.
Requires overlap of users between domains
Evaluation was performed for sub-domains of the same domain
Content-based unified user-model
(Gahni and Fano 2002) proposed generating a content-based user
model that can be used across domains.
Extracting semantic features that might be relevant for many
domains and are pre- defined by domain experts (e.g., trendiness
vs. individualism)
Not implemented or evaluated
30.07.2012
98. Cross-Domain Procedure SWOT
Overview
Methods Summary
Model Analysis
Cross Domain Community Group
abcd
State of the art techniques
Transfer learning (TL)
A relatively young research area (since 1995) in Machine learning
Aims at extracting knowledge that was learned for one task in a
domain and use it for a target task in a different domain.
TL technique is recently gaining attention for application where
datasets are available only for specific domains
30.07.2012
100. Cross-Domain Procedure SWOT
Methods Summary
Model Analysis
User-model Mediation and Aggregation Cross Domain
Aggregation
Community
CBT`
Group
abcd
Intuition and Approach
This technique was suggested by Berkovsy et al., (2006, 2007,
2008) and aims at the sparsity challenge of recommender
systems by enriching the UM with data from a remote (source)
system.
The suggested technique was demonstrated for the
collaborative filtering approach and is based on mediating
user model data form other domains to enrich the user's
model.
A similar approach was presented by (Gonzales et al., 2006)
that generate a unified UM approach that aggregates features
from different domains, and maps the features that are
aggregated to relevant domains
30.07.2012
101. Cross-Domain Procedure SWOT
Methods Summary
Model Analysis
User-model Mediation and Aggregation Cross Domain
Aggregation
Community
CBT`
Group
abcd
Intuition and Approach
Application of the mediation suggested above by Berkovsky at
al., requires:
Overlapping users – mediation enriches the data about a specific
user with data about the same user from another domain (for
other items, and may be also in another context)
Same prediction task – mediation of data from other users
models were applied from system that implemented the same
prediction function (collaborative filtering), thus employing the
same UM (user's ratings on items).
Similarity between domains. A method to identify such similarity
is needed. Similarity should be integrated in the recommender
algorithm.
30.07.2012
102. Cross-Domain Procedure SWOT
Methods Summary
Model Analysis
User-model Mediation and Aggregation Cross Domain
Aggregation
Community
CBT`
Group
abcd
UM Aggregation approches
Domain 1 Domain 2
Source Target
abcd abcd abcd
Type 1 Type 2 Combine recommendation
K nearest neighbors are K nearest neighbors are Consider the two domains as one
computed in the source computed in the source integrated domain:
domain domain to Ks. As in Type1, set of K from the
domain 1 presents the
nearest neighbors.
These neighbors are K nearest neighbors are
utilized to generate also computed in the But in this case it aggregates
target domain to Kt. with the set of K nearest-
recommendation in the
neighbors within domain 2.
target domain.
From the aggregation
The most similar K results K users with a
This method is usable neighbors are selected maximum cosine similarity
for a user that is new in from U(Ks,Kt). value were selected and the
the target domain, and prediction was done with an
has history in the attitude to those K nearest
source domain. neighbors.
30.07.2012
Similarity Weights Optimization: also known by the name "Neighborhood modeling through global optimization". In SWO the similarity function (Pearson, Cosine) is only used to determine the neighbours. The weights for the weighted average are found via an optimization process which minimizes the total prediction error – the weights are the optimized parameter in the error function. The difference between NN CF and SWO (similarity weight optimization) is that in NN CF the similarity function (Pearson, Cosine) is used to both determine the nearest neighbours and determine the weights in the weighted average of the prediction. This technique requires data normalization.
In some situations the system can be asked for a recommendation tailored for a group of people. For example if a family is sitting together watching TV, the system needs to recommend something that suits the family as a whole. A sports show might be more interesting for the father, but would leave some other members of the family unsatisfied. In some systems the group is dynamic, and the members of the group change over time, which requires constant adjustments on the system's part. The satisfaction of individuals may be a complex matter since for example if the TV shows makes the children happy, then the mother may also be (indirectly) happy just because her children are happy. In some cases multiple items are recommended to the group, for example in a trip recommender there is time to visit 4 different places within a day's trip, and different members prefer to visit different locations.[1,2,3].