SlideShare a Scribd company logo
1 of 1
Download to read offline
Alan Said, Brijnesh J. Jain, Sahin Albayrak
                                                                                                        {alan, jain, sahin}@dai-lab.de

                                                                                                   CSCW 2013 – San Antonio, TX, USA



  Traditional recommender system evaluation only measures one               In each scenario, different concepts have different
  type of quality, e.g. recommendation accuracy or rating prediction        importance.
  error.
  We propose to evaluate and benchmark additional                           We represent the quality of an algorithm as a function E
  recommendation qualities:                                                 - a vector of cost functions:
   User Requirements                                                                                                        𝑇
     • recommendation accuracy                                                     𝐸 𝑓 =          𝐸1 𝑓 , … , 𝐸 𝑝 𝑓
                                      Business Models




     • perceived quality, etc.
   Business Values                                                         In order to allow for simple comparison, we formulate
     • Retention                                                            the utility function as:
     • Churn, etc.
   Technical Constraints                                                      𝑈 𝑓 = 𝓌𝑇𝐸 𝑓 =                          𝓌𝑖 𝐸𝑖 𝑓
     • Scalability                          User Requirements                                                   𝑖
     • Speed, etc.                                                          where w is the vector of weights defining the
                                                                            importance of each axis. The resulting value represents
  By defining a recommendation scenario, each of the three factors          the quality of the recommendation algorithm in the
  can be represented by a quality important in the specific use case.       defined use case.




  We conducted a movie recommendation user study with
  132 users providing feedback on 3 recommendation algo-
  rithms. Each user rated a number of movies and got 10
  recommendations provided by one of the 3 algorithms.
  The algorithm were tuned to provide traditional recom-
  mendations, diverse recommendations, or random
  recommendations respectively.
  Users were asked if they would watch the recommended movies (user requirement) and whether they would consider using the
  system again (business value). The technical constraint is represented by the time the algorithm took to recommend movies.




                                                                The results of the user study, shown with different weights, e.g.
                                                                when all three axis are similarly weighted, when the user
                                                                requirements are more important, when the business values are
                                                                more important and finally when the technical constraints are more
                                                                important than the other values.




  We presented a three dimensional model for evaluation          Further explanation of the 3D
  of recommender systems taking user-centric values,             evaluation concept[RUE‘12, Said et al. 2012]
  technical constraints and business values into
  consideration. The model simplifies the evaluation and
  benchmarking of recommendation algorithms in                                    Poster abstract [CSCW‘13, Said
  predefined scenarios, e.g. where different qualities of                         et al. 2013b]
  algorithms are sought for.

  We evaluated the model through a user study                    User-Centric Evaluation of a K-Furthest
  comparing 3 different recommendation algorithms and            Neighbor Collaborative Filtering
  presented different interpretations of the obtained            Recommender Algorithm
  qualities.                                                     [CSCW’13, Said et al. 2013a.]
                                                                 Presentation: Wednesday Feb 27, 10AM. Track 4.



Technische Universität Berlin                                                                                   www.dai-lab.de

More Related Content

Similar to A 3D Approach to Recommender System Evaluation

Chapter 1 - Requirement Engineering
Chapter 1 - Requirement EngineeringChapter 1 - Requirement Engineering
Chapter 1 - Requirement EngineeringNeeraj Kumar Singh
 
Software Architecture: Why and What?
Software Architecture: Why and What?Software Architecture: Why and What?
Software Architecture: Why and What?Chris F Carroll
 
A Survey of Synergistic Relationships For Designing Architecture: Scenarios, ...
A Survey of Synergistic Relationships For Designing Architecture: Scenarios, ...A Survey of Synergistic Relationships For Designing Architecture: Scenarios, ...
A Survey of Synergistic Relationships For Designing Architecture: Scenarios, ...ijbuiiir1
 
Opinion-Based Entity Ranking
Opinion-Based Entity RankingOpinion-Based Entity Ranking
Opinion-Based Entity RankingKavita Ganesan
 
requirement engineering
requirement engineeringrequirement engineering
requirement engineeringanam singla
 
Assessment outcomes from the TENCompetence project
Assessment outcomes from the TENCompetence projectAssessment outcomes from the TENCompetence project
Assessment outcomes from the TENCompetence projectUniversity of Strathclyde
 
User-Rating Based QoS Aware Approach for Selection of Updated Web Services to...
User-Rating Based QoS Aware Approach for Selection of Updated Web Services to...User-Rating Based QoS Aware Approach for Selection of Updated Web Services to...
User-Rating Based QoS Aware Approach for Selection of Updated Web Services to...IDES Editor
 
Role+Of+Testing+In+Sdlc
Role+Of+Testing+In+SdlcRole+Of+Testing+In+Sdlc
Role+Of+Testing+In+Sdlcmahendra singh
 
Preference Elicitation Interface
Preference Elicitation InterfacePreference Elicitation Interface
Preference Elicitation Interface晓愚 孟
 
Viewpoint-based Test Requirement Analysis Modeling and Test Architectural D...
Viewpoint-based Test Requirement Analysis Modelingand Test Architectural D...Viewpoint-based Test Requirement Analysis Modelingand Test Architectural D...
Viewpoint-based Test Requirement Analysis Modeling and Test Architectural D...Yasuharu Nishi
 
A Review Study OF Movie Recommendation Using Machine Learning
A Review Study OF Movie Recommendation Using Machine LearningA Review Study OF Movie Recommendation Using Machine Learning
A Review Study OF Movie Recommendation Using Machine LearningIRJET Journal
 
RELIABILITY EVALUATION OF SOFTWARE ARCHITECTURE STYLES
RELIABILITY EVALUATION OF SOFTWARE ARCHITECTURE STYLESRELIABILITY EVALUATION OF SOFTWARE ARCHITECTURE STYLES
RELIABILITY EVALUATION OF SOFTWARE ARCHITECTURE STYLEScscpconf
 
RELIABILITY EVALUATION OF SOFTWARE ARCHITECTURE STYLES
RELIABILITY EVALUATION OF SOFTWARE ARCHITECTURE STYLESRELIABILITY EVALUATION OF SOFTWARE ARCHITECTURE STYLES
RELIABILITY EVALUATION OF SOFTWARE ARCHITECTURE STYLEScsandit
 
Software quality
Software qualitySoftware quality
Software qualityjagadeesan
 
Lecture 3 Requirements and Quality Attributes 16 Sept 2020.pdf
Lecture 3 Requirements and Quality Attributes 16 Sept 2020.pdfLecture 3 Requirements and Quality Attributes 16 Sept 2020.pdf
Lecture 3 Requirements and Quality Attributes 16 Sept 2020.pdfSajalMitra4
 

Similar to A 3D Approach to Recommender System Evaluation (20)

Chapter 1 - Requirement Engineering
Chapter 1 - Requirement EngineeringChapter 1 - Requirement Engineering
Chapter 1 - Requirement Engineering
 
Architecture evaluation
Architecture evaluationArchitecture evaluation
Architecture evaluation
 
Software Architecture: Why and What?
Software Architecture: Why and What?Software Architecture: Why and What?
Software Architecture: Why and What?
 
Man.ppt
Man.pptMan.ppt
Man.ppt
 
A Survey of Synergistic Relationships For Designing Architecture: Scenarios, ...
A Survey of Synergistic Relationships For Designing Architecture: Scenarios, ...A Survey of Synergistic Relationships For Designing Architecture: Scenarios, ...
A Survey of Synergistic Relationships For Designing Architecture: Scenarios, ...
 
Opinion-Based Entity Ranking
Opinion-Based Entity RankingOpinion-Based Entity Ranking
Opinion-Based Entity Ranking
 
requirement engineering
requirement engineeringrequirement engineering
requirement engineering
 
Chapter 2 - Testing in Agile
Chapter 2 - Testing in AgileChapter 2 - Testing in Agile
Chapter 2 - Testing in Agile
 
Chapter 1 - Basic Concepts
Chapter 1 - Basic ConceptsChapter 1 - Basic Concepts
Chapter 1 - Basic Concepts
 
Assessment outcomes from the TENCompetence project
Assessment outcomes from the TENCompetence projectAssessment outcomes from the TENCompetence project
Assessment outcomes from the TENCompetence project
 
User-Rating Based QoS Aware Approach for Selection of Updated Web Services to...
User-Rating Based QoS Aware Approach for Selection of Updated Web Services to...User-Rating Based QoS Aware Approach for Selection of Updated Web Services to...
User-Rating Based QoS Aware Approach for Selection of Updated Web Services to...
 
Role+Of+Testing+In+Sdlc
Role+Of+Testing+In+SdlcRole+Of+Testing+In+Sdlc
Role+Of+Testing+In+Sdlc
 
Preference Elicitation Interface
Preference Elicitation InterfacePreference Elicitation Interface
Preference Elicitation Interface
 
Viewpoint-based Test Requirement Analysis Modeling and Test Architectural D...
Viewpoint-based Test Requirement Analysis Modelingand Test Architectural D...Viewpoint-based Test Requirement Analysis Modelingand Test Architectural D...
Viewpoint-based Test Requirement Analysis Modeling and Test Architectural D...
 
Soft requirement
Soft requirementSoft requirement
Soft requirement
 
A Review Study OF Movie Recommendation Using Machine Learning
A Review Study OF Movie Recommendation Using Machine LearningA Review Study OF Movie Recommendation Using Machine Learning
A Review Study OF Movie Recommendation Using Machine Learning
 
RELIABILITY EVALUATION OF SOFTWARE ARCHITECTURE STYLES
RELIABILITY EVALUATION OF SOFTWARE ARCHITECTURE STYLESRELIABILITY EVALUATION OF SOFTWARE ARCHITECTURE STYLES
RELIABILITY EVALUATION OF SOFTWARE ARCHITECTURE STYLES
 
RELIABILITY EVALUATION OF SOFTWARE ARCHITECTURE STYLES
RELIABILITY EVALUATION OF SOFTWARE ARCHITECTURE STYLESRELIABILITY EVALUATION OF SOFTWARE ARCHITECTURE STYLES
RELIABILITY EVALUATION OF SOFTWARE ARCHITECTURE STYLES
 
Software quality
Software qualitySoftware quality
Software quality
 
Lecture 3 Requirements and Quality Attributes 16 Sept 2020.pdf
Lecture 3 Requirements and Quality Attributes 16 Sept 2020.pdfLecture 3 Requirements and Quality Attributes 16 Sept 2020.pdf
Lecture 3 Requirements and Quality Attributes 16 Sept 2020.pdf
 

More from Alan Said

Replication of Recommender Systems Research
Replication of Recommender Systems ResearchReplication of Recommender Systems Research
Replication of Recommender Systems ResearchAlan Said
 
RecSysChallenge Opening
RecSysChallenge OpeningRecSysChallenge Opening
RecSysChallenge OpeningAlan Said
 
Best Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesBest Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesAlan Said
 
Estimating the Magic Barrier of Recommender Systems: A User Study
Estimating the Magic Barrier of Recommender Systems: A User StudyEstimating the Magic Barrier of Recommender Systems: A User Study
Estimating the Magic Barrier of Recommender Systems: A User StudyAlan Said
 
Users and Noise: The Magic Barrier of Recommender Systems
Users and Noise: The Magic Barrier of Recommender SystemsUsers and Noise: The Magic Barrier of Recommender Systems
Users and Noise: The Magic Barrier of Recommender SystemsAlan Said
 
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...Alan Said
 
CaRR 2012 Opening Presentation
CaRR 2012 Opening PresentationCaRR 2012 Opening Presentation
CaRR 2012 Opening PresentationAlan Said
 
Personalizing Tags: A Folksonomy-like Approach for Recommending Movies
Personalizing Tags: A Folksonomy-like Approach for Recommending MoviesPersonalizing Tags: A Folksonomy-like Approach for Recommending Movies
Personalizing Tags: A Folksonomy-like Approach for Recommending MoviesAlan Said
 
Inferring Contextual User Profiles - Improving Recommender Performance
Inferring Contextual User Profiles - Improving Recommender PerformanceInferring Contextual User Profiles - Improving Recommender Performance
Inferring Contextual User Profiles - Improving Recommender PerformanceAlan Said
 
Using Social- and Pseudo-Social Networks to Improve Recommendation Quality
Using Social- and Pseudo-Social Networks to Improve Recommendation QualityUsing Social- and Pseudo-Social Networks to Improve Recommendation Quality
Using Social- and Pseudo-Social Networks to Improve Recommendation QualityAlan Said
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender SystemsAlan Said
 

More from Alan Said (11)

Replication of Recommender Systems Research
Replication of Recommender Systems ResearchReplication of Recommender Systems Research
Replication of Recommender Systems Research
 
RecSysChallenge Opening
RecSysChallenge OpeningRecSysChallenge Opening
RecSysChallenge Opening
 
Best Practices in Recommender System Challenges
Best Practices in Recommender System ChallengesBest Practices in Recommender System Challenges
Best Practices in Recommender System Challenges
 
Estimating the Magic Barrier of Recommender Systems: A User Study
Estimating the Magic Barrier of Recommender Systems: A User StudyEstimating the Magic Barrier of Recommender Systems: A User Study
Estimating the Magic Barrier of Recommender Systems: A User Study
 
Users and Noise: The Magic Barrier of Recommender Systems
Users and Noise: The Magic Barrier of Recommender SystemsUsers and Noise: The Magic Barrier of Recommender Systems
Users and Noise: The Magic Barrier of Recommender Systems
 
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
 
CaRR 2012 Opening Presentation
CaRR 2012 Opening PresentationCaRR 2012 Opening Presentation
CaRR 2012 Opening Presentation
 
Personalizing Tags: A Folksonomy-like Approach for Recommending Movies
Personalizing Tags: A Folksonomy-like Approach for Recommending MoviesPersonalizing Tags: A Folksonomy-like Approach for Recommending Movies
Personalizing Tags: A Folksonomy-like Approach for Recommending Movies
 
Inferring Contextual User Profiles - Improving Recommender Performance
Inferring Contextual User Profiles - Improving Recommender PerformanceInferring Contextual User Profiles - Improving Recommender Performance
Inferring Contextual User Profiles - Improving Recommender Performance
 
Using Social- and Pseudo-Social Networks to Improve Recommendation Quality
Using Social- and Pseudo-Social Networks to Improve Recommendation QualityUsing Social- and Pseudo-Social Networks to Improve Recommendation Quality
Using Social- and Pseudo-Social Networks to Improve Recommendation Quality
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 

Recently uploaded

Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 

Recently uploaded (20)

Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
How Tech Giants Cut Corners to Harvest Data for A.I.
How Tech Giants Cut Corners to Harvest Data for A.I.How Tech Giants Cut Corners to Harvest Data for A.I.
How Tech Giants Cut Corners to Harvest Data for A.I.
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 

A 3D Approach to Recommender System Evaluation

  • 1. Alan Said, Brijnesh J. Jain, Sahin Albayrak {alan, jain, sahin}@dai-lab.de CSCW 2013 – San Antonio, TX, USA Traditional recommender system evaluation only measures one In each scenario, different concepts have different type of quality, e.g. recommendation accuracy or rating prediction importance. error. We propose to evaluate and benchmark additional We represent the quality of an algorithm as a function E recommendation qualities: - a vector of cost functions:  User Requirements 𝑇 • recommendation accuracy 𝐸 𝑓 = 𝐸1 𝑓 , … , 𝐸 𝑝 𝑓 Business Models • perceived quality, etc.  Business Values In order to allow for simple comparison, we formulate • Retention the utility function as: • Churn, etc.  Technical Constraints 𝑈 𝑓 = 𝓌𝑇𝐸 𝑓 = 𝓌𝑖 𝐸𝑖 𝑓 • Scalability User Requirements 𝑖 • Speed, etc. where w is the vector of weights defining the importance of each axis. The resulting value represents By defining a recommendation scenario, each of the three factors the quality of the recommendation algorithm in the can be represented by a quality important in the specific use case. defined use case. We conducted a movie recommendation user study with 132 users providing feedback on 3 recommendation algo- rithms. Each user rated a number of movies and got 10 recommendations provided by one of the 3 algorithms. The algorithm were tuned to provide traditional recom- mendations, diverse recommendations, or random recommendations respectively. Users were asked if they would watch the recommended movies (user requirement) and whether they would consider using the system again (business value). The technical constraint is represented by the time the algorithm took to recommend movies. The results of the user study, shown with different weights, e.g. when all three axis are similarly weighted, when the user requirements are more important, when the business values are more important and finally when the technical constraints are more important than the other values. We presented a three dimensional model for evaluation Further explanation of the 3D of recommender systems taking user-centric values, evaluation concept[RUE‘12, Said et al. 2012] technical constraints and business values into consideration. The model simplifies the evaluation and benchmarking of recommendation algorithms in Poster abstract [CSCW‘13, Said predefined scenarios, e.g. where different qualities of et al. 2013b] algorithms are sought for. We evaluated the model through a user study User-Centric Evaluation of a K-Furthest comparing 3 different recommendation algorithms and Neighbor Collaborative Filtering presented different interpretations of the obtained Recommender Algorithm qualities. [CSCW’13, Said et al. 2013a.] Presentation: Wednesday Feb 27, 10AM. Track 4. Technische Universität Berlin www.dai-lab.de