SlideShare a Scribd company logo
1 of 38
Download to read offline
The Search for the Best Live
Recommender System
Torben Brodt
plista GmbH
Keynote
SIGIR Conference 2013, Dublin
BARS Workshop - Benchmarking Adaptive
Retrieval and Recommender Systems
August 1st, 2013
recommendations
where
● news websites
● below the article
different types
● content
● advertising
quality is win win
● happy user
● happy advertiser
● happy publisher
● happy plista*
* company i am working
some years ago
one recommender
● collaborative
filtering
○ well known algorithm
○ more data means
more knowledge
● parameter tuning
○ time
○ trust
○ mainstream
one recommender = good result
2008
● finished studies
● publication
● plista was born
today
● 5k recs/second
● many publishers
netflix prize
" use as many
recommenders as
possible! "
more recommenders
lost in serendipity
● we have one score
● lucky success? bad
loose?
● we needed to keep
track on different
recommenders
success: 0.31 %
how to measure success
number of
● clicks
● orders
● engages
● time on site
● money
BAD
GOOD
evaluation technology
● features
○ SUM
○ INCR
● big data (!!)
● real time
● in memory
evaluation technology
impressions
collaborative filtering 500 +1
most popular 500
text similarity 500
ZINCRBY
"impressions"
"collaborative_filtering"
"1"
ZREVRANGEBYSCORE
"impressions"
evaluation technology
impressions
collaborative filtering 500
most popular 500
text similarity 500
clicks
collaborative filtering 100
most popular 10
... 1
needs division
ZREVRANGEBYSCORE
"clicks"
ZREVRANGEBYSCORE
"impressions"
evaluation results
● CF is "always" the best
recommender
● but "always" is just avg
of all context
lets check on context!
t = time
s = success
evaluation context
● our context is limited to the web
● we have URL + HTTP Headers
○ user agent -> device
○ IP address -> geolocation
○ time -> weekday
evaluation context
we use ~60 context attributes
publisher = welt.de
collaborative filtering 689 +1
most popular 420
text similarity 135
weekday = sunday
collaborative filtering 400 +1
most popular 200
... 100
category = archive
text similarity 200
collaborative filtering 10 +1
... 5
evaluation context
publisher = welt.de
collaborative filterin 689
most popular 420
text similarity 135
weekday = sunday
collaborative filtering 400
most popular 200
... 100
category = archive
text similarity 200
collaborative filtering 10
... 5
ZUNION clk ... WEIGHTS
p:welt.de:clk 4
w:sunday:clk 1
c:archive:clk 1
ZREVRANGEBYSCORE
"clk"
ZUNION imp ... WEIGHTS
p:welt.de:imp 4
w:sunday:imp 1
c:archive:imp 1
ZREVRANGEBYSCORE
"imp"
evaluation context
recap
● added 3rd dimension
result
● better for news:
Collaborative Filtering
● better for content: Text
Similarity
t = time
s = success
c = context
now breathe!
what did we get?
● possibly many recommenders
● know how to measure success
● technology to see success
now breathe!
what is the link to the workshop?
“.. novel, personalization-centric benchmarking
approaches to evaluate adaptive retrieval and
recommender systems”
● Functional: focus on user-centered
utility metrics
● Non-functional: scalability and
reactivity
the ensemble
● realtime evaluation
technology exists
● to choose best
algorithm for
current context we
need to learn
○ multi-armed
bayesian bandit
multi armed bandit
temporary
success?
No. 1 getting most
local minima?
Interested? Look for Ted Dunning + Bayesian Bandit
the ensemble = better results
● new total / avg is
much better
● thx bandit
● thx ensemble
t = time
s = success
try and error
● minimum pre-
testing
● no risk if
recommender
crashs
● "bad" code might
find its context
collaboration
● now plista
developers can try
ideas
● and allow
researchers to do
same
big pool of algorithms
Ensemble is able to choose
researcher has idea
.. needs to start the server
... probably hosted by
university, plista or
any cloud provider?
.. api implementation
"message bus"
● event notifications
○ impression
○ click
● error notifications
● item updates
train model from it
plista
API
API
research
{ // json
"type": "impression",
"context": {
"simple": {
"27": 418, // publisher
"14": 31721, // widget
...
},
"lists": {
"10": [100, 101] // channel
}
...
}
.. package content
api specs hosted at https://sites.google.
com/site/newsrec2013/
long term URL to be announced
plista
API
API
research
Context
+ Kind
.. reply to recommendation requests
{ // json
"recs": {
"int": {
"3": [13010630, 84799192]
// 3 refers to content
recommendations
}
...
}
generated by researchers
to be shown to real user
api specs hosted at https://sites.google.
com/site/newsrec2013/
long term URL to be announced
recs
API
real user
researcher
quality is win win #2
● happy user
● happy researcher
● happy plista
research can profit
● real user feedback
● real benchmark
recs
plista
real user
researcher
quick and fast
● no movies!
● news articles will outdate!
● visitors need the recs NOW
● => handle the data very fast
srchttp://en.wikipedia.org/wiki/Flash_(comics)
"send quickly" technologies
● fast web server
● fast network protocol
● fast message queue
● fast storage
or Apache Kafka
"learn quickly" technologies
● use common
frameworks
src http://en.wikipedia.org/wiki/Pac-Man
comparison to plista
"real-time features feel better in a
real-time world"
we don't need batch! see http://goo.gl/AJntul
our setup
● php, its easy
● redis, its fast
● r, its well known
Overview
Questions?
Torben
http://goo.gl/pvXm5 (Blog)
torben.brodt@plista.com
http://lnkd.in/MUXXuv
xing.com/profile/Torben_Brodt
www.plista.com
News Recommender Challenge
https://sites.google.com/site/newsrec2013/
#sigir2013 #bars2013
@torbenbrodt @plista @BARSws

More Related Content

Similar to SIGIR 2013 BARS Keynote - the search for the best live recommender system

Living Labs Challenge Workshop
Living Labs Challenge WorkshopLiving Labs Challenge Workshop
Living Labs Challenge WorkshopTorben Brodt
 
Pinterest - Big Data Machine Learning Platform at Pinterest
Pinterest - Big Data Machine Learning Platform at PinterestPinterest - Big Data Machine Learning Platform at Pinterest
Pinterest - Big Data Machine Learning Platform at PinterestAlluxio, Inc.
 
The right path to making search relevant - Taxonomy Bootcamp London 2019
The right path to making search relevant  - Taxonomy Bootcamp London 2019The right path to making search relevant  - Taxonomy Bootcamp London 2019
The right path to making search relevant - Taxonomy Bootcamp London 2019OpenSource Connections
 
Recommender Hackathon @plista 2013/04
Recommender Hackathon @plista 2013/04Recommender Hackathon @plista 2013/04
Recommender Hackathon @plista 2013/04Torben Brodt
 
Partner Webinar: Recommendation Engines with MongoDB and Hadoop
 Partner Webinar: Recommendation Engines with MongoDB and Hadoop Partner Webinar: Recommendation Engines with MongoDB and Hadoop
Partner Webinar: Recommendation Engines with MongoDB and HadoopMongoDB
 
Webinar: Event Processing & Data Analytics with Lucidworks Fusion
Webinar: Event Processing & Data Analytics with Lucidworks FusionWebinar: Event Processing & Data Analytics with Lucidworks Fusion
Webinar: Event Processing & Data Analytics with Lucidworks FusionLucidworks
 
I Know It Was MEAN, But I Cut the Cord to LAMP Anyway
I Know It Was MEAN, But I Cut the Cord to LAMP AnywayI Know It Was MEAN, But I Cut the Cord to LAMP Anyway
I Know It Was MEAN, But I Cut the Cord to LAMP AnywayPOSSCON
 
Making the Most of Customer Data
Making the Most of Customer DataMaking the Most of Customer Data
Making the Most of Customer DataWSO2
 
Using Visualizations to Monitor Changes and Harvest Insights from a Global-sc...
Using Visualizations to Monitor Changes and Harvest Insights from a Global-sc...Using Visualizations to Monitor Changes and Harvest Insights from a Global-sc...
Using Visualizations to Monitor Changes and Harvest Insights from a Global-sc...Krist Wongsuphasawat
 
Opticon18: Developer Night
Opticon18: Developer NightOpticon18: Developer Night
Opticon18: Developer NightOptimizely
 
GDayX - Advanced Angular.JS
GDayX - Advanced Angular.JSGDayX - Advanced Angular.JS
GDayX - Advanced Angular.JSNicolas Embleton
 
Node in Production at Aviary
Node in Production at AviaryNode in Production at Aviary
Node in Production at AviaryAviary
 
MCOE Masterclass - Creating Helpful Content.pdf
MCOE Masterclass - Creating Helpful Content.pdfMCOE Masterclass - Creating Helpful Content.pdf
MCOE Masterclass - Creating Helpful Content.pdfLane Houk
 
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologiesIntroduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologiesChris Schalk
 
Digital Attribution Modeling Using Apache Spark-(Anny Chen and William Yan, A...
Digital Attribution Modeling Using Apache Spark-(Anny Chen and William Yan, A...Digital Attribution Modeling Using Apache Spark-(Anny Chen and William Yan, A...
Digital Attribution Modeling Using Apache Spark-(Anny Chen and William Yan, A...Spark Summit
 
Failure is an Option: Scaling Resilient Feature Delivery
Failure is an Option: Scaling Resilient Feature DeliveryFailure is an Option: Scaling Resilient Feature Delivery
Failure is an Option: Scaling Resilient Feature DeliveryOptimizely
 

Similar to SIGIR 2013 BARS Keynote - the search for the best live recommender system (20)

Living Labs Challenge Workshop
Living Labs Challenge WorkshopLiving Labs Challenge Workshop
Living Labs Challenge Workshop
 
Pinterest - Big Data Machine Learning Platform at Pinterest
Pinterest - Big Data Machine Learning Platform at PinterestPinterest - Big Data Machine Learning Platform at Pinterest
Pinterest - Big Data Machine Learning Platform at Pinterest
 
The right path to making search relevant - Taxonomy Bootcamp London 2019
The right path to making search relevant  - Taxonomy Bootcamp London 2019The right path to making search relevant  - Taxonomy Bootcamp London 2019
The right path to making search relevant - Taxonomy Bootcamp London 2019
 
Recommender Hackathon @plista 2013/04
Recommender Hackathon @plista 2013/04Recommender Hackathon @plista 2013/04
Recommender Hackathon @plista 2013/04
 
Partner Webinar: Recommendation Engines with MongoDB and Hadoop
 Partner Webinar: Recommendation Engines with MongoDB and Hadoop Partner Webinar: Recommendation Engines with MongoDB and Hadoop
Partner Webinar: Recommendation Engines with MongoDB and Hadoop
 
Google Developers Overview Deck 2015
Google Developers Overview Deck 2015Google Developers Overview Deck 2015
Google Developers Overview Deck 2015
 
Tweak Geeks #FOS15
Tweak Geeks #FOS15Tweak Geeks #FOS15
Tweak Geeks #FOS15
 
Webinar: Event Processing & Data Analytics with Lucidworks Fusion
Webinar: Event Processing & Data Analytics with Lucidworks FusionWebinar: Event Processing & Data Analytics with Lucidworks Fusion
Webinar: Event Processing & Data Analytics with Lucidworks Fusion
 
I Know It Was MEAN, But I Cut the Cord to LAMP Anyway
I Know It Was MEAN, But I Cut the Cord to LAMP AnywayI Know It Was MEAN, But I Cut the Cord to LAMP Anyway
I Know It Was MEAN, But I Cut the Cord to LAMP Anyway
 
Making the Most of Customer Data
Making the Most of Customer DataMaking the Most of Customer Data
Making the Most of Customer Data
 
Logs & Visualizations at Twitter
Logs & Visualizations at TwitterLogs & Visualizations at Twitter
Logs & Visualizations at Twitter
 
Using Visualizations to Monitor Changes and Harvest Insights from a Global-sc...
Using Visualizations to Monitor Changes and Harvest Insights from a Global-sc...Using Visualizations to Monitor Changes and Harvest Insights from a Global-sc...
Using Visualizations to Monitor Changes and Harvest Insights from a Global-sc...
 
Are API Services Taking Over All the Interesting Data Science Problems?
Are API Services Taking Over All the Interesting Data Science Problems?Are API Services Taking Over All the Interesting Data Science Problems?
Are API Services Taking Over All the Interesting Data Science Problems?
 
Opticon18: Developer Night
Opticon18: Developer NightOpticon18: Developer Night
Opticon18: Developer Night
 
GDayX - Advanced Angular.JS
GDayX - Advanced Angular.JSGDayX - Advanced Angular.JS
GDayX - Advanced Angular.JS
 
Node in Production at Aviary
Node in Production at AviaryNode in Production at Aviary
Node in Production at Aviary
 
MCOE Masterclass - Creating Helpful Content.pdf
MCOE Masterclass - Creating Helpful Content.pdfMCOE Masterclass - Creating Helpful Content.pdf
MCOE Masterclass - Creating Helpful Content.pdf
 
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologiesIntroduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
 
Digital Attribution Modeling Using Apache Spark-(Anny Chen and William Yan, A...
Digital Attribution Modeling Using Apache Spark-(Anny Chen and William Yan, A...Digital Attribution Modeling Using Apache Spark-(Anny Chen and William Yan, A...
Digital Attribution Modeling Using Apache Spark-(Anny Chen and William Yan, A...
 
Failure is an Option: Scaling Resilient Feature Delivery
Failure is an Option: Scaling Resilient Feature DeliveryFailure is an Option: Scaling Resilient Feature Delivery
Failure is an Option: Scaling Resilient Feature Delivery
 

More from Torben Brodt

Recommender Trends 2014
Recommender Trends 2014Recommender Trends 2014
Recommender Trends 2014Torben Brodt
 
Paper the plista dataset
Paper  the plista datasetPaper  the plista dataset
Paper the plista datasetTorben Brodt
 
Algorithmus, Good School, Camp Digital
Algorithmus, Good School, Camp DigitalAlgorithmus, Good School, Camp Digital
Algorithmus, Good School, Camp DigitalTorben Brodt
 
Realtime Recommender with Redis: Hands on
Realtime Recommender with Redis: Hands onRealtime Recommender with Redis: Hands on
Realtime Recommender with Redis: Hands onTorben Brodt
 
RecSys2012 inside the plista contest
RecSys2012   inside the plista contestRecSys2012   inside the plista contest
RecSys2012 inside the plista contestTorben Brodt
 
Webhacks am Beispiel PHP + MySQL
Webhacks am Beispiel PHP + MySQLWebhacks am Beispiel PHP + MySQL
Webhacks am Beispiel PHP + MySQLTorben Brodt
 
Collaborative Filtering.. für automatische Empfehlungen
Collaborative Filtering.. für automatische EmpfehlungenCollaborative Filtering.. für automatische Empfehlungen
Collaborative Filtering.. für automatische EmpfehlungenTorben Brodt
 
Google Web Toolkit
Google Web ToolkitGoogle Web Toolkit
Google Web ToolkitTorben Brodt
 
Geld Verdienen Mit Adsense
Geld Verdienen Mit AdsenseGeld Verdienen Mit Adsense
Geld Verdienen Mit AdsenseTorben Brodt
 
Web 2.0 - "Fluch oder Segen"
Web 2.0 - "Fluch oder Segen"Web 2.0 - "Fluch oder Segen"
Web 2.0 - "Fluch oder Segen"Torben Brodt
 

More from Torben Brodt (12)

Recommender Trends 2014
Recommender Trends 2014Recommender Trends 2014
Recommender Trends 2014
 
Paper the plista dataset
Paper  the plista datasetPaper  the plista dataset
Paper the plista dataset
 
Algorithmus, Good School, Camp Digital
Algorithmus, Good School, Camp DigitalAlgorithmus, Good School, Camp Digital
Algorithmus, Good School, Camp Digital
 
Realtime Recommender with Redis: Hands on
Realtime Recommender with Redis: Hands onRealtime Recommender with Redis: Hands on
Realtime Recommender with Redis: Hands on
 
RecSys2012 inside the plista contest
RecSys2012   inside the plista contestRecSys2012   inside the plista contest
RecSys2012 inside the plista contest
 
Webhacks am Beispiel PHP + MySQL
Webhacks am Beispiel PHP + MySQLWebhacks am Beispiel PHP + MySQL
Webhacks am Beispiel PHP + MySQL
 
GIT / SVN
GIT / SVNGIT / SVN
GIT / SVN
 
Collaborative Filtering.. für automatische Empfehlungen
Collaborative Filtering.. für automatische EmpfehlungenCollaborative Filtering.. für automatische Empfehlungen
Collaborative Filtering.. für automatische Empfehlungen
 
Google Web Toolkit
Google Web ToolkitGoogle Web Toolkit
Google Web Toolkit
 
Geld Verdienen Mit Adsense
Geld Verdienen Mit AdsenseGeld Verdienen Mit Adsense
Geld Verdienen Mit Adsense
 
AJAX
AJAXAJAX
AJAX
 
Web 2.0 - "Fluch oder Segen"
Web 2.0 - "Fluch oder Segen"Web 2.0 - "Fluch oder Segen"
Web 2.0 - "Fluch oder Segen"
 

Recently uploaded

Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireExakis Nelite
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfSrushith Repakula
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxFIDO Alliance
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...ScyllaDB
 
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?Paolo Missier
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandIES VE
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...marcuskenyatta275
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxFIDO Alliance
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...FIDO Alliance
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftshyamraj55
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024Stephen Perrenod
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuidePixlogix Infotech
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewDianaGray10
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxFIDO Alliance
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Hiroshi SHIBATA
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform EngineeringMarcus Vechiato
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Skynet Technologies
 

Recently uploaded (20)

Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 

SIGIR 2013 BARS Keynote - the search for the best live recommender system

  • 1. The Search for the Best Live Recommender System Torben Brodt plista GmbH Keynote SIGIR Conference 2013, Dublin BARS Workshop - Benchmarking Adaptive Retrieval and Recommender Systems August 1st, 2013
  • 2. recommendations where ● news websites ● below the article different types ● content ● advertising
  • 3. quality is win win ● happy user ● happy advertiser ● happy publisher ● happy plista* * company i am working
  • 5. one recommender ● collaborative filtering ○ well known algorithm ○ more data means more knowledge ● parameter tuning ○ time ○ trust ○ mainstream
  • 6. one recommender = good result 2008 ● finished studies ● publication ● plista was born today ● 5k recs/second ● many publishers
  • 7. netflix prize " use as many recommenders as possible! "
  • 9. lost in serendipity ● we have one score ● lucky success? bad loose? ● we needed to keep track on different recommenders success: 0.31 %
  • 10. how to measure success number of ● clicks ● orders ● engages ● time on site ● money BAD GOOD
  • 11. evaluation technology ● features ○ SUM ○ INCR ● big data (!!) ● real time ● in memory
  • 12. evaluation technology impressions collaborative filtering 500 +1 most popular 500 text similarity 500 ZINCRBY "impressions" "collaborative_filtering" "1" ZREVRANGEBYSCORE "impressions"
  • 13. evaluation technology impressions collaborative filtering 500 most popular 500 text similarity 500 clicks collaborative filtering 100 most popular 10 ... 1 needs division ZREVRANGEBYSCORE "clicks" ZREVRANGEBYSCORE "impressions"
  • 14. evaluation results ● CF is "always" the best recommender ● but "always" is just avg of all context lets check on context! t = time s = success
  • 15. evaluation context ● our context is limited to the web ● we have URL + HTTP Headers ○ user agent -> device ○ IP address -> geolocation ○ time -> weekday
  • 16. evaluation context we use ~60 context attributes publisher = welt.de collaborative filtering 689 +1 most popular 420 text similarity 135 weekday = sunday collaborative filtering 400 +1 most popular 200 ... 100 category = archive text similarity 200 collaborative filtering 10 +1 ... 5
  • 17. evaluation context publisher = welt.de collaborative filterin 689 most popular 420 text similarity 135 weekday = sunday collaborative filtering 400 most popular 200 ... 100 category = archive text similarity 200 collaborative filtering 10 ... 5 ZUNION clk ... WEIGHTS p:welt.de:clk 4 w:sunday:clk 1 c:archive:clk 1 ZREVRANGEBYSCORE "clk" ZUNION imp ... WEIGHTS p:welt.de:imp 4 w:sunday:imp 1 c:archive:imp 1 ZREVRANGEBYSCORE "imp"
  • 18. evaluation context recap ● added 3rd dimension result ● better for news: Collaborative Filtering ● better for content: Text Similarity t = time s = success c = context
  • 19. now breathe! what did we get? ● possibly many recommenders ● know how to measure success ● technology to see success
  • 20. now breathe! what is the link to the workshop? “.. novel, personalization-centric benchmarking approaches to evaluate adaptive retrieval and recommender systems” ● Functional: focus on user-centered utility metrics ● Non-functional: scalability and reactivity
  • 21. the ensemble ● realtime evaluation technology exists ● to choose best algorithm for current context we need to learn ○ multi-armed bayesian bandit
  • 22. multi armed bandit temporary success? No. 1 getting most local minima? Interested? Look for Ted Dunning + Bayesian Bandit
  • 23. the ensemble = better results ● new total / avg is much better ● thx bandit ● thx ensemble t = time s = success
  • 24. try and error ● minimum pre- testing ● no risk if recommender crashs ● "bad" code might find its context
  • 25. collaboration ● now plista developers can try ideas ● and allow researchers to do same
  • 26. big pool of algorithms Ensemble is able to choose
  • 28. .. needs to start the server ... probably hosted by university, plista or any cloud provider?
  • 29. .. api implementation "message bus" ● event notifications ○ impression ○ click ● error notifications ● item updates train model from it plista API API research
  • 30. { // json "type": "impression", "context": { "simple": { "27": 418, // publisher "14": 31721, // widget ... }, "lists": { "10": [100, 101] // channel } ... } .. package content api specs hosted at https://sites.google. com/site/newsrec2013/ long term URL to be announced plista API API research Context + Kind
  • 31. .. reply to recommendation requests { // json "recs": { "int": { "3": [13010630, 84799192] // 3 refers to content recommendations } ... } generated by researchers to be shown to real user api specs hosted at https://sites.google. com/site/newsrec2013/ long term URL to be announced recs API real user researcher
  • 32. quality is win win #2 ● happy user ● happy researcher ● happy plista research can profit ● real user feedback ● real benchmark recs plista real user researcher
  • 33. quick and fast ● no movies! ● news articles will outdate! ● visitors need the recs NOW ● => handle the data very fast srchttp://en.wikipedia.org/wiki/Flash_(comics)
  • 34. "send quickly" technologies ● fast web server ● fast network protocol ● fast message queue ● fast storage or Apache Kafka
  • 35. "learn quickly" technologies ● use common frameworks src http://en.wikipedia.org/wiki/Pac-Man
  • 36. comparison to plista "real-time features feel better in a real-time world" we don't need batch! see http://goo.gl/AJntul our setup ● php, its easy ● redis, its fast ● r, its well known
  • 38. Questions? Torben http://goo.gl/pvXm5 (Blog) torben.brodt@plista.com http://lnkd.in/MUXXuv xing.com/profile/Torben_Brodt www.plista.com News Recommender Challenge https://sites.google.com/site/newsrec2013/ #sigir2013 #bars2013 @torbenbrodt @plista @BARSws