SlideShare a Scribd company logo
1 of 39
Download to read offline
Open Recommendation Platform
For Researchers
and Developers
Living Labs Challenge Workshop
University of Amsterdam
June 6th, 2014
Torben Brodt
plista GmbH
-> http://orp.plista.com
-> http://living-labs.net/llc/
@torbenbrodt
1. what we built for ourselves
○ recommendation engine
2. how we built it
○ big data math
○ system architecture
3. application for “living labs”
○ for developers, researchers and geeks
Contents
@torbenbrodt
Not just opening algorithms to partners,
But opening our platform to algorithms.
where
● news websites
● below the article
● now in NL too!
different types
● content
● advertising
What we built for ourselves
Recommendation Engine
Visitors Publisher
#
@torbenbrodt
What we built for ourselves
Recommendation Engine
Visitors Publisher
Results
Request
Engine
@torbenbrodt
Context
Personalized
II
What we built for ourselves
Collaborative Filtering
Peter James
Peter and James have sth in common.
They both like football
Term: User Similarity
@torbenbrodt
What we built for ourselves
Collaborative Filtering
Peter James
Tennis will be recommendation for Peter,
because James likes it too.
Item Recommendation from User Similarity
@torbenbrodt
● more data => more
knowledge
● not needed:
○ domain knowledge
○ concrete user
○ concrete article
What we built for ourselves
Collaborative Filtering
@torbenbrodt
Text Similarity
What we built for ourselves
More recommenders
● article content matching
recommendation
content
● but which ads to present
to political content?
Most Popular
etc ...
● premise: what
everybody likes is also
good to me
● e.g. public trends, social
likes, wiki data
@torbenbrodt
Text Similarity
What we built for ourselves
More recommenders
● article content matching
recommendation
content
● but which ads to present
to political content?
Most Popular
etc ...
● premise: what
everybody likes is also
good to me
● e.g. public trends, social
likes, wiki data
@torbenbrodt
Text Similarity
What we built for ourselves
More recommenders
● article content matching
recommendation
content
● but which ads to present
to political content?
Most Popular
etc ...
● premise: what
everybody likes is also
good to me
● e.g. public trends, social likes,
wiki data, NLP, Matrix Fac.
@torbenbrodt
What we built for ourselves
good recommendations for...
User
happy!
Advertiser
happy!
Publisher
happy!
plista
happy!
@torbenbrodt
What we built for ourselves
What are the goals?
high number of...
● clicks
● attention
● orders
● engages/videos
● time on site
● page depth
bad good
@torbenbrodt
What we built for ourselves
Who wants this goals?
Advertising Goal
RWE Europe 500 +1
IBM Germany 500
Intel Austria 500
Recommenders Goal
collaborative filtering 500 +1
most popular 500
text similarity 500
Content Goal
new iphone
su...
500 +1
twitter buys p.. 500
google has seri. 500
@torbenbrodt
What we built for ourselves
Who wants this goals?
Advertising Goal
RWE Europe 500 +1
IBM Germany 500
Intel Austria 500
Recommenders Goal
collaborative filtering 500 +1
most popular 500
text similarity 500
Content Goal
new iphone
su...
500 +1
twitter buys p.. 500
google has seri. 500
used to A/B test our
algorithms
@torbenbrodt
What we built for ourselves
Who wants this goals?
Advertising Goal
RWE Europe 500 +1
IBM Germany 500
Intel Austria 500
Recommenders Goal
collaborative filtering 500 +1
most popular 500
text similarity 500
Content Goal
new iphone
su...
500 +1
twitter buys p.. 500
google has seri. 500
@torbenbrodt
What we built for ourselves
All goals have a context
Ad or Content or
Recommender
...
...
...
● user agent > device > mobile
● IP address > geolocation
● referer > origin (search,
direct)
● anonym!
@torbenbrodt
What we built for ourselves
All goals have a context
Which channel to show the Advertising
Which publishers tend to click on Semantic Recommendations
Which geolocation is the right for this Content
Questions the context can answer
Answers to this are given by the algorithms
@torbenbrodt
What we built for ourselves
All goals have a context
● answers change each second
● bayesian bandit approach
temporary
success?
No. 1 getting most
local minima?
@torbenbrodt
✓ easy exploration
● minimum pre-testing
● no risk if recommender
crashs
● "bad" code might find
its context
numbers in short
● 5k recs per second
● 250 Mbit contextual data
● 100 items per second
quite scaling issues
● big data math
● message bus
How we built it?
#
@torbenbrodt
Events
Technology Stack
Message Bus
Subscribers
● algorithms
● payment
● etc
Visitor
● new articles
● delivered
● clicks
@torbenbrodt
How we built it?
Big Data Math
Article 1+1 10
Article 100 2+5
Art...
@torbenbrodt
number of
● clicks
● orders
● engages
● time on site
● money
What math do we need?
● Addition can solve most formulas
● with Logarithm also multiplications
● Real-Time Ready
○ atomic
○ fast
How we built it?
Big Data Math
@torbenbrodt
How we built it?
Big Data Math
welt.de_201406
new iphone su... 500 +1
twitter buys p.. 400
google has seri... 300
ZINCRBY (WRITE)
"welt.de_201406"
"article 1"
"1"
ZUNION (JOIN)
“welt.de_201406”
“geolocation:NL_201406”
ZREVRANGEBYSCORE (FETCH)
@torbenbrodt
Application for Living Labs
#
● These are your visitors
@torbenbrodt
● This is your data
● Assume this is open!
● This is your challenge
● Message Bus provides
YOU with data
Application for Living Labs
Your role in the ORP
@torbenbrodt
plista
ORP
master
YOU!
● Real-Time Results are
provided by YOU
● ORP master will choose YOU
● User will see YOUR results
Try latest technologies
Application for Living Labs
YOU, a technology enthusiast
● Mahout
implementation exists
with Kornakapi
● what will be next?
Oryx? MyMediaLite?
LensKit? Predict.io?
we have strong open source
connections
@torbenbrodt
● try if ideas work
● write papers
● we are on
conferences!
○ sigir 2013
○ recsys 2013
○ clef 2014
○ … 2015 ?
we have strong university
cooperations
Application for Living Labs
YOU, a researcher
@torbenbrodt
● plista earns money
with recommendations
on publishers
● help us -> we help you
● weekly contest with
250 € prices
http://contest.plista.com
(currently in maintenance)
Application for Living Labs
YOU, a partner
@torbenbrodt
Application for Living Labs
YOU, a developer
● APIs in php and java
exists
● start your own using
the api
@torbenbrodt
Your server is probably hosted by university, plista or any
cloud provider
Application for Living Labs
YOU, a developer
@torbenbrodt
"message bus"
● event notifications
○ impression
○ click
● error notifications
● item updates
train model from it
Application for Living Labs
YOU, a developer
@torbenbrodt
{ // json
"type": "impression",
"context": {
"simple": {
27: 418, // publisher
14: 31721, // widget
...
},
"lists": {
"10": [100, 101] // channel
}
...
}
Application for Living Labs
YOU, a developer
@torbenbrodt
recs
Your response shown to real users
{ // json
"recs": {
"int": {
"3": [13010630, 84799192]
// 3 refers to content
recommendations
}
...
}
API
Real User
YOU
Application for Living Labs
YOU, a developer
api specs hosted at http://orp.plista.
com
@torbenbrodt
recs
Real User
YOU
● user, publisher,
advertiser, plista
YOU can profit
● real user feedback
● real benchmark
with others
Application for Living Labs
quality is win win
@torbenbrodt
● 2012
○ Contest v1
● 2013 October
○ ACM RecSys “News
Recommender Challenge”
● 2014 November
○ CLEF News Recommendation
Evaluation Labs “newsreel”
Application for Living Labs
Overview
@torbenbrodt
Application for Living Labs
Challenge Numbers :)
● during recsys’13:
○ 571,744,114 impressions delivered by researchers
○ 23 registrations => 11 active teams
● news articles of ~13 publishers
● contextual data with ~50 attributes
● cross domain application
Application for Living Labs
Challenge Challenges :(
● what is the benchmark
○ click per impression?
○ absolute number of clicks?
○ absolute number weighted by time range?
● integration in real application is challenging
○ starting from scratch?
○ having runtime environment?
● papers better match offline data
○ here i can compare against previous work
○ are we working for papers or for passion?
● real users = real privacy issues?
Contact
+TorbenBrodt
torben.brodt@plista.com
http://lnkd.in/MUXXuv
xing.com/profile/Torben_Brodt
www.plista.com
Open Recommendation Platform
http://orp.plista.com
@torbenbrodt @plista
questions?
@torbenbrodt

More Related Content

Similar to Living Labs Challenge Workshop

When e-commerce meets Symfony
When e-commerce meets SymfonyWhen e-commerce meets Symfony
When e-commerce meets SymfonyMarc Morera
 
ChatGPT and AI for Web Developers
ChatGPT and AI for Web DevelopersChatGPT and AI for Web Developers
ChatGPT and AI for Web DevelopersMaximiliano Firtman
 
Data Science Stack with MongoDB and RStudio
Data Science Stack with MongoDB and RStudioData Science Stack with MongoDB and RStudio
Data Science Stack with MongoDB and RStudioWinston Chen
 
Creating UI Marketers Won't F*Up
Creating UI Marketers Won't F*UpCreating UI Marketers Won't F*Up
Creating UI Marketers Won't F*UpLOIC BURDET
 
Model-OpenAI-EROLw11-English.pdf
Model-OpenAI-EROLw11-English.pdfModel-OpenAI-EROLw11-English.pdf
Model-OpenAI-EROLw11-English.pdfUGAIA
 
Crawling and Processing the Italian Corporate Web
Crawling and Processing the Italian Corporate WebCrawling and Processing the Italian Corporate Web
Crawling and Processing the Italian Corporate WebSpeck&Tech
 
Engineer as a Leading Role
Engineer as a Leading RoleEngineer as a Leading Role
Engineer as a Leading RoleSATOSHI TAGOMORI
 
Google Assistant Overview
Google Assistant Overview  Google Assistant Overview
Google Assistant Overview AI.academy
 
2020 02 29 TechDay Conf - Getting started with Machine Learning.Net
2020 02 29 TechDay Conf - Getting started with Machine Learning.Net2020 02 29 TechDay Conf - Getting started with Machine Learning.Net
2020 02 29 TechDay Conf - Getting started with Machine Learning.NetBruno Capuano
 
Achieving Technical Excellence in Your Software Teams - from Devternity
Achieving Technical Excellence in Your Software Teams - from Devternity Achieving Technical Excellence in Your Software Teams - from Devternity
Achieving Technical Excellence in Your Software Teams - from Devternity Peter Gfader
 
MongoDB World 2019: MongoDB in Data Science: How to Build a Scalable Product ...
MongoDB World 2019: MongoDB in Data Science: How to Build a Scalable Product ...MongoDB World 2019: MongoDB in Data Science: How to Build a Scalable Product ...
MongoDB World 2019: MongoDB in Data Science: How to Build a Scalable Product ...MongoDB
 
Velocity Conference - What do cats and APIs have in common? They are both awe...
Velocity Conference - What do cats and APIs have in common? They are both awe...Velocity Conference - What do cats and APIs have in common? They are both awe...
Velocity Conference - What do cats and APIs have in common? They are both awe...Stephen Fishman
 
Deconstructing the organic Traffic in the Apple App Store - Hamburg Mobile Su...
Deconstructing the organic Traffic in the Apple App Store - Hamburg Mobile Su...Deconstructing the organic Traffic in the Apple App Store - Hamburg Mobile Su...
Deconstructing the organic Traffic in the Apple App Store - Hamburg Mobile Su...Sven Jürgens
 
ChatGPT and AI for web developers - Maximiliano Firtman
ChatGPT and AI for web developers - Maximiliano FirtmanChatGPT and AI for web developers - Maximiliano Firtman
ChatGPT and AI for web developers - Maximiliano FirtmanWey Wey Web
 
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014Austin Ogilvie
 
Adventure in Data: A tour of visualization projects at Twitter
Adventure in Data: A tour of visualization projects at TwitterAdventure in Data: A tour of visualization projects at Twitter
Adventure in Data: A tour of visualization projects at TwitterKrist Wongsuphasawat
 
The Art of the Possible: Machine Learning and WordPress
The Art of the Possible: Machine Learning and WordPressThe Art of the Possible: Machine Learning and WordPress
The Art of the Possible: Machine Learning and WordPressWP Engine
 
ECS2018 - Accelerate success and time to-value for Office 365 with best pract...
ECS2018 - Accelerate success and time to-value for Office 365 with best pract...ECS2018 - Accelerate success and time to-value for Office 365 with best pract...
ECS2018 - Accelerate success and time to-value for Office 365 with best pract...Patrick Guimonet
 
[Guimonet] Accelerate success and time-to-value for Office 365 with best prac...
[Guimonet] Accelerate success and time-to-value for Office 365 with best prac...[Guimonet] Accelerate success and time-to-value for Office 365 with best prac...
[Guimonet] Accelerate success and time-to-value for Office 365 with best prac...European Collaboration Summit
 

Similar to Living Labs Challenge Workshop (20)

When e-commerce meets Symfony
When e-commerce meets SymfonyWhen e-commerce meets Symfony
When e-commerce meets Symfony
 
ChatGPT and AI for Web Developers
ChatGPT and AI for Web DevelopersChatGPT and AI for Web Developers
ChatGPT and AI for Web Developers
 
Data Science Stack with MongoDB and RStudio
Data Science Stack with MongoDB and RStudioData Science Stack with MongoDB and RStudio
Data Science Stack with MongoDB and RStudio
 
Creating UI Marketers Won't F*Up
Creating UI Marketers Won't F*UpCreating UI Marketers Won't F*Up
Creating UI Marketers Won't F*Up
 
Model-OpenAI-EROLw11-English.pdf
Model-OpenAI-EROLw11-English.pdfModel-OpenAI-EROLw11-English.pdf
Model-OpenAI-EROLw11-English.pdf
 
Crawling and Processing the Italian Corporate Web
Crawling and Processing the Italian Corporate WebCrawling and Processing the Italian Corporate Web
Crawling and Processing the Italian Corporate Web
 
Engineer as a Leading Role
Engineer as a Leading RoleEngineer as a Leading Role
Engineer as a Leading Role
 
Google Assistant Overview
Google Assistant Overview  Google Assistant Overview
Google Assistant Overview
 
2020 02 29 TechDay Conf - Getting started with Machine Learning.Net
2020 02 29 TechDay Conf - Getting started with Machine Learning.Net2020 02 29 TechDay Conf - Getting started with Machine Learning.Net
2020 02 29 TechDay Conf - Getting started with Machine Learning.Net
 
Achieving Technical Excellence in Your Software Teams - from Devternity
Achieving Technical Excellence in Your Software Teams - from Devternity Achieving Technical Excellence in Your Software Teams - from Devternity
Achieving Technical Excellence in Your Software Teams - from Devternity
 
MongoDB World 2019: MongoDB in Data Science: How to Build a Scalable Product ...
MongoDB World 2019: MongoDB in Data Science: How to Build a Scalable Product ...MongoDB World 2019: MongoDB in Data Science: How to Build a Scalable Product ...
MongoDB World 2019: MongoDB in Data Science: How to Build a Scalable Product ...
 
Velocity Conference - What do cats and APIs have in common? They are both awe...
Velocity Conference - What do cats and APIs have in common? They are both awe...Velocity Conference - What do cats and APIs have in common? They are both awe...
Velocity Conference - What do cats and APIs have in common? They are both awe...
 
Deconstructing the organic Traffic in the Apple App Store - Hamburg Mobile Su...
Deconstructing the organic Traffic in the Apple App Store - Hamburg Mobile Su...Deconstructing the organic Traffic in the Apple App Store - Hamburg Mobile Su...
Deconstructing the organic Traffic in the Apple App Store - Hamburg Mobile Su...
 
ChatGPT and AI for web developers - Maximiliano Firtman
ChatGPT and AI for web developers - Maximiliano FirtmanChatGPT and AI for web developers - Maximiliano Firtman
ChatGPT and AI for web developers - Maximiliano Firtman
 
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
 
Adventure in Data: A tour of visualization projects at Twitter
Adventure in Data: A tour of visualization projects at TwitterAdventure in Data: A tour of visualization projects at Twitter
Adventure in Data: A tour of visualization projects at Twitter
 
The Art of the Possible: Machine Learning and WordPress
The Art of the Possible: Machine Learning and WordPressThe Art of the Possible: Machine Learning and WordPress
The Art of the Possible: Machine Learning and WordPress
 
ECS2018 - Accelerate success and time to-value for Office 365 with best pract...
ECS2018 - Accelerate success and time to-value for Office 365 with best pract...ECS2018 - Accelerate success and time to-value for Office 365 with best pract...
ECS2018 - Accelerate success and time to-value for Office 365 with best pract...
 
[Guimonet] Accelerate success and time-to-value for Office 365 with best prac...
[Guimonet] Accelerate success and time-to-value for Office 365 with best prac...[Guimonet] Accelerate success and time-to-value for Office 365 with best prac...
[Guimonet] Accelerate success and time-to-value for Office 365 with best prac...
 
Tripletail
TripletailTripletail
Tripletail
 

More from Torben Brodt

Paper the plista dataset
Paper  the plista datasetPaper  the plista dataset
Paper the plista datasetTorben Brodt
 
Algorithmus, Good School, Camp Digital
Algorithmus, Good School, Camp DigitalAlgorithmus, Good School, Camp Digital
Algorithmus, Good School, Camp DigitalTorben Brodt
 
Realtime Recommender with Redis: Hands on
Realtime Recommender with Redis: Hands onRealtime Recommender with Redis: Hands on
Realtime Recommender with Redis: Hands onTorben Brodt
 
Content recommendations
Content recommendationsContent recommendations
Content recommendationsTorben Brodt
 
RecSys2012 inside the plista contest
RecSys2012   inside the plista contestRecSys2012   inside the plista contest
RecSys2012 inside the plista contestTorben Brodt
 
Webhacks am Beispiel PHP + MySQL
Webhacks am Beispiel PHP + MySQLWebhacks am Beispiel PHP + MySQL
Webhacks am Beispiel PHP + MySQLTorben Brodt
 
Google Web Toolkit
Google Web ToolkitGoogle Web Toolkit
Google Web ToolkitTorben Brodt
 
Geld Verdienen Mit Adsense
Geld Verdienen Mit AdsenseGeld Verdienen Mit Adsense
Geld Verdienen Mit AdsenseTorben Brodt
 
Web 2.0 - "Fluch oder Segen"
Web 2.0 - "Fluch oder Segen"Web 2.0 - "Fluch oder Segen"
Web 2.0 - "Fluch oder Segen"Torben Brodt
 

More from Torben Brodt (12)

Paper the plista dataset
Paper  the plista datasetPaper  the plista dataset
Paper the plista dataset
 
Nrs2013 recap
Nrs2013 recapNrs2013 recap
Nrs2013 recap
 
Algorithmus, Good School, Camp Digital
Algorithmus, Good School, Camp DigitalAlgorithmus, Good School, Camp Digital
Algorithmus, Good School, Camp Digital
 
Realtime Recommender with Redis: Hands on
Realtime Recommender with Redis: Hands onRealtime Recommender with Redis: Hands on
Realtime Recommender with Redis: Hands on
 
Content recommendations
Content recommendationsContent recommendations
Content recommendations
 
RecSys2012 inside the plista contest
RecSys2012   inside the plista contestRecSys2012   inside the plista contest
RecSys2012 inside the plista contest
 
Webhacks am Beispiel PHP + MySQL
Webhacks am Beispiel PHP + MySQLWebhacks am Beispiel PHP + MySQL
Webhacks am Beispiel PHP + MySQL
 
GIT / SVN
GIT / SVNGIT / SVN
GIT / SVN
 
Google Web Toolkit
Google Web ToolkitGoogle Web Toolkit
Google Web Toolkit
 
Geld Verdienen Mit Adsense
Geld Verdienen Mit AdsenseGeld Verdienen Mit Adsense
Geld Verdienen Mit Adsense
 
AJAX
AJAXAJAX
AJAX
 
Web 2.0 - "Fluch oder Segen"
Web 2.0 - "Fluch oder Segen"Web 2.0 - "Fluch oder Segen"
Web 2.0 - "Fluch oder Segen"
 

Recently uploaded

Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxFIDO Alliance
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)Wonjun Hwang
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftshyamraj55
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxFIDO Alliance
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...FIDO Alliance
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!Memoori
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxFIDO Alliance
 
How to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in PakistanHow to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in Pakistandanishmna97
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxMasterG
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGDSC PJATK
 
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfAnubhavMangla3
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingScyllaDB
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctBrainSell Technologies
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Paige Cruz
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform EngineeringMarcus Vechiato
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfSrushith Repakula
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfalexjohnson7307
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...ScyllaDB
 

Recently uploaded (20)

Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
How to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in PakistanHow to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in Pakistan
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 

Living Labs Challenge Workshop

  • 1. Open Recommendation Platform For Researchers and Developers Living Labs Challenge Workshop University of Amsterdam June 6th, 2014 Torben Brodt plista GmbH -> http://orp.plista.com -> http://living-labs.net/llc/ @torbenbrodt
  • 2. 1. what we built for ourselves ○ recommendation engine 2. how we built it ○ big data math ○ system architecture 3. application for “living labs” ○ for developers, researchers and geeks Contents @torbenbrodt Not just opening algorithms to partners, But opening our platform to algorithms.
  • 3. where ● news websites ● below the article ● now in NL too! different types ● content ● advertising What we built for ourselves Recommendation Engine Visitors Publisher # @torbenbrodt
  • 4. What we built for ourselves Recommendation Engine Visitors Publisher Results Request Engine @torbenbrodt Context Personalized II
  • 5. What we built for ourselves Collaborative Filtering Peter James Peter and James have sth in common. They both like football Term: User Similarity @torbenbrodt
  • 6. What we built for ourselves Collaborative Filtering Peter James Tennis will be recommendation for Peter, because James likes it too. Item Recommendation from User Similarity @torbenbrodt
  • 7. ● more data => more knowledge ● not needed: ○ domain knowledge ○ concrete user ○ concrete article What we built for ourselves Collaborative Filtering @torbenbrodt
  • 8. Text Similarity What we built for ourselves More recommenders ● article content matching recommendation content ● but which ads to present to political content? Most Popular etc ... ● premise: what everybody likes is also good to me ● e.g. public trends, social likes, wiki data @torbenbrodt
  • 9. Text Similarity What we built for ourselves More recommenders ● article content matching recommendation content ● but which ads to present to political content? Most Popular etc ... ● premise: what everybody likes is also good to me ● e.g. public trends, social likes, wiki data @torbenbrodt
  • 10. Text Similarity What we built for ourselves More recommenders ● article content matching recommendation content ● but which ads to present to political content? Most Popular etc ... ● premise: what everybody likes is also good to me ● e.g. public trends, social likes, wiki data, NLP, Matrix Fac. @torbenbrodt
  • 11. What we built for ourselves good recommendations for... User happy! Advertiser happy! Publisher happy! plista happy! @torbenbrodt
  • 12. What we built for ourselves What are the goals? high number of... ● clicks ● attention ● orders ● engages/videos ● time on site ● page depth bad good @torbenbrodt
  • 13. What we built for ourselves Who wants this goals? Advertising Goal RWE Europe 500 +1 IBM Germany 500 Intel Austria 500 Recommenders Goal collaborative filtering 500 +1 most popular 500 text similarity 500 Content Goal new iphone su... 500 +1 twitter buys p.. 500 google has seri. 500 @torbenbrodt
  • 14. What we built for ourselves Who wants this goals? Advertising Goal RWE Europe 500 +1 IBM Germany 500 Intel Austria 500 Recommenders Goal collaborative filtering 500 +1 most popular 500 text similarity 500 Content Goal new iphone su... 500 +1 twitter buys p.. 500 google has seri. 500 used to A/B test our algorithms @torbenbrodt
  • 15. What we built for ourselves Who wants this goals? Advertising Goal RWE Europe 500 +1 IBM Germany 500 Intel Austria 500 Recommenders Goal collaborative filtering 500 +1 most popular 500 text similarity 500 Content Goal new iphone su... 500 +1 twitter buys p.. 500 google has seri. 500 @torbenbrodt
  • 16. What we built for ourselves All goals have a context Ad or Content or Recommender ... ... ... ● user agent > device > mobile ● IP address > geolocation ● referer > origin (search, direct) ● anonym! @torbenbrodt
  • 17. What we built for ourselves All goals have a context Which channel to show the Advertising Which publishers tend to click on Semantic Recommendations Which geolocation is the right for this Content Questions the context can answer Answers to this are given by the algorithms @torbenbrodt
  • 18. What we built for ourselves All goals have a context ● answers change each second ● bayesian bandit approach temporary success? No. 1 getting most local minima? @torbenbrodt
  • 19. ✓ easy exploration ● minimum pre-testing ● no risk if recommender crashs ● "bad" code might find its context
  • 20. numbers in short ● 5k recs per second ● 250 Mbit contextual data ● 100 items per second quite scaling issues ● big data math ● message bus How we built it? # @torbenbrodt
  • 21. Events Technology Stack Message Bus Subscribers ● algorithms ● payment ● etc Visitor ● new articles ● delivered ● clicks @torbenbrodt
  • 22. How we built it? Big Data Math Article 1+1 10 Article 100 2+5 Art... @torbenbrodt number of ● clicks ● orders ● engages ● time on site ● money What math do we need?
  • 23. ● Addition can solve most formulas ● with Logarithm also multiplications ● Real-Time Ready ○ atomic ○ fast How we built it? Big Data Math @torbenbrodt
  • 24. How we built it? Big Data Math welt.de_201406 new iphone su... 500 +1 twitter buys p.. 400 google has seri... 300 ZINCRBY (WRITE) "welt.de_201406" "article 1" "1" ZUNION (JOIN) “welt.de_201406” “geolocation:NL_201406” ZREVRANGEBYSCORE (FETCH) @torbenbrodt
  • 25. Application for Living Labs # ● These are your visitors @torbenbrodt ● This is your data ● Assume this is open! ● This is your challenge
  • 26. ● Message Bus provides YOU with data Application for Living Labs Your role in the ORP @torbenbrodt plista ORP master YOU! ● Real-Time Results are provided by YOU ● ORP master will choose YOU ● User will see YOUR results
  • 27. Try latest technologies Application for Living Labs YOU, a technology enthusiast ● Mahout implementation exists with Kornakapi ● what will be next? Oryx? MyMediaLite? LensKit? Predict.io? we have strong open source connections @torbenbrodt
  • 28. ● try if ideas work ● write papers ● we are on conferences! ○ sigir 2013 ○ recsys 2013 ○ clef 2014 ○ … 2015 ? we have strong university cooperations Application for Living Labs YOU, a researcher @torbenbrodt
  • 29. ● plista earns money with recommendations on publishers ● help us -> we help you ● weekly contest with 250 € prices http://contest.plista.com (currently in maintenance) Application for Living Labs YOU, a partner @torbenbrodt
  • 30. Application for Living Labs YOU, a developer ● APIs in php and java exists ● start your own using the api @torbenbrodt
  • 31. Your server is probably hosted by university, plista or any cloud provider Application for Living Labs YOU, a developer @torbenbrodt
  • 32. "message bus" ● event notifications ○ impression ○ click ● error notifications ● item updates train model from it Application for Living Labs YOU, a developer @torbenbrodt
  • 33. { // json "type": "impression", "context": { "simple": { 27: 418, // publisher 14: 31721, // widget ... }, "lists": { "10": [100, 101] // channel } ... } Application for Living Labs YOU, a developer @torbenbrodt
  • 34. recs Your response shown to real users { // json "recs": { "int": { "3": [13010630, 84799192] // 3 refers to content recommendations } ... } API Real User YOU Application for Living Labs YOU, a developer api specs hosted at http://orp.plista. com @torbenbrodt
  • 35. recs Real User YOU ● user, publisher, advertiser, plista YOU can profit ● real user feedback ● real benchmark with others Application for Living Labs quality is win win @torbenbrodt
  • 36. ● 2012 ○ Contest v1 ● 2013 October ○ ACM RecSys “News Recommender Challenge” ● 2014 November ○ CLEF News Recommendation Evaluation Labs “newsreel” Application for Living Labs Overview @torbenbrodt
  • 37. Application for Living Labs Challenge Numbers :) ● during recsys’13: ○ 571,744,114 impressions delivered by researchers ○ 23 registrations => 11 active teams ● news articles of ~13 publishers ● contextual data with ~50 attributes ● cross domain application
  • 38. Application for Living Labs Challenge Challenges :( ● what is the benchmark ○ click per impression? ○ absolute number of clicks? ○ absolute number weighted by time range? ● integration in real application is challenging ○ starting from scratch? ○ having runtime environment? ● papers better match offline data ○ here i can compare against previous work ○ are we working for papers or for passion? ● real users = real privacy issues?