Build Your Own Recommendation Engine

Build Your Own
Recommendation Engine  
(during Weekend)
Michal Malohlava
@mmalohlava && @h2oai
presents

MusicService
Activities  
clicks/swipes/likes
Clients iOS/Android/…
Next  
N-recommendations
? REBB*?
*REBB = Recommendation Engine Black Box

Requirements
Activities  
can be >100/s
REBB should be accessible
via REST API
Recommendations
need to be served <500ms,
should keep users exploring
AWS infrastructure
Need to  
be ready in 2 days!

Requirements
• Recommendations should be served <500ms
• ML part should allow quick prototyping &
experimentation
• Storage (online/ofﬂine) - user stats, histories, recommendations
• Scalable
• frontend receiving requests
• backend solving ML
• storage Need to  
be ready in 2 days!

Engine Architecture
Variation of λ-architecture…
… with pluggable ML 
backend

Engine Architecture
Regular EC2
nodes

API Router
REST API via Spray 
Akka Actor accepting and ﬁltering:
• user activities
• recommendation requests
Scalable via HAProxy

API Router
 
Akka Actor handles 
• POST of user activity  
• publish activity to Redis 
• update stats in Redis (quick
updates) 
• trigger recommendation
computation

API Router
 
Akka Actor handles
• GET recommendation request 
• fetch pre-computed
recommendation from Redis if exists 
• OR try to do best-effort to provide
“coldstart" recommendation based
on history of user activities

Redis Store
Redis is used as 
• events bus:
• inform subscribers about user
activities
• requests to provide new
recommendation for user 
• data storage
• old/new recommendations
• statistics (likes/swipe per user)
• simple persistence model 
• computation engine
• keep top-N artists, top-N songs per user

ML Backend
Language/technology agnostic
• Needs to be ﬂexible enough to prototype
different strategies
“Runners” for
• generating recommendations 
with H2O and Python
• collecting/generating statistics
• clustering users with H2O JVM
“Runners” are subscribed to Redis/
processing Redis data

ML Backend
Final strategy
• identify user cluster based on  
users activities (aka music styles)
• apply different recommendation 
strategies inside each cluster
• identify “weird” users (~outliers)
• adapt recommendation for them
• needs manual intervention/algorithm
tuning

Results
• Single machine for API Router and Redis
• peeks 50 activities/sec, avg 10 activities/sec
• small memory footprint
• ML Runners spread over EC2 machines
• even simple but different strategies for each user sectors
and selected individual users provides surprisingly good
results

Learn more at h2o.ai
Follow us at @h2oai
Thank you!

Build Your Own Recommendation Engine

More Related Content

What's hot

Viewers also liked

Similar to Build Your Own Recommendation Engine

More from Sri Ambati

Recently uploaded

Build Your Own Recommendation Engine