Velox: Models in Action

VELOX:
MODELS IN ACTION
Presented by Dan Crankshaw
crankshaw@cs.berkeley.edu
Henry Milner, Joseph Gonzalez, Peter Bailis, Haoyuan Li, Tomer Kaftan,
Zhao Zhang, Ali Ghodsi, Michael Franklin, Michael Jordan, and Ion Stoica
https://amplab.cs.berkeley.edu/projects/velox/

MODELS AT REST
Data
Well
Studied
Train
Observe
Predictions Model Predict

Data
Training
Feedback
Open
Challenges
Predictions Model Serving

Velox Model Management System
Data
Training
Feedback
Open
Challenges

Catify: Music for Cats
Apache Web Server
Node.js App Server
MongoDB

MODELING TASK
Ratings
Songs
Prediction

Data
Training
Feedback

Tachyon + HDFS
Pipeline
CatID Song Score
1 16 2.1
1 14 3.7
3 273 4.2
4 14 1.9

Pipeline
Tachyon + HDFS
Apache Web Server
Node.js App Server
MongoDB

Tachyon + HDFS
NGINX
Node.js App Server
MongoDB
Materialize all
predictions
Pipeline

Songs
Users
O(users + songs)

Songs
Users
O(users * songs)

Pipeline
Tachyon + HDFS
NGINX
Node.js App Server
MongoDB

Pipeline
Tachyon + HDFS
NGINX
Node.js App Server
MongoDB
Training Data

Pipeline
Tachyon + HDFS
NGINX
Node.js App Server
MongoDB
Training Data
New Model

What’s wrong?
1. Built from scratch for each
application

What’s wrong?
application
2. Different systems

What’s wrong?
application
3. Space inefficient

What’s wrong?
application
4. Stale predictions

What’s wrong?
application
4. Stale predictions
5. The T-Swift effect Sample Bias

Pipeline
Tachyon + HDFS
The Missing Piece
Web
Application Velox

Tachyon + HDFS
The Missing Piece
Velox
Prediction
Service
Model
Manager
Web
Application
Pipeline

BENEFITS
1. Low-latency and scalable
predictions as a service

BENEFITS
2. Integrated approach leads to
fresher, better predictions

BENEFITS
3. Easy translation to production
predictions

BENEFITS
3. Easy translation to production
predictions
4. Eases operational pain

PERSONALIZED MODELING
Rating = wu · f(x; ✓)

Shared Basis
Feature Models

Shared Basis
Feature Models
Personalized
User Model

Change slowly
Shared Basis
Feature Models
Personalized
User Model
wu · f(x; ✓)
Rating =

Highly dynamic Change slowly
Shared Basis
Feature Models
Personalized
User Model
wu · f(x; ✓)
Rating =

VELOX
Pipeline
Tachyon + HDFS
Velox
Prediction
Service
Model
Manager
Web
Application
Predictions as a
service

PREDICTION API
GET
/velox/catify/predict?userid=22&song=27632
GET
/velox/catify/predict_top_k?userid=22&k=100

PREDICTIONS
def
predict(
u:
UUID,
x:
Context
)
wu · f(x; ✓)

PREDICTIONS
def
predict(
u:
UUID,
x:
Context
)
Look up user
weight
wu · f(x; ✓)

PREDICTIONS
def
predict(
u:
UUID,
x:
Context
)
Compute
Features
Look up user
weight
wu · f(x; ✓)

LOW-LATENCY PREDICTIONS
Partition
0
Velox
Tachyon
Partition
1
Velox
Tachyon
Partition
2
Velox
Tachyon
Partition users

Velox
Tachyon
Feature Cache

Velox
Tachyon
Features shared
between users
Feature Cache

SIMPLE EXPLORATION
Rating
Songs
Prediction

SIMPLE EXPLORATION
Rating
Songs
Epsilon-greedy
Prediction

ACTIVE LEARNING
Rating
Songs
Prediction

ACTIVE LEARNING: LinUCB
Rating
Songs
Uncertainty
Prediction
Li, L., Chu, W., Langford, J., & Schapire, R. E. (2010). A contextual-bandit approach to personalized news article recommendation. WWW '10:
Proceedings of the 19th international conference on World wide web, New York, New York, USA: ACM. doi:10.1145/1772690.1772758

ACTIVE LEARNING: LinUCB
Rating
Songs
Look at upper
confidence bound
Uncertainty
Prediction
Li, L., Chu, W., Langford, J., & Schapire, R. E. (2010). A contextual-bandit approach to personalized news article recommendation. WWW '10:
Proceedings of the 19th international conference on World wide web, New York, New York, USA: ACM. doi:10.1145/1772690.1772758

Pipeline
Tachyon + HDFS
NGINX
Node.js App Server
MongoDB
Velox
Prediction
Service
Model
Manager

Data
Training
Feedback
Mgmt.

Realtime
Learning
Data
Training
Feedback
Mgmt.

USER-FACING API
GET
GET

USER-FACING API
GET
GET
POST
/velox/catify/observe?userid=22&song=27632?score=3.7

ONLINE UPDATES
def
observe(u:
UUID,
x:
Context,
y:
Score)
wu · f(x; ✓)

ONLINE UPDATES
def
observe(u:
UUID,
x:
Context,
y:
Score)
Update wu with
new training point
wu · f(x; ✓)

ONLINE UPDATES
def
observe(u:
UUID,
x:
Context,
y:
Score)
Basis functions
stay fixed
Update wu with
new training point
wu · f(x; ✓)

Realtime
Learning + Offline Retraining
Data
Training
Feedback
Mgmt.

Velox Model Management System
Data
Feedback
Spark

The future of research in scalable learning systems will be in the
integration of the learning lifecycle:
Data
Training
Feedback

SUMMARY
•Model training and predictions rely on ad-hoc,
manual processes spread across multiple systems

SUMMARY
•The Velox system automatically maintains multiple
models while providing low latency, scalable, and
personalized predictions

SUMMARY
•Velox is part of BDAS, is coming soon…

SUMMARY
•Velox is part of BDAS, is coming soon…
•https://amplab.cs.berkeley.edu/projects/velox/

RETRAIN OFFLINE
def
retrainOffline(sc:
SparkContext,
trainingData:
RDD)
wu · f(x; ✓)

RETRAIN OFFLINE
def
retrainOffline(sc:
SparkContext,
trainingData:
RDD)
Retrain feature
functions
Use Spark for batch
retrain
wu · f(x; ✓)

Velox: Models in Action

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Velox: Models in Action

Similar to Velox: Models in Action (20)

Recently uploaded

Recently uploaded (20)

Velox: Models in Action