12. Problems
• Hard to figure out good features
• Hard to build the pipelines to generate features
• Can’t compute some features in real time
Solution: DSL and Feature Store
● Database of curated and crowd-sourced features
● Make it easy to use and transform these features in ML projects
● Make it easy to discover new useful features
● Batch and realtime serving
13. Data Pipeline For Predictions
Feature
DSL
Transformed
Features
Basis
Features
ML Model PredictionsData Lake
Spark or
SQL
14. Data Pipeline For Predictions w/ Feature Palette
Feature Store
Feature
DSL
Transforme
d Features
Basis
Features
ML Model PredictionsData Lake
Spark or
SQL
15. Use Case: UberEATS ETD Model Details
15
Feature
store
Model: GBT
RegressionUberEATS
App
ETD
● restaurant features
○ location, avg prep-time, avg delivery time,
avg demand during lunch ...
● contextual features
○ time of day, day of week, ...
● order features
○ #items, total cost, ...
● near real-time features
○ info about the past N orders, ...
● ...
● Feature store provides aggregate features for
real-time prediction
○ These features are time-consuming to
compute in real-time
16. Problem
● Often you want to train a model per city
● But hard to train and manage 400+ models for a project
Solution
● Let users define partitioning scheme
● Automatically train model per partition
● Manage and deploy as single logical model
22. M
M
M M
M
M M M
6. At prediction time, use best model for each node
23.
24. Use Case: UberEATS ETD Prediction Performance
24
● Partitioned GBDT Regression Model
● Latency (measured from client)
○ p50: 7ms
○ p95: 15ms
○ p99: 20ms
25. Conclusion
● We present a scalable ML as a service system
● We focus on the scalability challenges and solutions
○ Feature store key to enable aggregate features for real-time prediction
■ Same API to access feature store for both batch training and real-time prediction
○ Partitioned models greatly simplifies model management and selection
■ Per city model performance often worse than global model
○ Scalable low latency real-time prediction service enables interactive user experiences
■ Load balancing across containers without global state
■ Fast one button deployment
■ Hot swap model upgrade