Scaling machinelearning as a service at uber li Erran li - 2016

•

0 likes•80 views

Karthik Murugesan

Uber Michelanglo platform overview

Technology

HADOOP / YARN (Batch)
Hive
Feature Store

NETWORK (Realtime)
HADOOP / YARN (Batch)
Cassandra
Feature Store
Hive
Feature Store

Real-time
prediction
Training
Use Case: UberEATS ETD ML Pipeline
Hive
11
Feature
store
Model
Training
Model
UberEATS
App
Model
Performance
ETD

Problems
• Hard to figure out good features
• Hard to build the pipelines to generate features
• Can’t compute some features in real time
Solution: DSL and Feature Store
● Database of curated and crowd-sourced features
● Make it easy to use and transform these features in ML projects
● Make it easy to discover new useful features
● Batch and realtime serving

Data Pipeline For Predictions
Feature
DSL
Transformed
Features
Basis
Features
ML Model PredictionsData Lake
Spark or
SQL

Data Pipeline For Predictions w/ Feature Palette
Feature Store
Feature
DSL
Transforme
d Features
Basis
Features
ML Model PredictionsData Lake
Spark or
SQL

Use Case: UberEATS ETD Model Details
15
Feature
store
Model: GBT
RegressionUberEATS
App
ETD
● restaurant features
○ location, avg prep-time, avg delivery time,
avg demand during lunch ...
● contextual features
○ time of day, day of week, ...
● order features
○ #items, total cost, ...
● near real-time features
○ info about the past N orders, ...
● ...
● Feature store provides aggregate features for
real-time prediction
○ These features are time-consuming to
compute in real-time

Problem
● Often you want to train a model per city
● But hard to train and manage 400+ models for a project
Solution
● Let users define partitioning scheme
● Automatically train model per partition
● Manage and deploy as single logical model

M
M
M M M
M
M M M
4. Train model for every node

M
M
M M
M
M M M
6. At prediction time, use best model for each node

Use Case: UberEATS ETD Prediction Performance
24
● Partitioned GBDT Regression Model
● Latency (measured from client)
○ p50: 7ms
○ p95: 15ms
○ p99: 20ms

Conclusion
● We present a scalable ML as a service system
● We focus on the scalability challenges and solutions
○ Feature store key to enable aggregate features for real-time prediction
■ Same API to access feature store for both batch training and real-time prediction
○ Partitioned models greatly simplifies model management and selection
■ Per city model performance often worse than global model
○ Scalable low latency real-time prediction service enables interactive user experiences
■ Load balancing across containers without global state
■ Fast one button deployment
■ Hot swap model upgrade

Similar to Scaling machinelearning as a service at uber li Erran li - 2016

Michelangelo - Machine Learning Platform - 2018Karthik Murugesan

AutoML for user segmentation: how to match millions of users with hundreds of...Institute of Contemporary Sciences

Model Transformation A Personal PerspectiveEdward Willink

[DSC Europe 23] Ricard Borras - Deploying computer vision models at scale dev...DataScienceConferenc1

Unified MLOps: Feature Stores & Model DeploymentDatabricks

Scaling Ride-Hailing with Machine Learning on MLflowDatabricks

Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...PAPIs.io

Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdfvitm11

Uber Business Metrics Generation and Management Through Apache FlinkWenrui Meng

ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureFei Chen

RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroPyData

Using Spark Mllib Models in a Production Training and Serving Platform: Exper...Databricks

Predicting Machine Failure AppAbhinav Bisht

Index conf sparkml-feb20-n-pentreathChester Chen

Cloud Native Camel Design PatternsBilgin Ibryam

Optimica Compiler Toolkit - OverviewModelon

Design Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabszekeLabs Technologies

Microservices patternsVikram Babu Kuruguntla

Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uberconfluent

Elephant CarpaccioLars Thorup

Similar to Scaling machinelearning as a service at uber li Erran li - 2016 (20)

Michelangelo - Machine Learning Platform - 2018

AutoML for user segmentation: how to match millions of users with hundreds of...

Model Transformation A Personal Perspective

[DSC Europe 23] Ricard Borras - Deploying computer vision models at scale dev...

Unified MLOps: Feature Stores & Model Deployment

Scaling Ride-Hailing with Machine Learning on MLflow

Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...

Slides-Артем Коваль-Cloud-Native MLOps Framework - DataFest 2021.pdf

Uber Business Metrics Generation and Management Through Apache Flink

ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure

RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro

Using Spark Mllib Models in a Production Training and Serving Platform: Exper...

Predicting Machine Failure App

Index conf sparkml-feb20-n-pentreath

Cloud Native Camel Design Patterns

Optimica Compiler Toolkit - Overview

Design Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabs

Microservices patterns

Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber

Elephant Carpaccio

Recently uploaded

Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst

Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation

Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed

Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

Understanding the Laravel MVC ArchitecturePixlogix Infotech

Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard

Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik

Key Features Of Token Development (1).pptxLBM Solutions

How to convert PDF to text with Nanonetsnaman860154

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies

Install Stable Diffusion in windows machinePadma Pradeep

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

Artificial intelligence in the post-deep learning eraDeakin University

SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j

Recently uploaded (20)

Human Factors of XR: Using Human Factors to Design XR Systems

Connect Wave/ connectwave Pitch Deck Presentation

Scanning the Internet for External Cloud Exposures via SSL Certs

Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx

Presentation on how to chat with PDF using ChatGPT code interpreter

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

Understanding the Laravel MVC Architecture

Maximizing Board Effectiveness 2024 Webinar.pptx

Injustice - Developers Among Us (SciFiDevCon 2024)

Key Features Of Token Development (1).pptx

How to convert PDF to text with Nanonets

My Hashitalk Indonesia April 2024 Presentation

Benefits Of Flutter Compared To Other Frameworks

Install Stable Diffusion in windows machine

08448380779 Call Girls In Friends Colony Women Seeking Men

Designing IA for AI - Information Architecture Conference 2024

Breaking the Kubernetes Kill Chain: Host Path Mount

Artificial intelligence in the post-deep learning era

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph

Scaling machinelearning as a service at uber li Erran li - 2016

5. Use Case: UberEATS ETD Prediction 5

6. ● ○ ● ○ ● ○ ● ○ ○ ● ○

7. HADOOP / YARN (Batch)

8. HADOOP / YARN (Batch) Hive Feature Store

9. NETWORK (Realtime) HADOOP / YARN (Batch) Cassandra Feature Store Hive Feature Store

10.

11. Real-time prediction Training Use Case: UberEATS ETD ML Pipeline Hive 11 Feature store Model Training Model UberEATS App Model Performance ETD

12. Problems • Hard to figure out good features • Hard to build the pipelines to generate features • Can’t compute some features in real time Solution: DSL and Feature Store ● Database of curated and crowd-sourced features ● Make it easy to use and transform these features in ML projects ● Make it easy to discover new useful features ● Batch and realtime serving

13. Data Pipeline For Predictions Feature DSL Transformed Features Basis Features ML Model PredictionsData Lake Spark or SQL

14. Data Pipeline For Predictions w/ Feature Palette Feature Store Feature DSL Transforme d Features Basis Features ML Model PredictionsData Lake Spark or SQL

15. Use Case: UberEATS ETD Model Details 15 Feature store Model: GBT RegressionUberEATS App ETD ● restaurant features ○ location, avg prep-time, avg delivery time, avg demand during lunch ... ● contextual features ○ time of day, day of week, ... ● order features ○ #items, total cost, ... ● near real-time features ○ info about the past N orders, ... ● ... ● Feature store provides aggregate features for real-time prediction ○ These features are time-consuming to compute in real-time

16. Problem ● Often you want to train a model per city ● But hard to train and manage 400+ models for a project Solution ● Let users define partitioning scheme ● Automatically train model per partition ● Manage and deploy as single logical model

17. 1. Define partition scheme

18. 2. Make train / test split

19. 3. Keep same split for each level

20. M M M M M M M M M 4. Train model for every node

21. M M M M M M M M 5. Prune bad models

22. M M M M M M M M 6. At prediction time, use best model for each node

23.

24. Use Case: UberEATS ETD Prediction Performance 24 ● Partitioned GBDT Regression Model ● Latency (measured from client) ○ p50: 7ms ○ p95: 15ms ○ p99: 20ms

25. Conclusion ● We present a scalable ML as a service system ● We focus on the scalability challenges and solutions ○ Feature store key to enable aggregate features for real-time prediction ■ Same API to access feature store for both batch training and real-time prediction ○ Partitioned models greatly simplifies model management and selection ■ Per city model performance often worse than global model ○ Scalable low latency real-time prediction service enables interactive user experiences ■ Load balancing across containers without global state ■ Fast one button deployment ■ Hot swap model upgrade

Scaling machinelearning as a service at uber li Erran li - 2016

Recommended

Recommended

More Related Content

Similar to Scaling machinelearning as a service at uber li Erran li - 2016

Similar to Scaling machinelearning as a service at uber li Erran li - 2016 (20)

More from Karthik Murugesan

More from Karthik Murugesan (20)

Recently uploaded

Recently uploaded (20)

Scaling machinelearning as a service at uber li Erran li - 2016