ML Infra for Netflix Recommendations - AI NEXTCon talk

Faisal Siddiqi (@faisalzs)
12 Apr 2018
Machine Learning Infra for
Recommendations
?
?
Create Personalized recommendations for discoveries
of engaging video content that maximize member joy
Goal
Create Personalized recommendations for discoveries
of engaging video content that maximize member joy
Goal
Outline for today
● Domain Context
● ML Infra stack for Personalization
● Deeper dives into 2 major ML Infra components
Personalize everything!
Row
Selection &
Order
Titles ranked by relevance
Artwork!
Personalization Context
Data from
Millions of
users
Training
pipelines
Models
Precompute System
Rankings
Online
caches
AB Test
Allocation
Joy == Rainbow
Training
Data
Preparation
Training
Feature
Engineering
Model
Quality
Intent To Treat
(Serving)
Treatment &
Action
Inference &
Logging
The
Personalization
Rainbow
Online
Device
Function
Offline
Personalization systems & infrastructure
Control Plane
Label
Generation
Fact Store
Training
Data
Preparation
Training
Feature
Engineering
Model
Quality
Intent To Treat
(Serving)
Treatment &
Action
Hyperparameter
Optimization
N
O
T
E
B
O
O
K
S
Caching
Dynamic param
management
Inference &
Logging
A/B Testing
Platform Online &
Precompute
Framework
Personalization
Aggregation
Fact
Logging
Device
Logging
Online
Services
API
The
Personalization
Rainbow
Control Plane
Online
Device
Function
Offline
Personalization systems & infrastructure
Boson
Algo Commons
O
R
C
H
E
S
T
R
A
T
I
O
N
Training
Feature
Engineering
Model
Quality
Inference &
Logging
Boson
Algo Commons
We’ll zoom into Boson & AlgoCommons today
The Context for AlgoCommons & Boson
● Machine Learning via ‘loosely-coupled, highly-aligned’ Scala/Java Libraries
● Historical context
○ Siloed machine learning infrastructure
○ Few opportunities for sharing
■ Incompatibility
■ Dependency concerns
■ Improvements in one pipeline not shared across others
Design Principles
● Composability
○ Ability to put pieces together in novel ways
○ Enable construction of generic tools
● Portability
○ Easily share code online/offline and between applications
○ Models, Feature encoders, Common data manipulation
● Avoiding Training-Serving Skew
○ Serving/Online systems are Java based, drives choice of offline software
○ Share code & data between offline/online worlds
Training
Feature
Engineering
Model
Quality
Inference &
Logging
Delorean
Time Travel
Feature Generation
Feature Transformers
Label Joins
Feature Schema
Stratification & Sampling
Data Fetchers & utilities
Training
API
Model
Tuning
Boson
AlgoCommons
Spot Checks (human-in-the-loop)
Visualization
Feature Importance
Validation Runs
Training Metrics
Abstractions
Feature Sharing
Component Sets
Data Maps
Feature Encoders
Specification
Common Model Format
(JSON)
Metrics Framework
Predictions
Inferencing Metrics
Scoring
Model Loading
InferencingAlgoCommons
& Boson
Batch Training over Distributed
Spark or Dockerized Containers
AlgoCommons
● Common abstractions and building blocks for ML
● Integrated in Java microservices
for online or pre-computed Inferencing
● Library > framework (user-focus)
● Program to interfaces (composability)
● Aggressive modularization to avoid Jar Hell (portability)
● Data Access Abstraction (portability, testability)
Overview AlgoCommons
Common abstractions and Building Blocks
● Data
○ Data Keys
○ Data Maps
● Modeling
○ Component Sets
○ Feature Encoders, Predictor, Scorer
○ Model Format
● Metrics
AlgoCommons
DataKey<T>
○ Identifies a data value by name/type e.g “ViewingHistory”
Data Value
○ Preferably immutable data structure
DataMap
○ Map from DataKey<T> to T, plus metadata
Data Access - Abstractions AlgoCommons
Data Access - Lifecycle
Application Component
Factory
Component
What DataKeys do
you need?
I need X, Y, and Z
f.create(dataMap)
new Component(X, Y, Z)
Return comp
comp.do(someInput)
Make
DataMap w/
X, Y, and Z
Data Retrieval
Component Instantiation /
Data Prep
Component Application
(repeat as needed)
AlgoCommons
DataTransform
● DataMap => K/V
● Given zero or more key/values, produce a new key/value
● Consumable by other data transforms, feature encoders, and components
AlgoCommons
Feature Encoder
● DataMap ⇒ (T ⇒ FeatureSet)
● FeatureEncoder<T> create(DataMap)
○ Given a DataMap, initialize a new encoder doing any required data prep
● void encode(T, FeatureSet)
○ Given an item (say, a Video), encode features for it into the feature set
AlgoCommons
Feature Transform
● Expression “language” for transforming features to produce new features
○ aka Feature Interactions
● Many operators available
○ log, outer/inner product, arithmetic, logic
● Expressions can be arbitrarily “stacked”
● Expressions are automatically DeDuped
AlgoCommons
Predictor
● Compute a score for a feature vector
● DataMap ⇒ (Vector ⇒ Double)
○ Predictor create(DataMap)
■ Given a data map, construct a new predictor
○ double predict(Vector)
■ Given a feature vector, compute a prediction/score
● Supports many Predictors:
○ LR, RegressionTree, TensorFlow, XGBoost,
WeightedAdditiveEnsemble, FeatureWeighted, MultivariatePredictors,
BanditPredictor, Sequence-to-sequence,...
AlgoCommons
Scorer
● Compute a score for business objects
● DataMap ⇒ (T ⇒ Double)
● Scorer<T> create(DataMap)
○ Given a data map, construct a new Scorer<T>.
● double score(T)
○ Given an item, compute a score
AlgoCommons
Extensible Model Definition
● Component abstraction
● JSON model serialization
● Various “views” of the Model
○ Feature gen
○ Prediction
○ Scoring
{
"@id" : "my-model",
"@schema" : "SimpleFeatureScoringModel",
"dataTransforms" : [ ... data transforms ...],
"featureEncoders" : [ ... feature defs ...],
"featureTransform" : { ... feature interactions ... },
"predictor" : { ... ML model (weights, etc.) ... }
}
AlgoCommons
Data
Transform
Data
Transform
Feature
Encoder
Feature
Transform
App
Data
Feature
Encoder
Feature
Encoder
Predictor
ScoringModelView
DataTransformView
FeatureGeneratorView
PredictorView
Views of the Feature Scoring Model
AlgoCommons
Metrics
● Building blocks
○ Accumulators
○ Estimators
● Ranking
○ Precision, Recall
○ Recall@Rank, NormalizedMeanReciprocalRank
● Regression
○ Error Accumulators
○ RMSE
AlgoCommons
ML Infra for Netflix Recommendations - AI NEXTCon talk
Motivation
Provide the productivity of this
But make it easy to go between
prod & experimentation
Overview
● A high level Scala API for ML exploration
● Focuses on Offline Training for both
○ Ad-hoc exploration
○ Production Training
● Think “Subset of SKLearn” for Scala/JVM ecosystem
● Spark’s dataframe a core data abstraction
Data Utilities
● Utilities for data transfer between heterogeneous systems
● Leverage Spark for data munging, but need bridge to Docker Trainers
○ Use standalone s3 downloader and parquet reader
○ S3 + s3fs-fuse
○ HDFS + hdfs-fuse
● On the wire format
○ Parquet
○ Protobuf
Feature Schema
● Context
The setting for evaluating a set of items (member profiles, country, etc.)
● Items
The elements to be trained on and scored (videos, rows, etc.)
Stratification
dataframe.stratify (samplingRules =
$(“column_foo”) == ‘US’ maxPercent 8.0,
$(“column_bar”) > 10 && $(“column_qux”) > 1 minPercent 0.5,
…
)
A generalized API on Spark Dataframes
Native SparkSQL expressions
Emphasis on type-safety
Many stratification attributes: Country, Devices, Searches,...
Feature Transformers
The feature generation pipeline is a sequence of Transformers
A Transformer takes a dataframe, and based on contexts performs computations
on and returns a new data frame.
Dataset Type Tagger
→ Country Tenure Stratified Sampler
→ Negative Generator
→ ….
Feature Generation - Putting it together
Model Training
Structured Labeled Features
Feature Model
Structured Data in
DataFrame
Feature Encoders
Required
Feature
Maps of Data
POJO
Features
Required Data
Label Data
Catalyst
Expressions
AlgoCommons
Fact Store
Structured Labeled
Features
Required
Feature
DataMaps
Features
Required
Data
1
2
24
5
6
7
Training
● Need flexibility and access to trainers in all languages/environments
● A simple unified Training API for
○ Synchronous & Asynchronous
○ Single Docker or Distributed (Spark)
● Inputs: Trainingset as a Spark Dataset, model params
● Returns: a Model abstraction wrapper of AlgoCommons PredictorConfig
● Can support many popular Trainers:
Learning Tools
Metrics
● Leverages AlgoCommons Metrics framework
● Context Level Metrics
○ Supports ranking metrics: nMRR, Recall, nDCG, etc.
○ Supports algo-commons models or custom scoring functions
○ Users can slice and dice the metrics
○ Users can aggregate them using SQL
■ Performant implementation using Spark SQL catalyst expressions
● Item Level Metrics
○ E.g. row popularity
Visualization
Integrates with
- a Scala library for
matplotlib like visualizations
Open-sourced
Lessons learnt
● Machine learning is an iterative and data sensitive process
○ Make exploration easy, and productionizing robust
○ Make it easy to go switch between the two
● Design components with a general flexible interface
○ Specialize interfaces when you need to
● Testing can be hard, but worthwhile
○ Unit, Integration, Data Checks, Continuous Integration, @ScaleTesting
○ Metric driven system validations
Label
Generation
Fact Store
Training
Data
Preparation
Training
Feature
Engineering
Model
Quality
Intent To Treat
(Serving)
Treatment &
Action
Hyperparameter
Optimization
N
O
T
E
B
O
O
K
S
Caching
Dynamic param
management
Inference &
Logging
A/B Testing
Platform Online &
Precompute
Framework
Personalization
Aggregation
Fact
Logging
Device
Logging
Online
Services
API
The
Personalization
Rainbow
Control Plane
Online
Device
Function
Offline
Personalization systems & infrastructure
Boson
Algo Commons
O
R
C
H
E
S
T
R
A
T
I
O
N
Joy
Thank you!
(and yes, we’re hiring)
Questions
1 of 43

More Related Content

What's hot(20)

Learning a Personalized HomepageLearning a Personalized Homepage
Learning a Personalized Homepage
Justin Basilico6.5K views
Context Aware Recommendations at NetflixContext Aware Recommendations at Netflix
Context Aware Recommendations at Netflix
Linas Baltrunas5.7K views
Machine Learning at Netflix ScaleMachine Learning at Netflix Scale
Machine Learning at Netflix Scale
Aish Fenton1.9K views
ML Infrastracture @ Dropbox ML Infrastracture @ Dropbox
ML Infrastracture @ Dropbox
Tsahi Glik923 views
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
Yves Raimond15.4K views
Contextualization at NetflixContextualization at Netflix
Contextualization at Netflix
Linas Baltrunas7.6K views
Recommender SystemsRecommender Systems
Recommender Systems
Girish Khanzode8.6K views
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
Harald Steck4.2K views

Similar to ML Infra for Netflix Recommendations - AI NEXTCon talk(20)

Multiple graphs in openCypherMultiple graphs in openCypher
Multiple graphs in openCypher
openCypher145 views
Ml programming with pythonMl programming with python
Ml programming with python
Kumud Arora242 views

ML Infra for Netflix Recommendations - AI NEXTCon talk

  • 1. Faisal Siddiqi (@faisalzs) 12 Apr 2018 Machine Learning Infra for Recommendations
  • 2. ?
  • 3. ?
  • 4. Create Personalized recommendations for discoveries of engaging video content that maximize member joy Goal
  • 5. Create Personalized recommendations for discoveries of engaging video content that maximize member joy Goal
  • 6. Outline for today ● Domain Context ● ML Infra stack for Personalization ● Deeper dives into 2 major ML Infra components
  • 8. Personalization Context Data from Millions of users Training pipelines Models Precompute System Rankings Online caches AB Test Allocation
  • 10. Training Data Preparation Training Feature Engineering Model Quality Intent To Treat (Serving) Treatment & Action Inference & Logging The Personalization Rainbow Online Device Function Offline Personalization systems & infrastructure Control Plane
  • 11. Label Generation Fact Store Training Data Preparation Training Feature Engineering Model Quality Intent To Treat (Serving) Treatment & Action Hyperparameter Optimization N O T E B O O K S Caching Dynamic param management Inference & Logging A/B Testing Platform Online & Precompute Framework Personalization Aggregation Fact Logging Device Logging Online Services API The Personalization Rainbow Control Plane Online Device Function Offline Personalization systems & infrastructure Boson Algo Commons O R C H E S T R A T I O N
  • 13. The Context for AlgoCommons & Boson ● Machine Learning via ‘loosely-coupled, highly-aligned’ Scala/Java Libraries ● Historical context ○ Siloed machine learning infrastructure ○ Few opportunities for sharing ■ Incompatibility ■ Dependency concerns ■ Improvements in one pipeline not shared across others
  • 14. Design Principles ● Composability ○ Ability to put pieces together in novel ways ○ Enable construction of generic tools ● Portability ○ Easily share code online/offline and between applications ○ Models, Feature encoders, Common data manipulation ● Avoiding Training-Serving Skew ○ Serving/Online systems are Java based, drives choice of offline software ○ Share code & data between offline/online worlds
  • 15. Training Feature Engineering Model Quality Inference & Logging Delorean Time Travel Feature Generation Feature Transformers Label Joins Feature Schema Stratification & Sampling Data Fetchers & utilities Training API Model Tuning Boson AlgoCommons Spot Checks (human-in-the-loop) Visualization Feature Importance Validation Runs Training Metrics Abstractions Feature Sharing Component Sets Data Maps Feature Encoders Specification Common Model Format (JSON) Metrics Framework Predictions Inferencing Metrics Scoring Model Loading InferencingAlgoCommons & Boson Batch Training over Distributed Spark or Dockerized Containers
  • 17. ● Common abstractions and building blocks for ML ● Integrated in Java microservices for online or pre-computed Inferencing ● Library > framework (user-focus) ● Program to interfaces (composability) ● Aggressive modularization to avoid Jar Hell (portability) ● Data Access Abstraction (portability, testability) Overview AlgoCommons
  • 18. Common abstractions and Building Blocks ● Data ○ Data Keys ○ Data Maps ● Modeling ○ Component Sets ○ Feature Encoders, Predictor, Scorer ○ Model Format ● Metrics AlgoCommons
  • 19. DataKey<T> ○ Identifies a data value by name/type e.g “ViewingHistory” Data Value ○ Preferably immutable data structure DataMap ○ Map from DataKey<T> to T, plus metadata Data Access - Abstractions AlgoCommons
  • 20. Data Access - Lifecycle Application Component Factory Component What DataKeys do you need? I need X, Y, and Z f.create(dataMap) new Component(X, Y, Z) Return comp comp.do(someInput) Make DataMap w/ X, Y, and Z Data Retrieval Component Instantiation / Data Prep Component Application (repeat as needed) AlgoCommons
  • 21. DataTransform ● DataMap => K/V ● Given zero or more key/values, produce a new key/value ● Consumable by other data transforms, feature encoders, and components AlgoCommons
  • 22. Feature Encoder ● DataMap ⇒ (T ⇒ FeatureSet) ● FeatureEncoder<T> create(DataMap) ○ Given a DataMap, initialize a new encoder doing any required data prep ● void encode(T, FeatureSet) ○ Given an item (say, a Video), encode features for it into the feature set AlgoCommons
  • 23. Feature Transform ● Expression “language” for transforming features to produce new features ○ aka Feature Interactions ● Many operators available ○ log, outer/inner product, arithmetic, logic ● Expressions can be arbitrarily “stacked” ● Expressions are automatically DeDuped AlgoCommons
  • 24. Predictor ● Compute a score for a feature vector ● DataMap ⇒ (Vector ⇒ Double) ○ Predictor create(DataMap) ■ Given a data map, construct a new predictor ○ double predict(Vector) ■ Given a feature vector, compute a prediction/score ● Supports many Predictors: ○ LR, RegressionTree, TensorFlow, XGBoost, WeightedAdditiveEnsemble, FeatureWeighted, MultivariatePredictors, BanditPredictor, Sequence-to-sequence,... AlgoCommons
  • 25. Scorer ● Compute a score for business objects ● DataMap ⇒ (T ⇒ Double) ● Scorer<T> create(DataMap) ○ Given a data map, construct a new Scorer<T>. ● double score(T) ○ Given an item, compute a score AlgoCommons
  • 26. Extensible Model Definition ● Component abstraction ● JSON model serialization ● Various “views” of the Model ○ Feature gen ○ Prediction ○ Scoring { "@id" : "my-model", "@schema" : "SimpleFeatureScoringModel", "dataTransforms" : [ ... data transforms ...], "featureEncoders" : [ ... feature defs ...], "featureTransform" : { ... feature interactions ... }, "predictor" : { ... ML model (weights, etc.) ... } } AlgoCommons
  • 28. Metrics ● Building blocks ○ Accumulators ○ Estimators ● Ranking ○ Precision, Recall ○ Recall@Rank, NormalizedMeanReciprocalRank ● Regression ○ Error Accumulators ○ RMSE AlgoCommons
  • 30. Motivation Provide the productivity of this But make it easy to go between prod & experimentation
  • 31. Overview ● A high level Scala API for ML exploration ● Focuses on Offline Training for both ○ Ad-hoc exploration ○ Production Training ● Think “Subset of SKLearn” for Scala/JVM ecosystem ● Spark’s dataframe a core data abstraction
  • 32. Data Utilities ● Utilities for data transfer between heterogeneous systems ● Leverage Spark for data munging, but need bridge to Docker Trainers ○ Use standalone s3 downloader and parquet reader ○ S3 + s3fs-fuse ○ HDFS + hdfs-fuse ● On the wire format ○ Parquet ○ Protobuf
  • 33. Feature Schema ● Context The setting for evaluating a set of items (member profiles, country, etc.) ● Items The elements to be trained on and scored (videos, rows, etc.)
  • 34. Stratification dataframe.stratify (samplingRules = $(“column_foo”) == ‘US’ maxPercent 8.0, $(“column_bar”) > 10 && $(“column_qux”) > 1 minPercent 0.5, … ) A generalized API on Spark Dataframes Native SparkSQL expressions Emphasis on type-safety Many stratification attributes: Country, Devices, Searches,...
  • 35. Feature Transformers The feature generation pipeline is a sequence of Transformers A Transformer takes a dataframe, and based on contexts performs computations on and returns a new data frame. Dataset Type Tagger → Country Tenure Stratified Sampler → Negative Generator → ….
  • 36. Feature Generation - Putting it together Model Training Structured Labeled Features Feature Model Structured Data in DataFrame Feature Encoders Required Feature Maps of Data POJO Features Required Data Label Data Catalyst Expressions AlgoCommons Fact Store Structured Labeled Features Required Feature DataMaps Features Required Data 1 2 24 5 6 7
  • 37. Training ● Need flexibility and access to trainers in all languages/environments ● A simple unified Training API for ○ Synchronous & Asynchronous ○ Single Docker or Distributed (Spark) ● Inputs: Trainingset as a Spark Dataset, model params ● Returns: a Model abstraction wrapper of AlgoCommons PredictorConfig ● Can support many popular Trainers: Learning Tools
  • 38. Metrics ● Leverages AlgoCommons Metrics framework ● Context Level Metrics ○ Supports ranking metrics: nMRR, Recall, nDCG, etc. ○ Supports algo-commons models or custom scoring functions ○ Users can slice and dice the metrics ○ Users can aggregate them using SQL ■ Performant implementation using Spark SQL catalyst expressions ● Item Level Metrics ○ E.g. row popularity
  • 39. Visualization Integrates with - a Scala library for matplotlib like visualizations Open-sourced
  • 40. Lessons learnt ● Machine learning is an iterative and data sensitive process ○ Make exploration easy, and productionizing robust ○ Make it easy to go switch between the two ● Design components with a general flexible interface ○ Specialize interfaces when you need to ● Testing can be hard, but worthwhile ○ Unit, Integration, Data Checks, Continuous Integration, @ScaleTesting ○ Metric driven system validations
  • 41. Label Generation Fact Store Training Data Preparation Training Feature Engineering Model Quality Intent To Treat (Serving) Treatment & Action Hyperparameter Optimization N O T E B O O K S Caching Dynamic param management Inference & Logging A/B Testing Platform Online & Precompute Framework Personalization Aggregation Fact Logging Device Logging Online Services API The Personalization Rainbow Control Plane Online Device Function Offline Personalization systems & infrastructure Boson Algo Commons O R C H E S T R A T I O N
  • 42. Joy
  • 43. Thank you! (and yes, we’re hiring) Questions