SlideShare a Scribd company logo
1 of 43
Download to read offline
Faisal Siddiqi (@faisalzs)
12 Apr 2018
Machine Learning Infra for
Recommendations
?
?
Create Personalized recommendations for discoveries
of engaging video content that maximize member joy
Goal
Create Personalized recommendations for discoveries
of engaging video content that maximize member joy
Goal
Outline for today
● Domain Context
● ML Infra stack for Personalization
● Deeper dives into 2 major ML Infra components
Personalize everything!
Row
Selection &
Order
Titles ranked by relevance
Artwork!
Personalization Context
Data from
Millions of
users
Training
pipelines
Models
Precompute System
Rankings
Online
caches
AB Test
Allocation
Joy == Rainbow
Training
Data
Preparation
Training
Feature
Engineering
Model
Quality
Intent To Treat
(Serving)
Treatment &
Action
Inference &
Logging
The
Personalization
Rainbow
Online
Device
Function
Offline
Personalization systems & infrastructure
Control Plane
Label
Generation
Fact Store
Training
Data
Preparation
Training
Feature
Engineering
Model
Quality
Intent To Treat
(Serving)
Treatment &
Action
Hyperparameter
Optimization
N
O
T
E
B
O
O
K
S
Caching
Dynamic param
management
Inference &
Logging
A/B Testing
Platform Online &
Precompute
Framework
Personalization
Aggregation
Fact
Logging
Device
Logging
Online
Services
API
The
Personalization
Rainbow
Control Plane
Online
Device
Function
Offline
Personalization systems & infrastructure
Boson
Algo Commons
O
R
C
H
E
S
T
R
A
T
I
O
N
Training
Feature
Engineering
Model
Quality
Inference &
Logging
Boson
Algo Commons
We’ll zoom into Boson & AlgoCommons today
The Context for AlgoCommons & Boson
● Machine Learning via ‘loosely-coupled, highly-aligned’ Scala/Java Libraries
● Historical context
○ Siloed machine learning infrastructure
○ Few opportunities for sharing
■ Incompatibility
■ Dependency concerns
■ Improvements in one pipeline not shared across others
Design Principles
● Composability
○ Ability to put pieces together in novel ways
○ Enable construction of generic tools
● Portability
○ Easily share code online/offline and between applications
○ Models, Feature encoders, Common data manipulation
● Avoiding Training-Serving Skew
○ Serving/Online systems are Java based, drives choice of offline software
○ Share code & data between offline/online worlds
Training
Feature
Engineering
Model
Quality
Inference &
Logging
Delorean
Time Travel
Feature Generation
Feature Transformers
Label Joins
Feature Schema
Stratification & Sampling
Data Fetchers & utilities
Training
API
Model
Tuning
Boson
AlgoCommons
Spot Checks (human-in-the-loop)
Visualization
Feature Importance
Validation Runs
Training Metrics
Abstractions
Feature Sharing
Component Sets
Data Maps
Feature Encoders
Specification
Common Model Format
(JSON)
Metrics Framework
Predictions
Inferencing Metrics
Scoring
Model Loading
InferencingAlgoCommons
& Boson
Batch Training over Distributed
Spark or Dockerized Containers
AlgoCommons
● Common abstractions and building blocks for ML
● Integrated in Java microservices
for online or pre-computed Inferencing
● Library > framework (user-focus)
● Program to interfaces (composability)
● Aggressive modularization to avoid Jar Hell (portability)
● Data Access Abstraction (portability, testability)
Overview AlgoCommons
Common abstractions and Building Blocks
● Data
○ Data Keys
○ Data Maps
● Modeling
○ Component Sets
○ Feature Encoders, Predictor, Scorer
○ Model Format
● Metrics
AlgoCommons
DataKey<T>
○ Identifies a data value by name/type e.g “ViewingHistory”
Data Value
○ Preferably immutable data structure
DataMap
○ Map from DataKey<T> to T, plus metadata
Data Access - Abstractions AlgoCommons
Data Access - Lifecycle
Application Component
Factory
Component
What DataKeys do
you need?
I need X, Y, and Z
f.create(dataMap)
new Component(X, Y, Z)
Return comp
comp.do(someInput)
Make
DataMap w/
X, Y, and Z
Data Retrieval
Component Instantiation /
Data Prep
Component Application
(repeat as needed)
AlgoCommons
DataTransform
● DataMap => K/V
● Given zero or more key/values, produce a new key/value
● Consumable by other data transforms, feature encoders, and components
AlgoCommons
Feature Encoder
● DataMap ⇒ (T ⇒ FeatureSet)
● FeatureEncoder<T> create(DataMap)
○ Given a DataMap, initialize a new encoder doing any required data prep
● void encode(T, FeatureSet)
○ Given an item (say, a Video), encode features for it into the feature set
AlgoCommons
Feature Transform
● Expression “language” for transforming features to produce new features
○ aka Feature Interactions
● Many operators available
○ log, outer/inner product, arithmetic, logic
● Expressions can be arbitrarily “stacked”
● Expressions are automatically DeDuped
AlgoCommons
Predictor
● Compute a score for a feature vector
● DataMap ⇒ (Vector ⇒ Double)
○ Predictor create(DataMap)
■ Given a data map, construct a new predictor
○ double predict(Vector)
■ Given a feature vector, compute a prediction/score
● Supports many Predictors:
○ LR, RegressionTree, TensorFlow, XGBoost,
WeightedAdditiveEnsemble, FeatureWeighted, MultivariatePredictors,
BanditPredictor, Sequence-to-sequence,...
AlgoCommons
Scorer
● Compute a score for business objects
● DataMap ⇒ (T ⇒ Double)
● Scorer<T> create(DataMap)
○ Given a data map, construct a new Scorer<T>.
● double score(T)
○ Given an item, compute a score
AlgoCommons
Extensible Model Definition
● Component abstraction
● JSON model serialization
● Various “views” of the Model
○ Feature gen
○ Prediction
○ Scoring
{
"@id" : "my-model",
"@schema" : "SimpleFeatureScoringModel",
"dataTransforms" : [ ... data transforms ...],
"featureEncoders" : [ ... feature defs ...],
"featureTransform" : { ... feature interactions ... },
"predictor" : { ... ML model (weights, etc.) ... }
}
AlgoCommons
Data
Transform
Data
Transform
Feature
Encoder
Feature
Transform
App
Data
Feature
Encoder
Feature
Encoder
Predictor
ScoringModelView
DataTransformView
FeatureGeneratorView
PredictorView
Views of the Feature Scoring Model
AlgoCommons
Metrics
● Building blocks
○ Accumulators
○ Estimators
● Ranking
○ Precision, Recall
○ Recall@Rank, NormalizedMeanReciprocalRank
● Regression
○ Error Accumulators
○ RMSE
AlgoCommons
Motivation
Provide the productivity of this
But make it easy to go between
prod & experimentation
Overview
● A high level Scala API for ML exploration
● Focuses on Offline Training for both
○ Ad-hoc exploration
○ Production Training
● Think “Subset of SKLearn” for Scala/JVM ecosystem
● Spark’s dataframe a core data abstraction
Data Utilities
● Utilities for data transfer between heterogeneous systems
● Leverage Spark for data munging, but need bridge to Docker Trainers
○ Use standalone s3 downloader and parquet reader
○ S3 + s3fs-fuse
○ HDFS + hdfs-fuse
● On the wire format
○ Parquet
○ Protobuf
Feature Schema
● Context
The setting for evaluating a set of items (member profiles, country, etc.)
● Items
The elements to be trained on and scored (videos, rows, etc.)
Stratification
dataframe.stratify (samplingRules =
$(“column_foo”) == ‘US’ maxPercent 8.0,
$(“column_bar”) > 10 && $(“column_qux”) > 1 minPercent 0.5,
…
)
A generalized API on Spark Dataframes
Native SparkSQL expressions
Emphasis on type-safety
Many stratification attributes: Country, Devices, Searches,...
Feature Transformers
The feature generation pipeline is a sequence of Transformers
A Transformer takes a dataframe, and based on contexts performs computations
on and returns a new data frame.
Dataset Type Tagger
→ Country Tenure Stratified Sampler
→ Negative Generator
→ ….
Feature Generation - Putting it together
Model Training
Structured Labeled Features
Feature Model
Structured Data in
DataFrame
Feature Encoders
Required
Feature
Maps of Data
POJO
Features
Required Data
Label Data
Catalyst
Expressions
AlgoCommons
Fact Store
Structured Labeled
Features
Required
Feature
DataMaps
Features
Required
Data
1
2
24
5
6
7
Training
● Need flexibility and access to trainers in all languages/environments
● A simple unified Training API for
○ Synchronous & Asynchronous
○ Single Docker or Distributed (Spark)
● Inputs: Trainingset as a Spark Dataset, model params
● Returns: a Model abstraction wrapper of AlgoCommons PredictorConfig
● Can support many popular Trainers:
Learning Tools
Metrics
● Leverages AlgoCommons Metrics framework
● Context Level Metrics
○ Supports ranking metrics: nMRR, Recall, nDCG, etc.
○ Supports algo-commons models or custom scoring functions
○ Users can slice and dice the metrics
○ Users can aggregate them using SQL
■ Performant implementation using Spark SQL catalyst expressions
● Item Level Metrics
○ E.g. row popularity
Visualization
Integrates with
- a Scala library for
matplotlib like visualizations
Open-sourced
Lessons learnt
● Machine learning is an iterative and data sensitive process
○ Make exploration easy, and productionizing robust
○ Make it easy to go switch between the two
● Design components with a general flexible interface
○ Specialize interfaces when you need to
● Testing can be hard, but worthwhile
○ Unit, Integration, Data Checks, Continuous Integration, @ScaleTesting
○ Metric driven system validations
Label
Generation
Fact Store
Training
Data
Preparation
Training
Feature
Engineering
Model
Quality
Intent To Treat
(Serving)
Treatment &
Action
Hyperparameter
Optimization
N
O
T
E
B
O
O
K
S
Caching
Dynamic param
management
Inference &
Logging
A/B Testing
Platform Online &
Precompute
Framework
Personalization
Aggregation
Fact
Logging
Device
Logging
Online
Services
API
The
Personalization
Rainbow
Control Plane
Online
Device
Function
Offline
Personalization systems & infrastructure
Boson
Algo Commons
O
R
C
H
E
S
T
R
A
T
I
O
N
Joy
Thank you!
(and yes, we’re hiring)
Questions

More Related Content

What's hot

Dropbox Talk at Netflix ML Platform Meetup Spe 2019
Dropbox Talk at Netflix ML Platform Meetup Spe 2019Dropbox Talk at Netflix ML Platform Meetup Spe 2019
Dropbox Talk at Netflix ML Platform Meetup Spe 2019Faisal Siddiqi
 
Artwork Personalization at Netflix
Artwork Personalization at NetflixArtwork Personalization at Netflix
Artwork Personalization at NetflixJustin Basilico
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixJustin Basilico
 
LinkedIn talk at Netflix ML Platform meetup Sep 2019
LinkedIn talk at Netflix ML Platform meetup Sep 2019LinkedIn talk at Netflix ML Platform meetup Sep 2019
LinkedIn talk at Netflix ML Platform meetup Sep 2019Faisal Siddiqi
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixJaya Kawale
 
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
 Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se... Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...Sudeep Das, Ph.D.
 
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...Flink Forward
 
Facebook Talk at Netflix ML Platform meetup Sep 2019
Facebook Talk at Netflix ML Platform meetup Sep 2019Facebook Talk at Netflix ML Platform meetup Sep 2019
Facebook Talk at Netflix ML Platform meetup Sep 2019Faisal Siddiqi
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveJustin Basilico
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixJustin Basilico
 
Wix's ML Platform
Wix's ML PlatformWix's ML Platform
Wix's ML PlatformRan Romano
 
Frame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine LearningFrame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine LearningDavid Stein
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshJeffrey T. Pollock
 
Recommending for the World
Recommending for the WorldRecommending for the World
Recommending for the WorldYves Raimond
 
Boston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender SystemsBoston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender SystemsJames Kirk
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkTimothy Spann
 
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScalePinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScaleSeunghyun Lee
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsJustin Basilico
 
Applied Data Science for E-Commerce
Applied Data Science for E-CommerceApplied Data Science for E-Commerce
Applied Data Science for E-CommerceArul Bharathi
 

What's hot (20)

Dropbox Talk at Netflix ML Platform Meetup Spe 2019
Dropbox Talk at Netflix ML Platform Meetup Spe 2019Dropbox Talk at Netflix ML Platform Meetup Spe 2019
Dropbox Talk at Netflix ML Platform Meetup Spe 2019
 
Artwork Personalization at Netflix
Artwork Personalization at NetflixArtwork Personalization at Netflix
Artwork Personalization at Netflix
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
LinkedIn talk at Netflix ML Platform meetup Sep 2019
LinkedIn talk at Netflix ML Platform meetup Sep 2019LinkedIn talk at Netflix ML Platform meetup Sep 2019
LinkedIn talk at Netflix ML Platform meetup Sep 2019
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at Netflix
 
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
 Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se... Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
Deeper Things: How Netflix Leverages Deep Learning in Recommendations and Se...
 
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
 
Facebook Talk at Netflix ML Platform meetup Sep 2019
Facebook Talk at Netflix ML Platform meetup Sep 2019Facebook Talk at Netflix ML Platform meetup Sep 2019
Facebook Talk at Netflix ML Platform meetup Sep 2019
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix Perspective
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Wix's ML Platform
Wix's ML PlatformWix's ML Platform
Wix's ML Platform
 
Frame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine LearningFrame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine Learning
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
Recommending for the World
Recommending for the WorldRecommending for the World
Recommending for the World
 
Boston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender SystemsBoston ML - Architecting Recommender Systems
Boston ML - Architecting Recommender Systems
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
 
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's ScalePinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender Systems
 
Applied Data Science for E-Commerce
Applied Data Science for E-CommerceApplied Data Science for E-Commerce
Applied Data Science for E-Commerce
 

Similar to ML Infra for Netflix Recommendations - AI NEXTCon talk

Ml ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupMl ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupJim Dowling
 
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Databricks
 
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...Chetan Khatri
 
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking VN
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaChetan Khatri
 
GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...
GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...
GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...Noriaki Tatsumi
 
Productionalizing ML : Real Experience
Productionalizing ML : Real ExperienceProductionalizing ML : Real Experience
Productionalizing ML : Real ExperienceIhor Bobak
 
Shaping serverless architecture with domain driven design patterns - py web-il
Shaping serverless architecture with domain driven design patterns - py web-ilShaping serverless architecture with domain driven design patterns - py web-il
Shaping serverless architecture with domain driven design patterns - py web-ilAsher Sterkin
 
Machine Learning Pipelines - Joseph Bradley - Databricks
Machine Learning Pipelines - Joseph Bradley - DatabricksMachine Learning Pipelines - Joseph Bradley - Databricks
Machine Learning Pipelines - Joseph Bradley - DatabricksSpark Summit
 
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...Khai Tran
 
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...PAPIs.io
 
Multiple graphs in openCypher
Multiple graphs in openCypherMultiple graphs in openCypher
Multiple graphs in openCypheropenCypher
 
Machine learning pipeline with spark ml
Machine learning pipeline with spark mlMachine learning pipeline with spark ml
Machine learning pipeline with spark mldatamantra
 
AWS November Webinar Series - Advanced Analytics with Amazon Redshift and the...
AWS November Webinar Series - Advanced Analytics with Amazon Redshift and the...AWS November Webinar Series - Advanced Analytics with Amazon Redshift and the...
AWS November Webinar Series - Advanced Analytics with Amazon Redshift and the...Amazon Web Services
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroPyData
 
Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17
Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17
Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17Taro L. Saito
 
Ml programming with python
Ml programming with pythonMl programming with python
Ml programming with pythonKumud Arora
 
Practical Machine Learning Pipelines with MLlib
Practical Machine Learning Pipelines with MLlibPractical Machine Learning Pipelines with MLlib
Practical Machine Learning Pipelines with MLlibDatabricks
 
Performance Optimization of Recommendation Training Pipeline at Netflix DB Ts...
Performance Optimization of Recommendation Training Pipeline at Netflix DB Ts...Performance Optimization of Recommendation Training Pipeline at Netflix DB Ts...
Performance Optimization of Recommendation Training Pipeline at Netflix DB Ts...Databricks
 
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...James Anderson
 

Similar to ML Infra for Netflix Recommendations - AI NEXTCon talk (20)

Ml ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupMl ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science Meetup
 
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
 
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
 
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedInGrokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
Grokking TechTalk #29: Building Realtime Metrics Platform at LinkedIn
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
 
GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...
GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...
GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...
 
Productionalizing ML : Real Experience
Productionalizing ML : Real ExperienceProductionalizing ML : Real Experience
Productionalizing ML : Real Experience
 
Shaping serverless architecture with domain driven design patterns - py web-il
Shaping serverless architecture with domain driven design patterns - py web-ilShaping serverless architecture with domain driven design patterns - py web-il
Shaping serverless architecture with domain driven design patterns - py web-il
 
Machine Learning Pipelines - Joseph Bradley - Databricks
Machine Learning Pipelines - Joseph Bradley - DatabricksMachine Learning Pipelines - Joseph Bradley - Databricks
Machine Learning Pipelines - Joseph Bradley - Databricks
 
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
 
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
 
Multiple graphs in openCypher
Multiple graphs in openCypherMultiple graphs in openCypher
Multiple graphs in openCypher
 
Machine learning pipeline with spark ml
Machine learning pipeline with spark mlMachine learning pipeline with spark ml
Machine learning pipeline with spark ml
 
AWS November Webinar Series - Advanced Analytics with Amazon Redshift and the...
AWS November Webinar Series - Advanced Analytics with Amazon Redshift and the...AWS November Webinar Series - Advanced Analytics with Amazon Redshift and the...
AWS November Webinar Series - Advanced Analytics with Amazon Redshift and the...
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
 
Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17
Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17
Airframe: Lightweight Building Blocks for Scala @ TD Tech Talk 2018-10-17
 
Ml programming with python
Ml programming with pythonMl programming with python
Ml programming with python
 
Practical Machine Learning Pipelines with MLlib
Practical Machine Learning Pipelines with MLlibPractical Machine Learning Pipelines with MLlib
Practical Machine Learning Pipelines with MLlib
 
Performance Optimization of Recommendation Training Pipeline at Netflix DB Ts...
Performance Optimization of Recommendation Training Pipeline at Netflix DB Ts...Performance Optimization of Recommendation Training Pipeline at Netflix DB Ts...
Performance Optimization of Recommendation Training Pipeline at Netflix DB Ts...
 
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
 

Recently uploaded

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 

Recently uploaded (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 

ML Infra for Netflix Recommendations - AI NEXTCon talk

  • 1. Faisal Siddiqi (@faisalzs) 12 Apr 2018 Machine Learning Infra for Recommendations
  • 2. ?
  • 3. ?
  • 4. Create Personalized recommendations for discoveries of engaging video content that maximize member joy Goal
  • 5. Create Personalized recommendations for discoveries of engaging video content that maximize member joy Goal
  • 6. Outline for today ● Domain Context ● ML Infra stack for Personalization ● Deeper dives into 2 major ML Infra components
  • 8. Personalization Context Data from Millions of users Training pipelines Models Precompute System Rankings Online caches AB Test Allocation
  • 10. Training Data Preparation Training Feature Engineering Model Quality Intent To Treat (Serving) Treatment & Action Inference & Logging The Personalization Rainbow Online Device Function Offline Personalization systems & infrastructure Control Plane
  • 11. Label Generation Fact Store Training Data Preparation Training Feature Engineering Model Quality Intent To Treat (Serving) Treatment & Action Hyperparameter Optimization N O T E B O O K S Caching Dynamic param management Inference & Logging A/B Testing Platform Online & Precompute Framework Personalization Aggregation Fact Logging Device Logging Online Services API The Personalization Rainbow Control Plane Online Device Function Offline Personalization systems & infrastructure Boson Algo Commons O R C H E S T R A T I O N
  • 13. The Context for AlgoCommons & Boson ● Machine Learning via ‘loosely-coupled, highly-aligned’ Scala/Java Libraries ● Historical context ○ Siloed machine learning infrastructure ○ Few opportunities for sharing ■ Incompatibility ■ Dependency concerns ■ Improvements in one pipeline not shared across others
  • 14. Design Principles ● Composability ○ Ability to put pieces together in novel ways ○ Enable construction of generic tools ● Portability ○ Easily share code online/offline and between applications ○ Models, Feature encoders, Common data manipulation ● Avoiding Training-Serving Skew ○ Serving/Online systems are Java based, drives choice of offline software ○ Share code & data between offline/online worlds
  • 15. Training Feature Engineering Model Quality Inference & Logging Delorean Time Travel Feature Generation Feature Transformers Label Joins Feature Schema Stratification & Sampling Data Fetchers & utilities Training API Model Tuning Boson AlgoCommons Spot Checks (human-in-the-loop) Visualization Feature Importance Validation Runs Training Metrics Abstractions Feature Sharing Component Sets Data Maps Feature Encoders Specification Common Model Format (JSON) Metrics Framework Predictions Inferencing Metrics Scoring Model Loading InferencingAlgoCommons & Boson Batch Training over Distributed Spark or Dockerized Containers
  • 17. ● Common abstractions and building blocks for ML ● Integrated in Java microservices for online or pre-computed Inferencing ● Library > framework (user-focus) ● Program to interfaces (composability) ● Aggressive modularization to avoid Jar Hell (portability) ● Data Access Abstraction (portability, testability) Overview AlgoCommons
  • 18. Common abstractions and Building Blocks ● Data ○ Data Keys ○ Data Maps ● Modeling ○ Component Sets ○ Feature Encoders, Predictor, Scorer ○ Model Format ● Metrics AlgoCommons
  • 19. DataKey<T> ○ Identifies a data value by name/type e.g “ViewingHistory” Data Value ○ Preferably immutable data structure DataMap ○ Map from DataKey<T> to T, plus metadata Data Access - Abstractions AlgoCommons
  • 20. Data Access - Lifecycle Application Component Factory Component What DataKeys do you need? I need X, Y, and Z f.create(dataMap) new Component(X, Y, Z) Return comp comp.do(someInput) Make DataMap w/ X, Y, and Z Data Retrieval Component Instantiation / Data Prep Component Application (repeat as needed) AlgoCommons
  • 21. DataTransform ● DataMap => K/V ● Given zero or more key/values, produce a new key/value ● Consumable by other data transforms, feature encoders, and components AlgoCommons
  • 22. Feature Encoder ● DataMap ⇒ (T ⇒ FeatureSet) ● FeatureEncoder<T> create(DataMap) ○ Given a DataMap, initialize a new encoder doing any required data prep ● void encode(T, FeatureSet) ○ Given an item (say, a Video), encode features for it into the feature set AlgoCommons
  • 23. Feature Transform ● Expression “language” for transforming features to produce new features ○ aka Feature Interactions ● Many operators available ○ log, outer/inner product, arithmetic, logic ● Expressions can be arbitrarily “stacked” ● Expressions are automatically DeDuped AlgoCommons
  • 24. Predictor ● Compute a score for a feature vector ● DataMap ⇒ (Vector ⇒ Double) ○ Predictor create(DataMap) ■ Given a data map, construct a new predictor ○ double predict(Vector) ■ Given a feature vector, compute a prediction/score ● Supports many Predictors: ○ LR, RegressionTree, TensorFlow, XGBoost, WeightedAdditiveEnsemble, FeatureWeighted, MultivariatePredictors, BanditPredictor, Sequence-to-sequence,... AlgoCommons
  • 25. Scorer ● Compute a score for business objects ● DataMap ⇒ (T ⇒ Double) ● Scorer<T> create(DataMap) ○ Given a data map, construct a new Scorer<T>. ● double score(T) ○ Given an item, compute a score AlgoCommons
  • 26. Extensible Model Definition ● Component abstraction ● JSON model serialization ● Various “views” of the Model ○ Feature gen ○ Prediction ○ Scoring { "@id" : "my-model", "@schema" : "SimpleFeatureScoringModel", "dataTransforms" : [ ... data transforms ...], "featureEncoders" : [ ... feature defs ...], "featureTransform" : { ... feature interactions ... }, "predictor" : { ... ML model (weights, etc.) ... } } AlgoCommons
  • 28. Metrics ● Building blocks ○ Accumulators ○ Estimators ● Ranking ○ Precision, Recall ○ Recall@Rank, NormalizedMeanReciprocalRank ● Regression ○ Error Accumulators ○ RMSE AlgoCommons
  • 29.
  • 30. Motivation Provide the productivity of this But make it easy to go between prod & experimentation
  • 31. Overview ● A high level Scala API for ML exploration ● Focuses on Offline Training for both ○ Ad-hoc exploration ○ Production Training ● Think “Subset of SKLearn” for Scala/JVM ecosystem ● Spark’s dataframe a core data abstraction
  • 32. Data Utilities ● Utilities for data transfer between heterogeneous systems ● Leverage Spark for data munging, but need bridge to Docker Trainers ○ Use standalone s3 downloader and parquet reader ○ S3 + s3fs-fuse ○ HDFS + hdfs-fuse ● On the wire format ○ Parquet ○ Protobuf
  • 33. Feature Schema ● Context The setting for evaluating a set of items (member profiles, country, etc.) ● Items The elements to be trained on and scored (videos, rows, etc.)
  • 34. Stratification dataframe.stratify (samplingRules = $(“column_foo”) == ‘US’ maxPercent 8.0, $(“column_bar”) > 10 && $(“column_qux”) > 1 minPercent 0.5, … ) A generalized API on Spark Dataframes Native SparkSQL expressions Emphasis on type-safety Many stratification attributes: Country, Devices, Searches,...
  • 35. Feature Transformers The feature generation pipeline is a sequence of Transformers A Transformer takes a dataframe, and based on contexts performs computations on and returns a new data frame. Dataset Type Tagger → Country Tenure Stratified Sampler → Negative Generator → ….
  • 36. Feature Generation - Putting it together Model Training Structured Labeled Features Feature Model Structured Data in DataFrame Feature Encoders Required Feature Maps of Data POJO Features Required Data Label Data Catalyst Expressions AlgoCommons Fact Store Structured Labeled Features Required Feature DataMaps Features Required Data 1 2 24 5 6 7
  • 37. Training ● Need flexibility and access to trainers in all languages/environments ● A simple unified Training API for ○ Synchronous & Asynchronous ○ Single Docker or Distributed (Spark) ● Inputs: Trainingset as a Spark Dataset, model params ● Returns: a Model abstraction wrapper of AlgoCommons PredictorConfig ● Can support many popular Trainers: Learning Tools
  • 38. Metrics ● Leverages AlgoCommons Metrics framework ● Context Level Metrics ○ Supports ranking metrics: nMRR, Recall, nDCG, etc. ○ Supports algo-commons models or custom scoring functions ○ Users can slice and dice the metrics ○ Users can aggregate them using SQL ■ Performant implementation using Spark SQL catalyst expressions ● Item Level Metrics ○ E.g. row popularity
  • 39. Visualization Integrates with - a Scala library for matplotlib like visualizations Open-sourced
  • 40. Lessons learnt ● Machine learning is an iterative and data sensitive process ○ Make exploration easy, and productionizing robust ○ Make it easy to go switch between the two ● Design components with a general flexible interface ○ Specialize interfaces when you need to ● Testing can be hard, but worthwhile ○ Unit, Integration, Data Checks, Continuous Integration, @ScaleTesting ○ Metric driven system validations
  • 41. Label Generation Fact Store Training Data Preparation Training Feature Engineering Model Quality Intent To Treat (Serving) Treatment & Action Hyperparameter Optimization N O T E B O O K S Caching Dynamic param management Inference & Logging A/B Testing Platform Online & Precompute Framework Personalization Aggregation Fact Logging Device Logging Online Services API The Personalization Rainbow Control Plane Online Device Function Offline Personalization systems & infrastructure Boson Algo Commons O R C H E S T R A T I O N
  • 42. Joy
  • 43. Thank you! (and yes, we’re hiring) Questions