SlideShare a Scribd company logo
1 of 41
Download to read offline
Reinforcement Learning in
Production at Zynga
Patrick Halina, Architect, ML Engineering
Curren Pangler, Principal Engineer, ML Engineering
Agenda
Reinforcement Learning (RL) Intro
Intro to RL and personalization
Zynga’s RL Tech Stack
The off the shelf technologies Zynga uses to run
RL in production for millions of users per day
Designing RL Applications
Creating RL applications is hard! Here’s what
we’ve learned
Mobile game developer
Over 60M monthly active users
Game Design is Hard
Lots of design decisions
▪ How hard to make level?
▪ What game mode do we recommend?
How do we personalize games?
We want to choose behaviors for
user to maximize long term
engagement
Personalization Problem Formulation
Given a user’s State
▪ User features
What Action do we pick
▪ E.g. Difficulty level
That maximizes a long term Reward
▪ E.g. Engagement, retention
Personalization Method 1: Rules Based Segments
PMs define segments via rules
Assign ‘personalized action’ to each segment
A/B test segment vs. control
Challenges
Lots of trial and error manual work
Player patterns change
Limited ability to personalize
▪ Small set of outputs
▪ Small set of datapoints to make decisions on
Personalization Method 2: Prediction Models
Train model to predict long term reward for each
personalized Action
Challenges
Requires lots of models
Requires lots of labelled data
▪ Need to randomly assign users to each Action, then wait long enough to measure long term results
Limited to simple outputs. E.g. How to pick best continuous
value?
Personalization Wishlist
Automatically tune details of personalization
Continuously explore and improve over time
Personalize complex outputs
▪ E.g. Continuous values, multiple dimensions
Solution: Reinforcement Learning (RL)
AI for making sequences of decisions
Agent picks Action based on current State to maximize Reward
Automatically learn from past experiences
Balance exploration with choosing best known Action
If RL can beat the world’s best GO player, can we use it to make
our games better?
Application: WWF Daily Message Timing
What time should we send user their daily message?
We used RL Agent to personalize based on hourly activity
Results: Significant increase in CTR vs. hand tuned system
Delivered to millions of users per day
Zynga’s RL Tech Stack
RL Model Training
Action
Agent EnvironmentState
RL Model Training
Action
Agent Environment
ExperienceExperience Replay Buffer
training logging
State
RL Model Training
action
Agent Environment
ExperienceExperience Replay Buffer
training logging Single Experience/Trajectory
S0: Previous Observation
A0: Action
R : Reward
S1: Current Observation
A1: Next Optimal Action
State
Academic RL Applications
RL Agents learn by interacting with environment
▪ Can’t just train Agent with static set of labelled data like supervised learning models
Well known RL applications are trained offline with simulator
E.g. Training Agent to play Atari
Agent is applied after lots of offline learning
RL Agent
v1
Action Learn
RL Agent
v2
Action Learn
Production RL Applications for Personalization
Hard to simulate humans, so we learn by interacting with real users
Agent interacts with humans from v1
Need to learn from batches in parallel
Harder to manage data and workflows!
RL
Agent
v1
Action Learn
RL
Agent
v2
Action Learn
Action Learn Action Learn
RL Model Training
Training Pipeline Wish List:
Off-the-shelf
Scalable
Cutting-edge algorithms
Reliable & robust
Easily extendable
RL Model Training
TF-Agents
Open-source RL Library that implements
cutting-edge Deep RL algorithms (DQN,
PPO, TD3 etc.)
Advantages:
▪ Modular design
▪ Well-written
▪ Accuracy
▪ New algorithms
Production RL Challenges
How to:
Convert messy, real-time logged data into RL trajectories?
Persist, restore, and re-use past agents & trajectories
Create trajectories at production scale?
And… how do we make this repeatable and data-scientist friendly?
RL – Bakery
Our open-source library to help build batch RL applications in production, at scale
github.com/zynga/rl-bakery
RL-Bakery
Wrapper around RL algorithm libraries
that simplifies developing real world RL
apps like personalization
RL-Application
RL-Bakery
RL Library
RL-Bakery
RL-Application:
▪ Application-specific
▪ Written in a Databricks notebook
▪ Data-scientist friendly
▪ Provides model configuration &
hyperparameters
▪ Fetches observations, actions,
rewards as Spark DataFrames
RL-Application
RL-Bakery
RL Library
RL-Bakery
RL-Bakery:
▪ Orchestrate steps of training
pipeline
▪ Restore models and old time steps
▪ Create new training trajectories
▪ Persist model and trajectories
between runs
▪ Deploy models to serving system
▪ Add functionality unavailable in TF-
Agents (e.g. prioritized replay buffer)
RL-Application
RL-Bakery
RL Library
RL-Bakery
RL Library:
▪ Open source RL libs implement algos
like PPO, DQN etc.
▪ Currently only support TF-Agents
▪ Core RL Algorithms implemented
using TensorFlow
RL-Application
RL-Bakery
RL Library
Pre-processing
Model Inference
Post-processing
State & Action
Logging
S3
Zynga
Personalize
AWS SageMaker
Feature
Hydration
Zynga Feature
Store
action
observation
Real Time Model Serving
MODEL SERVING MODEL TRAINING
Real-Time
Features
Real-Time
Serving
RL Bakery Application
AWS SageMaker
ActionObservations
Recommendation
S3 Experience
Logs
Training
RL Agent
Designing RL Applications
Choose the Right Application
Is the problem best modelled as a sequence of decisions?
▪ Does the Action taken in one timestep affect future Actions?
▪ Otherwise, use simpler solutions like predictive models or contextual Multi-Armed Bandits
Is the Reward learnable?
▪ Does the Action impact the Reward
▪ Hard to learn sparse rewards
RL shouldn’t be applied to every situation
Choose States
Anecdotally, RL Agents are sensitive to too many inputs
Choose simple state spaces
Compress state space size with unsupervised learning techniques like
Auto-Encoding
Designing Actions
Start simple: small set of discrete Actions
▪ Allows you to use simpler Deep RL algorithms
Continuous action spaces require algos from Policy Gradient family
Large set of discrete Actions -> classic Recommendation Systems
▪ This goes beyond traditional RL set up
▪ Some cutting edge Recommendation Systems use RL
Choosing RL Algorithms
Active area of research, new algos are constantly being developed
Algorithms are hard to implement, subtle details affect results
Off the shelf implementations available from Open Source libs
Hyperparameter Tuning
Lots of RL application design choices
▪ Allows you to use simpler Deep RL algorithms
Plus Deep Learning hyper parameters
▪ Learning rate
▪ Neural network architecture
Slight hyper parameter changes have big effects
How can we choose best options before going live?
How to Pretrain
Can you do better than random for initial launch?
Train Agent to mimic some existing behavior
▪ Use historic data to reward Agent for picking previous Actions
▪ Agent then slowly learns when to deviate
Simulate simple scenarios with hand made mechanics
▪ Capture relationships between features in State and Action
▪ Simple scenarios have clear “optimal” strategy so you can measure success
Hyperparameter Tuning Automation
Automate deep learning hyperparameter tuning with MLFlow
Key Takeaways
RL is the perfect methodology for personalization problems
RL is ready for production with off the shelf technology
RL applications are challenging to develop, best practices are
currently being discovered
Thank You!
Patrick HalinaMehdi Ben AyedCurren Pangler
Feedback
Your feedback is important to us.
Don’t forget to rate and
review the sessions.

More Related Content

What's hot

Dropbox Talk at Netflix ML Platform Meetup Spe 2019
Dropbox Talk at Netflix ML Platform Meetup Spe 2019Dropbox Talk at Netflix ML Platform Meetup Spe 2019
Dropbox Talk at Netflix ML Platform Meetup Spe 2019Faisal Siddiqi
 
Generics, Reflection, and Efficient Collections
Generics, Reflection, and Efficient CollectionsGenerics, Reflection, and Efficient Collections
Generics, Reflection, and Efficient CollectionsEleanor McHugh
 
게임 분산 서버 구조
게임 분산 서버 구조게임 분산 서버 구조
게임 분산 서버 구조Hyunjik Bae
 
Room 1 - 5 - Thủy Đặng - Load balancing k8s services on baremetal with Cilium...
Room 1 - 5 - Thủy Đặng - Load balancing k8s services on baremetal with Cilium...Room 1 - 5 - Thủy Đặng - Load balancing k8s services on baremetal with Cilium...
Room 1 - 5 - Thủy Đặng - Load balancing k8s services on baremetal with Cilium...Vietnam Open Infrastructure User Group
 
A brief overview of Reinforcement Learning applied to games
A brief overview of Reinforcement Learning applied to gamesA brief overview of Reinforcement Learning applied to games
A brief overview of Reinforcement Learning applied to gamesThomas da Silva Paula
 
오승준, 사회적 기술이 프로그래머 인생을 바꿔주는 이유, NDC2011
오승준, 사회적 기술이 프로그래머 인생을 바꿔주는 이유, NDC2011오승준, 사회적 기술이 프로그래머 인생을 바꿔주는 이유, NDC2011
오승준, 사회적 기술이 프로그래머 인생을 바꿔주는 이유, NDC2011devCAT Studio, NEXON
 
게임 개발 파이프라인과 시스템 기획(공개용)
게임 개발 파이프라인과 시스템 기획(공개용)게임 개발 파이프라인과 시스템 기획(공개용)
게임 개발 파이프라인과 시스템 기획(공개용)ChangHyun Won
 
프로그래머에게 사랑받는 게임 기획서 작성법
프로그래머에게 사랑받는 게임 기획서 작성법프로그래머에게 사랑받는 게임 기획서 작성법
프로그래머에게 사랑받는 게임 기획서 작성법Lee Sangkyoon (Kay)
 
The Art of Game Design 도서 요약 - Part 1 (원론편) : 디자이너는 경험을 만들어 낸다
The Art of Game Design 도서 요약 - Part 1 (원론편) : 디자이너는 경험을 만들어 낸다The Art of Game Design 도서 요약 - Part 1 (원론편) : 디자이너는 경험을 만들어 낸다
The Art of Game Design 도서 요약 - Part 1 (원론편) : 디자이너는 경험을 만들어 낸다Harns (Nak-Hyoung) Kim
 
Introduction to Presto at Treasure Data
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure DataTaro L. Saito
 
그럴듯한 랜덤 생성 컨텐츠 만들기
그럴듯한 랜덤 생성 컨텐츠 만들기그럴듯한 랜덤 생성 컨텐츠 만들기
그럴듯한 랜덤 생성 컨텐츠 만들기Yongha Kim
 
이원, 온라인 게임 프로젝트 개발 결산 - 마비노기 개발 완수 보고서, NDC2011
이원, 온라인 게임 프로젝트 개발 결산 - 마비노기 개발 완수 보고서, NDC2011이원, 온라인 게임 프로젝트 개발 결산 - 마비노기 개발 완수 보고서, NDC2011
이원, 온라인 게임 프로젝트 개발 결산 - 마비노기 개발 완수 보고서, NDC2011devCAT Studio, NEXON
 
Live ops in mobile gaming - how to do it right?
 Live ops in mobile gaming - how to do it right? Live ops in mobile gaming - how to do it right?
Live ops in mobile gaming - how to do it right?GameCamp
 
Introducing PlayFab -- Effective LiveOps
Introducing PlayFab -- Effective LiveOpsIntroducing PlayFab -- Effective LiveOps
Introducing PlayFab -- Effective LiveOpsJames Gwertzman
 

What's hot (20)

Dropbox Talk at Netflix ML Platform Meetup Spe 2019
Dropbox Talk at Netflix ML Platform Meetup Spe 2019Dropbox Talk at Netflix ML Platform Meetup Spe 2019
Dropbox Talk at Netflix ML Platform Meetup Spe 2019
 
Generics, Reflection, and Efficient Collections
Generics, Reflection, and Efficient CollectionsGenerics, Reflection, and Efficient Collections
Generics, Reflection, and Efficient Collections
 
게임 분산 서버 구조
게임 분산 서버 구조게임 분산 서버 구조
게임 분산 서버 구조
 
Room 1 - 5 - Thủy Đặng - Load balancing k8s services on baremetal with Cilium...
Room 1 - 5 - Thủy Đặng - Load balancing k8s services on baremetal with Cilium...Room 1 - 5 - Thủy Đặng - Load balancing k8s services on baremetal with Cilium...
Room 1 - 5 - Thủy Đặng - Load balancing k8s services on baremetal with Cilium...
 
A brief overview of Reinforcement Learning applied to games
A brief overview of Reinforcement Learning applied to gamesA brief overview of Reinforcement Learning applied to games
A brief overview of Reinforcement Learning applied to games
 
오승준, 사회적 기술이 프로그래머 인생을 바꿔주는 이유, NDC2011
오승준, 사회적 기술이 프로그래머 인생을 바꿔주는 이유, NDC2011오승준, 사회적 기술이 프로그래머 인생을 바꿔주는 이유, NDC2011
오승준, 사회적 기술이 프로그래머 인생을 바꿔주는 이유, NDC2011
 
Game dev process
Game dev processGame dev process
Game dev process
 
게임 개발 파이프라인과 시스템 기획(공개용)
게임 개발 파이프라인과 시스템 기획(공개용)게임 개발 파이프라인과 시스템 기획(공개용)
게임 개발 파이프라인과 시스템 기획(공개용)
 
프로그래머에게 사랑받는 게임 기획서 작성법
프로그래머에게 사랑받는 게임 기획서 작성법프로그래머에게 사랑받는 게임 기획서 작성법
프로그래머에게 사랑받는 게임 기획서 작성법
 
Raheem Shehzad CV
Raheem Shehzad CVRaheem Shehzad CV
Raheem Shehzad CV
 
The Art of Game Design 도서 요약 - Part 1 (원론편) : 디자이너는 경험을 만들어 낸다
The Art of Game Design 도서 요약 - Part 1 (원론편) : 디자이너는 경험을 만들어 낸다The Art of Game Design 도서 요약 - Part 1 (원론편) : 디자이너는 경험을 만들어 낸다
The Art of Game Design 도서 요약 - Part 1 (원론편) : 디자이너는 경험을 만들어 낸다
 
Introduction to Presto at Treasure Data
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure Data
 
[PandoraCube] 게임 디자인 원리
[PandoraCube] 게임 디자인 원리[PandoraCube] 게임 디자인 원리
[PandoraCube] 게임 디자인 원리
 
Supervised models
Supervised modelsSupervised models
Supervised models
 
그럴듯한 랜덤 생성 컨텐츠 만들기
그럴듯한 랜덤 생성 컨텐츠 만들기그럴듯한 랜덤 생성 컨텐츠 만들기
그럴듯한 랜덤 생성 컨텐츠 만들기
 
이원, 온라인 게임 프로젝트 개발 결산 - 마비노기 개발 완수 보고서, NDC2011
이원, 온라인 게임 프로젝트 개발 결산 - 마비노기 개발 완수 보고서, NDC2011이원, 온라인 게임 프로젝트 개발 결산 - 마비노기 개발 완수 보고서, NDC2011
이원, 온라인 게임 프로젝트 개발 결산 - 마비노기 개발 완수 보고서, NDC2011
 
Live ops in mobile gaming - how to do it right?
 Live ops in mobile gaming - how to do it right? Live ops in mobile gaming - how to do it right?
Live ops in mobile gaming - how to do it right?
 
게임 디렉팅 튜토리얼
게임 디렉팅 튜토리얼게임 디렉팅 튜토리얼
게임 디렉팅 튜토리얼
 
Introducing PlayFab -- Effective LiveOps
Introducing PlayFab -- Effective LiveOpsIntroducing PlayFab -- Effective LiveOps
Introducing PlayFab -- Effective LiveOps
 
Flink Streaming
Flink StreamingFlink Streaming
Flink Streaming
 

Similar to Productionizing Deep Reinforcement Learning with Spark and MLflow

Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleXavier Amatriain
 
Real world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.comReal world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.comMathieu Dumoulin
 
Horizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at ScaleHorizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at ScaleDatabricks
 
User Behavior Hashing for Audience Expansion
User Behavior Hashing for Audience ExpansionUser Behavior Hashing for Audience Expansion
User Behavior Hashing for Audience ExpansionDatabricks
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflowDatabricks
 
Scott Clark, CEO, SigOpt, at The AI Conference 2017
Scott Clark, CEO, SigOpt, at The AI Conference 2017Scott Clark, CEO, SigOpt, at The AI Conference 2017
Scott Clark, CEO, SigOpt, at The AI Conference 2017MLconf
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon Web Services
 
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...PAPIs.io
 
Simulation To Reality: Reinforcement Learning For Autonomous Driving
Simulation To Reality: Reinforcement Learning For Autonomous DrivingSimulation To Reality: Reinforcement Learning For Autonomous Driving
Simulation To Reality: Reinforcement Learning For Autonomous DrivingDonal Byrne
 
Pay pal paypal continuous performance as a self-service with fully-automated...
Pay pal  paypal continuous performance as a self-service with fully-automated...Pay pal  paypal continuous performance as a self-service with fully-automated...
Pay pal paypal continuous performance as a self-service with fully-automated...Dynatrace
 
Biomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLABBiomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLABCodeOps Technologies LLP
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Lucidworks
 
Webinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep LearningWebinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep LearningLucidworks
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabszekeLabs Technologies
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Sri Ambati
 
What is Drools, Guvnor and Planner? 2012 02-17 Brno Dev Conference
What is Drools, Guvnor and Planner? 2012 02-17 Brno Dev ConferenceWhat is Drools, Guvnor and Planner? 2012 02-17 Brno Dev Conference
What is Drools, Guvnor and Planner? 2012 02-17 Brno Dev ConferenceGeoffrey De Smet
 

Similar to Productionizing Deep Reinforcement Learning with Spark and MLflow (20)

Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
 
Real world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.comReal world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.com
 
Horizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at ScaleHorizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at Scale
 
User Behavior Hashing for Audience Expansion
User Behavior Hashing for Audience ExpansionUser Behavior Hashing for Audience Expansion
User Behavior Hashing for Audience Expansion
 
Matlab worshop
Matlab worshopMatlab worshop
Matlab worshop
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
 
Scott Clark, CEO, SigOpt, at The AI Conference 2017
Scott Clark, CEO, SigOpt, at The AI Conference 2017Scott Clark, CEO, SigOpt, at The AI Conference 2017
Scott Clark, CEO, SigOpt, at The AI Conference 2017
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)
 
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
Building machine learning service in your business — Eric Chen (Uber) @PAPIs ...
 
Simulation To Reality: Reinforcement Learning For Autonomous Driving
Simulation To Reality: Reinforcement Learning For Autonomous DrivingSimulation To Reality: Reinforcement Learning For Autonomous Driving
Simulation To Reality: Reinforcement Learning For Autonomous Driving
 
JavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJS
JavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJSJavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJS
JavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJS
 
Informatica_Level1_Flyer
Informatica_Level1_FlyerInformatica_Level1_Flyer
Informatica_Level1_Flyer
 
Pay pal paypal continuous performance as a self-service with fully-automated...
Pay pal  paypal continuous performance as a self-service with fully-automated...Pay pal  paypal continuous performance as a self-service with fully-automated...
Pay pal paypal continuous performance as a self-service with fully-automated...
 
Biomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLABBiomedical Signal and Image Analytics using MATLAB
Biomedical Signal and Image Analytics using MATLAB
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
 
Webinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep LearningWebinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep Learning
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
 
What is Drools, Guvnor and Planner? 2012 02-17 Brno Dev Conference
What is Drools, Guvnor and Planner? 2012 02-17 Brno Dev ConferenceWhat is Drools, Guvnor and Planner? 2012 02-17 Brno Dev Conference
What is Drools, Guvnor and Planner? 2012 02-17 Brno Dev Conference
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 

Recently uploaded (20)

PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 

Productionizing Deep Reinforcement Learning with Spark and MLflow

  • 1.
  • 2. Reinforcement Learning in Production at Zynga Patrick Halina, Architect, ML Engineering Curren Pangler, Principal Engineer, ML Engineering
  • 3. Agenda Reinforcement Learning (RL) Intro Intro to RL and personalization Zynga’s RL Tech Stack The off the shelf technologies Zynga uses to run RL in production for millions of users per day Designing RL Applications Creating RL applications is hard! Here’s what we’ve learned
  • 4. Mobile game developer Over 60M monthly active users
  • 5. Game Design is Hard Lots of design decisions ▪ How hard to make level? ▪ What game mode do we recommend? How do we personalize games? We want to choose behaviors for user to maximize long term engagement
  • 6. Personalization Problem Formulation Given a user’s State ▪ User features What Action do we pick ▪ E.g. Difficulty level That maximizes a long term Reward ▪ E.g. Engagement, retention
  • 7. Personalization Method 1: Rules Based Segments PMs define segments via rules Assign ‘personalized action’ to each segment A/B test segment vs. control
  • 8. Challenges Lots of trial and error manual work Player patterns change Limited ability to personalize ▪ Small set of outputs ▪ Small set of datapoints to make decisions on
  • 9. Personalization Method 2: Prediction Models Train model to predict long term reward for each personalized Action
  • 10. Challenges Requires lots of models Requires lots of labelled data ▪ Need to randomly assign users to each Action, then wait long enough to measure long term results Limited to simple outputs. E.g. How to pick best continuous value?
  • 11. Personalization Wishlist Automatically tune details of personalization Continuously explore and improve over time Personalize complex outputs ▪ E.g. Continuous values, multiple dimensions
  • 12. Solution: Reinforcement Learning (RL) AI for making sequences of decisions Agent picks Action based on current State to maximize Reward Automatically learn from past experiences Balance exploration with choosing best known Action
  • 13. If RL can beat the world’s best GO player, can we use it to make our games better?
  • 14. Application: WWF Daily Message Timing What time should we send user their daily message? We used RL Agent to personalize based on hourly activity Results: Significant increase in CTR vs. hand tuned system Delivered to millions of users per day
  • 16. RL Model Training Action Agent EnvironmentState
  • 17. RL Model Training Action Agent Environment ExperienceExperience Replay Buffer training logging State
  • 18. RL Model Training action Agent Environment ExperienceExperience Replay Buffer training logging Single Experience/Trajectory S0: Previous Observation A0: Action R : Reward S1: Current Observation A1: Next Optimal Action State
  • 19. Academic RL Applications RL Agents learn by interacting with environment ▪ Can’t just train Agent with static set of labelled data like supervised learning models Well known RL applications are trained offline with simulator E.g. Training Agent to play Atari Agent is applied after lots of offline learning RL Agent v1 Action Learn RL Agent v2 Action Learn
  • 20. Production RL Applications for Personalization Hard to simulate humans, so we learn by interacting with real users Agent interacts with humans from v1 Need to learn from batches in parallel Harder to manage data and workflows! RL Agent v1 Action Learn RL Agent v2 Action Learn Action Learn Action Learn
  • 21. RL Model Training Training Pipeline Wish List: Off-the-shelf Scalable Cutting-edge algorithms Reliable & robust Easily extendable
  • 22. RL Model Training TF-Agents Open-source RL Library that implements cutting-edge Deep RL algorithms (DQN, PPO, TD3 etc.) Advantages: ▪ Modular design ▪ Well-written ▪ Accuracy ▪ New algorithms
  • 23. Production RL Challenges How to: Convert messy, real-time logged data into RL trajectories? Persist, restore, and re-use past agents & trajectories Create trajectories at production scale? And… how do we make this repeatable and data-scientist friendly?
  • 24. RL – Bakery Our open-source library to help build batch RL applications in production, at scale github.com/zynga/rl-bakery
  • 25. RL-Bakery Wrapper around RL algorithm libraries that simplifies developing real world RL apps like personalization RL-Application RL-Bakery RL Library
  • 26. RL-Bakery RL-Application: ▪ Application-specific ▪ Written in a Databricks notebook ▪ Data-scientist friendly ▪ Provides model configuration & hyperparameters ▪ Fetches observations, actions, rewards as Spark DataFrames RL-Application RL-Bakery RL Library
  • 27. RL-Bakery RL-Bakery: ▪ Orchestrate steps of training pipeline ▪ Restore models and old time steps ▪ Create new training trajectories ▪ Persist model and trajectories between runs ▪ Deploy models to serving system ▪ Add functionality unavailable in TF- Agents (e.g. prioritized replay buffer) RL-Application RL-Bakery RL Library
  • 28. RL-Bakery RL Library: ▪ Open source RL libs implement algos like PPO, DQN etc. ▪ Currently only support TF-Agents ▪ Core RL Algorithms implemented using TensorFlow RL-Application RL-Bakery RL Library
  • 29. Pre-processing Model Inference Post-processing State & Action Logging S3 Zynga Personalize AWS SageMaker Feature Hydration Zynga Feature Store action observation Real Time Model Serving
  • 30. MODEL SERVING MODEL TRAINING Real-Time Features Real-Time Serving RL Bakery Application AWS SageMaker ActionObservations Recommendation S3 Experience Logs Training RL Agent
  • 32. Choose the Right Application Is the problem best modelled as a sequence of decisions? ▪ Does the Action taken in one timestep affect future Actions? ▪ Otherwise, use simpler solutions like predictive models or contextual Multi-Armed Bandits Is the Reward learnable? ▪ Does the Action impact the Reward ▪ Hard to learn sparse rewards RL shouldn’t be applied to every situation
  • 33. Choose States Anecdotally, RL Agents are sensitive to too many inputs Choose simple state spaces Compress state space size with unsupervised learning techniques like Auto-Encoding
  • 34. Designing Actions Start simple: small set of discrete Actions ▪ Allows you to use simpler Deep RL algorithms Continuous action spaces require algos from Policy Gradient family Large set of discrete Actions -> classic Recommendation Systems ▪ This goes beyond traditional RL set up ▪ Some cutting edge Recommendation Systems use RL
  • 35. Choosing RL Algorithms Active area of research, new algos are constantly being developed Algorithms are hard to implement, subtle details affect results Off the shelf implementations available from Open Source libs
  • 36. Hyperparameter Tuning Lots of RL application design choices ▪ Allows you to use simpler Deep RL algorithms Plus Deep Learning hyper parameters ▪ Learning rate ▪ Neural network architecture Slight hyper parameter changes have big effects How can we choose best options before going live?
  • 37. How to Pretrain Can you do better than random for initial launch? Train Agent to mimic some existing behavior ▪ Use historic data to reward Agent for picking previous Actions ▪ Agent then slowly learns when to deviate Simulate simple scenarios with hand made mechanics ▪ Capture relationships between features in State and Action ▪ Simple scenarios have clear “optimal” strategy so you can measure success
  • 38. Hyperparameter Tuning Automation Automate deep learning hyperparameter tuning with MLFlow
  • 39. Key Takeaways RL is the perfect methodology for personalization problems RL is ready for production with off the shelf technology RL applications are challenging to develop, best practices are currently being discovered
  • 40. Thank You! Patrick HalinaMehdi Ben AyedCurren Pangler
  • 41. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.