SlideShare a Scribd company logo
1 of 38
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
1
Reinforcement Learning
in the Wild
&
Lessons Learned
Mohamad Charafeddine
@mohamadtweets
Director of Tech Planning, AI Team
Samsung SDS Research America
April 12th, 2018
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
“Theory is the first term in the Taylor series of practice” – Tom Cover, Stanford
Professor of Information Theory, in his 1990 Shannon Lecture
2
Practice = Theory + higher order terms
“the wild”
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
3
Theory
Practice
Time
Complexity
Optimize
“engagement” through
AI personalization
• What does “engagement” means?
• From who’s point of view? User? Company? Society?
• For what time horizon? Days? Weeks? Years?
• ..
• Unintended 2nd order effect: amplification of
echo-chambers
• Unintended 3rd order effect: ?
• Should the AI objective function be open-source and
auditable?
• Should the AI objective function imitate/learn from
humans?
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Takeaways from this talk
 Introduce Reinforcement Learning and its breadth of potential applications
 Showcase some RL examples
 Provide a framework to better evaluate RL application areas in terms of risk
and design challenges from a PM point of view
4
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
5
I. Reinforcement Learning Intro
II. 3 RL Use Cases
III. Lessons Learned
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Rewards
Perceived
State of the
environment
Reinforcement Learning
6
RL Agent Environment
Actions
Inputs
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
In the context of an Atari game, the long-term objective is the score
7
RL Agent Environment
Actions
Rewards:
score
State of the
environment
frames on the
screen
Inputs
https://deepmind.com/research/publications/playing-atari-deep-reinforcement-learning/
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
.. But just focusing on the score in the reward function can sometimes back-fire!
8
RL Agent Environment
Actions
State of the
environment
Inputs
Rewards:
score
https://blog.openai.com/faulty-reward-functions/
The boat can spin in loops collecting goodies but never finishes the race! – unintended high order effect
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
.. But just focusing on the score in the reward function can sometimes back-fire!
9
RL Agent Environment
Actions
State of the
environment
Inputs
The boat can spin in loops collecting goodies but never finishes the race! – unintended high order effect
Rewards:
score
https://blog.openai.com/faulty-reward-functions/
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
10
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Many potential use cases of RL, where the environment is: Object (physics, etc.),
Human (preferences..), Biology (& chemical,..), Market (multiple agents), Code,..
11
Environment
Robotics
Industrial
Manufacturing
Social
Content
Wellness
Healthcare
Pharmaceutical
Agriculture
Advertising
Marketing
Enterprise
Finance
Games
Security
E-commerce
Networking
Another AI system
..
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
12
Marketing, Sales, Customer Support, Security,
Recruiting, Education, Investment, Legal, Logistics,
Healthcare, Wellness, Automotive, Manufacturing,
Agriculture, Personal Assistants, Speech/Image/Video
recognition, Advertising, ..
Deep Learning has succeeded to break into Practice
Reinforcement Learning succeeded in fewer areas
Games, Robotics, Chatbots, Manufacturing, Wellness,
Automotive*, Marketing*, Customer Support*,
Agriculture*, Advertising*,..
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
13
Insights
Perception Models
Features
Data
Recommendation
Decision
Value
Perception
Decision
Deep Learning
Reinforcement
Learning
Insights
Perception Models
Features
Data
Recommendation
Decision
Value
Perception
Decision
Deep Learning
Human-designed
Rules
Most AI applications today Future AI applications
tech + ethical challenges
Slowly human decision rules will be replaced with AI decisions.
Such move, inherent to RL applications, opens door for ethics and decision
governance questions.
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
To apply RL, we need to understand the characteristics (metadata) of the
problems that are most appropriate to apply it to.
14
Where to start to bring RL to practice? Risk profile?
And how to build a framework to qualify application areas?
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
15
I. Reinforcement Learning Intro
II. 3 RL Use Cases
III. Lessons Learned
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Traffic Lights IoT Optimization
16
• 2 sensors on each lane that
measure # of cars that pass
& speed of cars
• 2 intersections with 4 lights
each
• Goal is to optimize for flow
rate (# cars/sec)
Controlled Environment: Simulator
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Traffic Lights IoT Optimization
17
Direct
Reward
State
DRL Agent
LSTM, Online Learning
Environment
8 discrete actions
Inputs
16 sensors outputs
car counter + speed
Stream of data
Flow of cars
in intersections
End-to-End: no features
engineering, just feeding
raw stream of data
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Results: ~ 30% reduction in total Travel Time
18
Uneven Flow High Flow
32% faster 39%
30%
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Theory to Practice journey:
19
• What happens if I want to apply this to a city?
• How to handle 10s of thousands of actions?
• How to characterize convergence, robustness?
• How fast can it adapt to changes?
• …
• Cost of sensors? Use of Traffic on Google Maps? Etc.
• Can Autonomous Vehicles play a role as extra control knobs?
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
20
Simulation: Wu, et al. IEEE T-RO, 2018
Ion Stoica - RL Systems @ RISELab at UC Berkeley
ScaledML Conference by Matroid, March 2018, Stanford Univ.
https://www.youtube.com/watch?v=-KC3tO4BDuQ
RL for traffic management – using Autonomous Vehicles, on the road with human drivers
2 lanes
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
21
I. Reinforcement Learning Intro
II. 3 RL Use Cases
III. Lessons Learned
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Applying DRL to Storage Servers to optimize operational efficiency
22
State: function of workload (Reads or Writes) and temperature
RL Agent Environment
Actions:
24 SSD Drives
Fan Speeds
Reward: A function of Temperatures & Fans Speeds
Advantage Actor-Critic
(A2C)
S. Srinivasa, G. Kathalagiri, J. Varanasi, L. Quintela, M. Charafeddine, C. Lee,
“On Optimizing Operational Efficiency in Storage Systems Via Deep Reinforcement Learning”, submitted to ECML PKDD
Desired
operation
region
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Learning… over few days
23
RL Agent
Fan Speeds
State:
Different
workloads,
Temperature
Reward
• Learning directly on the real environment
(no simulator)
• Model-free: does not require any knowledge
of the SSD server behavior dynamics
• Exposed to different stochastic workloads
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Performance for Idle Vs Heavy I/O workloads on the operational contours
24
Status Quo controller Using Deep Reinforcement Learning
Desired
Operational
Region
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Performance for different workloads
25
At the beginning of training, algorithm
is exploring and learning
Once finished learning right policy,
operational behavior is within desired region
Desired
Operational
Region
Resulting
distribution
from different
workloads
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
26
I. Reinforcement Learning Intro
II. 3 RL Use Cases
III. Lessons Learned
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Ads Spend Optimization for leads demand generation
27
RL in Digital Marketing
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Demand gen challenges
28
Daily leads qualification
volume constraints
Financial Services
Changing inventory:
Hotels, Car Rentals
Hospitality
Limited supply & time
sensitivity challenges
Food Apps Retail
Limited # for
Inventories, discounts.
Marketing bidding vs
competitors
Under-producing
Demand
Demand < Supply
Opportunity Loss
Demand > Supply
Over-producing
Demand
Over Spending
Maximum Gross Profit
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Reward: Gross Profit
State: Previous CPC per SEM Account, Hour of Day, Day of Week, ..
29
Setup: Marketing Demand Gen Optimization
Every Hour
Marketplace webpage
Decide hourly Cost Per Click for 8 Search Engine Marketing accounts to optimize Gross Profit:
sum over 24 hrs of (Hourly Lead Gen Referral Revenue - Hourly Cust. Acquisition Cost from SEM)
RL Agent
TRPO,
Importance Sampling
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
Results for SEM Demand Gen: 12-20% gross profit uplift
30
A. Beloi, M. Charafeddine, G. Kathalagiri, A. Mishra, L. Quintela, S.
Srinivasa, patent filed: “Spending Allocation in Multi-Channel
Digital Marketing” (U.S. Application No.: 20180047039)
Gross Profit
Cumulative Demand Spend
Joint Decisions
Gross Margin
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
31
I. Reinforcement Learning Intro
II. 3 RL Use Cases
III. Lessons Learned
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
32
EASIER HARDER
Fully observable Partially observable
Low dimensionality
to represent
High dimensionality
to represent
Time-invariant
(if I conduct the experiment now
or next week, it’s the same)
Time variant
We bring the concept of “Environment
Coherence time tc” borrowed from digital
Communication to characterize how the
“channel” or “environment” is changing.
Well-behaved Stochastic w/ Fat Tails
Environment
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
33
Objective
Subjective
(mostly dealing w/ humans)
prone to PM/Data Scientists bias;
has an ethical dimension
Monolithic
Direct Indirect w/ a lag (e.g., Marketing)
Composite
(e.g., Robotics: Get closer  move arm
 orient  pick  move  stack)
Simple to describe Complex, need AI (Inverse RL) to learn it
(how to prune bad actors?)
EASIER HARDERReward
Most challenging for Product Management
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
34
Discrete: small number ~< 20 Discrete: large # (100s,1K,..)
Need Hierarchical Actions
Continuous: small number
(Self-Driving Cars:
Gas, Brake, Steering)
Continuous: large number
(Ad Spend CPC per Keyword)
Static Dynamic with time
(new Ads added, removed,..)
EASIER HARDERActions
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
35
A high-fidelity simulator exists
(in a game, Simulator = Environment)
Low-fidelity simulator or none
Can run many parallel experiments Only 1 experiment at a time
(marketplace that’s hard to simulate)
$0, no impact
$$$ or Humans involved
(Ad Spend, Healthcare, Social media,..)
EASIER HARDERExploration Cost of Learning
Fast learning episodes
Long cycle learning episodes
(wellness, marketing re-targeting)
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
36
RL in a Lab RL in the Wild
RL for a Game: Simulator & Environment
are 100% the same
There is Simulator & Environment Gap
~0 Exploration Cost $-$$$ Exploration Cost
Environment is Time-Invariant Environment can be Time-Variant
Direct, instant feedback More complex: can be indirect or w/ lag
Unconstrained Convergence Time Convergence Time << Env Coherence Time
Big Data Big Data & Small Data
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
37
Controlled
Environment
Wild Environment
Simulator ≠ Reality
Low Exploration Risk
High Exploration Risk
Healthcare
WellnessAgTech
Trading
Manufacturing
Marketing
Copyright © 2018 Samsung SDS, Inc. All rights reserved
@mohamadtweets
38
Advice to AI entrepreneurs planning their journey into the wild
Pick your vertical wisely.. It decides the macro terms that you will face

More Related Content

Similar to Mohamad C

[NEW LAUNCH!] [REPEAT 1] AWS DeepRacer Workshops –a new, fun way to learn rei...
[NEW LAUNCH!] [REPEAT 1] AWS DeepRacer Workshops –a new, fun way to learn rei...[NEW LAUNCH!] [REPEAT 1] AWS DeepRacer Workshops –a new, fun way to learn rei...
[NEW LAUNCH!] [REPEAT 1] AWS DeepRacer Workshops –a new, fun way to learn rei...Amazon Web Services
 
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018Amazon Web Services
 
How can you maximize the benefits from RPA and AI in your automation initiatives
How can you maximize the benefits from RPA and AI in your automation initiativesHow can you maximize the benefits from RPA and AI in your automation initiatives
How can you maximize the benefits from RPA and AI in your automation initiativesIndernain Singh
 
How To Convince Your Boss a Rewrite is Necessary
How To Convince Your Boss a Rewrite is NecessaryHow To Convince Your Boss a Rewrite is Necessary
How To Convince Your Boss a Rewrite is NecessaryPete Jeffryes
 
Designing for a Data-Driven Economy (AIS307) - AWS re:Invent 2018
Designing for a Data-Driven Economy (AIS307) - AWS re:Invent 2018Designing for a Data-Driven Economy (AIS307) - AWS re:Invent 2018
Designing for a Data-Driven Economy (AIS307) - AWS re:Invent 2018Amazon Web Services
 
AIoT: AI Meets IoT (IOT204) - AWS re:Invent 2018
AIoT: AI Meets IoT (IOT204) - AWS re:Invent 2018AIoT: AI Meets IoT (IOT204) - AWS re:Invent 2018
AIoT: AI Meets IoT (IOT204) - AWS re:Invent 2018Amazon Web Services
 
AWS re:Invent 2018 - AIM302 - Machine Learning at the Edge
AWS re:Invent 2018 - AIM302  - Machine Learning at the Edge AWS re:Invent 2018 - AIM302  - Machine Learning at the Edge
AWS re:Invent 2018 - AIM302 - Machine Learning at the Edge Julien SIMON
 
Designing a Successful Governed Citizen Data Science Strategy
Designing a Successful Governed Citizen Data Science StrategyDesigning a Successful Governed Citizen Data Science Strategy
Designing a Successful Governed Citizen Data Science StrategyDATAVERSITY
 
Leadership Session: The Future of Enterprise IT (ENT220-L) - AWS re:Invent 2018
Leadership Session:  The Future of Enterprise IT (ENT220-L) - AWS re:Invent 2018Leadership Session:  The Future of Enterprise IT (ENT220-L) - AWS re:Invent 2018
Leadership Session: The Future of Enterprise IT (ENT220-L) - AWS re:Invent 2018Amazon Web Services
 
Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksAmazon Web Services
 
Martin Huddleston: No Service Management, No Security
Martin Huddleston: No Service Management, No SecurityMartin Huddleston: No Service Management, No Security
Martin Huddleston: No Service Management, No SecurityitSMF UK
 
Cheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial WorldCheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial WorldRehgan Avon
 
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...Matt Stubbs
 
Unlocking business value in cement through Digital transformation| Ramco
Unlocking business value in cement through  Digital transformation| RamcoUnlocking business value in cement through  Digital transformation| Ramco
Unlocking business value in cement through Digital transformation| Ramcoramcosystemcom
 
Smarter Event-Driven Edge with Amazon SageMaker & Project Flogo (AIM204-S) - ...
Smarter Event-Driven Edge with Amazon SageMaker & Project Flogo (AIM204-S) - ...Smarter Event-Driven Edge with Amazon SageMaker & Project Flogo (AIM204-S) - ...
Smarter Event-Driven Edge with Amazon SageMaker & Project Flogo (AIM204-S) - ...Amazon Web Services
 
TM Forum AI Program Overview
TM Forum AI Program OverviewTM Forum AI Program Overview
TM Forum AI Program OverviewTMForum
 
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...MongoDB
 
Better Business from Exploring Ideas - Modern Data Architectures on AWS
Better Business from Exploring Ideas - Modern Data Architectures on AWSBetter Business from Exploring Ideas - Modern Data Architectures on AWS
Better Business from Exploring Ideas - Modern Data Architectures on AWSAmazon Web Services
 
WinOps Conf 2015 - John Rakowski - Militarise It for #DevOps success
WinOps Conf 2015 - John Rakowski - Militarise It for #DevOps successWinOps Conf 2015 - John Rakowski - Militarise It for #DevOps success
WinOps Conf 2015 - John Rakowski - Militarise It for #DevOps successWinOps Conf
 

Similar to Mohamad C (20)

[NEW LAUNCH!] [REPEAT 1] AWS DeepRacer Workshops –a new, fun way to learn rei...
[NEW LAUNCH!] [REPEAT 1] AWS DeepRacer Workshops –a new, fun way to learn rei...[NEW LAUNCH!] [REPEAT 1] AWS DeepRacer Workshops –a new, fun way to learn rei...
[NEW LAUNCH!] [REPEAT 1] AWS DeepRacer Workshops –a new, fun way to learn rei...
 
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
Machine Learning at the Edge (AIM302) - AWS re:Invent 2018
 
How can you maximize the benefits from RPA and AI in your automation initiatives
How can you maximize the benefits from RPA and AI in your automation initiativesHow can you maximize the benefits from RPA and AI in your automation initiatives
How can you maximize the benefits from RPA and AI in your automation initiatives
 
How To Convince Your Boss a Rewrite is Necessary
How To Convince Your Boss a Rewrite is NecessaryHow To Convince Your Boss a Rewrite is Necessary
How To Convince Your Boss a Rewrite is Necessary
 
Designing for a Data-Driven Economy (AIS307) - AWS re:Invent 2018
Designing for a Data-Driven Economy (AIS307) - AWS re:Invent 2018Designing for a Data-Driven Economy (AIS307) - AWS re:Invent 2018
Designing for a Data-Driven Economy (AIS307) - AWS re:Invent 2018
 
AIoT: AI Meets IoT (IOT204) - AWS re:Invent 2018
AIoT: AI Meets IoT (IOT204) - AWS re:Invent 2018AIoT: AI Meets IoT (IOT204) - AWS re:Invent 2018
AIoT: AI Meets IoT (IOT204) - AWS re:Invent 2018
 
AWS re:Invent 2018 - AIM302 - Machine Learning at the Edge
AWS re:Invent 2018 - AIM302  - Machine Learning at the Edge AWS re:Invent 2018 - AIM302  - Machine Learning at the Edge
AWS re:Invent 2018 - AIM302 - Machine Learning at the Edge
 
Designing a Successful Governed Citizen Data Science Strategy
Designing a Successful Governed Citizen Data Science StrategyDesigning a Successful Governed Citizen Data Science Strategy
Designing a Successful Governed Citizen Data Science Strategy
 
Leadership Session: The Future of Enterprise IT (ENT220-L) - AWS re:Invent 2018
Leadership Session:  The Future of Enterprise IT (ENT220-L) - AWS re:Invent 2018Leadership Session:  The Future of Enterprise IT (ENT220-L) - AWS re:Invent 2018
Leadership Session: The Future of Enterprise IT (ENT220-L) - AWS re:Invent 2018
 
Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech Talks
 
Martin Huddleston: No Service Management, No Security
Martin Huddleston: No Service Management, No SecurityMartin Huddleston: No Service Management, No Security
Martin Huddleston: No Service Management, No Security
 
Cheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial WorldCheryl Wiebe - Advanced Analytics in the Industrial World
Cheryl Wiebe - Advanced Analytics in the Industrial World
 
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
 
Unlocking business value in cement through Digital transformation| Ramco
Unlocking business value in cement through  Digital transformation| RamcoUnlocking business value in cement through  Digital transformation| Ramco
Unlocking business value in cement through Digital transformation| Ramco
 
Smarter Event-Driven Edge with Amazon SageMaker & Project Flogo (AIM204-S) - ...
Smarter Event-Driven Edge with Amazon SageMaker & Project Flogo (AIM204-S) - ...Smarter Event-Driven Edge with Amazon SageMaker & Project Flogo (AIM204-S) - ...
Smarter Event-Driven Edge with Amazon SageMaker & Project Flogo (AIM204-S) - ...
 
TM Forum AI Program Overview
TM Forum AI Program OverviewTM Forum AI Program Overview
TM Forum AI Program Overview
 
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
MongoDB World 2018: From Disruption to Transformation: Document Databases, Do...
 
Introduction to Sagemaker
Introduction to SagemakerIntroduction to Sagemaker
Introduction to Sagemaker
 
Better Business from Exploring Ideas - Modern Data Architectures on AWS
Better Business from Exploring Ideas - Modern Data Architectures on AWSBetter Business from Exploring Ideas - Modern Data Architectures on AWS
Better Business from Exploring Ideas - Modern Data Architectures on AWS
 
WinOps Conf 2015 - John Rakowski - Militarise It for #DevOps success
WinOps Conf 2015 - John Rakowski - Militarise It for #DevOps successWinOps Conf 2015 - John Rakowski - Militarise It for #DevOps success
WinOps Conf 2015 - John Rakowski - Militarise It for #DevOps success
 

More from Hilary Ip

Living in Color: Carving Out Safe Spaces For Community by Danielle Cadet (Man...
Living in Color: Carving Out Safe Spaces For Community by Danielle Cadet (Man...Living in Color: Carving Out Safe Spaces For Community by Danielle Cadet (Man...
Living in Color: Carving Out Safe Spaces For Community by Danielle Cadet (Man...Hilary Ip
 
Testing New Revenue Streams by Stefanie Rapp (SVP, Revenue Strategy, Bleacher...
Testing New Revenue Streams by Stefanie Rapp (SVP, Revenue Strategy, Bleacher...Testing New Revenue Streams by Stefanie Rapp (SVP, Revenue Strategy, Bleacher...
Testing New Revenue Streams by Stefanie Rapp (SVP, Revenue Strategy, Bleacher...Hilary Ip
 
Building A New Ecosystem: The Role of Partnerships at an OTT Service by Justi...
Building A New Ecosystem: The Role of Partnerships at an OTT Service by Justi...Building A New Ecosystem: The Role of Partnerships at an OTT Service by Justi...
Building A New Ecosystem: The Role of Partnerships at an OTT Service by Justi...Hilary Ip
 
How PBS Creates YouTube Series that Educate, Entertain & Inspire by Adam Dyle...
How PBS Creates YouTube Series that Educate, Entertain & Inspire by Adam Dyle...How PBS Creates YouTube Series that Educate, Entertain & Inspire by Adam Dyle...
How PBS Creates YouTube Series that Educate, Entertain & Inspire by Adam Dyle...Hilary Ip
 
Telling Better Stories Across the Open Web by Adam Greenberg (Sr. Global Prod...
Telling Better Stories Across the Open Web by Adam Greenberg (Sr. Global Prod...Telling Better Stories Across the Open Web by Adam Greenberg (Sr. Global Prod...
Telling Better Stories Across the Open Web by Adam Greenberg (Sr. Global Prod...Hilary Ip
 
Data Storytelling in the Digital Age by Stephanie Salmon (SVP, Data & Informa...
Data Storytelling in the Digital Age by Stephanie Salmon (SVP, Data & Informa...Data Storytelling in the Digital Age by Stephanie Salmon (SVP, Data & Informa...
Data Storytelling in the Digital Age by Stephanie Salmon (SVP, Data & Informa...Hilary Ip
 
Seven Steps to Building Out Newsletters by Michael Liss (VP, Product, New Yor...
Seven Steps to Building Out Newsletters by Michael Liss (VP, Product, New Yor...Seven Steps to Building Out Newsletters by Michael Liss (VP, Product, New Yor...
Seven Steps to Building Out Newsletters by Michael Liss (VP, Product, New Yor...Hilary Ip
 
Estelle Ayer
Estelle AyerEstelle Ayer
Estelle AyerHilary Ip
 
Adrian Gregory
Adrian GregoryAdrian Gregory
Adrian GregoryHilary Ip
 
Mark Wilson
Mark Wilson Mark Wilson
Mark Wilson Hilary Ip
 
Nathan Jacob
Nathan JacobNathan Jacob
Nathan JacobHilary Ip
 
Fireside chat slide
Fireside chat slide Fireside chat slide
Fireside chat slide Hilary Ip
 

More from Hilary Ip (20)

Living in Color: Carving Out Safe Spaces For Community by Danielle Cadet (Man...
Living in Color: Carving Out Safe Spaces For Community by Danielle Cadet (Man...Living in Color: Carving Out Safe Spaces For Community by Danielle Cadet (Man...
Living in Color: Carving Out Safe Spaces For Community by Danielle Cadet (Man...
 
Testing New Revenue Streams by Stefanie Rapp (SVP, Revenue Strategy, Bleacher...
Testing New Revenue Streams by Stefanie Rapp (SVP, Revenue Strategy, Bleacher...Testing New Revenue Streams by Stefanie Rapp (SVP, Revenue Strategy, Bleacher...
Testing New Revenue Streams by Stefanie Rapp (SVP, Revenue Strategy, Bleacher...
 
Building A New Ecosystem: The Role of Partnerships at an OTT Service by Justi...
Building A New Ecosystem: The Role of Partnerships at an OTT Service by Justi...Building A New Ecosystem: The Role of Partnerships at an OTT Service by Justi...
Building A New Ecosystem: The Role of Partnerships at an OTT Service by Justi...
 
How PBS Creates YouTube Series that Educate, Entertain & Inspire by Adam Dyle...
How PBS Creates YouTube Series that Educate, Entertain & Inspire by Adam Dyle...How PBS Creates YouTube Series that Educate, Entertain & Inspire by Adam Dyle...
How PBS Creates YouTube Series that Educate, Entertain & Inspire by Adam Dyle...
 
Telling Better Stories Across the Open Web by Adam Greenberg (Sr. Global Prod...
Telling Better Stories Across the Open Web by Adam Greenberg (Sr. Global Prod...Telling Better Stories Across the Open Web by Adam Greenberg (Sr. Global Prod...
Telling Better Stories Across the Open Web by Adam Greenberg (Sr. Global Prod...
 
Data Storytelling in the Digital Age by Stephanie Salmon (SVP, Data & Informa...
Data Storytelling in the Digital Age by Stephanie Salmon (SVP, Data & Informa...Data Storytelling in the Digital Age by Stephanie Salmon (SVP, Data & Informa...
Data Storytelling in the Digital Age by Stephanie Salmon (SVP, Data & Informa...
 
Seven Steps to Building Out Newsletters by Michael Liss (VP, Product, New Yor...
Seven Steps to Building Out Newsletters by Michael Liss (VP, Product, New Yor...Seven Steps to Building Out Newsletters by Michael Liss (VP, Product, New Yor...
Seven Steps to Building Out Newsletters by Michael Liss (VP, Product, New Yor...
 
John M
John MJohn M
John M
 
Maike S
Maike SMaike S
Maike S
 
Joe C
Joe CJoe C
Joe C
 
Philip R
Philip RPhilip R
Philip R
 
Michael W
Michael WMichael W
Michael W
 
Nick C
Nick CNick C
Nick C
 
Tyler M
Tyler MTyler M
Tyler M
 
Estelle Ayer
Estelle AyerEstelle Ayer
Estelle Ayer
 
Adrian Gregory
Adrian GregoryAdrian Gregory
Adrian Gregory
 
Mark Wilson
Mark Wilson Mark Wilson
Mark Wilson
 
Nathan Jacob
Nathan JacobNathan Jacob
Nathan Jacob
 
Fireside chat slide
Fireside chat slide Fireside chat slide
Fireside chat slide
 
Kate Tovey
Kate ToveyKate Tovey
Kate Tovey
 

Recently uploaded

BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024eCommerce Institute
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubssamaasim06
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Vipesco
 
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Pooja Nehwal
 
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStrSaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStrsaastr
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Kayode Fayemi
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyPooja Nehwal
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Delhi Call girls
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...Sheetaleventcompany
 
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Salam Al-Karadaghi
 
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...henrik385807
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfhenrik385807
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxNikitaBankoti2
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Chameera Dedduwage
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024eCommerce Institute
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...NETWAYS
 
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝soniya singh
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesPooja Nehwal
 

Recently uploaded (20)

BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubs
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
 
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStrSaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
 
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
 
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 

Mohamad C

  • 1. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 1 Reinforcement Learning in the Wild & Lessons Learned Mohamad Charafeddine @mohamadtweets Director of Tech Planning, AI Team Samsung SDS Research America April 12th, 2018
  • 2. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets “Theory is the first term in the Taylor series of practice” – Tom Cover, Stanford Professor of Information Theory, in his 1990 Shannon Lecture 2 Practice = Theory + higher order terms “the wild”
  • 3. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 3 Theory Practice Time Complexity Optimize “engagement” through AI personalization • What does “engagement” means? • From who’s point of view? User? Company? Society? • For what time horizon? Days? Weeks? Years? • .. • Unintended 2nd order effect: amplification of echo-chambers • Unintended 3rd order effect: ? • Should the AI objective function be open-source and auditable? • Should the AI objective function imitate/learn from humans?
  • 4. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Takeaways from this talk  Introduce Reinforcement Learning and its breadth of potential applications  Showcase some RL examples  Provide a framework to better evaluate RL application areas in terms of risk and design challenges from a PM point of view 4
  • 5. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 5 I. Reinforcement Learning Intro II. 3 RL Use Cases III. Lessons Learned
  • 6. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Rewards Perceived State of the environment Reinforcement Learning 6 RL Agent Environment Actions Inputs
  • 7. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets In the context of an Atari game, the long-term objective is the score 7 RL Agent Environment Actions Rewards: score State of the environment frames on the screen Inputs https://deepmind.com/research/publications/playing-atari-deep-reinforcement-learning/
  • 8. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets .. But just focusing on the score in the reward function can sometimes back-fire! 8 RL Agent Environment Actions State of the environment Inputs Rewards: score https://blog.openai.com/faulty-reward-functions/ The boat can spin in loops collecting goodies but never finishes the race! – unintended high order effect
  • 9. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets .. But just focusing on the score in the reward function can sometimes back-fire! 9 RL Agent Environment Actions State of the environment Inputs The boat can spin in loops collecting goodies but never finishes the race! – unintended high order effect Rewards: score https://blog.openai.com/faulty-reward-functions/
  • 10. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 10
  • 11. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Many potential use cases of RL, where the environment is: Object (physics, etc.), Human (preferences..), Biology (& chemical,..), Market (multiple agents), Code,.. 11 Environment Robotics Industrial Manufacturing Social Content Wellness Healthcare Pharmaceutical Agriculture Advertising Marketing Enterprise Finance Games Security E-commerce Networking Another AI system ..
  • 12. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 12 Marketing, Sales, Customer Support, Security, Recruiting, Education, Investment, Legal, Logistics, Healthcare, Wellness, Automotive, Manufacturing, Agriculture, Personal Assistants, Speech/Image/Video recognition, Advertising, .. Deep Learning has succeeded to break into Practice Reinforcement Learning succeeded in fewer areas Games, Robotics, Chatbots, Manufacturing, Wellness, Automotive*, Marketing*, Customer Support*, Agriculture*, Advertising*,..
  • 13. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 13 Insights Perception Models Features Data Recommendation Decision Value Perception Decision Deep Learning Reinforcement Learning Insights Perception Models Features Data Recommendation Decision Value Perception Decision Deep Learning Human-designed Rules Most AI applications today Future AI applications tech + ethical challenges Slowly human decision rules will be replaced with AI decisions. Such move, inherent to RL applications, opens door for ethics and decision governance questions.
  • 14. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets To apply RL, we need to understand the characteristics (metadata) of the problems that are most appropriate to apply it to. 14 Where to start to bring RL to practice? Risk profile? And how to build a framework to qualify application areas?
  • 15. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 15 I. Reinforcement Learning Intro II. 3 RL Use Cases III. Lessons Learned
  • 16. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Traffic Lights IoT Optimization 16 • 2 sensors on each lane that measure # of cars that pass & speed of cars • 2 intersections with 4 lights each • Goal is to optimize for flow rate (# cars/sec) Controlled Environment: Simulator
  • 17. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Traffic Lights IoT Optimization 17 Direct Reward State DRL Agent LSTM, Online Learning Environment 8 discrete actions Inputs 16 sensors outputs car counter + speed Stream of data Flow of cars in intersections End-to-End: no features engineering, just feeding raw stream of data
  • 18. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Results: ~ 30% reduction in total Travel Time 18 Uneven Flow High Flow 32% faster 39% 30%
  • 19. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Theory to Practice journey: 19 • What happens if I want to apply this to a city? • How to handle 10s of thousands of actions? • How to characterize convergence, robustness? • How fast can it adapt to changes? • … • Cost of sensors? Use of Traffic on Google Maps? Etc. • Can Autonomous Vehicles play a role as extra control knobs?
  • 20. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 20 Simulation: Wu, et al. IEEE T-RO, 2018 Ion Stoica - RL Systems @ RISELab at UC Berkeley ScaledML Conference by Matroid, March 2018, Stanford Univ. https://www.youtube.com/watch?v=-KC3tO4BDuQ RL for traffic management – using Autonomous Vehicles, on the road with human drivers 2 lanes
  • 21. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 21 I. Reinforcement Learning Intro II. 3 RL Use Cases III. Lessons Learned
  • 22. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Applying DRL to Storage Servers to optimize operational efficiency 22 State: function of workload (Reads or Writes) and temperature RL Agent Environment Actions: 24 SSD Drives Fan Speeds Reward: A function of Temperatures & Fans Speeds Advantage Actor-Critic (A2C) S. Srinivasa, G. Kathalagiri, J. Varanasi, L. Quintela, M. Charafeddine, C. Lee, “On Optimizing Operational Efficiency in Storage Systems Via Deep Reinforcement Learning”, submitted to ECML PKDD Desired operation region
  • 23. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Learning… over few days 23 RL Agent Fan Speeds State: Different workloads, Temperature Reward • Learning directly on the real environment (no simulator) • Model-free: does not require any knowledge of the SSD server behavior dynamics • Exposed to different stochastic workloads
  • 24. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Performance for Idle Vs Heavy I/O workloads on the operational contours 24 Status Quo controller Using Deep Reinforcement Learning Desired Operational Region
  • 25. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Performance for different workloads 25 At the beginning of training, algorithm is exploring and learning Once finished learning right policy, operational behavior is within desired region Desired Operational Region Resulting distribution from different workloads
  • 26. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 26 I. Reinforcement Learning Intro II. 3 RL Use Cases III. Lessons Learned
  • 27. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Ads Spend Optimization for leads demand generation 27 RL in Digital Marketing
  • 28. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Demand gen challenges 28 Daily leads qualification volume constraints Financial Services Changing inventory: Hotels, Car Rentals Hospitality Limited supply & time sensitivity challenges Food Apps Retail Limited # for Inventories, discounts. Marketing bidding vs competitors Under-producing Demand Demand < Supply Opportunity Loss Demand > Supply Over-producing Demand Over Spending Maximum Gross Profit
  • 29. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Reward: Gross Profit State: Previous CPC per SEM Account, Hour of Day, Day of Week, .. 29 Setup: Marketing Demand Gen Optimization Every Hour Marketplace webpage Decide hourly Cost Per Click for 8 Search Engine Marketing accounts to optimize Gross Profit: sum over 24 hrs of (Hourly Lead Gen Referral Revenue - Hourly Cust. Acquisition Cost from SEM) RL Agent TRPO, Importance Sampling
  • 30. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets Results for SEM Demand Gen: 12-20% gross profit uplift 30 A. Beloi, M. Charafeddine, G. Kathalagiri, A. Mishra, L. Quintela, S. Srinivasa, patent filed: “Spending Allocation in Multi-Channel Digital Marketing” (U.S. Application No.: 20180047039) Gross Profit Cumulative Demand Spend Joint Decisions Gross Margin
  • 31. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 31 I. Reinforcement Learning Intro II. 3 RL Use Cases III. Lessons Learned
  • 32. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 32 EASIER HARDER Fully observable Partially observable Low dimensionality to represent High dimensionality to represent Time-invariant (if I conduct the experiment now or next week, it’s the same) Time variant We bring the concept of “Environment Coherence time tc” borrowed from digital Communication to characterize how the “channel” or “environment” is changing. Well-behaved Stochastic w/ Fat Tails Environment
  • 33. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 33 Objective Subjective (mostly dealing w/ humans) prone to PM/Data Scientists bias; has an ethical dimension Monolithic Direct Indirect w/ a lag (e.g., Marketing) Composite (e.g., Robotics: Get closer  move arm  orient  pick  move  stack) Simple to describe Complex, need AI (Inverse RL) to learn it (how to prune bad actors?) EASIER HARDERReward Most challenging for Product Management
  • 34. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 34 Discrete: small number ~< 20 Discrete: large # (100s,1K,..) Need Hierarchical Actions Continuous: small number (Self-Driving Cars: Gas, Brake, Steering) Continuous: large number (Ad Spend CPC per Keyword) Static Dynamic with time (new Ads added, removed,..) EASIER HARDERActions
  • 35. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 35 A high-fidelity simulator exists (in a game, Simulator = Environment) Low-fidelity simulator or none Can run many parallel experiments Only 1 experiment at a time (marketplace that’s hard to simulate) $0, no impact $$$ or Humans involved (Ad Spend, Healthcare, Social media,..) EASIER HARDERExploration Cost of Learning Fast learning episodes Long cycle learning episodes (wellness, marketing re-targeting)
  • 36. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 36 RL in a Lab RL in the Wild RL for a Game: Simulator & Environment are 100% the same There is Simulator & Environment Gap ~0 Exploration Cost $-$$$ Exploration Cost Environment is Time-Invariant Environment can be Time-Variant Direct, instant feedback More complex: can be indirect or w/ lag Unconstrained Convergence Time Convergence Time << Env Coherence Time Big Data Big Data & Small Data
  • 37. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 37 Controlled Environment Wild Environment Simulator ≠ Reality Low Exploration Risk High Exploration Risk Healthcare WellnessAgTech Trading Manufacturing Marketing
  • 38. Copyright © 2018 Samsung SDS, Inc. All rights reserved @mohamadtweets 38 Advice to AI entrepreneurs planning their journey into the wild Pick your vertical wisely.. It decides the macro terms that you will face

Editor's Notes

  1. is an area of machine learning an agent learns from interaction with an environment what actions to take in order to optimize a long-term objective (expectation of cumulative rewards)