SlideShare a Scribd company logo
1 of 15
Deep Reinforcement Learning in
Machine Learning
June 18-20
Ceasars Palace, Las Vegas, NV
Deep Reinforcement Learning in Machine Learning
Though, the term artificial intelligence has been around from
1950s, there has been a major shift towards machine learning
from late 1990s through early 2002. The rise of the popularity
in the reinforcement learning has begun from 2000s and has
been the most promising algorithmic technique on the
landscape of artificial intelligence in the recent years. 1980s
have seen the knowledge-based systems trying to power the
machines with common sense and knowledge. There seemed
to be no end program the number of rules to power the
knowledge-based systems. It not significantly increased the
costs to power the systems through such knowledge-based
rules, it also slowed down the efforts and ability to re-create
the common sense in the machines. The trend shifted towards
the machine learning to avoid encoding millions of rules and
embed these into the machine. Machine learning learns the
rules from a pile of data automatically from the machines
through programming. The industries have shifted their focus
onto machine learning and abandoned the knowledge-based
systems. Through the 2000s, the AI researchers have started
implemented a number of machine learning algorithms
through Bayesian networks, bioinspired algorithms through
evolutionary algorithms, markov methods, and support vector
machines. The neural networks have shot to fame in 2012
with the introduction of deep learning technique with a
number of neural networks.
Deep Reinforcement Learning in Machine Learning
The third and final shift in the artificial
intelligence research community has been the
reinforcement learning technique. Moving away
from feeding the machines with labeled-data
through the supervised learning, the research
community has ignited the world by powering the
neural networks through rewards, actions, states,
policies, value, and action. With the advent of
DeepMind’s AlphaGo in 2015 totally trained with
reinforcement learning algorithm. It has defeated
the world champion of ancient game Go. AlphaGo
leverages the value networks to determine the
board positions in the Go and leverages policy
networks for selecting each move. A number of
Monte Carlo tree search programs have been
implemented that can simulate thousands of
moves without any historical datasets. DeepMind
has developed a special search algorithm that can
achieve a 99.8% winning rate against the
opponent programs and defeated European Go
champion with 5-0 and other human professional
players as well.
Deep Reinforcement Learning in Machine Learning
D Silver et al. Nature 529, 484–489 (2016) doi:10.1038/nature16961
DeepMind’s AlphaGo neural network training pipeline and reinforcement learning architecture
Deep Reinforcement Learning in Machine Learning
The Future of Reinforcement Learning
MIT Technology Review has downloaded 16,625
research papers from arxiv that are publicly available
under the computer science and artificial intelligence
section through November, 2018. Through natural
language processing technique on the abstracts the
words constraint, theory, rule, logic, program, learning,
network, data, task, and performance have been
evaluated to find the reinforcement learning boom in
the recent times. The trends have shown the rise of
the traditional neural networks in 1950s and 1960s,
symbolic approaches in 1970s, the knowledge-based
and rule-based systems in 1980s, support vector
machines in 1990s, and the reign of neural networks
was back in 2010s with the advent of heavy
implementation of deep neural networks.
Deep Reinforcement Learning in Machine Learning
Deep Traffic - Reinforcement Learning
Deep Traffic is a reinforcement learning simulation based on the
24K entries received on MIT Deep Traffic competition on self-
driving cars that drive on a multi-lane freeway with a model-free
off-policy reinforcement learning process that inspires a number
of data scientists and machine learning enthusiasts to evaluate
the Deep-Q-Learning reinforcement learning network variants
and hyperparameter configurations with episodic iterations
training of 96.6 years of RL simulations, 572.2 million
crowdsourced and optimized DQN hyperparameters to train the
agents successfully. Deep Reinforcement Learning also has
shown the promising future with physics engine for model-
based control in MuJoCo environment. It has also shown
significant advancements in the Arcade gaming environment
and Atari gaming environments of DeepMind. It’s implemented
completely in JavaScript.
Deep Reinforcement Learning in Machine Learning
Markov Decision Processes
A number of reinforcement learning algorithms can be
applied in the field of robotics such as policy optimization,
model-free reinforcement learning, policy gradients with
trust region policy optimization, proximal policy
optimization, bootstrapping, Monte Carlo methods, actor-
critic methods, on-policy (SARSA), off-policy (Q-Learning),
Deep-Q-Network, Markov decision processes, and
dynamic programming. Majority of the function
approximations are built on the mathematical foundations
based on the Markov decision processes with optimal
state and Q-value functions that operate on the state and
action pairs. In Atari games, the illustration here also
would depict the past frames state representation. In
Markov decision processes, an infinite horizon is
discounted as (S,A,P,R,γ,d0), where
S – Finite state space
A – Finite action space
P – S×A→∆(S)  Transition function
R:S ×A →∆([0,Rmax]) -> Reward function
γ∈[0,1) -> Discount factor
d0∈∆(S)is the initial state distribution
Deep Reinforcement Learning in Machine Learning
Atari Game Zoo
Deep reinforcement learning agents have not only made
significant progress in the field of robotics, but in many
instances have superseded the performance of humans in
the benchmarks such as Atari 2600 games and Dota 2.
Uber also has applied the reinforcement learning
algorithms in improving Uber Eats recommendations and
self-driving cars. Uber has built Atari Game Zoo based on
the Atari Learning Environment (ALE) Atari 2600 on Atari
gaming console for games such as SeaQuest,
Montezuma’s Revenge or Pitfall. Though, the objective of
creating such Atari Zoo is not to make comparisons of
high-scoring solutions and hyperparameter optimization
configurations among multiple algorithms. For example
the evolutionary algorithms from OpenAI gym have
shown different type of learning representations than the
gradient-methods.
Deep Reinforcement Learning in Machine Learning
The machine intelligence of algorithms is now distributed in a cloud-computing environment and will aid the
organizations in future to discover valuable insights and perform several operations through APIs. Organizations
are mass-manufacturing algorithms since it meets economies of scale in a distributed environment. Artificial
intelligence is the new inferno for powering AI winter (that lasted from 1990s through 2010s) with the machine
intelligence platforms through machine learning to rapidly prototype and deploy in production from sandboxes.
Figure: Wang, H., & Raj, B. (2017). On the Origin of Deep Learning
Deep Reinforcement Learning in Machine Learning
Intel optimized deep learning and machine learning frameworks
.
Figure: Intel Deep Learning and Machine Learning Frameworks (Alberto,
2016).
Deep Reinforcement Learning in Machine Learning
.
Figure: Nvidia deep learning frameworks with DGX (Nvidia, 2016).
Deep Reinforcement Learning in Machine Learning
.
Figure: IBM PowerAI: AI Platform (Soutter, 2016).
Deep Reinforcement Learning in Machine Learning
.
Figure: . Deep Water: Open source deep learning framework (H2O.AI, 2017).
Deep Reinforcement Learning in Machine Learning
MXNet Deep Learning Framework
.
Figure: . MXNet for deep learning (DMLC, 2017).
Deep Reinforcement Learning in Machine Learning
.
References
Fridman, L. (2019). Tutorials, assignments, and competitions for MIT Deep Learning related
courses. Retrieved from https://github.com/lexfridman/mit-deep-learning
Fridman, L., Terwilliger, J., & Jenik, B. (2018). DeepTraffic: Crowdsourced Hyperparameter Tuning
of Deep Reinforcement Learning Systems for Multi-Agent Dense Traffic Navigation. Retrieved from
https://arxiv.org/abs/1801.02805
Hao, K. (2019, January 25). We analyzed 16,625 papers to figure out where AI is headed next. MIT
Technology Review. Retrieved from https://www.technologyreview.com/s/612768/we-analyzed-
16625-papers-to-figure-out-where-ai-is-headed-next/
Jiang, N. (2019). On Value Functions and the Agent-Environment Boundary. Retrieved from
https://arxiv.org/pdf/1905.13341.pdf
Petroski, F., Madhavan, V., Liu, R., Wang, R., Li, Y., Clune, J., & Lehman, J. (2019). AI Creating a Zoo
of Atari-Playing Agents to Catalyze the Understanding of Deep Reinforcement Learning. Retrieved
from https://eng.uber.com/atari-zoo-deep-reinforcement-learning/
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Driessche, G. V., ... Grewe, D. (2016,
January 28). Mastering the game of Go with deep neural networks and tree search. Nature, 529,
484-489. Retrieved from https://www.nature.com/articles/nature16961

More Related Content

Similar to Deep Reinforcement Leaning In Machine Learning

Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Turi, Inc.
 
Ai open powermeetupmarch25th
Ai open powermeetupmarch25thAi open powermeetupmarch25th
Ai open powermeetupmarch25thIBM
 
Ai open powermeetupmarch25th
Ai open powermeetupmarch25thAi open powermeetupmarch25th
Ai open powermeetupmarch25thIBM
 
Intel 20180608 v2
Intel 20180608 v2Intel 20180608 v2
Intel 20180608 v2ISSIP
 
Vertex perspectives artificial intelligence
Vertex perspectives   artificial intelligenceVertex perspectives   artificial intelligence
Vertex perspectives artificial intelligenceYanai Oron
 
Vertex Perspectives | Artificial Intelligence
Vertex Perspectives | Artificial IntelligenceVertex Perspectives | Artificial Intelligence
Vertex Perspectives | Artificial IntelligenceVertex Holdings
 
Top 10 Most Demand IT Certifications Course in 2020 - MildainTrainings
Top 10 Most Demand IT Certifications Course in 2020 - MildainTrainingsTop 10 Most Demand IT Certifications Course in 2020 - MildainTrainings
Top 10 Most Demand IT Certifications Course in 2020 - MildainTrainingsMildain Solutions
 
楽天技術研究所の次世代AI 技術への挑戦
楽天技術研究所の次世代AI 技術への挑戦楽天技術研究所の次世代AI 技術への挑戦
楽天技術研究所の次世代AI 技術への挑戦Rakuten Group, Inc.
 
Machine Learning in Cyber Security Domain
Machine Learning in Cyber Security Domain Machine Learning in Cyber Security Domain
Machine Learning in Cyber Security Domain BGA Cyber Security
 
Ai open powermeetupmarch25th
Ai open powermeetupmarch25thAi open powermeetupmarch25th
Ai open powermeetupmarch25thIBM
 
Trends in Training and Simulation Technologies
Trends in Training and Simulation TechnologiesTrends in Training and Simulation Technologies
Trends in Training and Simulation TechnologiesAndy Fawkes
 
GPSBUS201-GPS Demystifying Artificial Intelligence
GPSBUS201-GPS Demystifying Artificial IntelligenceGPSBUS201-GPS Demystifying Artificial Intelligence
GPSBUS201-GPS Demystifying Artificial IntelligenceAmazon Web Services
 
Benefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycleBenefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycleMartin Kaltenböck
 
[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 Updates
[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 Updates[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 Updates
[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 UpdatesNaoki (Neo) SATO
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningHoa Le
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon Web Services
 
Build, Train, and Deploy ML Models at Scale
Build, Train, and Deploy ML Models at ScaleBuild, Train, and Deploy ML Models at Scale
Build, Train, and Deploy ML Models at ScaleAmazon Web Services
 
[Srijan Wednesday Webinars] Artificial Intelligence & the Future of Business
[Srijan Wednesday Webinars] Artificial Intelligence & the Future of Business[Srijan Wednesday Webinars] Artificial Intelligence & the Future of Business
[Srijan Wednesday Webinars] Artificial Intelligence & the Future of BusinessSrijan Technologies
 

Similar to Deep Reinforcement Leaning In Machine Learning (20)

Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015
 
Ai open powermeetupmarch25th
Ai open powermeetupmarch25thAi open powermeetupmarch25th
Ai open powermeetupmarch25th
 
Ai open powermeetupmarch25th
Ai open powermeetupmarch25thAi open powermeetupmarch25th
Ai open powermeetupmarch25th
 
NVIDIA GTC21 AI Conference Highlights
NVIDIA GTC21 AI Conference Highlights NVIDIA GTC21 AI Conference Highlights
NVIDIA GTC21 AI Conference Highlights
 
Intel 20180608 v2
Intel 20180608 v2Intel 20180608 v2
Intel 20180608 v2
 
Vertex perspectives artificial intelligence
Vertex perspectives   artificial intelligenceVertex perspectives   artificial intelligence
Vertex perspectives artificial intelligence
 
Vertex Perspectives | Artificial Intelligence
Vertex Perspectives | Artificial IntelligenceVertex Perspectives | Artificial Intelligence
Vertex Perspectives | Artificial Intelligence
 
Top 10 Most Demand IT Certifications Course in 2020 - MildainTrainings
Top 10 Most Demand IT Certifications Course in 2020 - MildainTrainingsTop 10 Most Demand IT Certifications Course in 2020 - MildainTrainings
Top 10 Most Demand IT Certifications Course in 2020 - MildainTrainings
 
楽天技術研究所の次世代AI 技術への挑戦
楽天技術研究所の次世代AI 技術への挑戦楽天技術研究所の次世代AI 技術への挑戦
楽天技術研究所の次世代AI 技術への挑戦
 
Machine Learning in Cyber Security Domain
Machine Learning in Cyber Security Domain Machine Learning in Cyber Security Domain
Machine Learning in Cyber Security Domain
 
Ai open powermeetupmarch25th
Ai open powermeetupmarch25thAi open powermeetupmarch25th
Ai open powermeetupmarch25th
 
Trends in Training and Simulation Technologies
Trends in Training and Simulation TechnologiesTrends in Training and Simulation Technologies
Trends in Training and Simulation Technologies
 
GPSBUS201-GPS Demystifying Artificial Intelligence
GPSBUS201-GPS Demystifying Artificial IntelligenceGPSBUS201-GPS Demystifying Artificial Intelligence
GPSBUS201-GPS Demystifying Artificial Intelligence
 
Benefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycleBenefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycle
 
[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 Updates
[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 Updates[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 Updates
[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 Updates
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearning
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)
 
Build, Train, and Deploy ML Models at Scale
Build, Train, and Deploy ML Models at ScaleBuild, Train, and Deploy ML Models at Scale
Build, Train, and Deploy ML Models at Scale
 
[Srijan Wednesday Webinars] Artificial Intelligence & the Future of Business
[Srijan Wednesday Webinars] Artificial Intelligence & the Future of Business[Srijan Wednesday Webinars] Artificial Intelligence & the Future of Business
[Srijan Wednesday Webinars] Artificial Intelligence & the Future of Business
 
Null
NullNull
Null
 

More from InterCon

Getting Started With IoT – Guidebook: Presented by Anu Taksali, CEO of Dhanuk...
Getting Started With IoT – Guidebook: Presented by Anu Taksali, CEO of Dhanuk...Getting Started With IoT – Guidebook: Presented by Anu Taksali, CEO of Dhanuk...
Getting Started With IoT – Guidebook: Presented by Anu Taksali, CEO of Dhanuk...InterCon
 
Cross Border Blockchain Equity/Capital Market Services And Compliance: Presen...
Cross Border Blockchain Equity/Capital Market Services And Compliance: Presen...Cross Border Blockchain Equity/Capital Market Services And Compliance: Presen...
Cross Border Blockchain Equity/Capital Market Services And Compliance: Presen...InterCon
 
Transitioning Your Business Model - From Services To Subscriptions: Presented...
Transitioning Your Business Model - From Services To Subscriptions: Presented...Transitioning Your Business Model - From Services To Subscriptions: Presented...
Transitioning Your Business Model - From Services To Subscriptions: Presented...InterCon
 
IoT Now And In The Future: Presented by Niroshan Madampitige, Head of Deliver...
IoT Now And In The Future: Presented by Niroshan Madampitige, Head of Deliver...IoT Now And In The Future: Presented by Niroshan Madampitige, Head of Deliver...
IoT Now And In The Future: Presented by Niroshan Madampitige, Head of Deliver...InterCon
 
Data is the New Oil: Presented By Naveen Narayanan, Global Client Partner of ...
Data is the New Oil: Presented By Naveen Narayanan, Global Client Partner of ...Data is the New Oil: Presented By Naveen Narayanan, Global Client Partner of ...
Data is the New Oil: Presented By Naveen Narayanan, Global Client Partner of ...InterCon
 
Can Blockchain Disrupt Or Even Destroy The Cloud? : Presented by Suhas Patil,...
Can Blockchain Disrupt Or Even Destroy The Cloud? : Presented by Suhas Patil,...Can Blockchain Disrupt Or Even Destroy The Cloud? : Presented by Suhas Patil,...
Can Blockchain Disrupt Or Even Destroy The Cloud? : Presented by Suhas Patil,...InterCon
 
E-Commerce Automation: Presented by Siddhartha Choudhary, Co-Founder & CEO of...
E-Commerce Automation: Presented by Siddhartha Choudhary, Co-Founder & CEO of...E-Commerce Automation: Presented by Siddhartha Choudhary, Co-Founder & CEO of...
E-Commerce Automation: Presented by Siddhartha Choudhary, Co-Founder & CEO of...InterCon
 
5G Technology - The Future of Internet
5G Technology - The Future of Internet5G Technology - The Future of Internet
5G Technology - The Future of InternetInterCon
 
Software Security For DevOps And Continuous Deployment In The Cloud
Software Security For DevOps And Continuous Deployment In The CloudSoftware Security For DevOps And Continuous Deployment In The Cloud
Software Security For DevOps And Continuous Deployment In The CloudInterCon
 
Transitioning Your Business Model - From Services To SaaS
Transitioning Your Business Model - From Services To SaaSTransitioning Your Business Model - From Services To SaaS
Transitioning Your Business Model - From Services To SaaSInterCon
 
ML Will Redesign, Not Replace, Jobs
ML Will Redesign, Not Replace, JobsML Will Redesign, Not Replace, Jobs
ML Will Redesign, Not Replace, JobsInterCon
 
Blockchain Applications Transforming Society
Blockchain Applications Transforming SocietyBlockchain Applications Transforming Society
Blockchain Applications Transforming SocietyInterCon
 
How Are AI And ML Transforming Decision Making?
How Are AI And ML Transforming Decision Making?How Are AI And ML Transforming Decision Making?
How Are AI And ML Transforming Decision Making?InterCon
 
Boosting App Installs
Boosting App InstallsBoosting App Installs
Boosting App InstallsInterCon
 
Blockchain, Smart Contracts & IoT
Blockchain, Smart Contracts & IoTBlockchain, Smart Contracts & IoT
Blockchain, Smart Contracts & IoTInterCon
 
Phishing Attacks and Trends in Cloud Computing
Phishing Attacks and Trends in Cloud ComputingPhishing Attacks and Trends in Cloud Computing
Phishing Attacks and Trends in Cloud ComputingInterCon
 
IoT - Understanding The Shift To Edge Computing
IoT - Understanding The Shift To Edge ComputingIoT - Understanding The Shift To Edge Computing
IoT - Understanding The Shift To Edge ComputingInterCon
 
IoT Data - Like No Data We have Ever Seen
IoT Data - Like No Data We have Ever SeenIoT Data - Like No Data We have Ever Seen
IoT Data - Like No Data We have Ever SeenInterCon
 

More from InterCon (18)

Getting Started With IoT – Guidebook: Presented by Anu Taksali, CEO of Dhanuk...
Getting Started With IoT – Guidebook: Presented by Anu Taksali, CEO of Dhanuk...Getting Started With IoT – Guidebook: Presented by Anu Taksali, CEO of Dhanuk...
Getting Started With IoT – Guidebook: Presented by Anu Taksali, CEO of Dhanuk...
 
Cross Border Blockchain Equity/Capital Market Services And Compliance: Presen...
Cross Border Blockchain Equity/Capital Market Services And Compliance: Presen...Cross Border Blockchain Equity/Capital Market Services And Compliance: Presen...
Cross Border Blockchain Equity/Capital Market Services And Compliance: Presen...
 
Transitioning Your Business Model - From Services To Subscriptions: Presented...
Transitioning Your Business Model - From Services To Subscriptions: Presented...Transitioning Your Business Model - From Services To Subscriptions: Presented...
Transitioning Your Business Model - From Services To Subscriptions: Presented...
 
IoT Now And In The Future: Presented by Niroshan Madampitige, Head of Deliver...
IoT Now And In The Future: Presented by Niroshan Madampitige, Head of Deliver...IoT Now And In The Future: Presented by Niroshan Madampitige, Head of Deliver...
IoT Now And In The Future: Presented by Niroshan Madampitige, Head of Deliver...
 
Data is the New Oil: Presented By Naveen Narayanan, Global Client Partner of ...
Data is the New Oil: Presented By Naveen Narayanan, Global Client Partner of ...Data is the New Oil: Presented By Naveen Narayanan, Global Client Partner of ...
Data is the New Oil: Presented By Naveen Narayanan, Global Client Partner of ...
 
Can Blockchain Disrupt Or Even Destroy The Cloud? : Presented by Suhas Patil,...
Can Blockchain Disrupt Or Even Destroy The Cloud? : Presented by Suhas Patil,...Can Blockchain Disrupt Or Even Destroy The Cloud? : Presented by Suhas Patil,...
Can Blockchain Disrupt Or Even Destroy The Cloud? : Presented by Suhas Patil,...
 
E-Commerce Automation: Presented by Siddhartha Choudhary, Co-Founder & CEO of...
E-Commerce Automation: Presented by Siddhartha Choudhary, Co-Founder & CEO of...E-Commerce Automation: Presented by Siddhartha Choudhary, Co-Founder & CEO of...
E-Commerce Automation: Presented by Siddhartha Choudhary, Co-Founder & CEO of...
 
5G Technology - The Future of Internet
5G Technology - The Future of Internet5G Technology - The Future of Internet
5G Technology - The Future of Internet
 
Software Security For DevOps And Continuous Deployment In The Cloud
Software Security For DevOps And Continuous Deployment In The CloudSoftware Security For DevOps And Continuous Deployment In The Cloud
Software Security For DevOps And Continuous Deployment In The Cloud
 
Transitioning Your Business Model - From Services To SaaS
Transitioning Your Business Model - From Services To SaaSTransitioning Your Business Model - From Services To SaaS
Transitioning Your Business Model - From Services To SaaS
 
ML Will Redesign, Not Replace, Jobs
ML Will Redesign, Not Replace, JobsML Will Redesign, Not Replace, Jobs
ML Will Redesign, Not Replace, Jobs
 
Blockchain Applications Transforming Society
Blockchain Applications Transforming SocietyBlockchain Applications Transforming Society
Blockchain Applications Transforming Society
 
How Are AI And ML Transforming Decision Making?
How Are AI And ML Transforming Decision Making?How Are AI And ML Transforming Decision Making?
How Are AI And ML Transforming Decision Making?
 
Boosting App Installs
Boosting App InstallsBoosting App Installs
Boosting App Installs
 
Blockchain, Smart Contracts & IoT
Blockchain, Smart Contracts & IoTBlockchain, Smart Contracts & IoT
Blockchain, Smart Contracts & IoT
 
Phishing Attacks and Trends in Cloud Computing
Phishing Attacks and Trends in Cloud ComputingPhishing Attacks and Trends in Cloud Computing
Phishing Attacks and Trends in Cloud Computing
 
IoT - Understanding The Shift To Edge Computing
IoT - Understanding The Shift To Edge ComputingIoT - Understanding The Shift To Edge Computing
IoT - Understanding The Shift To Edge Computing
 
IoT Data - Like No Data We have Ever Seen
IoT Data - Like No Data We have Ever SeenIoT Data - Like No Data We have Ever Seen
IoT Data - Like No Data We have Ever Seen
 

Recently uploaded

一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理F
 
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...kumargunjan9515
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasDigicorns Technologies
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge GraphsEleniIlkou
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdfMatthew Sinclair
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsMonica Sydney
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrHenryBriggs2
 
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查ydyuyu
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdfMatthew Sinclair
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查ydyuyu
 
一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理F
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制pxcywzqs
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirtrahman018755
 
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call GirlsMira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call GirlsPriya Reddy
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样ayvbos
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查ydyuyu
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdfMatthew Sinclair
 
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样ayvbos
 

Recently uploaded (20)

一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理
 
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency Dallas
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
 
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
 
一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirt
 
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call GirlsMira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
 
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
 

Deep Reinforcement Leaning In Machine Learning

  • 1. Deep Reinforcement Learning in Machine Learning June 18-20 Ceasars Palace, Las Vegas, NV
  • 2. Deep Reinforcement Learning in Machine Learning Though, the term artificial intelligence has been around from 1950s, there has been a major shift towards machine learning from late 1990s through early 2002. The rise of the popularity in the reinforcement learning has begun from 2000s and has been the most promising algorithmic technique on the landscape of artificial intelligence in the recent years. 1980s have seen the knowledge-based systems trying to power the machines with common sense and knowledge. There seemed to be no end program the number of rules to power the knowledge-based systems. It not significantly increased the costs to power the systems through such knowledge-based rules, it also slowed down the efforts and ability to re-create the common sense in the machines. The trend shifted towards the machine learning to avoid encoding millions of rules and embed these into the machine. Machine learning learns the rules from a pile of data automatically from the machines through programming. The industries have shifted their focus onto machine learning and abandoned the knowledge-based systems. Through the 2000s, the AI researchers have started implemented a number of machine learning algorithms through Bayesian networks, bioinspired algorithms through evolutionary algorithms, markov methods, and support vector machines. The neural networks have shot to fame in 2012 with the introduction of deep learning technique with a number of neural networks.
  • 3. Deep Reinforcement Learning in Machine Learning The third and final shift in the artificial intelligence research community has been the reinforcement learning technique. Moving away from feeding the machines with labeled-data through the supervised learning, the research community has ignited the world by powering the neural networks through rewards, actions, states, policies, value, and action. With the advent of DeepMind’s AlphaGo in 2015 totally trained with reinforcement learning algorithm. It has defeated the world champion of ancient game Go. AlphaGo leverages the value networks to determine the board positions in the Go and leverages policy networks for selecting each move. A number of Monte Carlo tree search programs have been implemented that can simulate thousands of moves without any historical datasets. DeepMind has developed a special search algorithm that can achieve a 99.8% winning rate against the opponent programs and defeated European Go champion with 5-0 and other human professional players as well.
  • 4. Deep Reinforcement Learning in Machine Learning D Silver et al. Nature 529, 484–489 (2016) doi:10.1038/nature16961 DeepMind’s AlphaGo neural network training pipeline and reinforcement learning architecture
  • 5. Deep Reinforcement Learning in Machine Learning The Future of Reinforcement Learning MIT Technology Review has downloaded 16,625 research papers from arxiv that are publicly available under the computer science and artificial intelligence section through November, 2018. Through natural language processing technique on the abstracts the words constraint, theory, rule, logic, program, learning, network, data, task, and performance have been evaluated to find the reinforcement learning boom in the recent times. The trends have shown the rise of the traditional neural networks in 1950s and 1960s, symbolic approaches in 1970s, the knowledge-based and rule-based systems in 1980s, support vector machines in 1990s, and the reign of neural networks was back in 2010s with the advent of heavy implementation of deep neural networks.
  • 6. Deep Reinforcement Learning in Machine Learning Deep Traffic - Reinforcement Learning Deep Traffic is a reinforcement learning simulation based on the 24K entries received on MIT Deep Traffic competition on self- driving cars that drive on a multi-lane freeway with a model-free off-policy reinforcement learning process that inspires a number of data scientists and machine learning enthusiasts to evaluate the Deep-Q-Learning reinforcement learning network variants and hyperparameter configurations with episodic iterations training of 96.6 years of RL simulations, 572.2 million crowdsourced and optimized DQN hyperparameters to train the agents successfully. Deep Reinforcement Learning also has shown the promising future with physics engine for model- based control in MuJoCo environment. It has also shown significant advancements in the Arcade gaming environment and Atari gaming environments of DeepMind. It’s implemented completely in JavaScript.
  • 7. Deep Reinforcement Learning in Machine Learning Markov Decision Processes A number of reinforcement learning algorithms can be applied in the field of robotics such as policy optimization, model-free reinforcement learning, policy gradients with trust region policy optimization, proximal policy optimization, bootstrapping, Monte Carlo methods, actor- critic methods, on-policy (SARSA), off-policy (Q-Learning), Deep-Q-Network, Markov decision processes, and dynamic programming. Majority of the function approximations are built on the mathematical foundations based on the Markov decision processes with optimal state and Q-value functions that operate on the state and action pairs. In Atari games, the illustration here also would depict the past frames state representation. In Markov decision processes, an infinite horizon is discounted as (S,A,P,R,γ,d0), where S – Finite state space A – Finite action space P – S×A→∆(S)  Transition function R:S ×A →∆([0,Rmax]) -> Reward function γ∈[0,1) -> Discount factor d0∈∆(S)is the initial state distribution
  • 8. Deep Reinforcement Learning in Machine Learning Atari Game Zoo Deep reinforcement learning agents have not only made significant progress in the field of robotics, but in many instances have superseded the performance of humans in the benchmarks such as Atari 2600 games and Dota 2. Uber also has applied the reinforcement learning algorithms in improving Uber Eats recommendations and self-driving cars. Uber has built Atari Game Zoo based on the Atari Learning Environment (ALE) Atari 2600 on Atari gaming console for games such as SeaQuest, Montezuma’s Revenge or Pitfall. Though, the objective of creating such Atari Zoo is not to make comparisons of high-scoring solutions and hyperparameter optimization configurations among multiple algorithms. For example the evolutionary algorithms from OpenAI gym have shown different type of learning representations than the gradient-methods.
  • 9. Deep Reinforcement Learning in Machine Learning The machine intelligence of algorithms is now distributed in a cloud-computing environment and will aid the organizations in future to discover valuable insights and perform several operations through APIs. Organizations are mass-manufacturing algorithms since it meets economies of scale in a distributed environment. Artificial intelligence is the new inferno for powering AI winter (that lasted from 1990s through 2010s) with the machine intelligence platforms through machine learning to rapidly prototype and deploy in production from sandboxes. Figure: Wang, H., & Raj, B. (2017). On the Origin of Deep Learning
  • 10. Deep Reinforcement Learning in Machine Learning Intel optimized deep learning and machine learning frameworks . Figure: Intel Deep Learning and Machine Learning Frameworks (Alberto, 2016).
  • 11. Deep Reinforcement Learning in Machine Learning . Figure: Nvidia deep learning frameworks with DGX (Nvidia, 2016).
  • 12. Deep Reinforcement Learning in Machine Learning . Figure: IBM PowerAI: AI Platform (Soutter, 2016).
  • 13. Deep Reinforcement Learning in Machine Learning . Figure: . Deep Water: Open source deep learning framework (H2O.AI, 2017).
  • 14. Deep Reinforcement Learning in Machine Learning MXNet Deep Learning Framework . Figure: . MXNet for deep learning (DMLC, 2017).
  • 15. Deep Reinforcement Learning in Machine Learning . References Fridman, L. (2019). Tutorials, assignments, and competitions for MIT Deep Learning related courses. Retrieved from https://github.com/lexfridman/mit-deep-learning Fridman, L., Terwilliger, J., & Jenik, B. (2018). DeepTraffic: Crowdsourced Hyperparameter Tuning of Deep Reinforcement Learning Systems for Multi-Agent Dense Traffic Navigation. Retrieved from https://arxiv.org/abs/1801.02805 Hao, K. (2019, January 25). We analyzed 16,625 papers to figure out where AI is headed next. MIT Technology Review. Retrieved from https://www.technologyreview.com/s/612768/we-analyzed- 16625-papers-to-figure-out-where-ai-is-headed-next/ Jiang, N. (2019). On Value Functions and the Agent-Environment Boundary. Retrieved from https://arxiv.org/pdf/1905.13341.pdf Petroski, F., Madhavan, V., Liu, R., Wang, R., Li, Y., Clune, J., & Lehman, J. (2019). AI Creating a Zoo of Atari-Playing Agents to Catalyze the Understanding of Deep Reinforcement Learning. Retrieved from https://eng.uber.com/atari-zoo-deep-reinforcement-learning/ Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Driessche, G. V., ... Grewe, D. (2016, January 28). Mastering the game of Go with deep neural networks and tree search. Nature, 529, 484-489. Retrieved from https://www.nature.com/articles/nature16961