Reinforcement Learning, Application and Q-Learning

Bangladesh University of
Professionals (BUP)
MCSE-1101: Advanced
Artificial
Intelligence
Machine Learning:-
Reinforcement learning
Presentation Title
Presented By
Md. Abdullah al Mamun
A.B.M. Nazibullah

1. What is Artificial Intelligence?
2. What is Machine Learning?
3. Relationship among AI, ML and DL.
4. Human Brain Learning Process
5. Learning Vs Recognition
6. Supervised Learning
7. Unsupervised Learning
8. Reinforcement Learning
9. Definition of Reinforcement Learning
10. Reinforcement Learning Application: AWS Deep racer
11. Markov Decision Process
12. Understanding Q-Learning Algorithm
13. Q-Learning Algorithm Example
Outline

What is exactly
Artificial Intelligence?
Artificial Intelligence is a
model/procedure/tool who has
capability for self learning,
dynamically detect the pattern/object
and take decision by own knowledge
just like human brain.
“So according to the definition, is it proved that AI is
really threat for human existence?”

Machine Learning?
Machine Learning is a subset of Artificial Intelligence(AI) which provides machines the
ability to learn automatically & improve from experience without being explicitly
programmed.

The Relationship among
AI, ML and DL
Machine Learning is a
sub-category of AI
Deep Learning is a sub-
category of ML
That’s mean they are
both forms of AI

Human Brain Learning Process
Input Image Feature Extraction Learning
Human Brain Neuron
Recognition

Learning Vs Recognition
Learning
Learning is a search
through the space of
possible hypotheses for
one that will perform
well, even on new
examples beyond the
training set. To
measure the accuracy
of a hypothesis we give
it a test set of examples
that are distinct from
the training set.
Recognition
According to the
training dataset
learning process is
performed and engine
is updated. By pass
through the input
sample over the engine
and it will return an
output according to the
learning accuracy.

Supervised Learning
Supervised Learning use of labeled datasets to train
algorithms that to classify data or predict outcomes
accurately. As input data is fed into the model, it adjusts
its weights through a reinforcement learning process,
which ensures that the model has been fitted
appropriately.
The model first learns from the given training data. The
training data contains different patterns, which the model
will learn.
Application:
 classifying spam in a separate folder from your inbox
 Image- and object-recognition
 Predictive analytics

Unsupervised Learning
Unsupervised learning has no training phase; instead, the
algorithm is simply handed a dataset and uses the
variables within the data to identify and separate out
natural clusters.
Application:
 Finding customer segments
 Feature selection

Reinforcement Learning
Reinforcement Learning(RL) is a type of machine learning
technique that enables an agent to learn in an interactive
environment by trial and error using feedback from its
own actions and experiences.
Application:
 Robot deciding its path
 Next move in a chess game

Definition of Reinforcement Learning

A Taxonomy of RLAlgorithms
Model-based RL uses experience to construct an internal model of the transitions and immediate outcomes in the
environment.
Model-free RL, on the other hand, uses experience to learn directly one or both of two simpler quantities (state/ action values
or policies) which can achieve the same optimal behavior but without estimation or use of a world model.

AWS Deepracer
AWS DeepRacer gives you an interesting and fun way to get
started with reinforcement learning (RL). RL is an advanced
machine learning (ML) technique that takes a very different
approach to training models than other machine learning
methods. Its super power is that it learns very complex
behaviors without requiring any labeled training data, and
can make short term decisions while optimizing for a longer
term goal.
https://aws.amazon.com/deepracer/

AWS DeepRacer - Training
https://www.youtube.com/watch?v=-PeGCyBTzVc

The following parameters are used to attain a solution:
 Set of actions (A)
 Set of states (S)
 Reward (R)
 Policy (π)
 Value (V)
Markov Decision Process
The mathematical approach for mapping a solution in reinforcement learning is called Markov Decision Process(MDP)

Understanding Q-Learning
Place an agent in any one of the rooms(0,1,2,3,4) and the goal is to reach outside the building(room 5)
 5 rooms in a building
connected by doors.
 Each room is numbered 0
through 4
 The outside of the building
can be thought of as one big
room(5)
 Door 1 & 4 lead into the
building from room 5(outside)

Understanding Q-Learning(Graph Representation)
Let's represent the rooms on a graph, each room as a node, and each door as a link

Q-Learning Example: Selected Path 1 -> 5
If we iterate the loop to select path from 1 to 5 then
The matrix Q get's updated-

Q-Learning Example: Selected Path 2 -> 3 -> 4 -> 5
If we iterate the loop to select path from 2 to 5 then
The matrix Q get's updated-

Thank You
Thank you very much for
the opportunity to take
part in this knowledge
sharing session!

A
Q & A
“The important thing is
not to stop questioning.”
- Albert Einstein
&
Q

Reinforcement Learning, Application and Q-Learning

In this document