Urooj group ai e sec

Q- Learning
Artificial Intelligence

Group Members
1. Syeda Urooj Fatima 31
2. Faiza Liaqat 39
3. Umma Rabbab 25
4. ShahBano 02
5. Maryam Athar 16

Reinforcement Learning
 This concept came from Supervised learning. In
Supervised learning, we know about output but In
Reinforcement learning output is predicted from past
output.
 Behavior Based
 Critic Information: It doesn’t tell what is going to be in
future, It tells w.r.t past what is current state.
 E.g.
S1 0.3 S2 0.7

Reinforcement Learning
 Elements:
1. Agent
2. Environment
3. Reward
4. State
5. Action
E.g. Autonomous Car

Q Learning
 Episodes/Trials: Sequence of action from start to terminal state.
 Policy: (Behaviour map) State action Pair.
 Represents as Pi.
 Finite Horizon
 Infinite Horizon

Q Learning
 Finite Horizon:
Episode/Agent tries to maximize reward.
 Infinite Horizon:
Tries to maximize reward but it has no specified time limit.
(Infinite)

References
https://www.youtube.com/watch?v=3yJTInvfQvw
https://www.youtube.com/watch?v=qhRNvCVVJaA
https://en.wikipedia.org/wiki/Q-learning
https://blog.floydhub.com/an-introduction-to-q-learning-reinforcement-
learning/

Urooj group ai e sec

Recommended

Recommended

More Related Content

Recently uploaded

Recently uploaded (20)

Featured

Featured (20)

Urooj group ai e sec