ReinforcementReinforcement
LearningLearning
Science 8 Unit B: Cells and Systems (Nature of Science Emphasis)
Introduction
> What does it mean to have a behaviour
reinforced?
> Let’s look at a famous example first...
Introduction
Ivan Pavlov (1849-1936)
> Born in Russia in 1849, Ivan Pavlov abandoned
a religious career for which he had b...
Introduction
> Pavlov was awarded the Nobel Prize in
Physiology or Medicine in 1904. He then
turned to studying reflexes, ...
Introduction
> Pavlov became interested in
studying reflexes when he
noticed that dogs sometimes
drooled even without food...
Introduction
> Every time the dogs were served
food, the person who served the food
was wearing a lab coat.
> The lab coat...
Introduction
> A stimulus is anything capable of
evoking a response in an
organism.
> Examples of stimuli include
sights, ...
Introduction
> In a series of experiments, Pavlov
then tried to figure out why this was
happening.
> For example, he struc...
More on Pavlov's Dog
> You can read more about Pavlov’s dog and
see if you can train a dog to drool on command
online at t...
Reinforcement Learning
> Dogs are often trained through a method of
reinforcement.
> For example, if a dog hears the word ...
Reinforcement Learning
Definition:
– Reinforcement occurs when an event following a
response causes an increase in the pro...
Reinforcement Learning
> If animals (including humans) can learn by
reinforcement, can a machine also learn
through reinfo...
Reinforcement Learning
> The robot is called
“Critterbot”.
> The robot responds to
stimuli in the environment.
> For lesso...
How can a Machine be Reinforced?
> In Machine Learning (which is a type of
artificial intelligence) the “learner” is a
com...
How can a Machine be Reinforced?
> A positive reward will result in a “1”
> A neutral reward will result in a “0”
> A nega...
How can a Machine be Reinforced?
> What separates Reinforcement Learning
from other forms of artificial intelligence is
th...
Questions
1. How is a robot that uses Machine Learning
different from robot that is programmed for
specific tasks?
– Answe...
Questions
2. A robot in a car factory is designed to build
cars at a fast rate. Would Machine Learning
be a good applicati...
Questions
3. Are dogs the only animals that respond to a
stimulus by salivating? For example, what
happens to you when you...
Questions
4. Critterbot was designed to respond to stimuli
(plural for stimulus). Imagine that you had to
design a robot t...
Question 4 continued.
– What types of sensors would it need to have to work
without your assistance? Remember, it is only ...
Centre for Mathematics Science and Technology Education (CMASTE)
382 Education South
University of Alberta
Edmonton AB T6G...
Upcoming SlideShare
Loading in...5
×

Lesson12: Reinforcement Learning for Critterbot Science 8

181

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
181
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Lesson12: Reinforcement Learning for Critterbot Science 8

  1. 1. ReinforcementReinforcement LearningLearning Science 8 Unit B: Cells and Systems (Nature of Science Emphasis)
  2. 2. Introduction > What does it mean to have a behaviour reinforced? > Let’s look at a famous example first...
  3. 3. Introduction Ivan Pavlov (1849-1936) > Born in Russia in 1849, Ivan Pavlov abandoned a religious career for which he had been preparing, and instead went into science. > His work had a great impact on the field of physiology (the study of the mechanical, physical, and biochemical functions of living organisms) by studying the mechanisms underlying the digestive system in mammals. Source: Nobelprize.org
  4. 4. Introduction > Pavlov was awarded the Nobel Prize in Physiology or Medicine in 1904. He then turned to studying reflexes, in particular with dogs. > His discoveries led to the science of behaviour. Source: Nobelprize.org
  5. 5. Introduction > Pavlov became interested in studying reflexes when he noticed that dogs sometimes drooled even without food being shown to them. > Although no food was in sight, their saliva still dribbled. It turned out that the dogs were reacting to lab coats. Source: Nobelprize.org
  6. 6. Introduction > Every time the dogs were served food, the person who served the food was wearing a lab coat. > The lab coats became a “stimulus”. Source: Nobelprize.org
  7. 7. Introduction > A stimulus is anything capable of evoking a response in an organism. > Examples of stimuli include sights, sounds, heat, cold, smells, or other sensations. > Therefore, the dogs reacted as if food was on its way whenever they saw a lab coat. Source: Nobelprize.org
  8. 8. Introduction > In a series of experiments, Pavlov then tried to figure out why this was happening. > For example, he struck a bell when the dogs were fed. If the bell was sounded close to meal time, the dogs learnt to associate the sound of the bell with food. > After a while, the stimulus of the bell, caused them to drool. Source: Nobelprize.org
  9. 9. More on Pavlov's Dog > You can read more about Pavlov’s dog and see if you can train a dog to drool on command online at the Nobel Prize website.
  10. 10. Reinforcement Learning > Dogs are often trained through a method of reinforcement. > For example, if a dog hears the word “sit” and receives a treat, he or she will learn that “sitting” provides a treat. > In fact, almost all animals can learn through reinforcement.
  11. 11. Reinforcement Learning Definition: – Reinforcement occurs when an event following a response causes an increase in the probability of that response occurring in the future. > So when a dog hears “sit” (response) and receives a treat (event), the dog will more likely sit in the future in hopes of receiving another treat.
  12. 12. Reinforcement Learning > If animals (including humans) can learn by reinforcement, can a machine also learn through reinforcement? > Computing Scientists at the Centre for Machine Learning believe so, and they are building a robot that learns through reinforcement.
  13. 13. Reinforcement Learning > The robot is called “Critterbot”. > The robot responds to stimuli in the environment. > For lessons on Critterbot see Critterbot for Physics 30 and Critterbot for Science 8.
  14. 14. How can a Machine be Reinforced? > In Machine Learning (which is a type of artificial intelligence) the “learner” is a computer that learns by trying to obtain a maximum reward. > So what does a computer or robot want as a reward? – Just a number. -1 0 1 -1 0 1 -1 0 1 0 1 -1
  15. 15. How can a Machine be Reinforced? > A positive reward will result in a “1” > A neutral reward will result in a “0” > A negative reward will result in a “-1”
  16. 16. How can a Machine be Reinforced? > What separates Reinforcement Learning from other forms of artificial intelligence is that the learner is never told what actions to take. > The learner uses a trial-and-error search approach and if it receives a positive reward, will continue that action. > But if it receives a negative reward, it will learn to avoid that action.
  17. 17. Questions 1. How is a robot that uses Machine Learning different from robot that is programmed for specific tasks? – Answer: In Machine Learning, the robot is not told what actions to take. It learns by trial and error.
  18. 18. Questions 2. A robot in a car factory is designed to build cars at a fast rate. Would Machine Learning be a good application for a car building machine? Why or why not? Answer: No, probably not. Robots that build use specific designs to ensure they build exactly as they are told.
  19. 19. Questions 3. Are dogs the only animals that respond to a stimulus by salivating? For example, what happens to you when you are just about to put a pickle in your mouth? Or mustard? Or a sour candy? – Answer: Humans also respond to visual stimuli and will salivate at the sight of some stimuli.
  20. 20. Questions 4. Critterbot was designed to respond to stimuli (plural for stimulus). Imagine that you had to design a robot to that will automatically shovel snow from your driveway every winter. – The robot cannot have any human assistance, it has to be autonomous (work on its own). – First, come up with a ‘cool’ name for your robot. – Use drawings and written descriptions to write up a one page explanation of how your robot would work. continued...
  21. 21. Question 4 continued. – What types of sensors would it need to have to work without your assistance? Remember, it is only going to shovel your driveway, and not wander down the street shovelling every driveway. – Animals require energy and use special systems to convert food into energy. For example, the digestive system takes in food, digests it to extract energy and nutrients. – How will your robot gets its energy? Remember, it has to work in winter conditions, most often when it is snowing.
  22. 22. Centre for Mathematics Science and Technology Education (CMASTE) 382 Education South University of Alberta Edmonton AB T6G 2G5 www.CMASTE.ca To download: select Outreach, Alberta Ingenuity Resources and Centre for Machine Learning Filename: AICML6BrainTumourAnalysis Centre for Machine Learning Department of Computing Science University of Alberta 2-21 Athabasca Hall Edmonton AB T6G 2E8 (780) 492-4828 www.machinelearningcentre.ca Alberta Ingenuity 2410 Manulife Place, 10180-101 Street Edmonton AB T5J 3S4 (780) 423-5735 www.albertaingenuity.ca
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×