Operant
Conditioning
Hadley D’Souza
A Daily phenomenon
• One of the two most common ways of learning
new things
• Learning= acquiring new behaviours
• Operant also known Instrumental- Rewards
(reinforcements) , Punishments.
• Skinner and Thorndike.
Origin
• Burrhus Frederic Skinner (B.F. Skinner)-
regarded as Father of Operant Conditioning.
• However, his work was based on Edward
Thorndike’s ‘Law of Effect’
• Thorndike- believed- the pleasure
(reinforcement) or discomfort (punishment)
caused by a stimulus can either strengthen or
weaken a particular behaviour.
Skinner Box
Reinforcement: Positive
• Skinner placed hungry rat in box
• Rat would keep moving around searching for
food
• By chance, it would press the lever and food
would appear in the container.
• Consequently, rat learnt to press the lever
directly to obtain food
• Here the rat learns (acquires) a particular
beahviour (pressing the lever) in order to
obtain the reinforcement (food pellets)
• This is positive reinforcement: behavior
strengthened
Negative
• Skinner would subject-rat to mild electric
shock
• Rat would run around the box
• Eventually press a lever by chance
• This lever stops the current
• The rat learns to immediately go to the lever to
stop the current
• Here rat learns to engage in certain behaviour
to stop the painful stimulus.
• This is negative reinforcement.
• >Behaviour that is learnt by receiving
something pleasant- Positive Reinforcement
• >Behaviour that is learnt by removing
something unpleasant- Negative
Reinforcement.
2 more Phenomenon
• 1: escape learning: the rats learnt to
straightaway head for the lever to escape the
current.
• 2: avoidance learning: Skinner started turning
on a light before giving the current. The rats
learnt to associate the light to the current- rats
went to the lever simply at the sight of the light
to avoid the current completely.
Punishment: Positive
• Rat has learnt to press lever by reinforcement.
• But now, If rat is given a shock for pressing
the lever- rat will learn to stop doing it-
behaviour weakened.
• Giving something unpleasant to weaken the
behaviour is Punishment (positive)
Negative
• Suppose the rat is given food regularly, but not
if it presses the lever: It stops pressing lever.
• If something pleasant is taken away, it leads to
weakening of behaviour.
• Giving unpleasant stimulus- Positive
Reinforcement.
• Taking away a pleasant stimulus- Negative
Reinforcement.
Reinforcement vs. Punishment
• Anything that strengthens behaviour- reinforcement
• > positive reinforcement is where a reward is given to
strengthen behaviour
• > negative reinforcement is where a painful stimulus is
removed to strengthen behaviour
• Anything that weakens a behaviour- punishment
• > positive punishment is where a punishment is given
to weaken behaviour.
• > negative punishment is where something pleasant is
taken away to weaken behaviour.
• PR- pleasant stimulus given [more pocket
money]
• NR- unpleasant stimulus removed [seminar
topic reduced]
• Both will strengthen behaviour
• PP- Unpleasant stimulus given [extra seminar]
• NP- pleasant stimulus removed [pocket money
taken away]
• Both will weaken behaviour
Other concepts:
• Shaping: developing a behaviour using
reinforcements and punishments- from simpler to
more complex tasks. [ex; training a dog]
• Extinction: stopping punishments and
reinforcements-> behaviour will have no reason
to continue.
• Generalization: after getting used to engage in a
particular behaviour in a particular situation, may
engage in the same behaviour in other similar
situations. [a tortured prisoner may show a fear of
all people]
• Discrimination: opposite of generalization.
Here one learns that all situations may not
yield same reward/punishment as the one
which was learnt. [ex: the tortured prisoner
will learn to differentiate between his torturers
and other human beings]
Schedules of Reinforcement
• Continuous Reinforcement Schedule: Reinforcements
are continuously given.
• Partial or Intermittent Schedules:
A. Ratio Schedules:here a fixed number of responses will
be awarded a fixed number of reinforcements. EX:
getting ‘A’ in all exams is rewarded with a new video
game.
 fixed ratio: a reinforcement will surely follow after a
certain number of behaviours.
 variable ratio: the number of times the behaviour
should be performed to obtain the reinforcement varies
from one reinforcement to the next.
B. Interval Schedules: Reinforcement does not
depend on the number of times behaviour has
occurred, but on a certain period of time.
EX: Monthly salary.
Fixed interval: the time gap between
reinforcements is constant.
Variable interval: the time gap between
reinforcements varies.
Implications and limitations
Like animals, humans too learn more or less in
a similar way.
Based on observable behaviour, influence of
environment
Did not consider insight, cognition, or
genetics.
One argument- Experiments on animals cannot
be generalized to humans as physiology and
mental capacities are very different.
THANK YOU!
• References:
• >Bauer,A., Maracich,C., ‘Skinner’s Operant
Conditioning’. Retrieved from
http://www3.niu.edu/acad/psych/Millis/History/2003/
conditioning.htm
• >McLeod, S. A. (2007). ‘Skinner - Operant
Conditioning’. Retrieved from
http://www.simplypsychology.org/operant-
conditioning.html

Operant conditioning

  • 1.
  • 2.
    A Daily phenomenon •One of the two most common ways of learning new things • Learning= acquiring new behaviours • Operant also known Instrumental- Rewards (reinforcements) , Punishments. • Skinner and Thorndike.
  • 3.
    Origin • Burrhus FredericSkinner (B.F. Skinner)- regarded as Father of Operant Conditioning. • However, his work was based on Edward Thorndike’s ‘Law of Effect’ • Thorndike- believed- the pleasure (reinforcement) or discomfort (punishment) caused by a stimulus can either strengthen or weaken a particular behaviour.
  • 4.
  • 5.
    Reinforcement: Positive • Skinnerplaced hungry rat in box • Rat would keep moving around searching for food • By chance, it would press the lever and food would appear in the container. • Consequently, rat learnt to press the lever directly to obtain food
  • 6.
    • Here therat learns (acquires) a particular beahviour (pressing the lever) in order to obtain the reinforcement (food pellets) • This is positive reinforcement: behavior strengthened
  • 7.
    Negative • Skinner wouldsubject-rat to mild electric shock • Rat would run around the box • Eventually press a lever by chance • This lever stops the current • The rat learns to immediately go to the lever to stop the current
  • 8.
    • Here ratlearns to engage in certain behaviour to stop the painful stimulus. • This is negative reinforcement. • >Behaviour that is learnt by receiving something pleasant- Positive Reinforcement • >Behaviour that is learnt by removing something unpleasant- Negative Reinforcement.
  • 9.
    2 more Phenomenon •1: escape learning: the rats learnt to straightaway head for the lever to escape the current. • 2: avoidance learning: Skinner started turning on a light before giving the current. The rats learnt to associate the light to the current- rats went to the lever simply at the sight of the light to avoid the current completely.
  • 10.
    Punishment: Positive • Rathas learnt to press lever by reinforcement. • But now, If rat is given a shock for pressing the lever- rat will learn to stop doing it- behaviour weakened. • Giving something unpleasant to weaken the behaviour is Punishment (positive)
  • 11.
    Negative • Suppose therat is given food regularly, but not if it presses the lever: It stops pressing lever. • If something pleasant is taken away, it leads to weakening of behaviour. • Giving unpleasant stimulus- Positive Reinforcement. • Taking away a pleasant stimulus- Negative Reinforcement.
  • 12.
    Reinforcement vs. Punishment •Anything that strengthens behaviour- reinforcement • > positive reinforcement is where a reward is given to strengthen behaviour • > negative reinforcement is where a painful stimulus is removed to strengthen behaviour • Anything that weakens a behaviour- punishment • > positive punishment is where a punishment is given to weaken behaviour. • > negative punishment is where something pleasant is taken away to weaken behaviour.
  • 13.
    • PR- pleasantstimulus given [more pocket money] • NR- unpleasant stimulus removed [seminar topic reduced] • Both will strengthen behaviour • PP- Unpleasant stimulus given [extra seminar] • NP- pleasant stimulus removed [pocket money taken away] • Both will weaken behaviour
  • 14.
    Other concepts: • Shaping:developing a behaviour using reinforcements and punishments- from simpler to more complex tasks. [ex; training a dog] • Extinction: stopping punishments and reinforcements-> behaviour will have no reason to continue. • Generalization: after getting used to engage in a particular behaviour in a particular situation, may engage in the same behaviour in other similar situations. [a tortured prisoner may show a fear of all people]
  • 15.
    • Discrimination: oppositeof generalization. Here one learns that all situations may not yield same reward/punishment as the one which was learnt. [ex: the tortured prisoner will learn to differentiate between his torturers and other human beings]
  • 16.
    Schedules of Reinforcement •Continuous Reinforcement Schedule: Reinforcements are continuously given. • Partial or Intermittent Schedules: A. Ratio Schedules:here a fixed number of responses will be awarded a fixed number of reinforcements. EX: getting ‘A’ in all exams is rewarded with a new video game.  fixed ratio: a reinforcement will surely follow after a certain number of behaviours.  variable ratio: the number of times the behaviour should be performed to obtain the reinforcement varies from one reinforcement to the next.
  • 17.
    B. Interval Schedules:Reinforcement does not depend on the number of times behaviour has occurred, but on a certain period of time. EX: Monthly salary. Fixed interval: the time gap between reinforcements is constant. Variable interval: the time gap between reinforcements varies.
  • 18.
    Implications and limitations Likeanimals, humans too learn more or less in a similar way. Based on observable behaviour, influence of environment Did not consider insight, cognition, or genetics. One argument- Experiments on animals cannot be generalized to humans as physiology and mental capacities are very different.
  • 19.
    THANK YOU! • References: •>Bauer,A., Maracich,C., ‘Skinner’s Operant Conditioning’. Retrieved from http://www3.niu.edu/acad/psych/Millis/History/2003/ conditioning.htm • >McLeod, S. A. (2007). ‘Skinner - Operant Conditioning’. Retrieved from http://www.simplypsychology.org/operant- conditioning.html