SlideShare a Scribd company logo
1 of 15
Download to read offline
ORGANIZATIONAL BEHAVIOUR
DIFFERENCE BETWEEN NEGATIVE
REINFORCEMENT
AND PUNISHMENT BY REMOVAL
Report Submitted to the
Institute of Real Estate and Finance ( MBA+PGP)
BY
Vikrant Wagh
Rajlaxmi Pardeshi
Parag Jagtap
Sachin Rai
Dilip Mehta
UNDER THE GUIDANCE OF
Mis Pooja Kumar
ACKNOWLEDEMENT
We would like to express our special thanks of gratitude to our director
Mr. Abhay Kumar for all support and guidance throughout and giving
us the opportunity to work on this report which helped me to
understand Difference between reinforcement and punishment by
removal deeply , and I came to know about so many things . secondly I
would like to show my gratitude towards our mentor Miss Pooja
Kumar for giving an opportunity to work on such a report . this
opportunity led me to the pathway of deep knowledge of difference
between negative reinforcement and punishment by removal .
TABLE OF CONTENT
1 . INTRODUCION
2 . REINFORCEMENT
3. TYPES OF REINFORCEMENT
1. POSITIVE REINFORCEMENT
2. NEGATIVE REINFORCEMENT
3. PUNISHMENT
4. EXTINCTION
4 . PUNISHMENT
1. POSSTIVE PUNISHMENT
2. NEGATIVE PUNISHMENT
5 . REINFORCEMENT SCHEDULES
1. FIXED RATIO
2. FIXED INTERVAL
6 . LEARNING FROM REINFORCEMENT
7. EXAMPLES
8. CHALLENGES WITH REINFORCEMENT LEARNING
9. DIFEERENCE BETWEEN POSSITIVE AND NEGATIVE
REINFORCEMENT
10 . POSSITIVE REINFORCEMENT VS PUNISHMENT
11 . REFFERENCE
INTRODUCTION
REINFORCEMENT:
The term reinforce means to strengthen, and used in psychology to refer to any stimuli which
strengthens the probability of a specific response.
For example, if you want your dog to sit on command, you may give him a treat every time he
sits for you. The dog will eventually come to understand that sitting when told to will result in
a treat. This treat is reinforcing because dogs like treats.
This is a simple description of a reinforcer (the treat), which increases the response (sitting).
We all apply reinforcers every day, most of the time without even realizing we are doing it.
You may tell your child “good job” after he or she cleans their room; perhaps you tell your
partner how good he or she looks when they dress up; or maybe you got a raise at work after
doing a great job on a project. All of these things increase the probability that the same response
will be repeated.
There are four types of reinforcement:
• Positive reinforcement
• Negative reinforcement
• Punishment
• Extinction
Operant Conditioning
Add Something Remove Something
Increase a Behavior Positive Reinforcement Negative Reinforcement
Decrease a Behavior Positive Punishment Negative Punishment
Positive Reinforcement:
Think of it as adding something in order to increase a response. For example, adding a treat
will increase the response of sitting; adding praise will increase the chances of your child
cleaning his or her room. The most common types of positive reinforcement or praise and
rewards, and most of us have experienced this as both the giver and receiver.
Negative Reinforcement:
Think of negative reinforcement as taking something away in order to increase a response.
Taking away a toy until your son picks up his room, or withholding payment until a job is
complete are examples of this. Basically, you want to remove or withhold something of value
in order to increase a certain response or behavior.
Punishment (Positive Punishment):
What most people refer to punishment is typically positive punishment. This is when something
aversive is added in order to decrease a behavior. The most common example of this is
disciplining a child for misbehaving. The reason we do this is because the child begins to
associate being punished with the negative behavior. The punishment is not liked and therefore
to avoid it, he or she will stop behaving in that manner.
Negative Punishment:
When you remove something in order to decrease a behavior, this is called negative
punishment. You are taking something away so that a response or unwanted behavior is
decreased. Putting a child in a time-out until they can decrease their aggressive behavior, for
instance, is an example of a negative punishment. You’re removing interactions with others in
order to decrease the unwanted behavior.
Reinforcement Schedules:
Know that we understand the four types of reinforcement, we need to understand how and
when these are applied (Ferster & Skinner,1957). For example, do we apply the positive
reinforcement every time a child does something positive? Do we punish a child every time he
does something negative? To answer these questions, you need to understand the schedules of
reinforcement.
Applying one of the four types of reinforcement every time the behavior occurs is called a
continuous schedule. Its continuous because the application occurs after every project,
behavior, etc. This is the best approach when using punishment. Inconsistencies in the
punishment of children often results in confusion and resentment. A problem with this schedule
is that we are not always present when a behavior occurs or may not be able to apply the
punishment.
There are two types of continuous schedules:
• Fixed Ratio: A fixed ratio schedule refers to applying the reinforcement after a specific
number of behaviors. Spanking a child if you have to ask him three times to clean his
room is an example. The problem is that the child will begin to realize that he ca get
away with two requests before he has to act. Therefore, the behavior does not tend to
change until right before the preset number.
• Fixed Interval: Applying the reinforcer after a specific amount of time is referred to as
a fixed interval schedule. An example might be getting a raise every year and not in
between. A major problem with this schedule is that people tend to improve their
performance right before the time period expires so as to “look good” when the review
comes around.
When reinforcement is applied on an irregular basis, they are called variable
schedules.
• Variable Rati: This refers to applying a reinforcer after a variable number of
responses. Variable ratio schedules have been found to work best under many
circumstances and knowing an example will explain why. Imagine walking into a
casino and heading for the slot machines. After the third coin you put in, you get two
back. Two more and you get three back. Another five coins and you receive two more
back. How difficult is it to stop playing?
• Variable Interval: Reinforcement someone after a variable amount of time is the
final schedule. If you have a boss who checks your work periodically, you understand
the power of this schedule. Because you don’t know when the next ‘check-up’ might
come, you have to be working hard at all times in order to be ready.
What is Reinforcement Learning:
Reinforcement learning is the training of machine learning models to make a sequence of
decisions. The agent learns to achieve a goal in an uncertain, potentially complex environment.
In reinforcement learning, an artificial intelligence faces a game like situation. The computer
employs trail and error to come up with a solution to the problem. To get the machine to do
what the programmer wants, the artificial intelligence gets either rewards or penalties for the
actions it performs. Its goal is to maximize the total reward. Although the designer sets the
reward policy- that is, the rules of the game- he gives the model no hints or suggestions for
how to solve the game. It’s up to the model to figure out how to perform the task to maximize
the reward, starting from totally random trials and finishing with sophisticated tactics and
superhuman skills. By leveraging the power of search and many trials, reinforcement learning
is currently the most effective way to hint machine’s creativity. In contrast to human beings,
artificial intelligence can gather experience from thousands of parallel gameplays if a
reinforcement learning algorithm is run on a sufficiently powerful computer infrastructure.
• For example, Applications of reinforcement learning were in the past limited by weak
computer infrastructure. However, as Gerard Tesauro’s backgammon AI super player
developed in1990’s shows, progress did happen. That early progress is now rapidly
changing with powerful new computational technologies opening the way to
completely new inspiring applications. Training the models that control autonomous
cars is an excellent example of a potential application of reinforcement learning. In an
ideal situation, the computer should get no instructions on driving the car. The
programmer would avoid hard-wiring anything connected with the task and allow the
machine to learn from its own errors. In a perfect situation, the only hard wired element
would be the reward function.
• For example: In usual circumstances we would require an autonomous vehicle to put
safety first, minimize ride time, reduce pollution, offer passengers comfort and obey
the rules of law. With an autonomous race car, on the other hand, we would emphasize
speed much more than the driver’s comfort. The programmer cannot predict everything
that could happen on the road. Instead of building lengthy “if-then” instructions, the
programmer prepares the reinforcement learning agent to be capable of learning from
the system of rewards and penalties. The agent (another name for reinforcement
learning algorithms performing the task) gets rewards for reaching specific goals.
• Another example: deepsense.ai took part in the “Learning to run” project, which aimed
to train a virtual runner from scratch. The runner is an advanced and precise
musculoskeletal model designed by the Stanford Neuromuscular Biomechanics
Laboratory. Learning the agent how to run is a first step in building a new generation
of prosthetic legs, ones that automatically recognize people’s walking patterns and
tweak themselves to make moving easier and more effective. While it is possible and
has been done in Stanford’s labs, hard-wiring all the commands and predicting all
possible patterns of walking requires a lot of work from highly skilled programmers.
Challenges with reinforcement learning:
The main challenge in reinforcement learning lays in preparing the simulation environment,
which is highly dependent on the task to be performed. When the model has to go
superhuman in Chess, Go or Atari games, preparing the simulation environment is
relatively simple. When it comes to building a model capable of driving an autonomous
car, building a realistic simulator is crucial before letting the car ride on the street. The
model has to figure out how to brake or avoid a collision in a safe environment, where
sacrificing even a thousand cars comes at a minimal cost. Transferring the model out of the
training environment and into the real world is where things get tricky. Scaling and
tweaking the neural network controlling the agent is another challenge. There is no way to
communicate with the network other than through the system of rewards and penalties.
This in particular may lead to catastrophic forgetting, where a acquiring new knowledge
causes some of the old to be erased from the network.
Yet another challenge is reaching a local optimum- that is the agent performs the task as it
is, but not in the optimum or required way. A “jumper” jumping like a kangaroo instead of
doing the thing that was expected of it- walking-is a great example, and is also one that can
be found in our recent blog post. Finally, there are agents that will optimize the prize
without performing the task it was designed for.
What distinguishes reinforcement learning from deep learning and machine
learning:
In fact, there should be no clear divide between machine learning, deep learning and
reinforcement learning. It is like a parallelogram- rectangle- square relation, where machine
learning is the broadest category and the deep reinforcement learning the narrowest one.
In the same way, reinforcement learning is a specialized application of machine and deep
learning techniques, designed to solve problems in a particular way.
Although the ideas seem to differ, there is no sharp divide between these sub type. Moreover,
they merge within projects, as the models are designed not to stick to a “pure type” but to
perform the task in the most effective way possible. So “what precisely distinguishes machine
learning, deep learning is reinforcement learning” is actually a tricky question to answer.
• Machine learning is a form of AI in which computer are given the ability to progressively
improve the performance of a specific task with data, without being directly programmed
(Arthur Lee Samuel’s definition). He coined the term “machine learning”, of which there
are two types, supervised and unsupervised machine learning.
• Supervised machine learning happens when a programmer can provide a label for every
training input into the machine learning system.
• Example: By analyzing the historical data taken from coal mines, deepsense.ai prepared
an automated system for predicting dangerous seismic events up to 8 hours before they
occur. The records of seismic events were taken from 24 coal mines that had collected data
for several months. The model was able to recognize the likelihood of an explosion by
analyzing the readings from the previous 24 hours.
Unsupervised learning takes place when the model is provided only with the input data, but no
explicit labels. It has to dig through the data and find the hidden structure or relationships
within. The designer might not know what the structure is or what the machine learning model
is going to find.
• An example we employed was for chum prediction. We analyzed customer data and
designed an algorithm to group similar customers. However, we didn’t choose the groups
ourselves. Later on, we could identify high-risk groups (those with a high churn rate) and
our client knew which customers they should approach first.
• Another example of unsupervised learning is anomaly detection, where the algorithm has
to spot the element that doesn’t fit in with the group. It may be a flawed product, potentially
fraudulent transaction or any other event associated with breaking the norm.
Deep learning consists of several layers of neural networks, designed to perform more
sophisticated tasks. The construction of deep learning models was inspired by the design of the
human brain, but simplified. Deep learning models consist of a few neural network layers
which are in principle responsible for gradually learning more abstract features about particular
data. Although deep learning solutions are able to provide marvelous results, in terms of scale
they are no match for the human brain. Each layer uses the outcome of a previous one as an
input and the whole network is trained as a single whole. The core concept of creating an
artificial neural network is not new, but only recently has modern hardware provided enough
computational power to effectively train such networks by exposing a sufficient number of
examples. Extended adoption has brought about frameworks like TensorFlow, Keras and
PyTorch, all of which have made building machine learning models much more convenient.
• Example: deepsense.ai designed a deep learning- based model for the National Oceanic
and Atmospheric Administration (NOAA). It was designed to recognize right whales from
aerial photos taken by researchers. For further information about this endangered species
and dispensaries work with the NOAA. From a technical point of view, recognizing a
particular specimen of whales from aerial photos is pure deep learning.
Reinforcement learning, as started above employs a system of rewards and penalties to compel
the computer to solve problem by itself. Human involvement is limited to changing the
environment and tweaking the system of rewards and penalties. As the computer maximizes
the reward, it is prone to seeking unexpected ways of doing it. Human involvement is focused
on preventing it from exploiting the system and motivating the machine to perform the task in
the way expected. Reinforcement learning is useful when there is no “proper way” to perform
a task, yet there are rules the model has to follow to perform its duties correctly.
• Example: By tweaking and seeking the optimal policy for deep reinforcement learning, we
built an agent that in just 20 minutes reached a superhuman level in playing Atari games.
Similar algorithms in principal can be used to build AI for an autonomous car or a prosthetic
leg. In fact, one of the best ways to evaluate the reinforcement learning approach is to give
the model an Atari video game to play, such as Arkanoid or Space Invaders. According to
Google Brain’s Marc G. Bellemare, who introduced Atari video games as a reinforcement
learning benchmark, “although challenging, these environments remain simple enough that
we can hope to achieve measurable progress as we attempt to solve them”.
The key distinguishing factor of reinforcement learning is how the agent is trained. Instead of
inspecting the data provided, the model interacts with the environment, seeking ways to
minimize the reward. In case of deep reinforcement learning, a neural network is in charge of
storing the experiences and thus improves the way the task is performed.
The Difference between Positive/Negative Reinforcement and
Positive/Negative Punishment
In Applied Behaviour Analysis, there are two types of reinforcement and punishment: positive
and negative. It can be difficult to distinguish between the four of these. Therefore, the purpose
of this blog is to explain the differences in order to help parents and professionals develop
appropriate interventions to improve behaviour.
Reinforcement
• Reinforcement is used to help increase the probability that a specific behaviour will
occur in the future by delivering or removing a stimulus immediately after a behaviour.
• Another way to put it is that reinforcement, if done correctly, results in a behaviour
occurring more frequently in the future.
Positive Reinforcement
Positive reinforcement works by presenting a motivating/reinforcing stimulus to the person
after the desired behaviour is exhibited, making the behaviour more likely to happen in the
future.
The following are some examples of positive reinforcement:
• A mother gives her son praise (reinforcing stimulus) for doing homework (behaviour).
• The little boy receives $5.00 (reinforcing stimulus) for every A he earns on his report card
(behaviour).
• A father gives his daughter candy (reinforcing stimulus) for cleaning up toys (behaviour)
Negative Reinforcement
Negative reinforcement occurs when a certain stimulus (usually an aversive stimulus)
is removed after a particular behaviour is exhibited. The likelihood of the particular behaviour
occurring again in the future is increased because of removing/avoiding the negative
consequence.
Negative reinforcement should not be thought of as a punishment procedure. With negative
reinforcement, you are increasing a behaviour, whereas with punishment, you are decreasing a
behaviour.
The following are some examples of negative reinforcement:
• Bob does the dishes (behaviour) in order to stop his mother’s nagging (aversive stimulus).
• Natalie can get up from the dinner table (aversive stimulus) when she eats 2 bites of her
broccoli (behaviour).
• Joe presses a button (behaviour) that turns off a loud alarm (aversive stimulus)
When thinking about reinforcement, always remember that the end result is to try to increase
the behaviour, whereas punishment procedures are used to decrease behaviour. For positive
reinforcement, think of it as adding something positive in order to increase a response. For
negative reinforcement, think of it as taking something negative away in order to increase a
response.
Punishment
• When people hear that punishment procedures are being used, they typically think of an
aversive or harmful consequence. This is not always the case as you can see below.
• Punishment is a process by which a consequence immediately follows a behaviour which
decreases the future frequency of that behaviour. Like reinforcement, a stimulus can be
added (positive punishment) or removed (negative punishment).
• There are two types of punishment: positive and negative, and it can be difficult to tell the
difference between the two. Below are some examples to help clear up the confusion.
What is Positive Punishment?
Positive punishment works by presenting an aversive consequence after an undesired
behaviour is exhibited, making the behaviour less likely to happen in the future. The following
are some examples of positive punishment:
• A child picks his nose during class (behaviour) and the teacher reprimands him
(aversive stimulus) in front of his classmates.
• A child touches a hot stove (behaviour) and feels pain (aversive stimulus).
• A person eats spoiled food (behaviour) and gets a bad taste in his/her mouth (aversive
stimulus).
What is Negative Punishment?
Negative punishment happens when a certain reinforcing stimulus is removed after a particular
undesired behaviour is exhibited, resulting in the behaviour happening less often in the future.
The following are some examples of negative punishment:
• A child kicks a peer (behaviour), and is removed from his/her favourite activity (reinforcing
stimulus
• removed)
• A child yells out in class (behaviour), loses a token for good behaviour on his/her token
board (reinforcing stimulus removed) that could have later be cashed in for a prize.
• A child fights with her brother (behaviour) and has her favourite toy taken away
(reinforcing stimulus removed).
With punishment, always remember that the end result is to try to decrease the undesired
behaviour. Positive punishment involves adding an aversive consequence after an undesired
behaviour is emitted to decrease future responses. Negative punishment includes taking away
a certain reinforcing item after the undesired behaviour happens in order to decrease future
responses.
It should be noted that research shows that positive consequences are more powerful than
negative consequences for improving behaviour. Therefore, it is always suggested that these
interventions be tried prior to negative consequences.
Positive and Negative Reinforcement and Punishment
Reinforcement Punishment
Positive Something is added to
increase the likelihood of
a behavior
Something is added to
decrease the likelihood of
a behavior
Negative Something is removed to
increase the likelihood of
a behavior
Something is removed to
decrease the likelihood of
a behavior
Negative Reinforcement VS Punishment
Negative Reinforcement Punishment
Reinforcement follows a behavior and
results in an increase in future responding.
Punishment follows a behavior and
results in a decrease in future responding.
Reinforcement strengthens, maintains
and increases behavior.
Punishment decreases and reduces
behavior.
Negative refers to the removal of a
stimulus. Negative reinforcement occurs
when a stimulus is removed following a
behavior and results in an increase in the
behavior.
Whether a stimulus is added or removed
following the behavior, if the result is a
decrease in the behavior, punishment has
occurred.
Negative reinforcement is something is
removed following a behavior and results
in the likelihood that the behavior will
occur in the future.
Punishment is something is added or
removed following a behavior and result
in the likelihood that the behavior will no
longer occur.
References:
https://www.parentingforbrain.com/difference-between-positive-negative-reinforcement-and-
punishment/
https://difference.guru/difference-between-negative-reinforcement-and-punishment/
https://keydifferences.com/difference-between-reinforcement-and-punishment.html

More Related Content

Similar to Organizational behaviour

Re Boot Team²20071219
Re Boot Team²20071219Re Boot Team²20071219
Re Boot Team²20071219Yves Hanoulle
 
Bringing out the best in people summary with example
Bringing out the best in people summary with exampleBringing out the best in people summary with example
Bringing out the best in people summary with exampleIlya Sizov
 
learning_theories_reinforcement.pptx
learning_theories_reinforcement.pptxlearning_theories_reinforcement.pptx
learning_theories_reinforcement.pptxpriyanka pandey
 
Operant Conditioning in Day to Day Life
Operant Conditioning in Day to Day LifeOperant Conditioning in Day to Day Life
Operant Conditioning in Day to Day LifeKumari K. Karandawala
 
238 PART 1 Individual Behavior6.6 REINFORCEMENT AND CONSEQ.docx
238 PART 1 Individual Behavior6.6 REINFORCEMENT AND CONSEQ.docx238 PART 1 Individual Behavior6.6 REINFORCEMENT AND CONSEQ.docx
238 PART 1 Individual Behavior6.6 REINFORCEMENT AND CONSEQ.docxnovabroom
 
Reward Power - The Fastest Way To Persuade
Reward Power - The Fastest Way To PersuadeReward Power - The Fastest Way To Persuade
Reward Power - The Fastest Way To Persuadeaboardvisitor9294
 
Prosci's Top 10 Tactics for Managing Resistance.pdf
Prosci's Top 10 Tactics for Managing Resistance.pdfProsci's Top 10 Tactics for Managing Resistance.pdf
Prosci's Top 10 Tactics for Managing Resistance.pdfMARG Business Transformation
 
238 PART 1 Individual Behavior6.6 REINFORCEMENT AND CONSEQ.docx
238 PART 1 Individual Behavior6.6 REINFORCEMENT AND CONSEQ.docx238 PART 1 Individual Behavior6.6 REINFORCEMENT AND CONSEQ.docx
238 PART 1 Individual Behavior6.6 REINFORCEMENT AND CONSEQ.docxdomenicacullison
 
Types of machine learning
Types of machine learningTypes of machine learning
Types of machine learningHimaniAloona
 
Get better at getting better
Get better at getting betterGet better at getting better
Get better at getting betterKranthi Rainbow
 
Get better at_getting_better - share
Get better at_getting_better - shareGet better at_getting_better - share
Get better at_getting_better - shareKranthi Rainbow
 
MOTIVATION KOMAL YADAV
MOTIVATION KOMAL YADAVMOTIVATION KOMAL YADAV
MOTIVATION KOMAL YADAVpklk
 
KOMAL YADAV MOTIVATION MEANING AND THEORIES
KOMAL YADAV MOTIVATION MEANING AND THEORIESKOMAL YADAV MOTIVATION MEANING AND THEORIES
KOMAL YADAV MOTIVATION MEANING AND THEORIESpklk
 
MICROLEARNING FOR TRANSFORMATION, NOT INFORMATION TRANSFER
MICROLEARNING FOR TRANSFORMATION, NOT INFORMATION TRANSFERMICROLEARNING FOR TRANSFORMATION, NOT INFORMATION TRANSFER
MICROLEARNING FOR TRANSFORMATION, NOT INFORMATION TRANSFERHuman Capital Media
 
Feedback Models Handout-3 from Dream Team Webinar
Feedback Models Handout-3 from Dream Team WebinarFeedback Models Handout-3 from Dream Team Webinar
Feedback Models Handout-3 from Dream Team Webinarguestbc85d0
 
Experimentation mindset
Experimentation mindsetExperimentation mindset
Experimentation mindsetDoc Norton
 
The Core Protocols Zen
The Core Protocols ZenThe Core Protocols Zen
The Core Protocols ZenYves Hanoulle
 
Reinforcement theory of motivation -proposed by bf skinner
Reinforcement theory of motivation -proposed by bf skinnerReinforcement theory of motivation -proposed by bf skinner
Reinforcement theory of motivation -proposed by bf skinnerEVERSENDAI ENGINEERING (L.L.C.)
 
Intro to Reinforcement Learning
Intro to Reinforcement LearningIntro to Reinforcement Learning
Intro to Reinforcement LearningUtkarsh Garg
 

Similar to Organizational behaviour (20)

Re Boot Team²20071219
Re Boot Team²20071219Re Boot Team²20071219
Re Boot Team²20071219
 
Bringing out the best in people summary with example
Bringing out the best in people summary with exampleBringing out the best in people summary with example
Bringing out the best in people summary with example
 
learning_theories_reinforcement.pptx
learning_theories_reinforcement.pptxlearning_theories_reinforcement.pptx
learning_theories_reinforcement.pptx
 
Operant Conditioning in Day to Day Life
Operant Conditioning in Day to Day LifeOperant Conditioning in Day to Day Life
Operant Conditioning in Day to Day Life
 
238 PART 1 Individual Behavior6.6 REINFORCEMENT AND CONSEQ.docx
238 PART 1 Individual Behavior6.6 REINFORCEMENT AND CONSEQ.docx238 PART 1 Individual Behavior6.6 REINFORCEMENT AND CONSEQ.docx
238 PART 1 Individual Behavior6.6 REINFORCEMENT AND CONSEQ.docx
 
Reward Power - The Fastest Way To Persuade
Reward Power - The Fastest Way To PersuadeReward Power - The Fastest Way To Persuade
Reward Power - The Fastest Way To Persuade
 
Prosci's Top 10 Tactics for Managing Resistance.pdf
Prosci's Top 10 Tactics for Managing Resistance.pdfProsci's Top 10 Tactics for Managing Resistance.pdf
Prosci's Top 10 Tactics for Managing Resistance.pdf
 
238 PART 1 Individual Behavior6.6 REINFORCEMENT AND CONSEQ.docx
238 PART 1 Individual Behavior6.6 REINFORCEMENT AND CONSEQ.docx238 PART 1 Individual Behavior6.6 REINFORCEMENT AND CONSEQ.docx
238 PART 1 Individual Behavior6.6 REINFORCEMENT AND CONSEQ.docx
 
Types of machine learning
Types of machine learningTypes of machine learning
Types of machine learning
 
Get better at getting better
Get better at getting betterGet better at getting better
Get better at getting better
 
Get better at_getting_better - share
Get better at_getting_better - shareGet better at_getting_better - share
Get better at_getting_better - share
 
MOTIVATION KOMAL YADAV
MOTIVATION KOMAL YADAVMOTIVATION KOMAL YADAV
MOTIVATION KOMAL YADAV
 
Learning
LearningLearning
Learning
 
KOMAL YADAV MOTIVATION MEANING AND THEORIES
KOMAL YADAV MOTIVATION MEANING AND THEORIESKOMAL YADAV MOTIVATION MEANING AND THEORIES
KOMAL YADAV MOTIVATION MEANING AND THEORIES
 
MICROLEARNING FOR TRANSFORMATION, NOT INFORMATION TRANSFER
MICROLEARNING FOR TRANSFORMATION, NOT INFORMATION TRANSFERMICROLEARNING FOR TRANSFORMATION, NOT INFORMATION TRANSFER
MICROLEARNING FOR TRANSFORMATION, NOT INFORMATION TRANSFER
 
Feedback Models Handout-3 from Dream Team Webinar
Feedback Models Handout-3 from Dream Team WebinarFeedback Models Handout-3 from Dream Team Webinar
Feedback Models Handout-3 from Dream Team Webinar
 
Experimentation mindset
Experimentation mindsetExperimentation mindset
Experimentation mindset
 
The Core Protocols Zen
The Core Protocols ZenThe Core Protocols Zen
The Core Protocols Zen
 
Reinforcement theory of motivation -proposed by bf skinner
Reinforcement theory of motivation -proposed by bf skinnerReinforcement theory of motivation -proposed by bf skinner
Reinforcement theory of motivation -proposed by bf skinner
 
Intro to Reinforcement Learning
Intro to Reinforcement LearningIntro to Reinforcement Learning
Intro to Reinforcement Learning
 

More from rajlaxmipardeshi

Procurement & contract.(bot)
Procurement & contract.(bot)Procurement & contract.(bot)
Procurement & contract.(bot)rajlaxmipardeshi
 
Introduction of construction management
Introduction of construction managementIntroduction of construction management
Introduction of construction managementrajlaxmipardeshi
 
Consumer services & office management
Consumer services & office managementConsumer services & office management
Consumer services & office managementrajlaxmipardeshi
 
Construction site management
Construction site managementConstruction site management
Construction site managementrajlaxmipardeshi
 
Construction project management & feasibility
Construction project management & feasibilityConstruction project management & feasibility
Construction project management & feasibilityrajlaxmipardeshi
 
Construction project management & risk mitigation
Construction project management & risk mitigationConstruction project management & risk mitigation
Construction project management & risk mitigationrajlaxmipardeshi
 

More from rajlaxmipardeshi (7)

Procurement & contract.(bot)
Procurement & contract.(bot)Procurement & contract.(bot)
Procurement & contract.(bot)
 
Primavera.
Primavera.Primavera.
Primavera.
 
Introduction of construction management
Introduction of construction managementIntroduction of construction management
Introduction of construction management
 
Consumer services & office management
Consumer services & office managementConsumer services & office management
Consumer services & office management
 
Construction site management
Construction site managementConstruction site management
Construction site management
 
Construction project management & feasibility
Construction project management & feasibilityConstruction project management & feasibility
Construction project management & feasibility
 
Construction project management & risk mitigation
Construction project management & risk mitigationConstruction project management & risk mitigation
Construction project management & risk mitigation
 

Recently uploaded

Lucknow 💋 High Class Call Girls Lucknow 10k @ I'm VIP Independent Escorts Gir...
Lucknow 💋 High Class Call Girls Lucknow 10k @ I'm VIP Independent Escorts Gir...Lucknow 💋 High Class Call Girls Lucknow 10k @ I'm VIP Independent Escorts Gir...
Lucknow 💋 High Class Call Girls Lucknow 10k @ I'm VIP Independent Escorts Gir...anilsa9823
 
Call Girls Anjuna beach Mariott Resort ₰8588052666
Call Girls Anjuna beach Mariott Resort ₰8588052666Call Girls Anjuna beach Mariott Resort ₰8588052666
Call Girls Anjuna beach Mariott Resort ₰8588052666nishakur201
 
REFLECTIONS Newsletter Jan-Jul 2024.pdf.pdf
REFLECTIONS Newsletter Jan-Jul 2024.pdf.pdfREFLECTIONS Newsletter Jan-Jul 2024.pdf.pdf
REFLECTIONS Newsletter Jan-Jul 2024.pdf.pdfssusere8ea60
 
《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...
《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...
《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...ur8mqw8e
 
CALL ON ➥8923113531 🔝Call Girls Adil Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Adil Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Adil Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Adil Nagar Lucknow best Female serviceanilsa9823
 
(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts
(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts
(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ EscortsDelhi Escorts Service
 
Call Girls in Kalyan Vihar Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Kalyan Vihar Delhi 💯 Call Us 🔝8264348440🔝Call Girls in Kalyan Vihar Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Kalyan Vihar Delhi 💯 Call Us 🔝8264348440🔝soniya singh
 
办理西悉尼大学毕业证成绩单、制作假文凭
办理西悉尼大学毕业证成绩单、制作假文凭办理西悉尼大学毕业证成绩单、制作假文凭
办理西悉尼大学毕业证成绩单、制作假文凭o8wvnojp
 
Postal Ballot procedure for employees to utilise
Postal Ballot procedure for employees to utilisePostal Ballot procedure for employees to utilise
Postal Ballot procedure for employees to utiliseccsubcollector
 
Reinventing Corporate Philanthropy_ Strategies for Meaningful Impact by Leko ...
Reinventing Corporate Philanthropy_ Strategies for Meaningful Impact by Leko ...Reinventing Corporate Philanthropy_ Strategies for Meaningful Impact by Leko ...
Reinventing Corporate Philanthropy_ Strategies for Meaningful Impact by Leko ...Leko Durda
 
Understanding Relationship Anarchy: A Guide to Liberating Love | CIO Women Ma...
Understanding Relationship Anarchy: A Guide to Liberating Love | CIO Women Ma...Understanding Relationship Anarchy: A Guide to Liberating Love | CIO Women Ma...
Understanding Relationship Anarchy: A Guide to Liberating Love | CIO Women Ma...CIOWomenMagazine
 
Dhule Call Girls #9907093804 Contact Number Escorts Service Dhule
Dhule Call Girls #9907093804 Contact Number Escorts Service DhuleDhule Call Girls #9907093804 Contact Number Escorts Service Dhule
Dhule Call Girls #9907093804 Contact Number Escorts Service Dhulesrsj9000
 
Call Girls In Andheri East Call US Pooja📞 9892124323 Book Hot And
Call Girls In Andheri East Call US Pooja📞 9892124323 Book Hot AndCall Girls In Andheri East Call US Pooja📞 9892124323 Book Hot And
Call Girls In Andheri East Call US Pooja📞 9892124323 Book Hot AndPooja Nehwal
 
The Selfspace Journal Preview by Mindbrush
The Selfspace Journal Preview by MindbrushThe Selfspace Journal Preview by Mindbrush
The Selfspace Journal Preview by MindbrushShivain97
 
Breath, Brain & Beyond_A Holistic Approach to Peak Performance.pdf
Breath, Brain & Beyond_A Holistic Approach to Peak Performance.pdfBreath, Brain & Beyond_A Holistic Approach to Peak Performance.pdf
Breath, Brain & Beyond_A Holistic Approach to Peak Performance.pdfJess Walker
 
8377087607 Full Enjoy @24/7-CLEAN-Call Girls In Chhatarpur,
8377087607 Full Enjoy @24/7-CLEAN-Call Girls In Chhatarpur,8377087607 Full Enjoy @24/7-CLEAN-Call Girls In Chhatarpur,
8377087607 Full Enjoy @24/7-CLEAN-Call Girls In Chhatarpur,dollysharma2066
 
call girls in candolim beach 9870370636] NORTH GOA ..
call girls in candolim beach 9870370636] NORTH GOA ..call girls in candolim beach 9870370636] NORTH GOA ..
call girls in candolim beach 9870370636] NORTH GOA ..nishakur201
 
CALL ON ➥8923113531 🔝Call Girls Aliganj Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Aliganj Lucknow best sexual serviceCALL ON ➥8923113531 🔝Call Girls Aliganj Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Aliganj Lucknow best sexual serviceanilsa9823
 
Lilac Illustrated Social Psychology Presentation.pptx
Lilac Illustrated Social Psychology Presentation.pptxLilac Illustrated Social Psychology Presentation.pptx
Lilac Illustrated Social Psychology Presentation.pptxABMWeaklings
 

Recently uploaded (20)

Lucknow 💋 High Class Call Girls Lucknow 10k @ I'm VIP Independent Escorts Gir...
Lucknow 💋 High Class Call Girls Lucknow 10k @ I'm VIP Independent Escorts Gir...Lucknow 💋 High Class Call Girls Lucknow 10k @ I'm VIP Independent Escorts Gir...
Lucknow 💋 High Class Call Girls Lucknow 10k @ I'm VIP Independent Escorts Gir...
 
Call Girls Anjuna beach Mariott Resort ₰8588052666
Call Girls Anjuna beach Mariott Resort ₰8588052666Call Girls Anjuna beach Mariott Resort ₰8588052666
Call Girls Anjuna beach Mariott Resort ₰8588052666
 
REFLECTIONS Newsletter Jan-Jul 2024.pdf.pdf
REFLECTIONS Newsletter Jan-Jul 2024.pdf.pdfREFLECTIONS Newsletter Jan-Jul 2024.pdf.pdf
REFLECTIONS Newsletter Jan-Jul 2024.pdf.pdf
 
《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...
《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...
《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...
 
CALL ON ➥8923113531 🔝Call Girls Adil Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Adil Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Adil Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Adil Nagar Lucknow best Female service
 
(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts
(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts
(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts
 
Call Girls in Kalyan Vihar Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Kalyan Vihar Delhi 💯 Call Us 🔝8264348440🔝Call Girls in Kalyan Vihar Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Kalyan Vihar Delhi 💯 Call Us 🔝8264348440🔝
 
办理西悉尼大学毕业证成绩单、制作假文凭
办理西悉尼大学毕业证成绩单、制作假文凭办理西悉尼大学毕业证成绩单、制作假文凭
办理西悉尼大学毕业证成绩单、制作假文凭
 
Postal Ballot procedure for employees to utilise
Postal Ballot procedure for employees to utilisePostal Ballot procedure for employees to utilise
Postal Ballot procedure for employees to utilise
 
Reinventing Corporate Philanthropy_ Strategies for Meaningful Impact by Leko ...
Reinventing Corporate Philanthropy_ Strategies for Meaningful Impact by Leko ...Reinventing Corporate Philanthropy_ Strategies for Meaningful Impact by Leko ...
Reinventing Corporate Philanthropy_ Strategies for Meaningful Impact by Leko ...
 
Understanding Relationship Anarchy: A Guide to Liberating Love | CIO Women Ma...
Understanding Relationship Anarchy: A Guide to Liberating Love | CIO Women Ma...Understanding Relationship Anarchy: A Guide to Liberating Love | CIO Women Ma...
Understanding Relationship Anarchy: A Guide to Liberating Love | CIO Women Ma...
 
Dhule Call Girls #9907093804 Contact Number Escorts Service Dhule
Dhule Call Girls #9907093804 Contact Number Escorts Service DhuleDhule Call Girls #9907093804 Contact Number Escorts Service Dhule
Dhule Call Girls #9907093804 Contact Number Escorts Service Dhule
 
Call Girls In Andheri East Call US Pooja📞 9892124323 Book Hot And
Call Girls In Andheri East Call US Pooja📞 9892124323 Book Hot AndCall Girls In Andheri East Call US Pooja📞 9892124323 Book Hot And
Call Girls In Andheri East Call US Pooja📞 9892124323 Book Hot And
 
The Selfspace Journal Preview by Mindbrush
The Selfspace Journal Preview by MindbrushThe Selfspace Journal Preview by Mindbrush
The Selfspace Journal Preview by Mindbrush
 
Model Call Girl in Lado Sarai Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Lado Sarai Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Lado Sarai Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Lado Sarai Delhi reach out to us at 🔝9953056974🔝
 
Breath, Brain & Beyond_A Holistic Approach to Peak Performance.pdf
Breath, Brain & Beyond_A Holistic Approach to Peak Performance.pdfBreath, Brain & Beyond_A Holistic Approach to Peak Performance.pdf
Breath, Brain & Beyond_A Holistic Approach to Peak Performance.pdf
 
8377087607 Full Enjoy @24/7-CLEAN-Call Girls In Chhatarpur,
8377087607 Full Enjoy @24/7-CLEAN-Call Girls In Chhatarpur,8377087607 Full Enjoy @24/7-CLEAN-Call Girls In Chhatarpur,
8377087607 Full Enjoy @24/7-CLEAN-Call Girls In Chhatarpur,
 
call girls in candolim beach 9870370636] NORTH GOA ..
call girls in candolim beach 9870370636] NORTH GOA ..call girls in candolim beach 9870370636] NORTH GOA ..
call girls in candolim beach 9870370636] NORTH GOA ..
 
CALL ON ➥8923113531 🔝Call Girls Aliganj Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Aliganj Lucknow best sexual serviceCALL ON ➥8923113531 🔝Call Girls Aliganj Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Aliganj Lucknow best sexual service
 
Lilac Illustrated Social Psychology Presentation.pptx
Lilac Illustrated Social Psychology Presentation.pptxLilac Illustrated Social Psychology Presentation.pptx
Lilac Illustrated Social Psychology Presentation.pptx
 

Organizational behaviour

  • 1. ORGANIZATIONAL BEHAVIOUR DIFFERENCE BETWEEN NEGATIVE REINFORCEMENT AND PUNISHMENT BY REMOVAL Report Submitted to the Institute of Real Estate and Finance ( MBA+PGP) BY Vikrant Wagh Rajlaxmi Pardeshi Parag Jagtap Sachin Rai Dilip Mehta UNDER THE GUIDANCE OF Mis Pooja Kumar
  • 2. ACKNOWLEDEMENT We would like to express our special thanks of gratitude to our director Mr. Abhay Kumar for all support and guidance throughout and giving us the opportunity to work on this report which helped me to understand Difference between reinforcement and punishment by removal deeply , and I came to know about so many things . secondly I would like to show my gratitude towards our mentor Miss Pooja Kumar for giving an opportunity to work on such a report . this opportunity led me to the pathway of deep knowledge of difference between negative reinforcement and punishment by removal .
  • 3. TABLE OF CONTENT 1 . INTRODUCION 2 . REINFORCEMENT 3. TYPES OF REINFORCEMENT 1. POSITIVE REINFORCEMENT 2. NEGATIVE REINFORCEMENT 3. PUNISHMENT 4. EXTINCTION 4 . PUNISHMENT 1. POSSTIVE PUNISHMENT 2. NEGATIVE PUNISHMENT 5 . REINFORCEMENT SCHEDULES 1. FIXED RATIO 2. FIXED INTERVAL 6 . LEARNING FROM REINFORCEMENT 7. EXAMPLES 8. CHALLENGES WITH REINFORCEMENT LEARNING 9. DIFEERENCE BETWEEN POSSITIVE AND NEGATIVE REINFORCEMENT 10 . POSSITIVE REINFORCEMENT VS PUNISHMENT 11 . REFFERENCE
  • 4. INTRODUCTION REINFORCEMENT: The term reinforce means to strengthen, and used in psychology to refer to any stimuli which strengthens the probability of a specific response. For example, if you want your dog to sit on command, you may give him a treat every time he sits for you. The dog will eventually come to understand that sitting when told to will result in a treat. This treat is reinforcing because dogs like treats. This is a simple description of a reinforcer (the treat), which increases the response (sitting). We all apply reinforcers every day, most of the time without even realizing we are doing it. You may tell your child “good job” after he or she cleans their room; perhaps you tell your partner how good he or she looks when they dress up; or maybe you got a raise at work after doing a great job on a project. All of these things increase the probability that the same response will be repeated. There are four types of reinforcement: • Positive reinforcement • Negative reinforcement • Punishment • Extinction Operant Conditioning Add Something Remove Something Increase a Behavior Positive Reinforcement Negative Reinforcement Decrease a Behavior Positive Punishment Negative Punishment Positive Reinforcement: Think of it as adding something in order to increase a response. For example, adding a treat will increase the response of sitting; adding praise will increase the chances of your child cleaning his or her room. The most common types of positive reinforcement or praise and rewards, and most of us have experienced this as both the giver and receiver. Negative Reinforcement: Think of negative reinforcement as taking something away in order to increase a response. Taking away a toy until your son picks up his room, or withholding payment until a job is complete are examples of this. Basically, you want to remove or withhold something of value in order to increase a certain response or behavior.
  • 5. Punishment (Positive Punishment): What most people refer to punishment is typically positive punishment. This is when something aversive is added in order to decrease a behavior. The most common example of this is disciplining a child for misbehaving. The reason we do this is because the child begins to associate being punished with the negative behavior. The punishment is not liked and therefore to avoid it, he or she will stop behaving in that manner. Negative Punishment: When you remove something in order to decrease a behavior, this is called negative punishment. You are taking something away so that a response or unwanted behavior is decreased. Putting a child in a time-out until they can decrease their aggressive behavior, for instance, is an example of a negative punishment. You’re removing interactions with others in order to decrease the unwanted behavior. Reinforcement Schedules: Know that we understand the four types of reinforcement, we need to understand how and when these are applied (Ferster & Skinner,1957). For example, do we apply the positive reinforcement every time a child does something positive? Do we punish a child every time he does something negative? To answer these questions, you need to understand the schedules of reinforcement. Applying one of the four types of reinforcement every time the behavior occurs is called a continuous schedule. Its continuous because the application occurs after every project, behavior, etc. This is the best approach when using punishment. Inconsistencies in the punishment of children often results in confusion and resentment. A problem with this schedule is that we are not always present when a behavior occurs or may not be able to apply the punishment. There are two types of continuous schedules: • Fixed Ratio: A fixed ratio schedule refers to applying the reinforcement after a specific number of behaviors. Spanking a child if you have to ask him three times to clean his room is an example. The problem is that the child will begin to realize that he ca get away with two requests before he has to act. Therefore, the behavior does not tend to change until right before the preset number. • Fixed Interval: Applying the reinforcer after a specific amount of time is referred to as a fixed interval schedule. An example might be getting a raise every year and not in between. A major problem with this schedule is that people tend to improve their performance right before the time period expires so as to “look good” when the review comes around.
  • 6. When reinforcement is applied on an irregular basis, they are called variable schedules. • Variable Rati: This refers to applying a reinforcer after a variable number of responses. Variable ratio schedules have been found to work best under many circumstances and knowing an example will explain why. Imagine walking into a casino and heading for the slot machines. After the third coin you put in, you get two back. Two more and you get three back. Another five coins and you receive two more back. How difficult is it to stop playing? • Variable Interval: Reinforcement someone after a variable amount of time is the final schedule. If you have a boss who checks your work periodically, you understand the power of this schedule. Because you don’t know when the next ‘check-up’ might come, you have to be working hard at all times in order to be ready. What is Reinforcement Learning: Reinforcement learning is the training of machine learning models to make a sequence of decisions. The agent learns to achieve a goal in an uncertain, potentially complex environment. In reinforcement learning, an artificial intelligence faces a game like situation. The computer employs trail and error to come up with a solution to the problem. To get the machine to do what the programmer wants, the artificial intelligence gets either rewards or penalties for the actions it performs. Its goal is to maximize the total reward. Although the designer sets the reward policy- that is, the rules of the game- he gives the model no hints or suggestions for how to solve the game. It’s up to the model to figure out how to perform the task to maximize the reward, starting from totally random trials and finishing with sophisticated tactics and superhuman skills. By leveraging the power of search and many trials, reinforcement learning is currently the most effective way to hint machine’s creativity. In contrast to human beings,
  • 7. artificial intelligence can gather experience from thousands of parallel gameplays if a reinforcement learning algorithm is run on a sufficiently powerful computer infrastructure. • For example, Applications of reinforcement learning were in the past limited by weak computer infrastructure. However, as Gerard Tesauro’s backgammon AI super player developed in1990’s shows, progress did happen. That early progress is now rapidly changing with powerful new computational technologies opening the way to completely new inspiring applications. Training the models that control autonomous cars is an excellent example of a potential application of reinforcement learning. In an ideal situation, the computer should get no instructions on driving the car. The programmer would avoid hard-wiring anything connected with the task and allow the machine to learn from its own errors. In a perfect situation, the only hard wired element would be the reward function. • For example: In usual circumstances we would require an autonomous vehicle to put safety first, minimize ride time, reduce pollution, offer passengers comfort and obey the rules of law. With an autonomous race car, on the other hand, we would emphasize speed much more than the driver’s comfort. The programmer cannot predict everything that could happen on the road. Instead of building lengthy “if-then” instructions, the programmer prepares the reinforcement learning agent to be capable of learning from the system of rewards and penalties. The agent (another name for reinforcement learning algorithms performing the task) gets rewards for reaching specific goals. • Another example: deepsense.ai took part in the “Learning to run” project, which aimed to train a virtual runner from scratch. The runner is an advanced and precise musculoskeletal model designed by the Stanford Neuromuscular Biomechanics Laboratory. Learning the agent how to run is a first step in building a new generation of prosthetic legs, ones that automatically recognize people’s walking patterns and tweak themselves to make moving easier and more effective. While it is possible and has been done in Stanford’s labs, hard-wiring all the commands and predicting all possible patterns of walking requires a lot of work from highly skilled programmers. Challenges with reinforcement learning: The main challenge in reinforcement learning lays in preparing the simulation environment, which is highly dependent on the task to be performed. When the model has to go superhuman in Chess, Go or Atari games, preparing the simulation environment is relatively simple. When it comes to building a model capable of driving an autonomous car, building a realistic simulator is crucial before letting the car ride on the street. The model has to figure out how to brake or avoid a collision in a safe environment, where sacrificing even a thousand cars comes at a minimal cost. Transferring the model out of the training environment and into the real world is where things get tricky. Scaling and tweaking the neural network controlling the agent is another challenge. There is no way to communicate with the network other than through the system of rewards and penalties. This in particular may lead to catastrophic forgetting, where a acquiring new knowledge causes some of the old to be erased from the network.
  • 8. Yet another challenge is reaching a local optimum- that is the agent performs the task as it is, but not in the optimum or required way. A “jumper” jumping like a kangaroo instead of doing the thing that was expected of it- walking-is a great example, and is also one that can be found in our recent blog post. Finally, there are agents that will optimize the prize without performing the task it was designed for. What distinguishes reinforcement learning from deep learning and machine learning: In fact, there should be no clear divide between machine learning, deep learning and reinforcement learning. It is like a parallelogram- rectangle- square relation, where machine learning is the broadest category and the deep reinforcement learning the narrowest one. In the same way, reinforcement learning is a specialized application of machine and deep learning techniques, designed to solve problems in a particular way. Although the ideas seem to differ, there is no sharp divide between these sub type. Moreover, they merge within projects, as the models are designed not to stick to a “pure type” but to perform the task in the most effective way possible. So “what precisely distinguishes machine learning, deep learning is reinforcement learning” is actually a tricky question to answer. • Machine learning is a form of AI in which computer are given the ability to progressively improve the performance of a specific task with data, without being directly programmed (Arthur Lee Samuel’s definition). He coined the term “machine learning”, of which there are two types, supervised and unsupervised machine learning. • Supervised machine learning happens when a programmer can provide a label for every training input into the machine learning system. • Example: By analyzing the historical data taken from coal mines, deepsense.ai prepared an automated system for predicting dangerous seismic events up to 8 hours before they occur. The records of seismic events were taken from 24 coal mines that had collected data for several months. The model was able to recognize the likelihood of an explosion by analyzing the readings from the previous 24 hours. Unsupervised learning takes place when the model is provided only with the input data, but no explicit labels. It has to dig through the data and find the hidden structure or relationships within. The designer might not know what the structure is or what the machine learning model is going to find. • An example we employed was for chum prediction. We analyzed customer data and designed an algorithm to group similar customers. However, we didn’t choose the groups ourselves. Later on, we could identify high-risk groups (those with a high churn rate) and our client knew which customers they should approach first. • Another example of unsupervised learning is anomaly detection, where the algorithm has to spot the element that doesn’t fit in with the group. It may be a flawed product, potentially fraudulent transaction or any other event associated with breaking the norm.
  • 9. Deep learning consists of several layers of neural networks, designed to perform more sophisticated tasks. The construction of deep learning models was inspired by the design of the human brain, but simplified. Deep learning models consist of a few neural network layers which are in principle responsible for gradually learning more abstract features about particular data. Although deep learning solutions are able to provide marvelous results, in terms of scale they are no match for the human brain. Each layer uses the outcome of a previous one as an input and the whole network is trained as a single whole. The core concept of creating an artificial neural network is not new, but only recently has modern hardware provided enough computational power to effectively train such networks by exposing a sufficient number of examples. Extended adoption has brought about frameworks like TensorFlow, Keras and PyTorch, all of which have made building machine learning models much more convenient. • Example: deepsense.ai designed a deep learning- based model for the National Oceanic and Atmospheric Administration (NOAA). It was designed to recognize right whales from aerial photos taken by researchers. For further information about this endangered species and dispensaries work with the NOAA. From a technical point of view, recognizing a particular specimen of whales from aerial photos is pure deep learning. Reinforcement learning, as started above employs a system of rewards and penalties to compel the computer to solve problem by itself. Human involvement is limited to changing the environment and tweaking the system of rewards and penalties. As the computer maximizes the reward, it is prone to seeking unexpected ways of doing it. Human involvement is focused on preventing it from exploiting the system and motivating the machine to perform the task in the way expected. Reinforcement learning is useful when there is no “proper way” to perform a task, yet there are rules the model has to follow to perform its duties correctly. • Example: By tweaking and seeking the optimal policy for deep reinforcement learning, we built an agent that in just 20 minutes reached a superhuman level in playing Atari games. Similar algorithms in principal can be used to build AI for an autonomous car or a prosthetic leg. In fact, one of the best ways to evaluate the reinforcement learning approach is to give the model an Atari video game to play, such as Arkanoid or Space Invaders. According to Google Brain’s Marc G. Bellemare, who introduced Atari video games as a reinforcement learning benchmark, “although challenging, these environments remain simple enough that we can hope to achieve measurable progress as we attempt to solve them”. The key distinguishing factor of reinforcement learning is how the agent is trained. Instead of inspecting the data provided, the model interacts with the environment, seeking ways to minimize the reward. In case of deep reinforcement learning, a neural network is in charge of storing the experiences and thus improves the way the task is performed. The Difference between Positive/Negative Reinforcement and Positive/Negative Punishment In Applied Behaviour Analysis, there are two types of reinforcement and punishment: positive and negative. It can be difficult to distinguish between the four of these. Therefore, the purpose
  • 10. of this blog is to explain the differences in order to help parents and professionals develop appropriate interventions to improve behaviour. Reinforcement • Reinforcement is used to help increase the probability that a specific behaviour will occur in the future by delivering or removing a stimulus immediately after a behaviour. • Another way to put it is that reinforcement, if done correctly, results in a behaviour occurring more frequently in the future. Positive Reinforcement Positive reinforcement works by presenting a motivating/reinforcing stimulus to the person after the desired behaviour is exhibited, making the behaviour more likely to happen in the future. The following are some examples of positive reinforcement: • A mother gives her son praise (reinforcing stimulus) for doing homework (behaviour). • The little boy receives $5.00 (reinforcing stimulus) for every A he earns on his report card (behaviour).
  • 11. • A father gives his daughter candy (reinforcing stimulus) for cleaning up toys (behaviour) Negative Reinforcement Negative reinforcement occurs when a certain stimulus (usually an aversive stimulus) is removed after a particular behaviour is exhibited. The likelihood of the particular behaviour occurring again in the future is increased because of removing/avoiding the negative consequence. Negative reinforcement should not be thought of as a punishment procedure. With negative reinforcement, you are increasing a behaviour, whereas with punishment, you are decreasing a behaviour. The following are some examples of negative reinforcement: • Bob does the dishes (behaviour) in order to stop his mother’s nagging (aversive stimulus). • Natalie can get up from the dinner table (aversive stimulus) when she eats 2 bites of her broccoli (behaviour). • Joe presses a button (behaviour) that turns off a loud alarm (aversive stimulus) When thinking about reinforcement, always remember that the end result is to try to increase the behaviour, whereas punishment procedures are used to decrease behaviour. For positive reinforcement, think of it as adding something positive in order to increase a response. For negative reinforcement, think of it as taking something negative away in order to increase a response. Punishment • When people hear that punishment procedures are being used, they typically think of an aversive or harmful consequence. This is not always the case as you can see below.
  • 12. • Punishment is a process by which a consequence immediately follows a behaviour which decreases the future frequency of that behaviour. Like reinforcement, a stimulus can be added (positive punishment) or removed (negative punishment). • There are two types of punishment: positive and negative, and it can be difficult to tell the difference between the two. Below are some examples to help clear up the confusion. What is Positive Punishment? Positive punishment works by presenting an aversive consequence after an undesired behaviour is exhibited, making the behaviour less likely to happen in the future. The following are some examples of positive punishment: • A child picks his nose during class (behaviour) and the teacher reprimands him (aversive stimulus) in front of his classmates. • A child touches a hot stove (behaviour) and feels pain (aversive stimulus). • A person eats spoiled food (behaviour) and gets a bad taste in his/her mouth (aversive stimulus). What is Negative Punishment? Negative punishment happens when a certain reinforcing stimulus is removed after a particular undesired behaviour is exhibited, resulting in the behaviour happening less often in the future. The following are some examples of negative punishment: • A child kicks a peer (behaviour), and is removed from his/her favourite activity (reinforcing stimulus • removed) • A child yells out in class (behaviour), loses a token for good behaviour on his/her token board (reinforcing stimulus removed) that could have later be cashed in for a prize. • A child fights with her brother (behaviour) and has her favourite toy taken away (reinforcing stimulus removed).
  • 13. With punishment, always remember that the end result is to try to decrease the undesired behaviour. Positive punishment involves adding an aversive consequence after an undesired behaviour is emitted to decrease future responses. Negative punishment includes taking away a certain reinforcing item after the undesired behaviour happens in order to decrease future responses. It should be noted that research shows that positive consequences are more powerful than negative consequences for improving behaviour. Therefore, it is always suggested that these interventions be tried prior to negative consequences. Positive and Negative Reinforcement and Punishment Reinforcement Punishment Positive Something is added to increase the likelihood of a behavior Something is added to decrease the likelihood of a behavior
  • 14. Negative Something is removed to increase the likelihood of a behavior Something is removed to decrease the likelihood of a behavior Negative Reinforcement VS Punishment Negative Reinforcement Punishment Reinforcement follows a behavior and results in an increase in future responding. Punishment follows a behavior and results in a decrease in future responding. Reinforcement strengthens, maintains and increases behavior. Punishment decreases and reduces behavior. Negative refers to the removal of a stimulus. Negative reinforcement occurs when a stimulus is removed following a behavior and results in an increase in the behavior. Whether a stimulus is added or removed following the behavior, if the result is a decrease in the behavior, punishment has occurred. Negative reinforcement is something is removed following a behavior and results in the likelihood that the behavior will occur in the future. Punishment is something is added or removed following a behavior and result in the likelihood that the behavior will no longer occur.