Your SlideShare is downloading. ×
0
Reward
Reward
Reward
Reward
Reward
Reward
Reward
Reward
Reward
Reward
Reward
Reward
Reward
Reward
Reward
Reward
Reward
Reward
Reward
Reward
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Reward

728

Published on

Speaker: Jimmy Lu …

Speaker: Jimmy Lu
Topics: Reward
Date: 2010.09.17

WECO Lab, CSIE, FJU

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
728
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
5
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Reward<br />Speaker : Jimmy Lu<br />Advisor : Hsing Mei<br />Web Computing Laboratory(WECO Lab)<br />Computer Science and Information Engineering Department<br />Fu Jen Catholic University<br />
  • 2. Conditioning<br />
  • 3. Classical Conditioning<br />Also called Pavlovian or respondent conditioning.<br />Is a form of associative learning.<br />The typical procedure for inducing classical conditioning involves presentations of a neutral stimulus along with a stimulus of some significance.<br />conditioned stimulus (CS), conditioned response (CR), unconditioned stimulus (US),unconditioned response (CS)<br />
  • 4. Typical Procedure<br />
  • 5. Operant Conditioning<br />Or instrumental conditioning.<br />Operant conditioning deals with the modification of voluntary behavioror operant behavior.<br />Operant behavior "operates" on the environment and is maintained by its consequences, while classical conditioning deals with the conditioning of reflexive (reflex) behaviors which are elicited by antecedent conditions. Behaviors conditioned via a classical conditioning procedure are not maintained by consequences.<br />
  • 6. Core Tools<br />Reinforcement is a consequence that causes a behavior to occur with greater frequency.<br />Punishment is a consequence that causes a behavior to occur with less frequency.<br />Extinction is the lack of any consequence following a behavior. When a behavior is inconsequential, producing neither favorable nor unfavorable consequences, it will occur with less frequency. When a previously reinforced behavior is no longer reinforced with either positive or negative reinforcement, it leads to a decline in the response.<br />
  • 7. Four Contexts<br />
  • 8. Multiple Reward Signals in the Brain<br />
  • 9. Abstract<br />This article focuses on recent neurophysiologicalstudies in primates that have revealed that neurons in a limited number of brain structures carry specific signals about past and future rewards. This research provides the first step towards an understanding of how rewards influence behaviour before they are received and how the brain might use reward information to control learningand goal-directed behaviour.<br />
  • 10. Reward Processing and the Brain<br />
  • 11. Reward Detection and Perception<br />In various behavioural situations, including classical and instrumental conditioning, most dopamine neurons show short, phasic activation in a rather homogeneous fashion after the presentation of liquid and solid rewards, and visual or auditory stimuli that predict reward.<br />By contrast, only a few dopamine neurons show phasic activations to punishers.<br />
  • 12. Reward Prediction Errors<br />A closer examination of the properties of the phasic dopamine response suggests that it might encode a reward prediction error rather than reward per se.<br />In view of the crucial role that prediction errors are thought to play during learning, a phasic dopamine response that reports a reward prediction error might constitute an ideal teaching signal for approach learning.<br />Error-driven learning mechanisms.<br />
  • 13. Reward Prediction Errors<br />
  • 14. Experiments<br />
  • 15. Experiemnts<br />
  • 16. Experiments<br />
  • 17. Experiments<br />
  • 18. Conclusions<br />A limited number of brain structures process reward information in several different ways.<br />Neurons detect reward prediction errors and produce a global reinforcement signal that might underlie the learning of appropriate behaviours.<br />Other neurons detect and discriminate between different rewardsand might be involved in assessing the nature and identity of individual rewards, and might thus underlie the perception of rewards.<br />
  • 19. Conclusions<br />Neurons respond to learned stimuli that predict rewards and show sustained activities during periods in which expectations of rewards are evoked.<br />They even estimate future rewards and adapt their activity according to ongoing experience.<br />
  • 20. Reference<br />[1] Classical conditioning, Wikipedia, http://en.wikipedia.org/wiki/Classical_conditioning<br />[2] Operant conditioning, Wikipedia, http://en.wikipedia.org/wiki/Operant_conditioning<br />[3] Wolfram Schultz, “Multiple reward signals in the brain”, Nature Reviews Neuroscience 1, 199-207 (December 2000)<br />[4] Wolfram Schultz, Peter Dayan, P. Read Montague, “A Neural Substrate of Prediction and Reward”, Science275, 1593 – 1599 (March 1997)<br />

×