SlideShare a Scribd company logo
1 of 8
Download to read offline
Practical tutorial: part 1
CHAIN Hokudai Winter School 2021
8 & 9 January 2021
Wataru Toyokawa
Step-by-step walking through of the model
A two-armed risky bandit task
March, J. G. (1996). Learning to Be Risk Averse. Psychological Review, 103(2), 309-319.
Denrell, J. (2007).Adaptive learning and risk taking. Psychological review, 114(1), 177.
Hertwig, R., Barron, G.,Weber, E. U., & Erev, I. (2004). Decisions from experience and the effect of rare events in risky choice. Psychological science, 15(8), 534-539.
a
b
c d
Risky but higher payoff option
Safe but lower payoff option
Trials (decision horizon)
Consider a decision-maker facing a repeated choice between a safe (i.e. certain)
alternative 𝑠 and a risky (i.e. uncertain) alternative 𝑟 for over 𝑇 trials.
A decision-maker’s goal is to maximise his/her total payoff obtained over the trials.
Due to a limited time horizon, there is a trade-off between exploration and
exploitation.
Though this task setting might seem too artificial, the task captures the basic principle
underlying exploration-exploitation dilemma and decision-making from experiences,
which is related to various real life situations ranging from choosing better
restaurants, investing profitable stocks, and finding nicer mates, to developing new
technologies and innovations.
t = 1
Reinforcement learning model
(i.e. the baseline asocial learning model)
q values
Choice
Prob
125 0 10 10
t = 2
Update
t = 3
Update
t = 4
Update
Rescorla-Wagner Rule for
Value Updating
(1-α) × α ×
Q-values at
t = 2
Q-values at
t = 1
+ 125
Payoff at
t = 1
A decision-maker updates their value of choosing each of the two alternatives at time
t, following the Rescorla-Wagner rule.
α is a learning rate (i.e. step size parameter), manipulating a step size of belief-
updating.The larger α, the more weight is given to recent experience (i.e. myopic
learning).The Q-value for the unchosen option is unchanged.
Q
(decision values)
The ‘Softmax’ Choice Rule
Choice
probability‘softmax’
transformation
e Qi
P
k e Qk
<latexit sha1_base64="jD0IVOPlXP+U4dUGFpM7VCuSEhQ=">AAACE3icbVDLSsNAFJ34rPUVdelmsAjioiRV0GXRjcsW7AOaGCbTm3bo5MHMRCgh/+DGX3HjQhG3btz5N07bLGrrgQuHc+7l3nv8hDOpLOvHWFldW9/YLG2Vt3d29/bNg8O2jFNBoUVjHouuTyRwFkFLMcWhmwggoc+h449uJ37nEYRkcXSvxgm4IRlELGCUKC155rkTCEIzeMgcHxTBTY/leebINPRGeE4d5blnVqyqNQVeJnZBKqhAwzO/nX5M0xAiRTmRsmdbiXIzIhSjHPKyk0pICB2RAfQ0jUgI0s2mP+X4VCt9HMRCV6TwVJ2fyEgo5Tj0dWdI1FAuehPxP6+XquDazViUpAoiOlsUpByrGE8Cwn0mgCo+1oRQwfStmA6JDknpGMs6BHvx5WXSrlXti2qteVmp3xRxlNAxOkFnyEZXqI7uUAO1EEVP6AW9oXfj2Xg1PozPWeuKUcwcoT8wvn4BvD6esA==</latexit>
Then Q-values are translated into choice probabilities through a softmax (or multinoimal-
logistic) function. β is an inverse temperature, regulating how sensitive the choice probability is
to the value of the Q.As β decreases and approaches to 0, the choice probability approximates
to a random choice (i.e. highly explorative). Conversely, a large β makes choices almost
deterministic in favour of the option with highest Q value (i.e. highly exploitative).
A collective learning situation
Safe Risky
10
?
5
10
10
?
?
?
Time
Choice
Safe Risky
Round: 2/70
Make a next choice!
4 people
chose this
2 people
chose this
Let’s consider a collective learning situation under which multiple individuals play a
task simultaneously and obtain social information during the play.
A frequency-based social cue suggests how many people chose each slot in
the preceding round. The others’ payoff information is kept private.
Social learning model
Toyokawa et al. 2017; 2019;Aplin et al. 2017; McElreath et al. 2005; 2008; Deffner et al. 2020
Relying on
social information
σ
θ = Conformity exponent
Pi =
F✓
i
P
F✓
k
F1 F2
Reward based
reinforcement learning
1 - σ
Softmax choice based on the
reinforcement
Pi =
exp( Qi)
P
exp( Qi)<latexit sha1_base64="ZPWF1IOMhurpw07U9hAle85OVIQ=">AAACnHicSyrIySwuMTC4ycjEzMLKxs7BycXNw8vHLyAoFFacX1qUnBqanJ+TXxSRlFicmpOZlxpaklmSkxpRUJSamJuUkxqelO0Mkg8vSy0qzszPCympLEiNzU1Mz8tMy0xOLAEKxQuYB8RnKtgqxKQVJSZXx6RWFGjEJKWWJCoExldn1mrWVscUl+ZiEa+NF1A20DMAAwVMhiGUocwABQH5AssZYhhSGPIZkhlKGXIZUhnyGEqA7ByGRIZiIIxmMGQwYCgAisUyVAPFioCsTLB8KkMtAxdQbylQVSpQRSJQNBtIpgN50VDRPCAfZGYxWHcy0JYcIC4C6lRgUDW4arDS4LPBCYPVBi8N/uA0qxpsBsgtlUA6CaI3tSCev0si+DtBXblAuoQhA6ELr5tLGNIYLMBuzQS6vQAsAvJFMkR/WdX0z8FWQarVagaLDF4D3b/Q4KbBYaAP8sq+JC8NTA2azcAFjABD9ODGZIQZ6RkC2YEmyg5O0KjgYJBmUGLQAIa3OYMDgwdDAEMo0N65DIcZzjCcZZJjcmHyZvKFKGVihOoRZkABTGEASk2fOg==</latexit><latexit sha1_base64="ZPWF1IOMhurpw07U9hAle85OVIQ=">AAACnHicSyrIySwuMTC4ycjEzMLKxs7BycXNw8vHLyAoFFacX1qUnBqanJ+TXxSRlFicmpOZlxpaklmSkxpRUJSamJuUkxqelO0Mkg8vSy0qzszPCympLEiNzU1Mz8tMy0xOLAEKxQuYB8RnKtgqxKQVJSZXx6RWFGjEJKWWJCoExldn1mrWVscUl+ZiEa+NF1A20DMAAwVMhiGUocwABQH5AssZYhhSGPIZkhlKGXIZUhnyGEqA7ByGRIZiIIxmMGQwYCgAisUyVAPFioCsTLB8KkMtAxdQbylQVSpQRSJQNBtIpgN50VDRPCAfZGYxWHcy0JYcIC4C6lRgUDW4arDS4LPBCYPVBi8N/uA0qxpsBsgtlUA6CaI3tSCev0si+DtBXblAuoQhA6ELr5tLGNIYLMBuzQS6vQAsAvJFMkR/WdX0z8FWQarVagaLDF4D3b/Q4KbBYaAP8sq+JC8NTA2azcAFjABD9ODGZIQZ6RkC2YEmyg5O0KjgYJBmUGLQAIa3OYMDgwdDAEMo0N65DIcZzjCcZZJjcmHyZvKFKGVihOoRZkABTGEASk2fOg==</latexit><latexit sha1_base64="ZPWF1IOMhurpw07U9hAle85OVIQ=">AAACnHicSyrIySwuMTC4ycjEzMLKxs7BycXNw8vHLyAoFFacX1qUnBqanJ+TXxSRlFicmpOZlxpaklmSkxpRUJSamJuUkxqelO0Mkg8vSy0qzszPCympLEiNzU1Mz8tMy0xOLAEKxQuYB8RnKtgqxKQVJSZXx6RWFGjEJKWWJCoExldn1mrWVscUl+ZiEa+NF1A20DMAAwVMhiGUocwABQH5AssZYhhSGPIZkhlKGXIZUhnyGEqA7ByGRIZiIIxmMGQwYCgAisUyVAPFioCsTLB8KkMtAxdQbylQVSpQRSJQNBtIpgN50VDRPCAfZGYxWHcy0JYcIC4C6lRgUDW4arDS4LPBCYPVBi8N/uA0qxpsBsgtlUA6CaI3tSCev0si+DtBXblAuoQhA6ELr5tLGNIYLMBuzQS6vQAsAvJFMkR/WdX0z8FWQarVagaLDF4D3b/Q4KbBYaAP8sq+JC8NTA2azcAFjABD9ODGZIQZ6RkC2YEmyg5O0KjgYJBmUGLQAIa3OYMDgwdDAEMo0N65DIcZzjCcZZJjcmHyZvKFKGVihOoRZkABTGEASk2fOg==</latexit><latexit sha1_base64="ZPWF1IOMhurpw07U9hAle85OVIQ=">AAACnHicSyrIySwuMTC4ycjEzMLKxs7BycXNw8vHLyAoFFacX1qUnBqanJ+TXxSRlFicmpOZlxpaklmSkxpRUJSamJuUkxqelO0Mkg8vSy0qzszPCympLEiNzU1Mz8tMy0xOLAEKxQuYB8RnKtgqxKQVJSZXx6RWFGjEJKWWJCoExldn1mrWVscUl+ZiEa+NF1A20DMAAwVMhiGUocwABQH5AssZYhhSGPIZkhlKGXIZUhnyGEqA7ByGRIZiIIxmMGQwYCgAisUyVAPFioCsTLB8KkMtAxdQbylQVSpQRSJQNBtIpgN50VDRPCAfZGYxWHcy0JYcIC4C6lRgUDW4arDS4LPBCYPVBi8N/uA0qxpsBsgtlUA6CaI3tSCev0si+DtBXblAuoQhA6ELr5tLGNIYLMBuzQS6vQAsAvJFMkR/WdX0z8FWQarVagaLDF4D3b/Q4KbBYaAP8sq+JC8NTA2azcAFjABD9ODGZIQZ6RkC2YEmyg5O0KjgYJBmUGLQAIa3OYMDgwdDAEMo0N65DIcZzjCcZZJjcmHyZvKFKGVihOoRZkABTGEASk2fOg==</latexit>
Relying on private payoff-
based learning
# of other
individuals
Option 2Option 1
Choice_probability = (1 - σ) Asocial_choice + σ Social_influence

More Related Content

Similar to CHAIN WINTER SCHOOL 2021 - modelling tutorial 1

CMU Trecvid med13 nist
CMU Trecvid med13 nistCMU Trecvid med13 nist
CMU Trecvid med13 nistLu Jiang
 
Ties Adjusted Nonparametric Statististical Method For The Analysis Of Ordered...
Ties Adjusted Nonparametric Statististical Method For The Analysis Of Ordered...Ties Adjusted Nonparametric Statististical Method For The Analysis Of Ordered...
Ties Adjusted Nonparametric Statististical Method For The Analysis Of Ordered...inventionjournals
 
Ties Adjusted Nonparametric Statististical Method For The Analysis Of Ordered...
Ties Adjusted Nonparametric Statististical Method For The Analysis Of Ordered...Ties Adjusted Nonparametric Statististical Method For The Analysis Of Ordered...
Ties Adjusted Nonparametric Statististical Method For The Analysis Of Ordered...inventionjournals
 
QM-013-DOE Introduction
QM-013-DOE IntroductionQM-013-DOE Introduction
QM-013-DOE Introductionhandbook
 
Introduction to the Genetic Algorithm
Introduction to the Genetic AlgorithmIntroduction to the Genetic Algorithm
Introduction to the Genetic AlgorithmQiang Hao
 
EIPOMDP Poster (PDF)
EIPOMDP Poster (PDF)EIPOMDP Poster (PDF)
EIPOMDP Poster (PDF)Teddy Ni
 
Application of Genetic Algorithm and Particle Swarm Optimization in Software ...
Application of Genetic Algorithm and Particle Swarm Optimization in Software ...Application of Genetic Algorithm and Particle Swarm Optimization in Software ...
Application of Genetic Algorithm and Particle Swarm Optimization in Software ...IOSR Journals
 
Qm0021 statistical process control
Qm0021 statistical process controlQm0021 statistical process control
Qm0021 statistical process controlsmumbahelp
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spssDr Nisha Arora
 
Sampling techniques and size
Sampling techniques and sizeSampling techniques and size
Sampling techniques and sizeDr. Keerti Jain
 
Principles of design of experiments (doe)20 5-2014
Principles of  design of experiments (doe)20 5-2014Principles of  design of experiments (doe)20 5-2014
Principles of design of experiments (doe)20 5-2014Awad Albalwi
 
Optimal Stopping Report Final
Optimal Stopping Report FinalOptimal Stopping Report Final
Optimal Stopping Report FinalWilliam Teng
 

Similar to CHAIN WINTER SCHOOL 2021 - modelling tutorial 1 (18)

CMU Trecvid med13 nist
CMU Trecvid med13 nistCMU Trecvid med13 nist
CMU Trecvid med13 nist
 
Ties Adjusted Nonparametric Statististical Method For The Analysis Of Ordered...
Ties Adjusted Nonparametric Statististical Method For The Analysis Of Ordered...Ties Adjusted Nonparametric Statististical Method For The Analysis Of Ordered...
Ties Adjusted Nonparametric Statististical Method For The Analysis Of Ordered...
 
Ties Adjusted Nonparametric Statististical Method For The Analysis Of Ordered...
Ties Adjusted Nonparametric Statististical Method For The Analysis Of Ordered...Ties Adjusted Nonparametric Statististical Method For The Analysis Of Ordered...
Ties Adjusted Nonparametric Statististical Method For The Analysis Of Ordered...
 
QM-013-DOE Introduction
QM-013-DOE IntroductionQM-013-DOE Introduction
QM-013-DOE Introduction
 
Probability
ProbabilityProbability
Probability
 
1582997627872.pdf
1582997627872.pdf1582997627872.pdf
1582997627872.pdf
 
Decision theory
Decision theoryDecision theory
Decision theory
 
Introduction to the Genetic Algorithm
Introduction to the Genetic AlgorithmIntroduction to the Genetic Algorithm
Introduction to the Genetic Algorithm
 
EIPOMDP Poster (PDF)
EIPOMDP Poster (PDF)EIPOMDP Poster (PDF)
EIPOMDP Poster (PDF)
 
RM 701 Genetic Algorithm and Fuzzy Logic lecture
RM 701 Genetic Algorithm and Fuzzy Logic lectureRM 701 Genetic Algorithm and Fuzzy Logic lecture
RM 701 Genetic Algorithm and Fuzzy Logic lecture
 
I2b2 2008
I2b2 2008I2b2 2008
I2b2 2008
 
M017127578
M017127578M017127578
M017127578
 
Application of Genetic Algorithm and Particle Swarm Optimization in Software ...
Application of Genetic Algorithm and Particle Swarm Optimization in Software ...Application of Genetic Algorithm and Particle Swarm Optimization in Software ...
Application of Genetic Algorithm and Particle Swarm Optimization in Software ...
 
Qm0021 statistical process control
Qm0021 statistical process controlQm0021 statistical process control
Qm0021 statistical process control
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spss
 
Sampling techniques and size
Sampling techniques and sizeSampling techniques and size
Sampling techniques and size
 
Principles of design of experiments (doe)20 5-2014
Principles of  design of experiments (doe)20 5-2014Principles of  design of experiments (doe)20 5-2014
Principles of design of experiments (doe)20 5-2014
 
Optimal Stopping Report Final
Optimal Stopping Report FinalOptimal Stopping Report Final
Optimal Stopping Report Final
 

Recently uploaded

Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICEayushi9330
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 

Recently uploaded (20)

Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 

CHAIN WINTER SCHOOL 2021 - modelling tutorial 1

  • 1. Practical tutorial: part 1 CHAIN Hokudai Winter School 2021 8 & 9 January 2021 Wataru Toyokawa
  • 3. A two-armed risky bandit task March, J. G. (1996). Learning to Be Risk Averse. Psychological Review, 103(2), 309-319. Denrell, J. (2007).Adaptive learning and risk taking. Psychological review, 114(1), 177. Hertwig, R., Barron, G.,Weber, E. U., & Erev, I. (2004). Decisions from experience and the effect of rare events in risky choice. Psychological science, 15(8), 534-539. a b c d Risky but higher payoff option Safe but lower payoff option Trials (decision horizon) Consider a decision-maker facing a repeated choice between a safe (i.e. certain) alternative 𝑠 and a risky (i.e. uncertain) alternative 𝑟 for over 𝑇 trials. A decision-maker’s goal is to maximise his/her total payoff obtained over the trials. Due to a limited time horizon, there is a trade-off between exploration and exploitation. Though this task setting might seem too artificial, the task captures the basic principle underlying exploration-exploitation dilemma and decision-making from experiences, which is related to various real life situations ranging from choosing better restaurants, investing profitable stocks, and finding nicer mates, to developing new technologies and innovations.
  • 4. t = 1 Reinforcement learning model (i.e. the baseline asocial learning model) q values Choice Prob 125 0 10 10 t = 2 Update t = 3 Update t = 4 Update
  • 5. Rescorla-Wagner Rule for Value Updating (1-α) × α × Q-values at t = 2 Q-values at t = 1 + 125 Payoff at t = 1 A decision-maker updates their value of choosing each of the two alternatives at time t, following the Rescorla-Wagner rule. α is a learning rate (i.e. step size parameter), manipulating a step size of belief- updating.The larger α, the more weight is given to recent experience (i.e. myopic learning).The Q-value for the unchosen option is unchanged.
  • 6. Q (decision values) The ‘Softmax’ Choice Rule Choice probability‘softmax’ transformation e Qi P k e Qk <latexit sha1_base64="jD0IVOPlXP+U4dUGFpM7VCuSEhQ=">AAACE3icbVDLSsNAFJ34rPUVdelmsAjioiRV0GXRjcsW7AOaGCbTm3bo5MHMRCgh/+DGX3HjQhG3btz5N07bLGrrgQuHc+7l3nv8hDOpLOvHWFldW9/YLG2Vt3d29/bNg8O2jFNBoUVjHouuTyRwFkFLMcWhmwggoc+h449uJ37nEYRkcXSvxgm4IRlELGCUKC155rkTCEIzeMgcHxTBTY/leebINPRGeE4d5blnVqyqNQVeJnZBKqhAwzO/nX5M0xAiRTmRsmdbiXIzIhSjHPKyk0pICB2RAfQ0jUgI0s2mP+X4VCt9HMRCV6TwVJ2fyEgo5Tj0dWdI1FAuehPxP6+XquDazViUpAoiOlsUpByrGE8Cwn0mgCo+1oRQwfStmA6JDknpGMs6BHvx5WXSrlXti2qteVmp3xRxlNAxOkFnyEZXqI7uUAO1EEVP6AW9oXfj2Xg1PozPWeuKUcwcoT8wvn4BvD6esA==</latexit> Then Q-values are translated into choice probabilities through a softmax (or multinoimal- logistic) function. β is an inverse temperature, regulating how sensitive the choice probability is to the value of the Q.As β decreases and approaches to 0, the choice probability approximates to a random choice (i.e. highly explorative). Conversely, a large β makes choices almost deterministic in favour of the option with highest Q value (i.e. highly exploitative).
  • 7. A collective learning situation Safe Risky 10 ? 5 10 10 ? ? ? Time Choice Safe Risky Round: 2/70 Make a next choice! 4 people chose this 2 people chose this Let’s consider a collective learning situation under which multiple individuals play a task simultaneously and obtain social information during the play. A frequency-based social cue suggests how many people chose each slot in the preceding round. The others’ payoff information is kept private.
  • 8. Social learning model Toyokawa et al. 2017; 2019;Aplin et al. 2017; McElreath et al. 2005; 2008; Deffner et al. 2020 Relying on social information σ θ = Conformity exponent Pi = F✓ i P F✓ k F1 F2 Reward based reinforcement learning 1 - σ Softmax choice based on the reinforcement Pi = exp( Qi) P exp( Qi)<latexit sha1_base64="ZPWF1IOMhurpw07U9hAle85OVIQ=">AAACnHicSyrIySwuMTC4ycjEzMLKxs7BycXNw8vHLyAoFFacX1qUnBqanJ+TXxSRlFicmpOZlxpaklmSkxpRUJSamJuUkxqelO0Mkg8vSy0qzszPCympLEiNzU1Mz8tMy0xOLAEKxQuYB8RnKtgqxKQVJSZXx6RWFGjEJKWWJCoExldn1mrWVscUl+ZiEa+NF1A20DMAAwVMhiGUocwABQH5AssZYhhSGPIZkhlKGXIZUhnyGEqA7ByGRIZiIIxmMGQwYCgAisUyVAPFioCsTLB8KkMtAxdQbylQVSpQRSJQNBtIpgN50VDRPCAfZGYxWHcy0JYcIC4C6lRgUDW4arDS4LPBCYPVBi8N/uA0qxpsBsgtlUA6CaI3tSCev0si+DtBXblAuoQhA6ELr5tLGNIYLMBuzQS6vQAsAvJFMkR/WdX0z8FWQarVagaLDF4D3b/Q4KbBYaAP8sq+JC8NTA2azcAFjABD9ODGZIQZ6RkC2YEmyg5O0KjgYJBmUGLQAIa3OYMDgwdDAEMo0N65DIcZzjCcZZJjcmHyZvKFKGVihOoRZkABTGEASk2fOg==</latexit><latexit sha1_base64="ZPWF1IOMhurpw07U9hAle85OVIQ=">AAACnHicSyrIySwuMTC4ycjEzMLKxs7BycXNw8vHLyAoFFacX1qUnBqanJ+TXxSRlFicmpOZlxpaklmSkxpRUJSamJuUkxqelO0Mkg8vSy0qzszPCympLEiNzU1Mz8tMy0xOLAEKxQuYB8RnKtgqxKQVJSZXx6RWFGjEJKWWJCoExldn1mrWVscUl+ZiEa+NF1A20DMAAwVMhiGUocwABQH5AssZYhhSGPIZkhlKGXIZUhnyGEqA7ByGRIZiIIxmMGQwYCgAisUyVAPFioCsTLB8KkMtAxdQbylQVSpQRSJQNBtIpgN50VDRPCAfZGYxWHcy0JYcIC4C6lRgUDW4arDS4LPBCYPVBi8N/uA0qxpsBsgtlUA6CaI3tSCev0si+DtBXblAuoQhA6ELr5tLGNIYLMBuzQS6vQAsAvJFMkR/WdX0z8FWQarVagaLDF4D3b/Q4KbBYaAP8sq+JC8NTA2azcAFjABD9ODGZIQZ6RkC2YEmyg5O0KjgYJBmUGLQAIa3OYMDgwdDAEMo0N65DIcZzjCcZZJjcmHyZvKFKGVihOoRZkABTGEASk2fOg==</latexit><latexit sha1_base64="ZPWF1IOMhurpw07U9hAle85OVIQ=">AAACnHicSyrIySwuMTC4ycjEzMLKxs7BycXNw8vHLyAoFFacX1qUnBqanJ+TXxSRlFicmpOZlxpaklmSkxpRUJSamJuUkxqelO0Mkg8vSy0qzszPCympLEiNzU1Mz8tMy0xOLAEKxQuYB8RnKtgqxKQVJSZXx6RWFGjEJKWWJCoExldn1mrWVscUl+ZiEa+NF1A20DMAAwVMhiGUocwABQH5AssZYhhSGPIZkhlKGXIZUhnyGEqA7ByGRIZiIIxmMGQwYCgAisUyVAPFioCsTLB8KkMtAxdQbylQVSpQRSJQNBtIpgN50VDRPCAfZGYxWHcy0JYcIC4C6lRgUDW4arDS4LPBCYPVBi8N/uA0qxpsBsgtlUA6CaI3tSCev0si+DtBXblAuoQhA6ELr5tLGNIYLMBuzQS6vQAsAvJFMkR/WdX0z8FWQarVagaLDF4D3b/Q4KbBYaAP8sq+JC8NTA2azcAFjABD9ODGZIQZ6RkC2YEmyg5O0KjgYJBmUGLQAIa3OYMDgwdDAEMo0N65DIcZzjCcZZJjcmHyZvKFKGVihOoRZkABTGEASk2fOg==</latexit><latexit sha1_base64="ZPWF1IOMhurpw07U9hAle85OVIQ=">AAACnHicSyrIySwuMTC4ycjEzMLKxs7BycXNw8vHLyAoFFacX1qUnBqanJ+TXxSRlFicmpOZlxpaklmSkxpRUJSamJuUkxqelO0Mkg8vSy0qzszPCympLEiNzU1Mz8tMy0xOLAEKxQuYB8RnKtgqxKQVJSZXx6RWFGjEJKWWJCoExldn1mrWVscUl+ZiEa+NF1A20DMAAwVMhiGUocwABQH5AssZYhhSGPIZkhlKGXIZUhnyGEqA7ByGRIZiIIxmMGQwYCgAisUyVAPFioCsTLB8KkMtAxdQbylQVSpQRSJQNBtIpgN50VDRPCAfZGYxWHcy0JYcIC4C6lRgUDW4arDS4LPBCYPVBi8N/uA0qxpsBsgtlUA6CaI3tSCev0si+DtBXblAuoQhA6ELr5tLGNIYLMBuzQS6vQAsAvJFMkR/WdX0z8FWQarVagaLDF4D3b/Q4KbBYaAP8sq+JC8NTA2azcAFjABD9ODGZIQZ6RkC2YEmyg5O0KjgYJBmUGLQAIa3OYMDgwdDAEMo0N65DIcZzjCcZZJjcmHyZvKFKGVihOoRZkABTGEASk2fOg==</latexit> Relying on private payoff- based learning # of other individuals Option 2Option 1 Choice_probability = (1 - σ) Asocial_choice + σ Social_influence