SlideShare a Scribd company logo
1 of 16
Parameter Tuning Method for
Multi-agent Simulation using
Reinforcement Learning
Masanori HIRANO, Kiyoshi IZUMI
School of Engineering, The University of Tokyo
research@mhirano.jp
https://mhirano.jp/
©M.HIRANO & Izumi Lab.
What’s Multi-agent Simulation?
• MAS: Multi-agent Simulation
• Simulate social phenomena by piling up agents’ behavior
• Can see the emergent phenomena of complex systems caused by
agents’ complex interactions
• Useful as a tool to understand social phenomena
• Artificial Market Simulation
• MAS for financial markets
• Agents’ interactions are necessary to replicate ”Stylized facts” (well
known phenomena in financial markets)[Lux+, 1999]
• Many models are available
Lux, T., & Marchesi, M. (1999). Scaling and criticality in a stochastic multi-agent model of a financial market. Nature,
397(6719), 498–500. https://doi.org/10.1038/17290
10/30/22
BESC2022
2
©M.HIRANO & Izumi Lab.
MAS
• Usually,
1. Human made a simulation model
2. Tune model parameter to replicate the actual phenomena
3. Evaluation & Analysis
• Parameter tuning for MAS is difficult because …
• Usually, MAS has many parameters
• The response of parameter changes are not always continuous
• MAS is useful and used mainly for complex system, which shows
the chaotic phenomena
Modeling
Parameter
Tuning
Evaluation
& Analysis
10/30/22
BESC2022
3
©M.HIRANO & Izumi Lab.
Approach: Deep Reinforce Learning
• Social Simulation & Bayesian optimization seems
incompatible.
• Bayesian optimization (Optuna): Continuous estimation
• Social simulation using MAS: each trial shows the different
movements & chaotic phenomena or phase transition is frequently
occurs
• Recent developed deep reinforcement learning is a possible
solution.
• Deep reinforcement learning can handle high-dimensional
parameter spaces.
• à We try to use reinforcement learning for MAS parameter
tuning.
10/30/22
BESC2022
4
©M.HIRANO & Izumi Lab.
Model Outline
10/30/22
BESC2022
5
©M.HIRANO & Izumi Lab.
DDPG (Deep Deterministic Policy Gradients)
• Actor-critic based
• Continuous action space
• https://arxiv.org/abs/1509.02971
10/30/22
BESC2022
6
State
Action
NN
Concatenate
Action
NN
State
Actor
Critic
©M.HIRANO & Izumi Lab.
Architectures used in DDPG
• General architectures widely used in reinforcement learning
• Replay buffer
• Not prioritized
• Soft-target
• Exploration noise
• Ornstein–Uhlenbeck process
• In practice, Gaussian is enough but we employed OU noise according
to the original paper
10/30/22
BESC2022
7
𝑑𝑟! = −𝜃 𝑟! − 𝜇 𝑑𝑡 + 𝜎𝑑𝑊!
©M.HIRANO & Izumi Lab.
Simulation surrogate by the critic
10/30/22
BESC2022
8
Works as surrogate
• Actor-critic-based RL is important in terms of surrogation
©M.HIRANO & Izumi Lab.
Proposed method to realize our idea
• DDPG4MASPT (DDPG for MAS Parameter Tuning)
1. Customized DDPG for our task
2. Action Converter (AC)
3. Redundant Full Neural Network Actor (FNNA)
4. Seed Fixer (SF)
• → According to the results, all of them are required to
realize our idea!
10/30/22
BESC2022
12
©M.HIRANO & Izumi Lab.
Customized DDPG
• For 𝑖 th iteration, the parameter set for trial are:
𝑃+ = 𝐴()
• Then, minimize the surrogate error through critic:
min
,
MSE(𝑜+, 𝐶(𝑃+))
• According to the surrogate, the actor is updates as:
max
-
𝐶(𝐴())
10/30/22
BESC2022
13
©M.HIRANO & Izumi Lab.
Action Converter (AC)
• Convert action space
• DDPG is usually supporting continuous spaces
• For non-negative restriction, mapping −∞, ∞ →
.
(0, ∞)
seems good for learning.
• Action probability squashing of Soft Actor-Critic (SAC) is
similar but different from this.
• In our experiments, we employed the mapping 𝑓 𝑥 =
ln(1 + exp(𝑥))
• (Other mappings can be used for it but not tested)
• Because
!"($)
!$
=
&!
'(&! = 1 −
'
'(&!, the effect of this mapping is not
significant in the area 𝑥 ≫ 1
10/30/22
BESC2022
14
©M.HIRANO & Izumi Lab.
Redundant Full Neural Network Actor
• Use redundant neural networks for Actor
10/30/22
BESC2022
15
𝑃",$
𝑃%,$
𝑃&,$
… 𝑃",$
𝑃%,$
𝑃&,$
…
…
…
…
Minimum requirements Redundant Full Neural Network
©M.HIRANO & Izumi Lab.
Seed Fixer
• In social simulations, usually, the effect of seeds >> the
effect of parameters.
• When the variance among simulation trials, it is hard to get precise
gradients.
• Fix seed and test
10/30/22
BESC2022
16
Blue print
When the seed is different When the parameters are slightly different
(when no phase transfer is not included)
©M.HIRANO & Izumi Lab.
Simulation model to evaluate our idea
• Artificial market simulation
• Stylized Trader Agents [Chiarella et al. 02]
• Agents calculate…
• Logarithmic return prediction for bid/ask price
𝑟 =
0
1!21"21#
𝑤3 ⋅ 𝐹 + 𝑤, ⋅ 𝐶 + 𝑤4 ⋅ 𝑁
• Fundamentals
𝐹 =
'
)*+, -*.*-/01, 20)*
ln
34--*,2 )+-5*2 6-03*
34--*,2 74,8+)*,2+9 6-03*
• Chartist (trend)
𝐶 = logarithm averaged return in the past
• Noise 𝑁 ~ 𝑁 0, 𝜎:
• + margin => decide price
• We tuned 𝑤3, 𝑤,.
BESC2022
17
10/30/22
©M.HIRANO & Izumi Lab.
Experiments & Results
• The objective function is MSE of skewness and kurtosis
• Target: Skewness = 0.0, Kurtosis = 6.0
• tuned 𝑤3, 𝑤, only
• Comparative models:
• Optuna (Bayesian estimation)
• Models missing one or two components
• Our proposed model showed the best performance!
(But, not statistically significant…)
10/30/22
BESC2022
18
©M.HIRANO & Izumi Lab.
Discussion & Future work
• Our proposed method works well and three additional
components (AC, FNNA, SF) were necessary.
• Critic works well as a surrogation of simulation
• The task we used in this study is very simple only tuning 2
parameters.
• Bayesian estimation has the limitation of supporting high-
dimensional tuning (Bayesian considers a sample as a POINT)
• Gradient-based methods, such as DDPG, can support high-
dimensional tuning. (A sample having the gradient can be
considered as a surface)
• According to some previous works, DDPG can solve high-
dimensional tasks. Therefore, it seems a good fit for tuning
high-dimensional parameter tuning tasks. à we should test
• For future work, RL models with high-exploration capability
should be tested
10/30/22
BESC2022
19

More Related Content

Similar to 2022/10/30 BESC2022: Parameter Tuning Method for Multi-agent Simulation using Reinforcement Learning

2020/11/19 PRIMA2020: Simulation of Unintentional Collusion Caused by Auto Pr...
2020/11/19 PRIMA2020: Simulation of Unintentional Collusion Caused by Auto Pr...2020/11/19 PRIMA2020: Simulation of Unintentional Collusion Caused by Auto Pr...
2020/11/19 PRIMA2020: Simulation of Unintentional Collusion Caused by Auto Pr...Masanori HIRANO
 
deep learning from scratch chapter 4.neural network learing
deep learning from scratch chapter 4.neural network learingdeep learning from scratch chapter 4.neural network learing
deep learning from scratch chapter 4.neural network learingJaey Jeong
 
Probability Collectives
Probability CollectivesProbability Collectives
Probability Collectiveskulk0003
 
Introduction to algorithmic aspect of auction theory
Introduction to algorithmic aspect of auction theoryIntroduction to algorithmic aspect of auction theory
Introduction to algorithmic aspect of auction theoryAbner Chih Yi Huang
 
2022/05/05 CIFEr2022: Concept and Practice of Artificial Market Data Mining P...
2022/05/05 CIFEr2022: Concept and Practice of Artificial Market Data Mining P...2022/05/05 CIFEr2022: Concept and Practice of Artificial Market Data Mining P...
2022/05/05 CIFEr2022: Concept and Practice of Artificial Market Data Mining P...Masanori HIRANO
 
Machine learning and linear regression programming
Machine learning and linear regression programmingMachine learning and linear regression programming
Machine learning and linear regression programmingSoumya Mukherjee
 
Human action recognition with kinect using a joint motion descriptor
Human action recognition with kinect using a joint motion descriptorHuman action recognition with kinect using a joint motion descriptor
Human action recognition with kinect using a joint motion descriptorSoma Boubou
 
PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...
PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...
PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...Jisang Yoon
 
Uniform and non-uniform pseudo random numbers generators for high dimensional...
Uniform and non-uniform pseudo random numbers generators for high dimensional...Uniform and non-uniform pseudo random numbers generators for high dimensional...
Uniform and non-uniform pseudo random numbers generators for high dimensional...LEBRUN Régis
 
Analysis of algorithms
Analysis of algorithmsAnalysis of algorithms
Analysis of algorithmsiqbalphy1
 
Conjugate Gradient method for Brain Magnetic Resonance Images Segmentation
Conjugate Gradient method for Brain Magnetic Resonance Images SegmentationConjugate Gradient method for Brain Magnetic Resonance Images Segmentation
Conjugate Gradient method for Brain Magnetic Resonance Images SegmentationEL-Hachemi Guerrout
 
Week 2 - ML models and Linear Regression.pptx
Week 2 - ML models and Linear Regression.pptxWeek 2 - ML models and Linear Regression.pptx
Week 2 - ML models and Linear Regression.pptxHafizAliHummad
 
seminar reprtv hdchjbjfkdbf dgusghdfs gsdgjsbk
seminar reprtv hdchjbjfkdbf dgusghdfs gsdgjsbkseminar reprtv hdchjbjfkdbf dgusghdfs gsdgjsbk
seminar reprtv hdchjbjfkdbf dgusghdfs gsdgjsbkRajeshKotian11
 
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...Ajay Kumar
 

Similar to 2022/10/30 BESC2022: Parameter Tuning Method for Multi-agent Simulation using Reinforcement Learning (20)

2020/11/19 PRIMA2020: Simulation of Unintentional Collusion Caused by Auto Pr...
2020/11/19 PRIMA2020: Simulation of Unintentional Collusion Caused by Auto Pr...2020/11/19 PRIMA2020: Simulation of Unintentional Collusion Caused by Auto Pr...
2020/11/19 PRIMA2020: Simulation of Unintentional Collusion Caused by Auto Pr...
 
1. intro. to or & lp
1. intro. to or & lp1. intro. to or & lp
1. intro. to or & lp
 
deep learning from scratch chapter 4.neural network learing
deep learning from scratch chapter 4.neural network learingdeep learning from scratch chapter 4.neural network learing
deep learning from scratch chapter 4.neural network learing
 
Probability Collectives
Probability CollectivesProbability Collectives
Probability Collectives
 
Aggregation operator for image reduction
Aggregation operator for image reductionAggregation operator for image reduction
Aggregation operator for image reduction
 
Introduction to algorithmic aspect of auction theory
Introduction to algorithmic aspect of auction theoryIntroduction to algorithmic aspect of auction theory
Introduction to algorithmic aspect of auction theory
 
Handson 2 (6/6)
Handson 2 (6/6)Handson 2 (6/6)
Handson 2 (6/6)
 
2022/05/05 CIFEr2022: Concept and Practice of Artificial Market Data Mining P...
2022/05/05 CIFEr2022: Concept and Practice of Artificial Market Data Mining P...2022/05/05 CIFEr2022: Concept and Practice of Artificial Market Data Mining P...
2022/05/05 CIFEr2022: Concept and Practice of Artificial Market Data Mining P...
 
Optimization Using Evolutionary Computing Techniques
Optimization Using Evolutionary Computing Techniques Optimization Using Evolutionary Computing Techniques
Optimization Using Evolutionary Computing Techniques
 
Hci and psychology
Hci and psychologyHci and psychology
Hci and psychology
 
Machine learning and linear regression programming
Machine learning and linear regression programmingMachine learning and linear regression programming
Machine learning and linear regression programming
 
Human action recognition with kinect using a joint motion descriptor
Human action recognition with kinect using a joint motion descriptorHuman action recognition with kinect using a joint motion descriptor
Human action recognition with kinect using a joint motion descriptor
 
PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...
PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...
PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...
 
Uniform and non-uniform pseudo random numbers generators for high dimensional...
Uniform and non-uniform pseudo random numbers generators for high dimensional...Uniform and non-uniform pseudo random numbers generators for high dimensional...
Uniform and non-uniform pseudo random numbers generators for high dimensional...
 
Analysis of algorithms
Analysis of algorithmsAnalysis of algorithms
Analysis of algorithms
 
Conjugate Gradient method for Brain Magnetic Resonance Images Segmentation
Conjugate Gradient method for Brain Magnetic Resonance Images SegmentationConjugate Gradient method for Brain Magnetic Resonance Images Segmentation
Conjugate Gradient method for Brain Magnetic Resonance Images Segmentation
 
Week 2 - ML models and Linear Regression.pptx
Week 2 - ML models and Linear Regression.pptxWeek 2 - ML models and Linear Regression.pptx
Week 2 - ML models and Linear Regression.pptx
 
seminar reprtv hdchjbjfkdbf dgusghdfs gsdgjsbk
seminar reprtv hdchjbjfkdbf dgusghdfs gsdgjsbkseminar reprtv hdchjbjfkdbf dgusghdfs gsdgjsbk
seminar reprtv hdchjbjfkdbf dgusghdfs gsdgjsbk
 
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
ADVANCED OPTIMIZATION TECHNIQUES META-HEURISTIC ALGORITHMS FOR ENGINEERING AP...
 
GDRR Opening Workshop - Deep Reinforcement Learning for Asset Based Modeling ...
GDRR Opening Workshop - Deep Reinforcement Learning for Asset Based Modeling ...GDRR Opening Workshop - Deep Reinforcement Learning for Asset Based Modeling ...
GDRR Opening Workshop - Deep Reinforcement Learning for Asset Based Modeling ...
 

More from Masanori HIRANO

2023/03/04 sigfin30: 原資産価格過程不要な敵対的Deep Hedging
2023/03/04 sigfin30: 原資産価格過程不要な敵対的Deep Hedging2023/03/04 sigfin30: 原資産価格過程不要な敵対的Deep Hedging
2023/03/04 sigfin30: 原資産価格過程不要な敵対的Deep HedgingMasanori HIRANO
 
2023/03/04 sigfin30 PR: Special Session on Applied Informatics in Finance and...
2023/03/04 sigfin30 PR: Special Session on Applied Informatics in Finance and...2023/03/04 sigfin30 PR: Special Session on Applied Informatics in Finance and...
2023/03/04 sigfin30 PR: Special Session on Applied Informatics in Finance and...Masanori HIRANO
 
2022/11/17 PRIMA2022: Analysis of Carbon Neutrality Scenarios of Industrial C...
2022/11/17 PRIMA2022: Analysis of Carbon Neutrality Scenarios of Industrial C...2022/11/17 PRIMA2022: Analysis of Carbon Neutrality Scenarios of Industrial C...
2022/11/17 PRIMA2022: Analysis of Carbon Neutrality Scenarios of Industrial C...Masanori HIRANO
 
2022/06/15 JSAI2022: Data-driven Agent Design for Artificial Market Simulation
2022/06/15 JSAI2022: Data-driven Agent Design for Artificial Market Simulation2022/06/15 JSAI2022: Data-driven Agent Design for Artificial Market Simulation
2022/06/15 JSAI2022: Data-driven Agent Design for Artificial Market SimulationMasanori HIRANO
 
2022/03/12 sigfin28: オプションによるオプションのヘッジを可能にする二重 Deep Hedging 機構
2022/03/12 sigfin28: オプションによるオプションのヘッジを可能にする二重 Deep Hedging 機構2022/03/12 sigfin28: オプションによるオプションのヘッジを可能にする二重 Deep Hedging 機構
2022/03/12 sigfin28: オプションによるオプションのヘッジを可能にする二重 Deep Hedging 機構Masanori HIRANO
 
2020/11/19 PRIMA2020: Implementation of Real Data for Financial Market Simula...
2020/11/19 PRIMA2020: Implementation of Real Data for Financial Market Simula...2020/11/19 PRIMA2020: Implementation of Real Data for Financial Market Simula...
2020/11/19 PRIMA2020: Implementation of Real Data for Financial Market Simula...Masanori HIRANO
 
2020/06/08 JSAI2020: STBM: Stochastic Trading Behavior Model for Financial Ma...
2020/06/08 JSAI2020: STBM: Stochastic Trading Behavior Model for Financial Ma...2020/06/08 JSAI2020: STBM: Stochastic Trading Behavior Model for Financial Ma...
2020/06/08 JSAI2020: STBM: Stochastic Trading Behavior Model for Financial Ma...Masanori HIRANO
 
2020/03/18 NLP2020: 金融文書のための別タスク学習による教師なし重要文判定
2020/03/18 NLP2020: 金融文書のための別タスク学習による教師なし重要文判定2020/03/18 NLP2020: 金融文書のための別タスク学習による教師なし重要文判定
2020/03/18 NLP2020: 金融文書のための別タスク学習による教師なし重要文判定Masanori HIRANO
 
2019/10/31 PRIMA2019: Comparison of Behaviors of Actual and Simulated HFT Tra...
2019/10/31 PRIMA2019: Comparison of Behaviors of Actual and Simulated HFT Tra...2019/10/31 PRIMA2019: Comparison of Behaviors of Actual and Simulated HFT Tra...
2019/10/31 PRIMA2019: Comparison of Behaviors of Actual and Simulated HFT Tra...Masanori HIRANO
 
2018/11/17 ICDMW 2018: Selection of Related Stocks using Financial Text Mining
2018/11/17 ICDMW 2018: Selection of Related Stocks using Financial Text Mining2018/11/17 ICDMW 2018: Selection of Related Stocks using Financial Text Mining
2018/11/17 ICDMW 2018: Selection of Related Stocks using Financial Text MiningMasanori HIRANO
 
2018/10/30 PRIMA Workshop 2018: Impact Assessments of the CAR Regulation usin...
2018/10/30 PRIMA Workshop 2018: Impact Assessments of the CAR Regulation usin...2018/10/30 PRIMA Workshop 2018: Impact Assessments of the CAR Regulation usin...
2018/10/30 PRIMA Workshop 2018: Impact Assessments of the CAR Regulation usin...Masanori HIRANO
 
2018/06/06 JSAI2018 Effects Analysis of CAR Regulations on Financial Markets ...
2018/06/06 JSAI2018 Effects Analysis of CAR Regulations on Financial Markets ...2018/06/06 JSAI2018 Effects Analysis of CAR Regulations on Financial Markets ...
2018/06/06 JSAI2018 Effects Analysis of CAR Regulations on Financial Markets ...Masanori HIRANO
 

More from Masanori HIRANO (12)

2023/03/04 sigfin30: 原資産価格過程不要な敵対的Deep Hedging
2023/03/04 sigfin30: 原資産価格過程不要な敵対的Deep Hedging2023/03/04 sigfin30: 原資産価格過程不要な敵対的Deep Hedging
2023/03/04 sigfin30: 原資産価格過程不要な敵対的Deep Hedging
 
2023/03/04 sigfin30 PR: Special Session on Applied Informatics in Finance and...
2023/03/04 sigfin30 PR: Special Session on Applied Informatics in Finance and...2023/03/04 sigfin30 PR: Special Session on Applied Informatics in Finance and...
2023/03/04 sigfin30 PR: Special Session on Applied Informatics in Finance and...
 
2022/11/17 PRIMA2022: Analysis of Carbon Neutrality Scenarios of Industrial C...
2022/11/17 PRIMA2022: Analysis of Carbon Neutrality Scenarios of Industrial C...2022/11/17 PRIMA2022: Analysis of Carbon Neutrality Scenarios of Industrial C...
2022/11/17 PRIMA2022: Analysis of Carbon Neutrality Scenarios of Industrial C...
 
2022/06/15 JSAI2022: Data-driven Agent Design for Artificial Market Simulation
2022/06/15 JSAI2022: Data-driven Agent Design for Artificial Market Simulation2022/06/15 JSAI2022: Data-driven Agent Design for Artificial Market Simulation
2022/06/15 JSAI2022: Data-driven Agent Design for Artificial Market Simulation
 
2022/03/12 sigfin28: オプションによるオプションのヘッジを可能にする二重 Deep Hedging 機構
2022/03/12 sigfin28: オプションによるオプションのヘッジを可能にする二重 Deep Hedging 機構2022/03/12 sigfin28: オプションによるオプションのヘッジを可能にする二重 Deep Hedging 機構
2022/03/12 sigfin28: オプションによるオプションのヘッジを可能にする二重 Deep Hedging 機構
 
2020/11/19 PRIMA2020: Implementation of Real Data for Financial Market Simula...
2020/11/19 PRIMA2020: Implementation of Real Data for Financial Market Simula...2020/11/19 PRIMA2020: Implementation of Real Data for Financial Market Simula...
2020/11/19 PRIMA2020: Implementation of Real Data for Financial Market Simula...
 
2020/06/08 JSAI2020: STBM: Stochastic Trading Behavior Model for Financial Ma...
2020/06/08 JSAI2020: STBM: Stochastic Trading Behavior Model for Financial Ma...2020/06/08 JSAI2020: STBM: Stochastic Trading Behavior Model for Financial Ma...
2020/06/08 JSAI2020: STBM: Stochastic Trading Behavior Model for Financial Ma...
 
2020/03/18 NLP2020: 金融文書のための別タスク学習による教師なし重要文判定
2020/03/18 NLP2020: 金融文書のための別タスク学習による教師なし重要文判定2020/03/18 NLP2020: 金融文書のための別タスク学習による教師なし重要文判定
2020/03/18 NLP2020: 金融文書のための別タスク学習による教師なし重要文判定
 
2019/10/31 PRIMA2019: Comparison of Behaviors of Actual and Simulated HFT Tra...
2019/10/31 PRIMA2019: Comparison of Behaviors of Actual and Simulated HFT Tra...2019/10/31 PRIMA2019: Comparison of Behaviors of Actual and Simulated HFT Tra...
2019/10/31 PRIMA2019: Comparison of Behaviors of Actual and Simulated HFT Tra...
 
2018/11/17 ICDMW 2018: Selection of Related Stocks using Financial Text Mining
2018/11/17 ICDMW 2018: Selection of Related Stocks using Financial Text Mining2018/11/17 ICDMW 2018: Selection of Related Stocks using Financial Text Mining
2018/11/17 ICDMW 2018: Selection of Related Stocks using Financial Text Mining
 
2018/10/30 PRIMA Workshop 2018: Impact Assessments of the CAR Regulation usin...
2018/10/30 PRIMA Workshop 2018: Impact Assessments of the CAR Regulation usin...2018/10/30 PRIMA Workshop 2018: Impact Assessments of the CAR Regulation usin...
2018/10/30 PRIMA Workshop 2018: Impact Assessments of the CAR Regulation usin...
 
2018/06/06 JSAI2018 Effects Analysis of CAR Regulations on Financial Markets ...
2018/06/06 JSAI2018 Effects Analysis of CAR Regulations on Financial Markets ...2018/06/06 JSAI2018 Effects Analysis of CAR Regulations on Financial Markets ...
2018/06/06 JSAI2018 Effects Analysis of CAR Regulations on Financial Markets ...
 

Recently uploaded

BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGSIVASHANKAR N
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesPrabhanshu Chaturvedi
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 

Recently uploaded (20)

BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 

2022/10/30 BESC2022: Parameter Tuning Method for Multi-agent Simulation using Reinforcement Learning

  • 1. Parameter Tuning Method for Multi-agent Simulation using Reinforcement Learning Masanori HIRANO, Kiyoshi IZUMI School of Engineering, The University of Tokyo research@mhirano.jp https://mhirano.jp/
  • 2. ©M.HIRANO & Izumi Lab. What’s Multi-agent Simulation? • MAS: Multi-agent Simulation • Simulate social phenomena by piling up agents’ behavior • Can see the emergent phenomena of complex systems caused by agents’ complex interactions • Useful as a tool to understand social phenomena • Artificial Market Simulation • MAS for financial markets • Agents’ interactions are necessary to replicate ”Stylized facts” (well known phenomena in financial markets)[Lux+, 1999] • Many models are available Lux, T., & Marchesi, M. (1999). Scaling and criticality in a stochastic multi-agent model of a financial market. Nature, 397(6719), 498–500. https://doi.org/10.1038/17290 10/30/22 BESC2022 2
  • 3. ©M.HIRANO & Izumi Lab. MAS • Usually, 1. Human made a simulation model 2. Tune model parameter to replicate the actual phenomena 3. Evaluation & Analysis • Parameter tuning for MAS is difficult because … • Usually, MAS has many parameters • The response of parameter changes are not always continuous • MAS is useful and used mainly for complex system, which shows the chaotic phenomena Modeling Parameter Tuning Evaluation & Analysis 10/30/22 BESC2022 3
  • 4. ©M.HIRANO & Izumi Lab. Approach: Deep Reinforce Learning • Social Simulation & Bayesian optimization seems incompatible. • Bayesian optimization (Optuna): Continuous estimation • Social simulation using MAS: each trial shows the different movements & chaotic phenomena or phase transition is frequently occurs • Recent developed deep reinforcement learning is a possible solution. • Deep reinforcement learning can handle high-dimensional parameter spaces. • à We try to use reinforcement learning for MAS parameter tuning. 10/30/22 BESC2022 4
  • 5. ©M.HIRANO & Izumi Lab. Model Outline 10/30/22 BESC2022 5
  • 6. ©M.HIRANO & Izumi Lab. DDPG (Deep Deterministic Policy Gradients) • Actor-critic based • Continuous action space • https://arxiv.org/abs/1509.02971 10/30/22 BESC2022 6 State Action NN Concatenate Action NN State Actor Critic
  • 7. ©M.HIRANO & Izumi Lab. Architectures used in DDPG • General architectures widely used in reinforcement learning • Replay buffer • Not prioritized • Soft-target • Exploration noise • Ornstein–Uhlenbeck process • In practice, Gaussian is enough but we employed OU noise according to the original paper 10/30/22 BESC2022 7 𝑑𝑟! = −𝜃 𝑟! − 𝜇 𝑑𝑡 + 𝜎𝑑𝑊!
  • 8. ©M.HIRANO & Izumi Lab. Simulation surrogate by the critic 10/30/22 BESC2022 8 Works as surrogate • Actor-critic-based RL is important in terms of surrogation
  • 9. ©M.HIRANO & Izumi Lab. Proposed method to realize our idea • DDPG4MASPT (DDPG for MAS Parameter Tuning) 1. Customized DDPG for our task 2. Action Converter (AC) 3. Redundant Full Neural Network Actor (FNNA) 4. Seed Fixer (SF) • → According to the results, all of them are required to realize our idea! 10/30/22 BESC2022 12
  • 10. ©M.HIRANO & Izumi Lab. Customized DDPG • For 𝑖 th iteration, the parameter set for trial are: 𝑃+ = 𝐴() • Then, minimize the surrogate error through critic: min , MSE(𝑜+, 𝐶(𝑃+)) • According to the surrogate, the actor is updates as: max - 𝐶(𝐴()) 10/30/22 BESC2022 13
  • 11. ©M.HIRANO & Izumi Lab. Action Converter (AC) • Convert action space • DDPG is usually supporting continuous spaces • For non-negative restriction, mapping −∞, ∞ → . (0, ∞) seems good for learning. • Action probability squashing of Soft Actor-Critic (SAC) is similar but different from this. • In our experiments, we employed the mapping 𝑓 𝑥 = ln(1 + exp(𝑥)) • (Other mappings can be used for it but not tested) • Because !"($) !$ = &! '(&! = 1 − ' '(&!, the effect of this mapping is not significant in the area 𝑥 ≫ 1 10/30/22 BESC2022 14
  • 12. ©M.HIRANO & Izumi Lab. Redundant Full Neural Network Actor • Use redundant neural networks for Actor 10/30/22 BESC2022 15 𝑃",$ 𝑃%,$ 𝑃&,$ … 𝑃",$ 𝑃%,$ 𝑃&,$ … … … … Minimum requirements Redundant Full Neural Network
  • 13. ©M.HIRANO & Izumi Lab. Seed Fixer • In social simulations, usually, the effect of seeds >> the effect of parameters. • When the variance among simulation trials, it is hard to get precise gradients. • Fix seed and test 10/30/22 BESC2022 16 Blue print When the seed is different When the parameters are slightly different (when no phase transfer is not included)
  • 14. ©M.HIRANO & Izumi Lab. Simulation model to evaluate our idea • Artificial market simulation • Stylized Trader Agents [Chiarella et al. 02] • Agents calculate… • Logarithmic return prediction for bid/ask price 𝑟 = 0 1!21"21# 𝑤3 ⋅ 𝐹 + 𝑤, ⋅ 𝐶 + 𝑤4 ⋅ 𝑁 • Fundamentals 𝐹 = ' )*+, -*.*-/01, 20)* ln 34--*,2 )+-5*2 6-03* 34--*,2 74,8+)*,2+9 6-03* • Chartist (trend) 𝐶 = logarithm averaged return in the past • Noise 𝑁 ~ 𝑁 0, 𝜎: • + margin => decide price • We tuned 𝑤3, 𝑤,. BESC2022 17 10/30/22
  • 15. ©M.HIRANO & Izumi Lab. Experiments & Results • The objective function is MSE of skewness and kurtosis • Target: Skewness = 0.0, Kurtosis = 6.0 • tuned 𝑤3, 𝑤, only • Comparative models: • Optuna (Bayesian estimation) • Models missing one or two components • Our proposed model showed the best performance! (But, not statistically significant…) 10/30/22 BESC2022 18
  • 16. ©M.HIRANO & Izumi Lab. Discussion & Future work • Our proposed method works well and three additional components (AC, FNNA, SF) were necessary. • Critic works well as a surrogation of simulation • The task we used in this study is very simple only tuning 2 parameters. • Bayesian estimation has the limitation of supporting high- dimensional tuning (Bayesian considers a sample as a POINT) • Gradient-based methods, such as DDPG, can support high- dimensional tuning. (A sample having the gradient can be considered as a surface) • According to some previous works, DDPG can solve high- dimensional tasks. Therefore, it seems a good fit for tuning high-dimensional parameter tuning tasks. à we should test • For future work, RL models with high-exploration capability should be tested 10/30/22 BESC2022 19