SlideShare a Scribd company logo
Application of Monte-
Carlo Tree Search in a
Fighting Game AI
Shubu Yoshida, Makoto Ishihara, Taichi Miyazaki,
Yuto Nakagawa, Tomohiro Harada, and Ruck Thawonmas
Intelligent Computer Entertainment Laboratory
Ritsumeikan University
Outline
1.Background of this research
2.Monte-Carlo Tree Search
3.Monte-Carlo Tree Search for a Fighting Game
4.Experimental Environment
5.Experimental Method
6.Result
7.Competition result in 2016
8.Conclusion
Background (1/2)
A Fighting Game AI Competition is held every year [1]
High-ranking AIs = Rule-based (until 2015)
Rule-based : a same action in a same situation
Human player can easily predict the AI’s action patterns and
outsmart it
[1] http://www.ice.ci.ritsumei.ac.jp/~ftgaic/
Background (2/2)
 Apply the Monte-Carlo Tree Search (MCTS)
to a fighting game AI
 Decides a next own action by stochastic simulations
 Already successful in many games [2][3]
We evaluate the effectiveness of MCTS on a fighting game
[2] S. Gelly, et al. ”The Grand Challenge of Computer Go: Monte Carlo Tree Search and Extensions”, Communications of the ACM, Vol. 55, No. 3, pp. 106-113,
2012.
[3] N. Ikehata and T. Ito. ”Monte-carlo tree search in ms. pac-man”. In Computational Intelligence and Games (CIG), 2011 IEEE Conference on, pp. 39-46, 2011
Monte-Carlo Tree Search (1/5)
selection simulation backpropagation
repeat until the set time has elapsed
expansion
Monte-Carlo Tree Search (2/5)
selection simulation backpropagation
repeat until the set time has elapsed
expansion
Formula of UCB1
・ 𝑋𝑖 : the value of an average reward
・𝐶 : The balance parameter
・𝑁𝑖
𝑝
: The total number of times the parent node of node 𝑖 has been visited
・𝑁𝑖 : The total number of times node 𝑖 has been visited
𝑈𝐶𝐵1𝑖 = 𝑋𝑖 + 𝐶
2 ln 𝑁𝑖
𝑝
𝑁𝑖
Preferentially select
a child node that has
been visited less
The evaluation valueExploitation
Exploration
Monte-Carlo Tree Search (3/5)
selection simulation backpropagation
repeat until the set time has elapsed
expansion
Monte-Carlo Tree Search (4/5)
selection simulation backpropagation
repeat until the set time has elapsed
expansion
Monte-Carlo Tree Search (5/5)
selection simulation backpropagation
repeat until the set time has elapsed
expansion
MCTS for a Fighting Game (1/2)
𝑈𝐶𝐵1𝑖 = 𝑋𝑖 + 𝐶
2 ln 𝑁𝑖
𝑝
𝑁𝑖
𝑋𝑖 =
1
𝑁𝑖
𝑗=1
𝑁 𝑖
𝑒𝑣𝑎𝑙𝑗
𝑒𝑣𝑎𝑙𝑗 = (𝑎𝑓𝑡𝑒𝑟𝐻𝑃𝑗
𝑚𝑦
− 𝑏𝑒𝑓𝑜𝑟𝑒𝐻𝑃𝑗
𝑚𝑦
)
−(𝑎𝑓𝑡𝑒𝑟𝐻𝑃𝑗
𝑜𝑝𝑝
− 𝑏𝑒𝑓𝑜𝑟𝑒𝐻𝑃𝑗
𝑜𝑝𝑝
)
MCTS for a Fighting Game (2/2)
・・・
Expansion
normal fighting game
・・・
・・・・・
Simulation
Experimental Environment
FightingICE
Used as the platform of international fighting game AI competition
1 game : 3 rounds
-1 round : 60 second
𝑚𝑦𝑆𝑐𝑜𝑟𝑒 =
𝑜𝑝𝑝𝐻𝑃
𝑚𝑦𝐻𝑃+𝑜𝑝𝑝𝐻𝑃
× 1000
Response time : 16.67ms
Experimental Method
MCTSAI(AI applying MCTS) vs high ranking 5 AIs of 2015
tournament
5 AIs : Rule-based
100 games (50 games each side)
TABLE I THE PARAMETERS USED IN THE EXPERIMENTS
Notations Meanings Values
C Balance parameter 3
Threshold of the number of visits 10
Threshold of the depth of tree 2
The number of simulations 60 frames
𝑁 𝑚𝑎𝑥
𝐷 𝑚𝑎𝑥
𝑇𝑠𝑖𝑚
Result (1/5)
0
100
200
300
400
500
600
700
800
Machete Ni1mir4ri Jay_Bot RatioBot AI128200
Score
vs AI names
Fig. 1. The average scores against high ranking 5 AIs of 2015 tournament
Result (2/5)
0
100
200
300
400
500
600
700
800
Machete Ni1mir4ri Jay_Bot RatioBot AI128200
Score
vs AI names
Fig. 1. The average scores against high ranking 5 AIs of 2015 tournament
Result (3/5)
P1 : MCTSAI P2 : RatioBot
Result (4/5)
0
100
200
300
400
500
600
700
800
Machete Ni1mir4ri Jay_Bot RatioBot AI128200
Score
vs AI names
Fig. 1. The average scores against high ranking 5 AIs of 2015 tournament
Result (5/5)
P1 : MCTSAI P2 : Machete
Competition result in 2016
Orange 1st
Blue 2nd
Green 3rd
Total Rank
RANK
BANZAI 11
DragonSurvivor 12
iaTest 7
IchibanChan 9
JayBot2016 5
KeepYourDistanceBot 10
MctsAi 3
MrAsh 4
Poring 8
Ranezi 2
Snorkel 13
Thunder01 1
Tomatensimulator 6
Triump 14
Conclusion
Applied MCTS to fighting game AI
Showed that MCTS in fighting game AI is effective
Future work
In fighting game, random simulation of the enemy behavior is
not effective
Predict the behavior of the enemy and use this information in
simulation
Thank you for listening

More Related Content

What's hot

What's hot (20)

【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning
【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning
【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning
 
Reinforcement Learning @ NeurIPS2018
Reinforcement Learning @ NeurIPS2018Reinforcement Learning @ NeurIPS2018
Reinforcement Learning @ NeurIPS2018
 
[DL輪読会]Learning Robust Rewards with Adversarial Inverse Reinforcement Learning
[DL輪読会]Learning Robust Rewards with Adversarial Inverse Reinforcement Learning[DL輪読会]Learning Robust Rewards with Adversarial Inverse Reinforcement Learning
[DL輪読会]Learning Robust Rewards with Adversarial Inverse Reinforcement Learning
 
[DL輪読会]DISTRIBUTIONAL POLICY GRADIENTS
[DL輪読会]DISTRIBUTIONAL POLICY GRADIENTS[DL輪読会]DISTRIBUTIONAL POLICY GRADIENTS
[DL輪読会]DISTRIBUTIONAL POLICY GRADIENTS
 
Interpretability beyond feature attribution quantitative testing with concept...
Interpretability beyond feature attribution quantitative testing with concept...Interpretability beyond feature attribution quantitative testing with concept...
Interpretability beyond feature attribution quantitative testing with concept...
 
ポーカーAIの最新動向 20171031
ポーカーAIの最新動向 20171031ポーカーAIの最新動向 20171031
ポーカーAIの最新動向 20171031
 
Reinforcement learning
Reinforcement  learningReinforcement  learning
Reinforcement learning
 
[DL輪読会]Decision Transformer: Reinforcement Learning via Sequence Modeling
[DL輪読会]Decision Transformer: Reinforcement Learning via Sequence Modeling[DL輪読会]Decision Transformer: Reinforcement Learning via Sequence Modeling
[DL輪読会]Decision Transformer: Reinforcement Learning via Sequence Modeling
 
High-impact Papers in Computer Vision: 歴史を変えた/トレンドを創る論文
High-impact Papers in Computer Vision: 歴史を変えた/トレンドを創る論文High-impact Papers in Computer Vision: 歴史を変えた/トレンドを創る論文
High-impact Papers in Computer Vision: 歴史を変えた/トレンドを創る論文
 
Control as Inference (強化学習とベイズ統計)
Control as Inference (強化学習とベイズ統計)Control as Inference (強化学習とベイズ統計)
Control as Inference (強化学習とベイズ統計)
 
強化学習その3
強化学習その3強化学習その3
強化学習その3
 
Attention is All You Need (Transformer)
Attention is All You Need (Transformer)Attention is All You Need (Transformer)
Attention is All You Need (Transformer)
 
【DL輪読会】GPT-4Technical Report
【DL輪読会】GPT-4Technical Report【DL輪読会】GPT-4Technical Report
【DL輪読会】GPT-4Technical Report
 
Reinforcement Learning with Deep Energy-Based Policies
Reinforcement Learning with Deep Energy-Based PoliciesReinforcement Learning with Deep Energy-Based Policies
Reinforcement Learning with Deep Energy-Based Policies
 
バンディット問題について
バンディット問題についてバンディット問題について
バンディット問題について
 
【DL輪読会】Implicit Behavioral Cloning
【DL輪読会】Implicit Behavioral Cloning【DL輪読会】Implicit Behavioral Cloning
【DL輪読会】Implicit Behavioral Cloning
 
Reinforcement Learning, Application and Q-Learning
Reinforcement Learning, Application and Q-LearningReinforcement Learning, Application and Q-Learning
Reinforcement Learning, Application and Q-Learning
 
[DL輪読会]Inverse Constrained Reinforcement Learning
[DL輪読会]Inverse Constrained Reinforcement Learning[DL輪読会]Inverse Constrained Reinforcement Learning
[DL輪読会]Inverse Constrained Reinforcement Learning
 
多人数不完全情報ゲームにおけるAI ~ポーカーと麻雀を例として~
多人数不完全情報ゲームにおけるAI ~ポーカーと麻雀を例として~多人数不完全情報ゲームにおけるAI ~ポーカーと麻雀を例として~
多人数不完全情報ゲームにおけるAI ~ポーカーと麻雀を例として~
 
強化学習エージェントの内発的動機付けによる探索とその応用(第4回 統計・機械学習若手シンポジウム 招待公演)
強化学習エージェントの内発的動機付けによる探索とその応用(第4回 統計・機械学習若手シンポジウム 招待公演)強化学習エージェントの内発的動機付けによる探索とその応用(第4回 統計・機械学習若手シンポジウム 招待公演)
強化学習エージェントの内発的動機付けによる探索とその応用(第4回 統計・機械学習若手シンポジウム 招待公演)
 

Viewers also liked

8 queens problem using back tracking
8 queens problem using back tracking8 queens problem using back tracking
8 queens problem using back tracking
Tech_MX
 

Viewers also liked (18)

Mcts ai
Mcts aiMcts ai
Mcts ai
 
"Monte-Carlo Tree Search for the game of Go"
"Monte-Carlo Tree Search for the game of Go""Monte-Carlo Tree Search for the game of Go"
"Monte-Carlo Tree Search for the game of Go"
 
Monte Carlo Tree Search for the Super Mario Bros
Monte Carlo Tree Search for the Super Mario BrosMonte Carlo Tree Search for the Super Mario Bros
Monte Carlo Tree Search for the Super Mario Bros
 
2016 Fighting Game Artificial Intelligence Competition
2016 Fighting Game Artificial Intelligence Competition2016 Fighting Game Artificial Intelligence Competition
2016 Fighting Game Artificial Intelligence Competition
 
What did AlphaGo do to beat the strongest human Go player?
What did AlphaGo do to beat the strongest human Go player?What did AlphaGo do to beat the strongest human Go player?
What did AlphaGo do to beat the strongest human Go player?
 
Challenges for implementing Monte Carlo Tree Search in commercial games
Challenges for implementing Monte Carlo Tree Search in commercial gamesChallenges for implementing Monte Carlo Tree Search in commercial games
Challenges for implementing Monte Carlo Tree Search in commercial games
 
Applying fuzzy control in fighting game ai
Applying fuzzy control in fighting game aiApplying fuzzy control in fighting game ai
Applying fuzzy control in fighting game ai
 
Alpha go 16110226_김영우
Alpha go 16110226_김영우Alpha go 16110226_김영우
Alpha go 16110226_김영우
 
Monte carlo tree search
Monte carlo tree searchMonte carlo tree search
Monte carlo tree search
 
A Markov Chain Monte Carlo approach to the Steiner Tree Problem in water netw...
A Markov Chain Monte Carlo approach to the Steiner Tree Problem in water netw...A Markov Chain Monte Carlo approach to the Steiner Tree Problem in water netw...
A Markov Chain Monte Carlo approach to the Steiner Tree Problem in water netw...
 
2013 Fighting Game Artificial Intelligence Competition
2013 Fighting Game Artificial Intelligence Competition2013 Fighting Game Artificial Intelligence Competition
2013 Fighting Game Artificial Intelligence Competition
 
Bayesian intro
Bayesian introBayesian intro
Bayesian intro
 
AlphaGo: An AI Go player based on deep neural networks and monte carlo tree s...
AlphaGo: An AI Go player based on deep neural networks and monte carlo tree s...AlphaGo: An AI Go player based on deep neural networks and monte carlo tree s...
AlphaGo: An AI Go player based on deep neural networks and monte carlo tree s...
 
Bayesian statistics using r intro
Bayesian statistics using r   introBayesian statistics using r   intro
Bayesian statistics using r intro
 
An introduction to bayesian statistics
An introduction to bayesian statisticsAn introduction to bayesian statistics
An introduction to bayesian statistics
 
Introduction to Bayesian Methods
Introduction to Bayesian MethodsIntroduction to Bayesian Methods
Introduction to Bayesian Methods
 
8 queens problem using back tracking
8 queens problem using back tracking8 queens problem using back tracking
8 queens problem using back tracking
 
AlphaGo 알고리즘 요약
AlphaGo 알고리즘 요약AlphaGo 알고리즘 요약
AlphaGo 알고리즘 요약
 

Similar to Application of Monte Carlo Tree Search in a Fighting Game AI (GCCE 2016)

IEEE CIG 2018 Maastricht, The Netherlands, A Machine-Learning Item Recommenda...
IEEE CIG 2018 Maastricht, The Netherlands, A Machine-Learning Item Recommenda...IEEE CIG 2018 Maastricht, The Netherlands, A Machine-Learning Item Recommenda...
IEEE CIG 2018 Maastricht, The Netherlands, A Machine-Learning Item Recommenda...
Anna Guitart Atienza
 

Similar to Application of Monte Carlo Tree Search in a Fighting Game AI (GCCE 2016) (20)

2017 Fighting Game AI Competition
2017 Fighting Game AI Competition2017 Fighting Game AI Competition
2017 Fighting Game AI Competition
 
Learning to Reason in Round-based Games: Multi-task Sequence Generation for P...
Learning to Reason in Round-based Games: Multi-task Sequence Generation for P...Learning to Reason in Round-based Games: Multi-task Sequence Generation for P...
Learning to Reason in Round-based Games: Multi-task Sequence Generation for P...
 
CoSECiVi 2020 - Parametric Action Pre-Selection for MCTS in Real-Time Strateg...
CoSECiVi 2020 - Parametric Action Pre-Selection for MCTS in Real-Time Strateg...CoSECiVi 2020 - Parametric Action Pre-Selection for MCTS in Real-Time Strateg...
CoSECiVi 2020 - Parametric Action Pre-Selection for MCTS in Real-Time Strateg...
 
2015 Fighting Game Artificial Intelligence Competition
2015 Fighting Game Artificial Intelligence Competition2015 Fighting Game Artificial Intelligence Competition
2015 Fighting Game Artificial Intelligence Competition
 
AI3391 Artificial Intelligence Session 18 Monto carlo search tree.pptx
AI3391 Artificial Intelligence Session 18 Monto carlo search tree.pptxAI3391 Artificial Intelligence Session 18 Monto carlo search tree.pptx
AI3391 Artificial Intelligence Session 18 Monto carlo search tree.pptx
 
IEEE CIG 2017 New York, Games and Big Data: A Scalable Multi-Dimensional Chur...
IEEE CIG 2017 New York, Games and Big Data: A Scalable Multi-Dimensional Chur...IEEE CIG 2017 New York, Games and Big Data: A Scalable Multi-Dimensional Chur...
IEEE CIG 2017 New York, Games and Big Data: A Scalable Multi-Dimensional Chur...
 
Dissertation defense
Dissertation defenseDissertation defense
Dissertation defense
 
AN ANALYSIS OF MINIMAX SEARCH AND ENDGAME DATABASES IN EVOLVING AWALE GAME PL...
AN ANALYSIS OF MINIMAX SEARCH AND ENDGAME DATABASES IN EVOLVING AWALE GAME PL...AN ANALYSIS OF MINIMAX SEARCH AND ENDGAME DATABASES IN EVOLVING AWALE GAME PL...
AN ANALYSIS OF MINIMAX SEARCH AND ENDGAME DATABASES IN EVOLVING AWALE GAME PL...
 
An analysis of minimax search and endgame databases in evolving awale game pl...
An analysis of minimax search and endgame databases in evolving awale game pl...An analysis of minimax search and endgame databases in evolving awale game pl...
An analysis of minimax search and endgame databases in evolving awale game pl...
 
Introduction to Alphago Zero
Introduction to Alphago ZeroIntroduction to Alphago Zero
Introduction to Alphago Zero
 
IEEE CIG 2018 Maastricht, The Netherlands, A Machine-Learning Item Recommenda...
IEEE CIG 2018 Maastricht, The Netherlands, A Machine-Learning Item Recommenda...IEEE CIG 2018 Maastricht, The Netherlands, A Machine-Learning Item Recommenda...
IEEE CIG 2018 Maastricht, The Netherlands, A Machine-Learning Item Recommenda...
 
Handout simulasi computer
Handout simulasi computerHandout simulasi computer
Handout simulasi computer
 
A STRATEGIC HYBRID TECHNIQUE TO DEVELOP A GAME PLAYER
A STRATEGIC HYBRID TECHNIQUE TO DEVELOP A GAME PLAYERA STRATEGIC HYBRID TECHNIQUE TO DEVELOP A GAME PLAYER
A STRATEGIC HYBRID TECHNIQUE TO DEVELOP A GAME PLAYER
 
AlphaGo Zero: Mastering the Game of Go Without Human Knowledge
AlphaGo Zero: Mastering the Game of Go Without Human KnowledgeAlphaGo Zero: Mastering the Game of Go Without Human Knowledge
AlphaGo Zero: Mastering the Game of Go Without Human Knowledge
 
Game Analytics & Machine Learning
Game Analytics & Machine LearningGame Analytics & Machine Learning
Game Analytics & Machine Learning
 
How DeepMind Mastered The Game Of Go
How DeepMind Mastered The Game Of GoHow DeepMind Mastered The Game Of Go
How DeepMind Mastered The Game Of Go
 
Optimizing search-space-of-othello-using-hybrid-approach
Optimizing search-space-of-othello-using-hybrid-approachOptimizing search-space-of-othello-using-hybrid-approach
Optimizing search-space-of-othello-using-hybrid-approach
 
Improving the Performance of MCTS-Based μRTS Agents Through Move Pruning
Improving the Performance of MCTS-Based μRTS Agents Through Move PruningImproving the Performance of MCTS-Based μRTS Agents Through Move Pruning
Improving the Performance of MCTS-Based μRTS Agents Through Move Pruning
 
AlphaGo and AlphaGo Zero
AlphaGo and AlphaGo ZeroAlphaGo and AlphaGo Zero
AlphaGo and AlphaGo Zero
 
AI_Session 14 Min Max Algorithm.pptx
AI_Session 14 Min Max Algorithm.pptxAI_Session 14 Min Max Algorithm.pptx
AI_Session 14 Min Max Algorithm.pptx
 

More from ftgaic

More from ftgaic (6)

2021 Fighting Game AI Competition
2021 Fighting Game AI Competition2021 Fighting Game AI Competition
2021 Fighting Game AI Competition
 
2020 Fighting Game AI Competition
2020 Fighting Game AI Competition2020 Fighting Game AI Competition
2020 Fighting Game AI Competition
 
2019 Fighting Game AI Competition
2019 Fighting Game AI Competition2019 Fighting Game AI Competition
2019 Fighting Game AI Competition
 
2018 Fighting Game AI Competition
2018 Fighting Game AI Competition 2018 Fighting Game AI Competition
2018 Fighting Game AI Competition
 
Introduction to the Replay File Analysis Tool
Introduction to the Replay File Analysis ToolIntroduction to the Replay File Analysis Tool
Introduction to the Replay File Analysis Tool
 
2014 Fighting Game Artificial Intelligence Competition
2014 Fighting Game Artificial Intelligence Competition2014 Fighting Game Artificial Intelligence Competition
2014 Fighting Game Artificial Intelligence Competition
 

Recently uploaded

Recently uploaded (20)

Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
Motion for AI: Creating Empathy in Technology
Motion for AI: Creating Empathy in TechnologyMotion for AI: Creating Empathy in Technology
Motion for AI: Creating Empathy in Technology
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Transforming The New York Times: Empowering Evolution through UX
Transforming The New York Times: Empowering Evolution through UXTransforming The New York Times: Empowering Evolution through UX
Transforming The New York Times: Empowering Evolution through UX
 
Server-Driven User Interface (SDUI) at Priceline
Server-Driven User Interface (SDUI) at PricelineServer-Driven User Interface (SDUI) at Priceline
Server-Driven User Interface (SDUI) at Priceline
 

Application of Monte Carlo Tree Search in a Fighting Game AI (GCCE 2016)

  • 1. Application of Monte- Carlo Tree Search in a Fighting Game AI Shubu Yoshida, Makoto Ishihara, Taichi Miyazaki, Yuto Nakagawa, Tomohiro Harada, and Ruck Thawonmas Intelligent Computer Entertainment Laboratory Ritsumeikan University
  • 2. Outline 1.Background of this research 2.Monte-Carlo Tree Search 3.Monte-Carlo Tree Search for a Fighting Game 4.Experimental Environment 5.Experimental Method 6.Result 7.Competition result in 2016 8.Conclusion
  • 3. Background (1/2) A Fighting Game AI Competition is held every year [1] High-ranking AIs = Rule-based (until 2015) Rule-based : a same action in a same situation Human player can easily predict the AI’s action patterns and outsmart it [1] http://www.ice.ci.ritsumei.ac.jp/~ftgaic/
  • 4. Background (2/2)  Apply the Monte-Carlo Tree Search (MCTS) to a fighting game AI  Decides a next own action by stochastic simulations  Already successful in many games [2][3] We evaluate the effectiveness of MCTS on a fighting game [2] S. Gelly, et al. ”The Grand Challenge of Computer Go: Monte Carlo Tree Search and Extensions”, Communications of the ACM, Vol. 55, No. 3, pp. 106-113, 2012. [3] N. Ikehata and T. Ito. ”Monte-carlo tree search in ms. pac-man”. In Computational Intelligence and Games (CIG), 2011 IEEE Conference on, pp. 39-46, 2011
  • 5. Monte-Carlo Tree Search (1/5) selection simulation backpropagation repeat until the set time has elapsed expansion
  • 6. Monte-Carlo Tree Search (2/5) selection simulation backpropagation repeat until the set time has elapsed expansion
  • 7. Formula of UCB1 ・ 𝑋𝑖 : the value of an average reward ・𝐶 : The balance parameter ・𝑁𝑖 𝑝 : The total number of times the parent node of node 𝑖 has been visited ・𝑁𝑖 : The total number of times node 𝑖 has been visited 𝑈𝐶𝐵1𝑖 = 𝑋𝑖 + 𝐶 2 ln 𝑁𝑖 𝑝 𝑁𝑖 Preferentially select a child node that has been visited less The evaluation valueExploitation Exploration
  • 8. Monte-Carlo Tree Search (3/5) selection simulation backpropagation repeat until the set time has elapsed expansion
  • 9. Monte-Carlo Tree Search (4/5) selection simulation backpropagation repeat until the set time has elapsed expansion
  • 10. Monte-Carlo Tree Search (5/5) selection simulation backpropagation repeat until the set time has elapsed expansion
  • 11. MCTS for a Fighting Game (1/2) 𝑈𝐶𝐵1𝑖 = 𝑋𝑖 + 𝐶 2 ln 𝑁𝑖 𝑝 𝑁𝑖 𝑋𝑖 = 1 𝑁𝑖 𝑗=1 𝑁 𝑖 𝑒𝑣𝑎𝑙𝑗 𝑒𝑣𝑎𝑙𝑗 = (𝑎𝑓𝑡𝑒𝑟𝐻𝑃𝑗 𝑚𝑦 − 𝑏𝑒𝑓𝑜𝑟𝑒𝐻𝑃𝑗 𝑚𝑦 ) −(𝑎𝑓𝑡𝑒𝑟𝐻𝑃𝑗 𝑜𝑝𝑝 − 𝑏𝑒𝑓𝑜𝑟𝑒𝐻𝑃𝑗 𝑜𝑝𝑝 )
  • 12. MCTS for a Fighting Game (2/2) ・・・ Expansion normal fighting game ・・・ ・・・・・ Simulation
  • 13. Experimental Environment FightingICE Used as the platform of international fighting game AI competition 1 game : 3 rounds -1 round : 60 second 𝑚𝑦𝑆𝑐𝑜𝑟𝑒 = 𝑜𝑝𝑝𝐻𝑃 𝑚𝑦𝐻𝑃+𝑜𝑝𝑝𝐻𝑃 × 1000 Response time : 16.67ms
  • 14. Experimental Method MCTSAI(AI applying MCTS) vs high ranking 5 AIs of 2015 tournament 5 AIs : Rule-based 100 games (50 games each side) TABLE I THE PARAMETERS USED IN THE EXPERIMENTS Notations Meanings Values C Balance parameter 3 Threshold of the number of visits 10 Threshold of the depth of tree 2 The number of simulations 60 frames 𝑁 𝑚𝑎𝑥 𝐷 𝑚𝑎𝑥 𝑇𝑠𝑖𝑚
  • 15. Result (1/5) 0 100 200 300 400 500 600 700 800 Machete Ni1mir4ri Jay_Bot RatioBot AI128200 Score vs AI names Fig. 1. The average scores against high ranking 5 AIs of 2015 tournament
  • 16. Result (2/5) 0 100 200 300 400 500 600 700 800 Machete Ni1mir4ri Jay_Bot RatioBot AI128200 Score vs AI names Fig. 1. The average scores against high ranking 5 AIs of 2015 tournament
  • 17. Result (3/5) P1 : MCTSAI P2 : RatioBot
  • 18. Result (4/5) 0 100 200 300 400 500 600 700 800 Machete Ni1mir4ri Jay_Bot RatioBot AI128200 Score vs AI names Fig. 1. The average scores against high ranking 5 AIs of 2015 tournament
  • 19. Result (5/5) P1 : MCTSAI P2 : Machete
  • 20. Competition result in 2016 Orange 1st Blue 2nd Green 3rd Total Rank RANK BANZAI 11 DragonSurvivor 12 iaTest 7 IchibanChan 9 JayBot2016 5 KeepYourDistanceBot 10 MctsAi 3 MrAsh 4 Poring 8 Ranezi 2 Snorkel 13 Thunder01 1 Tomatensimulator 6 Triump 14
  • 21. Conclusion Applied MCTS to fighting game AI Showed that MCTS in fighting game AI is effective Future work In fighting game, random simulation of the enemy behavior is not effective Predict the behavior of the enemy and use this information in simulation
  • 22. Thank you for listening

Editor's Notes

  1. Hello everyone. My name is shubu yoshida of Intelligent Computer Entertainment Lab, Ritsumeikan University. I’d like to talk about “Application of Monte-Carlo Tree Search in a Fighting Game AI” .
  2. This is the outline of my presentation. I’d like to talk about these contents.
  3. A Fighting Game AI Competition is held every year. In this competition, High-ranking AIs are mainly well-tuned rule-based AIs which always conduct a same action in a same situation. Rule-based AIs take predetermined actions. Human players can easily predict the AI’s action patterns and outsmart it. And if the parameters of the action changed, Rule-based AI’s strength be changed
  4. In order to solve this problem, we apply MCTS to a Fighting Game AI. MCTS decides a next own action by stochastic simulations. MCTS based approach produces a significantly promising result not only in a board game like Go [2], but also in a realtime based game like Ms.Pac-Man [3]. Then, it is expected that it performs better in a fighting game because this kind of game is similar to Ms.Pac-Man in terms of real-time based. it is expected that it performs better in a fighting game. In this paper, we evaluate the effectiveness of MCTS on a fighting game.
  5. We modified traditional MCTS for a fighting game. This figure is an overview of traditional MCTS. I’ll explain you about this. And after having explained this, I’ll explain you about MCTS for fighting game. MCTS combines the game tree search and the Monte Carlo method. Each node represents a state of the game. Each edge an action.
  6. First, MCTS selects the child node with the highest UCB1 value until it reaches a leaf node. Each child node has a UCB1 value.
  7. UCB1 value is calculated by this formula. In this formula, the first term is the evaluation value. The second term aims that MCTS preferentially selects a child node that has been visited less. So, this formula aims that MCTS selects a child node which not only has high evaluation value but also has been visited less to prevent local search. In short, the first term is exploitation and the second term is exploration.
  8. Second, after arriving at a leaf node, if its number of visits exceeds a pre-defined threshold and the depth of the tree has not reached the upper limit, MCTS will create child nodes from it.
  9. Third, it performs random simulation from the root node to the leaf node. And it simulate until the end of game. In this part, opponent actions are selected randomly and my actions are used in that path. After do these actions, we get reward and state.
  10. Finally, it propagates a result of simulation from the leaf node to the parent node and calculates UCB1 values and repeat propagation until the root node. The above 4 steps are repeated during allowed time budget in MCTS. Then, the child node is chosen with the highest number of visits from the root node.
  11. In fighting games, UCB1 is defined by this formula. The evaluation value of node 𝑖 is the average value of the amount of the opponent character's hit-point changes subtracted by the amount of that of the player character . This value is higher when my AI gives a lot of damage to the opponent and it is not damaged by the opponent. Each parameter shows AI HP before and after j-th simulation. The first term is the own score difference term before and after a simulation. The second term is the opponent’s one.
  12. In the expansion part, traditional MCTS expands only one node at a time. In this paper, we expand all actions or nodes that the AI can act. Fighting games have a lot of actions, and real time games have search time limit. We want to explore all of the nodes once at least. So we expand all actions that the AI can act. In the simulation part, in board games, simulation is done until the end of the game. But real-time games have limited thinking time. So we put restrictions on tree depth. These are the main changes in MCTS for fighting games.
  13. In an experiment, We used FightingICE as the fighting game platform. FightingICE is a 2D fighting game developed by our laboratory for game ai researches. It is used as the platform of international fighting game AI competitions recognized by IEEE CIG. The player AI score or My score is calculated by this formula. If more than 500, my AI’s performance is superior to the opponent AI
  14. Next , experimental method. We let MCTSAI fight 100 times against high ranking 5 AIs of 2015 tournament, while switching each side . Action behaviors of each AI are rule-based. And we used these parameters.
  15. The average score against each AI is shown in Fig. 1. In this figure, the horizontal axis lists the name of high ranking AI. And from left to right, there are 1st ranked to 5th ranked Ais. The vertical axis represents the average scores of MCTSAI againtw high ranking Ais.
  16. From this result, the proposed AI outperformed all opponent AIs, except for the 1st ranked AI Machete.
  17. This video is a fighting game scene where P1 is MCTSAI and P2 is RatioBot. RatioBot is the 4th ranked ai in 2015 tournament. As we can see from this video, MCTSAI has been able to dodge the behavior of RatioBot. It can be said that the simulation of Monte Carlo tree search has been working well. So MCTS is an effective method in this fighting game. ////
  18. But the proposed AI did not show a good performance against Machete.
  19. This video is a fighting game scene where P1 is MCTSAI and P2 is Machete. Machete is a well tuned rule-based AI that repeatedly conducts short actions, requiring less number of frames, which are not well simulated by MCTS RANDOM simulation.
  20. This is the competition result in 2016. the horizontal axis lists the name of AI. And these numbers represent these AIs Ranking. In this competition, our MctsAI came 3rd. So it can be said that Mcts showed good results also in an actual tournament.
  21. In conclusion we applied MCTS to a fighting game AI. Results showed that MCTS in fighting game AI is effective. In this paper, we have found that random simulation of the enemy behavior is not effective in fighting games. So, in the future, we plan to add a mechanism such as behavior prediction of the enemy and use it in simulation. Use of this kind of mechanism should better simulate the opponent.