These are slides about the 2020 Fighting Game Artificial Intelligence Competition virtually presented at the 2020 IEEE Conference on Games (CoG), August 24-27, 2020.
AWS Community Day CPH - Three problems of Terraform
2020 Fighting Game AI Competition
1. 2020 Fighting Game AI Competition
Yoshina Takano advisor
Ryota Ishii advisor
Hideyasu Inoue lead programmer
Keita Fujimakiprogrammer, tester, etc.
Pujana Paliyawan vice director
Ruck Thawonmas director
Team FightingICE
Intelligent Computer Entertainment Laboratory
Ritsumeikan University
Japan
Game resources are from The Rumble Fish 2 with the courtesy of Dimps Corporation.
http://www.ice.ci.ritsumei.ac.jp/~ftgaic/
CoG 2020: Aug 24-27, 2020
3. Fighting game AI platform viable to develop with
a small-size team in Java and also wrapped for Python
First of its kinds since 2013 & CIG 2014, developed from
scratch without using game ROM data
Aims:
Towards general fighting
game AIs
Strong against any unseen
opponents (AIs or players) ,
character types, and play modes
FightingICE
http://www.ice.ci.ritsumei.ac.jp/~ftgaic/
Game resources are from The Rumble Fish 2 with the
courtesy of Dimps Corporation.
CoG 2020: Aug 24-27, 2020
3
4. Has 16.67 ms response time (60 FPS)
for the agent to choose its action out of 40 actions
Provides the latest game state with a delay of 15
frames, to simulate human response time
Equipped with
a forward model
a method for accessing
the screen information
an OpenAI Gym API
FightingICE’s Main Features
Why FightingICE?
DRL does not prevail yet!?
Generalization against different
opponents of unknown characters for
multiple game modes is challenging
60 FPS + introduced delay are
challenging factors for MCTS
CoG 2020: Aug 24-27, 2020
4
5. Recent Research Using FightingICE
or about Fighting Games
One journal paper at Expert Systems with Applications
Genetic state-grouping algorithm for deep reinforcement Learning (Dec 2020)
One journal paper (under submission?!) at arXiv
Enhanced Rolling Horizon Evolution Algorithm with Opponent Model Learning:
Results for the Fighting Game AI Competition (arXiv:2003.13949)
Three conference papers at CoG 2020
Mastering Fighting Game Using Deep Reinforcement Learning With Self-play
Observer Interface Focused on Trends of Character Movement and Stamina in
Fighting Games
Towards Social Facilitation in Audience Participation Games: Fighting Game AIs
whose Strength Depends on Audience Responses (ours)
CoG 2020: Aug 24-27, 2020
5
6. Recent Use in Education
Student projects with an internal competition at Department of Games
and Interactive Media, School of Information Technology and
Innovation, Bangkok University, Thailand (May 2020)
CoG 2020: Aug 24-27, 2020
6
8. Three tournaments, using three characters, for Standard and
Speedrunning leagues:
ZEN, GARNET, and LUD (GARNET and LUD’s character data not
revealed in advance, unknown
characters)
Standard: considers the winner of a round as the one with the HP above
zero at the time its opponent's HP has reached zero (all AIs' initial HP = 400)
Speedrunning: the winner of a given character type is the AI with the
shortest average time to beat our sample MctsAi (all AIs' initial HP = 400)
Contest Rules
CoG 2020: Aug 24-27, 2020
8
9. 14 Entries from
China, Japan, Korea, Singapore and
Thailand
1 sample AI (Openloop MCTS)
from our group for reference
MCTS AI
Summary of AI Fighters
Techniques in use by the submitted
14 AIs (cont.)
4 AIs
MCTS + rules or opponent modeling
3 AIs
Pre-defined rules, including 1 AI with fuzzy
rules
1 AI
RHEA + opponent modeling (with
Deeplearning4j)
1 AI
Genetic State-Grouping Algorithm for
Deep Reinforcement Learning
1 AI
Search the best action among prioritized
actions for each character
Techniques in use by the submitted
14 AIs (6 Deep, 4 MCTS, 3 rules, 2 EA)
4 AIs
Deep reinforcement learning (3 PPO and 1 SAC),
including 1 AI trained with OpenAI Gym APICoG 2020: Aug 24-27, 2020
9
11. Results
• Winner AI: ERHEA_PI by Zhentao Tang*, Rongqin Liang, and Mengchen Zhao
(*2019 runner-up), University of Chinese Academy of Sciences and Huawei Noah’s
Ark Lab, China
• Rolling Horizon Evolutionary Algorithm combined with an adaptive learning-
based opponent model (Deeplearning4j) utilizing two simulation modules
from ReiwaThunder (2019 Winner) (cf. the ArXiv paper in slide 5)
• Runner-up AI: Tera Thunder by Eita Aoki (winner for the last four consecutive years),
Japan
• 1. Prioritize certain actions in advance. 2. Predict the most possible three
actions by the opponent. 3. Select the best AI action against the opponent's
three actions using his original simulator.
• 3nd Place AI: ButcherPudge by Wen Bai (newcomer), Nanyang Technological
University, Singapore
• Reinforcement Learning Algorithm SAC (Soft-Actor-Critic) trained against 2019
top AIs with the OpenAI gym interface and Pytorch library.CoG 2020: Aug 24-27, 2020
11
13. Fighting Game AI
A I : F u z z y _ Z Y Q A I
Developed by:ZHANG YUQI
Graduated from:ZhengZhou University
14. ∆ This AI is mainly based on Fuzzy Logic Control and Finite-state Machine.
∆ Its developed with:
⊿ FightingICE Version 4.50 ⊿ JavaSE-1.8 ⊿ Eclipse
∆ The mainly logic is:
Input:
Distance and energy data →
Fuzzy to:
The Feeling-Value →
Output:
The probability of next action
∆ Using only one character: ZEN
15. Gaussian function:low middle high
Normail Fuzzy function:
For the specificity of Fighting Game AI,I designed the fuzzy function as follows:
⊿ DistanceX ⊿ DistanceY ⊿ Energy
∆ Based on fuzzy function,AI can get a
Feeling-Value of (0~1).
∆ By making a fuzzy rule base,AI can get
the probability of next action.
∆ For the representation of the probability,
I using random function and the distribution
of ‘case’ in ‘switch statement ’ to implement it.
∆ In the case of multiple inputs, I implement
the outputs by nesting the ‘switch statements ’.
16. ∆ Fuzzy control is a rule-based control.It is no need to establish an accurate
mathematical model,so that the control mechanism and strategy are easy to
accept and understand.
It is beneficial to simulate the process and method of manual control,
enhance the adaptive ability of the control system, and make it have a certain
level of intelligence.
∆
∆ In tests, it performed better than the AI I wrote earlier(which is only based on Finite-state Machine).
→
Tests:
19. Outline
Pick up the 3 actions that opponent is likely to do ,
and my AI assume that AI knows what the enemy is doing.
AI do what AI can attack Opponent faster and Opponent can attack my
AI slower.
Give weight to the evaluation according to the likelihood of action of the
Opponent.
20. Weak Action
Small Attack is very weak.
e.g. Zen’s STAND_D_DB_BA
AttackHitArea of Zen’s STAND_D_DB_BA is only 100.
I stopped using Actions with AttackHitArea smaller than 500.
Short Active time Attack is very weak.
I stopped using Actions with AttackActive time rate is less than 0.13.
21. Sample Name
Until 2019, the name of the SpeedMode enemy was "MctsAi".
But in this year's MidTerm, it was "SampleMctsAi".
But I wasn't sure which name it would be in this year's Final, so if the
enemy was named either, it would work as SpeedMode.
23. Introduction
AI Name : SpringAI
Self-play Reinforcement Game AI
Developers & Affiliation
Dae-Wook Kim (dooroomie@etri.re.kr) and Teammates
Electronics and Telecommunications Research Institute (ETRI)
Daejeon, Korea
AI Development Language
Python 3.5
24. AI Outline
Method : Reinforcement Learning (RL) + Self-play
RL Configuration
Proximal Policy Optimization Algorithms (PPO)
Trained with MCTS AI (‘SampleMctsAi’ provided from competition)
2-stage learning
25. AI Details
Some Limitation on GARNET and LUD
RL is not suitable for unspecified character
However, we assumed that well-trained agent would also show good
performance even if specification of characters are changed,
See our paper submitted on COG 2020 to get more details
Mastering Fighting Game Using Deep Reinforcement Learning With Self-play
Paper number 207
33. EmcmAi is a reinforcement-learning bot that uses a policy network to fight
against opponents.
The policy network is trained by proximal policy optimization and by playing
against participating bots in the competitions of last few years.
The reward function is defined by 𝑟𝑡 = ℎ𝑝𝑡−1 𝑜𝑝𝑝 − ℎ𝑝𝑡 𝑜𝑝𝑝 −
(ℎ𝑝𝑡−1(𝑜𝑤𝑛) − ℎ𝑝𝑡(𝑜𝑤𝑛)) to encourage other damage and to avoid own hurt.
DESCRIPTION-1
34. DESCRIPTION-2
During the training, we design an Elo-based opponent-selection mechanism,
so that too weak bots will not appear frequently as opponents.
It encourages the agent to train more against strong bots to improve its own
competitiveness.
The occurrence of opponent bots in the training is proportional to the priority
𝑝 𝑜𝑝𝑝 = (1 − 𝑤𝑖𝑛𝑟𝑎𝑡𝑒(𝑜𝑝𝑝))2, where the 𝑤𝑖𝑛𝑟𝑎𝑡𝑒(𝑜𝑝𝑝) of given opponent is
determined by its relative Elo to our agent.
34
35. DESCRIPTION-3
In contrast to conventional RL problems, FightingICE has to consider the
dynamical action set where the candidate actions are determined by
AIR/GROUND state and character energy.
So we design predictCurrentFramedata() function to predict the current
framedata with stored action commands between the delayed frame and real
current frame.
In policy network, the unavailable actions are masked by assigning infinitely
small values to their output nodes.
35
36. Enhanced Rolling Fighting Bot -
ERHEA_PI
Zhentao Tang (Student)
Affiliation: University of Chinese Academy of Sciences
Rongqin Liang(Student)
Affiliation: University of Chinese Academy of Sciences
Mengchen Zhao(Young Professional)
Affiliation: Huawei Noah’s Ark Lab
37. Enhanced Rolling Fighting Bot
• Rolling Fighting Bot is based on Rolling Horizon Evolutionary
Algorithm, combined with the adaptive learning-based opponent
model. It uses Thunder Bot as a reference with the valid action set
as candidate.
• Base: RHEA_PI, we made 2019.
• New approach:
* Enrich the observation of opponent’s state.
* Enhance the reward design for opponent model.
* Enlarge the state-action pair datasets for opponent model
training.
* Use a more reliable simulator by Thunder proposed.
39. FightingICE Competition
AI's Name : Caselene
Developer's Name : Jaturawit Chaiwong
Email : jaturawit.chaiwong@gmail.com
BU-MIT LAB
Bangkok University, Thailand
40. AI's
Outline
• Our AI using MCTS provided us to switching the evaluation to the finest
situation.
• Crouch kick is what we use considered to make the opponent hard to
guard.
• Combo attack is use for making more score.
41. Crouch kick
Crouch kick is one of old strategy in 90's - 00's When fighting game is popular. A lot of
people back then use this strategy and win a several time.
42. According to the second page, When the opponent is pushed to the edge of the screen
and to make damage the most, the combo is the right choice. To get more points and take
advantage when the opponent is down.
Combo
44. FightingICE Competition
AI name : MrTwo
Developer's Name : Tannop Sangvanloy
Supervisor: Dr. Kingkarn Sookhanaphibarn
BU-MIT Lab
Bangkok University Thailand
45. • Used MCTS for selecting the best Action.
• Actions that knockdown opponent are more favorable.
• Switching multiplier for each character.
AI outline
46. Evaluation
• Knocking down opponent is the best way to dealt the most damage,
because after that we can perform action that they cant dodge or
fight back
47. Switching multiplier
• Each character have different action, but by using multiplier we can
adjust score for each character to select best Action.
49. Algorithm: Reinforcement Learning with PPO
I used model-free reinforcement learning algorithm, Proximal
Policy Optimization, to train my AI.
Following is the clipped surrogate objective which OpenAI
proposed in Proximal Policy Optimization Algorithms, 2017. I
used it as the objective to train my AI and update the weights.
Reference:
Schulman J, Wolski F, Dhariwal P, et al. Proximal Policy
Optimization Algorithms[J]. arXiv: Learning, 2017.
51. Load Weights: 2018Samples
2018_Sample_AIs.LoadTorchWeightAI gave me a complete
example to use directly. After training and getting my weights, I
just need to save it as _.csv and put it in my aiData, the
network class will do the forward compute.
There were only two place I have changed:
First, I changed the InputData to get the state I need.
Then, I changed the way to choose action. My output are
random policies, so I use roulette to get the action.
52. Acknowledgements:
I acknowledge the important role played by my teacher, Zhu
Yuan Huan, Zhao Dong Bin, my senior brother-in-learning,
Tang Zhen Tao, my junior brother- in-learning, Liang Rong
Qing. My teammates Yu Yang, Tong Ru, Chen Hao and Yu
Chang also help a lot at the beginning.
54. Outline
• This AI was developed using three core AI technologies and some heuristics.
1) State Grouping Method for Monte Carlo Tree Search
Genetic State-Grouping Algorithm for Deep Reinforcement Learning
(Expert Systems with Applications, 2020)
2) Hybrid Method for Zen Character
Hybrid fighting game AI using a genetic algorithm and Monte Carlo tree search
(GECCO, 2018)
3) Opponent Modeling for LUD Character
Opponent modeling based on action table for MCTS-based fighting game AI
(CIG, 2017)
57. BAI WEN (MSAI Student)
School of Computer Science and Engineering
Nanyang Technological University
BUTCHER PUDGEA Deep Reinforcement Learning Agent
58. Butcher_Pudge is implemented based on the Deep Reinforcement Learning Algorithm SAC
(Soft-Actor-Critic). It was mainly trained to fight with the 2019 Award winners’ AIs
(ReiwaThunder, RHEA_PI and Toothless..) progressively. This AI was trained directly on the
delayed RAM data, and did not utilized the simulator provided by the platform.
The Q-network and Policy network both contain 3 layers with 256 hidden units respectively.
Reward Tuning was also used during training to help the training converge faster.
The final deployed model only need numpy to run, no GPU needed, no deep learning library
needed. For the details of how to run the AI, please refer to the README.md
1. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a
Stochastic Actor, Haarnoja et al, 2018
60. AIBot in
FightingICE
competition 2020
AI name: YIYAI
Team: YIYAI
Developer name:
Mr. Thavatchai Kruhpoong, thavatchai.kruh@bumail.net
Mr. Peera Chuenbunchom, peera.chue@bumail.net
Mr. Kittiphop Ritthijan, kittiphop.ritt@bumail.net
Mr. Jitti Sailektim, jitti.sail@bumail.net
Mr. Danuporn Poonsang, danuporn.poon@bumail.net
Supervisor: Dr. Kingkarn Sookhanaphibarn
Bangkok University
61. Aggressive Defensive
Close
Close-
medium
Medium
Far
Random between Forward Jump, Forward Dash
Emphasize moving closer to OPP because the between distance is not too far.
Energy >= 30 chance ¼ for Forward range attack
If not release an ultimate skill, random actions between Forward jump, Forward Dash
Because at this far distance the normal skills cannot make a damage except the ultimate skills
Energy >= 30 chance ¼ for Upper range attack
or random action between Forward Jump, Forward dash
Emphasize to move closer to OPP for melee attack
A release skill of both actions to up the air can make a
damage to OPP in this distance when OPP is at both
ground and air.
Energy >= 30 chance ¼ for Upper range attack
or random action between Kick, Crouch kick, Back jump
Emphasize the far distance to OPP
A release skill of three actions can make a damage to OPP
in far distance better than others.
Random between Crouch kick,
Forward kick, Kick, Throw
Because fast, and much damage
Random between Crouch kick, Kick,
Throw,
Forward jump, Back step, Forward dash
Because fast, but keep distant with attack at
OPP back
Energy >= 300 , Distance X <= 300 , Distance Y <= 50
Use: Action Ultimate attack
63. AIBot in
FightingICE
competition 2020
AI name: Jitwisut
Team: Jitwisut
Developer name:
Mr. Jitwisut Hongthong, jitwisut.hong@bumail.net
Supervisor: Dr. Kingkarn Sookhanaphibarn
Bangkok University
64. AI Outline
● Our AI modified the evaluation function of MCTS to allow our AI make combos from OPP.
● OPP’s STUN will be detected. STUN was emphasized as found in many previous work. Then, we also
considered it to make COMBO score.
● STUN = OPP down or stuck at corner.
66. AIBot in
FightingICE
competition 2020
AI name: MonkeyLink_TriplePM.jar
Team: MonkeyLink_TriplePM
Developer name:
Mr. Supakit U-sabai, supakit.usab@bumail.net
Mr. Thanawat Sappawattanakun, tanawat.sapp@bumail.net
Mr. Sakchai Suthat, sakchai.suth@bumail.net
Mr. Chaiyaboon Pladisai, chaiyaboon.plad@bumail.net
Supervisor: Dr. Kingkarn Sookhanaphibarn
Bangkok University
67. AI Outline
Our AI used a rule-based state machine
Steps:
● Check the distance between our position and the opponent.
● Set a current state
70. Outline
· Modified by MctsAI sample
· Optimized Simulation Step
◦ The opponent Action based on data instead of random
◦ Use KNN to predict opponent next Action by data
◦ Data Structure in data set:
◦ oppActionData; myActionData; distanceXData; distanceYData; oppStateData; myStateData; mHP; oHP
◦ get score = (myHP(t)- myHP(t+3s)) - (oppHP(t)– oppHP(t+3s)),higher score means better action
◦ Calculate similarity between framedata and dataset, find the most similar state and get best action
71. Future work
· Increase quality of data set
· Improve training method
· Use data-driven method to optimize Mcts selection
· Use FSM for special situation
73. Thank you and see you at
CoG 2021 in Copenhagen
http://www.ice.ci.ritsumei.ac.jp/~ftgaic/CoG 2020: Aug 24-27, 2020
73
Editor's Notes
Hi there, I am Ruck Thawonmas, director of team FightingICE that organizes this fighting game AI competition.
Here are the contents of this video.
FightingICE is a fighting game AI platform in Java and also wrapped for Python. The main aim of this platform is to advance research and development of general fighting game AIs.
FightingICE’s main features are shown here. These features make it challenging for not only deep reinforcement learning but also Monte-Carlo tree search and the likes, typically used in development of game AI these days.
This is for your reference. Four of them here are from other groups.
This is also for your reference, which shows that this platform is a good candidate for use in education.
Next is about the contest.
If you submit your AI to this competition, your AI must fight in six different environments shown here. Among these three characters, the character data, like the damage amount of an action, of GARNET and LUD are not known. In the past three competitions, only the LUD data were unknown, before that all were known.
This year we received 14 entries from five countries shown here.Deep learning, MCTS, rules, and evolutionary algorithms are used in 6, 4, 3, and 2 AIs, respectively. Note that a couple of AIs use a combination of these techniques.
Results!
Orange, Blue, Green shows the 1st, 2nd and the 3rd for each fighting environment.
without further ado, the winner goes to ......
E_RHEA_PI by Zhentao Tang and his other members. Congratulations! This AI uses …
For more details, see the ArXiv paper in slide 5.The runner-up is …… . Its description is shown here.3rd place is … who is a newcomer of this competition. It is the first AI developed with the OpenAI gym interface for our competition.
Congratulations again for these winners.Please note that the source code of all submitted entries and their description slides will be made available from our website soon.
The source code of all the submitted entries is available on our website. Look forward to your participation in our next competition.