2020 Fighting Game AI Competition

2020 Fighting Game AI Competition
Yoshina Takano advisor
Ryota Ishii advisor
Hideyasu Inoue lead programmer
Keita Fujimakiprogrammer, tester, etc.
Pujana Paliyawan vice director
Ruck Thawonmas director
Team FightingICE
Intelligent Computer Entertainment Laboratory
Ritsumeikan University
Japan
Game resources are from The Rumble Fish 2 with the courtesy of Dimps Corporation.
http://www.ice.ci.ritsumei.ac.jp/~ftgaic/
CoG 2020: Aug 24-27, 2020

FightingICE
Contest
Results
Contents
CoG 2020: Aug 24-27, 2020
2

 Fighting game AI platform viable to develop with
a small-size team in Java and also wrapped for Python
 First of its kinds since 2013 & CIG 2014, developed from
scratch without using game ROM data
 Aims:
Towards general fighting
game AIs
 Strong against any unseen
opponents (AIs or players) ,
character types, and play modes
FightingICE
http://www.ice.ci.ritsumei.ac.jp/~ftgaic/
Game resources are from The Rumble Fish 2 with the
courtesy of Dimps Corporation.
CoG 2020: Aug 24-27, 2020
3

 Has 16.67 ms response time (60 FPS)
for the agent to choose its action out of 40 actions
 Provides the latest game state with a delay of 15
frames, to simulate human response time
 Equipped with
 a forward model
 a method for accessing
the screen information
 an OpenAI Gym API
FightingICE’s Main Features
 Why FightingICE?
 DRL does not prevail yet!?
 Generalization against different
opponents of unknown characters for
multiple game modes is challenging
 60 FPS + introduced delay are
challenging factors for MCTS
CoG 2020: Aug 24-27, 2020
4

Recent Research Using FightingICE
or about Fighting Games
 One journal paper at Expert Systems with Applications
 Genetic state-grouping algorithm for deep reinforcement Learning (Dec 2020)
 One journal paper (under submission?!) at arXiv
 Enhanced Rolling Horizon Evolution Algorithm with Opponent Model Learning:
Results for the Fighting Game AI Competition (arXiv:2003.13949)
 Three conference papers at CoG 2020
 Mastering Fighting Game Using Deep Reinforcement Learning With Self-play
 Observer Interface Focused on Trends of Character Movement and Stamina in
Fighting Games
 Towards Social Facilitation in Audience Participation Games: Fighting Game AIs
whose Strength Depends on Audience Responses (ours)
CoG 2020: Aug 24-27, 2020
5

Recent Use in Education
 Student projects with an internal competition at Department of Games
and Interactive Media, School of Information Technology and
Innovation, Bangkok University, Thailand (May 2020)
CoG 2020: Aug 24-27, 2020
6

FightingICE
Contest
Results
Contents
CoG 2020: Aug 24-27, 2020
7

 Three tournaments, using three characters, for Standard and
Speedrunning leagues:
ZEN, GARNET, and LUD (GARNET and LUD’s character data not
revealed in advance, unknown
characters)
 Standard: considers the winner of a round as the one with the HP above
zero at the time its opponent's HP has reached zero (all AIs' initial HP = 400)
 Speedrunning: the winner of a given character type is the AI with the
shortest average time to beat our sample MctsAi (all AIs' initial HP = 400)
Contest Rules
CoG 2020: Aug 24-27, 2020
8

 14 Entries from
 China, Japan, Korea, Singapore and
Thailand
 1 sample AI (Openloop MCTS)
from our group for reference
 MCTS AI
Summary of AI Fighters
 Techniques in use by the submitted
14 AIs (cont.)
 4 AIs
 MCTS + rules or opponent modeling
 3 AIs
 Pre-defined rules, including 1 AI with fuzzy
rules
 1 AI
 RHEA + opponent modeling (with
Deeplearning4j)
 1 AI
 Genetic State-Grouping Algorithm for
Deep Reinforcement Learning
 1 AI
 Search the best action among prioritized
actions for each character
 Techniques in use by the submitted
14 AIs (6 Deep, 4 MCTS, 3 rules, 2 EA)
 4 AIs
 Deep reinforcement learning (3 PPO and 1 SAC),
including 1 AI trained with OpenAI Gym APICoG 2020: Aug 24-27, 2020
9

FightingICE
Contest
Results
Contents
CoG 2020: Aug 24-27, 2020
10

Results
• Winner AI: ERHEA_PI by Zhentao Tang*, Rongqin Liang, and Mengchen Zhao
(*2019 runner-up), University of Chinese Academy of Sciences and Huawei Noah’s
Ark Lab, China
• Rolling Horizon Evolutionary Algorithm combined with an adaptive learning-
based opponent model (Deeplearning4j) utilizing two simulation modules
from ReiwaThunder (2019 Winner) (cf. the ArXiv paper in slide 5)
• Runner-up AI: Tera Thunder by Eita Aoki (winner for the last four consecutive years),
Japan
• 1. Prioritize certain actions in advance. 2. Predict the most possible three
actions by the opponent. 3. Select the best AI action against the opponent's
three actions using his original simulator.
• 3nd Place AI: ButcherPudge by Wen Bai (newcomer), Nanyang Technological
University, Singapore
• Reinforcement Learning Algorithm SAC (Soft-Actor-Critic) trained against 2019
top AIs with the OpenAI gym interface and Pytorch library.CoG 2020: Aug 24-27, 2020
11

Appendices: AI Details
(in submitted order)
CoG 2020: Aug 24-27, 2020
12

Fighting Game AI
A I : F u z z y _ Z Y Q A I
Developed by:ZHANG YUQI
Graduated from:ZhengZhou University

∆ This AI is mainly based on Fuzzy Logic Control and Finite-state Machine.
∆ Its developed with:
⊿ FightingICE Version 4.50 ⊿ JavaSE-1.8 ⊿ Eclipse
∆ The mainly logic is:
Input:
Distance and energy data →
Fuzzy to:
The Feeling-Value →
Output:
The probability of next action
∆ Using only one character: ZEN

Gaussian function:low middle high
Normail Fuzzy function:
For the specificity of Fighting Game AI,I designed the fuzzy function as follows:
⊿ DistanceX ⊿ DistanceY ⊿ Energy
∆ Based on fuzzy function,AI can get a
Feeling-Value of (0~1).
∆ By making a fuzzy rule base,AI can get
the probability of next action.
∆ For the representation of the probability,
I using random function and the distribution
of ‘case’ in ‘switch statement ’ to implement it.
∆ In the case of multiple inputs, I implement
the outputs by nesting the ‘switch statements ’.

∆ Fuzzy control is a rule-based control.It is no need to establish an accurate
mathematical model,so that the control mechanism and strategy are easy to
accept and understand.
It is beneficial to simulate the process and method of manual control,
enhance the adaptive ability of the control system, and make it have a certain
level of intelligence.
∆
∆ In tests, it performed better than the AI I wrote earlier(which is only based on Finite-state Machine).
→
Tests：

THANKS
Welcome to contact me: zc08075081996@gmail.com

TeraThunder
Eita Aoki
(I got my first degree at Nagoya University in 2013)

Outline
 Pick up the 3 actions that opponent is likely to do ,
and my AI assume that AI knows what the enemy is doing.
 AI do what AI can attack Opponent faster and Opponent can attack my
AI slower.
 Give weight to the evaluation according to the likelihood of action of the
Opponent.

Weak Action
 Small Attack is very weak.
e.g. Zen’s STAND_D_DB_BA
AttackHitArea of Zen’s STAND_D_DB_BA is only 100.
I stopped using Actions with AttackHitArea smaller than 500.
 Short Active time Attack is very weak.
I stopped using Actions with AttackActive time rate is less than 0.13.

Sample Name
 Until 2019, the name of the SpeedMode enemy was "MctsAi".
 But in this year's MidTerm, it was "SampleMctsAi".
 But I wasn't sure which name it would be in this year's Final, so if the
enemy was named either, it would work as SpeedMode.

SpringAI
FightingICE Competition 2020
ETRI
dooroomie@etri.re.kr

Introduction
 AI Name : SpringAI
 Self-play Reinforcement Game AI
 Developers & Affiliation
 Dae-Wook Kim (dooroomie@etri.re.kr) and Teammates
 Electronics and Telecommunications Research Institute (ETRI)
 Daejeon, Korea
 AI Development Language
 Python 3.5

AI Outline
 Method : Reinforcement Learning (RL) + Self-play
 RL Configuration
 Proximal Policy Optimization Algorithms (PPO)
 Trained with MCTS AI (‘SampleMctsAi’ provided from competition)
 2-stage learning

AI Details
 Some Limitation on GARNET and LUD
 RL is not suitable for unspecified character
 However, we assumed that well-trained agent would also show good
performance even if specification of characters are changed,
 See our paper submitted on COG 2020 to get more details
 Mastering Fighting Game Using Deep Reinforcement Learning With Self-play
 Paper number 207

How to Test
 After extracting zip file, you can see below files

How to Test
 Copy into FTG4.50/python directory
Copy

How to Test
 Open a terminal and run the FTG simulator

How to Test
 Go to python directory
 Open a new terminal and run python file

AI:
Developer:
Affiliation:
EmcmAi
Yuanheng Zhu1, Zhaoxiang Zhang1, Yaodong Yang2
1 Chinese Academy of Sciences, Institute of Automation
2 Huawei Noah's Ark Lab

 EmcmAi is a reinforcement-learning bot that uses a policy network to fight
against opponents.
 The policy network is trained by proximal policy optimization and by playing
against participating bots in the competitions of last few years.
 The reward function is defined by 𝑟𝑡 = ℎ𝑝𝑡−1 𝑜𝑝𝑝 − ℎ𝑝𝑡 𝑜𝑝𝑝 −
(ℎ𝑝𝑡−1(𝑜𝑤𝑛) − ℎ𝑝𝑡(𝑜𝑤𝑛)) to encourage other damage and to avoid own hurt.
DESCRIPTION-1

DESCRIPTION-2
 During the training, we design an Elo-based opponent-selection mechanism,
so that too weak bots will not appear frequently as opponents.
 It encourages the agent to train more against strong bots to improve its own
competitiveness.
 The occurrence of opponent bots in the training is proportional to the priority
𝑝 𝑜𝑝𝑝 = (1 − 𝑤𝑖𝑛𝑟𝑎𝑡𝑒(𝑜𝑝𝑝))2, where the 𝑤𝑖𝑛𝑟𝑎𝑡𝑒(𝑜𝑝𝑝) of given opponent is
determined by its relative Elo to our agent.
34

DESCRIPTION-3
 In contrast to conventional RL problems, FightingICE has to consider the
dynamical action set where the candidate actions are determined by
AIR/GROUND state and character energy.
 So we design predictCurrentFramedata() function to predict the current
framedata with stored action commands between the delayed frame and real
current frame.
 In policy network, the unavailable actions are masked by assigning infinitely
small values to their output nodes.
35

Enhanced Rolling Fighting Bot -
ERHEA_PI
Zhentao Tang (Student)
Affiliation: University of Chinese Academy of Sciences
Rongqin Liang(Student)
Affiliation: University of Chinese Academy of Sciences
Mengchen Zhao(Young Professional)
Affiliation: Huawei Noah’s Ark Lab

Enhanced Rolling Fighting Bot
• Rolling Fighting Bot is based on Rolling Horizon Evolutionary
Algorithm, combined with the adaptive learning-based opponent
model. It uses Thunder Bot as a reference with the valid action set
as candidate.
• Base: RHEA_PI, we made 2019.
• New approach:
* Enrich the observation of opponent’s state.
* Enhance the reward design for opponent model.
* Enlarge the state-action pair datasets for opponent model
training.
* Use a more reliable simulator by Thunder proposed.

Welcome to contact me,
Zhentao Tang: tzt101@qq.com

FightingICE Competition
AI's Name : Caselene
Developer's Name : Jaturawit Chaiwong
Email : jaturawit.chaiwong@gmail.com
BU-MIT LAB
Bangkok University, Thailand

AI's
Outline
• Our AI using MCTS provided us to switching the evaluation to the finest
situation.
• Crouch kick is what we use considered to make the opponent hard to
guard.
• Combo attack is use for making more score.

Crouch kick
Crouch kick is one of old strategy in 90's - 00's When fighting game is popular. A lot of
people back then use this strategy and win a several time.

According to the second page, When the opponent is pushed to the edge of the screen
and to make damage the most, the combo is the right choice. To get more points and take
advantage when the opponent is down.
Combo

Switching evaluation
To make the finest situation we considered to switch the evaluate of MCTS .

FightingICE Competition
AI name : MrTwo
Developer's Name : Tannop Sangvanloy
Supervisor: Dr. Kingkarn Sookhanaphibarn
BU-MIT Lab
Bangkok University Thailand

• Used MCTS for selecting the best Action.
• Actions that knockdown opponent are more favorable.
• Switching multiplier for each character.
AI outline

Evaluation
• Knocking down opponent is the best way to dealt the most damage,
because after that we can perform action that they cant dodge or
fight back

Switching multiplier
• Each character have different action, but by using multiplier we can
adjust score for each character to select best Action.

Fighting AI
Game
AI: CYR_AI
Developed by: Chen Yu Rou (Student)
Affiliation: University of Chinese Academy
of Sciences

Algorithm: Reinforcement Learning with PPO
I used model-free reinforcement learning algorithm, Proximal
Policy Optimization, to train my AI.
Following is the clipped surrogate objective which OpenAI
proposed in Proximal Policy Optimization Algorithms, 2017. I
used it as the objective to train my AI and update the weights.
Reference:
Schulman J, Wolski F, Dhariwal P, et al. Proximal Policy
Optimization Algorithms[J]. arXiv: Learning, 2017.

Training:
I trained my AI with tensorflow.keras, and with multithreading.

Load Weights: 2018Samples
2018_Sample_AIs.LoadTorchWeightAI gave me a complete
example to use directly. After training and getting my weights, I
just need to save it as _.csv and put it in my aiData, the
network class will do the forward compute.
There were only two place I have changed:
First, I changed the InputData to get the state I need.
Then, I changed the way to choose action. My output are
random policies, so I use roulette to get the action.

Acknowledgements:
I acknowledge the important role played by my teacher, Zhu
Yuan Huan, Zhao Dong Bin, my senior brother-in-learning,
Tang Zhen Tao, my junior brother- in-learning, Liang Rong
Qing. My teammates Yu Yang, Tong Ru, Chen Hao and Yu
Chang also help a lot at the beginning.

JayBot
Man-Je Kim
Gwangju Institute of Science and Technology
School of Electrical Engineering and Computer Science

Outline
• This AI was developed using three core AI technologies and some heuristics.
1) State Grouping Method for Monte Carlo Tree Search
Genetic State-Grouping Algorithm for Deep Reinforcement Learning
(Expert Systems with Applications, 2020)
2) Hybrid Method for Zen Character
Hybrid fighting game AI using a genetic algorithm and Monte Carlo tree search
(GECCO, 2018)
3) Opponent Modeling for LUD Character
Opponent modeling based on action table for MCTS-based fighting game AI
(CIG, 2017)

Heuristics
1) Distance between two agents
2) Weight for defensive actions
3) Frequent Action data used by previous AI

Contact
Email: jaykim0104@gist.ac.kr

BAI WEN (MSAI Student)
School of Computer Science and Engineering
Nanyang Technological University
BUTCHER PUDGEA Deep Reinforcement Learning Agent

Butcher_Pudge is implemented based on the Deep Reinforcement Learning Algorithm SAC
(Soft-Actor-Critic). It was mainly trained to fight with the 2019 Award winners’ AIs
(ReiwaThunder, RHEA_PI and Toothless..) progressively. This AI was trained directly on the
delayed RAM data, and did not utilized the simulator provided by the platform.
The Q-network and Policy network both contain 3 layers with 256 hidden units respectively.
Reward Tuning was also used during training to help the training converge faster.
The final deployed model only need numpy to run, no GPU needed, no deep learning library
needed. For the details of how to run the AI, please refer to the README.md
1. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a
Stochastic Actor, Haarnoja et al, 2018

THANKS
BAI WEN
wbai001@e.ntu.edu.sg (campus)
byron_edwards@outlook.com (personal)
And feel free to contact me

AIBot in
FightingICE
competition 2020
AI name: YIYAI
Team: YIYAI
Developer name:
Mr. Thavatchai Kruhpoong, thavatchai.kruh@bumail.net
Mr. Peera Chuenbunchom, peera.chue@bumail.net
Mr. Kittiphop Ritthijan, kittiphop.ritt@bumail.net
Mr. Jitti Sailektim, jitti.sail@bumail.net
Mr. Danuporn Poonsang, danuporn.poon@bumail.net
Bangkok University

Aggressive Defensive
Close
Close-
medium
Medium
Far
Random between Forward Jump, Forward Dash
Emphasize moving closer to OPP because the between distance is not too far.
Energy >= 30 chance ¼ for Forward range attack
If not release an ultimate skill, random actions between Forward jump, Forward Dash
Because at this far distance the normal skills cannot make a damage except the ultimate skills
Energy >= 30 chance ¼ for Upper range attack
or random action between Forward Jump, Forward dash
Emphasize to move closer to OPP for melee attack
A release skill of both actions to up the air can make a
damage to OPP in this distance when OPP is at both
ground and air.
Energy >= 30 chance ¼ for Upper range attack
or random action between Kick, Crouch kick, Back jump
Emphasize the far distance to OPP
A release skill of three actions can make a damage to OPP
in far distance better than others.
Random between Crouch kick,
Forward kick, Kick, Throw
Because fast, and much damage
Random between Crouch kick, Kick,
Throw,
Forward jump, Back step, Forward dash
Because fast, but keep distant with attack at
OPP back
Energy >= 300 , Distance X <= 300 , Distance Y <= 50
Use: Action Ultimate attack

AIBot in
FightingICE
competition 2020
AI name: Jitwisut
Team: Jitwisut
Developer name:
Mr. Jitwisut Hongthong, jitwisut.hong@bumail.net
Bangkok University

AI Outline
● Our AI modified the evaluation function of MCTS to allow our AI make combos from OPP.
● OPP’s STUN will be detected. STUN was emphasized as found in many previous work. Then, we also
considered it to make COMBO score.
● STUN = OPP down or stuck at corner.

AIBot in
FightingICE
competition 2020
AI name: MonkeyLink_TriplePM.jar
Team: MonkeyLink_TriplePM
Developer name:
Mr. Supakit U-sabai, supakit.usab@bumail.net
Mr. Thanawat Sappawattanakun, tanawat.sapp@bumail.net
Mr. Sakchai Suthat, sakchai.suth@bumail.net
Mr. Chaiyaboon Pladisai, chaiyaboon.plad@bumail.net
Bangkok University

AI Outline
Our AI used a rule-based state machine
Steps:
● Check the distance between our position and the opponent.
● Set a current state

Workflow
STAND
BACK_JUMP
FOR_JUMP STAND_FB
AIR_FB,
COUCH_FB
In distance
Not in
distance
Very close
In distance
Not in
distance
Far distance

Outline
· Modified by MctsAI sample
· Optimized Simulation Step
◦ The opponent Action based on data instead of random
◦ Use KNN to predict opponent next Action by data
◦ Data Structure in data set:
◦ oppActionData; myActionData; distanceXData; distanceYData; oppStateData; myStateData; mHP; oHP
◦ get score = (myHP(t)- myHP(t+3s)) - (oppHP(t)– oppHP(t+3s))，higher score means better action
◦ Calculate similarity between framedata and dataset, find the most similar state and get best action

Future work
· Increase quality of data set
· Improve training method
· Use data-driven method to optimize Mcts selection
· Use FSM for special situation

Thank you and see you at
CoG 2021 in Copenhagen
http://www.ice.ci.ritsumei.ac.jp/~ftgaic/CoG 2020: Aug 24-27, 2020
73

2020 Fighting Game AI Competition

Recommended

Recommended

More Related Content

Similar to 2020 Fighting Game AI Competition

Similar to 2020 Fighting Game AI Competition (20)

Recently uploaded

Recently uploaded (20)

2020 Fighting Game AI Competition

Editor's Notes