1. Unity ML-Agents is a platform for training intelligent agents to make good decisions in complex 3D simulated environments built with Unity.
2. It allows training agents using reinforcement learning algorithms like PPO through Python and then embedding the trained models back into Unity for inference.
3. The workflow involves creating a Unity environment with agents, brains, and rewards; training agents using the Python API; and embedding the trained model back into Unity for deployment. This enables training agents in photo-realistic 3D simulations.
3. About Unity
Leading global game industry platform
• 16 Billion Downloads in 2016 – up
31%
• 2.6 Billion Unique Devices
• 700 Million Gamers
• 38% of top 1000 free mobile games
18. Agents
• Agents are GameObjects within the Unity scene.
• They perceive the environment via observations, take actions,
and optionally receive rewards.
• Each agent is linked to a brain, which makes decisions for the
agent.
19. Brains
• Player – Actions are decided by user input through keyboard or
gamepad.
• Heuristic – Actions are decided by C# script using state input.
• External – Actions are decided using Tensorflow via Python
interface.
• Internal – Actions are decided using Tensorflow model
embedded into project.
22. Train Agents (Python)
• Launch an environment from python with
env = UnityEnvironment(“my_environment”)
• Interact with gym-like interface:
env.reset()
env.step()
env.close()
23. Train Agents (Python)
• Proximal Policy Optimization (PPO)
algorithm included by default.
• Works with Continuous and Discrete Control,
plus image and/or vector inputs.
• Supports multiple-agents per brain within a
scene (and multiple brains coming in v0.3!)
• Monitor progress with TensorBoard.
25. Embed Agents (Unity)
• Once a model is trained, it can be
exported into the Unity project.
• Simply drop .bytes file into Unity
project, and use it in corresponding
Brain with “Internal” mode.
• Support for Mac, Windows, Linux,
iOS, and Android.
26. Agents
• Learning environments are populated with agents, which are the
actors within the scene.
• Each actor contains functions which define their state variables,
observation cameras, action effects, and reward function.
• Agents are linked to game objects within a scene.
27. Brains
• Each brain corresponds to a policy for deciding actions within a
learning environment.
• Many agents can be linked to a single brain.
• Brains define the state and action space for their linked agents.
• Multiple brains can be within single environment.
• Four different types of brain.
28. Academy
• Only one per scene/environment.
• Responsibilities:
• Configures engine for Training or Inference
• Coordinates all Agents and Brains
• Contains set of custom “reset parameters” which can be used for
Curriculum Learning.
38. Multi-Stage Soccer Training
Defense
Train one brain with
negative reward for ball
entering their goal
Offense
Train one brain with
positive reward for ball
entering opponents goal
Combined
Train both brains together
to play against opponent
team