2. Deep Reinforcement Learning (DRL)
Fundatmentals
Deep Reinforcement Learning is an effective way to train robots to adapt to real world as it overcomes the
problem of data source sample inefficiency and the cost of collection.
It provides potentially infinite source of data as the agent explores the environment and exploits the
knowledge learned from its exploration.
3. Sim-to-Real Transfer
• Transferring of policies learned during training phase by robot to that
in real-world environment.
• There is a remarkable degradation in performance observed in
transitioning from simulated environment to real world.
• Learning via exploration in DRL is cost effective but the differences
between simulations and real-world scenarios pose challenges for the
process of learning.
5. Methods for Sim-to-Real Transfer
• Zero Shot Transfer
An extreme example of domain adaptation in which agent is exposed to unseen test samples which were not
available during training phase. Agent is expected to predict classes using meta representation of classes.
• System identification
Represent physical system via mathematical model and precisely calibrate the simulator
• Domain Randomization
Randomize the simulated environment so as to generalize the data distribution as in real world.
Visual Randomization and Dynamics Randomization.
6. Methods for Sim-to-Real Transfer
• Domain Adaptation Methods
To transfer knowledge from source domain to target which has limited data, we unify source and target feature
spaces.
• Learning with disturbances
Introduce perturbations in the simulation to minimize mismatches between simulation and real-world
environment.
• Simulation environments
Carefully calibrated simulation environments to introduce realism. E.g Gazebo, Unity3D, and PyBullet or
MuJoCo.
8. Challenges
• Domain Randomizations: Hard to determine what and how the randomizations
work for the simulations.
• Domain Adaptations: Feature space of source and target domains may not be
easily unified.
9. Conclusion
• A need to add more realism to the simulation environment to have a successful
sim-to-real transfer of knowledge.
• Domain randomization and domain adaptation are most commonly used
methods.
• Policy distillation for multi-task learning while meta learning for variety of tasks
can be utilized.
• This field has provided opportunities for future research in the domain of
transferring knowledge.