Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

DL輪読会LT Embodied Question Answering, World Models 輪読

231 views

Published on

2018/03/30 DL輪読会LT
"Embodied Question Answering", "World Models"の輪読

Published in: Engineering
  • Be the first to like this

DL輪読会LT Embodied Question Answering, World Models 輪読

  1. 1. “Embodied Question Answering” “World Models” 2017.03.30 Tatsuya Matsushima @__tmats__
  2. 2. Abhishek Das, Samyak Datta, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra (Facebook Research) https://arxiv.org/abs/1711.11543 “Embodied Question Answering” (arXiv, 2017) 3D QA Embodied Question Answering (EmbodiedQA) github https://github.com/facebookresearch/house3d QA 1) 1 2) ( ) RL - navigation QA (SL or ) Key - (active perception) - ex) - grounding ( )
  3. 3. Abhishek Das, Samyak Datta, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra (Facebook Research) https://arxiv.org/abs/1711.11543 “Embodied Question Answering” (arXiv, 2017) Overview This paper proposes Embodied Question Answering (EmbodiedQA) task. The simulator is available in github https://github.com/facebookresearch/house3d Key Point of Proposed Method Difference between existing QA tasks 1) State is presented as a first person view 2) Agent needs its actions in order to answer correctly In Experiment, they use hierarchical RL consisted of planner and controller - Train separately both modules of navigation and QA, then joint two modules Main Insights Design concept of task “Long term objective is to make intelligent agents that can perceive, communicate and act” - need active perception - need inference with “common sense” ex) If asked about a car, agents try to go garage, - need grounding of symbol and real world
  4. 4. David Ha, Jürgen Schmidhuber https://arxiv.org/abs/1803.10122 “World Models” (arXiv, 2018) - VAE RNN - (hallucinated dream) - VAE - z ( RNN) - (z h ) RNN But RL credit assignment NN - NN Key - CarRacing-v0 -
  5. 5. David Ha, Jürgen Schmidhuber https://arxiv.org/abs/1803.10122 “World Models” (arXiv, 2018) Overview This paper proposes to learn dynamics of environment and control of agent separately in RL settings. - model dynamics of environment using VAE and mixture gaussian RNN - We can make controller simpler (with fewer parameters) By learning the model of environment, the agent can learn policies without interacting real environment (hallucinated dream), then even transfer into real settings. Key Point of Proposed Method Making the controller simpler by dividing modules into “World Model” with a RNN, and controller with small number of parameters - dimension reduction with VAE - predict latent representation z using Gaussian Mixture RNN - simple controller with linear model Difference between Previous Work Large RNNs have high capacity, but in RL setting, there’s credit assignment problem, so existing method tended to use smaller RNNs. In proposed method, the model is divided into the model of environment and controller, so large RNNs can be used. Main Insights - First model that achieved required score in CarRacing-v0 task - solve task using only learned environment model

×