This document provides an overview of POMDP (Partially Observable Markov Decision Process) and its applications. It first defines the key concepts of POMDP such as states, actions, observations, and belief states. It then uses the classic Tiger problem as an example to illustrate these concepts. The document discusses different approaches to solve POMDP problems, including model-based methods that learn the environment model from data and model-free reinforcement learning methods. Finally, it provides examples of applying POMDP to games like ViZDoom and robot navigation problems.
【DL輪読会】NeRF-VAE: A Geometry Aware 3D Scene Generative ModelDeep Learning JP
NeRF-VAE is a 3D scene generative model that combines Neural Radiance Fields (NeRF) and Generative Query Networks (GQN) with a variational autoencoder (VAE). It uses a NeRF decoder to generate novel views conditioned on a latent code. An encoder extracts latent codes from input views. During training, it maximizes the evidence lower bound to learn the latent space of scenes and allow for novel view synthesis. NeRF-VAE aims to generate photorealistic novel views of scenes by leveraging NeRF's view synthesis abilities within a generative model framework.
This document provides an overview of POMDP (Partially Observable Markov Decision Process) and its applications. It first defines the key concepts of POMDP such as states, actions, observations, and belief states. It then uses the classic Tiger problem as an example to illustrate these concepts. The document discusses different approaches to solve POMDP problems, including model-based methods that learn the environment model from data and model-free reinforcement learning methods. Finally, it provides examples of applying POMDP to games like ViZDoom and robot navigation problems.
【DL輪読会】NeRF-VAE: A Geometry Aware 3D Scene Generative ModelDeep Learning JP
NeRF-VAE is a 3D scene generative model that combines Neural Radiance Fields (NeRF) and Generative Query Networks (GQN) with a variational autoencoder (VAE). It uses a NeRF decoder to generate novel views conditioned on a latent code. An encoder extracts latent codes from input views. During training, it maximizes the evidence lower bound to learn the latent space of scenes and allow for novel view synthesis. NeRF-VAE aims to generate photorealistic novel views of scenes by leveraging NeRF's view synthesis abilities within a generative model framework.
人的資本経営[1]を実現するには,生産性とQoW(Quality of Work,働き方の質)を同時に改善し続けていくことが有効である.そのための課題は多岐に渡るため,DX(Digital Transformation)的発想が求められる。一方、情報の約60~80%が位置情報に関連していることが報告されている.本稿では,地理空間情報と他の情報とを連携させて課題解決を支援する地理空間インテリジェンス(GSI)でDXを促進し,製造現場やサービス現場で人的資本経営を支援することに資する筆者らの一連の取り組みについて紹介する.
31. まとめ・今後の展望
• 古典的Visual Localizationへの3Dマップの導入
• InLoc [Taira et al., 2018]: 3つのステップで段階的に3Dマップを活用
• 3Dマップを利用した仮想視点生成等で頑健な自己位置・姿勢推定を実現
• 深層学習モデル学習時の3Dマップ活用
• End-to-endでのブラックボックス化: [Kendall et al., 2015]
• 追加情報活用による精度向上 [Brahmbhatt et al., 2018]
• 姿勢初期値としての応用?
• 単一ステップのCNNモデル構成: コンパクトな問題設定で高精度な推定を実現
• 古典的姿勢推定手法との結合 [Brachmann et al., 2017]
• 局所3Dマップと姿勢の同時推定・整合性評価 [Ummenhofer et al., 2017]
• 未学習シーンへの一般化、大規模シーンへの対応、頑健性向上 etc.
31
32. References
[1] Taira, Hajime, et al. "InLoc: Indoor visual localization with dense matching and view synthesis." Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
[2] 田平創, 荻野凌, 岩田健太郎, Torsten Sattler, Josef Sivic, Tomas Pajdla, 鳥居秋彦, 奥富正敏. 大規模visual
localization の実用化に向けた評価用データセットの作成. 第24回画像センシングシンポジウム, 2018.
[3] 田平創, Torsten Sattler, Josef Sivic, Tomas Pajdla, 鳥居秋彦, 奥富正敏. 大規模屋内環境における3Dマップを用い
た自己位置推定. 第25回画像センシングシンポジウム, 2019.
[4] Kendall, Alex, Matthew Grimes, and Roberto Cipolla. "Posenet: A convolutional network for real-time 6-dof
camera relocalization." Proceedings of the IEEE international conference on computer vision. 2015.
[5] Kendall, Alex, and Roberto Cipolla. "Geometric loss functions for camera pose regression with deep
learning." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.
[6] Brachmann, Eric, et al. "Dsac-differentiable ransac for camera localization." Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition. 2017.
[7] Brachmann, Eric, and Carsten Rother. "Learning less is more-6d camera localization via 3d surface
regression." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
[8] Brahmbhatt, Samarth, et al. "Geometry-aware learning of maps for camera localization." Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition. 2018.
[9] Ummenhofer, Benjamin, et al. "Demon: Depth and motion network for learning monocular
stereo." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.
32