The detailed results are described at GitHub (in English):
https://github.com/jkatsuta/exp-18-1q
(maddpg/experiments/my_notes/のexp1 ~ exp6)
立教大学のセミナー資料(前篇)です。
資料後篇:
https://www.slideshare.net/JunichiroKatsuta/ss-108099542
ブログ(動画あり):
https://recruit.gmo.jp/engineer/jisedai/blog/multi-agent-reinforcement-learning/
This document discusses self-supervised representation learning (SRL) for reinforcement learning tasks. SRL learns state representations by using prediction tasks as an auxiliary objective. The key ideas are: (1) SRL learns an encoder that maps observations to states using a prediction task like modeling future states or actions; (2) The learned state representations improve generalization and exploration in reinforcement learning algorithms; (3) Several SRL methods are discussed, including world models, inverse models, and causal infoGANs.
ゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement LearningPreferred Networks
Introduction of Deep Reinforcement Learning, which was presented at domestic NLP conference.
言語処理学会第24回年次大会(NLP2018) での講演資料です。
http://www.anlp.jp/nlp2018/#tutorial
This document provides an overview of POMDP (Partially Observable Markov Decision Process) and its applications. It first defines the key concepts of POMDP such as states, actions, observations, and belief states. It then uses the classic Tiger problem as an example to illustrate these concepts. The document discusses different approaches to solve POMDP problems, including model-based methods that learn the environment model from data and model-free reinforcement learning methods. Finally, it provides examples of applying POMDP to games like ViZDoom and robot navigation problems.
The detailed results are described at GitHub (in English):
https://github.com/jkatsuta/exp-18-1q
(maddpg/experiments/my_notes/のexp1 ~ exp6)
立教大学のセミナー資料(前篇)です。
資料後篇:
https://www.slideshare.net/JunichiroKatsuta/ss-108099542
ブログ(動画あり):
https://recruit.gmo.jp/engineer/jisedai/blog/multi-agent-reinforcement-learning/
This document discusses self-supervised representation learning (SRL) for reinforcement learning tasks. SRL learns state representations by using prediction tasks as an auxiliary objective. The key ideas are: (1) SRL learns an encoder that maps observations to states using a prediction task like modeling future states or actions; (2) The learned state representations improve generalization and exploration in reinforcement learning algorithms; (3) Several SRL methods are discussed, including world models, inverse models, and causal infoGANs.
ゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement LearningPreferred Networks
Introduction of Deep Reinforcement Learning, which was presented at domestic NLP conference.
言語処理学会第24回年次大会(NLP2018) での講演資料です。
http://www.anlp.jp/nlp2018/#tutorial
This document provides an overview of POMDP (Partially Observable Markov Decision Process) and its applications. It first defines the key concepts of POMDP such as states, actions, observations, and belief states. It then uses the classic Tiger problem as an example to illustrate these concepts. The document discusses different approaches to solve POMDP problems, including model-based methods that learn the environment model from data and model-free reinforcement learning methods. Finally, it provides examples of applying POMDP to games like ViZDoom and robot navigation problems.
The document summarizes recent research related to "theory of mind" in multi-agent reinforcement learning. It discusses three papers that propose methods for agents to infer the intentions of other agents by applying concepts from theory of mind:
1. The papers propose that in multi-agent reinforcement learning, being able to understand the intentions of other agents could help with cooperation and increase success rates.
2. The methods aim to estimate the intentions of other agents by modeling their beliefs and private information, using ideas from theory of mind in cognitive science. This involves inferring information about other agents that is not directly observable.
3. Bayesian inference is often used to reason about the beliefs, goals and private information of other agents based
This document summarizes recent research on applying self-attention mechanisms from Transformers to domains other than language, such as computer vision. It discusses models that use self-attention for images, including ViT, DeiT, and T2T, which apply Transformers to divided image patches. It also covers more general attention modules like the Perceiver that aims to be domain-agnostic. Finally, it discusses work on transferring pretrained language Transformers to other modalities through frozen weights, showing they can function as universal computation engines.
【DL輪読会】Diffusion Policy: Visuomotor Policy Learning via Action DiffusionDeep Learning JP
This document discusses a paper on visuomotor policy learning via action diffusion. The paper presents a method for training policies that map camera images directly to actions by incorporating action diffusion, which adds noise to actions during training. This helps explore the action space and avoid getting stuck in local optima during policy learning. The method can learn policies for complex manipulation tasks entirely from pixels using self-supervised reinforcement learning with image rewards.
The detailed results are described at GitHub (in English):
https://github.com/jkatsuta/exp-18-1q
(maddpg/experiments/my_notes/のexp7 ~ exp11)
立教大学のセミナー資料(後篇)です。
資料前篇:
https://www.slideshare.net/JunichiroKatsuta/ss-108099238
ブログ(動画あり):https://recruit.gmo.jp/engineer/jisedai/blog/multi-agent-reinforcement-learning2/
The document summarizes recent research related to "theory of mind" in multi-agent reinforcement learning. It discusses three papers that propose methods for agents to infer the intentions of other agents by applying concepts from theory of mind:
1. The papers propose that in multi-agent reinforcement learning, being able to understand the intentions of other agents could help with cooperation and increase success rates.
2. The methods aim to estimate the intentions of other agents by modeling their beliefs and private information, using ideas from theory of mind in cognitive science. This involves inferring information about other agents that is not directly observable.
3. Bayesian inference is often used to reason about the beliefs, goals and private information of other agents based
This document summarizes recent research on applying self-attention mechanisms from Transformers to domains other than language, such as computer vision. It discusses models that use self-attention for images, including ViT, DeiT, and T2T, which apply Transformers to divided image patches. It also covers more general attention modules like the Perceiver that aims to be domain-agnostic. Finally, it discusses work on transferring pretrained language Transformers to other modalities through frozen weights, showing they can function as universal computation engines.
【DL輪読会】Diffusion Policy: Visuomotor Policy Learning via Action DiffusionDeep Learning JP
This document discusses a paper on visuomotor policy learning via action diffusion. The paper presents a method for training policies that map camera images directly to actions by incorporating action diffusion, which adds noise to actions during training. This helps explore the action space and avoid getting stuck in local optima during policy learning. The method can learn policies for complex manipulation tasks entirely from pixels using self-supervised reinforcement learning with image rewards.
The detailed results are described at GitHub (in English):
https://github.com/jkatsuta/exp-18-1q
(maddpg/experiments/my_notes/のexp7 ~ exp11)
立教大学のセミナー資料(後篇)です。
資料前篇:
https://www.slideshare.net/JunichiroKatsuta/ss-108099238
ブログ(動画あり):https://recruit.gmo.jp/engineer/jisedai/blog/multi-agent-reinforcement-learning2/
This is the slide about comparing distributed GPU processing between some DeepLearning Flameworks on TensorFlow User Group #4.
The meetup was in Tokyo on 2017/04/19.
https://tfug-tokyo.connpass.com/event/54396/
【第54回 プログラミング・シンポジウム 発表資料 7-2】
Many new software development methods such as agile and iterative development require closer communication among developers compared to traditional ones. However today, many IT engineers are not good at communicating with others. Therefore, we are developing software engineer education curriculum with enhancement of communication skills (more specifically, consensus-building skills) in mind. We are adopting case-centered methods, in which both software development process and importance of consensus can be understood through actual experiences. For evaluation, we have applied our educational case method to six computer science undergraduates, composed in two groups of three persons each. As the result, consensus-building workshop in our curriculum was effective in achieving closer communication.
近年、アジャイルや反復開発など緊密なコミュニケーションを前提とする開発手法が普及してきている。しかし、他者とのコミュニケーションを拒むソフトウェア技術者の存在をはじめ、現在 のソフトウェア業界ではコミュニケーションスキルの向上は必ずしも実現されていない。そこで筆者らは、新人に対しコミュニケーションスキル、特に合意形成スキルを重視した技術者教育を行うことが重要だと考え、教育手法を設計・試行している。具体的には、開発プロセスをこなす中で合意形成の重要性を体験できるような、ケース中心の教育手法を採用している。その検証のため、大学4年生6名に集まってもらい、それぞれ3名のグループに分かれ、ケースを基に開発を進める実験を行った。その結果、コンセンサスを体感するワークショップを行う事で、コミュニケーションをより密にする開発を行う事が出来ることを確認した。