SlideShare a Scribd company logo
1 of 17
Download to read offline
Particle Filter on Episode
for Learning Decision Making Rule
Ryuichi Ueda Chiba Inst. of Technology
Kotaro Mizuta RIKEN BSI
Hiroshi Yamakawa DOWANGO
Hiroyuki Okada Tamagawa Univ.
navigation problems in the real world
• Not only robots, but also animals solve them.
• Mammals have specialized cells for spatial recognition in
their brain.
– especially around the hippocampus
– ex. place cells
• They show different reaction at each
place of environment.
• -> existence of maps in the brain
July 6th, 2016 IAS-14 Shanghai 2
Place cells [O'Keefe71]
(http://en.wikipedia.org/
wiki/Place_cell)
map vs. memory
• Mammals have maps in their brains.
• Maps of environments are of concern also in robotics.
– SLAM has been one of the most important topic.
– studies introducing the function of the hippocampus
• RatSLAM [Milford08]
• How about memory?
– Memory is also handled in the hippocampus.
– Sequence of memory is reduced to maps (or state space models).
– Robots can record its memory for long time if they has TB level
storages. (difference between mammals and robots)
July 6th, 2016 IAS-14 Shanghai 3
the purpose
• our intuition
– If memory is the source of maps, robots will be able to decide
its action not from a map but directly from memory.
– Some knowledge about handling of memory in the
hippocampus and its surroundings will help this attempt.
• to implement a learning algorithm that directly utilizes
memory
– particle filter on episode (PFoE)
– validation with an actual robot
July 6th, 2016 IAS-14 Shanghai 4
related works
• Episode-based reinforcement learning [Unemi 1999]
– Its base idea is identical with PFoE.
– PFoE simplifies implementation
and enables real-time calculation.
• RatSLAM [Milford08]
– an algorithm for robotics utilizing the knowledge around the
hippocampus
July 6th, 2016 IAS-14 Shanghai 5
outline of PFoE
• In repetitions of a task for learning, a robot stores events.
– an event = a set of sensor readings, actions, and rewards given by
someone obtained at a discrete time step
– the episode: the sequence of the events
• The degree of recall of each event is represented as a
probability.
July 6th, 2016 IAS-14 Shanghai 6
time axis
states
episode
rewards
belief
s s s s s s s
present time
1 -1
a a a a a a a actions
past current
decision with the belief and the episode
• An action is chosen by calculation of expectation values.
July 6th, 2016 IAS-14 Shanghai 7
time axis
states
episode
rewards
belief
s s s s s s s
present time
?
1 -1
a a a a a a a actions
When the robot recalls these
events, it may obtain +1
reward if it chooses the action
as those time.
When the robot recalls these
events, it should change its
action to avoid -1 reward.
representation with particles
• The belief is represented with particles.
– O(N) even if the episode has infinite length
• variables of a particle
– its position on the time axis
– its weight
July 6th, 2016 IAS-14 Shanghai 8
time axis
belief
present time
a particle
operation of PFoE – motion update
• When the current time goes to the next time step,
particles simply shift to their next time steps.
– The episode is extended by an additional event.
– Positions of particles are shifted.
July 6th, 2016 IAS-14 Shanghai 9
before an action
time axis
belief
after the action
time axis
belief
addition of
the event
operation of PFoE – sensor update
• The event related to each particle is compared
to the last one.
– Weights are reduced responding to the difference.
• resampled and normalized after reduction of weights
• When the sum of weights before normalization is under
a threshold, all particles are replaced (a reset).
– how to reset?
July 6th, 2016 IAS-14 Shanghai 10
time axis
belief
difference of sensor readings, the reward, or the action
e e e e e e
compare
events
operation of PFoE – retrospective resets
• inspired by the retrospective activity of place cells
– When a rat recalls past events, place cells become active as if
the rat virtually moves.
• algorithm
– 1. place particles randomly
– 2. replay the motion update and the sensor update for
M steps with the past M events from the current time
July 6th, 2016 IAS-14 Shanghai 11
time axis
belief
currentM step before
...
moved and
compared
e e e
experiments
• the robot: a micromouse
that has 4 range sensors
• T-maze that has a reward
at one of its arms.
• The robot chooses a turn right
action or a turn left action
at the T-junction.
• State transition is simplified to cycles of 4 events.
– The robot records an event when
• it is placed on the initial position
• it reaches the T-junction
• it turns right or left
• it goes to an end of the arm
July 6th, 2016 IAS-14 Shanghai 12
direction of
sensors
a marker of
reward
tasks of experiments
• a periodical task
– The reward is put right or left alternately.
– cycles of 8 events
• a discrimination task
– The reward is put the side
where the robot is placed at first.
– Right or left is chosen randomly.
• not periodical
• 1000 particles
• 50 trials in an episode x 5 sets
July 6th, 2016 IAS-14 Shanghai 13
periodical task with/without the retro. reset
• Retrospective resets reallocate particles effectively.
July 6th, 2016 IAS-14 Shanghai 14
with random
reset
with
the reset
discrimination task
• comparison of thresholds for retro. resets
• A higher threshold gives signs of learning.
– Particles are replaced frequently and go over the cyclic state
transition.
– But it is not perfect.
July 6th, 2016 IAS-14 Shanghai 15
0.2 (not frequent) 0.5 (frequent)
conclusion
• Particle Filter on Episode (PFoE)
– estimates the relation between current and past,
– has an ability of real-time learning, and
– does not require an environmental model except for the Bayes
model on the sensor update.
• experimental results
– It works on the actual robot.
– The simple periodical task can be learned within 20 trials.
– The discrimination task can be partially learned (75% success).
• It seems that the idea of the retrospective resetting should
be extended for non-periodical tasks. (future work)
July 6th, 2016 IAS-14 Shanghai 16
periodical task again
with different threshold
• to check ill effects of the high threshold for
retrospective resettings in the periodical task
• result: no ill effects can be seen
July 6th, 2016 IAS-14 Shanghai 17
0.2 0.5

More Related Content

Viewers also liked

確率ロボティクス第13回
確率ロボティクス第13回確率ロボティクス第13回
確率ロボティクス第13回Ryuichi Ueda
 
確率ロボティクス第二回
確率ロボティクス第二回確率ロボティクス第二回
確率ロボティクス第二回Ryuichi Ueda
 
ロボットシステム学2015年第12回
ロボットシステム学2015年第12回ロボットシステム学2015年第12回
ロボットシステム学2015年第12回Ryuichi Ueda
 
ロボットシステム学2015年第13回
ロボットシステム学2015年第13回ロボットシステム学2015年第13回
ロボットシステム学2015年第13回Ryuichi Ueda
 
ロボットシステム学2015年第8回
ロボットシステム学2015年第8回ロボットシステム学2015年第8回
ロボットシステム学2015年第8回Ryuichi Ueda
 
2014/08/02 第12回シェル芸勉強会イントロ
2014/08/02 第12回シェル芸勉強会イントロ2014/08/02 第12回シェル芸勉強会イントロ
2014/08/02 第12回シェル芸勉強会イントロRyuichi Ueda
 
20150227 オープンソースカンファレンス Tokyo 2015 Spring
20150227 オープンソースカンファレンス Tokyo 2015 Spring20150227 オープンソースカンファレンス Tokyo 2015 Spring
20150227 オープンソースカンファレンス Tokyo 2015 SpringRyuichi Ueda
 
確率ロボティクス第七回
確率ロボティクス第七回確率ロボティクス第七回
確率ロボティクス第七回Ryuichi Ueda
 
ロボットシステム学2015第2回
ロボットシステム学2015第2回ロボットシステム学2015第2回
ロボットシステム学2015第2回Ryuichi Ueda
 
確率ロボティクス第11回
確率ロボティクス第11回確率ロボティクス第11回
確率ロボティクス第11回Ryuichi Ueda
 
確率ロボティクス第12回
確率ロボティクス第12回確率ロボティクス第12回
確率ロボティクス第12回Ryuichi Ueda
 
確率ロボティクス第三回
確率ロボティクス第三回確率ロボティクス第三回
確率ロボティクス第三回Ryuichi Ueda
 
確率ロボティクス第九回
確率ロボティクス第九回確率ロボティクス第九回
確率ロボティクス第九回Ryuichi Ueda
 
Deep Learningを用いたロボット制御
Deep Learningを用いたロボット制御Deep Learningを用いたロボット制御
Deep Learningを用いたロボット制御Ryosuke Okuta
 
2012年10月27日 Hbstudy#38
2012年10月27日 Hbstudy#382012年10月27日 Hbstudy#38
2012年10月27日 Hbstudy#38Ryuichi Ueda
 
Uspstudy20121208qonly
Uspstudy20121208qonlyUspstudy20121208qonly
Uspstudy20121208qonlyRyuichi Ueda
 
Particle Filter Tracking in Python
Particle Filter Tracking in PythonParticle Filter Tracking in Python
Particle Filter Tracking in PythonKohta Ishikawa
 

Viewers also liked (19)

確率ロボティクス第13回
確率ロボティクス第13回確率ロボティクス第13回
確率ロボティクス第13回
 
確率ロボティクス第二回
確率ロボティクス第二回確率ロボティクス第二回
確率ロボティクス第二回
 
ロボットシステム学2015年第12回
ロボットシステム学2015年第12回ロボットシステム学2015年第12回
ロボットシステム学2015年第12回
 
ロボットシステム学2015年第13回
ロボットシステム学2015年第13回ロボットシステム学2015年第13回
ロボットシステム学2015年第13回
 
ロボットシステム学2015年第8回
ロボットシステム学2015年第8回ロボットシステム学2015年第8回
ロボットシステム学2015年第8回
 
2014/08/02 第12回シェル芸勉強会イントロ
2014/08/02 第12回シェル芸勉強会イントロ2014/08/02 第12回シェル芸勉強会イントロ
2014/08/02 第12回シェル芸勉強会イントロ
 
20150227 オープンソースカンファレンス Tokyo 2015 Spring
20150227 オープンソースカンファレンス Tokyo 2015 Spring20150227 オープンソースカンファレンス Tokyo 2015 Spring
20150227 オープンソースカンファレンス Tokyo 2015 Spring
 
確率ロボティクス第七回
確率ロボティクス第七回確率ロボティクス第七回
確率ロボティクス第七回
 
ロボットシステム学2015第2回
ロボットシステム学2015第2回ロボットシステム学2015第2回
ロボットシステム学2015第2回
 
確率ロボティクス第11回
確率ロボティクス第11回確率ロボティクス第11回
確率ロボティクス第11回
 
確率ロボティクス第12回
確率ロボティクス第12回確率ロボティクス第12回
確率ロボティクス第12回
 
Particle filter
Particle filterParticle filter
Particle filter
 
確率ロボティクス第三回
確率ロボティクス第三回確率ロボティクス第三回
確率ロボティクス第三回
 
確率ロボティクス第九回
確率ロボティクス第九回確率ロボティクス第九回
確率ロボティクス第九回
 
MIMO
MIMOMIMO
MIMO
 
Deep Learningを用いたロボット制御
Deep Learningを用いたロボット制御Deep Learningを用いたロボット制御
Deep Learningを用いたロボット制御
 
2012年10月27日 Hbstudy#38
2012年10月27日 Hbstudy#382012年10月27日 Hbstudy#38
2012年10月27日 Hbstudy#38
 
Uspstudy20121208qonly
Uspstudy20121208qonlyUspstudy20121208qonly
Uspstudy20121208qonly
 
Particle Filter Tracking in Python
Particle Filter Tracking in PythonParticle Filter Tracking in Python
Particle Filter Tracking in Python
 

Similar to Particle Filter on Episode

International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Particle Swarm Optimization by Rajorshi Mukherjee
Particle Swarm Optimization by Rajorshi MukherjeeParticle Swarm Optimization by Rajorshi Mukherjee
Particle Swarm Optimization by Rajorshi MukherjeeRajorshi Mukherjee
 
Feature Selection using Complementary Particle Swarm Optimization for DNA Mic...
Feature Selection using Complementary Particle Swarm Optimization for DNA Mic...Feature Selection using Complementary Particle Swarm Optimization for DNA Mic...
Feature Selection using Complementary Particle Swarm Optimization for DNA Mic...sky chang
 
Swarm intelligence pso and aco
Swarm intelligence pso and acoSwarm intelligence pso and aco
Swarm intelligence pso and acosatish561
 
Simultaneous Localization, Mapping and Self-body Shape Estimation by a Mobile...
Simultaneous Localization, Mapping and Self-body Shape Estimation by a Mobile...Simultaneous Localization, Mapping and Self-body Shape Estimation by a Mobile...
Simultaneous Localization, Mapping and Self-body Shape Estimation by a Mobile...Akira Taniguchi
 
Ron's muri presentation
Ron's muri presentationRon's muri presentation
Ron's muri presentationgowinraj
 
Pso kota baru parahyangan 2017
Pso kota baru parahyangan 2017Pso kota baru parahyangan 2017
Pso kota baru parahyangan 2017Iwan Sofana
 
A Fast and Inexpensive Particle Swarm Optimization for Drifting Problem-Spaces
A Fast and Inexpensive Particle Swarm Optimization for Drifting Problem-SpacesA Fast and Inexpensive Particle Swarm Optimization for Drifting Problem-Spaces
A Fast and Inexpensive Particle Swarm Optimization for Drifting Problem-SpacesZubin Bhuyan
 
Reconciling Self-adaptation and Self-organization
Reconciling Self-adaptation and Self-organizationReconciling Self-adaptation and Self-organization
Reconciling Self-adaptation and Self-organizationfzambonelli
 
Reconciling self-adaptation and self-organization
Reconciling self-adaptation and self-organizationReconciling self-adaptation and self-organization
Reconciling self-adaptation and self-organizationawarenessproject
 
Particle swarm optimization (PSO) ppt presentation
Particle swarm optimization (PSO) ppt presentationParticle swarm optimization (PSO) ppt presentation
Particle swarm optimization (PSO) ppt presentationLatestShorts
 
Open-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data setsOpen-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data setsAnubhav Jain
 
Fcv bio cv_cottrell
Fcv bio cv_cottrellFcv bio cv_cottrell
Fcv bio cv_cottrellzukun
 
Fcv bio cv_cottrell
Fcv bio cv_cottrellFcv bio cv_cottrell
Fcv bio cv_cottrellzukun
 
IRJET- PSO based PID Controller for Bidirectional Inductive Power Transfer Sy...
IRJET- PSO based PID Controller for Bidirectional Inductive Power Transfer Sy...IRJET- PSO based PID Controller for Bidirectional Inductive Power Transfer Sy...
IRJET- PSO based PID Controller for Bidirectional Inductive Power Transfer Sy...IRJET Journal
 
Particle Swarm Optimization
Particle Swarm OptimizationParticle Swarm Optimization
Particle Swarm OptimizationStelios Petrakis
 
Yokum 10.20.09 Presentation Valuation And Choice Using Values
Yokum   10.20.09 Presentation   Valuation And Choice   Using ValuesYokum   10.20.09 Presentation   Valuation And Choice   Using Values
Yokum 10.20.09 Presentation Valuation And Choice Using Valuestkvaran
 
Display Matters: A Test of Visual Display Options in a Web-Based Survey
Display Matters: A Test of Visual Display Options in a Web-Based SurveyDisplay Matters: A Test of Visual Display Options in a Web-Based Survey
Display Matters: A Test of Visual Display Options in a Web-Based SurveyJennifer Romano Bergstrom
 

Similar to Particle Filter on Episode (19)

International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Particle Swarm Optimization by Rajorshi Mukherjee
Particle Swarm Optimization by Rajorshi MukherjeeParticle Swarm Optimization by Rajorshi Mukherjee
Particle Swarm Optimization by Rajorshi Mukherjee
 
Feature Selection using Complementary Particle Swarm Optimization for DNA Mic...
Feature Selection using Complementary Particle Swarm Optimization for DNA Mic...Feature Selection using Complementary Particle Swarm Optimization for DNA Mic...
Feature Selection using Complementary Particle Swarm Optimization for DNA Mic...
 
Swarm intelligence pso and aco
Swarm intelligence pso and acoSwarm intelligence pso and aco
Swarm intelligence pso and aco
 
Simultaneous Localization, Mapping and Self-body Shape Estimation by a Mobile...
Simultaneous Localization, Mapping and Self-body Shape Estimation by a Mobile...Simultaneous Localization, Mapping and Self-body Shape Estimation by a Mobile...
Simultaneous Localization, Mapping and Self-body Shape Estimation by a Mobile...
 
Ron's muri presentation
Ron's muri presentationRon's muri presentation
Ron's muri presentation
 
My Powerpoint
My PowerpointMy Powerpoint
My Powerpoint
 
Pso kota baru parahyangan 2017
Pso kota baru parahyangan 2017Pso kota baru parahyangan 2017
Pso kota baru parahyangan 2017
 
A Fast and Inexpensive Particle Swarm Optimization for Drifting Problem-Spaces
A Fast and Inexpensive Particle Swarm Optimization for Drifting Problem-SpacesA Fast and Inexpensive Particle Swarm Optimization for Drifting Problem-Spaces
A Fast and Inexpensive Particle Swarm Optimization for Drifting Problem-Spaces
 
Reconciling Self-adaptation and Self-organization
Reconciling Self-adaptation and Self-organizationReconciling Self-adaptation and Self-organization
Reconciling Self-adaptation and Self-organization
 
Reconciling self-adaptation and self-organization
Reconciling self-adaptation and self-organizationReconciling self-adaptation and self-organization
Reconciling self-adaptation and self-organization
 
Particle swarm optimization (PSO) ppt presentation
Particle swarm optimization (PSO) ppt presentationParticle swarm optimization (PSO) ppt presentation
Particle swarm optimization (PSO) ppt presentation
 
Open-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data setsOpen-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data sets
 
Fcv bio cv_cottrell
Fcv bio cv_cottrellFcv bio cv_cottrell
Fcv bio cv_cottrell
 
Fcv bio cv_cottrell
Fcv bio cv_cottrellFcv bio cv_cottrell
Fcv bio cv_cottrell
 
IRJET- PSO based PID Controller for Bidirectional Inductive Power Transfer Sy...
IRJET- PSO based PID Controller for Bidirectional Inductive Power Transfer Sy...IRJET- PSO based PID Controller for Bidirectional Inductive Power Transfer Sy...
IRJET- PSO based PID Controller for Bidirectional Inductive Power Transfer Sy...
 
Particle Swarm Optimization
Particle Swarm OptimizationParticle Swarm Optimization
Particle Swarm Optimization
 
Yokum 10.20.09 Presentation Valuation And Choice Using Values
Yokum   10.20.09 Presentation   Valuation And Choice   Using ValuesYokum   10.20.09 Presentation   Valuation And Choice   Using Values
Yokum 10.20.09 Presentation Valuation And Choice Using Values
 
Display Matters: A Test of Visual Display Options in a Web-Based Survey
Display Matters: A Test of Visual Display Options in a Web-Based SurveyDisplay Matters: A Test of Visual Display Options in a Web-Based Survey
Display Matters: A Test of Visual Display Options in a Web-Based Survey
 

More from Ryuichi Ueda

第27回ロボティクスシンポジアスライド
第27回ロボティクスシンポジアスライド第27回ロボティクスシンポジアスライド
第27回ロボティクスシンポジアスライドRyuichi Ueda
 
シェル・ワンライナー160本ノック
シェル・ワンライナー160本ノックシェル・ワンライナー160本ノック
シェル・ワンライナー160本ノックRyuichi Ueda
 
日本ロボット学会第139回ロボット工学セミナー
日本ロボット学会第139回ロボット工学セミナー日本ロボット学会第139回ロボット工学セミナー
日本ロボット学会第139回ロボット工学セミナーRyuichi Ueda
 
シェル芸勉強会と会場の話
シェル芸勉強会と会場の話シェル芸勉強会と会場の話
シェル芸勉強会と会場の話Ryuichi Ueda
 
移動ロボットのナビゲーション
移動ロボットのナビゲーション移動ロボットのナビゲーション
移動ロボットのナビゲーションRyuichi Ueda
 
PythonとJupyter Notebookを利用した教科書「詳解確率ロボティクス」の企画と執筆
PythonとJupyter Notebookを利用した教科書「詳解確率ロボティクス」の企画と執筆PythonとJupyter Notebookを利用した教科書「詳解確率ロボティクス」の企画と執筆
PythonとJupyter Notebookを利用した教科書「詳解確率ロボティクス」の企画と執筆Ryuichi Ueda
 
第45回シェル芸勉強会オープニングスライド
第45回シェル芸勉強会オープニングスライド第45回シェル芸勉強会オープニングスライド
第45回シェル芸勉強会オープニングスライドRyuichi Ueda
 
bash(の変な使い方)update
bash(の変な使い方)updatebash(の変な使い方)update
bash(の変な使い方)updateRyuichi Ueda
 
第41回シェル芸勉強会 午後オープニング
第41回シェル芸勉強会 午後オープニング第41回シェル芸勉強会 午後オープニング
第41回シェル芸勉強会 午後オープニングRyuichi Ueda
 
Searching Behavior of a Simple Manipulator only with Sense of Touch Generated...
Searching Behavior of a Simple Manipulator only with Sense of Touch Generated...Searching Behavior of a Simple Manipulator only with Sense of Touch Generated...
Searching Behavior of a Simple Manipulator only with Sense of Touch Generated...Ryuichi Ueda
 
20181113_子ども夢ロボット&トーク
20181113_子ども夢ロボット&トーク20181113_子ども夢ロボット&トーク
20181113_子ども夢ロボット&トークRyuichi Ueda
 
第37回シェル芸勉強会イントロ
第37回シェル芸勉強会イントロ第37回シェル芸勉強会イントロ
第37回シェル芸勉強会イントロRyuichi Ueda
 
シェル芸勉強会にみる、コミュニティを通じたIT学習
シェル芸勉強会にみる、コミュニティを通じたIT学習シェル芸勉強会にみる、コミュニティを通じたIT学習
シェル芸勉強会にみる、コミュニティを通じたIT学習Ryuichi Ueda
 
ROSチュートリアル ROBOMECH2018
ROSチュートリアル ROBOMECH2018ROSチュートリアル ROBOMECH2018
ROSチュートリアル ROBOMECH2018Ryuichi Ueda
 
poster of PFoE used in ICRA 2018
poster of PFoE used in ICRA 2018poster of PFoE used in ICRA 2018
poster of PFoE used in ICRA 2018Ryuichi Ueda
 
Robot frontier lesson3 2018
Robot frontier lesson3 2018Robot frontier lesson3 2018
Robot frontier lesson3 2018Ryuichi Ueda
 
Robot frontier lesson2 2018
Robot frontier lesson2 2018Robot frontier lesson2 2018
Robot frontier lesson2 2018Ryuichi Ueda
 
Robot frontier lesson1 2018
Robot frontier lesson1 2018Robot frontier lesson1 2018
Robot frontier lesson1 2018Ryuichi Ueda
 
第34回シェル芸勉強会
第34回シェル芸勉強会第34回シェル芸勉強会
第34回シェル芸勉強会Ryuichi Ueda
 
第32回信号処理シンポジウム「Raspberry PiとROSを 使ったロボットシステム」
第32回信号処理シンポジウム「Raspberry PiとROSを使ったロボットシステム」第32回信号処理シンポジウム「Raspberry PiとROSを使ったロボットシステム」
第32回信号処理シンポジウム「Raspberry PiとROSを 使ったロボットシステム」Ryuichi Ueda
 

More from Ryuichi Ueda (20)

第27回ロボティクスシンポジアスライド
第27回ロボティクスシンポジアスライド第27回ロボティクスシンポジアスライド
第27回ロボティクスシンポジアスライド
 
シェル・ワンライナー160本ノック
シェル・ワンライナー160本ノックシェル・ワンライナー160本ノック
シェル・ワンライナー160本ノック
 
日本ロボット学会第139回ロボット工学セミナー
日本ロボット学会第139回ロボット工学セミナー日本ロボット学会第139回ロボット工学セミナー
日本ロボット学会第139回ロボット工学セミナー
 
シェル芸勉強会と会場の話
シェル芸勉強会と会場の話シェル芸勉強会と会場の話
シェル芸勉強会と会場の話
 
移動ロボットのナビゲーション
移動ロボットのナビゲーション移動ロボットのナビゲーション
移動ロボットのナビゲーション
 
PythonとJupyter Notebookを利用した教科書「詳解確率ロボティクス」の企画と執筆
PythonとJupyter Notebookを利用した教科書「詳解確率ロボティクス」の企画と執筆PythonとJupyter Notebookを利用した教科書「詳解確率ロボティクス」の企画と執筆
PythonとJupyter Notebookを利用した教科書「詳解確率ロボティクス」の企画と執筆
 
第45回シェル芸勉強会オープニングスライド
第45回シェル芸勉強会オープニングスライド第45回シェル芸勉強会オープニングスライド
第45回シェル芸勉強会オープニングスライド
 
bash(の変な使い方)update
bash(の変な使い方)updatebash(の変な使い方)update
bash(の変な使い方)update
 
第41回シェル芸勉強会 午後オープニング
第41回シェル芸勉強会 午後オープニング第41回シェル芸勉強会 午後オープニング
第41回シェル芸勉強会 午後オープニング
 
Searching Behavior of a Simple Manipulator only with Sense of Touch Generated...
Searching Behavior of a Simple Manipulator only with Sense of Touch Generated...Searching Behavior of a Simple Manipulator only with Sense of Touch Generated...
Searching Behavior of a Simple Manipulator only with Sense of Touch Generated...
 
20181113_子ども夢ロボット&トーク
20181113_子ども夢ロボット&トーク20181113_子ども夢ロボット&トーク
20181113_子ども夢ロボット&トーク
 
第37回シェル芸勉強会イントロ
第37回シェル芸勉強会イントロ第37回シェル芸勉強会イントロ
第37回シェル芸勉強会イントロ
 
シェル芸勉強会にみる、コミュニティを通じたIT学習
シェル芸勉強会にみる、コミュニティを通じたIT学習シェル芸勉強会にみる、コミュニティを通じたIT学習
シェル芸勉強会にみる、コミュニティを通じたIT学習
 
ROSチュートリアル ROBOMECH2018
ROSチュートリアル ROBOMECH2018ROSチュートリアル ROBOMECH2018
ROSチュートリアル ROBOMECH2018
 
poster of PFoE used in ICRA 2018
poster of PFoE used in ICRA 2018poster of PFoE used in ICRA 2018
poster of PFoE used in ICRA 2018
 
Robot frontier lesson3 2018
Robot frontier lesson3 2018Robot frontier lesson3 2018
Robot frontier lesson3 2018
 
Robot frontier lesson2 2018
Robot frontier lesson2 2018Robot frontier lesson2 2018
Robot frontier lesson2 2018
 
Robot frontier lesson1 2018
Robot frontier lesson1 2018Robot frontier lesson1 2018
Robot frontier lesson1 2018
 
第34回シェル芸勉強会
第34回シェル芸勉強会第34回シェル芸勉強会
第34回シェル芸勉強会
 
第32回信号処理シンポジウム「Raspberry PiとROSを 使ったロボットシステム」
第32回信号処理シンポジウム「Raspberry PiとROSを使ったロボットシステム」第32回信号処理シンポジウム「Raspberry PiとROSを使ったロボットシステム」
第32回信号処理シンポジウム「Raspberry PiとROSを 使ったロボットシステム」
 

Recently uploaded

React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 

Recently uploaded (20)

React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 

Particle Filter on Episode

  • 1. Particle Filter on Episode for Learning Decision Making Rule Ryuichi Ueda Chiba Inst. of Technology Kotaro Mizuta RIKEN BSI Hiroshi Yamakawa DOWANGO Hiroyuki Okada Tamagawa Univ.
  • 2. navigation problems in the real world • Not only robots, but also animals solve them. • Mammals have specialized cells for spatial recognition in their brain. – especially around the hippocampus – ex. place cells • They show different reaction at each place of environment. • -> existence of maps in the brain July 6th, 2016 IAS-14 Shanghai 2 Place cells [O'Keefe71] (http://en.wikipedia.org/ wiki/Place_cell)
  • 3. map vs. memory • Mammals have maps in their brains. • Maps of environments are of concern also in robotics. – SLAM has been one of the most important topic. – studies introducing the function of the hippocampus • RatSLAM [Milford08] • How about memory? – Memory is also handled in the hippocampus. – Sequence of memory is reduced to maps (or state space models). – Robots can record its memory for long time if they has TB level storages. (difference between mammals and robots) July 6th, 2016 IAS-14 Shanghai 3
  • 4. the purpose • our intuition – If memory is the source of maps, robots will be able to decide its action not from a map but directly from memory. – Some knowledge about handling of memory in the hippocampus and its surroundings will help this attempt. • to implement a learning algorithm that directly utilizes memory – particle filter on episode (PFoE) – validation with an actual robot July 6th, 2016 IAS-14 Shanghai 4
  • 5. related works • Episode-based reinforcement learning [Unemi 1999] – Its base idea is identical with PFoE. – PFoE simplifies implementation and enables real-time calculation. • RatSLAM [Milford08] – an algorithm for robotics utilizing the knowledge around the hippocampus July 6th, 2016 IAS-14 Shanghai 5
  • 6. outline of PFoE • In repetitions of a task for learning, a robot stores events. – an event = a set of sensor readings, actions, and rewards given by someone obtained at a discrete time step – the episode: the sequence of the events • The degree of recall of each event is represented as a probability. July 6th, 2016 IAS-14 Shanghai 6 time axis states episode rewards belief s s s s s s s present time 1 -1 a a a a a a a actions past current
  • 7. decision with the belief and the episode • An action is chosen by calculation of expectation values. July 6th, 2016 IAS-14 Shanghai 7 time axis states episode rewards belief s s s s s s s present time ? 1 -1 a a a a a a a actions When the robot recalls these events, it may obtain +1 reward if it chooses the action as those time. When the robot recalls these events, it should change its action to avoid -1 reward.
  • 8. representation with particles • The belief is represented with particles. – O(N) even if the episode has infinite length • variables of a particle – its position on the time axis – its weight July 6th, 2016 IAS-14 Shanghai 8 time axis belief present time a particle
  • 9. operation of PFoE – motion update • When the current time goes to the next time step, particles simply shift to their next time steps. – The episode is extended by an additional event. – Positions of particles are shifted. July 6th, 2016 IAS-14 Shanghai 9 before an action time axis belief after the action time axis belief addition of the event
  • 10. operation of PFoE – sensor update • The event related to each particle is compared to the last one. – Weights are reduced responding to the difference. • resampled and normalized after reduction of weights • When the sum of weights before normalization is under a threshold, all particles are replaced (a reset). – how to reset? July 6th, 2016 IAS-14 Shanghai 10 time axis belief difference of sensor readings, the reward, or the action e e e e e e compare events
  • 11. operation of PFoE – retrospective resets • inspired by the retrospective activity of place cells – When a rat recalls past events, place cells become active as if the rat virtually moves. • algorithm – 1. place particles randomly – 2. replay the motion update and the sensor update for M steps with the past M events from the current time July 6th, 2016 IAS-14 Shanghai 11 time axis belief currentM step before ... moved and compared e e e
  • 12. experiments • the robot: a micromouse that has 4 range sensors • T-maze that has a reward at one of its arms. • The robot chooses a turn right action or a turn left action at the T-junction. • State transition is simplified to cycles of 4 events. – The robot records an event when • it is placed on the initial position • it reaches the T-junction • it turns right or left • it goes to an end of the arm July 6th, 2016 IAS-14 Shanghai 12 direction of sensors a marker of reward
  • 13. tasks of experiments • a periodical task – The reward is put right or left alternately. – cycles of 8 events • a discrimination task – The reward is put the side where the robot is placed at first. – Right or left is chosen randomly. • not periodical • 1000 particles • 50 trials in an episode x 5 sets July 6th, 2016 IAS-14 Shanghai 13
  • 14. periodical task with/without the retro. reset • Retrospective resets reallocate particles effectively. July 6th, 2016 IAS-14 Shanghai 14 with random reset with the reset
  • 15. discrimination task • comparison of thresholds for retro. resets • A higher threshold gives signs of learning. – Particles are replaced frequently and go over the cyclic state transition. – But it is not perfect. July 6th, 2016 IAS-14 Shanghai 15 0.2 (not frequent) 0.5 (frequent)
  • 16. conclusion • Particle Filter on Episode (PFoE) – estimates the relation between current and past, – has an ability of real-time learning, and – does not require an environmental model except for the Bayes model on the sensor update. • experimental results – It works on the actual robot. – The simple periodical task can be learned within 20 trials. – The discrimination task can be partially learned (75% success). • It seems that the idea of the retrospective resetting should be extended for non-periodical tasks. (future work) July 6th, 2016 IAS-14 Shanghai 16
  • 17. periodical task again with different threshold • to check ill effects of the high threshold for retrospective resettings in the periodical task • result: no ill effects can be seen July 6th, 2016 IAS-14 Shanghai 17 0.2 0.5