0
PIRF-Nav:An Online Incremental Appearance- based Localization and Mapping in            Dynamic Environments              ...
Introduction to SLAM Simultaneous Localization and Mapping, or SLAM, is a  navigation system needed for every kind of mob...
Appearance-based Localization and      Mapping (FAB-MAP)                3
Why Visual SLAM ? What are the            Challenging ? Why don’t we just use GPS ?   GPS is not always reliable in the ...
Appearance-based Localization and     Mapping VS Place Recognition                  Place Recognition          Localizatio...
Appearance-based SLAM’s Common           Objectives 100% Precision with very high Recall Rates Can run incrementally in ...
Visual SLAM’s Related Works1.    FAB-MAP (Cummins & Newman, IJRR’08)      Considering the efficiency at 100% precision, t...
What Do We Want ? :PIRF-Nav’s                Advantages                                     FAB-MAP           Inc. BoW (T-...
Basic Idea & Concept of PIRF-Nav  Making use of PIRF, we can detect the good landmarks of   each individual place  The e...
Basic Idea of PIRFs (proposed) Outdoor Scenes generally include distant objects  whose appearances are robust against the...
PIRF Extraction Algorithm                                    Image Sequence    3            1              2         0    ...
Briefly on PIRF’s Performance Exp. 1 Scenes From Suzukakedai    Training (640x428)   Testing (640x428)           580     ...
PIRF’s Performance    Recognition Rate of Suzukakedai and O-okayama                                                       ...
Even With These Strong Changes,     PIRF Still Works Well !!!      Highly Dynamic Changes in Scenes                       ...
PIRF (City Centre Dataset)               Original Descriptors (SIFT)                                                      ...
PIRF-Nav Processing Diagram (prop.)Overall Processing Diagram Step 1: Perform simple feature  matching. The score is  cal...
Notation Definition At time t, a map of the environment is a collection of  nt discrete and disjoint location            ...
STEP 1: Simple Feature Matching The current model  is compared to each of all  mapped models  = {0 , … ,   } using standa...
STEP 1: Simple Feature Matching            (Continued) The similarity score s is calculated by considering the term  freq...
STEP 1: Simple Feature Matching            (Continued) To be used with PIRF, the function is then converted to           ...
STEP 2: Considering Neighbors Accepting of rejecting loop-closure detection based on the  score from only single image is...
STEP 3: Normalizing the Score                   Done by considering the                    standard deviation and        ...
STEP 4: Re-localization The obtained location  would be accepted as loop  closure if  −   2 Ideally, the neighboring mod...
Step 4: Relocalization (Sample          Problems)                    Location assigned from                    Step 3 does...
STEP 4: Re-Localization Therefore, we perform the second summation over  the neighbouring score model to achieve a more  ...
Results  Experiments : DATASETS Three datasets have been used   City Centre (2474 images with size 640 x 480)       The ...
Results  Experiments: DATASETS  City Centre                 27
Results  Experiments: DATASETS  New College                 28
Results  Experiments: DATASETS  Suzukakedai                 29
Results  Experiments: BASELINE  Among many visual SLAM methods, FAB-MAP (Cummins    Newman. IJRR’08) and the fast increme...
Evaluation on Appearance-based      Loop-closure Detection Problem                                                        ...
Evaluation on Appearance-basedLoop-closure Detection Problem Actually, performance should be evaluated by two graphs:   ...
Evaluation on Appearance-basedLoop-closure Detection Problem          (City Centre)                         Precision A – ...
Result 1: City Centre                         Vehicle Trajectory                         Loop Closure DetectionPIRF-Nav (1...
Result 1 : City Centre (Precision-Recall                 Curve)                   35
Result 1: City Centre            (Computation Time)                                                                       ...
Result 2: New College      Vehicle Trajectory      Loop Closure DetectionPIRF-Nav (100% Precision) (proposed)     FAB-MAP ...
Result 2: New College (Precision-          Recall Curve)                 38
Result 3: SuzukakedaiVehicle TrajectoryLoop Closure Detection                               39              PIRF-Nav (100%...
Result 3: Suzukakedai (Precision-           Recall Curve)                                    40
Result 4: Combined Datasets              (Precision-Recall Curve)                                                         ...
Sample Matched Images (Dynamical  Changes in Major Part of Scene)                42
Sample Matched Images (Different         View-Points)               43
Conclusions PIRF-Nav outperforms FAB-MAP in term of accuracy with  more than 80% recall rate at 100% precision on all dat...
Thank you for Your Kind        Attention“DOUBT IS THE FATHER OF INVENTION”                                     QUOTED BY G...
Publication Journal   1.   A. Kawewong and O. Hasegawa, Classifying 3D Real-World Texture Images by        Combining Maxi...
Publication Conferences  1.   A. Kawewong and O. Hasegawa, 3D Texture Classification by Using Pre-testing       Stage and...
Publication Conferences  6. A. Kawewong, S. Tangruamsub, and O. Hasegawa, Wide-baseline Visible     Features for Highly D...
Upcoming SlideShare
Loading in...5
×

Dr.Kawewong Ph.D Thesis

5,342

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
5,342
On Slideshare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
27
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Dr.Kawewong Ph.D Thesis"

  1. 1. PIRF-Nav:An Online Incremental Appearance- based Localization and Mapping in Dynamic Environments Aram Kawewong Hasegawa Laboratory Department of Computational Intelligence and Systems Science Interdisciplinary Graduate School of Science and Engineering Tokyo Institute of Technology 1
  2. 2. Introduction to SLAM Simultaneous Localization and Mapping, or SLAM, is a navigation system needed for every kind of mobile robots In the unfamiliar environment, the robot must be able to perform two important tasks simultaneously  Mapping the new place if the place has never been visited previously  Localizing itself to some mapped place if the place has been visited before 2
  3. 3. Appearance-based Localization and Mapping (FAB-MAP) 3
  4. 4. Why Visual SLAM ? What are the Challenging ? Why don’t we just use GPS ?  GPS is not always reliable in the crowded city centre  GPS can only locate the coordinate/position of the agent but not the corresponded scene; how can the robot answer the question “look at this picture and tell me where it is ?” or “have you ever visited this place before ? Can you describe about the nearby places ?” No false positive (can have false negative)  If the robot is not confident then it should answer “this is the new place”. If the robot is to answer “this place is the same place as the place ….”, it must be 100% correct.  100% precision (all answers must be correct) 4
  5. 5. Appearance-based Localization and Mapping VS Place Recognition Place Recognition Localization and Mapping (Robotics) (Computer Vision)Input Images All testing images are Every input image is a testing image; it known to come from might come from somewhere in the map somewhere in the map or it might be the previously unseen placeEnvironment Closed Environment Opened EnvironmentPrecision Precision-1 is not the Precision-1 is the first priority concern; main concern if the recall one false positive may lead the serious rate is reasonably high error in navigation 5
  6. 6. Appearance-based SLAM’s Common Objectives 100% Precision with very high Recall Rates Can run incrementally in an online manner Life-long  Low computation time  Consume less memory Suitable to navigate in large-scale environments Can solve 2 main problems:  Dynamical Changes  Perceptual Aliasing (Different Places but look similar) Note:  Coordinate-based Localization is not required here 6
  7. 7. Visual SLAM’s Related Works1. FAB-MAP (Cummins & Newman, IJRR’08)  Considering the efficiency at 100% precision, the obtained recall rate of FAB-MAP (a State-of-the-art method) is still not so high.  An offline generation process for dictionary generation is necessary.2. Fast Incremental Bag-of-words (Angeli, et al. T-RO’08)  The system can run incrementally; offline dictionary generation process is not needed.  Accuracy is said to be less than or equal to that of FAB-MAP  Consume much higher memory than FAB-MAP 7
  8. 8. What Do We Want ? :PIRF-Nav’s Advantages FAB-MAP Inc. BoW (T- PIRF-Nav (IJRR’08) RO’08) (prop.)Ability to incrementally runwithout needs for offline No Yes Yesdictionary generation processMemory Consumption Low High ModerateAbility to run in real-time Yes Yes YesRobustness against dynamical Moderate Low Highchanges* (~40% on (~20% on City (~85% on City Centre) Centre) City Centre) * The recall rate is considered at 100% precision 8
  9. 9. Basic Idea & Concept of PIRF-Nav  Making use of PIRF, we can detect the good landmarks of each individual place  The extracted PIRFs should be sufficiently informative to represent the place so that the system does not need the preliminary generated visual vocabulary  The number of PIRFs is sufficiently small to be used in the real-time application  Because the PIRF is robust against dynamical changes of scenes, the PIRF-based visual SLAM (called PIRF-Nav) become an efficient online incremental visual SLAM 9
  10. 10. Basic Idea of PIRFs (proposed) Outdoor Scenes generally include distant objects whose appearances are robust against the changes in camera position Averaging the “slow-moving” local features which capture such objects give us the less and more robust features 10 10
  11. 11. PIRF Extraction Algorithm Image Sequence 3 1 2 0 4 0 0 3 4 2 1 1 0 6 1 6 0 5 1 0 5 4 5 0 4 5 3 1 0 4 3 0 1 3 0 2 Sliding Window; w = 3 Sequence of Matching Index Vectors 11
  12. 12. Briefly on PIRF’s Performance Exp. 1 Scenes From Suzukakedai Training (640x428) Testing (640x428) 580 489 Exp. 1 Scenes From O-okayama Training (640x428) Testing (640x428) 450 493 12
  13. 13. PIRF’s Performance Recognition Rate of Suzukakedai and O-okayama 93.46% 77.48%100.00% 90.00% 80.00% 45.75% 70.00% 36.71% Suzukakedai 30.22% 31.08% O-Okayama 27.59% 60.00% 24.54% 22.29% 18.23% 50.00% 40.00% 30.00% 20.00% 10.00% 0.00% 13
  14. 14. Even With These Strong Changes, PIRF Still Works Well !!! Highly Dynamic Changes in Scenes 14 Illumination Changes in Scenes
  15. 15. PIRF (City Centre Dataset) Original Descriptors (SIFT) 15 Position-invariant Robust Feature (PIRF) (proposed)
  16. 16. PIRF-Nav Processing Diagram (prop.)Overall Processing Diagram Step 1: Perform simple feature matching. The score is calculated based on the popular term frequency- inverted document frequency weighting Step 2-3: Adapt the score by considering the neighbors and then perform normalization Step 4: Perform second integration over the score’s space for relocalization 16
  17. 17. Notation Definition At time t, a map of the environment is a collection of nt discrete and disjoint location = {1 , … , } Each of these locations , which has been created from past image , has an associated model  The model is a set of PIRFs 17
  18. 18. STEP 1: Simple Feature Matching The current model is compared to each of all mapped models = {0 , … , } using standard feature matching with distant threshold 2 Each matching outputs the similarity score s 0 is model of the location 0 which is a virtual location for the event “no loop closure occurred at time t”. Based on the obtained score s, the system proceed to the next step if () ≠ 0 18
  19. 19. STEP 1: Simple Feature Matching (Continued) The similarity score s is calculated by considering the term frequency – inverted document frequency (tf-idf) weighting (Sivic Zisserman, ICCV’ 03) :- tf − idf = log  is the number of occurrences of visual word w in  is the total number of visual words in  is the number of models containing word w N is the total number of all existing models 19
  20. 20. STEP 1: Simple Feature Matching (Continued) To be used with PIRF, the function is then converted to mi = log k=1  is the number of models , 0 ≤ ≤ , ≠ , containing PIRFs which match the kth PIRF of the input model  is the number of all matched PIRFs between input and the query model The system proceeds to STEP 2 if and only if the maximum score does not belong to 0 and is greater than 1 20
  21. 21. STEP 2: Considering Neighbors Accepting of rejecting loop-closure detection based on the score from only single image is sensitive to noise This can be handled by considering the similarity score of neighboring image models as:- + = ∙ , =− The term , is the transition probability generated from a Gaussian on the distance in time between i and k stands for the number of neighbors examined 21
  22. 22. STEP 3: Normalizing the Score  Done by considering the standard deviation and mean value over all scores  ln indicates the number of neighbours taken into consideration  The beta-scores are converted into normalized score according to the equation − , if ≥ = 1, Otherwise where 22 = +
  23. 23. STEP 4: Re-localization The obtained location would be accepted as loop closure if − 2 Ideally, the neighboring model scores of location Lj should decrease symmetrically from a model score. However, scenes in dynamic environments usually contains moving objects that frequently cause the occlusion. The score of some assigned location may not be symmetrical. 23
  24. 24. Step 4: Relocalization (Sample Problems) Location assigned from Step 3 does not have a symmetrical score Performing one more summation can shift the location to the right one 24
  25. 25. STEP 4: Re-Localization Therefore, we perform the second summation over the neighbouring score model to achieve a more accurate localization + ′ = ∙ , =− The obtained normalized score for all possible models determines the most potential loop-closure location , where = argmax ′ 25
  26. 26. Results Experiments : DATASETS Three datasets have been used  City Centre (2474 images with size 640 x 480) The dataset was taken to address to problem of dynamical changes of scenes in the city centre.  New College (2146 images with size 640 x 480) The dataset was taken to address the problem of perceptual aliasing. By this dataset, a robot walked to the same place many times. Many different places look very similar.  Suzukakedai (1079 images with size 1920 x 1080) The dataset was taken by video camera attached with the omnidirectional lens. The dataset was taken to address the problem of highly dynamical changes where the different event was organized (i.e. open-campus event) 26
  27. 27. Results Experiments: DATASETS  City Centre 27
  28. 28. Results Experiments: DATASETS  New College 28
  29. 29. Results Experiments: DATASETS  Suzukakedai 29
  30. 30. Results Experiments: BASELINE  Among many visual SLAM methods, FAB-MAP (Cummins Newman. IJRR’08) and the fast incremental BoW method of Angeli et al. (T-RO’ 08) are considered to be state-of-the- art.  Both of them are based on Bag-of-words scheme  Each of them offer different advantages  FAB-MAP  High accuracy with offline dictionary generation  Angeli et al.  Lower than or equal accuracy to FAB-MAP but with an online incremental dictionary generation  PIRF-Nav must offer higher accuracy than FAB-MAP while being an online incremental method like Angeli et al. 30
  31. 31. Evaluation on Appearance-based Loop-closure Detection Problem Correct Loop-closure PrecisionA = All Loop-closure Input Image Correct Loop-closure RecallA = All labeled loop-closure Loop- Binary Classification: New Closing ? place / Old Place Image Retrieval Problem:Add new place to Find the loop- Retrieve the most likely place for the map closure place loop-closure Correctly retrieved image PrecisionB = All retrieved images Output the loop- Correctly retrieved image closure location RecallB = All labeled images 31
  32. 32. Evaluation on Appearance-basedLoop-closure Detection Problem Actually, performance should be evaluated by two graphs:  Precision A – Recall A curve  Precision B – Recall B curve However, for compact representation, most works in visual SLAM use Precision B – Recall B curve to show the performance because  The binary classification is currently not so much problematic  Important challenge is given to the performance of image retrieval 32
  33. 33. Evaluation on Appearance-basedLoop-closure Detection Problem (City Centre) Precision A – Recall A: Focusing on only the problem of saying “YES/NO” loop-closure detected is currently trivial Precision B – Recall B: Instead, given that the precision of the “YES/NO” loop-closure detected is 100%, it is much more interesting to see how accurate the system can correctly retrieve the corresponding image 33
  34. 34. Result 1: City Centre Vehicle Trajectory Loop Closure DetectionPIRF-Nav (100% Precision) (proposed) FAB-MAP (100% Precision) 34
  35. 35. Result 1 : City Centre (Precision-Recall Curve) 35
  36. 36. Result 1: City Centre (Computation Time) 36*It is noteworthy that all programs of PIRF-Nav were written in MATLABwhile FAB-MAP was written in C.
  37. 37. Result 2: New College Vehicle Trajectory Loop Closure DetectionPIRF-Nav (100% Precision) (proposed) FAB-MAP (100% Precision) 37
  38. 38. Result 2: New College (Precision- Recall Curve) 38
  39. 39. Result 3: SuzukakedaiVehicle TrajectoryLoop Closure Detection 39 PIRF-Nav (100% Precision)
  40. 40. Result 3: Suzukakedai (Precision- Recall Curve) 40
  41. 41. Result 4: Combined Datasets (Precision-Recall Curve) 41Note: We did not test FAB-MAP on this experiment because FAB-MAP completely failed in SuzukakedaiDataset. Also the results on City Centre and New College clearly imply that FAB-MAP will not gainbetter accuracy in this experiment.
  42. 42. Sample Matched Images (Dynamical Changes in Major Part of Scene) 42
  43. 43. Sample Matched Images (Different View-Points) 43
  44. 44. Conclusions PIRF-Nav outperforms FAB-MAP in term of accuracy with more than 80% recall rate at 100% precision on all datasets provided by the authors PIRF-Nav offers an online and incremental ability to run in very different environments Although the computation time of PIRF-Nav at the same image scale is slower than FAB, PIRF-Nav compensates this drawback by processing on smaller image scale since the accuracy is still considerably much higher than FAB-MAP 44
  45. 45. Thank you for Your Kind Attention“DOUBT IS THE FATHER OF INVENTION” QUOTED BY GALILEO 45
  46. 46. Publication Journal 1. A. Kawewong and O. Hasegawa, Classifying 3D Real-World Texture Images by Combining Maximum Response 8, 4th Order of Auto Correlation and Colortons, Jour. of Advanced Comp. Intelligence and Intelligent Informatics, vol. 11, no. 5, 2007. 2. A. Kawewong, Y. Honda, M. Tsuboyama, and O. Hasegawa, Reasoning on the Self- Organizing Incremental Associative Memory for Online Robot Path Planning, IEICE Trans. Inf. Sys., vol. E93-D, no. 3, 2009. (impact factor 0.369) 3. 本田雄太郎,Aram Kawewong, 坪山学,長谷川修:半教師ありニューラルネットワーク による場所細胞の獲得とロボットの自律移動制御,信学論D,2009,採録決定 4. A. Kawewong, N. Tongprasit, S. Tangruamsub and O. Hasegawa, “Online and Incremental Appearance-based SLAM in Highly Dynamic Environments, Int’l Jour. Robotics Research (IJRR). (To Appear in 2010, impact factor 2.882, rank#1 in robotics) 5. A. Kawewong, S. Tangruamsub and O. Hasegawa, “Position-Invariant Robust Features for Long-term Recognition of Dynamic Outdoor Scenes, IEICE Trans. Inf. Sys. (conditional accepted) 46
  47. 47. Publication Conferences 1. A. Kawewong and O. Hasegawa, 3D Texture Classification by Using Pre-testing Stage and Reliability Table, IEEE Proc. International Conference on Image Processing (ICIP), (2005). 2. A. Kawewong and O. Hasegawa, Combining Rotationally Variant and Invariant Features Based on Between-Class Error for 3D Texture Classification, IEEE Int’l Conf. On Computer Vision (ICCV) Workshop, 2005. 3. A. Kawewong, Y. Honda, M. Tsuboyama, O. Hasegawa, A Common-Neural- Pattern Based Reasoning for Mobile Robot Cognitive Mapping, In Proc. Int’l Conf. Neural Information Processing (ICONIP), 2008. 4. A. Kawewong, Y. Honda, M. Tsuboyama, O. Hasegawa, Common-Patterns Based Mapping for Robot Navigation, in Proc. IEEE Int’l Conf. Robotics and Biomimetics (ROBIO), 2008. 5. S. Tangruamsub, M. Tsuboyama, A. Kawewong and O. Hasegawa, Mobile Robot Vision-Based Navigation Using Self-Organizing and Incremental Neural Networks, in Proc. Int’l Joint Conf. Neural Networks (IJCNN), 2009. 47
  48. 48. Publication Conferences 6. A. Kawewong, S. Tangruamsub, and O. Hasegawa, Wide-baseline Visible Features for Highly Dynamic Scene Recognition, in Proc. Intl Conf. Computer Analysis of Images and Patterns (CAIP), 2009. 7. N. Tongprasit, A. Kawewong and O. Hasegawa, Data Partitioning Technique for Online and Incremental Visual SLAM, in Proc. Int’l Conf. on Neural Information Processing (ICONIP), 2009. (oral student travel award) 48
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×