Real-time 3D Object Pose Estimation and Tracking  for Natural Landmark Based Visual Servo Seung-Min Baek and Sukhan Lee Sungkyunkwan University Intelligent System Research Center Changhyun  Choi Georgia Tech College of Computing
Contents Introduction Motivation Related Works Proposed Approach System Overview Problem Definition Initial Pose Estimation Local Pose Estimation Experimental Results Summary & Conclusion Future Work IEEE/RSJ IROS 2008, Sept 25
Introduction In Visual Servo Control, Object Recognition  Pose Estimation  are key tasks. IEEE/RSJ IROS 2008, Sept 25
Introduction Many systems still use Artificial Landmark . Unnatural  in human environment IEEE/RSJ IROS 2008, Sept 25
Introduction We need  Natural Landmarks . Natural Landmarks are visual features objects inherently have. IEEE/RSJ IROS 2008, Sept 25
Introduction Modern recognition methods SIFT about 200~300 ms on a modern PC Structured light several seconds IEEE/RSJ IROS 2008, Sept 25
Motivation How to apply these state-of-the-art recognition methods to visual servo control? How to overcome the time lag? How to solve the real-time issue? IEEE/RSJ IROS 2008, Sept 25
Related Works Monocular  Model-based Use keyframe information as prior knowledge Use sparse bundle adjustment technique [ L. Vacchetti et al.,  PAMI 04 ] Input image should be close enough to  the prior knowledge! IEEE/RSJ IROS 2008, Sept 25
Related Works Active Contour Local curve fitting algorithm Initialize by SIFT keypoint matching [G. Panin and A. Knoll,  JMM 04 ] Potential danger in background having  same color with tracking object! IEEE/RSJ IROS 2008, Sept 25
Our Idea Use  prior knowledge  (object models) 2D images 3D points  obtained from structured light system Use  scale invariant feature matching  for  accurate  initialization Use  KLT (Kanade-Lucas-Tomasi) tracker  for  fast  local tracking IEEE/RSJ IROS 2008, Sept 25
System Overview Add text IEEE/RSJ IROS 2008, Sept 25
Two Modes Mono Mode Using  mono camera Better computational performance Stereo Mode Using  stereo camera More accurate pose result IEEE/RSJ IROS 2008, Sept 25
Problem Definition –  Mono Mode Given 2D-3D correspondences and a calibrated mono camera, find the pose of the object with respect to the camera. IEEE/RSJ IROS 2008, Sept 25
Problem Definition –  Stereo Mode Given 3D-3D correspondences and a calibrated stereo camera, find the pose of the object with respect to the camera. IEEE/RSJ IROS 2008, Sept 25
Initial Pose Estimation Add text IEEE/RSJ IROS 2008, Sept 25
Initial Pose Estimation Extract SIFT keypoints Matching with model knowledge Estimate initial pose Get a convex hull of a set of matched SIFT keypoints Generate KLT tracking points within the convexhull Calculate 3D coordinates of KLT points IEEE/RSJ IROS 2008, Sept 25
Initial Pose Estimation Mono Mode Use the  POSIT algorithm  ( 2D-3D ) Stereo Mode Use the  closed-form solution using unit quaternions  ( 3D-3D ) R,t R,t IEEE/RSJ IROS 2008, Sept 25
Initial Pose Estimation Extract SIFT keypoints Matching with model knowledge Estimate initial pose Get a convex hull of a set of matched SIFT keypoints Generate KLT tracking points within the convexhull Calculate 3D coordinates of KLT points IEEE/RSJ IROS 2008, Sept 25
Initial Pose Estimation 3D coordinates of each KLT points are required for  subsequent local pose estimation Stereo Mode Straightforward in a calibrated stereo rig Triangulate 3D points Mono Mode Use  approximation  with the knowledge of model Get 3D coordinates by using  three nearest neighboring SIFT points IEEE/RSJ IROS 2008, Sept 25
Initial Pose Estimation +  : SIFT points •   : KLT points IEEE/RSJ IROS 2008, Sept 25
Initial Pose Estimation Treat the surface as  locally flat IEEE/RSJ IROS 2008, Sept 25
Local Pose Estimation Add text IEEE/RSJ IROS 2008, Sept 25
Local Pose Estimation Estimate pose with  KLT tracking points  and  their 3D points Pose estimation algorithms are same Mono Mode Use the  POSIT algorithm  ( 2D-3D ) Stereo Mode Use the  closed-form solution using unit quaternions  ( 3D-3D ) R,t R,t IEEE/RSJ IROS 2008, Sept 25
Removing Outliers IEEE/RSJ IROS 2008, Sept 25
Outlier Handling KLT tracking points are easy to  drift Drifting points result in inaccurate pose Use  RANSAC  to remove outlier Re-initialize  when there are no sufficient # of inliers IEEE/RSJ IROS 2008, Sept 25
Tracking Results IEEE/RSJ IROS 2008, Sept 25
Experiment Mono Mode Stereo Mode IEEE/RSJ IROS 2008, Sept 25
Tracking Results -  translation IEEE/RSJ IROS 2008, Sept 25
Tracking Results -  rotation IEEE/RSJ IROS 2008, Sept 25
RMS Error RMS errors over the whole sequence of image Z IEEE/RSJ IROS 2008, Sept 25
Computational Time Computational times of  pose estimation IEEE/RSJ IROS 2008, Sept 25
Computational Time Computational times of  each module IEEE/RSJ IROS 2008, Sept 25
Summary & Conclusion A method for tracking 3D roto-translation of rigid objects  using  scale invariant feature based matching  KLT (Kanade-Lucas-Tomasi) tracker Mono mode guarantees higher frame rate performance stereo mode shows better pose results IEEE/RSJ IROS 2008, Sept 25
Future Work To decrease the computational burden Use  GPU-based implementation  of KLT tracker and SIFT GPU KLT SiftGPU Unifying the  contour based tracking IEEE/RSJ IROS 2008, Sept 25
Thank you Any Questions? Any Suggestions? Any Comments? IEEE/RSJ IROS 2008, Sept 25

IEEE/RSJ IROS 2008 Real-time Tracker

  • 1.
    Real-time 3D ObjectPose Estimation and Tracking for Natural Landmark Based Visual Servo Seung-Min Baek and Sukhan Lee Sungkyunkwan University Intelligent System Research Center Changhyun Choi Georgia Tech College of Computing
  • 2.
    Contents Introduction MotivationRelated Works Proposed Approach System Overview Problem Definition Initial Pose Estimation Local Pose Estimation Experimental Results Summary & Conclusion Future Work IEEE/RSJ IROS 2008, Sept 25
  • 3.
    Introduction In VisualServo Control, Object Recognition Pose Estimation are key tasks. IEEE/RSJ IROS 2008, Sept 25
  • 4.
    Introduction Many systemsstill use Artificial Landmark . Unnatural in human environment IEEE/RSJ IROS 2008, Sept 25
  • 5.
    Introduction We need Natural Landmarks . Natural Landmarks are visual features objects inherently have. IEEE/RSJ IROS 2008, Sept 25
  • 6.
    Introduction Modern recognitionmethods SIFT about 200~300 ms on a modern PC Structured light several seconds IEEE/RSJ IROS 2008, Sept 25
  • 7.
    Motivation How toapply these state-of-the-art recognition methods to visual servo control? How to overcome the time lag? How to solve the real-time issue? IEEE/RSJ IROS 2008, Sept 25
  • 8.
    Related Works Monocular Model-based Use keyframe information as prior knowledge Use sparse bundle adjustment technique [ L. Vacchetti et al., PAMI 04 ] Input image should be close enough to the prior knowledge! IEEE/RSJ IROS 2008, Sept 25
  • 9.
    Related Works ActiveContour Local curve fitting algorithm Initialize by SIFT keypoint matching [G. Panin and A. Knoll, JMM 04 ] Potential danger in background having same color with tracking object! IEEE/RSJ IROS 2008, Sept 25
  • 10.
    Our Idea Use prior knowledge (object models) 2D images 3D points obtained from structured light system Use scale invariant feature matching for accurate initialization Use KLT (Kanade-Lucas-Tomasi) tracker for fast local tracking IEEE/RSJ IROS 2008, Sept 25
  • 11.
    System Overview Addtext IEEE/RSJ IROS 2008, Sept 25
  • 12.
    Two Modes MonoMode Using mono camera Better computational performance Stereo Mode Using stereo camera More accurate pose result IEEE/RSJ IROS 2008, Sept 25
  • 13.
    Problem Definition – Mono Mode Given 2D-3D correspondences and a calibrated mono camera, find the pose of the object with respect to the camera. IEEE/RSJ IROS 2008, Sept 25
  • 14.
    Problem Definition – Stereo Mode Given 3D-3D correspondences and a calibrated stereo camera, find the pose of the object with respect to the camera. IEEE/RSJ IROS 2008, Sept 25
  • 15.
    Initial Pose EstimationAdd text IEEE/RSJ IROS 2008, Sept 25
  • 16.
    Initial Pose EstimationExtract SIFT keypoints Matching with model knowledge Estimate initial pose Get a convex hull of a set of matched SIFT keypoints Generate KLT tracking points within the convexhull Calculate 3D coordinates of KLT points IEEE/RSJ IROS 2008, Sept 25
  • 17.
    Initial Pose EstimationMono Mode Use the POSIT algorithm ( 2D-3D ) Stereo Mode Use the closed-form solution using unit quaternions ( 3D-3D ) R,t R,t IEEE/RSJ IROS 2008, Sept 25
  • 18.
    Initial Pose EstimationExtract SIFT keypoints Matching with model knowledge Estimate initial pose Get a convex hull of a set of matched SIFT keypoints Generate KLT tracking points within the convexhull Calculate 3D coordinates of KLT points IEEE/RSJ IROS 2008, Sept 25
  • 19.
    Initial Pose Estimation3D coordinates of each KLT points are required for subsequent local pose estimation Stereo Mode Straightforward in a calibrated stereo rig Triangulate 3D points Mono Mode Use approximation with the knowledge of model Get 3D coordinates by using three nearest neighboring SIFT points IEEE/RSJ IROS 2008, Sept 25
  • 20.
    Initial Pose Estimation+ : SIFT points • : KLT points IEEE/RSJ IROS 2008, Sept 25
  • 21.
    Initial Pose EstimationTreat the surface as locally flat IEEE/RSJ IROS 2008, Sept 25
  • 22.
    Local Pose EstimationAdd text IEEE/RSJ IROS 2008, Sept 25
  • 23.
    Local Pose EstimationEstimate pose with KLT tracking points and their 3D points Pose estimation algorithms are same Mono Mode Use the POSIT algorithm ( 2D-3D ) Stereo Mode Use the closed-form solution using unit quaternions ( 3D-3D ) R,t R,t IEEE/RSJ IROS 2008, Sept 25
  • 24.
    Removing Outliers IEEE/RSJIROS 2008, Sept 25
  • 25.
    Outlier Handling KLTtracking points are easy to drift Drifting points result in inaccurate pose Use RANSAC to remove outlier Re-initialize when there are no sufficient # of inliers IEEE/RSJ IROS 2008, Sept 25
  • 26.
    Tracking Results IEEE/RSJIROS 2008, Sept 25
  • 27.
    Experiment Mono ModeStereo Mode IEEE/RSJ IROS 2008, Sept 25
  • 28.
    Tracking Results - translation IEEE/RSJ IROS 2008, Sept 25
  • 29.
    Tracking Results - rotation IEEE/RSJ IROS 2008, Sept 25
  • 30.
    RMS Error RMSerrors over the whole sequence of image Z IEEE/RSJ IROS 2008, Sept 25
  • 31.
    Computational Time Computationaltimes of pose estimation IEEE/RSJ IROS 2008, Sept 25
  • 32.
    Computational Time Computationaltimes of each module IEEE/RSJ IROS 2008, Sept 25
  • 33.
    Summary & ConclusionA method for tracking 3D roto-translation of rigid objects using scale invariant feature based matching KLT (Kanade-Lucas-Tomasi) tracker Mono mode guarantees higher frame rate performance stereo mode shows better pose results IEEE/RSJ IROS 2008, Sept 25
  • 34.
    Future Work Todecrease the computational burden Use GPU-based implementation of KLT tracker and SIFT GPU KLT SiftGPU Unifying the contour based tracking IEEE/RSJ IROS 2008, Sept 25
  • 35.
    Thank you AnyQuestions? Any Suggestions? Any Comments? IEEE/RSJ IROS 2008, Sept 25