Robot Vision for the Visually Impaired
Upcoming SlideShare
Loading in...5
×
 

Robot Vision for the Visually Impaired

on

  • 904 views

 

Statistics

Views

Total Views
904
Views on SlideShare
904
Embed Views
0

Actions

Likes
0
Downloads
12
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Robot Vision for the Visually Impaired Robot Vision for the Visually Impaired Presentation Transcript

  • Robot Vision for the Visually Impaired Vivek Pradeep, Gerard Medioni, James Weiland presented by Phongsathorn Eakamongul Department of Computer Science Asian Institute of Technology 2010, December 7Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 1 / 18
  • Outline1 Abstracts2 System Description3 Result Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 2 / 18
  • Abstracts head-mounted : wide-field information compare to shoulder or waist-mounted design in literature which require body rotations stereo-vision navigational assistance device visual odometry : dense 3D with 2D elevation grids metric-topological SLAM build vicinity map 3D traversability analysis to steer subjects away from obstacles in the path use microvibration motors provides cues for taking evasive action : they use tactile cues instead of audio since the latter impose greater cognitive load on the subject, and blind users rely on hearing to perform a wide variety of other tasks experiment running at 10 Hz Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18 View slide
  • Abstracts head-mounted : wide-field information compare to shoulder or waist-mounted design in literature which require body rotations stereo-vision navigational assistance device visual odometry : dense 3D with 2D elevation grids metric-topological SLAM build vicinity map 3D traversability analysis to steer subjects away from obstacles in the path use microvibration motors provides cues for taking evasive action : they use tactile cues instead of audio since the latter impose greater cognitive load on the subject, and blind users rely on hearing to perform a wide variety of other tasks experiment running at 10 Hz Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18 View slide
  • Abstracts head-mounted : wide-field information compare to shoulder or waist-mounted design in literature which require body rotations stereo-vision navigational assistance device visual odometry : dense 3D with 2D elevation grids metric-topological SLAM build vicinity map 3D traversability analysis to steer subjects away from obstacles in the path use microvibration motors provides cues for taking evasive action : they use tactile cues instead of audio since the latter impose greater cognitive load on the subject, and blind users rely on hearing to perform a wide variety of other tasks experiment running at 10 Hz Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18
  • Abstracts head-mounted : wide-field information compare to shoulder or waist-mounted design in literature which require body rotations stereo-vision navigational assistance device visual odometry : dense 3D with 2D elevation grids metric-topological SLAM build vicinity map 3D traversability analysis to steer subjects away from obstacles in the path use microvibration motors provides cues for taking evasive action : they use tactile cues instead of audio since the latter impose greater cognitive load on the subject, and blind users rely on hearing to perform a wide variety of other tasks experiment running at 10 Hz Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18
  • Abstracts head-mounted : wide-field information compare to shoulder or waist-mounted design in literature which require body rotations stereo-vision navigational assistance device visual odometry : dense 3D with 2D elevation grids metric-topological SLAM build vicinity map 3D traversability analysis to steer subjects away from obstacles in the path use microvibration motors provides cues for taking evasive action : they use tactile cues instead of audio since the latter impose greater cognitive load on the subject, and blind users rely on hearing to perform a wide variety of other tasks experiment running at 10 Hz Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18
  • Abstracts head-mounted : wide-field information compare to shoulder or waist-mounted design in literature which require body rotations stereo-vision navigational assistance device visual odometry : dense 3D with 2D elevation grids metric-topological SLAM build vicinity map 3D traversability analysis to steer subjects away from obstacles in the path use microvibration motors provides cues for taking evasive action : they use tactile cues instead of audio since the latter impose greater cognitive load on the subject, and blind users rely on hearing to perform a wide variety of other tasks experiment running at 10 Hz Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18
  • Abstracts head-mounted : wide-field information compare to shoulder or waist-mounted design in literature which require body rotations stereo-vision navigational assistance device visual odometry : dense 3D with 2D elevation grids metric-topological SLAM build vicinity map 3D traversability analysis to steer subjects away from obstacles in the path use microvibration motors provides cues for taking evasive action : they use tactile cues instead of audio since the latter impose greater cognitive load on the subject, and blind users rely on hearing to perform a wide variety of other tasks experiment running at 10 Hz Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18
  • Abstracts head-mounted : wide-field information compare to shoulder or waist-mounted design in literature which require body rotations stereo-vision navigational assistance device visual odometry : dense 3D with 2D elevation grids metric-topological SLAM build vicinity map 3D traversability analysis to steer subjects away from obstacles in the path use microvibration motors provides cues for taking evasive action : they use tactile cues instead of audio since the latter impose greater cognitive load on the subject, and blind users rely on hearing to perform a wide variety of other tasks experiment running at 10 Hz Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18
  • Abstracts head-mounted : wide-field information compare to shoulder or waist-mounted design in literature which require body rotations stereo-vision navigational assistance device visual odometry : dense 3D with 2D elevation grids metric-topological SLAM build vicinity map 3D traversability analysis to steer subjects away from obstacles in the path use microvibration motors provides cues for taking evasive action : they use tactile cues instead of audio since the latter impose greater cognitive load on the subject, and blind users rely on hearing to perform a wide variety of other tasks experiment running at 10 Hz Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18
  • Abstracts head-mounted : wide-field information compare to shoulder or waist-mounted design in literature which require body rotations stereo-vision navigational assistance device visual odometry : dense 3D with 2D elevation grids metric-topological SLAM build vicinity map 3D traversability analysis to steer subjects away from obstacles in the path use microvibration motors provides cues for taking evasive action : they use tactile cues instead of audio since the latter impose greater cognitive load on the subject, and blind users rely on hearing to perform a wide variety of other tasks experiment running at 10 Hz Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 3 / 18
  • Introduction visual impairment : need long cane or guide dog In US, 109,000 people : use long canes, 7,000 use dog guides only 1,500 graduate from dog-guid user program Electronic travel aids (ETAs), leveraging ultrasonic, laser, or vision sensors Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 4 / 18
  • Introduction visual impairment : need long cane or guide dog In US, 109,000 people : use long canes, 7,000 use dog guides only 1,500 graduate from dog-guid user program Electronic travel aids (ETAs), leveraging ultrasonic, laser, or vision sensors Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 4 / 18
  • Introduction visual impairment : need long cane or guide dog In US, 109,000 people : use long canes, 7,000 use dog guides only 1,500 graduate from dog-guid user program Electronic travel aids (ETAs), leveraging ultrasonic, laser, or vision sensors Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 4 / 18
  • Introduction visual impairment : need long cane or guide dog In US, 109,000 people : use long canes, 7,000 use dog guides only 1,500 graduate from dog-guid user program Electronic travel aids (ETAs), leveraging ultrasonic, laser, or vision sensors Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 4 / 18
  • Introduction visual impairment : need long cane or guide dog In US, 109,000 people : use long canes, 7,000 use dog guides only 1,500 graduate from dog-guid user program Electronic travel aids (ETAs), leveraging ultrasonic, laser, or vision sensors Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 4 / 18
  • wearable array of microvibration motors provides a tactile cuesand guide user along safe path Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 5 / 18
  • Outline1 Abstracts2 System Description3 Result Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 6 / 18
  • Online SLAM + obstacle detection Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 7 / 18
  • Stereo Vision Odometry t−1 t−1 t t−1 t−1 t matched correspondences across (PL , PR , PL ) or (PL , PR , PR ) can be computed using three-point algorithm in RANSAC setting for robustness, features matching and reprojection errors are measured across four views Sparse Bundle Adjustment feature covariances can be propagated to get motion uncertainty for use in the SLAM filter Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 8 / 18
  • Stereo Vision Odometry t−1 t−1 t t−1 t−1 t matched correspondences across (PL , PR , PL ) or (PL , PR , PR ) can be computed using three-point algorithm in RANSAC setting for robustness, features matching and reprojection errors are measured across four views Sparse Bundle Adjustment feature covariances can be propagated to get motion uncertainty for use in the SLAM filter Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 8 / 18
  • Stereo Vision Odometry t−1 t−1 t t−1 t−1 t matched correspondences across (PL , PR , PL ) or (PL , PR , PR ) can be computed using three-point algorithm in RANSAC setting for robustness, features matching and reprojection errors are measured across four views Sparse Bundle Adjustment feature covariances can be propagated to get motion uncertainty for use in the SLAM filter Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 8 / 18
  • Stereo Vision Odometry t−1 t−1 t t−1 t−1 t matched correspondences across (PL , PR , PL ) or (PL , PR , PR ) can be computed using three-point algorithm in RANSAC setting for robustness, features matching and reprojection errors are measured across four views Sparse Bundle Adjustment feature covariances can be propagated to get motion uncertainty for use in the SLAM filter Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 8 / 18
  • Stereo Vision Odometry t−1 t−1 t t−1 t−1 t matched correspondences across (PL , PR , PL ) or (PL , PR , PR ) can be computed using three-point algorithm in RANSAC setting for robustness, features matching and reprojection errors are measured across four views Sparse Bundle Adjustment feature covariances can be propagated to get motion uncertainty for use in the SLAM filter Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 8 / 18
  • SLAM Rao-Blackwellised particle filter (RBPF) in FastSLAM framework which use KLT and SIFT tracking construct 2 maps SLAM map : collection of sparse landmarks that propagated every frame to yield consistent camera pose estimates, for SLAM purpose only traversability map : dense 3D cloud from triangulation Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 9 / 18
  • SLAM Rao-Blackwellised particle filter (RBPF) in FastSLAM framework which use KLT and SIFT tracking construct 2 maps SLAM map : collection of sparse landmarks that propagated every frame to yield consistent camera pose estimates, for SLAM purpose only traversability map : dense 3D cloud from triangulation Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 9 / 18
  • SLAM Rao-Blackwellised particle filter (RBPF) in FastSLAM framework which use KLT and SIFT tracking construct 2 maps SLAM map : collection of sparse landmarks that propagated every frame to yield consistent camera pose estimates, for SLAM purpose only traversability map : dense 3D cloud from triangulation Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 9 / 18
  • SLAM Rao-Blackwellised particle filter (RBPF) in FastSLAM framework which use KLT and SIFT tracking construct 2 maps SLAM map : collection of sparse landmarks that propagated every frame to yield consistent camera pose estimates, for SLAM purpose only traversability map : dense 3D cloud from triangulation Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 9 / 18
  • SLAM Rao-Blackwellised particle filter (RBPF) in FastSLAM framework which use KLT and SIFT tracking construct 2 maps SLAM map : collection of sparse landmarks that propagated every frame to yield consistent camera pose estimates, for SLAM purpose only traversability map : dense 3D cloud from triangulation Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 9 / 18
  • SLAM Rao-Blackwellised particle filter (RBPF) in FastSLAM framework which use KLT and SIFT tracking construct 2 maps SLAM map : collection of sparse landmarks that propagated every frame to yield consistent camera pose estimates, for SLAM purpose only traversability map : dense 3D cloud from triangulation Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 9 / 18
  • Metric-Topological SLAM serveral thousands of landmarks environment two levels of environment representation local, metric (submap) : estimates state information six dimensional camera trajectory st sparse map mt feature observations (KLT/SIFT) z t camera motion estimates u tRBPFp(st , mt |z t , u t ) ≈ p(st |z t , u t ) i p(mt (i)|st , z t , u t ) mt (i) : ith landmark in the map represented by N(µi , σ i ) each time feature is observed, the corresponding lankmark is updated using EKF RBPF enables us to only update the observed landmark instead of the whole map global topological map is represents as a collection of submapannotated graphG = ({i M}i∈Ωt , {b Λ}a,b∈Ωt ) a i M : annotated submaps Ωt : set of computed submaps b a Λ : coordinate transformations between adjacent maps Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 10 / 18
  • Metric-Topological SLAM serveral thousands of landmarks environment two levels of environment representation local, metric (submap) : estimates state information six dimensional camera trajectory st sparse map mt feature observations (KLT/SIFT) z t camera motion estimates u tRBPFp(st , mt |z t , u t ) ≈ p(st |z t , u t ) i p(mt (i)|st , z t , u t ) mt (i) : ith landmark in the map represented by N(µi , σ i ) each time feature is observed, the corresponding lankmark is updated using EKF RBPF enables us to only update the observed landmark instead of the whole map global topological map is represents as a collection of submapannotated graphG = ({i M}i∈Ωt , {b Λ}a,b∈Ωt ) a i M : annotated submaps Ωt : set of computed submaps b a Λ : coordinate transformations between adjacent maps Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 10 / 18
  • Metric-Topological SLAM serveral thousands of landmarks environment two levels of environment representation local, metric (submap) : estimates state information six dimensional camera trajectory st sparse map mt feature observations (KLT/SIFT) z t camera motion estimates u tRBPFp(st , mt |z t , u t ) ≈ p(st |z t , u t ) i p(mt (i)|st , z t , u t ) mt (i) : ith landmark in the map represented by N(µi , σ i ) each time feature is observed, the corresponding lankmark is updated using EKF RBPF enables us to only update the observed landmark instead of the whole map global topological map is represents as a collection of submapannotated graphG = ({i M}i∈Ωt , {b Λ}a,b∈Ωt ) a i M : annotated submaps Ωt : set of computed submaps b a Λ : coordinate transformations between adjacent maps Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 10 / 18
  • Metric-Topological SLAM serveral thousands of landmarks environment two levels of environment representation local, metric (submap) : estimates state information six dimensional camera trajectory st sparse map mt feature observations (KLT/SIFT) z t camera motion estimates u tRBPFp(st , mt |z t , u t ) ≈ p(st |z t , u t ) i p(mt (i)|st , z t , u t ) mt (i) : ith landmark in the map represented by N(µi , σ i ) each time feature is observed, the corresponding lankmark is updated using EKF RBPF enables us to only update the observed landmark instead of the whole map global topological map is represents as a collection of submapannotated graphG = ({i M}i∈Ωt , {b Λ}a,b∈Ωt ) a i M : annotated submaps Ωt : set of computed submaps b a Λ : coordinate transformations between adjacent maps Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 10 / 18
  • Metric-Topological SLAM serveral thousands of landmarks environment two levels of environment representation local, metric (submap) : estimates state information six dimensional camera trajectory st sparse map mt feature observations (KLT/SIFT) z t camera motion estimates u tRBPFp(st , mt |z t , u t ) ≈ p(st |z t , u t ) i p(mt (i)|st , z t , u t ) mt (i) : ith landmark in the map represented by N(µi , σ i ) each time feature is observed, the corresponding lankmark is updated using EKF RBPF enables us to only update the observed landmark instead of the whole map global topological map is represents as a collection of submapannotated graphG = ({i M}i∈Ωt , {b Λ}a,b∈Ωt ) a i M : annotated submaps Ωt : set of computed submaps b a Λ : coordinate transformations between adjacent maps Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 10 / 18
  • Metric-Topological SLAM serveral thousands of landmarks environment two levels of environment representation local, metric (submap) : estimates state information six dimensional camera trajectory st sparse map mt feature observations (KLT/SIFT) z t camera motion estimates u tRBPFp(st , mt |z t , u t ) ≈ p(st |z t , u t ) i p(mt (i)|st , z t , u t ) mt (i) : ith landmark in the map represented by N(µi , σ i ) each time feature is observed, the corresponding lankmark is updated using EKF RBPF enables us to only update the observed landmark instead of the whole map global topological map is represents as a collection of submapannotated graphG = ({i M}i∈Ωt , {b Λ}a,b∈Ωt ) a i M : annotated submaps Ωt : set of computed submaps b a Λ : coordinate transformations between adjacent maps Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 10 / 18
  • Metric-Topological SLAM serveral thousands of landmarks environment two levels of environment representation local, metric (submap) : estimates state information six dimensional camera trajectory st sparse map mt feature observations (KLT/SIFT) z t camera motion estimates u tRBPFp(st , mt |z t , u t ) ≈ p(st |z t , u t ) i p(mt (i)|st , z t , u t ) mt (i) : ith landmark in the map represented by N(µi , σ i ) each time feature is observed, the corresponding lankmark is updated using EKF RBPF enables us to only update the observed landmark instead of the whole map global topological map is represents as a collection of submapannotated graphG = ({i M}i∈Ωt , {b Λ}a,b∈Ωt ) a i M : annotated submaps Ωt : set of computed submaps b a Λ : coordinate transformations between adjacent maps Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 10 / 18
  • Metric-Topological SLAM serveral thousands of landmarks environment two levels of environment representation local, metric (submap) : estimates state information six dimensional camera trajectory st sparse map mt feature observations (KLT/SIFT) z t camera motion estimates u tRBPFp(st , mt |z t , u t ) ≈ p(st |z t , u t ) i p(mt (i)|st , z t , u t ) mt (i) : ith landmark in the map represented by N(µi , σ i ) each time feature is observed, the corresponding lankmark is updated using EKF RBPF enables us to only update the observed landmark instead of the whole map global topological map is represents as a collection of submapannotated graphG = ({i M}i∈Ωt , {b Λ}a,b∈Ωt ) a i M : annotated submaps Ωt : set of computed submaps b a Λ : coordinate transformations between adjacent maps Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 10 / 18
  • Traversability Map 5 radius sphere multi-surface elevation map : point cloud is quantized into 2D grid Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 11 / 18
  • Traversability Map 5 radius sphere multi-surface elevation map : point cloud is quantized into 2D grid Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 11 / 18
  • Traversability Map 5 radius sphere multi-surface elevation map : point cloud is quantized into 2D grid Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 11 / 18
  • Prediction Motion and Cue Generation if magnitude of translation respect to previous position exceeds certain threshold, the direction of motion and reference position are updated little translation -> no update cue generation : most continuous traversable path ( Green color in picture ) Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 12 / 18
  • Outline1 Abstracts2 System Description3 Result Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 13 / 18
  • ResultGreen : travesibleRed : not Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 14 / 18
  • error of camera frame-to-frame heading (yaw), when compared withreadings from a commercially Inertial Measurement Unit (IMU)camera motion : slow (< 5 degree/s), medium (5-20 degree/s), fast (20-30 degree/s) Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 15 / 18
  • SLAM result Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 16 / 18
  • Traversability Mapone frame exppatch that has thickness > 30 cm is labeled as vertical5 horizontal patches is labeled as traversable Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 17 / 18
  • ExperimentManually generate cues : wireless remote controlAutonomous generate cues, like group 4 Phongsathorn (AIT) Robot Vision for the Visually Impaired Short Occasion 18 / 18