SlideShare a Scribd company logo
1 of 25
Download to read offline
Scalable Fiducial Tag Localization on a 3D Prior Map
Via Graph-Theoretic Global Tag-Map Registration
Kenji Koide, Shuji Oishi, Masashi Yokozuka, and Atsuhiko Banno
National Institute of Advanced Industrial Science and Technology (AIST), Japan
Background
• Map-based visual localization has been attracting much attention
• It is, however, sometimes necessary to rely on visual fiducial tags
(aka visual markers) for initialization and fail-safe
[Oishi, 2020]
Motivation
• Deploying many tags on a 3D prior map is sometimes difficult and tedious
• Tag positions are often measured by hand; large effort and inaccurate results
• We aim to develop an accurate and automatic method to determine tag poses
in the environment
Proposed Method
1. VIO-based Tag-Relative-Pose Estimation
We use an agile camera to observe tags in the environment and
estimate the relative poses between tags via landmark SLAM
2. Global Tag-Map Registration
We then roughly align tags and a prior map by establishing tag-plane
correspondences via graph-theoretic correspondence estimation
3. Estimation Refinement via Direct Camera-Map Alignment
Tag and camera poses are refined by directly aligning agile camera images with
the prior map and re-optimize all variables under all constraints
VIO-based Tag-Relative-Pose Estimation
• We use an agile camera and observe each tag in the environment at least once
• The tag poses in the VIO frame is estimated via landmark SLAM
VIO
(VINS-Mono)
Tag detections
(Apriltags)
Pose graph optimization
Global Tag-Map Registration
• We want to align the estimated tag poses with a prior 3D map without initial guess
• The modality difference makes it difficult to apply image matching…
Prior 3D map (sparse point cloud) Estimated tag poses (visually detected)
Align w/o initial guess
Geometry-based Tag-Plane Matching
• We assume that most tags are placed on a plane in the environment
• We establish tag-plane correspondences to determine the tag-map transformation
Detecting planes in the environment
1. Region growing segmentation
2. RANSAC plane detection
3. Fit oriented BBoxes to plane points
Geometry-based Tag-Plane Matching
• We assume that most tags are placed on a plane in the environment
• We establish tag-plane correspondences to determine the tag-map transformation
Detecting planes in the environment
1. Region growing segmentation
2. RANSAC plane detection
3. Fit oriented BBoxes to plane points
Geometry-based Tag-Plane Matching
• We assume that most tags are placed on a plane in the environment
• We establish tag-plane correspondences to determine the tag-map transformation
Detecting planes in the environment
1. Region growing segmentation
2. RANSAC plane detection
3. Fit oriented BBoxes to plane points
Geometry-based Tag-Plane Matching
• We assume that most tags are placed on a plane in the environment
• We establish tag-plane correspondences to determine the tag-map transformation
Detecting planes in the environment
1. Region growing segmentation
2. RANSAC plane detection
3. Fit oriented BBoxes to plane points
Plane = (center, normal, lengths)
Max-Clique-based Correspondence Estimation
• Tag-Plane Correspondence Consistency Graph
Vertex: tag-plane correspondence hypothesis
Edge: consistency between correspondence hypotheses
ℎ𝑖𝑗 does not contradict ℎ𝑘𝑙 (i.e., they are consistent)
Tag i corresponds to plane j
Tag k corresponds to plane l
ℎ𝑖𝑗
ℎ𝑘𝑙
Max-Clique-based Correspondence Estimation
• Tag-Plane Correspondence Consistency Graph
Vertex: tag-plane correspondence hypothesis
Edge: consistency between correspondence hypotheses
ℎ𝑖𝑗
ℎ𝑘𝑙
Max-Clique-based Correspondence Estimation
• Tag-Plane Correspondence Consistency Graph
Vertex: tag-plane correspondence hypothesis
Edge: consistency between correspondence hypotheses
• Largest subset of hypotheses that are all mutually consistent (i.e., maximum clique)
gives the best explanation for the tag placement in the given map
ℎ𝑖𝑗
ℎ𝑘𝑙
Tag-Plane Correspondence Consistency
• Consistency between tag-plane correspondence hypotheses is determined
based on geometric consistency check
ℎ𝑖𝑗
ℎ𝑘𝑙
Tag i
Tag k
Plane j
Plane l
Tag-Plane Correspondence Consistency
• Consistency between tag-plane correspondence hypotheses is determined
based on geometric consistency check
• We align tag i and plane j and s.t. distance between tag k and plane l
Plane j
Plane l
Tag-Plane Correspondence Consistency
• Consistency between tag-plane correspondence hypotheses is determined
based on geometric consistency check
• We align tag i and plane j and s.t. distance between tag k and plane l
• If normal and translation errors between tag k and plane l are smaller than
threshold, these hypotheses are mutually consistent
Plane j
Plane l
Normal error
Translation error
Example Result
Planes
Tags
• While the consistency graph contains many edges,
the max-clique can be found very efficiently [Rossi, 2015]
Example Result
Planes
Tags
Consistency graph contains
429,735 hypothesis pairs
• While the consistency graph contains many edges,
the max-clique can be found very efficiently [Rossi, 2015]
Example Result
Planes
Tags
Consistency graph contains
429,735 hypothesis pairs
Maximum clique consists of
56 tag-plane correspondences
found in 92 msec
• While the consistency graph contains many edges,
the max-clique can be found very efficiently [Rossi, 2015]
• Given the tag-plane correspondences, we estimate the tag-map transformation
by minimizing normal-to-normal ICP distance [Rusinkiewicz, 2019]
Estimation Refinement
• We refine the tag poses by directly aligning agile camera images with the map
VIO
Tag detections
Pose graph
Direct alignment
Estimation Refinement
• We refine the tag poses by directly aligning agile camera images with the map
• We use the normalized information distance (NID), a mutual information-based
cross modal metric, to maximize the co-occurrence of pixel and map intensity values
• Tag and camera poses are re-optimized under all the constraints
Agile camera image
Map rendered with
optimized camera pose
Evaluation in Simulation
• The method is evaluated on the Replica dataset [Savva, 2019]
Global tag-map registration
: 0.039m / 1.021°
Tag localization accuracy
: 98% success rate
Baseline (FPFH+RANSAC/Teaser) : 26% and 70%
Robustness to outlier tags
Evaluation in Real Environment
• 117 tags were placed in the environment
• Tag poses were estimated in 22 minutes (16 min for VIO recording, 6 min for post processing)
• Average tag pose error: 0.019m and 2.382°
Final estimation result
Thank you for your attention!!
24
Conclusion
• An accurate and scalable method for fiducial tag localization on a 3D prior
environmental map is proposed
• VIO-based tag relative pose estimation via landmark SLAM
• Global tag-map registration based on tag-plane correspondence estimation
via maximum clique finding
• Estimation refinement via NID-based direct camera-map alignment
• The proposed method could localize over 100 tags in 22 minutes
• The average tag localization error was about 2 cm

More Related Content

Similar to Scalable Fiducial Tag Localization on a 3D Prior Map via Graph-Theoretic Global Tag-Map Registration [IROS2022]

IGARSS presentation WKLEE.pptx
IGARSS presentation WKLEE.pptxIGARSS presentation WKLEE.pptx
IGARSS presentation WKLEE.pptxgrssieee
 
CVPR 2012 Review Seminar - Multi-View Hair Capture using Orientation Fields
CVPR 2012 Review Seminar - Multi-View Hair Capture using Orientation FieldsCVPR 2012 Review Seminar - Multi-View Hair Capture using Orientation Fields
CVPR 2012 Review Seminar - Multi-View Hair Capture using Orientation FieldsJun Saito
 
EFFECTIVE INTEREST REGION ESTIMATION MODEL TO REPRESENT CORNERS FOR IMAGE
EFFECTIVE INTEREST REGION ESTIMATION MODEL TO REPRESENT CORNERS FOR IMAGE EFFECTIVE INTEREST REGION ESTIMATION MODEL TO REPRESENT CORNERS FOR IMAGE
EFFECTIVE INTEREST REGION ESTIMATION MODEL TO REPRESENT CORNERS FOR IMAGE sipij
 
Geo referencing by Mashhood Arif
Geo referencing by Mashhood ArifGeo referencing by Mashhood Arif
Geo referencing by Mashhood ArifKU Leuven
 
Lecture 4 image measumrents & refinement
Lecture 4  image measumrents & refinementLecture 4  image measumrents & refinement
Lecture 4 image measumrents & refinementSarhat Adam
 
Depth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors IIDepth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors IIYu Huang
 
EXTENDED HYBRID REGION GROWING SEGMENTATION OF POINT CLOUDS WITH DIFFERENT RE...
EXTENDED HYBRID REGION GROWING SEGMENTATION OF POINT CLOUDS WITH DIFFERENT RE...EXTENDED HYBRID REGION GROWING SEGMENTATION OF POINT CLOUDS WITH DIFFERENT RE...
EXTENDED HYBRID REGION GROWING SEGMENTATION OF POINT CLOUDS WITH DIFFERENT RE...cscpconf
 
Extended hybrid region growing segmentation of point clouds with different re...
Extended hybrid region growing segmentation of point clouds with different re...Extended hybrid region growing segmentation of point clouds with different re...
Extended hybrid region growing segmentation of point clouds with different re...csandit
 
Remote Sensing: Georeferencing
Remote Sensing: GeoreferencingRemote Sensing: Georeferencing
Remote Sensing: GeoreferencingKamlesh Kumar
 
Effect of sub classes on the accuracy of the classified image
Effect of sub classes on the accuracy of the classified imageEffect of sub classes on the accuracy of the classified image
Effect of sub classes on the accuracy of the classified imageiaemedu
 
Optimizing GIS based Systems
Optimizing GIS based SystemsOptimizing GIS based Systems
Optimizing GIS based SystemsAjinkya Deshpande
 
Augmented reality session 4
Augmented reality session 4Augmented reality session 4
Augmented reality session 4NirsandhG
 
Understanding Users Behaviours in User-Centric Immersive Communications
Understanding Users Behaviours in User-Centric Immersive CommunicationsUnderstanding Users Behaviours in User-Centric Immersive Communications
Understanding Users Behaviours in User-Centric Immersive CommunicationsFörderverein Technische Fakultät
 
Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Symeon Papadopoulos
 
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Sunando Sengupta
 
Lecture 01 frank dellaert - 3 d reconstruction and mapping: a factor graph ...
Lecture 01   frank dellaert - 3 d reconstruction and mapping: a factor graph ...Lecture 01   frank dellaert - 3 d reconstruction and mapping: a factor graph ...
Lecture 01 frank dellaert - 3 d reconstruction and mapping: a factor graph ...mustafa sarac
 
Enhanced Tracking Aerial Image by Applying Fusion & Image Registration Technique
Enhanced Tracking Aerial Image by Applying Fusion & Image Registration TechniqueEnhanced Tracking Aerial Image by Applying Fusion & Image Registration Technique
Enhanced Tracking Aerial Image by Applying Fusion & Image Registration TechniqueIRJET Journal
 

Similar to Scalable Fiducial Tag Localization on a 3D Prior Map via Graph-Theoretic Global Tag-Map Registration [IROS2022] (20)

IGARSS presentation WKLEE.pptx
IGARSS presentation WKLEE.pptxIGARSS presentation WKLEE.pptx
IGARSS presentation WKLEE.pptx
 
CVPR 2012 Review Seminar - Multi-View Hair Capture using Orientation Fields
CVPR 2012 Review Seminar - Multi-View Hair Capture using Orientation FieldsCVPR 2012 Review Seminar - Multi-View Hair Capture using Orientation Fields
CVPR 2012 Review Seminar - Multi-View Hair Capture using Orientation Fields
 
EFFECTIVE INTEREST REGION ESTIMATION MODEL TO REPRESENT CORNERS FOR IMAGE
EFFECTIVE INTEREST REGION ESTIMATION MODEL TO REPRESENT CORNERS FOR IMAGE EFFECTIVE INTEREST REGION ESTIMATION MODEL TO REPRESENT CORNERS FOR IMAGE
EFFECTIVE INTEREST REGION ESTIMATION MODEL TO REPRESENT CORNERS FOR IMAGE
 
Geo referencing by Mashhood Arif
Geo referencing by Mashhood ArifGeo referencing by Mashhood Arif
Geo referencing by Mashhood Arif
 
Graphics
GraphicsGraphics
Graphics
 
Lecture 4 image measumrents & refinement
Lecture 4  image measumrents & refinementLecture 4  image measumrents & refinement
Lecture 4 image measumrents & refinement
 
Depth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors IIDepth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors II
 
GIS
GISGIS
GIS
 
EXTENDED HYBRID REGION GROWING SEGMENTATION OF POINT CLOUDS WITH DIFFERENT RE...
EXTENDED HYBRID REGION GROWING SEGMENTATION OF POINT CLOUDS WITH DIFFERENT RE...EXTENDED HYBRID REGION GROWING SEGMENTATION OF POINT CLOUDS WITH DIFFERENT RE...
EXTENDED HYBRID REGION GROWING SEGMENTATION OF POINT CLOUDS WITH DIFFERENT RE...
 
Extended hybrid region growing segmentation of point clouds with different re...
Extended hybrid region growing segmentation of point clouds with different re...Extended hybrid region growing segmentation of point clouds with different re...
Extended hybrid region growing segmentation of point clouds with different re...
 
Remote Sensing: Georeferencing
Remote Sensing: GeoreferencingRemote Sensing: Georeferencing
Remote Sensing: Georeferencing
 
Fd36957962
Fd36957962Fd36957962
Fd36957962
 
Effect of sub classes on the accuracy of the classified image
Effect of sub classes on the accuracy of the classified imageEffect of sub classes on the accuracy of the classified image
Effect of sub classes on the accuracy of the classified image
 
Optimizing GIS based Systems
Optimizing GIS based SystemsOptimizing GIS based Systems
Optimizing GIS based Systems
 
Augmented reality session 4
Augmented reality session 4Augmented reality session 4
Augmented reality session 4
 
Understanding Users Behaviours in User-Centric Immersive Communications
Understanding Users Behaviours in User-Centric Immersive CommunicationsUnderstanding Users Behaviours in User-Centric Immersive Communications
Understanding Users Behaviours in User-Centric Immersive Communications
 
Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...
 
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
 
Lecture 01 frank dellaert - 3 d reconstruction and mapping: a factor graph ...
Lecture 01   frank dellaert - 3 d reconstruction and mapping: a factor graph ...Lecture 01   frank dellaert - 3 d reconstruction and mapping: a factor graph ...
Lecture 01 frank dellaert - 3 d reconstruction and mapping: a factor graph ...
 
Enhanced Tracking Aerial Image by Applying Fusion & Image Registration Technique
Enhanced Tracking Aerial Image by Applying Fusion & Image Registration TechniqueEnhanced Tracking Aerial Image by Applying Fusion & Image Registration Technique
Enhanced Tracking Aerial Image by Applying Fusion & Image Registration Technique
 

Recently uploaded

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 

Recently uploaded (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 

Scalable Fiducial Tag Localization on a 3D Prior Map via Graph-Theoretic Global Tag-Map Registration [IROS2022]

  • 1. Scalable Fiducial Tag Localization on a 3D Prior Map Via Graph-Theoretic Global Tag-Map Registration Kenji Koide, Shuji Oishi, Masashi Yokozuka, and Atsuhiko Banno National Institute of Advanced Industrial Science and Technology (AIST), Japan
  • 2. Background • Map-based visual localization has been attracting much attention • It is, however, sometimes necessary to rely on visual fiducial tags (aka visual markers) for initialization and fail-safe [Oishi, 2020]
  • 3. Motivation • Deploying many tags on a 3D prior map is sometimes difficult and tedious • Tag positions are often measured by hand; large effort and inaccurate results • We aim to develop an accurate and automatic method to determine tag poses in the environment
  • 4. Proposed Method 1. VIO-based Tag-Relative-Pose Estimation We use an agile camera to observe tags in the environment and estimate the relative poses between tags via landmark SLAM 2. Global Tag-Map Registration We then roughly align tags and a prior map by establishing tag-plane correspondences via graph-theoretic correspondence estimation 3. Estimation Refinement via Direct Camera-Map Alignment Tag and camera poses are refined by directly aligning agile camera images with the prior map and re-optimize all variables under all constraints
  • 5. VIO-based Tag-Relative-Pose Estimation • We use an agile camera and observe each tag in the environment at least once • The tag poses in the VIO frame is estimated via landmark SLAM VIO (VINS-Mono) Tag detections (Apriltags) Pose graph optimization
  • 6. Global Tag-Map Registration • We want to align the estimated tag poses with a prior 3D map without initial guess • The modality difference makes it difficult to apply image matching… Prior 3D map (sparse point cloud) Estimated tag poses (visually detected) Align w/o initial guess
  • 7. Geometry-based Tag-Plane Matching • We assume that most tags are placed on a plane in the environment • We establish tag-plane correspondences to determine the tag-map transformation Detecting planes in the environment 1. Region growing segmentation 2. RANSAC plane detection 3. Fit oriented BBoxes to plane points
  • 8. Geometry-based Tag-Plane Matching • We assume that most tags are placed on a plane in the environment • We establish tag-plane correspondences to determine the tag-map transformation Detecting planes in the environment 1. Region growing segmentation 2. RANSAC plane detection 3. Fit oriented BBoxes to plane points
  • 9. Geometry-based Tag-Plane Matching • We assume that most tags are placed on a plane in the environment • We establish tag-plane correspondences to determine the tag-map transformation Detecting planes in the environment 1. Region growing segmentation 2. RANSAC plane detection 3. Fit oriented BBoxes to plane points
  • 10. Geometry-based Tag-Plane Matching • We assume that most tags are placed on a plane in the environment • We establish tag-plane correspondences to determine the tag-map transformation Detecting planes in the environment 1. Region growing segmentation 2. RANSAC plane detection 3. Fit oriented BBoxes to plane points Plane = (center, normal, lengths)
  • 11. Max-Clique-based Correspondence Estimation • Tag-Plane Correspondence Consistency Graph Vertex: tag-plane correspondence hypothesis Edge: consistency between correspondence hypotheses ℎ𝑖𝑗 does not contradict ℎ𝑘𝑙 (i.e., they are consistent) Tag i corresponds to plane j Tag k corresponds to plane l ℎ𝑖𝑗 ℎ𝑘𝑙
  • 12. Max-Clique-based Correspondence Estimation • Tag-Plane Correspondence Consistency Graph Vertex: tag-plane correspondence hypothesis Edge: consistency between correspondence hypotheses ℎ𝑖𝑗 ℎ𝑘𝑙
  • 13. Max-Clique-based Correspondence Estimation • Tag-Plane Correspondence Consistency Graph Vertex: tag-plane correspondence hypothesis Edge: consistency between correspondence hypotheses • Largest subset of hypotheses that are all mutually consistent (i.e., maximum clique) gives the best explanation for the tag placement in the given map ℎ𝑖𝑗 ℎ𝑘𝑙
  • 14. Tag-Plane Correspondence Consistency • Consistency between tag-plane correspondence hypotheses is determined based on geometric consistency check ℎ𝑖𝑗 ℎ𝑘𝑙 Tag i Tag k Plane j Plane l
  • 15. Tag-Plane Correspondence Consistency • Consistency between tag-plane correspondence hypotheses is determined based on geometric consistency check • We align tag i and plane j and s.t. distance between tag k and plane l Plane j Plane l
  • 16. Tag-Plane Correspondence Consistency • Consistency between tag-plane correspondence hypotheses is determined based on geometric consistency check • We align tag i and plane j and s.t. distance between tag k and plane l • If normal and translation errors between tag k and plane l are smaller than threshold, these hypotheses are mutually consistent Plane j Plane l Normal error Translation error
  • 17. Example Result Planes Tags • While the consistency graph contains many edges, the max-clique can be found very efficiently [Rossi, 2015]
  • 18. Example Result Planes Tags Consistency graph contains 429,735 hypothesis pairs • While the consistency graph contains many edges, the max-clique can be found very efficiently [Rossi, 2015]
  • 19. Example Result Planes Tags Consistency graph contains 429,735 hypothesis pairs Maximum clique consists of 56 tag-plane correspondences found in 92 msec • While the consistency graph contains many edges, the max-clique can be found very efficiently [Rossi, 2015] • Given the tag-plane correspondences, we estimate the tag-map transformation by minimizing normal-to-normal ICP distance [Rusinkiewicz, 2019]
  • 20. Estimation Refinement • We refine the tag poses by directly aligning agile camera images with the map VIO Tag detections Pose graph Direct alignment
  • 21. Estimation Refinement • We refine the tag poses by directly aligning agile camera images with the map • We use the normalized information distance (NID), a mutual information-based cross modal metric, to maximize the co-occurrence of pixel and map intensity values • Tag and camera poses are re-optimized under all the constraints Agile camera image Map rendered with optimized camera pose
  • 22. Evaluation in Simulation • The method is evaluated on the Replica dataset [Savva, 2019] Global tag-map registration : 0.039m / 1.021° Tag localization accuracy : 98% success rate Baseline (FPFH+RANSAC/Teaser) : 26% and 70% Robustness to outlier tags
  • 23. Evaluation in Real Environment • 117 tags were placed in the environment • Tag poses were estimated in 22 minutes (16 min for VIO recording, 6 min for post processing) • Average tag pose error: 0.019m and 2.382° Final estimation result
  • 24. Thank you for your attention!! 24
  • 25. Conclusion • An accurate and scalable method for fiducial tag localization on a 3D prior environmental map is proposed • VIO-based tag relative pose estimation via landmark SLAM • Global tag-map registration based on tag-plane correspondence estimation via maximum clique finding • Estimation refinement via NID-based direct camera-map alignment • The proposed method could localize over 100 tags in 22 minutes • The average tag localization error was about 2 cm