SlideShare a Scribd company logo
1 of 106
Download to read offline
AR TECHNOLOGY
COMP 4010 Lecture Three
Mark Billinghurst
August 11th 2022
mark.billinghurst@unisa.edu.au
REVIEW
How do We Perceive Reality?
• We understand the world through
our senses:
• Sight, Hearing, Touch, Taste, Smell
(and others..)
• Two basic processes:
• Sensation – Gathering information
• Perception – Interpreting information
Simple Sensing/Perception Model
Reality vs. Virtual Reality
• In a VR system there are input and output devices
between human perception and action
Presence ..
“The subjective experience of being in one place or
environment even when physically situated in another”
Witmer, B. G., & Singer, M. J. (1998). Measuring presence in virtual environments: A presence
questionnaire. Presence: Teleoperators and virtual environments, 7(3), 225-240.
Slater, M., Banakou, D., Beacco, A., Gallego, J., Macia-Varela, F., & Oliva, R. (2022). A Separate Reality: An
Update on Place Illusion and Plausibility in Virtual Reality. Frontiers in Virtual Reality, 81.
Four Illusions of Presence (Slater 2022)
• Place Illusion: being in the place
• Plausibility Illusion: events are real
• Body Ownership: seeing your body in VR
• Copresence/Social Presence: other people are in VR
Senses
• How an organism obtains information for perception:
• Sensation part of Somatic Division of Peripheral Nervous System
• Integration and perception requires the Central Nervous System
• Five major senses (but there are more..):
• Sight (Opthalamoception)
• Hearing (Audioception)
• Taste (Gustaoception)
• Smell (Olfacaoception)
• Touch (Tactioception)
The Human Visual System
• Purpose is to convert visual input to signals in the brain
Comparison between Eyes and HMD
Sound Localization
• Humans have two ears
• localize sound in space
• Sound can be localized
using 3 coordinates
• Azimuth, elevation,
distance
Haptic Sensation
• Somatosensory System
• complex system of nerve cells that responds to changes to
the surface or internal state of the body
• Skin is the largest organ
• 1.3-1.7 square m in adults
• Tactile: Surface properties
• Receptors not evenly spread
• Most densely populated area is the tongue
• Kinesthetic: Muscles, Tendons, etc.
• Also known as proprioception
Proprioception/Kinaesthesia
• Proprioception (joint position sense)
• Awareness of movement and positions of body parts
• Due to nerve endings and Pacinian and Ruffini corpuscles at joints
• Enables us to touch nose with eyes closed
• Joints closer to body more accurately sensed
• Users know hand position accurate to 8cm without looking at them
• Kinaesthesia (joint movement sense)
• Sensing muscle contraction or stretching
• Cutaneous mechanoreceptors measuring skin stretching
• Helps with force sensation
Augmented RealityTechnology
• Combines Real and Virtual Images
• Needs: Display technology
• Interactive in real-time
• Needs: Input and interaction technology
• Registered in 3D
• Needs: Viewpoint tracking technology
AR Display Technologies
• Classification (Bimber/Raskar 2005)
• Head attached
• Head mounted display/projector
• Body attached
• Handheld display/projector
• Spatial
• Spatially aligned projector/monitor
Types of Head Mounted Displays
Occluded
See-thru
Multiplexed
Optical see-through Head-Mounted Display
Virtual images
from monitors
Real
World
Optical
Combiners
Video see-through HMD
Video
cameras
Monitors
Graphics
Combiner
Video
Handheld AR
• Camera + display = handheld AR
• Mobile phone/Tablet display
SpatialAugmented Reality
• Project onto irregular surfaces
• Geometric Registration
• Projector blending, High dynamic range
• Book: Bimber, Rasker “Spatial Augmented Reality”
2: AR TRACKING
AR RequiresTracking and Registration
• Registration
• Positioning virtual object wrt real world
• Fixing virtual object on real object when view is fixed
• Calibration
• Offline measurements
• Measure camera relative to head mounted display
• Tracking
• Continually locating the user’s viewpoint when view moving
• Position (x,y,z), Orientation (r,p,y)
REGISTRATION AND CALIBRATION
Coordinate Systems
Local object
coordinates
Global world
coordinates
Eye
coordinates
Model transformation
• Track for moving objects,
if there are static objects as well
View transformation
• Track for moving objects,
if there are no static objects
• Track for moving observer
Perspective transformation
• Calibrate offline
• For both camera and display
Spatial Registration
The Registration Problem
• Virtual and Real content must stay properly aligned
• If not:
• Breaks the illusion that the two coexist
• Prevents acceptance of many serious applications
t = 0 seconds t = 0.5 second
Sources of Registration Errors
•Static errors
• Optical distortions (in HMD)
• Mechanical misalignments
• Tracker errors
• Incorrect viewing parameters
•Dynamic errors
• System delays (largest source of error)
• 1 ms delay = 1/3 mm registration error
Reducing Static Errors
•Distortion compensation
• For lens or display distortions
•Manual adjustments
• Have user manually alighn AR andVR content
•View-based or direct measurements
• Have user measure eye position
•Camera calibration (video AR)
• Measuring camera properties
View Based Calibration (Azuma 94)
Uncalibrated Calibrated
The Benefit of Calibration
Dynamic errors
• Total Delay = 50 + 2 + 33 + 17 = 102 ms
• 1 ms delay = 1/3 mm = 33mm error
Tracking Calculate
Viewpoint
Simulation
Render
Scene
Draw to
Display
x,y,z
r,p,y
Application Loop
20 Hz = 50ms 500 Hz = 2ms 30 Hz = 33ms 60 Hz = 17ms
Reducing dynamic errors (1)
•Reduce system lag
•Faster components/system modules
•Reduce apparent lag
•Image deflection
•Image warping
Reducing System Lag
Tracking Calculate
Viewpoint
Simulation
Render
Scene
Draw to
Display
x,y,z
r,p,y
Application Loop
Faster Tracker Faster CPU Faster GPU Faster Display
ReducingApparent Lag
Tracking
Update
x,y,z
r,p,y
Virtual
Display
Physical
Display
(640x480)
1280 x 960
Last known
position
Virtual
Display
Physical
Display
(640x480)
1280 x 960
Latest position
Tracking Calculate
Viewpoint
Simulation
Render
Scene
Draw to
Display
x,y,z
r,p,y
Application Loop
Reducing dynamic errors (2)
• Match video + graphics input streams (video AR)
• Delay video of real world to match system lag
• User doesn’t notice
• Predictive Tracking
• Inertial sensors helpful
Azuma / Bishop 1994
PredictiveTracking
Time
Position
Past Future
Can predict up to 80 ms in future (Holloway)
Now
PredictiveTracking (Azuma 94)
TRACKING
Frames of Reference
• Word-stabilized
• E.g., billboard or signpost
• Body-stabilized
• E.g., virtual tool-belt
• Screen-stabilized
• Heads-up display
Tracking Requirements
• Augmented Reality Information Display
• World Stabilized
• Body Stabilized
• Head Stabilized
Increasing Tracking
Requirements
Head Stabilized Body Stabilized World Stabilized
Tracking Technologies
§ Active
• Mechanical, Magnetic, Ultrasonic
• GPS, Wifi, cell location
§ Passive
• Inertial sensors (compass, accelerometer, gyro)
• Computer Vision
• Marker based, Natural feature tracking
§ Hybrid Tracking
• Combined sensors (eg Vision + Inertial)
Tracking Types
Magnetic
Tracker
Inertial
Tracker
Ultrasonic
Tracker
Optical
Tracker
Marker-Based
Tracking
Markerless
Tracking
Specialize
dTracking
Edge-Based
Tracking
Template-
BasedTracking
Interest Point
Tracking
Mechanical
Tracker
MechanicalTracker
•Idea: mechanical arms with joint sensors
•++: high accuracy, haptic feedback
•-- : cumbersome, expensive
Microscribe
MagneticTracker
• Idea: coil generates current when moved in
magnetic field. Measuring current gives position
and orientation relative to magnetic source.
• ++: 6DOF, robust
• -- : wired, sensible to metal, noisy, expensive
Flock of Birds (Ascension)
InertialTracker
• Idea: measuring linear and angular orientation rates
(accelerometer/gyroscope)
• ++: no transmitter, cheap, small, high frequency, wireless
• -- : drifts over time, hysteresis effect, only 3DOF
IS300 (Intersense)
Wii Remote
UltrasonicTracker
• Idea: time of Flight or phase-Coherence SoundWaves
• ++: Small, Cheap
• -- : 3DOF, Line of Sight, Low resolution, Affected by
environmental conditons (pressure, temperature)
Ultrasonic
Logitech IS600
Global Positioning System (GPS)
• Created by US in 1978
• Currently 29 satellites
• Satellites send position + time
• GPS Receiver positioning
• 4 satellites need to be visible
• Differential time of arrival
• Triangulation
• Accuracy
• 5-30m+, blocked by weather, buildings etc.
Mobile Sensors
• Inertial compass
• Earth’s magnetic field
• Measures absolute orientation
• Accelerometers
• Measures acceleration about axis
• Used for tilt, relative rotation
• Can drift over time
OPTICAL TRACKING
Why Optical Tracking for AR?
• Many AR devices have cameras
• Mobile phone/tablet, Video see-through display
• Provides precise alignment between video and AR overlay
• Using features in video to generate pixel perfect alignment
• Real world has many visual features that can be tracked from
• Computer Vision well established discipline
• Over 40 years of research to draw on
• Old non real time algorithms can be run in real time on todays devices
Common AR Optical Tracking Types
• Marker Tracking
• Tracking known artificial markers/images
• e.g. ARToolKit square markers
• Markerless Tracking
• Tracking from known features in real world
• e.g. Vuforia image tracking
• Unprepared Tracking
• Tracking in unknown environment
• e.g. SLAM tracking
Visual Tracking Approaches
• Marker based tracking with artificial features
• Make a model before tracking
• Model based tracking with natural features
• Acquire a model before tracking
• Simultaneous localization and mapping
• Build a model while tracking it
Marker tracking
• Available for more than 10 years
• Several open source solutions exist
• ARToolKit, ARTag, ATK+, etc
• Fairly simple to implement
• Standard computer vision methods
• A rectangle provides 4 corner points
• Enough for pose estimation!
Demo: ARToolKit
Marker Based Tracking: ARToolKit
http://www.artoolkit.org
Tracking challenges inARToolKit
False positives and inter-marker confusion
(image by M. Fiala)
Image noise
(e.g. poor lens, block
coding /
compression, neon tube)
Unfocused camera,
motion blur
Dark/unevenly lit
scene, vignetting
Jittering
(Photoshop illustration)
Occlusion
(image by M. Fiala)
Other MarkerTracking Libraries
Marker Target Identification
• More targets or features à more easily confused
• Must be as unique as possible
• Square markers
• 2D barcodes with error correction
• E.g., 6x6=36 bits (2 orientation, 6-12 payload,
rest for error correction)
• Marker tapestries
59
But - You can’t cover world with ARToolKit Markers!
Markerless Tracking
Magnetic
Tracker
Inertial
Tracker
Ultrasonic
Tracker
Optical
Tracker
Marker-Based
Tracking
Markerless
Tracking
Specialized
Tracking
Edge-Based
Tracking
Template-Based
Tracking
Interest Point
Tracking
• No more Markers! èMarkerless Tracking
Mechanica
l Tracker
Natural Feature Tracking
• Use Natural Cues of Real Elements
• Edges
• Surface Texture
• Interest Points
• Model or Model-Free
• No visual pollution
Contours
Features Points
Surfaces
TextureTracking
Natural Features
• Detect salient interest points in image
• Must be easily found
• Location in image should remain stable
when viewpoint changes
• Requires textured surfaces
• Alternative: can use edge features (less discriminative)
• Match interest points to tracking model database
• Database filled with results of 3D reconstruction
• Matching entire (sub-)images is too costly
• Typically interest points are compiled into “descriptors”
Tracking 64
Image: Gerhard Reitmayr
Image: Martin Hirzer
Tracking by Keypoint Detection
• This is what most trackers do…
• Targets are detected every frame
• Popular because
tracking and detection
are solved simultaneously
Keypoint detection
Descriptor creation
and matching
Outlier Removal
Pose estimation
and refinement
Camera Image
Pose
Recognition
What is a Keypoint?
• Invariant visual feature
• Different detectors possible
• For high performance use the FAST corner detector
• Apply FAST to all pixels of your image
• Obtain a set of keypoints for your image
• Describe the keypoints
Rosten, E., & Drummond, T. (2006, May). Machine learning for high-speed corner detection.
In European conference on computer vision (pp. 430-443). Springer Berlin Heidelberg.
FAST Corner Keypoint Detection
Example:FAST Corner Detection
https://www.youtube.com/watch?v=vEkHoYpMD3Y
Descriptors
• Describe the Keypoint features
• Can use SIFT
• Estimate the dominant keypoint
orientation using gradients
• Compensate for detected
orientation
• Describe the keypoints in terms
of the gradients surrounding it
Wagner D., Reitmayr G., Mulloni A., Drummond T., Schmalstieg D.,
Real-Time Detection and Tracking for Augmented Reality on Mobile Phones.
IEEE Transactions on Visualization and Computer Graphics, May/June, 2010
Detection and Tracking
Detection
Incremental
tracking
Tracking target
detected
Tracking target
lost
Tracking target
not detected
Incremental
tracking ok
Start
+ Recognize target type
+ Detect target
+ Initialize camera pose
+ Fast
+ Robust to blur, lighting
changes
+ Robust to tilt
Tracking and detection are complementary approaches.
After successful detection, the target is tracked incrementally.
If the target is lost, the detection is activated again
Demo: Vuforia Texture Tracking
https://www.youtube.com/watch?v=1Qf5Qew5zSU
Edge Based Tracking
• Example: RAPiD [Drummond et al. 02]
• Initialization, Control Points, Pose Prediction (Global Method)
Demo: Edge Based Tracking
Line Based Tracking
• Visual Servoing [Comport et al. 2004]
Marker vs.Natural FeatureTracking
• Marker tracking
• Usually requires no database to be stored
• Markers can be an eye-catcher
• Tracking is less demanding
• The environment must be instrumented
• Markers usually work only when fully in view
• Natural feature tracking
• A database of keypoints must be stored/downloaded
• Natural feature targets might catch the attention less
• Natural feature targets are potentially everywhere
• Natural feature targets work also if partially in view
Visual Tracking Approaches
• Marker based tracking with artificial features
• Make a model before tracking
• Model based tracking with natural features
• Acquire a model before tracking
• Simultaneous localization and mapping
• Build a model while tracking it
Model BasedTracking
• Tracking from 3D object shape
• Example: OpenTL - www.opentl.org
• General purpose library for model based visual tracking
Demo: OpenTL Model Tracking
Demo: OpenTL Face Tracking
Vuforia Model Tracker
• Uses pre-captured 3D model for tracking
• On-screen guide to line up model
Model Tracking Demo
https://www.youtube.com/watch?v=6W7_ZssUTDQ
Visual Tracking Approaches
• Marker based tracking with artificial features
• Make a model before tracking
• Model based tracking with natural features
• Acquire a model before tracking
• Simultaneous localization and mapping
• Build a model while tracking it
Tracking from an Unknown Environment
• What to do when you don’t know any features?
• Very important problem in mobile robotics - Where am I?
• SLAM
• Simultaneously Localize And Map the environment
• Goal: to recover both camera pose and map structure
while initially knowing neither.
• Mapping:
• Building a map of the environment which the robot is in
• Localisation:
• Navigating this environment using the map while keeping
track of the robot’s relative position and orientation
Parallel Tracking and Mapping
Tracking Mapping
New keyframes
Map updates
+ Estimate camera pose
+ For every frame
+ Extend map
+ Improve map
+ Slow updates rate
Parallel tracking and mapping uses two
concurrent threads, one for tracking and one for
mapping, which run at different speeds
Parallel Tracking and Mapping
Video stream
New frames
Map updates
Tracking Mapping
Tracked local pose
FAST SLOW
Simultaneous
localization and mapping
(SLAM)
in small workspaces
Klein/Drummond, U.
Cambridge
Visual SLAM
• Early SLAM systems (1986 - )
• Computer visions and sensors (e.g. IMU, laser, etc.)
• One of the most important algorithms in Robotics
• Visual SLAM
• Using cameras only, such as stereo view
• MonoSLAM (single camera) developed in 2007 (Davidson)
Example:Kudan MonoSLAM
How SLAMWorks
• Three main steps
1. Tracking a set of points through successive camera frames
2. Using these tracks to triangulate their 3D position
3. Simultaneously use the estimated point locations to calculate
the camera pose which could have observed them
• By observing a sufficient number of points can solve for both
structure and motion (camera path and scene structure).
Evolution of SLAM Systems
• MonoSLAM (Davidson, 2007)
• Real time SLAM from single camera
• PTAM (Klein, 2009)
• First SLAM implementation on mobile phone
• FAB-MAP (Cummins, 2008)
• Probabilistic Localization and Mapping
• DTAM (Newcombe, 2011)
• 3D surface reconstruction from every pixel in image
• KinectFusion (Izadi, 2011)
• Realtime dense surface mapping and tracking using RGB-D
Demo:MonoSLAM
LSD-SLAM (Engel 2014)
• A novel, direct monocular SLAM technique
• Uses image intensities both for tracking and mapping.
• The camera is tracked using direct image alignment, while
• Geometry is estimated as semi-dense depth maps
• Supports very large-scale tracking
• Runs in real time on CPU and smartphone
Demo:LSD-SLAM
Applications of SLAM Systems
• Many possible applications
• Augmented Reality camera tracking
• Mobile robot localisation
• Real world navigation aid
• 3D scene reconstruction
• 3D Object reconstruction
• Etc..
• Assumptions
• Camera moves through an unchanging scene
• So not suitable for person tracking, gesture recognition
• Both involve non-rigidly deforming objects and a non-static map
Hybrid Tracking
Combining several tracking modalities together
Combining Sensors andVision
• Sensors
• Produces noisy output (= jittering augmentations)
• Are not sufficiently accurate (= wrongly placed augmentations)
• Gives us first information on where we are in the world,
and what we are looking at
• Vision
• Is more accurate (= stable and correct augmentations)
• Requires choosing the correct keypoint database to track from
• Requires registering our local coordinate frame (online-
generated model) to the global one (world)
OutdoorARTracking System
You, Neumann,Azuma outdoor AR system (1999)
Types of Sensor Fusion
• Complementary
• Combining sensors with different degrees of freedom
• Sensors must be synchronized (or requires inter-/extrapolation)
• E.g., combine position-only and orientation-only sensor
• E.g., orthogonal 1D sensors in gyro or magnetometer are complementary
• Competitive
• Different sensor types measure the same degree of freedom
• Redundant sensor fusion
• Use worse sensor only if better sensor is unavailable
• E.g., GPS + pedometer
• Statistical sensor fusion
www.augmentedrealitybook.org Tracking 97
Example: Outdoor Hybrid Tracking
• Combines
• computer vision
• inertial gyroscope sensors
• Both correct for each other
• Inertial gyro
• provides frame to frame prediction of camera
orientation, fast sensing
• drifts over time
• Computer vision
• Natural feature tracking, corrects for gyro drift
• Slower, less accurate
Robust OutdoorTracking
• HybridTracking
• ComputerVision, GPS, inertial
• Going Out
• Reitmayr & Drummond (Univ. Cambridge)
Reitmayr, G., & Drummond, T. W. (2006). Going out: robust model-based tracking for outdoor augmented reaity. In Mixed and
Augmented Reality, 2006. ISMAR 2006. IEEE/ACM International Symposium on (pp. 109-118). IEEE.
Handheld Display
Demo: Going Out Hybrid Tracking
ARKit – Visual Inertial Odometry
• Uses both computer vision + inertial sensing
• Tracking position twice
• Computer Vision – feature tracking, 2D plane tracking
• Inertial sensing – using the phone IMU
• Output combined via Kalman filter
• Determine which output is most accurate
• Pass pose to ARKit SDK
• Each system compliments the other
• Computer vision – needs visual features
• IMU - drifts over time, doesn’t need features
ARKit –Visual Inertial Odometry
• Slow camera
• Fast IMU
• If camera drops out IMU takes over
• Camera corrects IMU errors
ARKit Demo
• https://www.youtube.com/watch?v=dMEWp45WAUg
Conclusions
• Tracking and Registration are key problems
• Registration error
• Measures against static error
• Measures against dynamic error
• AR typically requires multiple tracking technologies
• Computer vision most popular
• Research Areas:
• SLAM systems, Deformable models, Mobile outdoor tracking
www.empathiccomputing.org
@marknb00
mark.billinghurst@unisa.edu.au

More Related Content

What's hot

What's hot (20)

ISS2022 Keynote
ISS2022 KeynoteISS2022 Keynote
ISS2022 Keynote
 
2022 COMP 4010 Lecture 7: Introduction to VR
2022 COMP 4010 Lecture 7: Introduction to VR2022 COMP 4010 Lecture 7: Introduction to VR
2022 COMP 4010 Lecture 7: Introduction to VR
 
Comp4010 2021 Lecture2-Perception
Comp4010 2021 Lecture2-PerceptionComp4010 2021 Lecture2-Perception
Comp4010 2021 Lecture2-Perception
 
Comp4010 Lecture9 VR Input and Systems
Comp4010 Lecture9 VR Input and SystemsComp4010 Lecture9 VR Input and Systems
Comp4010 Lecture9 VR Input and Systems
 
Grand Challenges for Mixed Reality
Grand Challenges for Mixed Reality Grand Challenges for Mixed Reality
Grand Challenges for Mixed Reality
 
Comp4010 lecture3-AR Technology
Comp4010 lecture3-AR TechnologyComp4010 lecture3-AR Technology
Comp4010 lecture3-AR Technology
 
Comp4010 lecture6 Prototyping
Comp4010 lecture6 PrototypingComp4010 lecture6 Prototyping
Comp4010 lecture6 Prototyping
 
Comp4010 Lecture7 Designing AR Systems
Comp4010 Lecture7 Designing AR SystemsComp4010 Lecture7 Designing AR Systems
Comp4010 Lecture7 Designing AR Systems
 
COMP 4010 Lecture10: AR Tracking
COMP 4010 Lecture10: AR TrackingCOMP 4010 Lecture10: AR Tracking
COMP 4010 Lecture10: AR Tracking
 
COMP 4010 - Lecture4 VR Technology - Visual and Haptic Displays
COMP 4010 - Lecture4 VR Technology - Visual and Haptic DisplaysCOMP 4010 - Lecture4 VR Technology - Visual and Haptic Displays
COMP 4010 - Lecture4 VR Technology - Visual and Haptic Displays
 
Empathic Computing: Developing for the Whole Metaverse
Empathic Computing: Developing for the Whole MetaverseEmpathic Computing: Developing for the Whole Metaverse
Empathic Computing: Developing for the Whole Metaverse
 
Comp4010 Lecture12 Research Directions
Comp4010 Lecture12 Research DirectionsComp4010 Lecture12 Research Directions
Comp4010 Lecture12 Research Directions
 
Lecture 2 Presence and Perception
Lecture 2 Presence and PerceptionLecture 2 Presence and Perception
Lecture 2 Presence and Perception
 
Advanced Methods for User Evaluation in AR/VR Studies
Advanced Methods for User Evaluation in AR/VR StudiesAdvanced Methods for User Evaluation in AR/VR Studies
Advanced Methods for User Evaluation in AR/VR Studies
 
Comp4010 lecture11 VR Applications
Comp4010 lecture11 VR ApplicationsComp4010 lecture11 VR Applications
Comp4010 lecture11 VR Applications
 
COMP 4010 - Lecture1 Introduction to Virtual Reality
COMP 4010 - Lecture1 Introduction to Virtual RealityCOMP 4010 - Lecture1 Introduction to Virtual Reality
COMP 4010 - Lecture1 Introduction to Virtual Reality
 
Comp4010 Lecture8 Introduction to VR
Comp4010 Lecture8 Introduction to VRComp4010 Lecture8 Introduction to VR
Comp4010 Lecture8 Introduction to VR
 
Comp4010 Lecture4 AR Tracking and Interaction
Comp4010 Lecture4 AR Tracking and InteractionComp4010 Lecture4 AR Tracking and Interaction
Comp4010 Lecture4 AR Tracking and Interaction
 
Comp 4010 2021 Lecture1-Introduction to XR
Comp 4010 2021 Lecture1-Introduction to XRComp 4010 2021 Lecture1-Introduction to XR
Comp 4010 2021 Lecture1-Introduction to XR
 
Research Directions in Transitional Interfaces
Research Directions in Transitional InterfacesResearch Directions in Transitional Interfaces
Research Directions in Transitional Interfaces
 

Similar to 2022 COMP4010 Lecture3: AR Technology

Similar to 2022 COMP4010 Lecture3: AR Technology (20)

Lecture3 - VR Technology
Lecture3 - VR TechnologyLecture3 - VR Technology
Lecture3 - VR Technology
 
Lecture 4: VR Systems
Lecture 4: VR SystemsLecture 4: VR Systems
Lecture 4: VR Systems
 
COMP 4010 - Lecture 3 VR Systems
COMP 4010 - Lecture 3 VR SystemsCOMP 4010 - Lecture 3 VR Systems
COMP 4010 - Lecture 3 VR Systems
 
Mobile AR Lecture 10 - Research Directions
Mobile AR Lecture 10 - Research DirectionsMobile AR Lecture 10 - Research Directions
Mobile AR Lecture 10 - Research Directions
 
Mobile AR Lecture 2 - Technology
Mobile AR Lecture 2 - TechnologyMobile AR Lecture 2 - Technology
Mobile AR Lecture 2 - Technology
 
2013 Lecture3: AR Tracking
2013 Lecture3: AR Tracking 2013 Lecture3: AR Tracking
2013 Lecture3: AR Tracking
 
COMP 4010 Lecture 3 VR Input and Systems
COMP 4010 Lecture 3 VR Input and SystemsCOMP 4010 Lecture 3 VR Input and Systems
COMP 4010 Lecture 3 VR Input and Systems
 
COMP 4010 - Lecture 8 AR Technology
COMP 4010 - Lecture 8 AR TechnologyCOMP 4010 - Lecture 8 AR Technology
COMP 4010 - Lecture 8 AR Technology
 
2016 AR Summer School - Lecture 5
2016 AR Summer School - Lecture 52016 AR Summer School - Lecture 5
2016 AR Summer School - Lecture 5
 
Augumented reallity
Augumented reallityAugumented reallity
Augumented reallity
 
COMP 4010 Lecture6 - Virtual Reality Input Devices
COMP 4010 Lecture6 - Virtual Reality Input DevicesCOMP 4010 Lecture6 - Virtual Reality Input Devices
COMP 4010 Lecture6 - Virtual Reality Input Devices
 
Mobile Augmented Reality
Mobile Augmented RealityMobile Augmented Reality
Mobile Augmented Reality
 
eng.pptx
eng.pptxeng.pptx
eng.pptx
 
2016 AR Summer School Lecture2
2016 AR Summer School Lecture22016 AR Summer School Lecture2
2016 AR Summer School Lecture2
 
Augmented Reality
Augmented RealityAugmented Reality
Augmented Reality
 
feature processing and modelling for 6D motion gesture database.....
feature processing and modelling for 6D motion gesture database.....feature processing and modelling for 6D motion gesture database.....
feature processing and modelling for 6D motion gesture database.....
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Track 1 session 1 - st dev con 2016 - contextual awareness
Track 1   session 1 - st dev con 2016 - contextual awarenessTrack 1   session 1 - st dev con 2016 - contextual awareness
Track 1 session 1 - st dev con 2016 - contextual awareness
 
Future Force Presentation To Ministry Of Defence
Future Force Presentation To Ministry Of DefenceFuture Force Presentation To Ministry Of Defence
Future Force Presentation To Ministry Of Defence
 
Future force presentation_to_ministry_of_defence
Future force presentation_to_ministry_of_defenceFuture force presentation_to_ministry_of_defence
Future force presentation_to_ministry_of_defence
 

More from Mark Billinghurst

More from Mark Billinghurst (12)

IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Future Research Directions for Augmented Reality
Future Research Directions for Augmented RealityFuture Research Directions for Augmented Reality
Future Research Directions for Augmented Reality
 
Evaluation Methods for Social XR Experiences
Evaluation Methods for Social XR ExperiencesEvaluation Methods for Social XR Experiences
Evaluation Methods for Social XR Experiences
 
Empathic Computing: Delivering the Potential of the Metaverse
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the Metaverse
 
Empathic Computing: Capturing the Potential of the Metaverse
Empathic Computing: Capturing the Potential of the MetaverseEmpathic Computing: Capturing the Potential of the Metaverse
Empathic Computing: Capturing the Potential of the Metaverse
 
Talk to Me: Using Virtual Avatars to Improve Remote Collaboration
Talk to Me: Using Virtual Avatars to Improve Remote CollaborationTalk to Me: Using Virtual Avatars to Improve Remote Collaboration
Talk to Me: Using Virtual Avatars to Improve Remote Collaboration
 
Empathic Computing: Designing for the Broader Metaverse
Empathic Computing: Designing for the Broader MetaverseEmpathic Computing: Designing for the Broader Metaverse
Empathic Computing: Designing for the Broader Metaverse
 
Novel Interfaces for AR Systems
Novel Interfaces for AR SystemsNovel Interfaces for AR Systems
Novel Interfaces for AR Systems
 
Empathic Computing and Collaborative Immersive Analytics
Empathic Computing and Collaborative Immersive AnalyticsEmpathic Computing and Collaborative Immersive Analytics
Empathic Computing and Collaborative Immersive Analytics
 
Metaverse Learning
Metaverse LearningMetaverse Learning
Metaverse Learning
 
Comp4010 Lecture10 VR Interface Design
Comp4010 Lecture10 VR Interface DesignComp4010 Lecture10 VR Interface Design
Comp4010 Lecture10 VR Interface Design
 
Advanced Methods for User Evaluation in Enterprise AR
Advanced Methods for User Evaluation in Enterprise ARAdvanced Methods for User Evaluation in Enterprise AR
Advanced Methods for User Evaluation in Enterprise AR
 

Recently uploaded

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Recently uploaded (20)

Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational Performance
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern Enterprise
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governance
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
 

2022 COMP4010 Lecture3: AR Technology

  • 1. AR TECHNOLOGY COMP 4010 Lecture Three Mark Billinghurst August 11th 2022 mark.billinghurst@unisa.edu.au
  • 3. How do We Perceive Reality? • We understand the world through our senses: • Sight, Hearing, Touch, Taste, Smell (and others..) • Two basic processes: • Sensation – Gathering information • Perception – Interpreting information
  • 5. Reality vs. Virtual Reality • In a VR system there are input and output devices between human perception and action
  • 6. Presence .. “The subjective experience of being in one place or environment even when physically situated in another” Witmer, B. G., & Singer, M. J. (1998). Measuring presence in virtual environments: A presence questionnaire. Presence: Teleoperators and virtual environments, 7(3), 225-240.
  • 7. Slater, M., Banakou, D., Beacco, A., Gallego, J., Macia-Varela, F., & Oliva, R. (2022). A Separate Reality: An Update on Place Illusion and Plausibility in Virtual Reality. Frontiers in Virtual Reality, 81. Four Illusions of Presence (Slater 2022) • Place Illusion: being in the place • Plausibility Illusion: events are real • Body Ownership: seeing your body in VR • Copresence/Social Presence: other people are in VR
  • 8. Senses • How an organism obtains information for perception: • Sensation part of Somatic Division of Peripheral Nervous System • Integration and perception requires the Central Nervous System • Five major senses (but there are more..): • Sight (Opthalamoception) • Hearing (Audioception) • Taste (Gustaoception) • Smell (Olfacaoception) • Touch (Tactioception)
  • 9. The Human Visual System • Purpose is to convert visual input to signals in the brain
  • 11. Sound Localization • Humans have two ears • localize sound in space • Sound can be localized using 3 coordinates • Azimuth, elevation, distance
  • 12. Haptic Sensation • Somatosensory System • complex system of nerve cells that responds to changes to the surface or internal state of the body • Skin is the largest organ • 1.3-1.7 square m in adults • Tactile: Surface properties • Receptors not evenly spread • Most densely populated area is the tongue • Kinesthetic: Muscles, Tendons, etc. • Also known as proprioception
  • 13. Proprioception/Kinaesthesia • Proprioception (joint position sense) • Awareness of movement and positions of body parts • Due to nerve endings and Pacinian and Ruffini corpuscles at joints • Enables us to touch nose with eyes closed • Joints closer to body more accurately sensed • Users know hand position accurate to 8cm without looking at them • Kinaesthesia (joint movement sense) • Sensing muscle contraction or stretching • Cutaneous mechanoreceptors measuring skin stretching • Helps with force sensation
  • 14. Augmented RealityTechnology • Combines Real and Virtual Images • Needs: Display technology • Interactive in real-time • Needs: Input and interaction technology • Registered in 3D • Needs: Viewpoint tracking technology
  • 15. AR Display Technologies • Classification (Bimber/Raskar 2005) • Head attached • Head mounted display/projector • Body attached • Handheld display/projector • Spatial • Spatially aligned projector/monitor
  • 16. Types of Head Mounted Displays Occluded See-thru Multiplexed
  • 17. Optical see-through Head-Mounted Display Virtual images from monitors Real World Optical Combiners
  • 19. Handheld AR • Camera + display = handheld AR • Mobile phone/Tablet display
  • 20. SpatialAugmented Reality • Project onto irregular surfaces • Geometric Registration • Projector blending, High dynamic range • Book: Bimber, Rasker “Spatial Augmented Reality”
  • 22.
  • 23. AR RequiresTracking and Registration • Registration • Positioning virtual object wrt real world • Fixing virtual object on real object when view is fixed • Calibration • Offline measurements • Measure camera relative to head mounted display • Tracking • Continually locating the user’s viewpoint when view moving • Position (x,y,z), Orientation (r,p,y)
  • 25. Coordinate Systems Local object coordinates Global world coordinates Eye coordinates Model transformation • Track for moving objects, if there are static objects as well View transformation • Track for moving objects, if there are no static objects • Track for moving observer Perspective transformation • Calibrate offline • For both camera and display
  • 27. The Registration Problem • Virtual and Real content must stay properly aligned • If not: • Breaks the illusion that the two coexist • Prevents acceptance of many serious applications t = 0 seconds t = 0.5 second
  • 28. Sources of Registration Errors •Static errors • Optical distortions (in HMD) • Mechanical misalignments • Tracker errors • Incorrect viewing parameters •Dynamic errors • System delays (largest source of error) • 1 ms delay = 1/3 mm registration error
  • 29. Reducing Static Errors •Distortion compensation • For lens or display distortions •Manual adjustments • Have user manually alighn AR andVR content •View-based or direct measurements • Have user measure eye position •Camera calibration (video AR) • Measuring camera properties
  • 32. Dynamic errors • Total Delay = 50 + 2 + 33 + 17 = 102 ms • 1 ms delay = 1/3 mm = 33mm error Tracking Calculate Viewpoint Simulation Render Scene Draw to Display x,y,z r,p,y Application Loop 20 Hz = 50ms 500 Hz = 2ms 30 Hz = 33ms 60 Hz = 17ms
  • 33. Reducing dynamic errors (1) •Reduce system lag •Faster components/system modules •Reduce apparent lag •Image deflection •Image warping
  • 34. Reducing System Lag Tracking Calculate Viewpoint Simulation Render Scene Draw to Display x,y,z r,p,y Application Loop Faster Tracker Faster CPU Faster GPU Faster Display
  • 35. ReducingApparent Lag Tracking Update x,y,z r,p,y Virtual Display Physical Display (640x480) 1280 x 960 Last known position Virtual Display Physical Display (640x480) 1280 x 960 Latest position Tracking Calculate Viewpoint Simulation Render Scene Draw to Display x,y,z r,p,y Application Loop
  • 36. Reducing dynamic errors (2) • Match video + graphics input streams (video AR) • Delay video of real world to match system lag • User doesn’t notice • Predictive Tracking • Inertial sensors helpful Azuma / Bishop 1994
  • 37. PredictiveTracking Time Position Past Future Can predict up to 80 ms in future (Holloway) Now
  • 40. Frames of Reference • Word-stabilized • E.g., billboard or signpost • Body-stabilized • E.g., virtual tool-belt • Screen-stabilized • Heads-up display
  • 41. Tracking Requirements • Augmented Reality Information Display • World Stabilized • Body Stabilized • Head Stabilized Increasing Tracking Requirements Head Stabilized Body Stabilized World Stabilized
  • 42. Tracking Technologies § Active • Mechanical, Magnetic, Ultrasonic • GPS, Wifi, cell location § Passive • Inertial sensors (compass, accelerometer, gyro) • Computer Vision • Marker based, Natural feature tracking § Hybrid Tracking • Combined sensors (eg Vision + Inertial)
  • 44. MechanicalTracker •Idea: mechanical arms with joint sensors •++: high accuracy, haptic feedback •-- : cumbersome, expensive Microscribe
  • 45. MagneticTracker • Idea: coil generates current when moved in magnetic field. Measuring current gives position and orientation relative to magnetic source. • ++: 6DOF, robust • -- : wired, sensible to metal, noisy, expensive Flock of Birds (Ascension)
  • 46. InertialTracker • Idea: measuring linear and angular orientation rates (accelerometer/gyroscope) • ++: no transmitter, cheap, small, high frequency, wireless • -- : drifts over time, hysteresis effect, only 3DOF IS300 (Intersense) Wii Remote
  • 47. UltrasonicTracker • Idea: time of Flight or phase-Coherence SoundWaves • ++: Small, Cheap • -- : 3DOF, Line of Sight, Low resolution, Affected by environmental conditons (pressure, temperature) Ultrasonic Logitech IS600
  • 48. Global Positioning System (GPS) • Created by US in 1978 • Currently 29 satellites • Satellites send position + time • GPS Receiver positioning • 4 satellites need to be visible • Differential time of arrival • Triangulation • Accuracy • 5-30m+, blocked by weather, buildings etc.
  • 49. Mobile Sensors • Inertial compass • Earth’s magnetic field • Measures absolute orientation • Accelerometers • Measures acceleration about axis • Used for tilt, relative rotation • Can drift over time
  • 51. Why Optical Tracking for AR? • Many AR devices have cameras • Mobile phone/tablet, Video see-through display • Provides precise alignment between video and AR overlay • Using features in video to generate pixel perfect alignment • Real world has many visual features that can be tracked from • Computer Vision well established discipline • Over 40 years of research to draw on • Old non real time algorithms can be run in real time on todays devices
  • 52. Common AR Optical Tracking Types • Marker Tracking • Tracking known artificial markers/images • e.g. ARToolKit square markers • Markerless Tracking • Tracking from known features in real world • e.g. Vuforia image tracking • Unprepared Tracking • Tracking in unknown environment • e.g. SLAM tracking
  • 53. Visual Tracking Approaches • Marker based tracking with artificial features • Make a model before tracking • Model based tracking with natural features • Acquire a model before tracking • Simultaneous localization and mapping • Build a model while tracking it
  • 54. Marker tracking • Available for more than 10 years • Several open source solutions exist • ARToolKit, ARTag, ATK+, etc • Fairly simple to implement • Standard computer vision methods • A rectangle provides 4 corner points • Enough for pose estimation!
  • 56. Marker Based Tracking: ARToolKit http://www.artoolkit.org
  • 57. Tracking challenges inARToolKit False positives and inter-marker confusion (image by M. Fiala) Image noise (e.g. poor lens, block coding / compression, neon tube) Unfocused camera, motion blur Dark/unevenly lit scene, vignetting Jittering (Photoshop illustration) Occlusion (image by M. Fiala)
  • 59. Marker Target Identification • More targets or features à more easily confused • Must be as unique as possible • Square markers • 2D barcodes with error correction • E.g., 6x6=36 bits (2 orientation, 6-12 payload, rest for error correction) • Marker tapestries 59
  • 60. But - You can’t cover world with ARToolKit Markers!
  • 62. Natural Feature Tracking • Use Natural Cues of Real Elements • Edges • Surface Texture • Interest Points • Model or Model-Free • No visual pollution Contours Features Points Surfaces
  • 64. Natural Features • Detect salient interest points in image • Must be easily found • Location in image should remain stable when viewpoint changes • Requires textured surfaces • Alternative: can use edge features (less discriminative) • Match interest points to tracking model database • Database filled with results of 3D reconstruction • Matching entire (sub-)images is too costly • Typically interest points are compiled into “descriptors” Tracking 64 Image: Gerhard Reitmayr Image: Martin Hirzer
  • 65. Tracking by Keypoint Detection • This is what most trackers do… • Targets are detected every frame • Popular because tracking and detection are solved simultaneously Keypoint detection Descriptor creation and matching Outlier Removal Pose estimation and refinement Camera Image Pose Recognition
  • 66. What is a Keypoint? • Invariant visual feature • Different detectors possible • For high performance use the FAST corner detector • Apply FAST to all pixels of your image • Obtain a set of keypoints for your image • Describe the keypoints Rosten, E., & Drummond, T. (2006, May). Machine learning for high-speed corner detection. In European conference on computer vision (pp. 430-443). Springer Berlin Heidelberg.
  • 67. FAST Corner Keypoint Detection
  • 69. Descriptors • Describe the Keypoint features • Can use SIFT • Estimate the dominant keypoint orientation using gradients • Compensate for detected orientation • Describe the keypoints in terms of the gradients surrounding it Wagner D., Reitmayr G., Mulloni A., Drummond T., Schmalstieg D., Real-Time Detection and Tracking for Augmented Reality on Mobile Phones. IEEE Transactions on Visualization and Computer Graphics, May/June, 2010
  • 70. Detection and Tracking Detection Incremental tracking Tracking target detected Tracking target lost Tracking target not detected Incremental tracking ok Start + Recognize target type + Detect target + Initialize camera pose + Fast + Robust to blur, lighting changes + Robust to tilt Tracking and detection are complementary approaches. After successful detection, the target is tracked incrementally. If the target is lost, the detection is activated again
  • 71. Demo: Vuforia Texture Tracking https://www.youtube.com/watch?v=1Qf5Qew5zSU
  • 72. Edge Based Tracking • Example: RAPiD [Drummond et al. 02] • Initialization, Control Points, Pose Prediction (Global Method)
  • 73. Demo: Edge Based Tracking
  • 74. Line Based Tracking • Visual Servoing [Comport et al. 2004]
  • 75. Marker vs.Natural FeatureTracking • Marker tracking • Usually requires no database to be stored • Markers can be an eye-catcher • Tracking is less demanding • The environment must be instrumented • Markers usually work only when fully in view • Natural feature tracking • A database of keypoints must be stored/downloaded • Natural feature targets might catch the attention less • Natural feature targets are potentially everywhere • Natural feature targets work also if partially in view
  • 76. Visual Tracking Approaches • Marker based tracking with artificial features • Make a model before tracking • Model based tracking with natural features • Acquire a model before tracking • Simultaneous localization and mapping • Build a model while tracking it
  • 77. Model BasedTracking • Tracking from 3D object shape • Example: OpenTL - www.opentl.org • General purpose library for model based visual tracking
  • 78. Demo: OpenTL Model Tracking
  • 79. Demo: OpenTL Face Tracking
  • 80. Vuforia Model Tracker • Uses pre-captured 3D model for tracking • On-screen guide to line up model
  • 82. Visual Tracking Approaches • Marker based tracking with artificial features • Make a model before tracking • Model based tracking with natural features • Acquire a model before tracking • Simultaneous localization and mapping • Build a model while tracking it
  • 83. Tracking from an Unknown Environment • What to do when you don’t know any features? • Very important problem in mobile robotics - Where am I? • SLAM • Simultaneously Localize And Map the environment • Goal: to recover both camera pose and map structure while initially knowing neither. • Mapping: • Building a map of the environment which the robot is in • Localisation: • Navigating this environment using the map while keeping track of the robot’s relative position and orientation
  • 84. Parallel Tracking and Mapping Tracking Mapping New keyframes Map updates + Estimate camera pose + For every frame + Extend map + Improve map + Slow updates rate Parallel tracking and mapping uses two concurrent threads, one for tracking and one for mapping, which run at different speeds
  • 85. Parallel Tracking and Mapping Video stream New frames Map updates Tracking Mapping Tracked local pose FAST SLOW Simultaneous localization and mapping (SLAM) in small workspaces Klein/Drummond, U. Cambridge
  • 86. Visual SLAM • Early SLAM systems (1986 - ) • Computer visions and sensors (e.g. IMU, laser, etc.) • One of the most important algorithms in Robotics • Visual SLAM • Using cameras only, such as stereo view • MonoSLAM (single camera) developed in 2007 (Davidson)
  • 88. How SLAMWorks • Three main steps 1. Tracking a set of points through successive camera frames 2. Using these tracks to triangulate their 3D position 3. Simultaneously use the estimated point locations to calculate the camera pose which could have observed them • By observing a sufficient number of points can solve for both structure and motion (camera path and scene structure).
  • 89. Evolution of SLAM Systems • MonoSLAM (Davidson, 2007) • Real time SLAM from single camera • PTAM (Klein, 2009) • First SLAM implementation on mobile phone • FAB-MAP (Cummins, 2008) • Probabilistic Localization and Mapping • DTAM (Newcombe, 2011) • 3D surface reconstruction from every pixel in image • KinectFusion (Izadi, 2011) • Realtime dense surface mapping and tracking using RGB-D
  • 91. LSD-SLAM (Engel 2014) • A novel, direct monocular SLAM technique • Uses image intensities both for tracking and mapping. • The camera is tracked using direct image alignment, while • Geometry is estimated as semi-dense depth maps • Supports very large-scale tracking • Runs in real time on CPU and smartphone
  • 93. Applications of SLAM Systems • Many possible applications • Augmented Reality camera tracking • Mobile robot localisation • Real world navigation aid • 3D scene reconstruction • 3D Object reconstruction • Etc.. • Assumptions • Camera moves through an unchanging scene • So not suitable for person tracking, gesture recognition • Both involve non-rigidly deforming objects and a non-static map
  • 94. Hybrid Tracking Combining several tracking modalities together
  • 95. Combining Sensors andVision • Sensors • Produces noisy output (= jittering augmentations) • Are not sufficiently accurate (= wrongly placed augmentations) • Gives us first information on where we are in the world, and what we are looking at • Vision • Is more accurate (= stable and correct augmentations) • Requires choosing the correct keypoint database to track from • Requires registering our local coordinate frame (online- generated model) to the global one (world)
  • 97. Types of Sensor Fusion • Complementary • Combining sensors with different degrees of freedom • Sensors must be synchronized (or requires inter-/extrapolation) • E.g., combine position-only and orientation-only sensor • E.g., orthogonal 1D sensors in gyro or magnetometer are complementary • Competitive • Different sensor types measure the same degree of freedom • Redundant sensor fusion • Use worse sensor only if better sensor is unavailable • E.g., GPS + pedometer • Statistical sensor fusion www.augmentedrealitybook.org Tracking 97
  • 98. Example: Outdoor Hybrid Tracking • Combines • computer vision • inertial gyroscope sensors • Both correct for each other • Inertial gyro • provides frame to frame prediction of camera orientation, fast sensing • drifts over time • Computer vision • Natural feature tracking, corrects for gyro drift • Slower, less accurate
  • 99. Robust OutdoorTracking • HybridTracking • ComputerVision, GPS, inertial • Going Out • Reitmayr & Drummond (Univ. Cambridge) Reitmayr, G., & Drummond, T. W. (2006). Going out: robust model-based tracking for outdoor augmented reaity. In Mixed and Augmented Reality, 2006. ISMAR 2006. IEEE/ACM International Symposium on (pp. 109-118). IEEE.
  • 101. Demo: Going Out Hybrid Tracking
  • 102. ARKit – Visual Inertial Odometry • Uses both computer vision + inertial sensing • Tracking position twice • Computer Vision – feature tracking, 2D plane tracking • Inertial sensing – using the phone IMU • Output combined via Kalman filter • Determine which output is most accurate • Pass pose to ARKit SDK • Each system compliments the other • Computer vision – needs visual features • IMU - drifts over time, doesn’t need features
  • 103. ARKit –Visual Inertial Odometry • Slow camera • Fast IMU • If camera drops out IMU takes over • Camera corrects IMU errors
  • 105. Conclusions • Tracking and Registration are key problems • Registration error • Measures against static error • Measures against dynamic error • AR typically requires multiple tracking technologies • Computer vision most popular • Research Areas: • SLAM systems, Deformable models, Mobile outdoor tracking