Hiren Bhinde (Qualcomm): On-device Motion Tracking for Immersive VR

•Download as PPTX, PDF•

2 likes•1,541 views

A talk from the Tools & Products Track at AWE USA 2017 - the largest conference for AR+VR in Santa Clara, California May 31- June 2, 2017. Hiren Bhinde (Qualcomm): On-device Motion Tracking for Immersive VR As the industry strives toward immersive augmented and virtual reality experiences, we are guided by the extreme requirements associated with intuitive interactions, visual, and sound quality, in order to achieve the ultimate untethered user experience. Precise, low-latency motion tracking of head movements is crucial for intuitive interactions with the virtual world, and visual-inertial odometry (VIO) is the ideal complementary subsystem to achieve this goal. VIO allows for six-degrees of freedom in VR/AR experiences, reduces latency and cuts the cord. In this session, developers will learn about the evolution of motion tracking and dive into six-degrees of freedom and its impact on VR/AR content development and user experiences. http://AugmentedWorldExpo.com

Technology

On-device motion
tracking for immersive
mobile VR
Hiren Bhinde
XR Product Manager
Qualcomm Technologies, Inc.
June 2, 2017
Qualcomm Snapdragon is a product of Qualcomm Technologies, Inc.

2
VR will provide the ultimate level of immersion
Creating physical presence in real or imagined worlds: Intuitive interactions required
Interactions
So intuitive that they
become second nature
Sounds
So accurate that
they are true to life
Visuals
So vibrant that they are eventually
indistinguishable from the real world

3
3-DoF vs. 6-DoF
• “In which direction am I looking”
• Detect rotational head movement
• Look around the virtual world from a fixed point
3 degrees of freedom (3-DoF)
• “Where am I and in which direction am I looking”
• Detect rotational movement and translational movement
• Move in the virtual world like you move in the real world
6 degrees of freedom (6-DoF)

4
6-DoF allows developers to bring the user into their story
• Full immersion
• Can become part of the story
• Can now interact and change the story
6 degrees of freedom (6-DoF)
• Can only watch
3 degrees of freedom (3-DoF)

5
6-DoF motion tracking evolution
2014
Outside-in
External sensors,
markers, cameras,
or lasers to be set
up throughout the
room.
Monocular
fisheye camera
Inside-out solution
requiring no room
setup
Stereo fisheye
camera
More accurate
motion tracking
for immersive
interactions
Room-scale VR
Object detection,
boundary
augmentation, and
safety
World VR
Limitless movement
and robust safety
Future2017
Today’s discussion

6
Conventional 6-DoF: “Outside-in” tracking
External sensors determine the user’s position and orientation

7
Mobile 6-DoF: “Inside-out” tracking
Visual inertial odometry (VIO) for rapid and accurate 6-DoF pose

8
Mobile 6-DoF: “Inside-out” tracking
Visual inertial odometry (VIO) for rapid and accurate 6-DoF pose
6-DoF
position &
orientation
(aka “6-DoF pose”)
Captured from tracking camera
image sensor at ~30 fps
Mono or stereo
camera data
Accelerometer &
gyroscope data
Sampled from external
sensors at 800 / 1000 Hz
Camera feature processing
Inertial data processing
Snapdragon “VIO” subsystem
New frame accurately
displayed
Qualcomm® Hexagon™
DSP algorithms
• Camera and inertial sensor
data fusion
• Continuous localization
• Accurate, high-rate “pose”
generation & prediction
Qualcomm Hexagon is a product of Qualcomm Technologies, Inc.

9
Minimizing motion to photon (MTP) latency is crucial
~15-16 ms MTP latency on Qualcomm® Snapdragon™ 835 Mobile Platform shows our mobile VR leadership
Low latency Noticeable latency
Qualcomm Snapdragon is a product of Qualcomm Technologies, Inc.

10
Low latency mobile 6-DoF inside-out tracking
Many workloads must run efficiently for an immersive VR experience
1
Visual processing
• View generation
• Render / decode
Motion detection
• Sensor sampling
• Sensor fusion
Display
• Image correction
• Quality enhancement and display
“Motion” “Photon”
(new pixels’ light emitted from the screen)
Total time (motion to photon latency) for all steps above must be less than 20 milliseconds

11
Stereo VIO for rapid, robust and accurate 6-DoF pose
Sensor fusion of stereo camera features and high rate IMU data
Benefits of stereo over monocular 6 DoF
• Instant accurate scene depth
• Faster initialization
• Better performance with quick and rotational
motions
• Improved tolerance to camera occlusion
Stereo wide-angle
lenses

12
Some key things to consider
Developing 6-DoF content
Scale and space
• Issue: Untethered mobile allows for
limitless movement, but available space
could change between runs of app
• DO consider limits to amount of full body
movement
• DO provide movement alternatives
Tracking
• Issue: Device cameras can become
occluded or environment can be too dark
• DO fade to black or fixed image in event of
lost head tracking to avoid nauseating
jumps and judder
• DON’T fall back to tracking only orientation
(3-DoF). Jumps in position or seeing
virtual world respond differently to
movement can be uncomfortable for users
Storytelling
• Issue: Users may not be looking or
positioned in desired location to advance
story
• DO use audio and visual cues to guide
user focus
• DON’T take over control of virtual camera
in order to force focus on story element

13
How to get started with 6-DoF for mobile VR
Snapdragon VR SDK Snapdragon 835 VR HMD
Access to advanced VR features to optimize
applications and simplify development
Accelerating the development of standalone
head-mounted displays
URL: developer.qualcomm.com
Tools & resources to help developers build, integrate, and optimize

What's hot

2016 AR Summer School Lecture2Mark Billinghurst

Getting Real about ARMark Billinghurst

Augmented TelePortationMark Billinghurst

Augmented RealityBodhisha Thomas

Fifty Shades of Augmented Reality: Creating Connection Using ARMark Billinghurst

The Mobile Future of Augmented RealityQualcomm Research

2016 AR Summer School - Lecture1Mark Billinghurst

Building VR Applications For Google CardboardMark Billinghurst

COMP 4010 Lecture10: AR TrackingMark Billinghurst

COMP 4010 Lecture9 AR DisplaysMark Billinghurst

Aesthetec at MEIC5, augmenting the worldAesthetec Studio

COMP 4010 - Lecture 8 AR TechnologyMark Billinghurst

Cardboard VR: Building Low Cost VR ExperiencesMark Billinghurst

Spatial Audio for Augmented RealityMark Billinghurst

2013 426 Lecture 2: Augmented Reality TechnologyMark Billinghurst

COMP 4010 Lecture5 VR Audio and TrackingMark Billinghurst

Ubiquitous Computer Vision in IoTnazlitemu

MirawareBoris Greenberg

COSC 426 Lecture 1: Introduction to Augmented RealityMark Billinghurst

Mobile AR Lecture 2 - TechnologyMark Billinghurst

What's hot (20)

2016 AR Summer School Lecture2

Getting Real about AR

Augmented TelePortation

Augmented Reality

Fifty Shades of Augmented Reality: Creating Connection Using AR

The Mobile Future of Augmented Reality

2016 AR Summer School - Lecture1

Building VR Applications For Google Cardboard

COMP 4010 Lecture10: AR Tracking

COMP 4010 Lecture9 AR Displays

Aesthetec at MEIC5, augmenting the world

COMP 4010 - Lecture 8 AR Technology

Cardboard VR: Building Low Cost VR Experiences

Spatial Audio for Augmented Reality

2013 426 Lecture 2: Augmented Reality Technology

COMP 4010 Lecture5 VR Audio and Tracking

Ubiquitous Computer Vision in IoT

Miraware

COSC 426 Lecture 1: Introduction to Augmented Reality

Mobile AR Lecture 2 - Technology

Similar to Hiren Bhinde (Qualcomm): On-device Motion Tracking for Immersive VR

On-device Motion Tracking for Immersive VRQualcomm Research

"Computer Vision and Machine Learning at the Edge," a Presentation from Qualc...Edge AI and Vision Alliance

SCG Virtual Reality top news q1 2016Chris Rigatuso

Emerging vision technologiesQualcomm Research

Qualcomm: How to take advantage of XR over 5G in 2019: Understanding XR ViewersAugmentedWorldExpo

New Media Services from a Mobile Chipset Vendor and Standardization PerspectiveFörderverein Technische Fakultät

Hiren Bhinde (Qualcomm Technologies): Unlocking the Mysteries of SLAM (simult...AugmentedWorldExpo

Developing Games For VR - Lessons LearnedMartin Climatiano

Luis cataldi-ue4-vr-best-practices2Luis Cataldi

Virtual Reality: Learn to Maximize Present and Future Creative Possibilities!Stephan Tanguay

Introduction to DaydreamVR from DevFestDC 2017Jared Sheehan

OCULUS VIRTUAL REALITY TECHNOLOGYAkshay Balu

Introduction to daydream for AnDevCon DC - 2017Jared Sheehan

How to take advantage of XR over 5G: Understanding XR ViewersQualcomm Developer Network

AN Introduction to Augmented Reality(AR)Jai Sipani

Building the Matrix: Your First VR App (SVCC 2016)Liv Erickson

eng.pptxZuine

Hugo Swart (Qualcomm Technologies): The Making of an Ambient World: The Futur...AugmentedWorldExpo

Tim Leland (Qualcomm): The Mobile Future of Extended Reality (XR)AugmentedWorldExpo

Hacking for Salone: Drone Races - Di Saverio; Lippolis - Codemotion Milan 2016Codemotion

Similar to Hiren Bhinde (Qualcomm): On-device Motion Tracking for Immersive VR (20)

On-device Motion Tracking for Immersive VR

"Computer Vision and Machine Learning at the Edge," a Presentation from Qualc...

SCG Virtual Reality top news q1 2016

Emerging vision technologies

Qualcomm: How to take advantage of XR over 5G in 2019: Understanding XR Viewers

New Media Services from a Mobile Chipset Vendor and Standardization Perspective

Hiren Bhinde (Qualcomm Technologies): Unlocking the Mysteries of SLAM (simult...

Developing Games For VR - Lessons Learned

Luis cataldi-ue4-vr-best-practices2

Virtual Reality: Learn to Maximize Present and Future Creative Possibilities!

Introduction to DaydreamVR from DevFestDC 2017

OCULUS VIRTUAL REALITY TECHNOLOGY

Introduction to daydream for AnDevCon DC - 2017

How to take advantage of XR over 5G: Understanding XR Viewers

AN Introduction to Augmented Reality(AR)

Building the Matrix: Your First VR App (SVCC 2016)

eng.pptx

Hugo Swart (Qualcomm Technologies): The Making of an Ambient World: The Futur...

Tim Leland (Qualcomm): The Mobile Future of Extended Reality (XR)

Hacking for Salone: Drone Races - Di Saverio; Lippolis - Codemotion Milan 2016

Recently uploaded

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

How to Remove Document Management Hurdles with X-Docs?XfilesPro

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

The transition to renewables in India.pdfCompetition Advisory Services (India) LLP

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

Understanding the Laravel MVC ArchitecturePixlogix Infotech

CloudStudio User manual (basic edition):comworks

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxnull - The Open Security Community

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal

Slack Application Development 101 Slidespraypatel2

Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime

AI as an Interface for Commercial BuildingsMemoori

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

How to convert PDF to text with Nanonetsnaman860154

Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software

Recently uploaded (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

How to Remove Document Management Hurdles with X-Docs?

IAC 2024 - IA Fast Track to Search Focused AI Solutions

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

The transition to renewables in India.pdf

Breaking the Kubernetes Kill Chain: Host Path Mount

Understanding the Laravel MVC Architecture

CloudStudio User manual (basic edition):

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service

Slack Application Development 101 Slides

Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget

AI as an Interface for Commercial Buildings

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi

08448380779 Call Girls In Civil Lines Women Seeking Men

How to convert PDF to text with Nanonets

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation

Hiren Bhinde (Qualcomm): On-device Motion Tracking for Immersive VR

1. On-device motion tracking for immersive mobile VR Hiren Bhinde XR Product Manager Qualcomm Technologies, Inc. June 2, 2017 Qualcomm Snapdragon is a product of Qualcomm Technologies, Inc.

2. 2 VR will provide the ultimate level of immersion Creating physical presence in real or imagined worlds: Intuitive interactions required Interactions So intuitive that they become second nature Sounds So accurate that they are true to life Visuals So vibrant that they are eventually indistinguishable from the real world

3. 3 3-DoF vs. 6-DoF • “In which direction am I looking” • Detect rotational head movement • Look around the virtual world from a fixed point 3 degrees of freedom (3-DoF) • “Where am I and in which direction am I looking” • Detect rotational movement and translational movement • Move in the virtual world like you move in the real world 6 degrees of freedom (6-DoF)

4. 4 6-DoF allows developers to bring the user into their story • Full immersion • Can become part of the story • Can now interact and change the story 6 degrees of freedom (6-DoF) • Can only watch 3 degrees of freedom (3-DoF)

5. 5 6-DoF motion tracking evolution 2014 Outside-in External sensors, markers, cameras, or lasers to be set up throughout the room. Monocular fisheye camera Inside-out solution requiring no room setup Stereo fisheye camera More accurate motion tracking for immersive interactions Room-scale VR Object detection, boundary augmentation, and safety World VR Limitless movement and robust safety Future2017 Today’s discussion

6. 6 Conventional 6-DoF: “Outside-in” tracking External sensors determine the user’s position and orientation

7. 7 Mobile 6-DoF: “Inside-out” tracking Visual inertial odometry (VIO) for rapid and accurate 6-DoF pose

8. 8 Mobile 6-DoF: “Inside-out” tracking Visual inertial odometry (VIO) for rapid and accurate 6-DoF pose 6-DoF position & orientation (aka “6-DoF pose”) Captured from tracking camera image sensor at ~30 fps Mono or stereo camera data Accelerometer & gyroscope data Sampled from external sensors at 800 / 1000 Hz Camera feature processing Inertial data processing Snapdragon “VIO” subsystem New frame accurately displayed Qualcomm® Hexagon™ DSP algorithms • Camera and inertial sensor data fusion • Continuous localization • Accurate, high-rate “pose” generation & prediction Qualcomm Hexagon is a product of Qualcomm Technologies, Inc.

9. 9 Minimizing motion to photon (MTP) latency is crucial ~15-16 ms MTP latency on Qualcomm® Snapdragon™ 835 Mobile Platform shows our mobile VR leadership Low latency Noticeable latency Qualcomm Snapdragon is a product of Qualcomm Technologies, Inc.

10. 10 Low latency mobile 6-DoF inside-out tracking Many workloads must run efficiently for an immersive VR experience 1 Visual processing • View generation • Render / decode Motion detection • Sensor sampling • Sensor fusion Display • Image correction • Quality enhancement and display “Motion” “Photon” (new pixels’ light emitted from the screen) Total time (motion to photon latency) for all steps above must be less than 20 milliseconds

11. 11 Stereo VIO for rapid, robust and accurate 6-DoF pose Sensor fusion of stereo camera features and high rate IMU data Benefits of stereo over monocular 6 DoF • Instant accurate scene depth • Faster initialization • Better performance with quick and rotational motions • Improved tolerance to camera occlusion Stereo wide-angle lenses

12. 12 Some key things to consider Developing 6-DoF content Scale and space • Issue: Untethered mobile allows for limitless movement, but available space could change between runs of app • DO consider limits to amount of full body movement • DO provide movement alternatives Tracking • Issue: Device cameras can become occluded or environment can be too dark • DO fade to black or fixed image in event of lost head tracking to avoid nauseating jumps and judder • DON’T fall back to tracking only orientation (3-DoF). Jumps in position or seeing virtual world respond differently to movement can be uncomfortable for users Storytelling • Issue: Users may not be looking or positioned in desired location to advance story • DO use audio and visual cues to guide user focus • DON’T take over control of virtual camera in order to force focus on story element

13. 13 How to get started with 6-DoF for mobile VR Snapdragon VR SDK Snapdragon 835 VR HMD Access to advanced VR features to optimize applications and simplify development Accelerating the development of standalone head-mounted displays URL: developer.qualcomm.com Tools & resources to help developers build, integrate, and optimize

14. Follow us on: For more information, visit us at: www.qualcomm.com & www.qualcomm.com/blog Thank you Nothing in these materials is an offer to sell any of the components or devices referenced herein. ©2017 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved. Qualcomm, Snapdragon, and Hexagon are trademarks of Qualcomm Incorporated, registered in the United States and other countrie s. Other products and brand names may be trademarks or registered trademarks of their respective owners. References in this presentation to “Qualcomm” may mean Qualcomm Incorporated, Qualcomm Technologies, Inc., and/or other subsidiaries or business units within the Qualcomm corporate structure, as applicable. Qualcomm Incorporated includes Qualcomm’s licensing business, QTL, and the vast majority of its patent portfolio. Qualcomm Technologies, Inc., a wholly-owned subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of Qualcomm’s engineering, research and development functions, and substantially all of its product and services businesses, including its semiconductor business, QCT.

Editor's Notes

VR will provide the ultimate level of immersion. For virtual reality to be truly immersive and live up to its name, it needs to create physical presence in real or imagined worlds. VR does this by stimulating our human senses with realistic feedback (inputs and responses). Humans are very perceptive at noticing things that look wrong, sound wrong, feel out of place, or do not behave in a natural way – and this takes us out of the moment. When the sound is right and matches the visual, you are truly immersed in the scene. In contrast, when the sound is wrong – think of a movie with sound dubbing where audio mismatches the lip movements – we immediately know and focus on it. For 3D surround sound, we have a similar experience of feeling the direction of the sound and whether it is appropriate for the visual being shown. Virtual reality is the ultimate test for this. Our brain will tell us when the visuals and sounds are right or wrong, ultimately determining the level of immersion we feel. For creating truly immersive experiences, VR places high emphasis on the 3 pillars of immersion to stimulate our senses, providing: Visuals so vibrant and so realistic that they are eventually indistinguishable from the real world Sounds so accurate that they are true to life Interactions so intuitive that they become second nature and you forget there is even an interface. Intuitive interactions keep us immersed in the moment. For a VR headset, your primary control is actually your head movement. What’s more natural than just looking around. You might also use gestures or a control pad to provide other inputs. As we will get to later, one of the biggest challenges for VR is the response time from your head moving to the display actually being updated, which is known as “motion to photon”. It is the level of immersion that significantly sets VR apart from anything else people have experienced before – and it is why we are so excited about VR’s potential. VR will take our experiences to the next level, not by an incremental amount but by a huge amount.
Precise (and of course low latency) motion tracking of head movements is crucial for accurate and intuitive interactions with the virtual world. Precise is key for both large and small movements. If you turn your head to explore the scene, you want very accurate control. Similarly, if your head is stable and not moving, you want the visuals to stay completely still, otherwise you might feel like you are on a boat in an unstable virtual world. Motion is often characterized by how many degrees of freedom are possible in movement: either 3 degrees of freedom or 6 degrees of freedom. 3-DoF answers “In which direction am I looking”. This means that you can detect rotational movement around the X, Y, and Z axis – the orientation. For your head movements, that means being able to rotate, nod, and bend your neck/head, while keeping the rest of your body still. 3-DoF is very beneficial in VR. You will be able to look around the virtual world from fixed points – think of a camera on a tripod. For many 360° spherical videos, this will provide very immersive content, such as viewing sporting events or nature. [Click] 6-DoF answers “where I am and in which direction I am looking”. It means that you can detect rotational movement and translational movement – the orientation and position. This means that your body can now move from fixed viewpoints in the virtual world – it can move in the X, Y, and Z direction. 6-DoF is very beneficial in VR. Gaming is the obvious use case where you can move freely in the virtual world and look around corners. However, even simple things, like looking at objects at your desk or shifting your head side-to-side can be compelling with 6-DoF. Tower defense games with positional tracking is compelling -- leaning in works well to zoom. Looking at side view mirrors in a car racing game. 6-DoF is more realistic because it captures our real movement and does not cause a sensory conflict between our vision and vestibular system. 360° spherical video that supports 6-DoF will also be very compelling. The challenge is creating that content, which is an area of active research.
Precise (and of course low latency) motion tracking of head movements is crucial for accurate and intuitive interactions with the virtual world. Precise is key for both large and small movements. If you turn your head to explore the scene, you want very accurate control. Similarly, if your head is stable and not moving, you want the visuals to stay completely still, otherwise you might feel like you are on a boat in an unstable virtual world. Motion is often characterized by how many degrees of freedom are possible in movement: either 3 degrees of freedom or 6 degrees of freedom. 3-DoF answers “In which direction am I looking”. This means that you can detect rotational movement around the X, Y, and Z axis – the orientation. For your head movements, that means being able to rotate, nod, and bend your neck/head, while keeping the rest of your body still. 3-DoF is very beneficial in VR. You will be able to look around the virtual world from fixed points – think of a camera on a tripod. For many 360° spherical videos, this will provide very immersive content, such as viewing sporting events or nature. [Click] 6-DoF answers “where I am and in which direction I am looking”. It means that you can detect rotational movement and translational movement – the orientation and position. This means that your body can now move from fixed viewpoints in the virtual world – it can move in the X, Y, and Z direction. 6-DoF is very beneficial in VR. Gaming is the obvious use case where you can move freely in the virtual world and look around corners. However, even simple things, like looking at objects at your desk or shifting your head side-to-side can be compelling with 6-DoF. Tower defense games with positional tracking is compelling -- leaning in works well to zoom. Looking at side view mirrors in a car racing game. 6-DoF is more realistic because it captures our real movement and does not cause a sensory conflict between our vision and vestibular system. 360° spherical video that supports 6-DoF will also be very compelling. The challenge is creating that content, which is an area of active research.
Monocular Fisheye solution Stereo camera Whatever’s next…Room scale requirements (object detection, boundary augmentation, etc.)
Visual-inertial odometry, or VIO, provides precise on-device motion tracking. By using the on-device camera and inertial sensors, a 6-DoF pose can be generated. This allows you to have those immersive experiences we just talked about, plus the ability to use the VR headset anywhere – and it is much less expensive than other solutions. So how does VIO work? VIO estimates relative position and orientation of a moving device in an unknown environment using a camera and motion sensors. VIO takes advantage of the complementary strengths of the camera and inertial sensors. For example, a single camera can estimate relative position, but it cannot provide absolute scale—the actual distances between objects, or the size of objects in meters or feet. Inertial sensors provide absolute scale and take measurement samples at a higher rate, thereby improving robustness for fast device motion. However, inertial sensors, particularly low-cost MEMS varieties, are prone to substantial drifts in position estimates when compared with cameras. So VIO blends together the best of both worlds to accurately estimate device pose. Looking at the diagram, you can see that the camera, accelerometers, and gyroscopes are the inputs to VIO. Accelerometers and gyroscopes are running at much higher frequency than the camera. For Snapdragon VIO subsystem: The camera feature processing block detects features and tracks them The inertial data processing block runs at the high frequencies of the inertial sensors and fuses the data. The VIO algorithms run on the Hexagon DSP for efficiency. The magic really happens when fusing the camera and inertial sensor data. Fusing this data allows for continuous localization of the head position Finally, we need the head position at a high frequency, so we generate both an accurate, high-rate pose and predicted pose. This 6-DoF pose can then be used to quickly update the user’s view in the VR world with the precise visuals (and sound).
Visual-inertial odometry, or VIO, provides precise on-device motion tracking. By using the on-device camera and inertial sensors, a 6-DoF pose can be generated. This allows you to have those immersive experiences we just talked about, plus the ability to use the VR headset anywhere – and it is much less expensive than other solutions. So how does VIO work? VIO estimates relative position and orientation of a moving device in an unknown environment using a camera and motion sensors. VIO takes advantage of the complementary strengths of the camera and inertial sensors. For example, a single camera can estimate relative position, but it cannot provide absolute scale—the actual distances between objects, or the size of objects in meters or feet. Inertial sensors provide absolute scale and take measurement samples at a higher rate, thereby improving robustness for fast device motion. However, inertial sensors, particularly low-cost MEMS varieties, are prone to substantial drifts in position estimates when compared with cameras. So VIO blends together the best of both worlds to accurately estimate device pose. Looking at the diagram, you can see that the camera, accelerometers, and gyroscopes are the inputs to VIO. Accelerometers and gyroscopes are running at much higher frequency than the camera. For Snapdragon VIO subsystem: The camera feature processing block detects features and tracks them The inertial data processing block runs at the high frequencies of the inertial sensors and fuses the data. The VIO algorithms run on the Hexagon DSP for efficiency. The magic really happens when fusing the camera and inertial sensor data. Fusing this data allows for continuous localization of the head position Finally, we need the head position at a high frequency, so we generate both an accurate, high-rate pose and predicted pose. This 6-DoF pose can then be used to quickly update the user’s view in the VR world with the precise visuals (and sound).
Visual-inertial odometry, or VIO, provides precise on-device motion tracking. By using the on-device camera and inertial sensors, a 6-DoF pose can be generated. This allows you to have those immersive experiences we just talked about, plus the ability to use the VR headset anywhere – and it is much less expensive than other solutions. So how does VIO work? VIO estimates relative position and orientation of a moving device in an unknown environment using a camera and motion sensors. VIO takes advantage of the complementary strengths of the camera and inertial sensors. For example, a single camera can estimate relative position, but it cannot provide absolute scale—the actual distances between objects, or the size of objects in meters or feet. Inertial sensors provide absolute scale and take measurement samples at a higher rate, thereby improving robustness for fast device motion. However, inertial sensors, particularly low-cost MEMS varieties, are prone to substantial drifts in position estimates when compared with cameras. So VIO blends together the best of both worlds to accurately estimate device pose. Looking at the diagram, you can see that the camera, accelerometers, and gyroscopes are the inputs to VIO. Accelerometers and gyroscopes are running at much higher frequency than the camera. For Snapdragon VIO subsystem: The camera feature processing block detects features and tracks them The inertial data processing block runs at the high frequencies of the inertial sensors and fuses the data. The VIO algorithms run on the Hexagon DSP for efficiency. The magic really happens when fusing the camera and inertial sensor data. Fusing this data allows for continuous localization of the head position Finally, we need the head position at a high frequency, so we generate both an accurate, high-rate pose and predicted pose. This 6-DoF pose can then be used to quickly update the user’s view in the VR world with the precise visuals (and sound).
Motion-to-Photon latency is the amount of time between the movement of the VR user’s head, and the time that new content is correctly displayed Minimizing motion to photon latency is absolutely crucial for VR Your vestibular system (which provides a sense of balance and spatial orientation), signals to your brain that your head has moved, but if the visual stimulus from your eyes to the brain does not match, it can not only make the experience less immersive but it can also make some users sick. [Click] On the left, we can see what LOW motion to photon latency looks like. As you move your head, everything is updated with minimal lag. [Click] On the right, the head moves, but the display update to the VR headset is delayed. This is high motion to photon latency, and it produces a physiological sensory conflict that is uncomfortable for most people The best way to both minimize this discomfort and create immersion, is obviously to reduce the latency. In Snapdragon 835, we’ve decreased it by about 20% through efficiency improvements to sensor processing and interaction with the single buffered rendering subsystem that involves a clever way of concurrently rendering and time-warping one eye buffer while sending the other finished eye buffer to the display.
An end-to-end approach is required to minimize motion to photon latency (in fact, an end to end approach is required for many VR requirements!). The time from the head movement to the screen being updated, known as “motion to photon”, needs to be very fast. If not, the user will see the display lag, making the experience less immersive and possibly causing discomfort. Note that the motion to audio update is also important. There are many processing steps required before updating the display, which is why an end-to-end optimization for latency is required to provide a responsive interface. Let’s consider what happens when your head moves. At a high level, there is motion detection, visual processing, and finally displaying the updated image. Motion detection detects the head movement. This accomplished by sampling the sensor inputs, such as the gyroscope, accelerometer, and camera, fusing them together, and determining where you are looking – the head pose. Visual processing generates the correct image based on where you are looking. View generation takes the head pose and creates the commands required to render the correct image. The actual image is then rendered for graphics or decoded for pictures/video. Display processing enhances the rendered or decoded image and scans it out to the display. This includes image correction for the VR headset itself, such as lens distortion, but also quality enhancement for the image and the actual display panel. For VR, the lower the latency the better, but studies have shown that under 20 ms is an important number to hit. To put the challenge in perspective, a display running at 60 Hz is updated every 17 ms and a display running at 90 Hz is updated every 11 ms. The display update is just one part of the motion to photon path. To achieve this goal requires us to change how we think about rendering and taking some new approaches. We need to run as many tasks in parallel as possible and also reduce individual latencies.
Visual-inertial odometry, or VIO, provides precise on-device motion tracking. By using the on-device camera and inertial sensors, a 6-DoF pose can be generated. This allows you to have those immersive experiences we just talked about, plus the ability to use the VR headset anywhere – and it is much less expensive than other solutions. So how does VIO work? VIO estimates relative position and orientation of a moving device in an unknown environment using a camera and motion sensors. VIO takes advantage of the complementary strengths of the camera and inertial sensors. For example, a single camera can estimate relative position, but it cannot provide absolute scale—the actual distances between objects, or the size of objects in meters or feet. Inertial sensors provide absolute scale and take measurement samples at a higher rate, thereby improving robustness for fast device motion. However, inertial sensors, particularly low-cost MEMS varieties, are prone to substantial drifts in position estimates when compared with cameras. So VIO blends together the best of both worlds to accurately estimate device pose. Looking at the diagram, you can see that the camera, accelerometers, and gyroscopes are the inputs to VIO. Accelerometers and gyroscopes are running at much higher frequency than the camera. For Snapdragon VIO subsystem: The camera feature processing block detects features and tracks them The inertial data processing block runs at the high frequencies of the inertial sensors and fuses the data. The VIO algorithms run on the Hexagon DSP for efficiency. The magic really happens when fusing the camera and inertial sensor data. Fusing this data allows for continuous localization of the head position Finally, we need the head position at a high frequency, so we generate both an accurate, high-rate pose and predicted pose. This 6-DoF pose can then be used to quickly update the user’s view in the VR world with the precise visuals (and sound).
When developing 6-DoF content, there are a few things to consider. First, scale and space Issue: Untethered mobile allows for limitless movement, but available space could change between app runs While developing Reef, we learned a few things. The space the user is in is not well defined. They could be on couch, a small room, or a giant room. You need to consider when developing that the user may be in a 2x2 space, an 8x8 space, etc. DO consider limits to amount of full body movement DO provide movement alternatives such as teleportation if real world space is too small to represent virtual world Second, tracking Issue: Device cameras can become occluded or environment can be too dark DO fade to black or fixed image in event of lost head tracking to avoid nauseating jumps and judder DON’T fall back to tracking only orientation (3-DoF); Jumps in position or seeing virtual world respond differently to movement can be uncomfortable for users Third, storytelling Issue: Users may not be looking or positioned in desired location to advance story You don’t control where the user is looking or moving. You can’t take control of their head. Need the ability to dodge at any point – the permutations of different scenarios makes things much more complicated. DO use audio and visual cues to guide user focus DON’T take over control of virtual camera in order to force focus on story element Other issues: Safety Need to develop based on the environment. Considerations that a user may not be able to move around. Need to be aware of couch, chair, etc. so that user does not get hurt. It is the responsibility of the developer to take care of the user, especially as VR becomes more immersive. Mobile VR will used in a variety of locations and is immersive (lightweight, no cable) Mobile challenges: Power and thermal constraints of a mobile HMD are significant Could mention our tools for helping (Symphony, Profiler, VR SDK, Adreno GPU)
Snapdragon VR Development Kit Snapdragon 835 is purpose built silicon for superior mobile VR Snapdragon VR SDK provides easy developer access to Snapdragon accelerated VR libraries that simplify application development; it is compatible with existing toolchains Snapdragon VR Reference Design is an HMD reference design for Snapdragon 835, for OEMs and 3rd party content development Snapdragon 835 hardware and software is “Daydream-ready”

Hiren Bhinde (Qualcomm): On-device Motion Tracking for Immersive VR

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Hiren Bhinde (Qualcomm): On-device Motion Tracking for Immersive VR

Similar to Hiren Bhinde (Qualcomm): On-device Motion Tracking for Immersive VR (20)

More from AugmentedWorldExpo

More from AugmentedWorldExpo (20)

Recently uploaded

Recently uploaded (20)

Hiren Bhinde (Qualcomm): On-device Motion Tracking for Immersive VR

Editor's Notes