The final lecture in the 2021 COMP 4010 class on AR/VR. This lecture summarizes some more research directions and trends in AR and VR. This lecture was taught by Mark Billinghurst on November 2nd 2021 at the University of South Australia
3. Key Technologies for MR Systems
• Display
• Stimulate visual, hearing/touch sense
• Tracking
• Changing viewpoint, registered content
• Interaction
• Supporting user input
4. • Past
• Bulky Head mounted displays
• Current
• Handheld, lightweight head mounted
• Future
• Projected AR
• Wide FOV see through
• Retinal displays
• Contact lens
Evolution in Displays
5. Wide FOV See-Through Displays
• Waveguide techniques
• Wider FOV
• Thin see through
• Socially acceptable
• Pinlight Displays
• LCD panel + point light sources
• 110 degree FOV
• UNC/Nvidia
Lumus DK40
Maimone, A., Lanman, D., Rathinavel, K., Keller, K., Luebke, D., & Fuchs, H. (2014). Pinlight displays:
wide field of view augmented reality eyeglasses using defocused point light sources. In ACM SIGGRAPH
2014 Emerging Technologies (p. 20). ACM.
7. Nvidia Prototype
• 1 cm thick lightfield display
• 146x78 pixel resolution, 29x16 degree field of view
Lanman, D., & Luebke, D. (2013). Near-eye light field displays. ACM Transactions on Graphics (TOG), 32(6), 1-10.
14. Peripheral Displays
• Use second display in HMD to add peripheral view
Nakano, Kizashi, et al. "Head-Mounted Display with Increased Downward Field of View Improves Presence and
Sense of Self-Location." IEEE Transactions on Visualization and Computer Graphics (2021).
18. Text Input in AR/VR
• How can people input text in AR/VR as fast as typing the in real world?
• Challenges
• Providing feedback, clearly seeing keys, ergonomics, etc
19. Possible Approaches
• Physical input device
• Keyboard, chording keyboard, touch screen
• Virtual input device
• Gesture, pointing, gaze, etc.
• Smart keyboard input
• Word completion, swipe input, etc.
• Speech input
• Voice recognition
20. Typing in Augmented Reality
• VISAR AR keyboard on Hololens
• Using air-tap gestures over virtual keyboard, mean text entry ~ 9 WPM
• Adding word completion and training improves to ~18 WPM
Dudley, J. J., Vertanen, K., & Kristensson, P. O. (2018). Fast and precise touch-based text entry for head-mounted
augmented reality with variable occlusion. ACM Transactions on Computer-Human Interaction (TOCHI), 25(6), 1-40.
21. Typing on Midair Virtual Keyboards
• Using hand tracking (Leap Motion)
• Tracking index fingers
• Tested different configurations
• Unimanual, Bimanual, Split
• Support auto-correction
• Achieved 15-16 WPM with novices
• Split keyboard slowest with most errors
Adhikary, J., & Vertanen, K. (2021, August). Typing on Midair Virtual Keyboards: Exploring Visual Designs
and Interaction Styles. In IFIP Conference on Human-Computer Interaction (pp. 132-151). Springer, Cham.
Unimanual Bimanual
Split
22. Example: Word-Gesture Text Entry in VR
• User inputs text until desired word appears, then selects word
• Comparison between two methods
• 6DOF controller, touchscreen
• Controller significantly faster than touchscreen
• 16.4 WPM vs. 9.6 WPM
Chen, S., Wang, J., Guerra, S., Mittal, N., & Prakkamakul, S. (2019, May). Exploring word-gesture text entry techniques
in virtual reality. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems (pp. 1-6).
24. Typing on a Real Keyboard
• Explore the effect of hand representation on typing on a real keyboard in VR
• Compared different types of hand representation and transparency
• Compared performance to typing in the real work
• Use fast motion capture system to track hand motion
Knierim, Pascal, Valentin Schwind, Anna Maria Feit, Florian Nieuwenhuizen, and Niels Henze. "Physical
keyboards in virtual reality: Analysis of typing performance and effects of avatar hands." In Proceedings of the
2018 CHI Conference on Human Factors in Computing Systems, pp. 1-9. 2018.
26. Results
• Typing on real keyboard provides faster text input (35 WPM up to 65 WPM)
• Seeing virtual hands has significant effect for novices, but not experienced
• Realistic hands rendering generates the highest presence with the lowest workload.
27. Headmounted Keyboard (HMK)
• Split keyboard on side of HMD
• Users achieved 34.7 WPM after three days
• 81 percent of their regular entry speed.
Hutama, W., Harashima, H., Ishikawa, H., & Manabe, H. (2021, October). HMK: Head-Mounted-Keyboard for Text Inputin Virtual or Augmented
Reality. In The Adjunct Publication of the 34th Annual ACM Symposium on User Interface Software and Technology (pp. 115-117).
29. Review Article
• Physical keyboard input for VR
• Up to 70 WPM performance
• Virtual keyboard input for VR
• Head pointing, gaze, gesture, controllers, etc.
• Up to 25 WPM
• Key research areas
• Cheaper and better tracking devices
• Best way to show live video of hands/keyboard
• Effect of different keyboard properties
• Alternative test entry for 3D environments
Dube, T. J., & Arif, A. S. (2019, July). Text entry in
virtual reality: A comprehensive review of the literature.
In International Conference on Human-Computer
Interaction (pp. 419-437). Springer, Cham.
30. Natural Gesture
• Freehand gesture input
• Depth sensors for gesture capture
• Move beyond simple pointing
• Rich two handed gestures
• E.g. Microsoft Research Hand Tracker
• 3D hand tracking, 30 fps, single sensor
• Commercial Systems
• Hololens2, Oculus, Intel, MagicLeap, etc
Sharp, T., Keskin, C., Robertson, D., Taylor, J., Shotton, J., Leichter, D. K. C. R. I., ... & Izadi, S.
(2015, April). Accurate, Robust, and Flexible Real-time Hand Tracking. In Proc. CHI (Vol. 8).
31. State of the Art: UltraLeap Gemini
https://www.youtube.com/watch?v=Llvh4GBpnVA
32. 3D Hand Tracking from a Single Camera
• Use machine learning to combine multiple pieces of information together
• segmentation, dense matchings, 2D keypoint positions, intra-hand relative depth, and inter-hand distance
• Use a generative model to estimate pose and shape parameters of a 3D hand model
Wang, J., Mueller, F., Bernard, F., Sorli, S., Sotnychenko, O., Qian, N., ... & Theobalt, C. (2020). Rgb2hands: real-time
tracking of 3d hand interactions from monocular rgb video. ACM Transactions on Graphics (TOG), 39(6), 1-16.
33. Method
• Use CNNs to extract features from segmented images and combine into 3D hand
35. Multi-Scale Gesture
• Combine different gesture types
• In-air gestures – natural but imprecise
• Micro-gesture – fine scale gestures
• Gross motion + fine tuning interaction
Ens, B., Quigley, A., Yeo, H. S., Irani, P., Piumsomboon, T., & Billinghurst, M. (2018). Counterpoint:
Exploring Mixed-Scale Gesture Interaction for AR Applications. In Extended Abstracts of the 2018 CHI
Conference on Human Factors in Computing Systems (p. LBW120). ACM.
37. Multimodal Input
• Gesture and Speech Input Complimentary
• Speech
• modal commands, quantities
• Gesture
• selection, motion, qualities
• Support combined commands
• “Put that there” + pointing
• Previous work found multimodal interfaces
intuitive for 2D/3D graphics interaction
Lee, M., Billinghurst, M., Baek, W., Green, R., & Woo, W. (2013). A usability study of
multimodal input in an augmented reality environment. Virtual Reality, 17(4), 293-305.
44. Multimodal CAD Interface
Billinghurst, M., Piumsomboon, T., & Bai, H. (2014). Hands in Space: Gesture Interaction with
Augmented-Reality Interfaces. IEEE computer graphics and applications, (1), 77-80.
45. Eye Tracking Input
• HMDs with integrated eye-tracking
• Hololens2, MagicLeap One
• Research questions
• How can eye gaze be used for interaction?
• What interaction metaphors are natural?
46. Eye Gaze Interaction Methods
• Gaze for interaction
• Implicit vs. explicit input
• Exploring different gaze interaction
• Duo reticles – use eye saccade input
• Hardware
• HTC Vive + Pupil Labs integrated eye-tracking
Piumsomboon, T., Lee, G., Lindeman, R. W., & Billinghurst, M. (2017, March). Exploring natural eye-gaze-based
interaction for immersive virtual reality. In 3D User Interfaces (3DUI), 2017 IEEE Symposium on (pp. 36-39). IEEE.
47. Duo-Reticles (DR)
Inertial Reticle (IR)
Real-time Reticle (RR) or Eye-gaze Reticle (original name)
A-1
As RR and IR are aligned,
alignment time counts down
A-2 A-3
Selection completed
57. Project Galea - https://galea.co/
• Collaboration between Valve and OpenBCI
• Integrate multiple sensors into HMD
• EOG, EMG, EDA, PPG sensors, 10 EEG channels and eye-tracking
• Launching in 2022
58.
59. Intelligent Interfaces
• Move to Implicit Input vs. Explicit
• Recognize user behaviour
• Provide adaptive feedback
• Move beyond check-lists of actions
• E.g. AR + Intelligent Tutoring
• Constraint based ITS + AR
• PC Assembly (Westerfield, 2015)
• 30% faster, 25% better retention
Westerfield, G., Mitrovic, A., & Billinghurst, M. (2015). Intelligent Augmented Reality Training for
Motherboard Assembly. International Journal of Artificial Intelligence in Education, 25(1), 157-172.
60.
61. How Should AR Agents be Represented?
• AR Agent helps user find objects
• Three representation
• Smart speaker, upper body, full body
• Results
• Users gaze more at humanoid agents
• Users like full body AR agent most
Wang, I., Smith, J., & Ruiz, J. (2019, May). Exploring virtual agents for augmented reality.
In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (pp. 1-12).
62. Value of Embodied Agent?
• Explored the use of agent help in a shared task
• Desert survival task
• Agent provides suggestions
• Three conditions
• No agent, audio only, AR agent
• Results
• Using an agent produced better performance
• Cognitive load with embodied agent better than voice only
Kim, K., de Melo, C. M., Norouzi, N., Bruder, G., & Welch, G. F. (2020, March). Reducing task load with an embodied
intelligent virtual assistant for improved performance in collaborative decision making. In 2020 IEEE Conference on
Virtual Reality and 3D User Interfaces (VR) (pp. 529-538). IEEE.
64. Pervasive Agents
• Intelligent agents that are in the real world and
can sense and influence their surroundings
• Combination of IVE and IoT and AR
• IVA = intelligence and human agent communication
• IOT = sense and influence the world
• AR = AR display
65. Agent Architecture
Barakonyi, I., & Schmalstieg, D. (2006, October). Ubiquitous animated agents for augmented reality. In 2006
IEEE/ACM International Symposium on Mixed and Augmented Reality (pp. 145-154). IEEE.
68. Evolution of Tracking
• Past
• Location based, marker based,
• magnetic/mechanical
• Present
• Image based, hybrid tracking
• Future
• Ubiquitous
• Model based
• Environmental
69. Model Based Tracking
• Track from known 3D model
• Use depth + colour information
• Match input to model template
• Use CAD model of targets
• Recent innovations
• Learn models online
• Tracking from cluttered scene
• Track from deformable objects
Hinterstoisser, S., Lepetit, V., Ilic, S., Holzer, S., Bradski, G., Konolige, K., & Navab, N. (2013). Model based training, detection
and pose estimation of texture-less 3D objects in heavily cluttered scenes. In Computer Vision–ACCV 2012 (pp. 548-562).
71. Environmental Tracking
• Environment capture
• Use depth sensors to capture scene & track from model
• InifinitAM (www.robots.ox.ac.uk/~victor/infinitam/)
• Real time scene capture, dense or sparse capture, open source
• iPad Pro LiDAR
• Scene scanning up to 5m
75. Using Machine Learning to Improve Tracking
• Visual inertial navigation (ARKit, ARCore)
• relies on continuous visual tracking, fails when bad lighting, fast motion, repeated textures, etc.
• Can use IMU to reduce the dependence on vision information
• But IMU drifts over time, so Deep learning can robustly estimate relative displacement
• Integrate neural network IMU observations with visual-inertial navigation system
Chen, D., Wang, N., Xu, R., Xie, W., Bao, H., & Zhang, G. RNIN-VIO: Robust Neural Inertial
Navigation Aided Visual-Inertial Odometry in Challenging Scenes. (2021)
78. Wide Area Outdoor Tracking
• Process
• Combine panoramas into point cloud model (offline)
• Initialize camera tracking from point cloud
• Update pose by aligning camera image to point cloud
• Accurate to 25 cm, 0.5 degree over very wide area
Ventura, J., & Hollerer, T. (2012). Wide-area scene mapping for mobile visual tracking. In Mixed
and Augmented Reality (ISMAR), 2012 IEEE International Symposium on (pp. 3-12). IEEE.
79. Wide Area Outdoor Tracking
https://www.youtube.com/watch?v=8ZNN0NeXV6s
80. AR Cloud Based Tracking
• AR Cloud
• a machine-readable 1:1 scale model of the real world
• processing recognition/tracking data in the cloud
• Can create cloud from input from multiple devices
• Store key visual features in cloud, Stitch features from multiple devices
• Retrieve for tracking/interaction
• AR Cloud Companies
• 6D.ai, Vertical.ai, Ubiquity6, etc
82. Large Scale Outdoor Applications
• Use server to combine room scale maps into larger play area
• Provide game elements to keep players in the game area
• Enable coordination between players
Rompapas, D. C., Sandor, C., Plopski, A., Saakes, D., Shin, J., Taketomi, T., & Kato, H. (2019). Towards
large scale high fidelity collaborative augmented reality. Computers & Graphics, 84, 24-41.
85. Research Needed in Many Areas
• Social Acceptance
• Overcome social problems with AR
• Cloud Services
• Cloud based storage/processing
• AR Authoring Tools
• Easy content creation for non-experts
• Collaborative Experiences
• AR teleconferencing
• Etc..
94. Changing Perspective
•View from remote user’s perspective
•Wearable Teleconferencing
• audio, video, pointing
• send task space video
•CamNet (1992)
• British Telecom
•Similar CMU study (1996)
• cut performance time in half
95. AR for Remote Collaboration
• Camera + Processing + AR Display + Connectivity
• First person Ego-Vision Collaboration
97. Shared Sphere – 360 Video Sharing
Shared
Live 360 Video
Host User Guest User
Lee, G. A., Teo, T., Kim, S., & Billinghurst, M. (2017). Mixed reality collaboration through sharing a
live panorama. In SIGGRAPH Asia 2017 Mobile Graphics & Interactive Applications (pp. 1-4).
107. Sharing: Separating Cues from Body
• What happens when you can’t see your colleague/agent?
Piumsomboon, T., Lee, G. A., Hart, J. D., Ens, B., Lindeman, R. W., Thomas, B. H., & Billinghurst, M. (2018, April). Mini-me: An adaptive
avatar for mixed reality remote collaboration. In Proceedings of the 2018 CHI conference on human factors in computing systems (pp. 1-13).
Collaborating Collaborator out of View
108. Mini-Me Communication Cues in MR
• When lose sight of collaborator a Mini-Me avatar appears
• Miniature avatar in real world
• Mini-Me points to shared objects, show communication cues
• Redirected gaze, gestures
113. Empathic Computing
Can we develop systems
that allow us to share what
we are seeing, hearing and
feeling with others?
Piumsomboon, T., Lee, Y., Lee, G. A., Dey, A., & Billinghurst, M. (2017). Empathic Mixed
Reality: Sharing What You Feel and Interacting with What You See. In Ubiquitous Virtual
Reality (ISUVR), 2017 International Symposium on (pp. 38-41). IEEE.
114. Empathy Glasses (CHI 2016)
• Combine together eye-tracking, display, face expression
• Implicit cues – eye gaze, face expression
+
+
Pupil Labs Epson BT-200 AffectiveWear
Masai, K., Sugimoto, M., Kunze, K., & Billinghurst, M. (2016, May). Empathy Glasses. In Proceedings of
the 34th Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems. ACM.
115. Remote Collaboration
• Eye gaze pointer and remote pointing
• Face expression display
• Implicit cues for remote collaboration
126. Technology Trends
• Advanced displays
• Wide FOV, high resolution
• Real time space capture
• 3D scanning, stitching, segmentation
• Natural gesture interaction
• Hand tracking, pose recognition
• Robust eye-tracking
• Gaze points, focus depth
• Emotion sensing/sharing
• Physiological sensing, emotion mapping
127. • Advanced displays
• Real time space capture
• Natural gesture interaction
• Robust eye-tracking
• Emotion sensing/sharing
Empathic
Tele-Existence
128. Empathic Tele-Existence
• Move from Observer to Participant
• Explicit to Implicit communication
• Experiential collaboration – doing together
129. Empathic Tele-Existence
• Know what someone is seeing, hearing, feeling
• Feel that you are in the same environment with them
• Seeing virtual people with you in your real world
133. Social Acceptance
• People don’t want to look silly
• Only 12% of 4,600 adults would be willing to wear AR glasses
• 20% of mobile AR browser users experience social issues
• Acceptance more due to Social than Technical issues
• Needs further study (ethnographic, field tests, longitudinal)
137. Ethical Issues
• Persuasive Technology
• Affecting emotions
• Behaviour modification
• Privacy Concerns
• Facial recognition
• Space capture
• Personal data
• Safety Concerns
• Sim sickness, Distraction
• Long term effects
Pase, S. (2012). Ethical considerations in augmented reality applications. In Proceedings of the International Conference on
e-Learning, e-Business, Enterprise Information Systems, and e-Government (EEE) (p. 1). The Steering Committee of The
World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp).
138. Identification in VR
• Collected set of features from body motion in VR
• Have people perform standard actions
• Able to recognize people with about 40% accuracy (cf. 5% by chance)
• Relative distances between head and hands produced best results
• Head motion alone 30% accurate
Pfeuffer, K., Geiger, M. J., Prange, S., Mecke, L., Buschek, D., & Alt, F. (2019, May). Behavioural biometrics in
vr: Identifying people from body motion and relations in virtual reality. In Proceedings of the 2019 CHI
Conference on Human Factors in Computing Systems (pp. 1-12).
140. Trauma in VR..
• In VR the goal is immersion, presence and sense of self
• What are the consequences of having a traumatic,
aggressive or emotional in-VR experience?
142. Scaling Up
• Supporting Large Groups of People
• Social VR spaces
• Large scale events
• Hybrid Interfaces
• AR/VR users with desktop/mobile
• Persistent virtual worlds
143. Research Issues
• Avatars
• How to easily create?
• How realistic should they be?
• How can you communicate social cues?
• Hybrid Interfaces
• How can you provide equity across difference devices?
• Social Presence
• How can you objectively measure Social Presence?
• How use AR/VR cues to increase Social Presence?
144. The Metaverse
• Neal Stephenson’s “SnowCrash”
• VR successor to the internet
• The Metaverse is the convergence of:
• 1) virtually enhanced physical reality
• 2) physically persistent virtual space
• Metaverse Roadmap
• http://metaverseroadmap.org/
147. Parisi’s Seven Rules of the Metaverse
• Rule #1. There is only one Metaverse.
• Rule #2: The Metaverse is for everyone.
• Rule #3: Nobody controls the Metaverse.
• Rule #4: The Metaverse is open.
• Rule #5: The Metaverse is hardware-independent.
• Rule #6: The Metaverse is a Network.
• Rule #7: The Metaverse is the Internet.
https://medium.com/meta-verses/the-seven-rules-of-the-metaverse-7d4e06fa864c
148. Possible Research Directions
• Creating open platforms
• Scaling up social interaction
• Recognizing and sharing emotion at scale
• Creating intimate interactions in crowded spaces
• Bringing the real into the virtual
• Novel forms of social interaction
• Intelligent avatars for facilitating social interactions
• Creating persistent hybrid communities
• Supporting equity and equality
• Facilitating successful emergent behaviours
• And more..
151. Conclusions
• AR/VR/MR is becoming commonly available
• Significant advances over 50+ years
• In order to achieve Sutherland’s vision, research basics
• Display, Tracking, Input
• New MR technologies will enable this to happen
• Display devices, Interaction, Tracking technologies
• There are still significant areas for research
• Social Acceptance, Collaboration, Ethics, Etc.
153. Trends in AR Research
Kim, K., Billinghurst, M., Bruder, G., Duh,
H. B. L., & Welch, G. F. (2018). Revisiting
trends in augmented reality research: A
review of the 2nd decade of ISMAR (2008–
2017). IEEE transactions on visualization
and computer graphics, 24(11), 2947-2962.