Lecture 8 of the COMP 4010 course taught at the University of South Australia. This lecture provides and introduction to VR technology. Taught by Mark Billinghurst on September 14th 2021 at the University of South Australia.
4. AR. Design Considerations
• 1. Design for Humans
• Use Human Information Processing model
• 2. Design for Different User Groups
• Different users may have unique needs
• 3. Design for the Whole User
• Social, cultural, emotional, physical cognitive
• 4. Use UI Best Practices
• Adapt known UI guidelines to AR/VR
• 5. Use of Interface Metaphors/Affordances
• Decide best metaphor for AR/VR application
5. 1. Design for Human Information Processing
• High level staged model from Wickens and Carswell (1997)
• Relates perception, cognition, and physical ergonomics
Perception Cognition Ergonomics
6. Design for Perception
• Need to understand perception to design AR
• Visual perception
• Many types of visual cues (stereo, oculomotor, etc.)
• Auditory system
• Binaural cues, vestibular cues
• Somatosensory
• Haptic, tactile, kinesthetic, proprioceptive cues
• Chemical Sensing System
• Taste and smell
8. Design for Cognition
• Design for Working and Long-term memory
• Working memory
• Short term storage, Limited storage (~5-9 items)
• Long term memory
• Memory recall trigger by associative cues
• Situational Awareness
• Model of current state of user’s environment
• Used for wayfinding, object interaction, spatial awareness, etc..
• Provide cognitive cues to help with situational awareness
• Landmarks, procedural cues, map knowledge
• Support both ego-centric and exo-centric views
9. Design for Physical Ergonomics
• Design for the human motion range
• Consider human comfort and natural posture
• Design for hand input
• Coarse and fine scale motions, gripping and grasping
• Avoid “Gorilla arm syndrome” from holding arm pose
10. Gorilla Arm in AR
• Design interface to reduce mid-air gestures
15. •AR design is mixture of physical
affordance and virtual affordance
•Physical
•Tangible controllers and objects
•Virtual
•Virtual graphics and audio
16. Affordances in AR
• Design AR interface objects to show how they are used
• Use visual and physical cues to show possible affordances
• Perceived affordances should match actual affordances
• Physical and virtual affordances should match
Merge Cube Tangible Molecules
20. Design Patterns
“Each pattern describes a problem which occurs
over and over again in our environment, and then
describes the core of the solution to that problem in
such a way that you can use this solution a million
times over, without ever doing it the same way twice.”
– Christopher Alexander et al.
Use Design Patterns to Address Reoccurring Problems
C.A. Alexander, A Pattern Language, Oxford Univ. Press, New York, 1977.
21. Design Patterns for Handheld AR
• Set of design patterns for Handheld AR
• Title: a short phase that is memorable.
• Definition: what experiences the prepattern supports
• Description: how and why the prepattern works,
what aspects of game design it is based on.
• Examples: Illustrate the meaning of the pre-pattern.
• Using the pre-patterns: reveal the challenges and
context of applying the pre-patterns.
Xu, Y., Barba, E., Radu, I., Gandy, M., Shemaka, R., Schrank, B., ... & Tseng, T.
(2011, October). Pre-patterns for designing embodied interactions in handheld
augmented reality games. In 2011 IEEE International Symposium on Mixed and
Augmented Reality-Arts, Media, and Humanities (pp. 19-28). IEEE.
22. Handheld AR Design Patterns
Title Meaning Embodied Skills
Device Metaphors Using metaphor to suggest available player
actions
Body A&S Naïve physics
Control Mapping Intuitive mapping between physical and digital
objects
Body A&S Naïve physics
Seamful Design Making sense of and integrating the
technological seams through game design
Body A&S
World Consistency Whether the laws and rules in
physical world hold in digital world
Naïve physics
Environmental A&S
Landmarks Reinforcing the connection between digital-
physical space through landmarks
Environmental A&S
Personal Presence The way that a player is represented in the
game decides how much they feel like living in
the digital game world
Environmental A&S
Naïve physics
Living Creatures Game characters that are responsive to
physical, social events that mimic behaviours
of living beings
Social A&S Body A&S
Body constraints Movement of one’s body position
constrains another player’s action
Body A&S Social A&S
Hidden information The information that can be hidden and
revealed can foster emergent social play
Social A&S Body A&S
*A&S = awareness and skills
24. ARCore Elements App
• Mobile AR app demonstrating
interface guidelines
• Multiple Interface Guidelines
• User interface
• User environment
• Object manipulation
• Off-screen markers
• Etc..
• Test on Device
• https://play.google.com/store/apps/details?id=com.google.ar.unity.ddelements
28. The Trouble with AR Design Guidelines
1) Rapidly evolving best practices
Still a moving target, lots to learn about AR design
Slowly emerging design patterns, but often change with OS updates
Already major differences between device platforms
2) Challenges with scoping guidelines
Often too high level, like “keep the user safe and comfortable”
Or, too application/device/vendor-specific
3) Best guidelines come from learning by doing
Test your designs early and often, learn from your own “mistakes”
Mind differences between VR and AR, but less so between devices
30. From Reality to Virtual Reality
Internet of Things Augmented Reality Virtual Reality
Real World Virtual World
31. Virtual Reality (VR)
• Users immersed in Computer Generated environment
• HMD, gloves, 3D graphics, body tracking
32. Goal of Virtual Reality
“.. to make it feel like you’re actually in a place that
you are not.”
Palmer Luckey
Co-founder, Oculus
33. Virtual Reality Definition
•Defining Characteristics
• Immersion
• User feels immersed in computer generated scene
• Interaction
• The virtual content can be interacted with
• Independence
• User can have independent view and react to environment
34. From Immersion to Presence
• Immersion: describes the extent to which technology is capable of
delivering a vivid illusion of reality to the senses of a human participant.
• Presence: a state of consciousness, the (psychological) sense of being
in the virtual environment.
• So Immersion, defined in technical terms, is capable of producing a
sensation of Presence
• Goal of VR: Create a high degree of Presence
• Make people believe they are really in Virtual Environment
Slater, M., & Wilbur, S. (1997). A framework for immersive virtual environments (FIVE): Speculations on the role
of presence in virtual environments. Presence: Teleoperators and virtual environments, 6(6), 603-616.
35. Presence ..
“The subjective experience of being in one place or
environment even when physically situated in another”
Witmer, B. G., & Singer, M. J. (1998). Measuring presence in virtual environments: A presence
questionnaire. Presence: Teleoperators and virtual environments, 7(3), 225-240.
36. Reality vs. Virtual Reality
• In a VR system there are input and output devices
between human perception and action
37. Using Technology to Stimulate Senses
• Simulate output
• E.g. simulate real scene
• Map output to devices
• Graphics to HMD
• Use devices to
stimulate the senses
• HMD stimulates eyes
Visual
Simulation
3D Graphics HMD Vision
System
Brain
Example: Visual Simulation
Human-Machine Interface
38. Key Technologies for VR Systems
• Display (Immersion)
• Stimulate senses
• visual, auditory, tactile sense, etc..
• Tracking (Independence)
• Changing viewpoint
• independent movement
• Input Devices (Interaction)
• Supporting user interaction
• User input
46. Simple Magnifier HMD Design
p
q
Eyepiece
(one or more lenses) Display
(Image Source)
Eye f
Virtual
Image
1/p + 1/q = 1/f where
p = object distance (distance from image source to eyepiece)
q = image distance (distance of image from the lens)
f = focal length of the lens
52. Field of View
Monocular FOV is the angular
subtense of the displayed image as
measured from the pupil of one eye.
Total FOV is the total angular size of the
displayed image visible to both eyes.
Binocular(or stereoscopic) FOV refers to the
part of the displayed image visible to both eyes.
FOV may be measured horizontally,
vertically or diagonally.
64. HMD Design Trade-offs
• Resolution vs. field of view
• As FOV increases, resolution decreases for fixed pixels
• Eye box vs. field of view
• Larger eye box limits field of view
• Size, Weight and Power vs. everything else
vs.
65. Projection/Large Display Technologies
• Room Scale Projection
• CAVE, multi-wall environment
• Dome projection
• Hemisphere/spherical display
• Head/body inside
• Vehicle Simulator
• Simulated visual display in windows
66. Stereo Projection
• Active Stereo
• Active shutter glasses
• Time synced signal
• Brighter images
• More expensive
• Passive Stereo
• Polarized images
• Two projectors (one/eye)
• Cheap glasses (powerless)
• Lower resolution/dimmer
• Less expensive
67. CAVE
• Developed in 1992, EVL University of Illinois Chicago
• Multi-walled stereo projection environment
• Head tracked active stereo
Cruz-Neira, C., Sandin, D. J., DeFanti, T. A., Kenyon, R. V., & Hart, J. C. (1992). The CAVE: audio
visual experience automatic virtual environment. Communications of the ACM, 35(6), 64-73.
72. Multi-User CAVEs
• Limitation of CAVEs
• Stereo projection from only one user’s viewpoint
• Solution
• Higher frequency projectors and time slicing
Kulik, A., Kunert, A., Beck, S., Reichel, R., Blach, R., Zink, A., & Froehlich, B. (2011). C1x6: a
stereoscopic six-user display for co-located collaboration in shared virtual environments. ACM
Transactions on Graphics (TOG), 30(6), 188.
75. Technology
Large working volume
10.5m x 7.5m x 4.0m
360 Surround Projection
Front projection
Five 4K @ 120 Hz 3D projectors
One 2K @ 120 Hz 3D projectors
Complex screen geometry
Rounded corners
Overhanging ceiling
Tech Viz, Unreal, Panda3D
77. Allosphere
• Univ. California Santa Barbara
• One of a kind facility
• Immersive Spherical display
• 10 m diameter
• Inside 3 story anechoic cube
• Passive stereoscopic projection
• 26 projectors, 146 speakers
• Visual tracking system for input
• See http://www.allosphere.ucsb.edu/
Kuchera-Morin, J., Wright, M., Wakefield, G.,
Roberts, C., Adderton, D., Sajadi, B., ... & Majumder,
A. (2014). Immersive full-surround multi-user system
design. Computers & Graphics, 40, 10-21.
84. Audio Displays
Definition: Computer interfaces that provide synthetic sound
feedback to users interacting with the virtual world.
The sound can be monoaural (both ears hear the same sound), or
binaural (each ear hears a different sound)
Burdea, Coiffet (2003)
85. Motivation
• Most of the focus in Virtual Reality is on the visuals
• GPUs continue to drive the field
• Users want more
• More realism, More complexity, More speed
• However, sound can significantly enhance realism
• Example: Mood music in horror games
• Sound can provide valuable user interface feedback
• Example: Alert in training simulation
86. 360 Video + Spatial Audio (wear headphones)
• https://www.youtube.com/watch?v=G8pABGosD38
87. Creating/Capturing Sounds
• Sounds can be captured from nature (sampled) or synthesized
computationally
• High-quality recorded sounds are
• Cheap to play
• Easy to create realism
• Expensive to store and load
• Difficult to manipulate for expressiveness
• Synthetic sounds are
• Cheap to store and load
• Easy to manipulate
• Expensive to compute before playing
• Difficult to create realism
88. Types of Audio Recordings
• Monaural: Recording with one microphone – no positioning
• Stereo Sound: Recording with two microphones placed several feet
apart. Perceived sound position as recorded by microphones.
• Binaural: Recording microphones embedded in a dummy head. Audio
filtered by head shape.
• 3D Sound: Using tiny microphones in the ears of a real person.
Generate HRTF based on ear shape and audio response.
89. Capturing 3D Audio for Playback
• Binaural recording
• 3D Sound recording, from microphones in simulated ears
• Hear some examples (use headphones)
• http://binauralenthusiast.com/examples/
91. Synthetic Sounds
• Complex sounds can be built from simple waveforms (e.g., sawtooth, sine)
and combined using operators
• Waveform parameters (frequency, amplitude) could be taken from motion
data, such as object velocity
• Can combine wave forms in various ways
• This is what classic synthesizers do
• Works well for many non-speech sounds
95. Spatialization vs. Localization
• Spatialization is the processing of sound signals to make them
emanate from a point in space
• This is a technical topic
• Localization is the ability of people to identify the source position
of a sound
• This is a human topic, some people are better at it than others.
96. Stereo Sound
• Seems to come from inside users head
• Follows head motion as user moves head
97. 3D Spatial Sound
• Seems to be external to the head
• Fixed in space when user moves head
• Has reflected sound properties
99. Spatialized Audio Effects
• Naïve approach
• Simple left/right shift for lateral position
• Amplitude adjustment for distance
• Easy to produce using consumer hardware/software
• Does not give us "true" realism in sound
• No up/down or front/back cues
• We can use multiple speakers for this
• Surround the user with speakers
• Send different sound signals to each one
100. Example: The BoomRoom
• Use surround speakers to create spatial audio effects
• Gesture based interaction
• https://www.youtube.com/watch?time_continue=54&v=6RQMOyQ3lyg
101. Audio Localization
• Main cues used by humans to localize sound:
1. Interaural time differences: Time difference for
sound wave to travel between ears
2. Interaural level differences: For high frequency
sounds (> 1.5 kHz), volume difference between
ears used to determine source direction
3. Spectral filtering done by outer ears: Ear shape
changes frequency heard
102. Interaural Time Difference
• Takes fixed time to travel between ears
• Can use time difference to determine sound location
103. Spectral Filtering
Ear shape filters sound depending on direction it is coming
from. This change in frequency determines sound source
elevation.
104. Head-Related Transfer Functions (HRTFs)
• A set of functions that model how sound from a
source at a known location reaches the eardrum
105. More About HRTFs
• Functions take into account,
• Individual ear shape
• Slope of shoulders
• Head shape
• So, each person has his/her own HRTF!
• Need to have a parameterizable HRTFs
• Some sound cards/APIs allow specifying an HRTF
107. Measuring HRTFs
• Putting microphones in Manikin or human ears
• Playing sound from fixed positions
• Record response
108. Environmental Effects
• Sound is also changed by objects in the
environment
• Can reverberate off of reflective objects
• Can be absorbed by objects
• Can be occluded by objects
• Doppler shift
• Moving sound sources
• Need to simulate environmental audio properties
• Takes significant processing power
109. Sound Reverberation
• Need to consider first and second order reflections
• Need to model material properties, objects in room, etc
111. The Tough Part
• All of this takes a lot of processing
• Need to keep track of
• Multiple (possibly moving) sound sources
• Path of sounds through a dynamic environment
• Position and orientation of listener(s)
• Most sound cards only support a limited number of
spatialized sound channels
• Increasingly complex geometry increases load on
audio system as well as visuals
• That's why we fake it ;-)
• GPUs might change this too!
112. GPU Based Audio Acceleration
• Using GPU for audio physics calculations
• AMD TrueAudio Next - https://gpuopen.com/true-audio-next/
https://www.youtube.com/watch?v=Z6nwYLHG8PU
113. Audio Software SDKs
• Modern CPUs are fast enough spatial audio can be
generated without dedicated hardware
• Several 3D audio SDKs exist
• OpenAL
• www.openal.org
• Open source, cross platform
• Renders multichannel three-dimensional positional audio
• Google VR SDK
• Android, iOS, Unity
• https://developers.google.com/vr/concepts/spatial-audio
• Unity
• Unity Audio Spatializer SDK
• Microsoft DirectX, MRTK, etc
114. Google VR Spatial Audio Demo
https://www.youtube.com/watch?v=I9zf4hCjRg0&feature=youtu.be
115. Demo: Spatial Audio In VR
• AltspaceVR spatial audio for speaker discrimination
• https://youtu.be/yKxhjqW2Vuc
116. Designing Spatial Audio
• There are several tools available for designing 3D audio
• E.g. Facebook Spatial Workstation
• Audio tools for cinematic VR and360 video
• https://facebook360.fb.com/spatial-workstation/
• Spatial Audio Designer
• Mixing of surround sound and 3D audio
• http://www.newaudiotechnology.com/en/products/spatial-audio-designer/
118. Haptic Feedback
• Greatly improves realism
• Hands and wrist are most important
• High density of touch receptors
• Two kinds of feedback:
• Touch Feedback
• information on texture, temperature, etc.
• Does not resist user contact
• Force Feedback
• information on weight, and inertia.
• Actively resists contact motion
119. Active Haptics
• Actively resists motion
• Key properties
• Force resistance
• Frequency Response
• Degrees of Freedom
• Latency
120. Force Feedback Joysticks
• WingMan Force 3D
• Inexpensive ($60)
• Actuators that can move the
joystick given system
commands
• Max 3.3 N of force
• Force feedback driving wheel
126. Homebrew Glove
• LucidVR Budget Haptic Glove
• Simple hand tracking, force feedback,
• $22 in parts..
• https://hackaday.io/project/178243-
lucidvr-budget-haptic-glove
127. Passive Haptics
• Not controlled by system
• Use real props (Styrofoam for walls)
• Pros
• Cheap
• Large scale
• Accurate
• Cons
• Not dynamic
• Limited use
131. Vibrotactile Cueing Devices
• Vibrotactile feedback has been incorporated into many devices
• Can we use this technology to provide scalable, wearable touch cues?
136. Immersion and Tracking
• Motivation: For immersion, when the user changes
position in reality the VR view also needs to change
• Requires tracking of the user’s pose (position/orientation) in
the real world and mapping to the Virtual World
137. Tracking in VR
• Need for Tracking
• User turns their head and the VR graphics scene changes
• User wants to walking through a virtual scene
• User reaches out and grab a virtual object
• The user wants to use a real prop in VR
• All of these require technology to track the user or object
• Continuously provide information about position and orientation
Head Tracking
Hand Tracking
139. • Degree of Freedom = independent movement about an axis
• 3 DoF Orientation = roll, pitch, yaw (rotation about x, y, or z axis)
• 3 DoF Translation = movement along x,y,z axis
• Different requirements
• User turns their head in VR -> needs 3 DoF orientation tracker
• Moving in VR -> needs a 6 DoF tracker (r,p,y) and (x, y, z)
Degrees of Freedom
141. Key Tracking Performance Criteria
• Static Accuracy
• Dynamic Accuracy
• Latency
• Update Rate
• Tracking Jitter
• Signal to Noise Ratio
• Tracking Drift
142. Static vs. Dynamic Accuracy
• Static Accuracy
• Ability of tracker to determine
coordinates of a position in space
• Depends on sensor sensitivity, errors
(algorithm, operator), environment
• Dynamic Accuracy
• System accuracy as sensor moves
• Depends on static accuracy
• Resolution
• Minimum change sensor can detect
• Repeatability
• Same input giving same output
143. Tracker Latency, Update Rate
• Latency: Time between change
in object pose and time sensor
detects the change
• Large latency (> 10 ms) can cause
simulator sickness
• Larger latency (> 50 ms) can
reduce VR immersion
• Update Rate: Number of
measurements per second
• Typically > 30 Hz
144. Tracker Jitter, Signal to Noise Ratio
• Jitter: Change in tracker output
when tracked object is stationary
• Range of change is sensor noise
• Tracker with no jitter reports constant
value if tracked object stationary
• Makes tracker data changing
randomly about average value
• Signal to Noise Ratio: Signal in
data relative to noise
• Found from calculating mean of
samples in known positions
145. Tracker Drift
• Drift: Steady increase in
tracker error over time
• Accumulative (additive) error
over time
• Relative to Dynamic sensitivity
over time
• Controlled by periodically
recalibration (zeroing)
147. Example: Fake Space Boom
• BOOM (Binocular Omni-Orientation Monitor)
• Counterbalanced arm with 100
o
FOV HMD mounted on it
• 6 DOF, 4mm position accuracy, 300Hz sampling, < 5 ms latency
148. Demo: Fake Space Tele Presence
• Using Boom with HMD to control robot view
• https://www.youtube.com/watch?v=QpTQTu7A6SI
149. MagneticTracker (Active)
• Idea: difference between a magnetic
transmitter and a receiver
• ++: 6DOF, robust
• -- : wired, sensible to metal, noisy, expensive
• -- : error increases with distance
Flock of Birds (Ascension)
150. Example: Razer Hydra
• Developed by Sixense
• Magnetic source + 2 wired controllers
• Short range (< 1 m), Precision of 1mm and 1o
• 62Hz sampling rate, < 50 ms latency
• $600 USD
153. InertialTracker (Passive)
• Idea: measuring linear and angular orientation rates
(accelerometer/gyroscope)
• ++: no transmitter, cheap, small, high frequency, wireless
• -- : drift, hysteris only 3DOF
IS300 (Intersense)
Wii Remote
154. Types of Inertial Trackers
• Gyroscopes
• The rate of change in object orientation or angular velocity is measured.
• Accelerometers
• Measure acceleration.
• Can be used to determine object position, if the starting point is known.
• Inclinometer
• Measures inclination, ”level” position.
• Like carpenter’s level, but giving electrical signal.
155. Example: MEMS Sensor
• Uses spring-supported load
• Reacts to gravity and inertia
• Changes its electrical parameters
• < 5 ms latency, 0.01o
accuracy
• up to 1000Hz sampling
• Problems
• Rapidly accumulating errors.
• Error in position increases with the square of time.
• Cheap units can get position drift of 4 cm in 2 seconds.
• Expensive units have same error in 200 seconds.
• Not good for measuring location
• Need to periodically reset the output
156. Demo: MEMS Sensor Working
https://www.youtube.com/watch?v=9eSnxebfuxg
157. MEMS Gyro Bias Drift
• Zero reading of MEMS Gyro drifts over time due to noise
158. Acoustic - UltrasonicsTracker
• Idea:Time of Flight or Phase-Coherence SoundWaves
• ++: Small, Cheap
• -- : 3DOF, Line of Sight, Low resolution, Affected by
Environment (pressure, temperature), Low sampling rate
Ultrasonic
Logitech IS600
161. HiBallTracking System (3rd Tech)
• Inside-Out Tracker
• $50K USD
• Scalable over large area
• Fast update (2000Hz)
• Latency Less than 1 ms.
• Accurate
• Position 0.4mm RMS
• Orientation 0.02° RMS
162.
163. Example: Oculus Quest
• Inside out tracking
• Four cameras on corner of display
• Searching for visual features
• On setup creates map of room
170. How Lighthouse Tracking Works
• Position tracking using IMU
• 500 Hz sampling
• But drifts over time
• Drift correction using optical tracking
• IR synchronization pulse (60 Hz)
• Laser sweep between pulses
• Photo-sensors recognize sync pulse, measure time to laser
• Know when sensor hit and which sensor hit
• Calculate position of sensor relative to base station
• Use 2 base stations to calculate pose
• Use IMU sensor data between pulses (500Hz)
• See http://xinreality.com/wiki/Lighthouse
171. Lighthouse Tracking
Base station scanning
https://www.youtube.com/watch?v=avBt_P0wg_Y
https://www.youtube.com/watch?v=oqPaaMR4kY4
Room tracking
172. Tracking Coordinate Frames
• There can be several coordinate frames to consider
• Head pose with respect to real world
• Coordinate fame of tracking system wrt HMD
• Position of hand in coordinate frame of hand tracker
173. Example: Finding your hand in VR
• Using Lighthouse and LeapMotion
• Multiple Coordinate Frames
• LeapMotion tracks hand in LeapMotion coordinate frame (HLM)
• LeapMotion is fixed in HMD coordinate frame (LMHMD)
• HMD is tracked in VR coordinate frame (HMDVR) (using Lighthouse)
• Where is your hand in VR coordinate frame?
• Combine transformations in each coordinate frame
• HVR = HLM x LMHMD x HMDVR