Natural Interaction for Augmented Reality Applications

Natural Interaction for Augmented
Reality Applications
Mark Billinghurst
mark.billinghurst@hitlabnz.org
The HIT Lab NZ, University of Canterbury
November 28th 2013

1977 – Star Wars

1977 – Star Wars

Augmented Reality Definition
  Defining Characteristics
  Combines Real and Virtual Images
-  Both can be seen at the same time

  Interactive in real-time
-  The virtual content can be interacted with

  Registered in 3D
-  Virtual objects appear fixed in space

Azuma, R. T. (1997). A survey of augmented reality. Presence, 6(4), 355-385.

AR Interface Components
Physical
Elements
Input

Interaction
Metaphor

Virtual
Elements
Output

  Key Question: How should a person interact
with the Augmented Reality content?
  Connecting physical and virtual with interaction

AR Interaction Metaphors
  Information Browsing
  View AR content

  3D AR Interfaces
  3D UI interaction techniques

  Augmented Surfaces
  Tangible UI techniques

  Tangible AR
  Tangible UI input + AR output

Tangible User Interfaces
  Use physical objects to
interact with digital content
  Foreground
  graspable user interface

  Background
  ambient interfaces

Ishii, H., & Ullmer, B. (1997). Tangible bits: towards seamless interfaces between
people, bits and atoms. In Proceedings of the ACM SIGCHI Conference on Human
factors in computing systems (pp. 234-241). ACM.

TUI Benefits and Limitations
  Pros
  Physical objects make us smart
  Objects aid collaboration
  Objects increase understanding

  Cons
  Difficult to change object properties
  Limited display capabilities – 2D view
  Separation between object and display

Tangible AR Metaphor
  AR overcomes limitation of TUIs
  enhance display possibilities
  merge task/display space
  provide public and private views

  TUI + AR = Tangible AR
  Apply TUI methods to AR interface design

VOMAR Demo (Kato 2000)
  AR Furniture Arranging
  Elements + Interactions
  Book:
-  Turn over the page

  Paddle:
-  Push, shake, incline, hit, scoop

Kato, H., Billinghurst, M., et al. 2000. Virtual Object Manipulation on a Table-Top AR
Environment. In Proceedings of the International Symposium on Augmented Reality
(ISAR 2000), Munich, Germany, 111--119.

Lessons Learned
  Advantages
  Intuitive interaction, ease of use
  Full 6 DOF manipulation

  Disadvantages
  Marker based tracking
-  occlusion, limited tracking range, etc

  Needs external interface objects
-  Paddle, book, etc

To Make the Vision Real..
  Hardware/software requirements
  Contact lens displays
  Free space hand/body tracking
  Speech/gesture recognition
  Etc..

  Most importantly
  Usability/User Experience

Natural Interaction
  Automatically detecting real environment
  Environmental awareness, Physically based interaction

  Gesture interaction
  Free-hand interaction

  Multimodal input
  Speech and gesture interaction

  Intelligent interfaces
  Implicit rather than Explicit interaction

AR MicroMachines
  AR experience with environment awareness
and physically-based interaction
  Based on MS Kinect RGB-D sensor

  Augmented environment supports
  occlusion, shadows
  physically-based interaction between real and
virtual objects
Clark, A., & Piumsomboon, T. (2011). A realistic augmented reality racing game using a
depth-sensing camera. In Proceedings of the 10th International Conference on Virtual
Reality Continuum and Its Applications in Industry (pp. 499-502). ACM.

Architecture
  Our framework uses five libraries:
  OpenNI
  OpenCV
  OPIRA
  Bullet Physics
  OpenSceneGraph

System Flow
  The system flow consists of three sections:
  Image Processing and Marker Tracking
  Physics Simulation
  Rendering

Physics Simulation

  Create virtual mesh over real world
  Update at 10 fps – can move real objects
  Use by physics engine for collision detection (virtual/real)
  Use by OpenScenegraph for occlusion and shadows

Natural Hand Interaction

  Using bare hands to interact with AR content
  MS Kinect depth sensing
  Real time hand tracking
  Physics based simulation model

Hand Interaction

  Represent models as collections of spheres
  Bullet physics engine for interaction with real world

Scene Interaction

  Render AR scene with OpenSceneGraph
  Using depth map for occlusion
  Shadows yet to be implemented

Architecture
5. Gesture

•  Static Gestures
•  Dynamic Gestures
•  Context based Gestures
4. Modeling

•  Hand recognition/modeling
•  Rigid-body modeling
3. Classification/Tracking
2. Segmentation
1. Hardware Interface

Architecture
5. Gesture

o  Supports PCL, OpenNI, OpenCV, and Kinect SDK.
o  Provides access to depth, RGB, XYZRGB.
o  Usage: Capturing color image, depth image and concatenated
point clouds from a single or multiple cameras
o  For example:

4. Modeling
•  Hand recognition/
modeling
2. Segmentation

Kinect for Xbox 360
Kinect for Windows
Asus Xtion Pro Live

Architecture
5. Gesture

o  Segment images and point clouds based on color, depth and
space.
o  Usage: Segmenting images or point clouds using color
models, depth, or spatial properties such as location, shape
and size.
o  For example:

4. Modeling
modeling

Skin color segmentation

2. Segmentation

Depth threshold

Architecture
5. Gesture

o  Identify and track objects between frames based on
XYZRGB.
o  Usage: Identifying current position/orientation of the tracked
object in space.
o  For example:

4. Modeling
modeling
2. Segmentation

Training set of hand
poses, colors
represent unique
regions of the hand.
Raw output (withoutcleaning) classified
on real hand input
(depth image).

Architecture
5. Gesture
4. Modeling
modeling
2. Segmentation

o  Hand Recognition/Modeling
  Skeleton based (for low resolution
approximation)
  Model based (for more accurate
representation)
o  Object Modeling (identification and tracking rigidbody objects)
o  Physical Modeling (physical interaction)
  Sphere Proxy
  Model based
  Mesh based
o  Usage: For general spatial interaction in AR/VR
environment

Architecture
5. Gesture
4. Modeling
modeling
2. Segmentation

o  Static (hand pose recognition)
o  Dynamic (meaningful movement recognition)
o  Context-based gesture recognition (gestures with context,
e.g. pointing)
o  Usage: Issuing commands/anticipating user intention and high
level interaction.

Skeleton Based Interaction

  3 Gear Systems
  Kinect/Primesense Sensor
  Two hand tracking
  http://www.threegear.com

Skeleton Interaction + AR

  HMD AR View
  Viewpoint tracking

  Two hand input
  Skeleton interaction, occlusion

Multimodal Interaction
  Combined speech input
  Gesture and Speech complimentary
  Speech
-  modal commands, quantities

  Gesture
-  selection, motion, qualities

  Previous work found multimodal interfaces
intuitive for 2D/3D graphics interaction

Free Hand Multimodal Input

Point

Move

Pick/Drop

  Use free hand to interact with AR content
  Recognize simple gestures
Lee, M., Billinghurst, M., Baek, W., Green, R., & Woo, W. (2013). A usability study of multimodal
input in an augmented reality environment. Virtual Reality, 17(4), 293-305.

Experimental Setup

Change object shape
and colour

User Evaluation

  Change object shape, colour and position
  Conditions
  Speech only, gesture only, multimodal

  Measure
  performance time, error, subjective survey

Results
  Average performance time (MMI, speech fastest)
  Gesture: 15.44s
  Speech: 12.38s
  Multimodal: 11.78s

  No difference in user errors
  User subjective survey
  Q1: How natural was it to manipulate the object?
-  MMI, speech significantly better

  70% preferred MMI, 25% speech only, 5% gesture only

Intelligent Interfaces
  Most AR systems stupid
  Don’t recognize user behaviour
  Don’t provide feedback
  Don’t adapt to user

  Especially important for training
  Scaffolded learning
  Moving beyond check-lists of actions

Intelligent Interfaces

  AR interface + intelligent tutoring system
  ASPIRE constraint based system (from UC)
  Constraints
-  relevance cond., satisfaction cond., feedback
Westerfield, G., Mitrovic, A., & Billinghurst, M. (2013). Intelligent Augmented Reality Training for
Assembly Tasks. In Artificial Intelligence in Education (pp. 542-551). Springer Berlin Heidelberg.

Intelligent Feedback

  Actively monitors user behaviour
  Implicit vs. explicit interaction

  Provides corrective feedback

Evaluation Results
  16 subjects, with and without ITS
  Improved task completion

  Improved learning

Intelligent Agents
  AR characters
  Virtual embodiment of system
  Multimodal input/output

  Examples
  AR Lego, Welbo, etc
  Mr Virtuoso
-  AR character more real, more fun
-  On-screen 3D and AR similar in usefulness
Wagner, D., Billinghurst, M., & Schmalstieg, D. (2006). How real should virtual characters be?. In
Proceedings of the 2006 ACM SIGCHI international conference on Advances in computer
entertainment technology (p. 57). ACM.

Looking to the Future

What’s Next?

Directions for Future Research
  Mobile Gesture Interaction
  Tablet, phone interfaces

  Wearable Systems
  Google Glass

  Novel Displays
  Contact lens

Mobile Gesture Interaction
  Motivation
  Richer interaction with handheld devices
  Natural interaction with handheld AR

  2D tracking
  Finger tip tracking

  3D tracking
[Hurst and Wezel 2013]

  Hand tracking
[Henrysson et al. 2007]

Henrysson, A., Marshall, J., & Billinghurst, M. (2007). Experiments in 3D interaction for mobile phone AR.
In Proceedings of the 5th international conference on Computer graphics and interactive techniques in
Australia and Southeast Asia (pp. 187-194). ACM.

Fingertip Based Interaction

Running System
System Setup

Mobile Client + PC Server

Bai, H., Gao, L., El-Sana, J., & Billinghurst, M. (2013). Markerless 3D gesture-based interaction for
handheld augmented reality interfaces. In SIGGRAPH Asia 2013 Symposium on Mobile Graphics
and Interactive Applications (p. 22). ACM.

3D Prototype System
  3 Gear + Vuforia
  Hand tracking + phone tracking

  Freehand interaction on phone
  Skeleton model
  3D interaction
  20 fps performance

User Experience
  Truly Wearable Computing
  Less than 46 ounces

  Hands-free Information Access
  Voice interaction, Ego-vision camera

  Intuitive User Interface
  Touch, Gesture, Speech, Head Motion

  Access to all Google Services
  Map, Search, Location, Messaging, Email, etc

Contact Lens Display

  Babak Parviz

  University Washington

  MEMS components
  Transparent elements
  Micro-sensors

  Challenges
  Miniaturization
  Assembly
  Eye-safe

Conclusions
  AR experiences need new interaction methods
  Enabling technologies are advancing quickly
  Displays, tracking, depth capture devices

  Natural user interfaces possible
  Free hand gesture, speech, intelligence interfaces

  Important research for the future
  Mobile, wearable, displays

More Information
•  Mark Billinghurst
–  Email: mark.billinghurst@hitlabnz.org
–  Twitter: @marknb00

•  Website
–  http://www.hitlabnz.org/

Natural Interaction for Augmented Reality Applications

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (10)

Similar to Natural Interaction for Augmented Reality Applications

Similar to Natural Interaction for Augmented Reality Applications (20)

More from Mark Billinghurst

More from Mark Billinghurst (20)

Recently uploaded

Recently uploaded (20)

Natural Interaction for Augmented Reality Applications