SlideShare a Scribd company logo
1 of 76
Natural Interfaces for
 Augmented Reality

      Mark Billinghurst
       HIT Lab NZ
  University of Canterbury
Augmented Reality Definition
 Defining Characteristics [Azuma 97]
   Combines Real and Virtual Images
     - Both can be seen at the same time
   Interactive in real-time
     - The virtual content can be interacted with
   Registered in 3D
     - Virtual objects appear fixed in space
AR Today
 Most widely used AR is mobile or web based
 Mobile AR
   Outdoor AR (GPS + compass)
     - Layar (10 million+ users), Junaio, etc
   Indoor AR (image based tracking)
     - QCAR, String etc
 Web based (Flash)
   Flartoolkit marker tracking
   Markerless tracking
AR Interaction
 You can see spatially registered AR..
          how can you interact with it?
AR Interaction Today
 Mostly simple interaction
 Mobile
   Outdoor (Junaio, Layar, Wikitude, etc)
     - Viewing information in place, touch virtual tags
   Indoor (Invizimals, Qualcomm demos)
     - Change viewpoint, screen based (touch screen)
 Web based
   Change viewpoint, screen interaction (mouse)
History of AR Interaction
1. AR Information Viewing
 Information is registered to
  real-world context
    Hand held AR displays
 Interaction
    Manipulation of a window
     into information space
    2D/3D virtual viewpoint control
 Applications
    Context-aware information displays
 Examples                                NaviCam Rekimoto, et al. 1997
    NaviCam, Cameleon, etc
Current AR Information Browsers
 Mobile AR
   GPS + compass
 Many Applications
     Layar
     Wikitude
     Acrossair
     PressLite
     Yelp
     AR Car Finder
     …
2. 3D AR Interfaces
 Virtual objects displayed in 3D
  physical space and manipulated
    HMDs and 6DOF head-tracking
    6DOF hand trackers for input
 Interaction
    Viewpoint control
    Traditional 3D UI interaction:
                                      Kiyokawa, et al. 2000
     manipulation, selection, etc.
 Requires custom input devices
VLEGO - AR 3D Interaction
3. Augmented Surfaces and
Tangible Interfaces
 Basic principles
   Virtual objects are projected
    on a surface
   Physical objects are used as
    controls for virtual objects
   Support for collaboration
Augmented Surfaces
 Rekimoto, et al. 1998
   Front projection
   Marker-based tracking
   Multiple projection surfaces
Tangible User Interfaces (Ishii 97)
 Create digital shadows
  for physical objects
 Foreground
   graspable UI
 Background
   ambient interfaces
Tangible Interface: ARgroove
 Collaborative Instrument
 Exploring Physically Based Interaction
    Move and track physical record
    Map physical actions to Midi output
       - Translation, rotation
       - Tilt, shake
 Limitation
    AR output shown on screen
    Separation between input and output
Lessons from Tangible Interfaces
 Benefits
    Physical objects make us smart (affordances, constraints)
    Objects aid collaboration (shared meaning)
    Objects increase understanding (cognitive artifacts)
 Limitations
    Difficult to change object properties
    Limited display capabilities (project onto surface)
    Separation between object and display
4: Tangible AR
 AR overcomes limitation of TUIs
   enhance display possibilities
   merge task/display space
   provide public and private views


 TUI + AR = Tangible AR
   Apply TUI methods to AR interface design
Example Tangible AR Applications
 Use of natural physical object manipulations to
  control virtual objects
 LevelHead (Oliver)
    Physical cubes become rooms
 VOMAR (Kato 2000)
    Furniture catalog book:
      - Turn over the page to see new models
    Paddle interaction:
      - Push, shake, incline, hit, scoop
VOMAR Interface
Evolution of AR Interaction
1. Information Viewing Interfaces
   simple (conceptually!), unobtrusive
§ 3D AR Interfaces
   expressive, creative, require attention
§ Tangible Interfaces
   Embedded into conventional environments
4. Tangible AR
   Combines TUI input + AR display
Limitations
 Typical limitations
     Simple/No interaction (viewpoint control)
     Require custom devices
     Single mode interaction
     2D input for 3D (screen based interaction)
     No understanding of real world
     Explicit vs. implicit interaction
     Unintelligent interfaces (no learning)
Natural Interaction
The Vision of AR
To Make the Vision Real..
 Hardware/software requirements
     Contact lens displays
     Free space hand/body tracking
     Environment recognition
     Speech/gesture recognition
     Etc..
Natural Interaction
 Automatically detecting real environment
   Environmental awareness
   Physically based interaction
 Gesture Input
   Free-hand interaction
 Multimodal Input
   Speech and gesture interaction
   Implicit rather than Explicit interaction
Environmental Awareness
AR MicroMachines
 AR experience with environment awareness
  and physically-based interaction
   Based on MS Kinect RGB-D sensor
 Augmented environment supports
   occlusion, shadows
   physically-based interaction between real and
    virtual objects
Operating Environment
Architecture
 Our framework uses five libraries:

     OpenNI
     OpenCV
     OPIRA
     Bullet Physics
     OpenSceneGraph
System Flow
 The system flow consists of three sections:
    Image Processing and Marker Tracking
    Physics Simulation
    Rendering
Physics Simulation




 Create virtual mesh over real world
 Update at 10 fps – can move real objects
 Use by physics engine for collision detection (virtual/real)
 Use by OpenScenegraph for occlusion and shadows
Rendering




Occlusion               Shadows
Natural Gesture Interaction

 HIT Lab NZ AR Gesture Library
Motivation
                   AR MicroMachines and PhobiAR




 • Treated the environment as
   static – no tracking


                                     • Tracked objects in 2D

More realistic interaction requires 3D gesture tracking
Motivation
                                   Occlusion Issues
AR MicroMachines only achieved realistic occlusion because the user’s viewpoint matched the Kinect’s




Proper occlusion requires a more complete model of scene objects
HITLabNZ’s Gesture Library




Architecture
HITLabNZ’s Gesture Library




Architecture
   o   Supports PCL, OpenNI, OpenCV, and Kinect SDK.
   o   Provides access to depth, RGB, XYZRGB.
   o   Usage: Capturing color image, depth image and
       concatenated point clouds from a single or multiple cameras
   o   For example:




                                 Kinect for Xbox 360


                                  Kinect for Windows


                                 Asus Xtion Pro Live
HITLabNZ’s Gesture Library




Architecture
  o    Segment images and point clouds based on color, depth and
       space.
  o    Usage: Segmenting images or point clouds using color
       models, depth, or spatial properties such as location, shape
       and size.
  o    For example:




                                   Skin color segmentation



                                       Depth threshold
HITLabNZ’s Gesture Library




Architecture
  o    Identify and track objects between frames based on
       XYZRGB.
  o    Usage: Identifying current position/orientation of the
       tracked object in space.
  o    For example:



                                     Training set of hand
                                     poses, colors
                                     represent unique
                                     regions of the hand.


                                       Raw output (without-
                                       cleaning) classified
                                       on real hand input
                                       (depth image).
HITLabNZ’s Gesture Library




Architecture
   o Hand Recognition/Modeling
        Skeleton based (for low resolution
          approximation)
        Model based (for more accurate
          representation)
   o Object Modeling (identification and tracking rigid-
     body objects)
   o Physical Modeling (physical interaction)
        Sphere Proxy
        Model based
        Mesh based
   o Usage: For general spatial interaction in AR/VR
     environment
Method
Represent models as collections of spheres moving with
   the models in the Bullet physics engine
Method
Render AR scene with OpenSceneGraph, using depth map
   for occlusion




              Shadows yet to be implemented
Results
HITLabNZ’s Gesture Library




Architecture
  o   Static (hand pose recognition)
  o   Dynamic (meaningful movement recognition)
  o   Context-based gesture recognition (gestures with context,
      e.g. pointing)
  o   Usage: Issuing commands/anticipating user intention and
      high level interaction.
Multimodal Interaction
Multimodal Interaction
 Combined speech input
 Gesture and Speech complimentary
   Speech
     - modal commands, quantities
   Gesture
     - selection, motion, qualities
 Previous work found multimodal interfaces
  intuitive for 2D/3D graphics interaction
1. Marker Based Multimodal Interface




  Add speech recognition to VOMAR
  Paddle + speech commands
Commands Recognized
 Create Command "Make a blue chair": to create a virtual
  object and place it on the paddle.
 Duplicate Command "Copy this": to duplicate a virtual object
  and place it on the paddle.
 Grab Command "Grab table": to select a virtual object and
  place it on the paddle.
 Place Command "Place here": to place the attached object in
  the workspace.
 Move Command "Move the couch": to attach a virtual object
  in the workspace to the paddle so that it follows the paddle
  movement.
System Architecture
Object Relationships




"Put chair behind the table”
Where is behind?
                               View specific regions
User Evaluation
 Performance time
    Speech + static paddle significantly faster




 Gesture-only condition less accurate for position/orientation
 Users preferred speech + paddle input
Subjective Surveys
2. Free Hand Multimodal Input
 Use free hand to interact with AR content
 Recognize simple gestures
 No marker tracking




        Point         Move          Pick/Drop
Multimodal Architecture
Multimodal Fusion
Hand Occlusion
User Evaluation



 Change object shape, colour and position
 Conditions
   Speech only, gesture only, multimodal
 Measure
   performance time, error, subjective survey
Experimental Setup




Change object shape
  and colour
Results
 Average performance time (MMI, speech fastest)
   Gesture: 15.44s
   Speech: 12.38s
   Multimodal: 11.78s
 No difference in user errors
 User subjective survey
   Q1: How natural was it to manipulate the object?
     - MMI, speech significantly better
   70% preferred MMI, 25% speech only, 5% gesture only
Future Directions
Future Research
   Mobile real world capture
   Mobile gesture input
   Intelligent interfaces
   Virtual characters
Natural Gesture Interaction on Mobile




 Use mobile camera for hand tracking
   Fingertip detection
Evaluation




 Gesture input more than twice as slow as touch
 No difference in naturalness
Intelligent Interfaces
 Most AR systems stupid
   Don’t recognize user behaviour
   Don’t provide feedback
   Don’t adapt to user
 Especially important for training
   Scaffolded learning
   Moving beyond check-lists of actions
Intelligent Interfaces




 AR interface + intelligent tutoring system
   ASPIRE constraint based system (from UC)
   Constraints
     - relevance cond., satisfaction cond., feedback
Domain Ontology
Intelligent Feedback




 Actively monitors user behaviour
    Implicit vs. explicit interaction
 Provides corrective feedback
Evaluation Results
 16 subjects, with and without ITS
 Improved task completion




 Improved learning
Intelligent Agents
 AR characters
   Virtual embodiment of system
   Multimodal input/output
 Examples
   AR Lego, Welbo, etc
   Mr Virtuoso
     - AR character more real, more fun
     - On-screen 3D and AR similar in usefulness
Conclusions
Conclusions
 AR traditionally involves tangible interaction
 New technologies support natural interaction
   Environment capture
   Natural gestures
   Multimodal interaction
 Opportunities for future research
   Mobile, intelligent systems, characters
More Information
• Mark Billinghurst
  – mark.billinghurst@hitlabnz.org
• Website
  – http://www.hitlabnz.org/

More Related Content

What's hot

Comp4010 Lecture8 Introduction to VR
Comp4010 Lecture8 Introduction to VRComp4010 Lecture8 Introduction to VR
Comp4010 Lecture8 Introduction to VRMark Billinghurst
 
Lecture 2 Presence and Perception
Lecture 2 Presence and PerceptionLecture 2 Presence and Perception
Lecture 2 Presence and PerceptionMark Billinghurst
 
Comp4010 2021 Lecture2-Perception
Comp4010 2021 Lecture2-PerceptionComp4010 2021 Lecture2-Perception
Comp4010 2021 Lecture2-PerceptionMark Billinghurst
 
Comp4010 Lecture10 VR Interface Design
Comp4010 Lecture10 VR Interface DesignComp4010 Lecture10 VR Interface Design
Comp4010 Lecture10 VR Interface DesignMark Billinghurst
 
2022 COMP4010 Lecture1: Introduction to XR
2022 COMP4010 Lecture1: Introduction to XR2022 COMP4010 Lecture1: Introduction to XR
2022 COMP4010 Lecture1: Introduction to XRMark Billinghurst
 
Mixed Reality in the Workspace
Mixed Reality in the WorkspaceMixed Reality in the Workspace
Mixed Reality in the WorkspaceMark Billinghurst
 
Comp4010 Lecture5 Interaction and Prototyping
Comp4010 Lecture5 Interaction and PrototypingComp4010 Lecture5 Interaction and Prototyping
Comp4010 Lecture5 Interaction and PrototypingMark Billinghurst
 
Advanced Methods for User Evaluation in AR/VR Studies
Advanced Methods for User Evaluation in AR/VR StudiesAdvanced Methods for User Evaluation in AR/VR Studies
Advanced Methods for User Evaluation in AR/VR StudiesMark Billinghurst
 
COMP Lecture1 - Introduction to Virtual Reality
COMP Lecture1 - Introduction to Virtual RealityCOMP Lecture1 - Introduction to Virtual Reality
COMP Lecture1 - Introduction to Virtual RealityMark Billinghurst
 
Grand Challenges for Mixed Reality
Grand Challenges for Mixed Reality Grand Challenges for Mixed Reality
Grand Challenges for Mixed Reality Mark Billinghurst
 
2022 COMP4010 Lecture2: Perception
2022 COMP4010 Lecture2: Perception2022 COMP4010 Lecture2: Perception
2022 COMP4010 Lecture2: PerceptionMark Billinghurst
 
Comp4010 Lecture12 Research Directions
Comp4010 Lecture12 Research DirectionsComp4010 Lecture12 Research Directions
Comp4010 Lecture12 Research DirectionsMark Billinghurst
 
2022 COMP4010 Lecture3: AR Technology
2022 COMP4010 Lecture3: AR Technology2022 COMP4010 Lecture3: AR Technology
2022 COMP4010 Lecture3: AR TechnologyMark Billinghurst
 
Comp4010 lecture6 Prototyping
Comp4010 lecture6 PrototypingComp4010 lecture6 Prototyping
Comp4010 lecture6 PrototypingMark Billinghurst
 
COMP 4010 Lecture7 3D User Interfaces for Virtual Reality
COMP 4010 Lecture7 3D User Interfaces for Virtual RealityCOMP 4010 Lecture7 3D User Interfaces for Virtual Reality
COMP 4010 Lecture7 3D User Interfaces for Virtual RealityMark Billinghurst
 
2022 COMP4010 Lecture4: AR Interaction
2022 COMP4010 Lecture4: AR Interaction2022 COMP4010 Lecture4: AR Interaction
2022 COMP4010 Lecture4: AR InteractionMark Billinghurst
 
2013 426 Lecture 1: Introduction to Augmented Reality
2013 426 Lecture 1: Introduction to Augmented Reality2013 426 Lecture 1: Introduction to Augmented Reality
2013 426 Lecture 1: Introduction to Augmented RealityMark Billinghurst
 
Comp4010 lecture11 VR Applications
Comp4010 lecture11 VR ApplicationsComp4010 lecture11 VR Applications
Comp4010 lecture11 VR ApplicationsMark Billinghurst
 

What's hot (20)

Comp4010 Lecture8 Introduction to VR
Comp4010 Lecture8 Introduction to VRComp4010 Lecture8 Introduction to VR
Comp4010 Lecture8 Introduction to VR
 
Lecture 4: VR Systems
Lecture 4: VR SystemsLecture 4: VR Systems
Lecture 4: VR Systems
 
Lecture 2 Presence and Perception
Lecture 2 Presence and PerceptionLecture 2 Presence and Perception
Lecture 2 Presence and Perception
 
Comp4010 2021 Lecture2-Perception
Comp4010 2021 Lecture2-PerceptionComp4010 2021 Lecture2-Perception
Comp4010 2021 Lecture2-Perception
 
Comp4010 Lecture10 VR Interface Design
Comp4010 Lecture10 VR Interface DesignComp4010 Lecture10 VR Interface Design
Comp4010 Lecture10 VR Interface Design
 
2022 COMP4010 Lecture1: Introduction to XR
2022 COMP4010 Lecture1: Introduction to XR2022 COMP4010 Lecture1: Introduction to XR
2022 COMP4010 Lecture1: Introduction to XR
 
2013 Lecture3: AR Tracking
2013 Lecture3: AR Tracking 2013 Lecture3: AR Tracking
2013 Lecture3: AR Tracking
 
Mixed Reality in the Workspace
Mixed Reality in the WorkspaceMixed Reality in the Workspace
Mixed Reality in the Workspace
 
Comp4010 Lecture5 Interaction and Prototyping
Comp4010 Lecture5 Interaction and PrototypingComp4010 Lecture5 Interaction and Prototyping
Comp4010 Lecture5 Interaction and Prototyping
 
Advanced Methods for User Evaluation in AR/VR Studies
Advanced Methods for User Evaluation in AR/VR StudiesAdvanced Methods for User Evaluation in AR/VR Studies
Advanced Methods for User Evaluation in AR/VR Studies
 
COMP Lecture1 - Introduction to Virtual Reality
COMP Lecture1 - Introduction to Virtual RealityCOMP Lecture1 - Introduction to Virtual Reality
COMP Lecture1 - Introduction to Virtual Reality
 
Grand Challenges for Mixed Reality
Grand Challenges for Mixed Reality Grand Challenges for Mixed Reality
Grand Challenges for Mixed Reality
 
2022 COMP4010 Lecture2: Perception
2022 COMP4010 Lecture2: Perception2022 COMP4010 Lecture2: Perception
2022 COMP4010 Lecture2: Perception
 
Comp4010 Lecture12 Research Directions
Comp4010 Lecture12 Research DirectionsComp4010 Lecture12 Research Directions
Comp4010 Lecture12 Research Directions
 
2022 COMP4010 Lecture3: AR Technology
2022 COMP4010 Lecture3: AR Technology2022 COMP4010 Lecture3: AR Technology
2022 COMP4010 Lecture3: AR Technology
 
Comp4010 lecture6 Prototyping
Comp4010 lecture6 PrototypingComp4010 lecture6 Prototyping
Comp4010 lecture6 Prototyping
 
COMP 4010 Lecture7 3D User Interfaces for Virtual Reality
COMP 4010 Lecture7 3D User Interfaces for Virtual RealityCOMP 4010 Lecture7 3D User Interfaces for Virtual Reality
COMP 4010 Lecture7 3D User Interfaces for Virtual Reality
 
2022 COMP4010 Lecture4: AR Interaction
2022 COMP4010 Lecture4: AR Interaction2022 COMP4010 Lecture4: AR Interaction
2022 COMP4010 Lecture4: AR Interaction
 
2013 426 Lecture 1: Introduction to Augmented Reality
2013 426 Lecture 1: Introduction to Augmented Reality2013 426 Lecture 1: Introduction to Augmented Reality
2013 426 Lecture 1: Introduction to Augmented Reality
 
Comp4010 lecture11 VR Applications
Comp4010 lecture11 VR ApplicationsComp4010 lecture11 VR Applications
Comp4010 lecture11 VR Applications
 

Similar to Natural Interfaces for Augmented Reality

426 Lecture 9: Research Directions in AR
426 Lecture 9: Research Directions in AR426 Lecture 9: Research Directions in AR
426 Lecture 9: Research Directions in ARMark Billinghurst
 
Natural Interaction for Augmented Reality Applications
Natural Interaction for Augmented Reality ApplicationsNatural Interaction for Augmented Reality Applications
Natural Interaction for Augmented Reality ApplicationsMark Billinghurst
 
Hands and Speech in Space: Multimodal Input for Augmented Reality
Hands and Speech in Space: Multimodal Input for Augmented Reality Hands and Speech in Space: Multimodal Input for Augmented Reality
Hands and Speech in Space: Multimodal Input for Augmented Reality Mark Billinghurst
 
Tangible AR Interface
Tangible AR InterfaceTangible AR Interface
Tangible AR InterfaceJongHyoun
 
426 lecture6b: AR Interaction
426 lecture6b: AR Interaction426 lecture6b: AR Interaction
426 lecture6b: AR InteractionMark Billinghurst
 
COSC 426 lect. 4: AR Interaction
COSC 426 lect. 4: AR InteractionCOSC 426 lect. 4: AR Interaction
COSC 426 lect. 4: AR InteractionMark Billinghurst
 
Mobile AR Lecture 10 - Research Directions
Mobile AR Lecture 10 - Research DirectionsMobile AR Lecture 10 - Research Directions
Mobile AR Lecture 10 - Research DirectionsMark Billinghurst
 
426 lecture 7: Designing AR Interfaces
426 lecture 7: Designing AR Interfaces426 lecture 7: Designing AR Interfaces
426 lecture 7: Designing AR InterfacesMark Billinghurst
 
2013 Lecture 6: AR User Interface Design Guidelines
2013 Lecture 6: AR User Interface Design Guidelines2013 Lecture 6: AR User Interface Design Guidelines
2013 Lecture 6: AR User Interface Design GuidelinesMark Billinghurst
 
IoT Day Italy - Mixed Reality & IoT
IoT Day Italy - Mixed Reality & IoTIoT Day Italy - Mixed Reality & IoT
IoT Day Italy - Mixed Reality & IoTClemente Giorio
 
COSC 426 Lect. 8: AR Research Directions
COSC 426 Lect. 8: AR Research DirectionsCOSC 426 Lect. 8: AR Research Directions
COSC 426 Lect. 8: AR Research DirectionsMark Billinghurst
 
2016 AR Summer School - Lecture 5
2016 AR Summer School - Lecture 52016 AR Summer School - Lecture 5
2016 AR Summer School - Lecture 5Mark Billinghurst
 
Designing Augmented Reality Experiences
Designing Augmented Reality ExperiencesDesigning Augmented Reality Experiences
Designing Augmented Reality ExperiencesMark Billinghurst
 

Similar to Natural Interfaces for Augmented Reality (20)

426 Lecture 9: Research Directions in AR
426 Lecture 9: Research Directions in AR426 Lecture 9: Research Directions in AR
426 Lecture 9: Research Directions in AR
 
Natural Interaction for Augmented Reality Applications
Natural Interaction for Augmented Reality ApplicationsNatural Interaction for Augmented Reality Applications
Natural Interaction for Augmented Reality Applications
 
Hands and Speech in Space: Multimodal Input for Augmented Reality
Hands and Speech in Space: Multimodal Input for Augmented Reality Hands and Speech in Space: Multimodal Input for Augmented Reality
Hands and Speech in Space: Multimodal Input for Augmented Reality
 
Tangible AR Interface
Tangible AR InterfaceTangible AR Interface
Tangible AR Interface
 
Alvaro Cassinelli / Meta Perception Group leader
Alvaro Cassinelli / Meta Perception Group leaderAlvaro Cassinelli / Meta Perception Group leader
Alvaro Cassinelli / Meta Perception Group leader
 
426 lecture6b: AR Interaction
426 lecture6b: AR Interaction426 lecture6b: AR Interaction
426 lecture6b: AR Interaction
 
COSC 426 lect. 4: AR Interaction
COSC 426 lect. 4: AR InteractionCOSC 426 lect. 4: AR Interaction
COSC 426 lect. 4: AR Interaction
 
Mobile AR Lecture 10 - Research Directions
Mobile AR Lecture 10 - Research DirectionsMobile AR Lecture 10 - Research Directions
Mobile AR Lecture 10 - Research Directions
 
426 lecture 7: Designing AR Interfaces
426 lecture 7: Designing AR Interfaces426 lecture 7: Designing AR Interfaces
426 lecture 7: Designing AR Interfaces
 
SVR2011 Keynote
SVR2011 KeynoteSVR2011 Keynote
SVR2011 Keynote
 
VR- virtual reality
VR- virtual realityVR- virtual reality
VR- virtual reality
 
2013 Lecture 6: AR User Interface Design Guidelines
2013 Lecture 6: AR User Interface Design Guidelines2013 Lecture 6: AR User Interface Design Guidelines
2013 Lecture 6: AR User Interface Design Guidelines
 
IoT Day Italy - Mixed Reality & IoT
IoT Day Italy - Mixed Reality & IoTIoT Day Italy - Mixed Reality & IoT
IoT Day Italy - Mixed Reality & IoT
 
Augmented reality
Augmented realityAugmented reality
Augmented reality
 
COSC 426 Lect. 8: AR Research Directions
COSC 426 Lect. 8: AR Research DirectionsCOSC 426 Lect. 8: AR Research Directions
COSC 426 Lect. 8: AR Research Directions
 
Unit v
Unit vUnit v
Unit v
 
Virtual reality
Virtual realityVirtual reality
Virtual reality
 
2016 AR Summer School - Lecture 5
2016 AR Summer School - Lecture 52016 AR Summer School - Lecture 5
2016 AR Summer School - Lecture 5
 
Designing Augmented Reality Experiences
Designing Augmented Reality ExperiencesDesigning Augmented Reality Experiences
Designing Augmented Reality Experiences
 
Virtual Reality
Virtual RealityVirtual Reality
Virtual Reality
 

More from Mark Billinghurst

IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Future Research Directions for Augmented Reality
Future Research Directions for Augmented RealityFuture Research Directions for Augmented Reality
Future Research Directions for Augmented RealityMark Billinghurst
 
Evaluation Methods for Social XR Experiences
Evaluation Methods for Social XR ExperiencesEvaluation Methods for Social XR Experiences
Evaluation Methods for Social XR ExperiencesMark Billinghurst
 
Empathic Computing: Delivering the Potential of the Metaverse
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the MetaverseMark Billinghurst
 
Empathic Computing: Capturing the Potential of the Metaverse
Empathic Computing: Capturing the Potential of the MetaverseEmpathic Computing: Capturing the Potential of the Metaverse
Empathic Computing: Capturing the Potential of the MetaverseMark Billinghurst
 
Talk to Me: Using Virtual Avatars to Improve Remote Collaboration
Talk to Me: Using Virtual Avatars to Improve Remote CollaborationTalk to Me: Using Virtual Avatars to Improve Remote Collaboration
Talk to Me: Using Virtual Avatars to Improve Remote CollaborationMark Billinghurst
 
Empathic Computing: Designing for the Broader Metaverse
Empathic Computing: Designing for the Broader MetaverseEmpathic Computing: Designing for the Broader Metaverse
Empathic Computing: Designing for the Broader MetaverseMark Billinghurst
 
2022 COMP 4010 Lecture 7: Introduction to VR
2022 COMP 4010 Lecture 7: Introduction to VR2022 COMP 4010 Lecture 7: Introduction to VR
2022 COMP 4010 Lecture 7: Introduction to VRMark Billinghurst
 
Novel Interfaces for AR Systems
Novel Interfaces for AR SystemsNovel Interfaces for AR Systems
Novel Interfaces for AR SystemsMark Billinghurst
 
Empathic Computing and Collaborative Immersive Analytics
Empathic Computing and Collaborative Immersive AnalyticsEmpathic Computing and Collaborative Immersive Analytics
Empathic Computing and Collaborative Immersive AnalyticsMark Billinghurst
 
Empathic Computing: Developing for the Whole Metaverse
Empathic Computing: Developing for the Whole MetaverseEmpathic Computing: Developing for the Whole Metaverse
Empathic Computing: Developing for the Whole MetaverseMark Billinghurst
 
Research Directions in Transitional Interfaces
Research Directions in Transitional InterfacesResearch Directions in Transitional Interfaces
Research Directions in Transitional InterfacesMark Billinghurst
 
Comp4010 Lecture13 More Research Directions
Comp4010 Lecture13 More Research DirectionsComp4010 Lecture13 More Research Directions
Comp4010 Lecture13 More Research DirectionsMark Billinghurst
 
Comp4010 lecture11 VR Applications
Comp4010 lecture11 VR ApplicationsComp4010 lecture11 VR Applications
Comp4010 lecture11 VR ApplicationsMark Billinghurst
 
Advanced Methods for User Evaluation in Enterprise AR
Advanced Methods for User Evaluation in Enterprise ARAdvanced Methods for User Evaluation in Enterprise AR
Advanced Methods for User Evaluation in Enterprise ARMark Billinghurst
 

More from Mark Billinghurst (17)

IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Future Research Directions for Augmented Reality
Future Research Directions for Augmented RealityFuture Research Directions for Augmented Reality
Future Research Directions for Augmented Reality
 
Evaluation Methods for Social XR Experiences
Evaluation Methods for Social XR ExperiencesEvaluation Methods for Social XR Experiences
Evaluation Methods for Social XR Experiences
 
Empathic Computing: Delivering the Potential of the Metaverse
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the Metaverse
 
Empathic Computing: Capturing the Potential of the Metaverse
Empathic Computing: Capturing the Potential of the MetaverseEmpathic Computing: Capturing the Potential of the Metaverse
Empathic Computing: Capturing the Potential of the Metaverse
 
Talk to Me: Using Virtual Avatars to Improve Remote Collaboration
Talk to Me: Using Virtual Avatars to Improve Remote CollaborationTalk to Me: Using Virtual Avatars to Improve Remote Collaboration
Talk to Me: Using Virtual Avatars to Improve Remote Collaboration
 
Empathic Computing: Designing for the Broader Metaverse
Empathic Computing: Designing for the Broader MetaverseEmpathic Computing: Designing for the Broader Metaverse
Empathic Computing: Designing for the Broader Metaverse
 
2022 COMP 4010 Lecture 7: Introduction to VR
2022 COMP 4010 Lecture 7: Introduction to VR2022 COMP 4010 Lecture 7: Introduction to VR
2022 COMP 4010 Lecture 7: Introduction to VR
 
ISS2022 Keynote
ISS2022 KeynoteISS2022 Keynote
ISS2022 Keynote
 
Novel Interfaces for AR Systems
Novel Interfaces for AR SystemsNovel Interfaces for AR Systems
Novel Interfaces for AR Systems
 
Empathic Computing and Collaborative Immersive Analytics
Empathic Computing and Collaborative Immersive AnalyticsEmpathic Computing and Collaborative Immersive Analytics
Empathic Computing and Collaborative Immersive Analytics
 
Metaverse Learning
Metaverse LearningMetaverse Learning
Metaverse Learning
 
Empathic Computing: Developing for the Whole Metaverse
Empathic Computing: Developing for the Whole MetaverseEmpathic Computing: Developing for the Whole Metaverse
Empathic Computing: Developing for the Whole Metaverse
 
Research Directions in Transitional Interfaces
Research Directions in Transitional InterfacesResearch Directions in Transitional Interfaces
Research Directions in Transitional Interfaces
 
Comp4010 Lecture13 More Research Directions
Comp4010 Lecture13 More Research DirectionsComp4010 Lecture13 More Research Directions
Comp4010 Lecture13 More Research Directions
 
Comp4010 lecture11 VR Applications
Comp4010 lecture11 VR ApplicationsComp4010 lecture11 VR Applications
Comp4010 lecture11 VR Applications
 
Advanced Methods for User Evaluation in Enterprise AR
Advanced Methods for User Evaluation in Enterprise ARAdvanced Methods for User Evaluation in Enterprise AR
Advanced Methods for User Evaluation in Enterprise AR
 

Recently uploaded

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 

Recently uploaded (20)

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 

Natural Interfaces for Augmented Reality

  • 1. Natural Interfaces for Augmented Reality Mark Billinghurst HIT Lab NZ University of Canterbury
  • 2.
  • 3. Augmented Reality Definition  Defining Characteristics [Azuma 97]  Combines Real and Virtual Images - Both can be seen at the same time  Interactive in real-time - The virtual content can be interacted with  Registered in 3D - Virtual objects appear fixed in space
  • 4. AR Today  Most widely used AR is mobile or web based  Mobile AR  Outdoor AR (GPS + compass) - Layar (10 million+ users), Junaio, etc  Indoor AR (image based tracking) - QCAR, String etc  Web based (Flash)  Flartoolkit marker tracking  Markerless tracking
  • 5. AR Interaction  You can see spatially registered AR.. how can you interact with it?
  • 6. AR Interaction Today  Mostly simple interaction  Mobile  Outdoor (Junaio, Layar, Wikitude, etc) - Viewing information in place, touch virtual tags  Indoor (Invizimals, Qualcomm demos) - Change viewpoint, screen based (touch screen)  Web based  Change viewpoint, screen interaction (mouse)
  • 7. History of AR Interaction
  • 8. 1. AR Information Viewing  Information is registered to real-world context  Hand held AR displays  Interaction  Manipulation of a window into information space  2D/3D virtual viewpoint control  Applications  Context-aware information displays  Examples NaviCam Rekimoto, et al. 1997  NaviCam, Cameleon, etc
  • 9. Current AR Information Browsers  Mobile AR  GPS + compass  Many Applications  Layar  Wikitude  Acrossair  PressLite  Yelp  AR Car Finder  …
  • 10. 2. 3D AR Interfaces  Virtual objects displayed in 3D physical space and manipulated  HMDs and 6DOF head-tracking  6DOF hand trackers for input  Interaction  Viewpoint control  Traditional 3D UI interaction: Kiyokawa, et al. 2000 manipulation, selection, etc.  Requires custom input devices
  • 11. VLEGO - AR 3D Interaction
  • 12. 3. Augmented Surfaces and Tangible Interfaces  Basic principles  Virtual objects are projected on a surface  Physical objects are used as controls for virtual objects  Support for collaboration
  • 13. Augmented Surfaces  Rekimoto, et al. 1998  Front projection  Marker-based tracking  Multiple projection surfaces
  • 14.
  • 15. Tangible User Interfaces (Ishii 97)  Create digital shadows for physical objects  Foreground  graspable UI  Background  ambient interfaces
  • 16. Tangible Interface: ARgroove  Collaborative Instrument  Exploring Physically Based Interaction  Move and track physical record  Map physical actions to Midi output - Translation, rotation - Tilt, shake  Limitation  AR output shown on screen  Separation between input and output
  • 17.
  • 18. Lessons from Tangible Interfaces  Benefits  Physical objects make us smart (affordances, constraints)  Objects aid collaboration (shared meaning)  Objects increase understanding (cognitive artifacts)  Limitations  Difficult to change object properties  Limited display capabilities (project onto surface)  Separation between object and display
  • 19. 4: Tangible AR  AR overcomes limitation of TUIs  enhance display possibilities  merge task/display space  provide public and private views  TUI + AR = Tangible AR  Apply TUI methods to AR interface design
  • 20. Example Tangible AR Applications  Use of natural physical object manipulations to control virtual objects  LevelHead (Oliver)  Physical cubes become rooms  VOMAR (Kato 2000)  Furniture catalog book: - Turn over the page to see new models  Paddle interaction: - Push, shake, incline, hit, scoop
  • 22. Evolution of AR Interaction 1. Information Viewing Interfaces  simple (conceptually!), unobtrusive § 3D AR Interfaces  expressive, creative, require attention § Tangible Interfaces  Embedded into conventional environments 4. Tangible AR  Combines TUI input + AR display
  • 23. Limitations  Typical limitations  Simple/No interaction (viewpoint control)  Require custom devices  Single mode interaction  2D input for 3D (screen based interaction)  No understanding of real world  Explicit vs. implicit interaction  Unintelligent interfaces (no learning)
  • 26. To Make the Vision Real..  Hardware/software requirements  Contact lens displays  Free space hand/body tracking  Environment recognition  Speech/gesture recognition  Etc..
  • 27. Natural Interaction  Automatically detecting real environment  Environmental awareness  Physically based interaction  Gesture Input  Free-hand interaction  Multimodal Input  Speech and gesture interaction  Implicit rather than Explicit interaction
  • 29. AR MicroMachines  AR experience with environment awareness and physically-based interaction  Based on MS Kinect RGB-D sensor  Augmented environment supports  occlusion, shadows  physically-based interaction between real and virtual objects
  • 31. Architecture  Our framework uses five libraries:  OpenNI  OpenCV  OPIRA  Bullet Physics  OpenSceneGraph
  • 32. System Flow  The system flow consists of three sections:  Image Processing and Marker Tracking  Physics Simulation  Rendering
  • 33. Physics Simulation  Create virtual mesh over real world  Update at 10 fps – can move real objects  Use by physics engine for collision detection (virtual/real)  Use by OpenScenegraph for occlusion and shadows
  • 35. Natural Gesture Interaction HIT Lab NZ AR Gesture Library
  • 36. Motivation AR MicroMachines and PhobiAR • Treated the environment as static – no tracking • Tracked objects in 2D More realistic interaction requires 3D gesture tracking
  • 37. Motivation Occlusion Issues AR MicroMachines only achieved realistic occlusion because the user’s viewpoint matched the Kinect’s Proper occlusion requires a more complete model of scene objects
  • 39. HITLabNZ’s Gesture Library Architecture o Supports PCL, OpenNI, OpenCV, and Kinect SDK. o Provides access to depth, RGB, XYZRGB. o Usage: Capturing color image, depth image and concatenated point clouds from a single or multiple cameras o For example: Kinect for Xbox 360 Kinect for Windows Asus Xtion Pro Live
  • 40. HITLabNZ’s Gesture Library Architecture o Segment images and point clouds based on color, depth and space. o Usage: Segmenting images or point clouds using color models, depth, or spatial properties such as location, shape and size. o For example: Skin color segmentation Depth threshold
  • 41. HITLabNZ’s Gesture Library Architecture o Identify and track objects between frames based on XYZRGB. o Usage: Identifying current position/orientation of the tracked object in space. o For example: Training set of hand poses, colors represent unique regions of the hand. Raw output (without- cleaning) classified on real hand input (depth image).
  • 42. HITLabNZ’s Gesture Library Architecture o Hand Recognition/Modeling  Skeleton based (for low resolution approximation)  Model based (for more accurate representation) o Object Modeling (identification and tracking rigid- body objects) o Physical Modeling (physical interaction)  Sphere Proxy  Model based  Mesh based o Usage: For general spatial interaction in AR/VR environment
  • 43. Method Represent models as collections of spheres moving with the models in the Bullet physics engine
  • 44. Method Render AR scene with OpenSceneGraph, using depth map for occlusion Shadows yet to be implemented
  • 46. HITLabNZ’s Gesture Library Architecture o Static (hand pose recognition) o Dynamic (meaningful movement recognition) o Context-based gesture recognition (gestures with context, e.g. pointing) o Usage: Issuing commands/anticipating user intention and high level interaction.
  • 48. Multimodal Interaction  Combined speech input  Gesture and Speech complimentary  Speech - modal commands, quantities  Gesture - selection, motion, qualities  Previous work found multimodal interfaces intuitive for 2D/3D graphics interaction
  • 49. 1. Marker Based Multimodal Interface  Add speech recognition to VOMAR  Paddle + speech commands
  • 50.
  • 51. Commands Recognized  Create Command "Make a blue chair": to create a virtual object and place it on the paddle.  Duplicate Command "Copy this": to duplicate a virtual object and place it on the paddle.  Grab Command "Grab table": to select a virtual object and place it on the paddle.  Place Command "Place here": to place the attached object in the workspace.  Move Command "Move the couch": to attach a virtual object in the workspace to the paddle so that it follows the paddle movement.
  • 53. Object Relationships "Put chair behind the table” Where is behind? View specific regions
  • 54. User Evaluation  Performance time  Speech + static paddle significantly faster  Gesture-only condition less accurate for position/orientation  Users preferred speech + paddle input
  • 56. 2. Free Hand Multimodal Input  Use free hand to interact with AR content  Recognize simple gestures  No marker tracking Point Move Pick/Drop
  • 60. User Evaluation  Change object shape, colour and position  Conditions  Speech only, gesture only, multimodal  Measure  performance time, error, subjective survey
  • 62. Results  Average performance time (MMI, speech fastest)  Gesture: 15.44s  Speech: 12.38s  Multimodal: 11.78s  No difference in user errors  User subjective survey  Q1: How natural was it to manipulate the object? - MMI, speech significantly better  70% preferred MMI, 25% speech only, 5% gesture only
  • 64. Future Research  Mobile real world capture  Mobile gesture input  Intelligent interfaces  Virtual characters
  • 65. Natural Gesture Interaction on Mobile  Use mobile camera for hand tracking  Fingertip detection
  • 66. Evaluation  Gesture input more than twice as slow as touch  No difference in naturalness
  • 67. Intelligent Interfaces  Most AR systems stupid  Don’t recognize user behaviour  Don’t provide feedback  Don’t adapt to user  Especially important for training  Scaffolded learning  Moving beyond check-lists of actions
  • 68. Intelligent Interfaces  AR interface + intelligent tutoring system  ASPIRE constraint based system (from UC)  Constraints - relevance cond., satisfaction cond., feedback
  • 70. Intelligent Feedback  Actively monitors user behaviour  Implicit vs. explicit interaction  Provides corrective feedback
  • 71.
  • 72. Evaluation Results  16 subjects, with and without ITS  Improved task completion  Improved learning
  • 73. Intelligent Agents  AR characters  Virtual embodiment of system  Multimodal input/output  Examples  AR Lego, Welbo, etc  Mr Virtuoso - AR character more real, more fun - On-screen 3D and AR similar in usefulness
  • 75. Conclusions  AR traditionally involves tangible interaction  New technologies support natural interaction  Environment capture  Natural gestures  Multimodal interaction  Opportunities for future research  Mobile, intelligent systems, characters
  • 76. More Information • Mark Billinghurst – mark.billinghurst@hitlabnz.org • Website – http://www.hitlabnz.org/

Editor's Notes

  1. - To create an interaction volume, the Kinect is positioned above the desired interaction space facing downwards. - A reference marker is placed in the interaction space to calculate the transform between the Kinect coordinate system and the coordinate system used by the AR viewing camera. - Users can also wear color markers on their fingers for pre-defined gesture interaction.
  2. - The OpenSceneGraph framework is used for rendering. The input video image is rendered as the background, with all the virtual objects rendered on top. - At the top level of the scene graph, the viewing transformation is applied such that all virtual objects are transformed so as to appear attached to the real world. - The trimesh is rendered as an array of quads, with an alpha value of zero. This allows realistic occlusion effects of the terrain and virtual objects, while not affecting the users’ view of the real environment. - A custom fragment shader was written to allow rendering of shadows to the invisible terrain.
  3. Appearance-based interaction has been used at the Lab before, both in AR Micromachines and PhobiAR. Flaws in these applications have motivated my work on advanced tracking and modeling. AR Micromachines did not allow for dynamic interaction – a car could be picked up, but because the motion of the hand was not known, friction could not be simulated between the car and the hand. PhobiAR introduced tracking for dynamic interaction, but it really only tracked objects in 2D. I’ll show you what I mean.. As soon as the hand is flipped the tracking fails and the illusion of realistic interaction is broken. 3D tracking was required to make the interaction in both of these applications more realistic
  4. Another issue with typical AR applications is the handling of occlusion. The Kinect allows a model of the environment to be developed, which can help in determining whether a real object is in front of a virtual one. Micromachines had good success by assuming a situation such as that shown on the right, with all objects in the scene in contact with the ground. This was a fair assumption when most of the objects were books etc. However, in PhobiAR the user’s hands were often above the ground, more like the scene on the left. The thing to notice is that these two scenes are indistinguishable from the Kinect’s point of view, but completely different from the observer’s point of view. The main problem is that we don’t know enough about the shape real-world objects to handle occlusion properly. My work aims to model real-world objects by combining views of the objects across multiple frames, allowing better occlusion.
  5. The gesture library will provide a C++ API for real-time recognition and tracking of hands and rigid-body objects in 3D environments. The library will support usage of single and multiple depth sensing cameras. Collision detection and physics simulation will be integrated for realistic physical interaction. Finally, learning algorithms will be implemented for recognizing hand gestures.
  6. The library will support usage of single and multiple depth sensing cameras. Aim for general consumer hardware.
  7. Interaction between real objects and the virtual balls was achieved by representing objects as collections of spheres. The location of the spheres was determined by the modeling stage while their motion was found during tracking. I used the Bullet physics engine for physics simulation.
  8. The AR scene was rendered using OpenSceneGraph. Because the Kinect’s viewpoint was also the user’s viewpoint, realistic occlusion was possible using the Kinect’s depth data. I did not have time to experiment with using the object models to improve occlusion from other viewpoints. Also, the addition of shadows could have significantly improved the realism of the application.