Your SlideShare is downloading. ×
0
Hands and Speech in Space:
Multimodal Interaction for AR
Mark Billinghurst
mark.billinghurst@hitlabnz.org
The HIT Lab NZ, ...
1977 – Star Wars

1977 – Star Wars
Augmented Reality Definition
  Defining Characteristics
  Combines Real and Virtual Images
-  Both can be seen at the sa...
Augmented Reality Today
AR Interface Components
Physical
Elements
Input

Interaction
Metaphor

Virtual
Elements
Output

  Key Question: How shoul...
AR Interaction Metaphors
  Information Browsing
  View AR content

  3D AR Interfaces
  3D UI interaction techniques

...
VOMAR Demo (Kato 2000)
  AR Furniture Arranging
  Elements + Interactions
  Book:
-  Turn over the page

  Paddle:
-  ...
Opportunities for Multimodal Input
  Multimodal interfaces are a natural fit for AR
  Need for non-GUI interfaces
  Nat...
Related Work
  Related work in 3D graphics/VR
  Interaction with 3D content [Chu 1997]
  Navigating through virtual wor...
Examples

SenseShapes [2003]

Kolsch [2006]
Marker Based Multimodal Interface

  Add speech recognition to VOMAR
  Paddle + speech commands
Irawati, S., Green, S., ...
Commands Recognized
  Create Command "Make a blue chair": to create a virtual
object and place it on the paddle.
  Dupli...
System Architecture
Object Relationships

"Put chair behind the table”
Where is behind?
View specific regions
User Evaluation
  Performance time
  Speech + static paddle significantly faster

  Gesture-only condition less accurat...
Subjective Surveys
2012 – Iron Man 2
To Make the Vision Real..
  Hardware/software requirements
  Contact lens displays
  Free space hand/body tracking
  S...
Natural Interaction
  Automatically detecting real environment
  Environmental awareness, Physically based interaction

...
Environmental
Awareness
AR MicroMachines
  AR experience with environment awareness
and physically-based interaction
  Based on MS Kinect RGB-D ...
Operating Environment
Architecture
  Our framework uses five libraries:
  OpenNI
  OpenCV
  OPIRA
  Bullet Physics
  OpenSceneGraph
System Flow
  The system flow consists of three sections:
  Image Processing and Marker Tracking
  Physics Simulation
...
Physics Simulation

  Create virtual mesh over real world
  Update at 10 fps – can move real objects
  Use by physics e...
Rendering

Occlusion

Shadows
Gesture Interaction
Natural Hand Interaction

  Using bare hands to interact with AR content
  MS Kinect depth sensing
  Real time hand tra...
Hand Interaction

  Represent models as collections of spheres
  Bullet physics engine for interaction with real world
Scene Interaction

  Render AR scene with OpenSceneGraph
  Using depth map for occlusion
  Shadows yet to be implemente...
Architecture
5. Gesture

•  Static Gestures
•  Dynamic Gestures
•  Context based Gestures
4. Modeling

•  Hand recognition...
Architecture
5. Gesture
•  Static Gestures
•  Dynamic Gestures
•  Context based Gestures

o  Supports PCL, OpenNI, OpenCV,...
Architecture
5. Gesture
•  Static Gestures
•  Dynamic Gestures
•  Context based Gestures

o  Segment images and point clou...
Architecture
5. Gesture
•  Static Gestures
•  Dynamic Gestures
•  Context based Gestures

o  Identify and track objects be...
Architecture
5. Gesture
•  Static Gestures
•  Dynamic Gestures
•  Context based Gestures
4. Modeling
•  Hand recognition/
...
Architecture
5. Gesture
•  Static Gestures
•  Dynamic Gestures
•  Context based Gestures
4. Modeling
•  Hand recognition/
...
Skeleton Based Interaction

  3 Gear Systems
  Kinect/Primesense Sensor
  Two hand tracking
  http://www.threegear.com
Skeleton Interaction + AR

  HMD AR View
  Viewpoint tracking

  Two hand input
  Skeleton interaction, occlusion
What Gestures do People Want to Use?
  Limitations of Previous work in AR
  Limited range of gestures
  Gestures design...
User Defined Gesture Study
  Use AR view
  HMD + AR tracking

  Present AR animations
  40 tasks in six categories
-  ...
Data Recorded
  20 participants
  Gestures recorded (video, depth data)
  800 gestures from 40 tasks

  Subjective ran...
Typical Gestures
Results - Gestures
  Gestures grouped according to
similarity – 320 groups
  44 consensus (62% all gestures)
  276 low ...
Results –Agreement Scores

Red line – proportion of two handed gestures
Usability Results
Consensus Discarded
Ease of Performance
6.02
5.50
Good Match
6.17
5.83
Likert Scale [1-7], 7 = Very Good...
Lessons Learned
  AR animation can elicit desired gestures
  For some tasks there is a high degree of
similarity in user...
Multimodal Input
Multimodal Interaction
  Combined speech input
  Gesture and Speech complimentary
  Speech
-  modal commands, quantitie...
Wizard of Oz Study
  What speech and gesture input
would people like to use?
  Wizard
  Perform speech recognition
  C...
System Architecture
Hand Segmentation
System Set Up
Experiment
  12 participants
  Two display conditions (HMD vs. Desktop)
  Three tasks
  Task 1: Change object color/sh...
Key Results

  Most commands multimodal

  Multimodal (63%), Gesture (34%), Speech (4%)

  Most spoken phrases short
 ...
Free Hand Multimodal Input

Point

Move

Pick/Drop

  Use free hand to interact with AR content
  Recognize simple gestu...
Speech Input
  MS Speech + MS SAPI (> 90% accuracy)
  Single word speech commands
Multimodal Architecture
Multimodal Fusion
Hand Occlusion
Experimental Setup

Change object shape
and colour
User Evaluation

  25 subjects, 10 task trials x 3, 3 conditions
  Change object shape, colour and position
  Condition...
Results - Performance
  Average performance time
  Gesture: 15.44s
  Speech: 12.38s
  Multimodal: 11.78s

  Significa...
Errors
  User errors – errors per task
  Gesture (0.50), Speech (0.41), MMI (0.42)
  No significant difference

  Syst...
Subjective Results (Likert 1-7)
Gesture

Speech

MMI

Naturalness

4.60

5.60

5.80

Ease of Use

4.00

5.90

6.00

Effici...
Observations
  Significant difference in number of commands
  Gesture (6.14), Speech (5.23), MMI (4.93)

  MMI Simultan...
Lessons Learned
  Multimodal interaction significantly better than
gesture alone in AR interfaces for 3D tasks
  Short t...
Intelligent Interfaces
Intelligent Interfaces
  Most AR systems stupid
  Don’t recognize user behaviour
  Don’t provide feedback
  Don’t adap...
Intelligent Interfaces

  AR interface + intelligent tutoring system
  ASPIRE constraint based system (from UC)
  Const...
Domain Ontology
Intelligent Feedback

  Actively monitors user behaviour
  Implicit vs. explicit interaction

  Provides corrective fee...
Evaluation Results
  16 subjects, with and without ITS
  Improved task completion

  Improved learning
Intelligent Agents
  AR characters
  Virtual embodiment of system
  Multimodal input/output

  Examples
  AR Lego, We...
Looking to the Future

What’s Next?
Directions for Future Research
  Mobile Gesture Interaction
  Tablet, phone interfaces

  Wearable Systems
  Google Gl...
Mobile Gesture Interaction
  Motivation
  Richer interaction with handheld devices
  Natural interaction with handheld ...
Fingertip Based Interaction

Running System
System Setup

Mobile Client + PC Server

Bai, H., Gao, L., El-Sana, J., & Bill...
System Architecture
3D Prototype System
  3 Gear + Vuforia
  Hand tracking + phone tracking

  Freehand interaction on phone
  Skeleton mo...
Google Glass
User Experience
  Truly Wearable Computing
  Less than 46 ounces

  Hands-free Information Access
  Voice interaction,...
Contact Lens Display

  Babak Parviz

  University Washington

  MEMS components
  Transparent elements
  Micro-senso...
Contact Lens Prototype
Environmental Understanding

  Semantic understanding of environment
  What are the key objects?
  What are there relat...
Conclusion
Conclusions
  AR experiences need new interaction methods
  Enabling technologies are advancing quickly
  Displays, tra...
More Information
•  Mark Billinghurst
–  Email: mark.billinghurst@hitlabnz.org
–  Twitter: @marknb00

•  Website
–  http:/...
Hands and Speech in Space: Multimodal Input for Augmented Reality
Hands and Speech in Space: Multimodal Input for Augmented Reality
Hands and Speech in Space: Multimodal Input for Augmented Reality
Upcoming SlideShare
Loading in...5
×

Hands and Speech in Space: Multimodal Input for Augmented Reality

1,205

Published on

A keynote talk given by Mark Billinghurst at the ICMI 2013 conference, December 12th 2013. The talk is about how to use speech and gesture interaction with Augmented Reality interfaces.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,205
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
181
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "Hands and Speech in Space: Multimodal Input for Augmented Reality "

  1. 1. Hands and Speech in Space: Multimodal Interaction for AR Mark Billinghurst mark.billinghurst@hitlabnz.org The HIT Lab NZ, University of Canterbury December 12th 2013
  2. 2. 1977 – Star Wars 1977 – Star Wars
  3. 3. Augmented Reality Definition   Defining Characteristics   Combines Real and Virtual Images -  Both can be seen at the same time   Interactive in real-time -  The virtual content can be interacted with   Registered in 3D -  Virtual objects appear fixed in space Azuma, R. T. (1997). A survey of augmented reality. Presence, 6(4), 355-385.
  4. 4. Augmented Reality Today
  5. 5. AR Interface Components Physical Elements Input Interaction Metaphor Virtual Elements Output   Key Question: How should a person interact with the Augmented Reality content?   Connecting physical and virtual with interaction
  6. 6. AR Interaction Metaphors   Information Browsing   View AR content   3D AR Interfaces   3D UI interaction techniques   Augmented Surfaces   Tangible UI techniques   Tangible AR   Tangible UI input + AR output
  7. 7. VOMAR Demo (Kato 2000)   AR Furniture Arranging   Elements + Interactions   Book: -  Turn over the page   Paddle: -  Push, shake, incline, hit, scoop Kato, H., Billinghurst, M., et al. 2000. Virtual Object Manipulation on a Table-Top AR Environment. In Proceedings of the International Symposium on Augmented Reality (ISAR 2000), Munich, Germany, 111--119.
  8. 8. Opportunities for Multimodal Input   Multimodal interfaces are a natural fit for AR   Need for non-GUI interfaces   Natural interaction with real world   Natural support for body input   Previous work shown value of multimodal input and 3D graphics
  9. 9. Related Work   Related work in 3D graphics/VR   Interaction with 3D content [Chu 1997]   Navigating through virtual worlds [Krum 2002]   Interacting with virtual characters [Billinghurst 1998]   Little earlier work in AR   Require additional input devices   Few formal usability studies   Eg Olwal et. al [2003] Sense Shapes
  10. 10. Examples SenseShapes [2003] Kolsch [2006]
  11. 11. Marker Based Multimodal Interface   Add speech recognition to VOMAR   Paddle + speech commands Irawati, S., Green, S., Billinghurst, M., Duenser, A., & Ko, H. (2006, October). IEEE Xplore. In Mixed and Augmented Reality, 2006. ISMAR 2006. IEEE/ACM International Symposium on (pp. 183-186). IEEE.
  12. 12. Commands Recognized   Create Command "Make a blue chair": to create a virtual object and place it on the paddle.   Duplicate Command "Copy this": to duplicate a virtual object and place it on the paddle.   Grab Command "Grab table": to select a virtual object and place it on the paddle.   Place Command "Place here": to place the attached object in the workspace.   Move Command "Move the couch": to attach a virtual object in the workspace to the paddle so that it follows the paddle movement.
  13. 13. System Architecture
  14. 14. Object Relationships "Put chair behind the table” Where is behind? View specific regions
  15. 15. User Evaluation   Performance time   Speech + static paddle significantly faster   Gesture-only condition less accurate for position/orientation   Users preferred speech + paddle input
  16. 16. Subjective Surveys
  17. 17. 2012 – Iron Man 2
  18. 18. To Make the Vision Real..   Hardware/software requirements   Contact lens displays   Free space hand/body tracking   Speech/gesture recognition   Etc..   Most importantly   Usability/User Experience
  19. 19. Natural Interaction   Automatically detecting real environment   Environmental awareness, Physically based interaction   Gesture interaction   Free-hand interaction   Multimodal input   Speech and gesture interaction   Intelligent interfaces   Implicit rather than Explicit interaction
  20. 20. Environmental Awareness
  21. 21. AR MicroMachines   AR experience with environment awareness and physically-based interaction   Based on MS Kinect RGB-D sensor   Augmented environment supports   occlusion, shadows   physically-based interaction between real and virtual objects Clark, A., & Piumsomboon, T. (2011). A realistic augmented reality racing game using a depth-sensing camera. In Proceedings of the 10th International Conference on Virtual Reality Continuum and Its Applications in Industry (pp. 499-502). ACM.
  22. 22. Operating Environment
  23. 23. Architecture   Our framework uses five libraries:   OpenNI   OpenCV   OPIRA   Bullet Physics   OpenSceneGraph
  24. 24. System Flow   The system flow consists of three sections:   Image Processing and Marker Tracking   Physics Simulation   Rendering
  25. 25. Physics Simulation   Create virtual mesh over real world   Update at 10 fps – can move real objects   Use by physics engine for collision detection (virtual/real)   Use by OpenScenegraph for occlusion and shadows
  26. 26. Rendering Occlusion Shadows
  27. 27. Gesture Interaction
  28. 28. Natural Hand Interaction   Using bare hands to interact with AR content   MS Kinect depth sensing   Real time hand tracking   Physics based simulation model
  29. 29. Hand Interaction   Represent models as collections of spheres   Bullet physics engine for interaction with real world
  30. 30. Scene Interaction   Render AR scene with OpenSceneGraph   Using depth map for occlusion   Shadows yet to be implemented
  31. 31. Architecture 5. Gesture •  Static Gestures •  Dynamic Gestures •  Context based Gestures 4. Modeling •  Hand recognition/modeling •  Rigid-body modeling 3. Classification/Tracking 2. Segmentation 1. Hardware Interface
  32. 32. Architecture 5. Gesture •  Static Gestures •  Dynamic Gestures •  Context based Gestures o  Supports PCL, OpenNI, OpenCV, and Kinect SDK. o  Provides access to depth, RGB, XYZRGB. o  Usage: Capturing color image, depth image and concatenated point clouds from a single or multiple cameras o  For example: 4. Modeling •  Hand recognition/ modeling •  Rigid-body modeling 3. Classification/Tracking 2. Segmentation 1. Hardware Interface Kinect for Xbox 360 Kinect for Windows Asus Xtion Pro Live
  33. 33. Architecture 5. Gesture •  Static Gestures •  Dynamic Gestures •  Context based Gestures o  Segment images and point clouds based on color, depth and space. o  Usage: Segmenting images or point clouds using color models, depth, or spatial properties such as location, shape and size. o  For example: 4. Modeling •  Hand recognition/ modeling •  Rigid-body modeling Skin color segmentation 3. Classification/Tracking 2. Segmentation 1. Hardware Interface Depth threshold
  34. 34. Architecture 5. Gesture •  Static Gestures •  Dynamic Gestures •  Context based Gestures o  Identify and track objects between frames based on XYZRGB. o  Usage: Identifying current position/orientation of the tracked object in space. o  For example: 4. Modeling •  Hand recognition/ modeling •  Rigid-body modeling 3. Classification/Tracking 2. Segmentation 1. Hardware Interface Training set of hand poses, colors represent unique regions of the hand. Raw output (withoutcleaning) classified on real hand input (depth image).
  35. 35. Architecture 5. Gesture •  Static Gestures •  Dynamic Gestures •  Context based Gestures 4. Modeling •  Hand recognition/ modeling •  Rigid-body modeling 3. Classification/Tracking 2. Segmentation 1. Hardware Interface o  Hand Recognition/Modeling   Skeleton based (for low resolution approximation)   Model based (for more accurate representation) o  Object Modeling (identification and tracking rigidbody objects) o  Physical Modeling (physical interaction)   Sphere Proxy   Model based   Mesh based o  Usage: For general spatial interaction in AR/VR environment
  36. 36. Architecture 5. Gesture •  Static Gestures •  Dynamic Gestures •  Context based Gestures 4. Modeling •  Hand recognition/ modeling •  Rigid-body modeling 3. Classification/Tracking 2. Segmentation 1. Hardware Interface o  Static (hand pose recognition) o  Dynamic (meaningful movement recognition) o  Context-based gesture recognition (gestures with context, e.g. pointing) o  Usage: Issuing commands/anticipating user intention and high level interaction.
  37. 37. Skeleton Based Interaction   3 Gear Systems   Kinect/Primesense Sensor   Two hand tracking   http://www.threegear.com
  38. 38. Skeleton Interaction + AR   HMD AR View   Viewpoint tracking   Two hand input   Skeleton interaction, occlusion
  39. 39. What Gestures do People Want to Use?   Limitations of Previous work in AR   Limited range of gestures   Gestures designed for optimal recognition   Gestures studied as add-on to speech   Solution – elicit desired gestures from users   Eg. Gestures for surface computing [Wobbrock]   Previous work in unistroke getsures, mobile gestures
  40. 40. User Defined Gesture Study   Use AR view   HMD + AR tracking   Present AR animations   40 tasks in six categories -  Editing, transforms, menu, etc   Ask users to produce gestures causing animations   Record gesture (video, depth) Piumsomboon, T., Clark, A., Billinghurst, M., & Cockburn, A. (2013, April). User-defined gestures for augmented reality. In CHI'13 Extended Abstracts on Human Factors in Computing Systems (pp. 955-960).ACM
  41. 41. Data Recorded   20 participants   Gestures recorded (video, depth data)   800 gestures from 40 tasks   Subjective rankings   Likert ranking of goodness, ease of use   Think aloud transcripts
  42. 42. Typical Gestures
  43. 43. Results - Gestures   Gestures grouped according to similarity – 320 groups   44 consensus (62% all gestures)   276 low similarity (discarded)   11 hand poses seen   Degree of consensus (A) using guessability score [Wobbrock]
  44. 44. Results –Agreement Scores Red line – proportion of two handed gestures
  45. 45. Usability Results Consensus Discarded Ease of Performance 6.02 5.50 Good Match 6.17 5.83 Likert Scale [1-7], 7 = Very Good   Significant difference between consensus and discarded gesture sets (p < 0.0001)   Gestures in consensus set better than discarded gestures in perceived performance and goodness
  46. 46. Lessons Learned   AR animation can elicit desired gestures   For some tasks there is a high degree of similarity in user defined gestures   Especially command gestures (eg Open), select   Less agreement in manipulation gestures   Move (40%), rotate (30%), grouping (10%)   Small portion of two handed gestures (22%)   Scaling, group selection
  47. 47. Multimodal Input
  48. 48. Multimodal Interaction   Combined speech input   Gesture and Speech complimentary   Speech -  modal commands, quantities   Gesture -  selection, motion, qualities   Previous work found multimodal interfaces intuitive for 2D/3D graphics interaction
  49. 49. Wizard of Oz Study   What speech and gesture input would people like to use?   Wizard   Perform speech recognition   Command interpretation   Domain   3D object interaction/modelling Lee, M., & Billinghurst, M. (2008, October). A Wizard of Oz study for an AR multimodal interface. In Proceedings of the 10th international conference on Multimodal interfaces (pp. 249-256). ACM.
  50. 50. System Architecture
  51. 51. Hand Segmentation
  52. 52. System Set Up
  53. 53. Experiment   12 participants   Two display conditions (HMD vs. Desktop)   Three tasks   Task 1: Change object color/shape   Task 2: 3D positioning of obejcts   Task 3: Scene assembly
  54. 54. Key Results   Most commands multimodal   Multimodal (63%), Gesture (34%), Speech (4%)   Most spoken phrases short   74% phrases average 1.25 words long   Sentences (26%) average 3 words   Main gestures deictic (65%), metaphoric (35%)   In multimodal commands gesture issued first   94% time gesture begun before speech   Multimodal window 8s – speech 4.5s after gesture
  55. 55. Free Hand Multimodal Input Point Move Pick/Drop   Use free hand to interact with AR content   Recognize simple gestures   Open hand, closed hand, pointing Lee, M., Billinghurst, M., Baek, W., Green, R., & Woo, W. (2013). A usability study of multimodal input in an augmented reality environment. Virtual Reality, 17(4), 293-305.
  56. 56. Speech Input   MS Speech + MS SAPI (> 90% accuracy)   Single word speech commands
  57. 57. Multimodal Architecture
  58. 58. Multimodal Fusion
  59. 59. Hand Occlusion
  60. 60. Experimental Setup Change object shape and colour
  61. 61. User Evaluation   25 subjects, 10 task trials x 3, 3 conditions   Change object shape, colour and position   Conditions   Speech only, gesture only, multimodal   Measures   performance time, errors (system/user), subjective survey
  62. 62. Results - Performance   Average performance time   Gesture: 15.44s   Speech: 12.38s   Multimodal: 11.78s   Significant difference across conditions (p < 0.01)   Difference between gesture and speech/MMI
  63. 63. Errors   User errors – errors per task   Gesture (0.50), Speech (0.41), MMI (0.42)   No significant difference   System errors   Speech accuracy – 94%, Gesture accuracy – 85%   MMI accuracy – 90%
  64. 64. Subjective Results (Likert 1-7) Gesture Speech MMI Naturalness 4.60 5.60 5.80 Ease of Use 4.00 5.90 6.00 Efficiency 4.45 5.15 6.05 Physical Effort 4.75 3.15 3.85   User subjective survey   Gesture significantly worse, MMI and Speech same   MMI perceived as most efficient   Preference   70% MMI, 25% speech only, 5% gesture only
  65. 65. Observations   Significant difference in number of commands   Gesture (6.14), Speech (5.23), MMI (4.93)   MMI Simultaneous vs. Sequential commands   79% sequential, 21% simultaneous   Reaction to system errors   Almost always repeated same command   In MMI rarely changes modalities
  66. 66. Lessons Learned   Multimodal interaction significantly better than gesture alone in AR interfaces for 3D tasks   Short task time, more efficient   Users felt that MMI was more natural, easier, and more effective that gesture/speech only   Simultaneous input rarely used   More studies need to be conducted
  67. 67. Intelligent Interfaces
  68. 68. Intelligent Interfaces   Most AR systems stupid   Don’t recognize user behaviour   Don’t provide feedback   Don’t adapt to user   Especially important for training   Scaffolded learning   Moving beyond check-lists of actions
  69. 69. Intelligent Interfaces   AR interface + intelligent tutoring system   ASPIRE constraint based system (from UC)   Constraints -  relevance cond., satisfaction cond., feedback Westerfield, G., Mitrovic, A., & Billinghurst, M. (2013). Intelligent Augmented Reality Training for Assembly Tasks. In Artificial Intelligence in Education (pp. 542-551). Springer Berlin Heidelberg.
  70. 70. Domain Ontology
  71. 71. Intelligent Feedback   Actively monitors user behaviour   Implicit vs. explicit interaction   Provides corrective feedback
  72. 72. Evaluation Results   16 subjects, with and without ITS   Improved task completion   Improved learning
  73. 73. Intelligent Agents   AR characters   Virtual embodiment of system   Multimodal input/output   Examples   AR Lego, Welbo, etc   Mr Virtuoso -  AR character more real, more fun -  On-screen 3D and AR similar in usefulness Wagner, D., Billinghurst, M., & Schmalstieg, D. (2006). How real should virtual characters be?. In Proceedings of the 2006 ACM SIGCHI international conference on Advances in computer entertainment technology (p. 57). ACM.
  74. 74. Looking to the Future What’s Next?
  75. 75. Directions for Future Research   Mobile Gesture Interaction   Tablet, phone interfaces   Wearable Systems   Google Glass   Novel Displays   Contact lens   Environmental Understanding   Semantic representation
  76. 76. Mobile Gesture Interaction   Motivation   Richer interaction with handheld devices   Natural interaction with handheld AR   2D tracking   Finger tip tracking   3D tracking [Hurst and Wezel 2013]   Hand tracking [Henrysson et al. 2007] Henrysson, A., Marshall, J., & Billinghurst, M. (2007). Experiments in 3D interaction for mobile phone AR. In Proceedings of the 5th international conference on Computer graphics and interactive techniques in Australia and Southeast Asia (pp. 187-194). ACM.
  77. 77. Fingertip Based Interaction Running System System Setup Mobile Client + PC Server Bai, H., Gao, L., El-Sana, J., & Billinghurst, M. (2013). Markerless 3D gesture-based interaction for handheld augmented reality interfaces. In SIGGRAPH Asia 2013 Symposium on Mobile Graphics and Interactive Applications (p. 22). ACM.
  78. 78. System Architecture
  79. 79. 3D Prototype System   3 Gear + Vuforia   Hand tracking + phone tracking   Freehand interaction on phone   Skeleton model   3D interaction   20 fps performance
  80. 80. Google Glass
  81. 81. User Experience   Truly Wearable Computing   Less than 46 ounces   Hands-free Information Access   Voice interaction, Ego-vision camera   Intuitive User Interface   Touch, Gesture, Speech, Head Motion   Access to all Google Services   Map, Search, Location, Messaging, Email, etc
  82. 82. Contact Lens Display   Babak Parviz   University Washington   MEMS components   Transparent elements   Micro-sensors   Challenges   Miniaturization   Assembly   Eye-safe
  83. 83. Contact Lens Prototype
  84. 84. Environmental Understanding   Semantic understanding of environment   What are the key objects?   What are there relationships?   Represented in a form suitable for multimodal interaction?
  85. 85. Conclusion
  86. 86. Conclusions   AR experiences need new interaction methods   Enabling technologies are advancing quickly   Displays, tracking, depth capture devices   Natural user interfaces possible   Free hand gesture, speech, intelligence interfaces   Important research for the future   Mobile, wearable, displays
  87. 87. More Information •  Mark Billinghurst –  Email: mark.billinghurst@hitlabnz.org –  Twitter: @marknb00 •  Website –  http://www.hitlabnz.org/
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×