PACER: Fine-grained Interactive Paper via Hybrid Camera and Touch Gestures on a Cell Phone


Published on

PACER is a gesture-based interactive paper system that supports fine-grained paper document content manipulation through the touch screen of a cameraphone. Using the phone’s camera, PACER links a paper document to its digital version based on visual features. It adopts camera-based phone motion detection for embodied gestures (e.g. marquees, underlines and lassos), with which users can flexibly select and interact with document details (e.g. individual words, symbols and pixels). The touch input is incorporated to facilitate target selection at fine granularity, and to address some limitations of the embodied interaction, such as hand jitter and low input sampling rate. This hybrid interaction is coupled with other techniques such as semi-real time document tracking and loose physical-digital document registration, offering a gesture-based command system. We demonstrate the use of PACER in various scenarios including work-related reading, maps and music score playing. A preliminary user study on the design has produced encouraging user feedback, and suggested future research for better understanding of embodied vs. touch interaction and one vs. two handed interaction.

Published in: Technology
  • Be the first to comment

PACER: Fine-grained Interactive Paper via Hybrid Camera and Touch Gestures on a Cell Phone

  1. 1. PACER: Fine-grained Interactive Paper via Camera-touch Hybrid Gestures on a Cell Phone Chunyuan Liao, Qiong Liu, Bee Liew, Lynn Wilcox FX Palo Alto Laboratory ACM CHI Conference Atlanta, GA, U.S.A. 4/15/2010
  2. 2. Scenario One I should email Jenny about this interesting article, especial the pulses in this curve
  3. 3. Scenario Two I want to search for the definition of θ in this book
  4. 4. Typical Solutions <ul><li>Step 1: Switch to digital media </li></ul><ul><ul><li>On-the-fly conversion </li></ul></ul><ul><ul><ul><li>Capture pictures and apply OCR (Optical Character Recognition) </li></ul></ul></ul><ul><ul><li>Retrial of the digital version </li></ul></ul><ul><ul><ul><li>Find the digital version by file system browsing or web search </li></ul></ul></ul><ul><li>Step 2: Locate the specific content </li></ul><ul><ul><li>Navigate to the specific page and find the figure </li></ul></ul><ul><li>Step 3: Interact </li></ul><ul><ul><li>Mark the interesting region and email </li></ul></ul><ul><ul><li>Type θ for full text search </li></ul></ul>
  5. 5. Problems <ul><li>Media switching </li></ul><ul><ul><li>On-the-fly conversion </li></ul></ul><ul><ul><ul><li>Inefficient, no contextual information, low quality </li></ul></ul></ul><ul><ul><li>Finding the digital version </li></ul></ul><ul><ul><ul><li>File system browsing and typing on cell phones are inconvenient </li></ul></ul></ul><ul><ul><ul><li>The keyword-based search may be inaccurate </li></ul></ul></ul><ul><li>Location & interaction on cell phones </li></ul><ul><ul><li>Not integrated with paper </li></ul></ul><ul><ul><ul><li>Lose the working context already established on paper </li></ul></ul></ul><ul><ul><ul><li>Redundant document navigation </li></ul></ul></ul><ul><ul><ul><li>No direct interaction with the paper </li></ul></ul></ul><ul><ul><li>Small screen, lower display quality and inconvenient input </li></ul></ul>
  6. 6. Video Demo <ul><li>PACER: Paper And Cell phone for Editing and Reading </li></ul><ul><ul><li>Better integration of paper and cell phones </li></ul></ul>
  7. 7. Paper vs. Cell phones <ul><li>Seamless integration of their complementary affordances </li></ul>High display quality Flexible spatial arrangement Instant accessibility High robustness Dynamic rendering Rich digital interaction Digital communication Lower display quality Small display size Inconvenient input Lower robustness Static display No computation capability No digital communication + Computer-like UX on paper to put paper and digital media on more equal footing for smooth integration
  8. 8. The State of the Art <ul><li>Interaction target granularity </li></ul><ul><ul><li>Text patches [Erol, 08] </li></ul></ul><ul><ul><li>Pre-defined map regions [Rohs,04] </li></ul></ul><ul><li>Interaction styles </li></ul><ul><ul><li>Point & Click [Hare,08] </li></ul></ul><ul><li>Role of paper </li></ul><ul><ul><li>Paper as transient input source [Arai,97] </li></ul></ul><ul><ul><li>Paper as a proxy of digital documents </li></ul></ul><ul><ul><ul><li>[Liao,08][Weibel,08][Tsandilas,09] </li></ul></ul></ul><ul><li>Recognition mechanisms </li></ul><ul><ul><li>Barcodes [Rohs,08] , RFID [Reilly,06] </li></ul></ul><ul><ul><li>Anoto [Liao,08][Weibel,08][Tsandilas,09] </li></ul></ul><ul><ul><li>Text visual features [Hull,07][Liu, 08] </li></ul></ul><ul><ul><li>Image visual features [Lowe,04] [Liu,09] </li></ul></ul><ul><li>Fine granularity </li></ul><ul><li>Rich interaction styles </li></ul><ul><li>User-specified arbitrary content and actions </li></ul><ul><li>Generic documents </li></ul>Computer-like UX: PACER
  9. 9. Overview of PACER <ul><li>Highlights </li></ul><ul><ul><li>Generic paper documents </li></ul></ul><ul><ul><ul><li>No barcodes, not RFIDs, no special devices </li></ul></ul></ul><ul><ul><ul><li>Text (language-independent), pictures, graphics </li></ul></ul></ul><ul><ul><li>Rich gesture -based interaction styles </li></ul></ul><ul><ul><ul><li>Hybrid camera and touch input </li></ul></ul></ul><ul><ul><li>Fine-grained interaction </li></ul></ul><ul><ul><ul><li>Individual Latin words, Chinese/Japanese characters, math symbols, user-specified map places and image regions </li></ul></ul></ul>Camera phone + normal printouts Gestures for fine-grained interaction +
  10. 10. Architecture of the PACER system <ul><li>Client-server architecture </li></ul><ul><li>Data flow </li></ul><ul><ul><li>Print  Register  Capture  Recognize  Retrieve  Interact </li></ul></ul>
  11. 11. Overview of the PACER Interface Physical-digital Interaction Mapping Fine-grained Command System <ul><li>Semi-real time processing </li></ul><ul><li>Loose Registration </li></ul><ul><li>Hand Jitter Handling </li></ul><ul><li>Hybrid Camera-touch Gestures </li></ul>Application +
  12. 12. Content-based Physical-Digital Interaction Mapping <ul><li>Similar to Augmented Reality (AR) </li></ul><ul><li>Content-based approach </li></ul><ul><ul><li>Local image visual features </li></ul></ul><ul><ul><ul><li>SIFT [Lowe,04], FIT[Liu,09] </li></ul></ul></ul><ul><ul><li>Robust to partial documents, occlusion, scaling and rotation </li></ul></ul><ul><ul><li>Generic document content types </li></ul></ul><ul><li>Advantages of physical-digital linkage </li></ul><ul><ul><li>Rich contextual information </li></ul></ul><ul><ul><li>Persistent digital info. associated with paper </li></ul></ul>Feature Extraction Feature Matching image registration
  13. 13. Highlights (1): Semi-Real-Time Processing <ul><li>Challenge: Recognition and transmission are too expensive for continuous document tracking </li></ul><ul><ul><li>~300ms for 320x240 pictures with a 2.8GHz 4-core CPU </li></ul></ul><ul><ul><li>~1000 ms for a complete SOAP call transaction </li></ul></ul><ul><li>Solutions: </li></ul><ul><ul><li>Fast algorithms optimized for cell phones </li></ul></ul><ul><ul><ul><li>[Wagner,08] </li></ul></ul></ul><ul><ul><li>Remote recognition + local motion detection </li></ul></ul><ul><ul><ul><li>Recognition is slower but more accurate </li></ul></ul></ul><ul><ul><ul><li>Camera-based motion detection is faster but accumulates errors </li></ul></ul></ul><ul><ul><ul><li>Independent of background content (good to loose registration!) </li></ul></ul></ul>
  14. 14. PACER Gestures <ul><li>Simulate pen-based computer interfaces [Hinckley,05] </li></ul><ul><ul><li>Pointer: an individual word and character </li></ul></ul><ul><ul><li>Underline, Bracket, Vertical Bar: a text line, sentence and chunk </li></ul></ul><ul><ul><li>Lasso, Marquee: an arbitrary document region </li></ul></ul><ul><ul><li>Path: a route </li></ul></ul><ul><ul><li>Free-from </li></ul></ul>Pointer Underline Bracket Vertical Bar Lasso Marquee Path Free-form
  15. 15. Highlights (2): Loose Registration <ul><li>Mobile AR UIs not optimized for fine-grained interaction </li></ul><ul><ul><li>Inaccurate image registration </li></ul></ul><ul><ul><li>Low image quality of cell phone video frames </li></ul></ul><ul><ul><ul><li>Low resolution, out-of-focus, bad lighting conditions, distortion </li></ul></ul></ul><ul><ul><li>Hand jitter and fatigue </li></ul></ul>displaced overlay illegible content
  16. 16. Highlights (2): Loose Registration <ul><li>Our solution </li></ul><ul><ul><li>Replace raw frames with high quality images </li></ul></ul><ul><ul><li>Perform recognition only when required </li></ul></ul><ul><ul><ul><li>background-independent motion detection </li></ul></ul></ul><ul><li>Advantages over the strict registration of normal AR UIs </li></ul><ul><ul><li>Robust to inaccurate image registration </li></ul></ul><ul><ul><li>Better legibility </li></ul></ul><ul><ul><li>Less demands on phone-paper coordination </li></ul></ul><ul><ul><li>More flexibility for user interface designs </li></ul></ul><ul><ul><ul><li>Stable zoom levels </li></ul></ul></ul><ul><ul><ul><li>User-changeable control-to-display ratio </li></ul></ul></ul>perfect overlay
  17. 17. Highlights (3): Hand Jitter Handling <ul><li>Hand jitter affects fine-grained interaction too </li></ul><ul><ul><li>Inherent with direct freehand pointing </li></ul></ul><ul><ul><ul><li>Hand-held projector [Forlines,05], Laser pointer [Olsen,01] </li></ul></ul></ul><ul><li>Solutions </li></ul>Filter (Zoom-and-pick [Forlines,05]) Beautification ( REXplorer [Kratz,09] ) Snap-to-object Physical pointer Logical pointer
  18. 18. Highlights (4): Hybrid Gestures <ul><li>Direct touch manipulation is faster and more intuitive for within-thumb-reach content </li></ul><ul><li>Embodied vs. Touch interaction </li></ul><ul><ul><li>Touch can enhance fine-grained interaction </li></ul></ul>
  19. 19. Highlights (4): Hybrid Gestures <ul><li>Our proposal: </li></ul><ul><ul><li>Embodied gestures for faster and coarser navigation </li></ul></ul><ul><ul><li>Touch gestures for slower and finer navigation </li></ul></ul><ul><ul><li>Automatically switch with touch down/up </li></ul></ul>
  20. 20. Video Demo <ul><li>Hybrid camera-touch gestures </li></ul>
  21. 21. Video Demo of PACER Applications <ul><li>Highlights </li></ul><ul><ul><li>Fine-grained content manipulation </li></ul></ul><ul><ul><li>Contextual information </li></ul></ul><ul><ul><li>Rich and intuitive interaction with paper </li></ul></ul><ul><ul><li>Generic document content types </li></ul></ul>
  22. 22. Preliminary User study (1) <ul><li>Task </li></ul><ul><ul><li>Select designated words and pictures in a marked printout </li></ul></ul><ul><li>Participants </li></ul><ul><ul><li>Six colleagues not affiliated with PACER </li></ul></ul><ul><li>Settings </li></ul><ul><ul><li>4 testing pages </li></ul></ul><ul><ul><li>400+ database pages </li></ul></ul><ul><ul><li>2 sessions x 16 trials </li></ul></ul>Tracking tag
  23. 23. Preliminary User study (2) <ul><li>Overall feedback was positive </li></ul><ul><ul><li>Novel ideas, useful for mobile settings </li></ul></ul><ul><li>Document recognition </li></ul><ul><ul><li>81.6% accuracy for the 1 st shots </li></ul></ul><ul><ul><li>Failure sources </li></ul></ul>Motion blur Out of range Shadow
  24. 24. Preliminary User study (3) <ul><li>Embodied vs. Touch </li></ul><ul><ul><li>Embodied gestures are faster but more difficult to learn </li></ul></ul><ul><ul><ul><li>Caused by inadvertent phone movement and unfamiliar mental model </li></ul></ul></ul><ul><ul><li>Touch gestures are more familiar but may be hard for one-handed operations </li></ul></ul><ul><li>Loose vs. Strict registration </li></ul><ul><ul><li>Feedback on loose registration was positive </li></ul></ul><ul><ul><ul><li>The retrieved high quality documents were appreciated </li></ul></ul></ul><ul><ul><ul><li>Lower mental and physical demands </li></ul></ul></ul><ul><ul><ul><li>Depends on the completeness of the digital models </li></ul></ul></ul>
  25. 25. Preliminary User study (4) <ul><li>One vs. two-handed input </li></ul><ul><ul><li>One-handed input is more flexible for simultaneous manipulation of paper and the phone </li></ul></ul><ul><ul><li>Two-handed input is more stable and easy for touch gestures </li></ul></ul>Participant-initiated two-handed interaction
  26. 26. Conclusion & future work <ul><li>PACER is a cell phone-based interactive paper system </li></ul><ul><ul><li>Generic paper documents linked to rich digital information </li></ul></ul><ul><ul><li>Flexible interaction with the hybrid camera-touch gestures </li></ul></ul><ul><ul><li>Fine-grained content manipulation </li></ul></ul><ul><li>Future work </li></ul><ul><ul><li>Understand embodied/touch and one/two-handed interaction </li></ul></ul><ul><ul><li>Investigate loose vs. strict registrations in more application scenarios </li></ul></ul><ul><ul><li>Integrate PACER with other devices like mobile projectors and digital pens </li></ul></ul><ul><ul><li>Explore more application areas </li></ul></ul>
  27. 27. Thank You! <ul><li>Acknowledgement </li></ul><ul><ul><li>Our colleague participants </li></ul></ul><ul><ul><li>Don Kimber and Tony Dunnigan </li></ul></ul><ul><ul><li>CHI reviewers </li></ul></ul><ul><li>More resources </li></ul><ul><ul><li> </li></ul></ul>