SlideShare a Scribd company logo
Drawing in Talking:
Using Pen and Voice for Drawing System Configuration
Figures in Talking
Research and Technology Department
Xingya Xu (xingya.xu@fujixerox.co.jp)
December 8, 2017
IDW/AD’ 17
December 6-8
Sendai, Japan
INP7/UXC6 - 2
Fuji Xerox Co., Ltd.
Drawing in talking vs Making in advance
2
Shop Server Database
Cloud
Drawing by hand
• Quick and easy
• Interact with listeners actively
Making in advance
• Neat and precise
• Well-designed icons and graphs
Purpose
3
How to drawing system configuration
figures easily and quickly?
Support drawing quickly and easily
Shop Server Databas
e
Cloud
How to drawing system
configurations in real-time talking?
Support drawing in talking
4
To draw quickly and easily
Multimodal input
Make use of different input modalities such as
touch, pen, and speech in an integrated manner
The strength of Pen
• Talking or thinking during drawing
• Express the position and shape of objects
The strength of Voice
Express linguistic information
Approach 1
PC
smartphone
a. Circle  Icon
b. Line  Text
Previous Research
5
A user sitting on the chair can move the object by pointing to it and saying “move that
there”.
Put-That-There
Bolt, R.A. Put-that-there: Voice and gesture at the graphics
interface. ACM Computer Graphics 14, 3 (1980), 262–270
Previous Research
6
The problem of Put-That-There
Voice has two meanings
• to convey messages to the listeners
• to issue commands to the system
The problem of Put-That-There
Cause unintentional system behaviors
when the speaker talks to the listeners
Speaker
Listeners
System
Message
Command
In talking and drawing case, voice not only conveys
commands to the system, but also conveys messages
to the listeners
7
To draw in talking smoothly and naturally
Approach 2
Free mode & Command mode
• In the free mode pen or speech input is
not considered as command.
• In the command mode inputs are
considered as part of a command.
Smooth mode switching
Switch between the free mode and the
command mode smoothly and not disturb
talking
PC
smartphone
a. Circle  Icon
b. Line  Text
8
Approach 2
Mode switch techniques
Button
A basic technique
Tap
No need to specify the end, but need to
change hand holding posture
Pen-holding
No need to change hand holding posture
Pigtail
Draw a pigtail at the end of drawing
Pigtail gesture examples
Technique Description Start End
Button Press button before and
after drawing
Click Click
Tap Tap the panel before
drawing
Tap ―
Pen-holding Hold the pen for a while
before drawing
Hold the
pen
―
Pigtail Draw a pigtail at the end
of the stroke
― Pigtail
System implement
Design
9
TalkingDraw
A prototype system using C# on a Surface Pro 3
with a Surface Pen
Speech recognition
• Recognize users’ speech during the command
mode
Recognize Pen strokes
• Recognize the shape of users’ pen strokes when
the command is ended
• $P Point-Cloud Recognizer (R.D. Vatavu et.al.,
2012)
Talking
Drawing
Voice that will be recognized
Delay
(0.5s)
Start End
The command is automatically ended if there is
no pen and voice input detected in a 0.5s time
break.
Elements of system configurations
Design
smartphone
c. Line  Line Text
PC
a. Circle  Icon
cloud
cloud
b. Rectangle  Box
text
d. Line  Link
10
Shape of a
stroke Text of voice Behavior of
TalkingDraw
Circle “PC” Input an icon whose
name is “PC”
Rectangle “cloud” Show “cloud” in a text
box
Line
“smartphone” Show “smartphone” as
a simple text
― Make a link between
two objects
11
12
Experiment 1
Participants: 16 people (12 males and 4 females, age avg. 48.1)
Scenario: TalkingDraw used as a drawing tool in talking.
Task: Participants must speak a given sentence and insert icons while speaking.
練習1) 「ネット」から「資料」をダウンロードしましょう。 The task sentence
The icons to be inserted
Talking-in-drawing task
13
Experiment 1
Result
Task completion time
• One-way ANOVA: The main effect of
techniques was significant (F(3,45)=6.39,
p<.01).
• Tukey's method: Pigtail = Tap << Pen =
Button
Interview
• Pigtail was comfortable even the accuracy of
gesture recognition is complained.
• Pressing the button twice was a pain.
• It is hard to hold the pen on the screen
0.0
2.0
4.0
6.0
8.0
10.0
12.0
14.0
16.0
Button Tap Pen Pigtail
Taskcompletiontime(s)
(a) Experiment 1
14
Experiment 2
Participants: 16 people (12 males and 4 females, age avg. 48.1)
Scenario : TalkingDraw as a drawing tool for system configuration figures.
Task: Participants must draw a given figure.
携帯 写真
アップロード データ
ベース
Example: The given figure and the sample
figure
Making-in-advance
15
Experiment 2
Result
Task completion time
• One-way ANOVA: The main effect of
techniques was significant (F(3,45)=5.22,
p<.004).
• Tukey's method: Tap = Pigtail = Pen < Button
Interview
• There is no big difference between techniques.
• Button was more comfortable than in
Experiment 1.
0.0
5.0
10.0
15.0
20.0
25.0
30.0
35.0
Button Tap Pen Pigtail
Taskcompletiontime(s)
(b) Experiment 2
Pigtail performs best in experiment 1
• Specify the command mode after actions
• Need to improve the accuracy of pen gesture recognition
No big difference in experiment 2
• Participants don’t need to think during drawing a figure
• Techniques that specify the command mode before actions
perform better than in experiment 1
16
Discussion
17
Future work
The accuracy of Pigtail recognition
• More samples
• Normalize
The accuracy of speech recognition
• Google cloud speech recognition
Context sensitive
• The voice input and the drawing are not concurrent
• Timestamp
• Semantic analysis
Voice input
Pen input
Key content
The command duration
Noise
Xerox、Xeroxロゴ、およびFuji Xeroxロゴは、米国ゼロックス社の登録商標または商標です。
Thanks for Icons made by Freepik from www.flaticon.com

More Related Content

Similar to Drawing in Talking: Using Pen and Voice for Drawing System Configuration Figures in Talking

Smart note taker
Smart note takerSmart note taker
Smart note taker
PRADEEP Cheekatla
 
UCD and low-fidelity prototyping
UCD and low-fidelity prototypingUCD and low-fidelity prototyping
UCD and low-fidelity prototyping
sawsan slii
 
Autocad training
Autocad trainingAutocad training
Autocad training
Phurba Tamang
 
UI/GUI Design Guide Process Tutorial
UI/GUI Design Guide Process TutorialUI/GUI Design Guide Process Tutorial
UI/GUI Design Guide Process Tutorial
witstudio
 
Human Computer Interaction
Human Computer InteractionHuman Computer Interaction
Human Computer Interaction
Jitu Choudhary
 
HCI
HCIHCI
Python week 2 2019 2020 for g10 by eng.osama ghandour
Python week 2 2019 2020 for g10 by eng.osama ghandourPython week 2 2019 2020 for g10 by eng.osama ghandour
Python week 2 2019 2020 for g10 by eng.osama ghandour
Osama Ghandour Geris
 
sm t
sm tsm t
Origami
OrigamiOrigami
Origami
Gleb Revkov
 
IRJET- Virtual Vision for Blinds
IRJET- Virtual Vision for BlindsIRJET- Virtual Vision for Blinds
IRJET- Virtual Vision for Blinds
IRJET Journal
 
A multiple device approach for Supporting Whiteboard-based Interactions (Reki...
A multiple device approach for Supporting Whiteboard-based Interactions (Reki...A multiple device approach for Supporting Whiteboard-based Interactions (Reki...
A multiple device approach for Supporting Whiteboard-based Interactions (Reki...
Jun Rekimoto
 
nothing at all for programming site<gggggg
nothing at all for programming site<ggggggnothing at all for programming site<gggggg
nothing at all for programming site<gggggg
AnasAshraf34
 
Unit 2 - Complete (1).pptx
Unit 2 - Complete (1).pptxUnit 2 - Complete (1).pptx
Unit 2 - Complete (1).pptx
gogulram2
 
Chapter 5 - Interaktive Tools
Chapter 5 - Interaktive ToolsChapter 5 - Interaktive Tools
Chapter 5 - Interaktive Tools
Muhammad Najib
 
14 583
14 58314 583
Beginning computer basics
Beginning computer basics Beginning computer basics
Beginning computer basics
Vicente Antofina
 
Building your first UX Lab : Presented at GDS
Building your first UX Lab : Presented at GDSBuilding your first UX Lab : Presented at GDS
Building your first UX Lab : Presented at GDS
Craig Spencer
 
Computer and information technology lesson 1
Computer and information technology lesson 1Computer and information technology lesson 1
Computer and information technology lesson 1
Raramuri2
 
Creating Touchless HMIs Using Computer Vision for Gesture Interaction
Creating Touchless HMIs Using Computer Vision for Gesture InteractionCreating Touchless HMIs Using Computer Vision for Gesture Interaction
Creating Touchless HMIs Using Computer Vision for Gesture Interaction
ICS
 
Prototyping GNOME UI for Gestural Input
Prototyping GNOME UI for Gestural InputPrototyping GNOME UI for Gestural Input
Prototyping GNOME UI for Gestural Input
Adityo Pratomo
 

Similar to Drawing in Talking: Using Pen and Voice for Drawing System Configuration Figures in Talking (20)

Smart note taker
Smart note takerSmart note taker
Smart note taker
 
UCD and low-fidelity prototyping
UCD and low-fidelity prototypingUCD and low-fidelity prototyping
UCD and low-fidelity prototyping
 
Autocad training
Autocad trainingAutocad training
Autocad training
 
UI/GUI Design Guide Process Tutorial
UI/GUI Design Guide Process TutorialUI/GUI Design Guide Process Tutorial
UI/GUI Design Guide Process Tutorial
 
Human Computer Interaction
Human Computer InteractionHuman Computer Interaction
Human Computer Interaction
 
HCI
HCIHCI
HCI
 
Python week 2 2019 2020 for g10 by eng.osama ghandour
Python week 2 2019 2020 for g10 by eng.osama ghandourPython week 2 2019 2020 for g10 by eng.osama ghandour
Python week 2 2019 2020 for g10 by eng.osama ghandour
 
sm t
sm tsm t
sm t
 
Origami
OrigamiOrigami
Origami
 
IRJET- Virtual Vision for Blinds
IRJET- Virtual Vision for BlindsIRJET- Virtual Vision for Blinds
IRJET- Virtual Vision for Blinds
 
A multiple device approach for Supporting Whiteboard-based Interactions (Reki...
A multiple device approach for Supporting Whiteboard-based Interactions (Reki...A multiple device approach for Supporting Whiteboard-based Interactions (Reki...
A multiple device approach for Supporting Whiteboard-based Interactions (Reki...
 
nothing at all for programming site<gggggg
nothing at all for programming site<ggggggnothing at all for programming site<gggggg
nothing at all for programming site<gggggg
 
Unit 2 - Complete (1).pptx
Unit 2 - Complete (1).pptxUnit 2 - Complete (1).pptx
Unit 2 - Complete (1).pptx
 
Chapter 5 - Interaktive Tools
Chapter 5 - Interaktive ToolsChapter 5 - Interaktive Tools
Chapter 5 - Interaktive Tools
 
14 583
14 58314 583
14 583
 
Beginning computer basics
Beginning computer basics Beginning computer basics
Beginning computer basics
 
Building your first UX Lab : Presented at GDS
Building your first UX Lab : Presented at GDSBuilding your first UX Lab : Presented at GDS
Building your first UX Lab : Presented at GDS
 
Computer and information technology lesson 1
Computer and information technology lesson 1Computer and information technology lesson 1
Computer and information technology lesson 1
 
Creating Touchless HMIs Using Computer Vision for Gesture Interaction
Creating Touchless HMIs Using Computer Vision for Gesture InteractionCreating Touchless HMIs Using Computer Vision for Gesture Interaction
Creating Touchless HMIs Using Computer Vision for Gesture Interaction
 
Prototyping GNOME UI for Gestural Input
Prototyping GNOME UI for Gestural InputPrototyping GNOME UI for Gestural Input
Prototyping GNOME UI for Gestural Input
 

Recently uploaded

UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 

Recently uploaded (20)

UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 

Drawing in Talking: Using Pen and Voice for Drawing System Configuration Figures in Talking

  • 1. Drawing in Talking: Using Pen and Voice for Drawing System Configuration Figures in Talking Research and Technology Department Xingya Xu (xingya.xu@fujixerox.co.jp) December 8, 2017 IDW/AD’ 17 December 6-8 Sendai, Japan INP7/UXC6 - 2 Fuji Xerox Co., Ltd.
  • 2. Drawing in talking vs Making in advance 2 Shop Server Database Cloud Drawing by hand • Quick and easy • Interact with listeners actively Making in advance • Neat and precise • Well-designed icons and graphs
  • 3. Purpose 3 How to drawing system configuration figures easily and quickly? Support drawing quickly and easily Shop Server Databas e Cloud How to drawing system configurations in real-time talking? Support drawing in talking
  • 4. 4 To draw quickly and easily Multimodal input Make use of different input modalities such as touch, pen, and speech in an integrated manner The strength of Pen • Talking or thinking during drawing • Express the position and shape of objects The strength of Voice Express linguistic information Approach 1 PC smartphone a. Circle  Icon b. Line  Text
  • 5. Previous Research 5 A user sitting on the chair can move the object by pointing to it and saying “move that there”. Put-That-There Bolt, R.A. Put-that-there: Voice and gesture at the graphics interface. ACM Computer Graphics 14, 3 (1980), 262–270
  • 6. Previous Research 6 The problem of Put-That-There Voice has two meanings • to convey messages to the listeners • to issue commands to the system The problem of Put-That-There Cause unintentional system behaviors when the speaker talks to the listeners Speaker Listeners System Message Command In talking and drawing case, voice not only conveys commands to the system, but also conveys messages to the listeners
  • 7. 7 To draw in talking smoothly and naturally Approach 2 Free mode & Command mode • In the free mode pen or speech input is not considered as command. • In the command mode inputs are considered as part of a command. Smooth mode switching Switch between the free mode and the command mode smoothly and not disturb talking PC smartphone a. Circle  Icon b. Line  Text
  • 8. 8 Approach 2 Mode switch techniques Button A basic technique Tap No need to specify the end, but need to change hand holding posture Pen-holding No need to change hand holding posture Pigtail Draw a pigtail at the end of drawing Pigtail gesture examples Technique Description Start End Button Press button before and after drawing Click Click Tap Tap the panel before drawing Tap ― Pen-holding Hold the pen for a while before drawing Hold the pen ― Pigtail Draw a pigtail at the end of the stroke ― Pigtail
  • 9. System implement Design 9 TalkingDraw A prototype system using C# on a Surface Pro 3 with a Surface Pen Speech recognition • Recognize users’ speech during the command mode Recognize Pen strokes • Recognize the shape of users’ pen strokes when the command is ended • $P Point-Cloud Recognizer (R.D. Vatavu et.al., 2012) Talking Drawing Voice that will be recognized Delay (0.5s) Start End The command is automatically ended if there is no pen and voice input detected in a 0.5s time break.
  • 10. Elements of system configurations Design smartphone c. Line  Line Text PC a. Circle  Icon cloud cloud b. Rectangle  Box text d. Line  Link 10 Shape of a stroke Text of voice Behavior of TalkingDraw Circle “PC” Input an icon whose name is “PC” Rectangle “cloud” Show “cloud” in a text box Line “smartphone” Show “smartphone” as a simple text ― Make a link between two objects
  • 11. 11
  • 12. 12 Experiment 1 Participants: 16 people (12 males and 4 females, age avg. 48.1) Scenario: TalkingDraw used as a drawing tool in talking. Task: Participants must speak a given sentence and insert icons while speaking. 練習1) 「ネット」から「資料」をダウンロードしましょう。 The task sentence The icons to be inserted Talking-in-drawing task
  • 13. 13 Experiment 1 Result Task completion time • One-way ANOVA: The main effect of techniques was significant (F(3,45)=6.39, p<.01). • Tukey's method: Pigtail = Tap << Pen = Button Interview • Pigtail was comfortable even the accuracy of gesture recognition is complained. • Pressing the button twice was a pain. • It is hard to hold the pen on the screen 0.0 2.0 4.0 6.0 8.0 10.0 12.0 14.0 16.0 Button Tap Pen Pigtail Taskcompletiontime(s) (a) Experiment 1
  • 14. 14 Experiment 2 Participants: 16 people (12 males and 4 females, age avg. 48.1) Scenario : TalkingDraw as a drawing tool for system configuration figures. Task: Participants must draw a given figure. 携帯 写真 アップロード データ ベース Example: The given figure and the sample figure Making-in-advance
  • 15. 15 Experiment 2 Result Task completion time • One-way ANOVA: The main effect of techniques was significant (F(3,45)=5.22, p<.004). • Tukey's method: Tap = Pigtail = Pen < Button Interview • There is no big difference between techniques. • Button was more comfortable than in Experiment 1. 0.0 5.0 10.0 15.0 20.0 25.0 30.0 35.0 Button Tap Pen Pigtail Taskcompletiontime(s) (b) Experiment 2
  • 16. Pigtail performs best in experiment 1 • Specify the command mode after actions • Need to improve the accuracy of pen gesture recognition No big difference in experiment 2 • Participants don’t need to think during drawing a figure • Techniques that specify the command mode before actions perform better than in experiment 1 16 Discussion
  • 17. 17 Future work The accuracy of Pigtail recognition • More samples • Normalize The accuracy of speech recognition • Google cloud speech recognition Context sensitive • The voice input and the drawing are not concurrent • Timestamp • Semantic analysis Voice input Pen input Key content The command duration Noise

Editor's Notes

  1. The left one is a figure I draw by hand in five seconds to explain a fake cloud service. We usually draw many such kind of rough figures in discussion, brainstorming and so on. The advantages of drawing by hand is…. The right one is a figure I made in PowerPoint. Compared to the left one, it is neat and precise. I can also use some well-designed icons and graphs.
  2. So how can we draw quickly and easily? Our approach is to use multimodal input, which means that…. The strength of Pen includes that we can talk or think during drawing. And pen is good at expressing the position and shape of objects. The strength of Voice is express linguistic information. We can talk much faster that drawing or writing. For example, I draw a circle, and say something like “Here is a PC”. Then The system can get the position and shape of the object by this circle, and get the type of the icon, which is a ‘PC’. Actually, if PowerPoint is clever enough, it can input an PC icon here. Similarly, I draw a line and say ‘smartphone’, a text is inserted here.
  3. There’s some previous researches about multimodal input. Put that there is a pen and voice system to input or modify objects. Like this picture shows, …
  4. In talking and drawing case… The problem of Put that there is that, when users are just freely talking and drawing, it may cause unintentional behaviors such as inserting wrong objects or sending unintentional command.
  5. For example if I am saying something about a PC, and drawing a circle, that it may insert an icon of PC accidently. So we introduced two modes. First is the free mode, in the free mode…. Another is the command mode, in the command mode…. Then how to switch between…
  6. We explored 4 mode switch techniques. A basic technique is pressing a button to specify the start and the end of the command mode,. However, pressing a button twice may be a pain for users. Therefore, we introduce a next Tap technique, where users must tap a panel before drawing to start the command mode. Users do not have to specify the end of the command mode. The system automatically judges the end of the command mode by recognizing a break of drawing and talking. However, in this Tap technique, users must change the holding posture of their hand to draw something with a pen after tapping with their finger. To lessen this problem, we introduce a Pen-holding technique, where users keep the pen static for a short period of time before starting the command mode. In this technique, users do not have to change their hand posture. These three techniques require to specify the mode switching before entering into the command mode. However, specifying the mode before doing actions might be difficult for users and this might disturb natural talking because users must judge which mode should they choose before drawing or before talking. Therefore, we prepare another technique called a Pigtail technique. In this technique, users do not have to specify anything at the start of actions and they must specify the command mode at the end of drawing by using special drawing gesture, which is a crossed curve called a pigtail. In this technique, users don't care about mode during talking and drawing. They specify whether it is a command after performing actions.
  7. We built a prototype system using C# on a Surface Pro 3 with a Surface Pen. The speech recognition engine recognizes users’ speech during the command. And We used an open-source pen gesture recognizer to recognize the shape of users’ pen strokes once the command is completed. This figure shows how it works. This is the start of the command once starting to draw, and this is the end of the command if there is no pen and voice input detected in a 0.5s time break.
  8. In current system, we recognize three kinds of shapes of a stroke. The first is a circle. When I draw a circle here and say PC, an icon of PC appears here. The second is … The third is …. Specially, if the line connects two objects, it becomes a link.
  9. This is a demonstration video shows that how can TalkingDraw be used in an elemental school class. We used Pigtail in this video.
  10. In the first experiment, we evaluated the 4 techniques in a talking-in-drawing task. The participants must speak a given sentence and insert icons while speaking.
  11. This graph shows the task completion time of the techniques. We found Pigtail and Tap is much faster than Pen-holding and Button. Participants also reported that Pigtail was comfortable even the accuracy of gesture recognition is complained. In the contrast, they found Pressing the button twice was a pain, and it is hard to hold the pen on the screen.
  12. In the second experiment, we evaluated the 4 techniques in a making-in-advance task. The participants must draw a given figure like this.
  13. We found that Button is still the slowest, but there’s no big difference between other techniques. Participants also reported that Button was more comfortable than in Experiment 1.
  14. We found that Pigtail performs best in experiment 1. We think the reason is that it specifies the command mode after actions so users don’t need to think when drawing. And we need to improve the accuracy of pen gesture recognition for Pigtail, which actually has affected its performance in the experiment. There is no big difference found in experiment 2. Because participants don’t need to think during drawing a figure, techniques that specify the command mode before actions perform better than in experiment 1.
  15. Finally, about the future work. The accuracy of Pigtail recognition and the accuracy of Japanese speech recognition can still we improved. Furthermore, we found that the voice input and the drawing are not concurrent. For example, if I want to inset a PC icon, I may draw the circle before I said “PC” in a sentence. This is a problem we need to figure out in the next experiment.