SlideShare a Scribd company logo
1 of 32
Recognizing 3D Trajectories as 2D Multi-stroke
Gestures
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020)
Paolo Roselli
Università degli Studi di
Roma, Italy
Université catholique de
Louvain, Belgium
Jean Vanderdonckt
LouRIM
Université catholique de
Louvain, Belgium
Nathan Magrofuoco
LouRIM
Université catholique de
Louvain, Belgium
Arthur Sluÿters
LouRIM
Université catholique de
Louvain, Belgium
Mehdi Ousmer
LouRIM
Université catholique de
Louvain, Belgium
Contents
• Introduction
• Challenges
• Recognizers
• Experiment
• Conclusion & Future
Introduction
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 1
Ousmer, M., Vanderdonckt, J., Buraga, S., 2019. An ontology for reasoning on body-based gestures, in: Proceedings of the ACM SIGCHI
Symposium on Engineering Interactive Computing Systems - EICS ’19. Presented at the ACM SIGCHI Symposium, ACM Press, Valencia,
Spain, pp. 1–6. https://doi.org/10.1145/3319499.3328238
GES
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 2
X
Y
Z
• Deep Learning
• SVM
• Neural Networks
• Machine Learning
• Nearest-Neighbor-Classification
(NNC)
• Pattern matching
pi=<xi , yi , zi >
Challenges
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 4
 What are the suitable state-of-art 2D gesture recognizers that
could answer our main question? How can we recognize a 3D
gesture?
 What are the sampling rate and the number of templates that
give us the best results?
 Are these recognizers efficient? Can we compare them with
existing 3D recognizers?
Recognizers
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 7
GRANDMA, Rubine’s recognizer(1991): The first published algorithm in
stroke gesture recognition, it uses statistical matching.
-It describes gestures with a vector of geometrical features.
$P: One of $-family accurate recognizer. $P is invariant to scaling, rotation,
order, number and direction.
-Representation of the gesture as a point cloud.
$Q : An improvement of the $P recognizer for low computational devices by
reducing Its computational needs. The recognizer uses a Lookup table for
the point matching and implements the early abandoning.
$P+ : The most accurate recognizer of the $-family. An optimization of $P
with good results for low-vision users. It has a more flexible cloud matching
than $P.
2D Recognizers
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 8
3D Recognizers
Rubine3D:
• Projection of the gesture on each plane (XY, YZ,
ZX)
• Usage of the original features vector to describe
a gesture.
• Use of a heuristic to determine the candidate
gesture class if the result classes are different
between planes.
X
Y
Z
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 9
3D Recognizers
Rubine-Sheng:
• Extension of the Rubine’s recognizer to three
dimensions.
• Description of the 3D gesture with a vector of
16 features.
• Extension of the original feature vectors to 3D
by adding three features. (Sheng, “A Study of
AdaBoost in 3D Gesture Recognition.”)
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 10
3D Recognizers
$P3:
• Extension of the $P recognizer to three dimensions.
• The point Cloud defined as a set of 3D points.
• Pre-Processing of the gesture (Resample, Scale, Translate to origin).
• Implementation of the Greedy-5 heuristic used in the $P.
$Q3:
• Use of a 16x16x16 3D grid for the lookup technique.
• Storage of the indices (row, column, layer) of the closest point in the
grid.
• The best matching point cloud is the point cloud with the lowest
dissimilarity score.
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 11
3D Recognizers
$F:
• Extension of the $P3 recognizer with a flexible cloud matching of $P+.
• Pre-processing of the gestures.
• The best matching template is the template with the lowest dissimilarity
score.
$P+3:
• Extension of the $P+ recognizer to the third dimension.
• Pre-processing of the gestures.
• Inclusion of the ‘z’ coordinate in the turning angle and the point distance
computation.
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 10
3D Recognizers
FreeHandUni:
• Derives from the FreeHand recognizer pseudocode (Craciun et al.).
• Replacement of the hand pose structure with a 3D point (x,y,z).
• $P3 extension with a flexible cloud matching of $P+ .
• No early abandoning.
• The best matching template is the template with the lowest dissimilarity
score.
Experiment
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 12
Experiment
• Recognizers: $P3, $Q3, $P+3, $F, FH, R3D, RS
• Gesture sets: SHREC2019, 3DTCGS, 3DMadLabSD (4 Domains)
• Number of Templates (T) : {1,2,4,8,16}
• Sampling (N) : {4,8,16,32,64}
• User-Independent Scenario:
6 (Dataset)×5 (Sampling)×5 (Number of Templates)×100
(repetitions)×7(Recognizer)
= 105,000 recognition trials
• User-Dependent Scenario:
4 (Dataset)×10 (users)×5(Sampling)×4 (Number of
Templates)×100 (repetitions)×7 (Recognizer)
= 560,000 recognition trials
• Quantitative Measures:
• Recognition rate
• Execution time
• Evaluate effects:
• Datasets
• Number of templates T
• Sampling N
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 13
Datasets:
SHREC 2019
Cross “X” Circle “O”
V-mark “V” Caret “^” Square “[]”
3DTCGS
arc3Dleft arc3Dright caret check circle curly-braket-
left
curly-braket-
right
delete left-swipe pigtail poly3Dxyz poly3Dxzy poly3Dyxz poly3Dyzx
poly3Dzxy poly3Dzyx rectangle right-swipe spiral square-braket-left square-braket-
right
star triangle V X zig-zag
3DMadLabSD
0 1 2 3 4
5 6 7 8 9
a b c d e
f g h i j
Spring Mass Wheel Pulley Hinge
Fast
Forward
Rewind Play Pause Delete
Cuboid Cylinder Sphere
Rectangular
Pipe Hemisphere
Cylindrical
Pipe
Pyramid Tetrahedron Cone Toroid
Domain1
Domain3
Domain2
Domain4
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 14
SHREC2019 Results :
Execution time tables
Recognition tables
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 15
SHREC2019 Results : Confusion matrix for $P+3
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 16
SHREC2019 Results : recognition rate
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 17
Results : recognition rate
$𝑃3
$𝑄3
$𝑃+
3 $𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆
Global
recognition
rate
[%]
87.48
79.54 79.49 79.09 77.50
67.22
57.45
40
50
60
70
80
90
100
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 18
Results : recognition rate
***
***
$𝑃3
$𝑄3
$𝑃+
3
$𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆 $𝑃3
$𝑄3
$𝑃+
3 $𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆
***
***
***
***
***
***
***
**
***
***
Global
recognition
rate
[%]
84.28
78.84 78.83 78.45 77.43
74.48
68.82
3DTCGS
86.94
79.36 79.36 79.28
78.10
52.51
40.82
40
50
60
70
80
90
100
SHREC2019
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 19
Results : recognition rate
***
***
$𝑃3
$𝑄3
$𝑃+
3
$𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆 $𝑃3
$𝑄3
$𝑃+
3 $𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆
***
***
***
***
***
***
**
Global
recognition
rate
[%]
98.81 96.78 96.69 96.28
95.70
75.94
70.76
Domain 2
User dependent
***
***
87.87
75.78 75.45 74.55 71.55 70.33
58.10
40
50
60
70
80
90
100
Domain 2
User independent
***
***
$𝑃3
$𝑄3
$𝑃+
3 $𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆 $𝑃3
$𝑄3
$𝑃+
3 $𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆
***
***
***
***
***
***
**
Global
recognition
rate
[%]
***
***
98.45
96.22 96.21 95.7995.28
75.60
69.07
Domain 1
User dependent
88.08
76.38 76.05 75.20 72.57
72.57
58.37
40
50
60
70
80
90
100
Domain 1
User independent
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 20
Results : recognition rate
***
***
$𝑃3
$𝑄3
$𝑃+
3
$𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆
***
***
***
***
***
***
**
Global
recognition
rate
[%]
***
99.26 97.70 97.67
97.51
97.18
76.93
73.36
$𝑃3
$𝑄3
$𝑃+
3 $𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆
Domain 3
User dependent
***
94.45
85.66 85.55
84.88 82.92
76.52
67.98
40
50
60
70
80
90
100
Domain 3
User independent
***
***
$𝑃3
$𝑄3
$𝑃+
3
$𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆
***
***
***
***
***
**
Global
recognition
rate
[%]
***
$𝑃3
$𝑄3
$𝑃+
3 $𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆
***
93.86 91.96 91.76 91.31
90.68
72.99
65.04
Domain 4
User dependent
66.36
57.94 57.75 57.02
54.98 54.40
43.49
40
50
60
70
80
90
100
Domain 4 - User independent
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 21
Results: recognition rate
• Loss when switching from UD to UI scenario
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 22
Discussion
• $P+3 is the best recognizer in most conditions.
• Rubine-Sheng should be avoided.
• $-like recognizers are more accurate than R3D and RS.
• $F and FH have small variation.
• A significant effect on recognition rate (see paper for details).
-Number of templates.
-Sampling.
-Datasets
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 23
Limitation
• Rejection of some recognizers due to their supposed
computational complexity.
• The lack of diversity in the evaluated gestures.
• The impact of other factors on the measures.
Conclusion & Future
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 25
Conclusion
• Possibility to interpret a 3D trajectory as three 2D trajectories
projected on each plane.
• Extension of four 2D stroke gesture recognizers to three
dimensions.
• Implementation of a testing framework in JavaScript.
• Test an extensive range of 3D trajectories .
• The winner of the test is the $P+3 recognizer.
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 26
Future work
• Inclusion of omitted recognizers in the test to compare them against
$P+3.
• Evaluation of the gestures' depth variation impact.
• Test of the uni-stroke single-point gestures in a real-world application.
• Evaluation of complex gestures.
• Comparison with other algorithms (Not template based).
Thank you very much
for your attention

More Related Content

More from Jean Vanderdonckt

Gelicit: A Cloud Platform for Distributed Gesture Elicitation Studies
 Gelicit: A Cloud Platform for Distributed Gesture Elicitation Studies Gelicit: A Cloud Platform for Distributed Gesture Elicitation Studies
Gelicit: A Cloud Platform for Distributed Gesture Elicitation Studies
Jean Vanderdonckt
 

More from Jean Vanderdonckt (20)

Conducting a Gesture Elicitation Study: How to Get the Best Gestures From Peo...
Conducting a Gesture Elicitation Study: How to Get the Best Gestures From Peo...Conducting a Gesture Elicitation Study: How to Get the Best Gestures From Peo...
Conducting a Gesture Elicitation Study: How to Get the Best Gestures From Peo...
 
Designing Gestural Interaction: Challenges and Pitfalls
Designing Gestural Interaction: Challenges and PitfallsDesigning Gestural Interaction: Challenges and Pitfalls
Designing Gestural Interaction: Challenges and Pitfalls
 
Fundamentals of Gestural Interaction
Fundamentals of Gestural InteractionFundamentals of Gestural Interaction
Fundamentals of Gestural Interaction
 
Gestural Interaction, Is it Really Natural?
Gestural Interaction, Is it Really Natural?Gestural Interaction, Is it Really Natural?
Gestural Interaction, Is it Really Natural?
 
User-centred Development of a Clinical Decision-support System for Breast Can...
User-centred Development of a Clinical Decision-support System for Breast Can...User-centred Development of a Clinical Decision-support System for Breast Can...
User-centred Development of a Clinical Decision-support System for Breast Can...
 
Simplifying the Development of Cross-Platform Web User Interfaces by Collabo...
Simplifying the Development of  Cross-Platform Web User Interfaces by Collabo...Simplifying the Development of  Cross-Platform Web User Interfaces by Collabo...
Simplifying the Development of Cross-Platform Web User Interfaces by Collabo...
 
Attach Me, Detach Me, Assemble Me like you Work
Attach Me, Detach Me, Assemble Me like you WorkAttach Me, Detach Me, Assemble Me like you Work
Attach Me, Detach Me, Assemble Me like you Work
 
The Impact of Comfortable Viewing Positions on Smart TV Gestures
The Impact of Comfortable Viewing Positions on Smart TV GesturesThe Impact of Comfortable Viewing Positions on Smart TV Gestures
The Impact of Comfortable Viewing Positions on Smart TV Gestures
 
Head and Shoulders Gestures: Exploring User-De fined Gestures with Upper Body
Head and Shoulders Gestures: Exploring User-Defined Gestures with Upper BodyHead and Shoulders Gestures: Exploring User-Defined Gestures with Upper Body
Head and Shoulders Gestures: Exploring User-De fined Gestures with Upper Body
 
G-Menu: A Keyword-by-Gesture based Dynamic Menu Interface for Smartphones
G-Menu: A Keyword-by-Gesture based Dynamic Menu Interface for SmartphonesG-Menu: A Keyword-by-Gesture based Dynamic Menu Interface for Smartphones
G-Menu: A Keyword-by-Gesture based Dynamic Menu Interface for Smartphones
 
Vector-based, Structure Preserving Stroke Gesture Recognition
Vector-based, Structure Preserving Stroke Gesture RecognitionVector-based, Structure Preserving Stroke Gesture Recognition
Vector-based, Structure Preserving Stroke Gesture Recognition
 
An ontology for reasoning on body-based gestures
 An ontology for reasoning on body-based gestures An ontology for reasoning on body-based gestures
An ontology for reasoning on body-based gestures
 
AB4Web: An On-Line A/B Tester for Comparing User Interface Design Alternatives
AB4Web: An On-Line A/B Tester for Comparing User Interface Design AlternativesAB4Web: An On-Line A/B Tester for Comparing User Interface Design Alternatives
AB4Web: An On-Line A/B Tester for Comparing User Interface Design Alternatives
 
Gelicit: A Cloud Platform for Distributed Gesture Elicitation Studies
 Gelicit: A Cloud Platform for Distributed Gesture Elicitation Studies Gelicit: A Cloud Platform for Distributed Gesture Elicitation Studies
Gelicit: A Cloud Platform for Distributed Gesture Elicitation Studies
 
MoCaDiX: Designing Cross-Device User Interfaces of an Information System base...
MoCaDiX: Designing Cross-Device User Interfaces of an Information System base...MoCaDiX: Designing Cross-Device User Interfaces of an Information System base...
MoCaDiX: Designing Cross-Device User Interfaces of an Information System base...
 
Specification of a UX process reference model towards the strategic planning ...
Specification of a UX process reference model towards the strategic planning ...Specification of a UX process reference model towards the strategic planning ...
Specification of a UX process reference model towards the strategic planning ...
 
!FTL, an Articulation-Invariant Stroke Gesture Recognizer with Controllable P...
!FTL, an Articulation-Invariant Stroke Gesture Recognizer with Controllable P...!FTL, an Articulation-Invariant Stroke Gesture Recognizer with Controllable P...
!FTL, an Articulation-Invariant Stroke Gesture Recognizer with Controllable P...
 
Gestures for Smart Rings: Empirical Results, Insights, and Design Implications
Gestures for Smart Rings: Empirical Results, Insights, and Design ImplicationsGestures for Smart Rings: Empirical Results, Insights, and Design Implications
Gestures for Smart Rings: Empirical Results, Insights, and Design Implications
 
User Interface Evaluation: is it Ever Usable?
User Interface Evaluation: is it Ever Usable?User Interface Evaluation: is it Ever Usable?
User Interface Evaluation: is it Ever Usable?
 
Cloud Menus, a Circular Adaptive Menu for Small Screens
Cloud Menus, a Circular Adaptive Menu for Small ScreensCloud Menus, a Circular Adaptive Menu for Small Screens
Cloud Menus, a Circular Adaptive Menu for Small Screens
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Recently uploaded (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

Recognizing 3D Trajectories as 2D Multi-stroke Gestures

  • 1.
  • 2. Recognizing 3D Trajectories as 2D Multi-stroke Gestures ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) Paolo Roselli Università degli Studi di Roma, Italy Université catholique de Louvain, Belgium Jean Vanderdonckt LouRIM Université catholique de Louvain, Belgium Nathan Magrofuoco LouRIM Université catholique de Louvain, Belgium Arthur Sluÿters LouRIM Université catholique de Louvain, Belgium Mehdi Ousmer LouRIM Université catholique de Louvain, Belgium
  • 3. Contents • Introduction • Challenges • Recognizers • Experiment • Conclusion & Future
  • 5. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 1 Ousmer, M., Vanderdonckt, J., Buraga, S., 2019. An ontology for reasoning on body-based gestures, in: Proceedings of the ACM SIGCHI Symposium on Engineering Interactive Computing Systems - EICS ’19. Presented at the ACM SIGCHI Symposium, ACM Press, Valencia, Spain, pp. 1–6. https://doi.org/10.1145/3319499.3328238 GES
  • 6. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 2 X Y Z • Deep Learning • SVM • Neural Networks • Machine Learning • Nearest-Neighbor-Classification (NNC) • Pattern matching pi=<xi , yi , zi >
  • 8. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 4  What are the suitable state-of-art 2D gesture recognizers that could answer our main question? How can we recognize a 3D gesture?  What are the sampling rate and the number of templates that give us the best results?  Are these recognizers efficient? Can we compare them with existing 3D recognizers?
  • 10. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 7 GRANDMA, Rubine’s recognizer(1991): The first published algorithm in stroke gesture recognition, it uses statistical matching. -It describes gestures with a vector of geometrical features. $P: One of $-family accurate recognizer. $P is invariant to scaling, rotation, order, number and direction. -Representation of the gesture as a point cloud. $Q : An improvement of the $P recognizer for low computational devices by reducing Its computational needs. The recognizer uses a Lookup table for the point matching and implements the early abandoning. $P+ : The most accurate recognizer of the $-family. An optimization of $P with good results for low-vision users. It has a more flexible cloud matching than $P. 2D Recognizers
  • 11. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 8 3D Recognizers Rubine3D: • Projection of the gesture on each plane (XY, YZ, ZX) • Usage of the original features vector to describe a gesture. • Use of a heuristic to determine the candidate gesture class if the result classes are different between planes. X Y Z
  • 12. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 9 3D Recognizers Rubine-Sheng: • Extension of the Rubine’s recognizer to three dimensions. • Description of the 3D gesture with a vector of 16 features. • Extension of the original feature vectors to 3D by adding three features. (Sheng, “A Study of AdaBoost in 3D Gesture Recognition.”)
  • 13. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 10 3D Recognizers $P3: • Extension of the $P recognizer to three dimensions. • The point Cloud defined as a set of 3D points. • Pre-Processing of the gesture (Resample, Scale, Translate to origin). • Implementation of the Greedy-5 heuristic used in the $P. $Q3: • Use of a 16x16x16 3D grid for the lookup technique. • Storage of the indices (row, column, layer) of the closest point in the grid. • The best matching point cloud is the point cloud with the lowest dissimilarity score.
  • 14. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 11 3D Recognizers $F: • Extension of the $P3 recognizer with a flexible cloud matching of $P+. • Pre-processing of the gestures. • The best matching template is the template with the lowest dissimilarity score. $P+3: • Extension of the $P+ recognizer to the third dimension. • Pre-processing of the gestures. • Inclusion of the ‘z’ coordinate in the turning angle and the point distance computation.
  • 15. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 10 3D Recognizers FreeHandUni: • Derives from the FreeHand recognizer pseudocode (Craciun et al.). • Replacement of the hand pose structure with a 3D point (x,y,z). • $P3 extension with a flexible cloud matching of $P+ . • No early abandoning. • The best matching template is the template with the lowest dissimilarity score.
  • 17. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 12 Experiment • Recognizers: $P3, $Q3, $P+3, $F, FH, R3D, RS • Gesture sets: SHREC2019, 3DTCGS, 3DMadLabSD (4 Domains) • Number of Templates (T) : {1,2,4,8,16} • Sampling (N) : {4,8,16,32,64} • User-Independent Scenario: 6 (Dataset)×5 (Sampling)×5 (Number of Templates)×100 (repetitions)×7(Recognizer) = 105,000 recognition trials • User-Dependent Scenario: 4 (Dataset)×10 (users)×5(Sampling)×4 (Number of Templates)×100 (repetitions)×7 (Recognizer) = 560,000 recognition trials • Quantitative Measures: • Recognition rate • Execution time • Evaluate effects: • Datasets • Number of templates T • Sampling N
  • 18. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 13 Datasets: SHREC 2019 Cross “X” Circle “O” V-mark “V” Caret “^” Square “[]” 3DTCGS arc3Dleft arc3Dright caret check circle curly-braket- left curly-braket- right delete left-swipe pigtail poly3Dxyz poly3Dxzy poly3Dyxz poly3Dyzx poly3Dzxy poly3Dzyx rectangle right-swipe spiral square-braket-left square-braket- right star triangle V X zig-zag 3DMadLabSD 0 1 2 3 4 5 6 7 8 9 a b c d e f g h i j Spring Mass Wheel Pulley Hinge Fast Forward Rewind Play Pause Delete Cuboid Cylinder Sphere Rectangular Pipe Hemisphere Cylindrical Pipe Pyramid Tetrahedron Cone Toroid Domain1 Domain3 Domain2 Domain4
  • 19. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 14 SHREC2019 Results : Execution time tables Recognition tables
  • 20. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 15 SHREC2019 Results : Confusion matrix for $P+3
  • 21. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 16 SHREC2019 Results : recognition rate
  • 22. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 17 Results : recognition rate $𝑃3 $𝑄3 $𝑃+ 3 $𝐹 𝑅3𝐷 𝐹𝐻 𝑅𝑆 Global recognition rate [%] 87.48 79.54 79.49 79.09 77.50 67.22 57.45 40 50 60 70 80 90 100
  • 23. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 18 Results : recognition rate *** *** $𝑃3 $𝑄3 $𝑃+ 3 $𝐹 𝑅3𝐷 𝐹𝐻 𝑅𝑆 $𝑃3 $𝑄3 $𝑃+ 3 $𝐹 𝑅3𝐷 𝐹𝐻 𝑅𝑆 *** *** *** *** *** *** *** ** *** *** Global recognition rate [%] 84.28 78.84 78.83 78.45 77.43 74.48 68.82 3DTCGS 86.94 79.36 79.36 79.28 78.10 52.51 40.82 40 50 60 70 80 90 100 SHREC2019
  • 24. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 19 Results : recognition rate *** *** $𝑃3 $𝑄3 $𝑃+ 3 $𝐹 𝑅3𝐷 𝐹𝐻 𝑅𝑆 $𝑃3 $𝑄3 $𝑃+ 3 $𝐹 𝑅3𝐷 𝐹𝐻 𝑅𝑆 *** *** *** *** *** *** ** Global recognition rate [%] 98.81 96.78 96.69 96.28 95.70 75.94 70.76 Domain 2 User dependent *** *** 87.87 75.78 75.45 74.55 71.55 70.33 58.10 40 50 60 70 80 90 100 Domain 2 User independent *** *** $𝑃3 $𝑄3 $𝑃+ 3 $𝐹 𝑅3𝐷 𝐹𝐻 𝑅𝑆 $𝑃3 $𝑄3 $𝑃+ 3 $𝐹 𝑅3𝐷 𝐹𝐻 𝑅𝑆 *** *** *** *** *** *** ** Global recognition rate [%] *** *** 98.45 96.22 96.21 95.7995.28 75.60 69.07 Domain 1 User dependent 88.08 76.38 76.05 75.20 72.57 72.57 58.37 40 50 60 70 80 90 100 Domain 1 User independent
  • 25. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 20 Results : recognition rate *** *** $𝑃3 $𝑄3 $𝑃+ 3 $𝐹 𝑅3𝐷 𝐹𝐻 𝑅𝑆 *** *** *** *** *** *** ** Global recognition rate [%] *** 99.26 97.70 97.67 97.51 97.18 76.93 73.36 $𝑃3 $𝑄3 $𝑃+ 3 $𝐹 𝑅3𝐷 𝐹𝐻 𝑅𝑆 Domain 3 User dependent *** 94.45 85.66 85.55 84.88 82.92 76.52 67.98 40 50 60 70 80 90 100 Domain 3 User independent *** *** $𝑃3 $𝑄3 $𝑃+ 3 $𝐹 𝑅3𝐷 𝐹𝐻 𝑅𝑆 *** *** *** *** *** ** Global recognition rate [%] *** $𝑃3 $𝑄3 $𝑃+ 3 $𝐹 𝑅3𝐷 𝐹𝐻 𝑅𝑆 *** 93.86 91.96 91.76 91.31 90.68 72.99 65.04 Domain 4 User dependent 66.36 57.94 57.75 57.02 54.98 54.40 43.49 40 50 60 70 80 90 100 Domain 4 - User independent
  • 26. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 21 Results: recognition rate • Loss when switching from UD to UI scenario
  • 27. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 22 Discussion • $P+3 is the best recognizer in most conditions. • Rubine-Sheng should be avoided. • $-like recognizers are more accurate than R3D and RS. • $F and FH have small variation. • A significant effect on recognition rate (see paper for details). -Number of templates. -Sampling. -Datasets
  • 28. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 23 Limitation • Rejection of some recognizers due to their supposed computational complexity. • The lack of diversity in the evaluated gestures. • The impact of other factors on the measures.
  • 30. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 25 Conclusion • Possibility to interpret a 3D trajectory as three 2D trajectories projected on each plane. • Extension of four 2D stroke gesture recognizers to three dimensions. • Implementation of a testing framework in JavaScript. • Test an extensive range of 3D trajectories . • The winner of the test is the $P+3 recognizer.
  • 31. ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 26 Future work • Inclusion of omitted recognizers in the test to compare them against $P+3. • Evaluation of the gestures' depth variation impact. • Test of the uni-stroke single-point gestures in a real-world application. • Evaluation of complex gestures. • Comparison with other algorithms (Not template based).
  • 32. Thank you very much for your attention

Editor's Notes

  1. Hello everyone my name is Mehdi Ousmer and I'm a PhD student from Université catholique de Louvain in Belgium Today I have the pleasure to present our recent work "Recognizing 3D trajectories as 2D Multi-stroke gestures".
  2. I will start with the introduction, and then I'll move on to the other parts.
  3. In a previous gesture elicitation study, we asked participants to elicit gestures that fit actions and commands to control objects in an IoT environment. As we can see in this video, some participants elicited 2D symbolic gestures in thin air.
  4. In the 3D space, we can represent gestures as sequential data of a single point. In the literature, we call these kinds of gestures “trajectories”. Moreover, I would like to mention that there are several gesture recognizers for 2D or 3D gestures based on different techniques.
  5. And for that reason, we raised in this work the question "Can we recognize 3D trajectories as 2D Multi-stroke gestures?"
  6. Now I'd like to move to the challenges that we faced to answer the question:
  7. After conducting a targeted Literature review on 2D stroke gesture recognizers, we selected four recognizers for multiple reasons, but, mainly because they are simple as well as well-known recognizers. these recognizers are: Rubine's algorithm $P $Q $P+
  8. Next, We extended these recognizers to 3 dimensions. Rubine 3D: a 3D version of Rubine's recognizer formed of three 2D Rubine recognizers. //
  9. RubineSheng: An extension of Rubine’s recognizer to three dimensions.
  10. $Pcube: The 3D version of $P. $Qcube: The 3D version of $Q.
  11. $F: An improved version of the $Pcube with a flexible cloud matching of $P+. $P+cube: The 3D version of $P+.
  12. FreehandUni: An improved version of the $Pcube with a flexible cloud matching of $P+. However, it differs from $F because the early abandoning is not included.
  13. After we created new 3D recognizers, we must prove their efficiency. Thus, we'll experiment them in both user-independent and user-dependent scenarios, using three trajectories datasets. Furthermore, we tested recognizers for different conditions which are defined by two variables: the number of templates and the number of points (sampling). Now, I would like to draw your attention to the total number of recognition trials performed for, which is in total 665,000 recognition trials We collected the test results through two quantitative measures: The recognition rate and the execution time. Thanks to the collected results , we can analyze the effects of three factors on the recognition rate and execution time: the datasets/ the number of templates and the sampling.
  14. During this experimentation, we used three datasets: The SHREC2019: a dataset proposed in a recent contest. 3DTCGS: It was used in 3Cents recognizer evaluation. 3DMadLabSD: the dataset is composed of 4 subsets (Domains).
  15. We present some examples of graphics and tables, produced from the test results on each dataset Tables contain the recognition rate as well as the execution time values of every recognizer for the diverse conditions, the colour scheme is defined by the median value in each table.
  16. We generated a normalized balanced confusion matrix for the $P+cube.
  17. We plot individual results for the different conditions.
  18. In this slide, we can examine the averaged recognition rate for each recognizer on all datasets. In this diagram, as you can notice $P+cube has the best recognition rate with 87.48 %, even if it is a high recognition rate, the $P+cube can have higher recognition rate, it depends on the gesture dataset, and the rate can exceed 90%, or We remark the $-like recognizers which have a similar rate, which is around 80%. At least we can observe that the R3D and RS have low recognition rates, under 70%.
  19. The SHREC2019 and 3DTCGS datasets are evaluated only in the user-independent scenario, from what we view in these diagrams, the recognition rates are similar to the averaged recognition rates seen in the previous slide.
  20. In this slide, we take note of the difference between the user-dependent and user-independent scenario. We observe a loss of nearly 10% for the $P+cube, and around 20% for the other $-like recognizers.
  21. In these diagrams, we notice the low rates of all recognizers for the Domain4. The principal cause is the type of gestures in this dataset.
  22. This figure indicates the recognition rates for all recognizers on the MadLabSD in the User-independent and User-dependent scenarios. We remark a significant difference when switching from user-dependent to the user-independent scenario, though there are some exceptions.
  23. Based on the results of the evaluation, we can say that $P+cube is the most efficient and effective recognizer in most conditions. Rubine-Sheng should be avoided, due to its low recognition rate in all conditions compared to the other recognizers. Concerning the effect of factors on measures, We can cite the significant effect of the number of templates on the recognition rates of all the recognizers. The significant effect of the number of points on the $-like recognizers The datasets significantly affect the recognition rates of all recognizers in the user-independent scenario.
  24. We can see some limitations of this experiment. Rejecting some recognizers Have one kind of gestures to evaluate We didn’t include the dataset size as a factor to evaluate
  25. To sum up, we made seven 3D gesture recognizers based on four 2D recognizers selected from the state-of-the-art. We implemented a testing framework in JavaScript and tested the recognizers. The test results showed that the $P+cube provides excellent results.
  26. We have a list of some improvements for future work Evaluate the effect of the depth variations in gesture. Compare with other recognizers based on other techniques
  27. Thank you