Recognizing 3D Trajectories as 2D Multi-stroke Gestures

Recognizing 3D Trajectories as 2D Multi-stroke
Gestures
ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020)
Paolo Roselli
Università degli Studi di
Roma, Italy
Université catholique de
Louvain, Belgium
Jean Vanderdonckt
LouRIM
Louvain, Belgium
Nathan Magrofuoco
LouRIM
Louvain, Belgium
Arthur Sluÿters
LouRIM
Louvain, Belgium
Mehdi Ousmer
LouRIM
Louvain, Belgium

Contents
• Introduction
• Challenges
• Recognizers
• Experiment
• Conclusion & Future

ACM ISS 2020 (Lisbon, Portugal, November 9-11, 2020) 1
Ousmer, M., Vanderdonckt, J., Buraga, S., 2019. An ontology for reasoning on body-based gestures, in: Proceedings of the ACM SIGCHI
Symposium on Engineering Interactive Computing Systems - EICS ’19. Presented at the ACM SIGCHI Symposium, ACM Press, Valencia,
Spain, pp. 1–6. https://doi.org/10.1145/3319499.3328238
GES

X
Y
Z
• Deep Learning
• SVM
• Neural Networks
• Machine Learning
• Nearest-Neighbor-Classification
(NNC)
• Pattern matching
pi=<xi , yi , zi >

 What are the suitable state-of-art 2D gesture recognizers that
could answer our main question? How can we recognize a 3D
gesture?
 What are the sampling rate and the number of templates that
give us the best results?
 Are these recognizers efficient? Can we compare them with
existing 3D recognizers?

GRANDMA, Rubine’s recognizer(1991): The first published algorithm in
stroke gesture recognition, it uses statistical matching.
-It describes gestures with a vector of geometrical features.
$P: One of $-family accurate recognizer. $P is invariant to scaling, rotation,
order, number and direction.
-Representation of the gesture as a point cloud.
$Q : An improvement of the $P recognizer for low computational devices by
reducing Its computational needs. The recognizer uses a Lookup table for
the point matching and implements the early abandoning.
$P+ : The most accurate recognizer of the $-family. An optimization of $P
with good results for low-vision users. It has a more flexible cloud matching
than $P.
2D Recognizers

3D Recognizers
Rubine3D:
• Projection of the gesture on each plane (XY, YZ,
ZX)
• Usage of the original features vector to describe
a gesture.
• Use of a heuristic to determine the candidate
gesture class if the result classes are different
between planes.
X
Y
Z

3D Recognizers
Rubine-Sheng:
• Extension of the Rubine’s recognizer to three
dimensions.
• Description of the 3D gesture with a vector of
16 features.
• Extension of the original feature vectors to 3D
by adding three features. (Sheng, “A Study of
AdaBoost in 3D Gesture Recognition.”)

3D Recognizers
$P3:
• Extension of the $P recognizer to three dimensions.
• The point Cloud defined as a set of 3D points.
• Pre-Processing of the gesture (Resample, Scale, Translate to origin).
• Implementation of the Greedy-5 heuristic used in the $P.
$Q3:
• Use of a 16x16x16 3D grid for the lookup technique.
• Storage of the indices (row, column, layer) of the closest point in the
grid.
• The best matching point cloud is the point cloud with the lowest
dissimilarity score.

3D Recognizers
$F:
• Extension of the $P3 recognizer with a flexible cloud matching of $P+.
• Pre-processing of the gestures.
• The best matching template is the template with the lowest dissimilarity
score.
$P+3:
• Extension of the $P+ recognizer to the third dimension.
• Pre-processing of the gestures.
• Inclusion of the ‘z’ coordinate in the turning angle and the point distance
computation.

3D Recognizers
FreeHandUni:
• Derives from the FreeHand recognizer pseudocode (Craciun et al.).
• Replacement of the hand pose structure with a 3D point (x,y,z).
• $P3 extension with a flexible cloud matching of $P+ .
• No early abandoning.
• The best matching template is the template with the lowest dissimilarity
score.

Experiment
• Recognizers: $P3, $Q3, $P+3, $F, FH, R3D, RS
• Gesture sets: SHREC2019, 3DTCGS, 3DMadLabSD (4 Domains)
• Number of Templates (T) : {1,2,4,8,16}
• Sampling (N) : {4,8,16,32,64}
• User-Independent Scenario:
6 (Dataset)×5 (Sampling)×5 (Number of Templates)×100
(repetitions)×7(Recognizer)
= 105,000 recognition trials
• User-Dependent Scenario:
4 (Dataset)×10 (users)×5(Sampling)×4 (Number of
Templates)×100 (repetitions)×7 (Recognizer)
= 560,000 recognition trials
• Quantitative Measures:
• Recognition rate
• Execution time
• Evaluate effects:
• Datasets
• Number of templates T
• Sampling N

Datasets:
SHREC 2019
Cross “X” Circle “O”
V-mark “V” Caret “^” Square “[]”
3DTCGS
arc3Dleft arc3Dright caret check circle curly-braket-
left
curly-braket-
right
delete left-swipe pigtail poly3Dxyz poly3Dxzy poly3Dyxz poly3Dyzx
poly3Dzxy poly3Dzyx rectangle right-swipe spiral square-braket-left square-braket-
right
star triangle V X zig-zag
3DMadLabSD
0 1 2 3 4
5 6 7 8 9
a b c d e
f g h i j
Spring Mass Wheel Pulley Hinge
Fast
Forward
Rewind Play Pause Delete
Cuboid Cylinder Sphere
Rectangular
Pipe Hemisphere
Cylindrical
Pipe
Pyramid Tetrahedron Cone Toroid
Domain1
Domain3
Domain2
Domain4

SHREC2019 Results :
Execution time tables
Recognition tables

SHREC2019 Results : Confusion matrix for $P+3

SHREC2019 Results : recognition rate

Results : recognition rate
$𝑃3
$𝑄3
$𝑃+
3 $𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆
Global
recognition
rate
[%]
87.48
79.54 79.49 79.09 77.50
67.22
57.45
40
50
60
70
80
90
100

***
***
$𝑃3
$𝑄3
$𝑃+
3
$𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆 $𝑃3
$𝑄3
$𝑃+
3 $𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆
***
***
***
***
***
***
***
**
***
***
Global
recognition
rate
[%]
84.28
78.84 78.83 78.45 77.43
74.48
68.82
3DTCGS
86.94
79.36 79.36 79.28
78.10
52.51
40.82
40
50
60
70
80
90
100
SHREC2019

***
***
$𝑃3
$𝑄3
$𝑃+
3
$𝐹 𝑅3𝐷
$𝑄3
$𝑃+
3 $𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆
***
***
***
***
***
***
**
Global
recognition
rate
[%]
98.81 96.78 96.69 96.28
95.70
75.94
70.76
Domain 2
User dependent
***
***
87.87
75.78 75.45 74.55 71.55 70.33
58.10
40
50
60
70
80
90
100
Domain 2
User independent
***
***
$𝑃3
$𝑄3
$𝑃+
3 $𝐹 𝑅3𝐷
$𝑄3
$𝑃+
3 $𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆
***
***
***
***
***
***
**
Global
recognition
rate
[%]
***
***
98.45
96.22 96.21 95.7995.28
75.60
69.07
Domain 1
User dependent
88.08
76.38 76.05 75.20 72.57
72.57
58.37
40
50
60
70
80
90
100
Domain 1
User independent

***
***
$𝑃3
$𝑄3
$𝑃+
3
$𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆
***
***
***
***
***
***
**
Global
recognition
rate
[%]
***
99.26 97.70 97.67
97.51
97.18
76.93
73.36
$𝑃3
$𝑄3
$𝑃+
3 $𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆
Domain 3
User dependent
***
94.45
85.66 85.55
84.88 82.92
76.52
67.98
40
50
60
70
80
90
100
Domain 3
User independent
***
***
$𝑃3
$𝑄3
$𝑃+
3
$𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆
***
***
***
***
***
**
Global
recognition
rate
[%]
***
$𝑃3
$𝑄3
$𝑃+
3 $𝐹 𝑅3𝐷
𝐹𝐻 𝑅𝑆
***
93.86 91.96 91.76 91.31
90.68
72.99
65.04
Domain 4
User dependent
66.36
57.94 57.75 57.02
54.98 54.40
43.49
40
50
60
70
80
90
100
Domain 4 - User independent

Results: recognition rate
• Loss when switching from UD to UI scenario

Discussion
• $P+3 is the best recognizer in most conditions.
• Rubine-Sheng should be avoided.
• $-like recognizers are more accurate than R3D and RS.
• $F and FH have small variation.
• A significant effect on recognition rate (see paper for details).
-Number of templates.
-Sampling.
-Datasets

Limitation
• Rejection of some recognizers due to their supposed
computational complexity.
• The lack of diversity in the evaluated gestures.
• The impact of other factors on the measures.

Conclusion
• Possibility to interpret a 3D trajectory as three 2D trajectories
projected on each plane.
• Extension of four 2D stroke gesture recognizers to three
dimensions.
• Implementation of a testing framework in JavaScript.
• Test an extensive range of 3D trajectories .
• The winner of the test is the $P+3 recognizer.

Future work
• Inclusion of omitted recognizers in the test to compare them against
$P+3.
• Evaluation of the gestures' depth variation impact.
• Test of the uni-stroke single-point gestures in a real-world application.
• Evaluation of complex gestures.
• Comparison with other algorithms (Not template based).

Thank you very much
for your attention

Recognizing 3D Trajectories as 2D Multi-stroke Gestures

Recommended

Recommended

More Related Content

More from Jean Vanderdonckt

More from Jean Vanderdonckt (20)

Recently uploaded

Recently uploaded (20)

Recognizing 3D Trajectories as 2D Multi-stroke Gestures

Editor's Notes