Paper: http://ceur-ws.org/Vol-2882/paper2.pdf
YouTube: https://youtu.be/-bRL868b8ys
Pierre-Etienne Martin, Jenny Benois-Pineau, Boris Mansencal, Renaud Péteri, Laurent Mascarilla, Jordan Calandre and Julien Morlier : Sports Video Classification: Classification of Strokes in Table Tennis for MediaEval 2020. Proc. of MediaEval 2020, 14-15 December 2020, Online.
Fine-grained action classification has raised new challenges compared to classical action classification problems. Sport video analysis is a very popular research topic, due to the variety of application areas, ranging from multimedia intelligent devices with user-tailored digests, up to analysis of athletes' performances. Running since 2019 as a part of MediaEval, we offer a task which consists in classifying table tennis strokes from videos recorded in natural conditions at the University of Bordeaux. The aim is to build tools for teachers, coaches and players to analyse table tennis games. Such tools could lead to an automatic profiling of the player and adaptation of his training for improving his/her sport skills more efficiently.
Presented by: Pierre-Etienne Martin
2. ediaEval 2020
Organized by:
Pierre-Etienne Martin
Jenny Benois-Pineau
Boris Mansencal
Renaud Péteri
Laurent Mascarilla
Jordan Calandre
Julien Morlier
2
Sports Video Classification:
Classification of Strokes in
Table Tennis
9. Task
9
➢ Train set fully annotated
➢ Test set partially annotated
➢ We provide:
○ mp4 videos
○ train: xml files with temporal annotation,
stroke class and the handedness of the
player
○ test: xml files with temporal annotation
and “Unknown” class
➢ Classification over 20 classes
➢ Participants has to return test xml files with
“Unknown” replaced by their prediction
➢ Participants can submit up to 5 runs
Prediction
t
Unknown
Offensive Backhand Flip
10. Evaluation
10
➢ Global accuracy of the classification
➢ Confusion Matrix for the best run:
○ General
○ The drive: Forehand, Backhand
○ The context: Serve, Offensive, Defensive
○ Drive ∩ Context: 6 classes
○ General without the drive (under request)
11. Participation
11
➢ 13 subscriptions
○ 1 was a robot
○ 1 subscribed twice
○ 1 without institutional address
○ 3 did not answer when asked for
institutional address
○ 1 did not signed the particular
conditions
○ 6 participants with access to the
data
➢ 3 submissions
○ 2 did not answer after granting
access to the data
○ 1 gave up
➢ 3 working note papers
➢ 2 presentations
➢ 22 subscriptions
○ 2 subscribed twice
○ 12 signed the ME data agreement
○ 11 accepted our particular conditions
○ 10 valid participants with access to the data
➢ 5 submissions
○ many did not answer after being granted access to the
data
➢ 5 working note papers
➢ 5 planned presentations thanks to the workshop being Online
Last year
12. Results - Best run
12
Accuracies in %
Rank Team General Drive Context Drive ∩ Context
1 HBKU_UNITN_SIMULA 31.4 92.7 78.3 75.4
2 CRISP 26.6 72.3 76.8 60.7
3 KDEME 16.7 79.7 57.9 52.8
4 MIA 13 65.3 65.8 49.2
5 iCV-UT 9.32 66.7 50.8 39
Last year submissions
1 CRISP 22.9 76.8 65.8 54.8
2 MIA 14.1 61.6 48.9 29.1
3 SSN 11.3 65.8 55.1 48.3
13. Suggestions for better performances
13
➢ Multi label or/and cascade ✅
○ Forehand / Backhand
○ Offensive / Defensive / Services
○ Type of stroke
➢ Build negative samples
➢ Data augmentation
➢ Split of the provided data ⚠
[1] P.-E. Martin et al., “Fine grained sport action recognition with Twin spatio-temporal convolutional neural networks,” in Multim. Tools
Appl. 79, 27-28, 2020.
[2] D. Shaoet al., “FineGym: A Hierarchical Video Dataset for Fine-Grained Action Understanding,” in CVPR, 2020.
[3] P.-E. Martin et al., “3D Attention Mechanisms in Twin Spatio-Temporal Convolutional Neural Networks. Application to Action
Classification in Videos of Table Tennis Games,” in ICPR, 2021.
[4] M. E. Kalfaoglu et al., “Late Temporal Modeling in 3D CNN Architectures with BERT for Action Recognition,” in CoRR, 2020.
14. Conclusion
14
➢ Complicated task because
○ limited dataset
○ high inter-similarity
→ Fine-grained classification from few examples
➢ Improvements possible (9% better than last year)
➢ Task will be reconducted with a larger number of samples
New participants are most welcome!
15. LaBRI - Building A30
351 cours de la Libération
33405, Talence
pierre-etienne.martin@u-bordeaux.fr
https://www.labri.fr/projet/AIV/CRISP_presentation.php
+33 5 40 00 38 80
www.linkedin.com/in/p-e-martin
@P_eMartin
Merci