SlideShare a Scribd company logo
Learn2Sign:
Sign Language
Recognition and
Translation
using Human Keypoint Estimation and
Transformer Model
Master thesis
presentation
Peter Muschick
Photo by Jo Hilton on Unsplash
Were my hands visible? Was the
background not distracting? Did
my clothes contrast my skin color?
Was the video quality sufficient?
• Problem
• Communication issues of sign language speakers (in digital environments)
[DFG+]
• Proposed solutions
• Creation of automatically generated subtitles and translations of sign languages
• Speech2Signs: Spoken to Sign Language Translation using NN of prof Xavier
Giró and Amanda Duarte (PhD cand.) at Universitat Politècnica de Catalunya,
Barcelona
• Here: Research of sign language translation with a new dataset called How2Sign
and OpenPose
3
Motivation
University of Stuttgart 06.11.2020
• Introduction
• Sign language research
• Current state
• Related works
• Methods
• Results
• Discussion & Summary
4
Content
University of Stuttgart 06.11.2020
• Sign languages are
• individual and independent languages
• Sign languages are spoken on multiple and parallel channels [Dam11]
• All information of sign languages cannot be covered in texts [Sut95] [Sto05]
[Pri90]
• Research of sign language translation is dependent on the translation direction
5
Characteristics of neural sign language translation research
Introduction
University of Stuttgart 06.11.2020
• Research of sign language translation: Sign language to spoken language
6
Translation direction
Introduction
‘Hi my name is ...’ / Audio
[DPG+20]
University of Stuttgart 06.11.2020
Input: image/video Output: text/audio
• Research of sign language translation: Spoken language to sign language
7
Translation direction
Introduction
University of Stuttgart 06.11.2020
GAN = Generative Adversarial Networks
‘Hi my name is ...’ / Audio
Input: text/audio Output: animated avatar or
generated videos (GAN)
‘Hi my name is ...’ / Audio
[DPG+20]
• Research of sign language translation: Sign language to sign language
8
Translation direction
Introduction
University of Stuttgart 06.11.2020
[DPG+20]
Input: image/video
GAN = Generative Adversarial Networks
‘Hi my name is ...’ / Audio
Output: animated avatar or
generated videos (GAN)
[DPG+20]
• Sign language to sign language: no known publications
• Spoken language to sign language: (Saunders et al., 2020 [SCB20], Stoll et al., 2018
[STL+18])
• Sign language to spoken language:
• Sign Recognition (Zahoor et al., 2011 [ZAH+11])
• Continuous Sign Recognition (Koller et al., 2015 [KFN15])
• Sign Language Translation (Camgöz et al., 2018 [CHK+18], Camgöz et al. 2020
[CKHB20])
9
Current state of research
Introduction
University of Stuttgart 06.11.2020
10
Sign language to spoken language tasks
Introduction
Task Sign Recognition Continuous Sign
Recognition
Sign Language
Translation
Sign Language
representation
Images Videos Videos
Spoken Language
representation
Classes Signs Text
“A” “HI ME SARAH”
“Hi my name is
Sarah”
• Enable use of sign language with sign language translation
• Current sign language datasets issues
• Limited range of topics & vocabulary & amount of speakers [DPG+20]
→ Collection and Creation of How2Sign dataset [DPG+20]
11
Sign language to spoken language translation
Introduction
University of Stuttgart 06.11.2020
12
Proposed solution - Sign language into spoken language translation
Introduction
Task Sign Recognition Continuous Sign
Recognition
Sign Language
Translation
Dataset SLR [GB]
PHOENIX14T
[CHK+18]
PHOENIX14T,
How2Sign [DPG+20]
Extraction OpenPose [CHS+18]
Model Transformer [VSP+17]
Evaluation R, M, B, W
Rouge [Lin04], Meteor [BL02], BLEU [PRWZ02], Word Error Rate [KP02]
University of Stuttgart 06.11.2020
13
Dataset
Methods
Task Sign
Recognition
Continuous
Sign
Recognition
Sign Language
Translation
Sign Language
Translation
Dataset SLR
PHOENIX14T
(Glosses)
PHOENIX14T
(German)
How2Sign
(English)
Type Images Videos Videos Videos
Annotation Classes Glosses German English
Hours - 10.5 10.5 80
Utterances 5 000 8 200 8 200 35 000
Vocab 24 1 000 3 000 16 000
University of Stuttgart 06.11.2020
Task Sign
Recognition
Continuous
Sign
Recognition
Sign Language
Translation
Sign Language
Translation
Dataset SLR
PHOENIX14T
(Glosses)
PHOENIX14T
(German)
How2Sign
(English)
Type Images Videos Videos Videos
Annotation Classes Glosses German English
Hours - 10.5 10.5 80
Utterances 5 000 8 200 8 200 35 000
Vocab 24 1 000 3 000 16 000
14
Dataset
Methods
University of Stuttgart 06.11.2020
Task Sign
Recognition
Continuous
Sign
Recognition
Sign Language
Translation
Sign Language
Translation
Dataset SLR
PHOENIX14T
(Glosses)
PHOENIX14T
(German)
How2Sign
(English)
Type Images Videos Videos Videos
Annotation Classes Glosses German English
Hours - 10.5 10.5 80
Utterances 5 000 8 200 8 200 35 000
Vocab 24 1 000 3 000 16 000
15
Dataset
Methods
University of Stuttgart 06.11.2020
• Human keypoint estimation with pretrained convolutional networks [CHS+18]
16
OpenPose - Human Keypoint Estimation
Methods
Input Output
University of Stuttgart 06.11.2020
• Receive 137 estimated keypoints (body, face, hands) per frame
• Keypoint: x- & y-coordinates and confidence score
• Data Normalization [KKJC19]
17
OpenPose - Human Keypoint Estimation
Methods
x = {x ∈ R | 0 ≤ x ≤ max(frame x-axis)}
n = {n ∈ N | 0 ≤ n ≤ #keypoints}
f = {f ∈ N | 0 < f ≤ #frames}
u = {u ∈ N | 0 < u ≤ #utterances}
University of Stuttgart 06.11.2020
• Transformer models from Attention is all you need [VSP+17] based on self-attention
• Schematic structure of the used Transformer model [Ala18]:
18
Models
Methods
N = Normalization layer
MLP = Multi layer perceptron
C = Classification layer
University of Stuttgart 06.11.2020
19
Proposed solution - Sign language into spoken language translation
Methods - Overview
Rouge [Lin04], Meteor [BL02], BLEU [PRWZ02], Word Error Rate [KP02]
Task Sign Recognition Continuous Sign
Recognition
Sign Language
Translation
Dataset SLR [GB]
PHOENIX14T
[CHK+18]
PHOENIX14T,
How2Sign [DPG+20]
Extraction OpenPose [CHS+18]
Model Transformer [VSP+17]
Evaluation R, M, B, W
University of Stuttgart 06.11.2020
20
SLR - Sign Recognition
Results
Work Our study Gupta et al. [GB]
Dataset SLR SLR
Extraction OpenPose CNN
Model Transformer MLP
Evaluation W W
Rouge [Lin04], Meteor [BL02], BLEU [PRWZ02], Word Error Rate [KP02]
University of Stuttgart 06.11.2020
21
SLR - Sign Recognition
Results
Experiment Hidden
size
#Layer Dropout LR #Heads WER (%)
Number
MLP size of
transformer
layer
Amount of
Transformer
layer
Dropout in
transformer
layer
Learning
rate
Amount of
attention
heads
Result
University of Stuttgart 06.11.2020
22
SLR - Sign Recognition
Results
Experiment Hidden
size
#Layer Dropout LR #Heads WER (%)
1 32 1 0.2 10-4
1 3.5
2 64 1 0.2 10-4
1 3.0
3 128 1 0.2 10-4
1 3.0
4 256 1 0.2 10-4
1 3.0
5 512 1 0.2 10-4
1 3.0
Gupta et al. - 3.1
University of Stuttgart 06.11.2020
23
PHOENIX14T - Continuous Sign Recognition
Results
Work Our study Camgöz et al., 2020
[CKHB20]
Dataset PHOENIX14T PHOENIX14T
Extraction OpenPose CNN
Model Transformer Transformer
Evaluation W W
Rouge [Lin04], Meteor [BL02], BLEU [PRWZ02], Word Error Rate [KP02]
University of Stuttgart 06.11.2020
24
PHOENIX14T - Continuous Sign Recognition
Results
Experiment Hidden
size
#Layer Dropout LR #Heads WER (%)
Val
WER (%)
Test
1 128 1 0.2 10-4
1 93.3 94.1
2 512 2 0.2 10-4
4 85.5 84.4
3 2048 4 0.2 10-4
8 79.3 81.2
Camgöz et
al., 2020
- 24.88 24.59
University of Stuttgart 06.11.2020
25
PHOENIX14T - Sign Language Translation
Results
Work Our study Ko et al., 2019
[KKJC19]
Camgöz et al., 2020
[CKHB20]
Dataset
PHOENIX14T
How2Sign
KETI (na) PHOENIX14T
Extraction OpenPose OpenPose CNN
Model Transformer Seq2Seq Transformer
Evaluation R, M, B, W R, M, B, C B, W
na = not available
Rouge [Lin04], Meteor [BL02], BLEU [PRWZ02], Word Error Rate [KP02]
University of Stuttgart 06.11.2020
BLEU-1, BLEU-2, BLEU-3, BLEU-4, Meteor, Rouge
26
PHOENIX14T - Sign Language Translation
Results
Exp #Hid #Lay Drop LR #H B1 B2 B3 B4 M R
1 1024 1 0.2 10-4
1 4.0 0.0 0.0 0.0 7.0 7.5
2 1024 1 0.2 10-4
8 9.0 1.6 0.0 0.0 11.0 14.0
3 1024 4 0.4 10-5
32 10.0 3.0 3.0 0.0 12.0 19.0
4 2048 6 0.4 10-5
16 10.0 4.0 4.0 2.0 11.0 16.0
Cmg,
2020
Validation
Test
47.25
46.61
34.40
33.73
27.05
26.19
22.38
21.32
- -
University of Stuttgart 06.11.2020
27
How2Sign - Sign Language Translation
Results
Exp #Hid #Lay Drop LR #H B1 B2 B3 B4 M R
1 1024 4 0.4 10-5
32 1.0 0.0 0.0 0.0 2.0 3.0
2 2048 6 0.4 10-5
16 1.0 0.0 0.0 0.0 1.0 2.0
oom 2048 4 0.4 10-5
64 - - - - - -
oom 2048 8 0.4 10-5
32 - - - - - -
oom = out of memory error
BLEU-1, BLEU-2, BLEU-3, BLEU-4, Meteor, Rouge
University of Stuttgart 06.11.2020
28
Translation results
Discussion
University of Stuttgart 06.11.2020
Task Dataset Translation/ Recognition
quality
Sign Recognition SLR High
Continuous Sign Recognition PHOENIX14T Low
Sign Language Translation
PHOENIX14T Low
How2Sign Not possible
→ Bigger and more complex datasets were not possible to translate
• Keypoint estimation accuracy of OpenPose might be too low
29
Limitations
Discussion
University of Stuttgart 06.11.2020
• Confidence scores of a video of ~2800 frames displaying a sign language speaker
30
OpenPose - How2Sign: face & body confidence scores
Discussion
University of Stuttgart 06.11.2020
• Confidence scores of a video of ~2800 frames displaying a sign language speaker
31
OpenPose - How2Sign: left & right hand confidence scores
Discussion
University of Stuttgart 06.11.2020
• Keypoint estimation accuracy of OpenPose might be too low
• Models with bigger hyperparameters exceed the server memory
• Complexity of used models might be too low
32
Limitations
Discussion
University of Stuttgart 06.11.2020
• OpenPose and transformer model are suited for sign recognition
• Proposed methods did not show satisfying results for continuous sign recognition and
sign language translation
33
Summary
University of Stuttgart 06.11.2020
• Run OpenPose with different datasets and examine accuracy
• Datasets with more repetitions of single signs
• Focus on hand recognition
• Continue with transformer models
• Use pre-defined transformer models from libraries
• Use OpenPose for facial recognition
34
Outlook
University of Stuttgart 06.11.2020
[Jac96] R. Jacobs. “Just how hard is it to learn ASL? The case for ASL as a truly foreign language.” In: Multicultural aspects of sociolinguistics in
deaf communities 2 (1996), pp. 183–226
[Dam11] S. Damian. “Spoken vs. Sign Languages-What’s the Difference?” In: Cognition, Brain, Behavior 15.2 (2011), p. 251
[DFG+] P. Dreuw, J. Forster, Y. Gweth, D. Stein, H. Ney, G. Martinez, J. V. Llahi, O. Crasborn, E. Ormel, W. Du, T. Hoyoux, J. Piater, J. M. Moya,
M. Wheatley. “SignSpeak – Understanding, Recognition, and Translation of Sign Languages.” en. In: (), p. 8
[ACH+13] M. Adams, C. Castaneda, H. W. Hackman, M. L. Peters, X. Zuniga, W. J. Blumenfeld. Readings for diversity and social justice. Third
edition. New York: Routledge Taylor & Franacis Group, 2013., 2013
[Sut95] V. Sutton. Lessons in sign writing. SignWriting, 1995
[Sto05] W. Stokoe. “Sign language structure: an outline of the visual communication systems of the American deaf. 1960.” In: Journal of deaf
studies and deaf education 10 1 (2005), pp. 3–37
[Pri90] S. Prillwitz. “Hamburger Notations-System - Entwicklung einer Gebärdenschrift mit Computeranwendung.” In: Gebärde, Laut und
graphisches Zeichen: Schrifterwerb im Problemfeld von Mehrsprachigkeit. Ed. by G. List, G. List. Wiesbaden: VS Verlag für Sozialwissenschaften,
1990, pp. 60–82.
[DPG+20] A. Duarte, S. Palaskar, D. Ghadiyaram, K. DeHaan, F. Metze, J. Torres, X. Giro-i-Nieto. “How2Sign: A Large-scale Multimodal Dataset
for Continuous American Sign Language.”
[SCB20] B. Saunders, N. C. Camgoz, R. Bowden. “Progressive Transformers for Endto-End Sign Language Production.” (Apr. 2020)
35
Sources I
[CHS+18] Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, Y. Sheikh. OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affnity Fields.
2018.
[GB] R. Gupta, V. Behl. im Rishabh Gupta/Indian-Sign-Language-Recognition. URL: https://github.com/imRishabhGupta/Indian-Sign-Language-
Recognition
[CHK+18] N. C. Camgoz, S. Hadfeld, O. Koller, H. Ney, R. Bowden. “Neural Sign Language Translation.” In: IEEE Conference on Computer Vision
and Pattern Recognition (CVPR). 2018
[CKHB20] N. C. Camgoz, O. Koller, S. Hadfeld, R. Bowden. “Sign Language Transformers: Joint End-to-end Sign Language Recognition and
Translation.”, (Mar. 2020).
[VSP+17] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin. “Attention Is All You Need.” (Dec.
2017).
[KP02] D. Klakow, J. Peters. “Testing the correlation of word error rate and perplexity.” In: Speech Communication 38.1 (2002), pp. 19–28. ISSN:
0167-6393.
[PRWZ02] K. Papineni, S. Roukos, T. Ward, W. J. Zhu. “BLEU: a Method for Automatic Evaluation of Machine Translation.” In: (Oct. 2002).
[BL02] S. Banerjee, A. Lavie. “METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments.” en. In: (2002).
[Lin04] C.-Y. Lin. “Rouge: A package for automatic evaluation of summaries.” In: Text summarization branches out. 2004.
36
Sources II
[STL+18] S. Stoll, N. Camgoz, S. Hadfield and R. Bowden. Text2Sign: Towards Sign Language Production Using Neural Machine Translation and
Generative Adversarial Networks. 2018.
[KKJC19] S.-K. Ko, C. J. Kim, H. Jung, C. Cho. “Neural Sign Language Translation based on Human Keypoint Estimation.” (June 2019).
[Ala18] J. Alammar. The Illustrated Transformer. June 2018. URL: http://jalammar.github.io/illustrated-transformer/
[KFN15] O. Koller, J. Forster, H. Ney. “Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling
multiple signers.” In: Computer Vision and Image Understanding 141 (Dec. 2015).
[ZAH+11] Zafrulla, Zahoor and Brashear, Helene and Starner, Thad and Hamilton, Harley and Presti, Peter. American Sign Language Recognition
with the Kinect. 2011
37
Sources III
Thank you!
e-mail
www.
University of Stuttgart
Peter Muschick
github.com/asdf11x/stt
swt89259@stud.uni-stuttgart.de
Photo by Louisa
Schaad on Unsplash
39
Confidence score vs actual accuracy
OpenPose
40
Positional encoding
• How hard is it to learn Sign Language actually? [Jac96] (for native English speakers)
• American Sign Language is as hard to learn as Japanese or Arabic
Time + Theme + Comment + Speaker
• Time = grammatical tense
• Theme = object of the sentence
• Comment = additional information about the subject
• Speaker = subject of the sentence
“I went to the university yesterday” -> YESTERDAY UNIVERSITY GO I
41
Sign language
42
Attention heads
Self attention
• Average confidence scores of OpenPose
43
OpenPose
Results
SLR PHOENIX14T How2Sign
body - 0.31 0.40
face - 0.77 0.84
left hand 0.55 0.31 0.47
right hand - 0.29 0.43
• Average confidence scores of OpenPose
44
OpenPose
Results
SLR* PHOENIX14T How2Sign
body - 0.31 0.40
face - 0.77 0.84
left hand 0.55 0.31 0.47
right hand - 0.29 0.43
• Average confidence scores of OpenPose
45
OpenPose
Results
SLR* PHOENIX14T How2Sign
body - 0.31 0.40
face - 0.77 0.84
left hand 0.55 0.31 0.47
right hand - 0.29 0.43
• Confidence scores of 242 images displaying left hand showing the letter A from
different angles
46
OpenPose - SLR
Results
• Confidence scores of 120 frames displaying a sign language speaker
47
OpenPose - PHOENIX14T
Results
• Confidence scores of 120 frames displaying a sign language speaker
48
OpenPose - PHOENIX14T
Results
• OpenPose: How2Sign vs PHOENIX14T
49
Discussion & Summary

More Related Content

What's hot

Face Detection Using MATLAB (SUD)
Face Detection Using MATLAB (SUD)Face Detection Using MATLAB (SUD)
Face Detection Using MATLAB (SUD)
Sudhanshu Saxena
 
Introduction to text to speech
Introduction to text to speechIntroduction to text to speech
Introduction to text to speech
Bilgin Aksoy
 
Real Time Object Tracking
Real Time Object TrackingReal Time Object Tracking
Real Time Object Tracking
Vanya Valindria
 
Deblurring of Digital Image PPT
Deblurring of Digital Image PPTDeblurring of Digital Image PPT
Deblurring of Digital Image PPTSyed Atif Naseem
 
Image Interpolation Techniques with Optical and Digital Zoom Concepts
Image Interpolation Techniques with Optical and Digital Zoom ConceptsImage Interpolation Techniques with Optical and Digital Zoom Concepts
Image Interpolation Techniques with Optical and Digital Zoom Concepts
mmjalbiaty
 
Walsh transform
Walsh transformWalsh transform
Walsh transform
SachinMaithani1
 
The motion estimation
The motion estimationThe motion estimation
The motion estimation
sakshij91
 
GRS '“ Gesture based Recognition System for Indian Sign Language Recognition ...
GRS '“ Gesture based Recognition System for Indian Sign Language Recognition ...GRS '“ Gesture based Recognition System for Indian Sign Language Recognition ...
GRS '“ Gesture based Recognition System for Indian Sign Language Recognition ...
ijtsrd
 
Sign language translator ieee power point
Sign language translator ieee power pointSign language translator ieee power point
Sign language translator ieee power pointMadhuri Yellapu
 
Fake Currency detction Using Image Processing
Fake Currency detction Using Image ProcessingFake Currency detction Using Image Processing
Fake Currency detction Using Image Processing
SavitaHanchinal
 
Attendance Management System using Face Recognition
Attendance Management System using Face RecognitionAttendance Management System using Face Recognition
Attendance Management System using Face Recognition
NanditaDutta4
 
Real time gesture recognition
Real time gesture recognitionReal time gesture recognition
Real time gesture recognition
Jaison2636
 
Multiple object detection report
Multiple object detection reportMultiple object detection report
Multiple object detection report
Manish Raghav
 
HAND GESTURE RECOGNITION.ppt (1).pptx
HAND GESTURE RECOGNITION.ppt (1).pptxHAND GESTURE RECOGNITION.ppt (1).pptx
HAND GESTURE RECOGNITION.ppt (1).pptx
Deepakkumaragrahari1
 
Object tracking
Object trackingObject tracking
Object tracking
Sri vidhya k
 
Image Enhancement using Frequency Domain Filters
Image Enhancement using Frequency Domain FiltersImage Enhancement using Frequency Domain Filters
Image Enhancement using Frequency Domain Filters
Karthika Ramachandran
 
Wavelet transform in image compression
Wavelet transform in image compressionWavelet transform in image compression
Wavelet transform in image compression
jeevithaelangovan
 

What's hot (20)

Face Detection Using MATLAB (SUD)
Face Detection Using MATLAB (SUD)Face Detection Using MATLAB (SUD)
Face Detection Using MATLAB (SUD)
 
Object Recognition
Object RecognitionObject Recognition
Object Recognition
 
Introduction to text to speech
Introduction to text to speechIntroduction to text to speech
Introduction to text to speech
 
Real Time Object Tracking
Real Time Object TrackingReal Time Object Tracking
Real Time Object Tracking
 
Deblurring of Digital Image PPT
Deblurring of Digital Image PPTDeblurring of Digital Image PPT
Deblurring of Digital Image PPT
 
Voicemorphing
VoicemorphingVoicemorphing
Voicemorphing
 
Image Interpolation Techniques with Optical and Digital Zoom Concepts
Image Interpolation Techniques with Optical and Digital Zoom ConceptsImage Interpolation Techniques with Optical and Digital Zoom Concepts
Image Interpolation Techniques with Optical and Digital Zoom Concepts
 
Walsh transform
Walsh transformWalsh transform
Walsh transform
 
The motion estimation
The motion estimationThe motion estimation
The motion estimation
 
GRS '“ Gesture based Recognition System for Indian Sign Language Recognition ...
GRS '“ Gesture based Recognition System for Indian Sign Language Recognition ...GRS '“ Gesture based Recognition System for Indian Sign Language Recognition ...
GRS '“ Gesture based Recognition System for Indian Sign Language Recognition ...
 
Sign language translator ieee power point
Sign language translator ieee power pointSign language translator ieee power point
Sign language translator ieee power point
 
Fake Currency detction Using Image Processing
Fake Currency detction Using Image ProcessingFake Currency detction Using Image Processing
Fake Currency detction Using Image Processing
 
Attendance Management System using Face Recognition
Attendance Management System using Face RecognitionAttendance Management System using Face Recognition
Attendance Management System using Face Recognition
 
Real time gesture recognition
Real time gesture recognitionReal time gesture recognition
Real time gesture recognition
 
Voice morphing
Voice morphingVoice morphing
Voice morphing
 
Multiple object detection report
Multiple object detection reportMultiple object detection report
Multiple object detection report
 
HAND GESTURE RECOGNITION.ppt (1).pptx
HAND GESTURE RECOGNITION.ppt (1).pptxHAND GESTURE RECOGNITION.ppt (1).pptx
HAND GESTURE RECOGNITION.ppt (1).pptx
 
Object tracking
Object trackingObject tracking
Object tracking
 
Image Enhancement using Frequency Domain Filters
Image Enhancement using Frequency Domain FiltersImage Enhancement using Frequency Domain Filters
Image Enhancement using Frequency Domain Filters
 
Wavelet transform in image compression
Wavelet transform in image compressionWavelet transform in image compression
Wavelet transform in image compression
 

Similar to Learn2Sign : Sign language recognition and translation using human keypoint estimation and transformer model

Breaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video AnalyticsBreaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video Analytics
Jason Anderson
 
Odyssey 2022: Investigating self-supervised front ends for speech spoofing co...
Odyssey 2022: Investigating self-supervised front ends for speech spoofing co...Odyssey 2022: Investigating self-supervised front ends for speech spoofing co...
Odyssey 2022: Investigating self-supervised front ends for speech spoofing co...
Yamagishi Laboratory, National Institute of Informatics, Japan
 
TRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text DescriptionTRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text Description
George Awad
 
IRJET- Speech to Speech Translation System
IRJET- Speech to Speech Translation SystemIRJET- Speech to Speech Translation System
IRJET- Speech to Speech Translation System
IRJET Journal
 
Curriculum Development of an Audio Processing Laboratory Course
Curriculum Development of an Audio Processing Laboratory CourseCurriculum Development of an Audio Processing Laboratory Course
Curriculum Development of an Audio Processing Laboratory Course
sipij
 
Enhancing Developer Productivity with Code Forensics
Enhancing Developer Productivity with Code ForensicsEnhancing Developer Productivity with Code Forensics
Enhancing Developer Productivity with Code Forensics
TechWell
 
Xuedong Huang - Deep Learning and Intelligent Applications
Xuedong Huang - Deep Learning and Intelligent ApplicationsXuedong Huang - Deep Learning and Intelligent Applications
Xuedong Huang - Deep Learning and Intelligent Applications
Machine Learning Prague
 
Real Time Sign Language Translation Using Tensor Flow Object Detection
Real Time Sign Language Translation Using Tensor Flow Object DetectionReal Time Sign Language Translation Using Tensor Flow Object Detection
Real Time Sign Language Translation Using Tensor Flow Object Detection
IRJET Journal
 
Jan Luts - Exploring artificial intelligence innovations in ESCO and Europass
Jan Luts - Exploring artificial intelligence innovations in ESCO and EuropassJan Luts - Exploring artificial intelligence innovations in ESCO and Europass
Jan Luts - Exploring artificial intelligence innovations in ESCO and Europass
EADTU
 
Cse373 multimedia systems and design
Cse373    multimedia systems and designCse373    multimedia systems and design
Cse373 multimedia systems and designSumit Kasaudhan
 
IMPACT Final Conference - Claus Gravenhorst
IMPACT Final Conference - Claus GravenhorstIMPACT Final Conference - Claus Gravenhorst
IMPACT Final Conference - Claus Gravenhorst
IMPACT Centre of Competence
 
Haystack Live tallison_202010_v2
Haystack Live tallison_202010_v2Haystack Live tallison_202010_v2
Haystack Live tallison_202010_v2
Tim Allison
 
[Dec./2017] My Personal/Professional Journey after Graduate Univ.
[Dec./2017] My Personal/Professional Journey after Graduate Univ.[Dec./2017] My Personal/Professional Journey after Graduate Univ.
[Dec./2017] My Personal/Professional Journey after Graduate Univ.
Hayoung Yoon
 
The Ring programming language version 1.7 book - Part 89 of 196
The Ring programming language version 1.7 book - Part 89 of 196The Ring programming language version 1.7 book - Part 89 of 196
The Ring programming language version 1.7 book - Part 89 of 196
Mahmoud Samir Fayed
 
Automated Podcasting System for Universities
Automated Podcasting System for UniversitiesAutomated Podcasting System for Universities
Automated Podcasting System for Universities
Educational Technology
 
BBA100 Business and SocietyGood Evening, everyone.T.docx
BBA100 Business and SocietyGood Evening, everyone.T.docxBBA100 Business and SocietyGood Evening, everyone.T.docx
BBA100 Business and SocietyGood Evening, everyone.T.docx
garnerangelika
 

Similar to Learn2Sign : Sign language recognition and translation using human keypoint estimation and transformer model (20)

Breaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video AnalyticsBreaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video Analytics
 
Odyssey 2022: Investigating self-supervised front ends for speech spoofing co...
Odyssey 2022: Investigating self-supervised front ends for speech spoofing co...Odyssey 2022: Investigating self-supervised front ends for speech spoofing co...
Odyssey 2022: Investigating self-supervised front ends for speech spoofing co...
 
TRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text DescriptionTRECVID 2016 : Video to Text Description
TRECVID 2016 : Video to Text Description
 
IRJET- Speech to Speech Translation System
IRJET- Speech to Speech Translation SystemIRJET- Speech to Speech Translation System
IRJET- Speech to Speech Translation System
 
Curriculum Development of an Audio Processing Laboratory Course
Curriculum Development of an Audio Processing Laboratory CourseCurriculum Development of an Audio Processing Laboratory Course
Curriculum Development of an Audio Processing Laboratory Course
 
cv_filustek_en_08
cv_filustek_en_08cv_filustek_en_08
cv_filustek_en_08
 
resume
resumeresume
resume
 
Enhancing Developer Productivity with Code Forensics
Enhancing Developer Productivity with Code ForensicsEnhancing Developer Productivity with Code Forensics
Enhancing Developer Productivity with Code Forensics
 
Xuedong Huang - Deep Learning and Intelligent Applications
Xuedong Huang - Deep Learning and Intelligent ApplicationsXuedong Huang - Deep Learning and Intelligent Applications
Xuedong Huang - Deep Learning and Intelligent Applications
 
DinhHoangTu-CV
DinhHoangTu-CVDinhHoangTu-CV
DinhHoangTu-CV
 
Real Time Sign Language Translation Using Tensor Flow Object Detection
Real Time Sign Language Translation Using Tensor Flow Object DetectionReal Time Sign Language Translation Using Tensor Flow Object Detection
Real Time Sign Language Translation Using Tensor Flow Object Detection
 
Jan Luts - Exploring artificial intelligence innovations in ESCO and Europass
Jan Luts - Exploring artificial intelligence innovations in ESCO and EuropassJan Luts - Exploring artificial intelligence innovations in ESCO and Europass
Jan Luts - Exploring artificial intelligence innovations in ESCO and Europass
 
Cse373 multimedia systems and design
Cse373    multimedia systems and designCse373    multimedia systems and design
Cse373 multimedia systems and design
 
IMPACT Final Conference - Claus Gravenhorst
IMPACT Final Conference - Claus GravenhorstIMPACT Final Conference - Claus Gravenhorst
IMPACT Final Conference - Claus Gravenhorst
 
Haystack Live tallison_202010_v2
Haystack Live tallison_202010_v2Haystack Live tallison_202010_v2
Haystack Live tallison_202010_v2
 
TULIKA KESHRI (1)
TULIKA KESHRI (1)TULIKA KESHRI (1)
TULIKA KESHRI (1)
 
[Dec./2017] My Personal/Professional Journey after Graduate Univ.
[Dec./2017] My Personal/Professional Journey after Graduate Univ.[Dec./2017] My Personal/Professional Journey after Graduate Univ.
[Dec./2017] My Personal/Professional Journey after Graduate Univ.
 
The Ring programming language version 1.7 book - Part 89 of 196
The Ring programming language version 1.7 book - Part 89 of 196The Ring programming language version 1.7 book - Part 89 of 196
The Ring programming language version 1.7 book - Part 89 of 196
 
Automated Podcasting System for Universities
Automated Podcasting System for UniversitiesAutomated Podcasting System for Universities
Automated Podcasting System for Universities
 
BBA100 Business and SocietyGood Evening, everyone.T.docx
BBA100 Business and SocietyGood Evening, everyone.T.docxBBA100 Business and SocietyGood Evening, everyone.T.docx
BBA100 Business and SocietyGood Evening, everyone.T.docx
 

More from Universitat Politècnica de Catalunya

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Universitat Politècnica de Catalunya
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
Universitat Politècnica de Catalunya
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Universitat Politècnica de Catalunya
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
Universitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Universitat Politècnica de Catalunya
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Universitat Politècnica de Catalunya
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Universitat Politècnica de Catalunya
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Universitat Politècnica de Catalunya
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
Universitat Politècnica de Catalunya
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Universitat Politècnica de Catalunya
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Universitat Politècnica de Catalunya
 

More from Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
 

Recently uploaded

一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
AlejandraGmez176757
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 

Recently uploaded (20)

一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 

Learn2Sign : Sign language recognition and translation using human keypoint estimation and transformer model

  • 1. Learn2Sign: Sign Language Recognition and Translation using Human Keypoint Estimation and Transformer Model Master thesis presentation Peter Muschick Photo by Jo Hilton on Unsplash
  • 2. Were my hands visible? Was the background not distracting? Did my clothes contrast my skin color? Was the video quality sufficient?
  • 3. • Problem • Communication issues of sign language speakers (in digital environments) [DFG+] • Proposed solutions • Creation of automatically generated subtitles and translations of sign languages • Speech2Signs: Spoken to Sign Language Translation using NN of prof Xavier Giró and Amanda Duarte (PhD cand.) at Universitat Politècnica de Catalunya, Barcelona • Here: Research of sign language translation with a new dataset called How2Sign and OpenPose 3 Motivation University of Stuttgart 06.11.2020
  • 4. • Introduction • Sign language research • Current state • Related works • Methods • Results • Discussion & Summary 4 Content University of Stuttgart 06.11.2020
  • 5. • Sign languages are • individual and independent languages • Sign languages are spoken on multiple and parallel channels [Dam11] • All information of sign languages cannot be covered in texts [Sut95] [Sto05] [Pri90] • Research of sign language translation is dependent on the translation direction 5 Characteristics of neural sign language translation research Introduction University of Stuttgart 06.11.2020
  • 6. • Research of sign language translation: Sign language to spoken language 6 Translation direction Introduction ‘Hi my name is ...’ / Audio [DPG+20] University of Stuttgart 06.11.2020 Input: image/video Output: text/audio
  • 7. • Research of sign language translation: Spoken language to sign language 7 Translation direction Introduction University of Stuttgart 06.11.2020 GAN = Generative Adversarial Networks ‘Hi my name is ...’ / Audio Input: text/audio Output: animated avatar or generated videos (GAN) ‘Hi my name is ...’ / Audio [DPG+20]
  • 8. • Research of sign language translation: Sign language to sign language 8 Translation direction Introduction University of Stuttgart 06.11.2020 [DPG+20] Input: image/video GAN = Generative Adversarial Networks ‘Hi my name is ...’ / Audio Output: animated avatar or generated videos (GAN) [DPG+20]
  • 9. • Sign language to sign language: no known publications • Spoken language to sign language: (Saunders et al., 2020 [SCB20], Stoll et al., 2018 [STL+18]) • Sign language to spoken language: • Sign Recognition (Zahoor et al., 2011 [ZAH+11]) • Continuous Sign Recognition (Koller et al., 2015 [KFN15]) • Sign Language Translation (Camgöz et al., 2018 [CHK+18], Camgöz et al. 2020 [CKHB20]) 9 Current state of research Introduction University of Stuttgart 06.11.2020
  • 10. 10 Sign language to spoken language tasks Introduction Task Sign Recognition Continuous Sign Recognition Sign Language Translation Sign Language representation Images Videos Videos Spoken Language representation Classes Signs Text “A” “HI ME SARAH” “Hi my name is Sarah”
  • 11. • Enable use of sign language with sign language translation • Current sign language datasets issues • Limited range of topics & vocabulary & amount of speakers [DPG+20] → Collection and Creation of How2Sign dataset [DPG+20] 11 Sign language to spoken language translation Introduction University of Stuttgart 06.11.2020
  • 12. 12 Proposed solution - Sign language into spoken language translation Introduction Task Sign Recognition Continuous Sign Recognition Sign Language Translation Dataset SLR [GB] PHOENIX14T [CHK+18] PHOENIX14T, How2Sign [DPG+20] Extraction OpenPose [CHS+18] Model Transformer [VSP+17] Evaluation R, M, B, W Rouge [Lin04], Meteor [BL02], BLEU [PRWZ02], Word Error Rate [KP02] University of Stuttgart 06.11.2020
  • 13. 13 Dataset Methods Task Sign Recognition Continuous Sign Recognition Sign Language Translation Sign Language Translation Dataset SLR PHOENIX14T (Glosses) PHOENIX14T (German) How2Sign (English) Type Images Videos Videos Videos Annotation Classes Glosses German English Hours - 10.5 10.5 80 Utterances 5 000 8 200 8 200 35 000 Vocab 24 1 000 3 000 16 000 University of Stuttgart 06.11.2020
  • 14. Task Sign Recognition Continuous Sign Recognition Sign Language Translation Sign Language Translation Dataset SLR PHOENIX14T (Glosses) PHOENIX14T (German) How2Sign (English) Type Images Videos Videos Videos Annotation Classes Glosses German English Hours - 10.5 10.5 80 Utterances 5 000 8 200 8 200 35 000 Vocab 24 1 000 3 000 16 000 14 Dataset Methods University of Stuttgart 06.11.2020
  • 15. Task Sign Recognition Continuous Sign Recognition Sign Language Translation Sign Language Translation Dataset SLR PHOENIX14T (Glosses) PHOENIX14T (German) How2Sign (English) Type Images Videos Videos Videos Annotation Classes Glosses German English Hours - 10.5 10.5 80 Utterances 5 000 8 200 8 200 35 000 Vocab 24 1 000 3 000 16 000 15 Dataset Methods University of Stuttgart 06.11.2020
  • 16. • Human keypoint estimation with pretrained convolutional networks [CHS+18] 16 OpenPose - Human Keypoint Estimation Methods Input Output University of Stuttgart 06.11.2020
  • 17. • Receive 137 estimated keypoints (body, face, hands) per frame • Keypoint: x- & y-coordinates and confidence score • Data Normalization [KKJC19] 17 OpenPose - Human Keypoint Estimation Methods x = {x ∈ R | 0 ≤ x ≤ max(frame x-axis)} n = {n ∈ N | 0 ≤ n ≤ #keypoints} f = {f ∈ N | 0 < f ≤ #frames} u = {u ∈ N | 0 < u ≤ #utterances} University of Stuttgart 06.11.2020
  • 18. • Transformer models from Attention is all you need [VSP+17] based on self-attention • Schematic structure of the used Transformer model [Ala18]: 18 Models Methods N = Normalization layer MLP = Multi layer perceptron C = Classification layer University of Stuttgart 06.11.2020
  • 19. 19 Proposed solution - Sign language into spoken language translation Methods - Overview Rouge [Lin04], Meteor [BL02], BLEU [PRWZ02], Word Error Rate [KP02] Task Sign Recognition Continuous Sign Recognition Sign Language Translation Dataset SLR [GB] PHOENIX14T [CHK+18] PHOENIX14T, How2Sign [DPG+20] Extraction OpenPose [CHS+18] Model Transformer [VSP+17] Evaluation R, M, B, W University of Stuttgart 06.11.2020
  • 20. 20 SLR - Sign Recognition Results Work Our study Gupta et al. [GB] Dataset SLR SLR Extraction OpenPose CNN Model Transformer MLP Evaluation W W Rouge [Lin04], Meteor [BL02], BLEU [PRWZ02], Word Error Rate [KP02] University of Stuttgart 06.11.2020
  • 21. 21 SLR - Sign Recognition Results Experiment Hidden size #Layer Dropout LR #Heads WER (%) Number MLP size of transformer layer Amount of Transformer layer Dropout in transformer layer Learning rate Amount of attention heads Result University of Stuttgart 06.11.2020
  • 22. 22 SLR - Sign Recognition Results Experiment Hidden size #Layer Dropout LR #Heads WER (%) 1 32 1 0.2 10-4 1 3.5 2 64 1 0.2 10-4 1 3.0 3 128 1 0.2 10-4 1 3.0 4 256 1 0.2 10-4 1 3.0 5 512 1 0.2 10-4 1 3.0 Gupta et al. - 3.1 University of Stuttgart 06.11.2020
  • 23. 23 PHOENIX14T - Continuous Sign Recognition Results Work Our study Camgöz et al., 2020 [CKHB20] Dataset PHOENIX14T PHOENIX14T Extraction OpenPose CNN Model Transformer Transformer Evaluation W W Rouge [Lin04], Meteor [BL02], BLEU [PRWZ02], Word Error Rate [KP02] University of Stuttgart 06.11.2020
  • 24. 24 PHOENIX14T - Continuous Sign Recognition Results Experiment Hidden size #Layer Dropout LR #Heads WER (%) Val WER (%) Test 1 128 1 0.2 10-4 1 93.3 94.1 2 512 2 0.2 10-4 4 85.5 84.4 3 2048 4 0.2 10-4 8 79.3 81.2 Camgöz et al., 2020 - 24.88 24.59 University of Stuttgart 06.11.2020
  • 25. 25 PHOENIX14T - Sign Language Translation Results Work Our study Ko et al., 2019 [KKJC19] Camgöz et al., 2020 [CKHB20] Dataset PHOENIX14T How2Sign KETI (na) PHOENIX14T Extraction OpenPose OpenPose CNN Model Transformer Seq2Seq Transformer Evaluation R, M, B, W R, M, B, C B, W na = not available Rouge [Lin04], Meteor [BL02], BLEU [PRWZ02], Word Error Rate [KP02] University of Stuttgart 06.11.2020
  • 26. BLEU-1, BLEU-2, BLEU-3, BLEU-4, Meteor, Rouge 26 PHOENIX14T - Sign Language Translation Results Exp #Hid #Lay Drop LR #H B1 B2 B3 B4 M R 1 1024 1 0.2 10-4 1 4.0 0.0 0.0 0.0 7.0 7.5 2 1024 1 0.2 10-4 8 9.0 1.6 0.0 0.0 11.0 14.0 3 1024 4 0.4 10-5 32 10.0 3.0 3.0 0.0 12.0 19.0 4 2048 6 0.4 10-5 16 10.0 4.0 4.0 2.0 11.0 16.0 Cmg, 2020 Validation Test 47.25 46.61 34.40 33.73 27.05 26.19 22.38 21.32 - - University of Stuttgart 06.11.2020
  • 27. 27 How2Sign - Sign Language Translation Results Exp #Hid #Lay Drop LR #H B1 B2 B3 B4 M R 1 1024 4 0.4 10-5 32 1.0 0.0 0.0 0.0 2.0 3.0 2 2048 6 0.4 10-5 16 1.0 0.0 0.0 0.0 1.0 2.0 oom 2048 4 0.4 10-5 64 - - - - - - oom 2048 8 0.4 10-5 32 - - - - - - oom = out of memory error BLEU-1, BLEU-2, BLEU-3, BLEU-4, Meteor, Rouge University of Stuttgart 06.11.2020
  • 28. 28 Translation results Discussion University of Stuttgart 06.11.2020 Task Dataset Translation/ Recognition quality Sign Recognition SLR High Continuous Sign Recognition PHOENIX14T Low Sign Language Translation PHOENIX14T Low How2Sign Not possible → Bigger and more complex datasets were not possible to translate
  • 29. • Keypoint estimation accuracy of OpenPose might be too low 29 Limitations Discussion University of Stuttgart 06.11.2020
  • 30. • Confidence scores of a video of ~2800 frames displaying a sign language speaker 30 OpenPose - How2Sign: face & body confidence scores Discussion University of Stuttgart 06.11.2020
  • 31. • Confidence scores of a video of ~2800 frames displaying a sign language speaker 31 OpenPose - How2Sign: left & right hand confidence scores Discussion University of Stuttgart 06.11.2020
  • 32. • Keypoint estimation accuracy of OpenPose might be too low • Models with bigger hyperparameters exceed the server memory • Complexity of used models might be too low 32 Limitations Discussion University of Stuttgart 06.11.2020
  • 33. • OpenPose and transformer model are suited for sign recognition • Proposed methods did not show satisfying results for continuous sign recognition and sign language translation 33 Summary University of Stuttgart 06.11.2020
  • 34. • Run OpenPose with different datasets and examine accuracy • Datasets with more repetitions of single signs • Focus on hand recognition • Continue with transformer models • Use pre-defined transformer models from libraries • Use OpenPose for facial recognition 34 Outlook University of Stuttgart 06.11.2020
  • 35. [Jac96] R. Jacobs. “Just how hard is it to learn ASL? The case for ASL as a truly foreign language.” In: Multicultural aspects of sociolinguistics in deaf communities 2 (1996), pp. 183–226 [Dam11] S. Damian. “Spoken vs. Sign Languages-What’s the Difference?” In: Cognition, Brain, Behavior 15.2 (2011), p. 251 [DFG+] P. Dreuw, J. Forster, Y. Gweth, D. Stein, H. Ney, G. Martinez, J. V. Llahi, O. Crasborn, E. Ormel, W. Du, T. Hoyoux, J. Piater, J. M. Moya, M. Wheatley. “SignSpeak – Understanding, Recognition, and Translation of Sign Languages.” en. In: (), p. 8 [ACH+13] M. Adams, C. Castaneda, H. W. Hackman, M. L. Peters, X. Zuniga, W. J. Blumenfeld. Readings for diversity and social justice. Third edition. New York: Routledge Taylor & Franacis Group, 2013., 2013 [Sut95] V. Sutton. Lessons in sign writing. SignWriting, 1995 [Sto05] W. Stokoe. “Sign language structure: an outline of the visual communication systems of the American deaf. 1960.” In: Journal of deaf studies and deaf education 10 1 (2005), pp. 3–37 [Pri90] S. Prillwitz. “Hamburger Notations-System - Entwicklung einer Gebärdenschrift mit Computeranwendung.” In: Gebärde, Laut und graphisches Zeichen: Schrifterwerb im Problemfeld von Mehrsprachigkeit. Ed. by G. List, G. List. Wiesbaden: VS Verlag für Sozialwissenschaften, 1990, pp. 60–82. [DPG+20] A. Duarte, S. Palaskar, D. Ghadiyaram, K. DeHaan, F. Metze, J. Torres, X. Giro-i-Nieto. “How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language.” [SCB20] B. Saunders, N. C. Camgoz, R. Bowden. “Progressive Transformers for Endto-End Sign Language Production.” (Apr. 2020) 35 Sources I
  • 36. [CHS+18] Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, Y. Sheikh. OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affnity Fields. 2018. [GB] R. Gupta, V. Behl. im Rishabh Gupta/Indian-Sign-Language-Recognition. URL: https://github.com/imRishabhGupta/Indian-Sign-Language- Recognition [CHK+18] N. C. Camgoz, S. Hadfeld, O. Koller, H. Ney, R. Bowden. “Neural Sign Language Translation.” In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2018 [CKHB20] N. C. Camgoz, O. Koller, S. Hadfeld, R. Bowden. “Sign Language Transformers: Joint End-to-end Sign Language Recognition and Translation.”, (Mar. 2020). [VSP+17] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin. “Attention Is All You Need.” (Dec. 2017). [KP02] D. Klakow, J. Peters. “Testing the correlation of word error rate and perplexity.” In: Speech Communication 38.1 (2002), pp. 19–28. ISSN: 0167-6393. [PRWZ02] K. Papineni, S. Roukos, T. Ward, W. J. Zhu. “BLEU: a Method for Automatic Evaluation of Machine Translation.” In: (Oct. 2002). [BL02] S. Banerjee, A. Lavie. “METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments.” en. In: (2002). [Lin04] C.-Y. Lin. “Rouge: A package for automatic evaluation of summaries.” In: Text summarization branches out. 2004. 36 Sources II
  • 37. [STL+18] S. Stoll, N. Camgoz, S. Hadfield and R. Bowden. Text2Sign: Towards Sign Language Production Using Neural Machine Translation and Generative Adversarial Networks. 2018. [KKJC19] S.-K. Ko, C. J. Kim, H. Jung, C. Cho. “Neural Sign Language Translation based on Human Keypoint Estimation.” (June 2019). [Ala18] J. Alammar. The Illustrated Transformer. June 2018. URL: http://jalammar.github.io/illustrated-transformer/ [KFN15] O. Koller, J. Forster, H. Ney. “Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers.” In: Computer Vision and Image Understanding 141 (Dec. 2015). [ZAH+11] Zafrulla, Zahoor and Brashear, Helene and Starner, Thad and Hamilton, Harley and Presti, Peter. American Sign Language Recognition with the Kinect. 2011 37 Sources III
  • 38. Thank you! e-mail www. University of Stuttgart Peter Muschick github.com/asdf11x/stt swt89259@stud.uni-stuttgart.de Photo by Louisa Schaad on Unsplash
  • 39. 39 Confidence score vs actual accuracy OpenPose
  • 41. • How hard is it to learn Sign Language actually? [Jac96] (for native English speakers) • American Sign Language is as hard to learn as Japanese or Arabic Time + Theme + Comment + Speaker • Time = grammatical tense • Theme = object of the sentence • Comment = additional information about the subject • Speaker = subject of the sentence “I went to the university yesterday” -> YESTERDAY UNIVERSITY GO I 41 Sign language
  • 43. • Average confidence scores of OpenPose 43 OpenPose Results SLR PHOENIX14T How2Sign body - 0.31 0.40 face - 0.77 0.84 left hand 0.55 0.31 0.47 right hand - 0.29 0.43
  • 44. • Average confidence scores of OpenPose 44 OpenPose Results SLR* PHOENIX14T How2Sign body - 0.31 0.40 face - 0.77 0.84 left hand 0.55 0.31 0.47 right hand - 0.29 0.43
  • 45. • Average confidence scores of OpenPose 45 OpenPose Results SLR* PHOENIX14T How2Sign body - 0.31 0.40 face - 0.77 0.84 left hand 0.55 0.31 0.47 right hand - 0.29 0.43
  • 46. • Confidence scores of 242 images displaying left hand showing the letter A from different angles 46 OpenPose - SLR Results
  • 47. • Confidence scores of 120 frames displaying a sign language speaker 47 OpenPose - PHOENIX14T Results
  • 48. • Confidence scores of 120 frames displaying a sign language speaker 48 OpenPose - PHOENIX14T Results
  • 49. • OpenPose: How2Sign vs PHOENIX14T 49 Discussion & Summary