SlideShare a Scribd company logo
1 of 21
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
Paper Title: Human Pose Estimation: Benchmarking Deep
Learning-based Methods
All authors Name and Affiliation
Mayank Lovanshi and Vivek Tiwari, IIIT-Naya Raipur
Paper ID: 8399
Track No. : 3
Presented by
Mayank Lovanshi, IIIT-Naya Raipur
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
Content
• Introduction
• Related Work
• Methodology: Human Pose Estimation Models
• Dataset Used
• Experiment & Results
• Conclusion
• References
2
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
INTRODUCTION
• Human Pose Estimation: Identifying and
classifying the joints in the human body [1,2].
• Way to capture a set of coordinates for each
joint (arm, head, torso, etc.,) Known as key
points [2,3].
• The connection between these points is
known as a Pair [1,2,3].
• Extraction of the angle information between
the body joints [2,3].
3
Fig.1: Sample of Pose Estimation
Source:https://www.quickerhire.com/blogs/human-pose-estimation-for-multiple-subjects-with-machine-learning
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
Cont…
4
Fig.2: Human body modeling: a) Skeleton based, b) Contour based,
c) Volume-based
Source:https://shop62004.afacetoreframe.org/content?c=body%20pose%20estimation&id=1
Three types of approaches to HPE:
 The skeleton-based model includes a set of
key points (joints) like ankles, knees, shoulders, and
elbows [1].
 The contour-based model consists of the
contour and rough width of the body, torso, and limbs
[1].
 The volume-based model consists of multiple
popular 3D human body models and poses represented
by human geometric meshes and shapes [1].
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
Cont…
2D Pose Estimation: 2D human pose estimation uses visuals like images and video to
evaluate the 2D human pose or spatial location of the human body’s key points [2,3].
3D Pose Estimation: The 3D Human Pose Estimation method is used to locate human joints
in 3D space [2,3].
5
Fig.3: 2D vs 3D Pose Estimation sample
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
Related Work
6
S.N. Paper Title Problem Statement Method Limitation
1. DeepPose: Human Pose
Estimation via Deep Neural
Networks by A. Toshev et.al.
(2014) [8].
Aim to extract 2D/3D
key points information
using deep learning
based HPE algorithm
PosePipe: a open-source deep
learning model used to extract
2D/3D keypoints.
Hard to work on the
video based datasets.
2. Human Pose Estimation via
Convolutional Part Heatmap
Regression by A. Bulat et.al.
(2016) [10].
Extraction of Human
body joints using deep
learning based
approach
A Convolutional Neural
Network (CNN) based approach
used for identification of the
human pose
CNN based approach
doesn’t work on part
based posture
identification
3. Combining local appearance and
holistic view: Dual-Source Deep
Neural Networks for human pose
estimation by Xiaochuan Fan
et.al. (2018) [12].
Aim to extract local
part pose information
to enhance human
posture evaluation
Dual-Source Deep
Convolutional Neural Network
(DS-CNN) used for posture
evaluation.
Pose estimation results
is not that much
correct.
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
Related Work
7
S.N. Paper Title Problem Statement Method Limitation
4. Deep learning based 2D human
pose estimation: A survey by Q.
Dang et.al. (2019) [2].
Identification of 2D/3D
human pose estimation
using kinematic model
Model-free & model based;two
estimation algorithm is used
It doesn’t work on the
RGB-D images.
5. End-to-end recovery of human
shape and pose by A. Kanazawa
(2020) [11].
Identification of the
human posture with
joints angle & key
points information
Human Mess Recovery (HMR):
an end-to-end system for
generating a complete 3D/2D
mess.
3D mess can’t be
extracted from the
depth RGB
image/video.
6. Hand Pose Estimation from RGB
Images Based on Deep Learning:
A Survey by Y. Liu et.al. (2021)
[1].
Identification of 2D
human pose
estimation.
DeepPose: a cascaded deep
learning-based regressor used
It doesn’t work to
extract 3D human pose
estimation
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
HUMAN POSE ESTIMATION MODELS
1. OpenPose [4]
2. ViTPose [13]
3. HRNet [6]
4. AlphaPose [5]
5. DenseNet [14]
6. EfficientPose [15,16]
7. DensePose [17]
8. Hourglass [18]
8
Fig. 4: Architecture of our proposed work
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
1. OpenPose:
• OpenPose is based on the VGG-19
convolutional neural network.
• It comprises four parts: input, part confidence
map, bipartite matching, & output image.
2. ViTPose:
• Based on non-hierarchical vision transformers
as backbones.
• Two deconvolution layers and one prediction
layer.
9
Fig. 5: Image extraction through the OpenPose method
Fig. 6: Framework of the ViTPose method
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
3. HRNet:
• Backbone model as a convolutional neural
network.
• Used for semantic segmentation, object
recognition, and image categorisation.
4. AlphaPose:
• Used Symmetric Spatial Transformer Network
(SSTN)
• Single-Person Pose Estimator (SPPE)
10
Fig. 8: Image extraction through the AlphaPose method
Fig. 7: Framework of HRNet method
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
5. DenseNet:
• The backbone model is Resnet (based on
CNN).
• Solve the vanishing gradient problem by
using LSTM as one layer.
6. EfficientPose:
• The backbone model is a Convolutional
neural network.
• It comprises two main parts; an efficient
backbone and an efficient head.
11
Fig. 9: Layered structure of DenseNet method
Fig. 10: Architecture of the EfficientPose Methods
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
7. DensePose:
• A fully-convolutional network design
was used in the Dense Regression
(DenseReg).
• It combines the DenseReg method with
the Mask-RCNN to improve Pose.
8. Hourglass:
• Based on tightly linked fully
convolutional networks.
• Conv-deconv and encoder-decoder
methods are linked to the hourglass
module.
12
Fig. 11: Architecture of the DensePose Methods
Fig. 12: Architecture of the Hourglass Methods
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
DATASET USED
A. COCO: [20]
• Images: 66,808
• Annotations: 273,469
• Key points: 17 key points
B. MPII: [12]
• Images: 40,000
• Annotations: 223,589
• Classes: 410 classes
13
Fig.13: COCO Dataset: sample images [20]
Fig.14: MPII Dataset: sample images [12]
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
RESULTS
Evaluate results on the basis of the following matrices:
1. Average Precision(AP): The weighted mean of precisions at each threshold; the
weight is the increase in recall from the prior threshold [7,8].
𝐴𝑃@𝛼 = 0
1
𝑝 𝑟 𝑑𝑟
2. Mean Average Precision(mAP): Average precision value over different IOUs [9].
3. Percentage of correct key point (PCK): PCK is a precision metric determining the
anticipated key point and the actual joint in a given distance [10,11].
14
𝑚𝐴𝑃@𝛼 =
1
𝑛
𝑖=1
𝑛
𝐴𝑃𝑖
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
RESULTS
TABLE I: BENCHMARKING WITH SOTA POSE ESTIMATION NETWORKS ON COCO DATASET BASED ON AP & MAP
15
Algorithm COCO Dataset
AP AP0.5 AP0.75 APM APL mAP
OpenPose[4] 60.5 83.4 66.4 55.1 68.1 65.9
AlphaPose [5] 73.3 89.2 79.1 69.0 78.6 77.84
HRNet [6] 77.4 92.6 84 73.6 83.7 82.3
ViTPose-B [13] 81.1 95.0 88.2 87.8 86.0 85.6
DenseNet [14] 77.1 93.3 83.6 72.2 83.6 82.6
EfficientPose [15,16] 70.5 91.1 79.0 67.3 76.2 76.1
DensePose [17] 55.8 83.7 56.3 42.2 53.8 61.1
Hourglass [18] 65.6 88.8 69.3 - - 74.5
4*RSN-50 [19] 78.6 94.6 86.6 83.3 75.5 83.8
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
RESULTS
TABLE II: BENCHMARKING WITH SOTA POSE ESTIMATION NETWORKS ON MPII DATASET BASED ON PCK OF BODY PART &
AVERAGE PCK
16
Algorithm MPII Dataset
Ankle Knee Hip Wrist Elbow Shoulder Head Avg.
PCK
OpenPose [4] 79.87 87.17 93.0 79.15 89.03 95.97 96.11 88.73
AlphaPose[5] 72.4 79.9 80.3 76.4 84.0 90.5 91.3 82.1
HRNet [6] 82.5 86.1 89.1 85.9 90.5 85.9 96.9 90.0
ViTPose-B [13] 88.3 91.9 92.4 90.1 93.7 97.4 97.6 93.4
EfficientPose [15,16] 83.9 87.5 90.3 87.5 91.7 96.0 98.2 91.2
Hourglass [18] 89.3 92.2 93.2 91.2 94.4 97.5 98.8 94.1
4*RSN-50 [19] 86.8 90.6 92.0 89.9 93.9 97.3 98.5 93.0
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
CONCLUSION
• This study helps to yield accurate and spatially precise key point heat maps, average
precision & probability of correct key points of human pose estimation.
• Experimental analysis was done over two datasets, i.e. COCO & MPII datasets.
• ViTPose-B performed better than the others in every AP variant on COCO dataset
because it uses a transformer instead of a convolution.
• OpenPose underperformed on the COCO dataset.
• The average PCK of the MPII dataset and the PCKs for each class were
outperformed by the hourglass model.
• AlphaPose underperformed on the MPII dataset.
17
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
WORK DONE(based on discussed work)
Task 1: Human Skeleton Pose and Spatio-Temporal Feature-based Activity
Recognition using ST-GCN
• Human activity recognition using pose estimation algorithm.
• Normalise human activity sequence with the Gaussian filter method.
• Investigate ST-GCN model for extraction of Spatial & Temporal features.
Task 2: 3D Skeleton-based Human Motion Prediction using Dynamic Multi-scale
Spatiotemporal Graph Recurrent Neural Networks
• Human motion prediction using graph recurrent neural network.
• Investigate a novel DMST-GRNN model on the multi-scale variation for the extraction of spatial &
temporal features.
• Validate human motion based on time series-based 3D sequential datasets.
18
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
REFERENCES
[1] Y. Liu, J. Jiang, and J. Sun, “Hand Pose Estimation from RGB Images Based on Deep Learning: A Survey.” 2021 IEEE 7th International Conference on
Virtual Reality (ICVR), 2021.
[2] Q. Dang, J. Yin, B. Wang, and W. Zheng, “Deep learning based 2D human pose estimation: A survey.” Tsinghua Science and Technology, vol. 24, no. 6,
pp. 663-676, 2019.
[3] Meenakshi Choudhary, Vivek Tiwari, and Swati Jain. Person reidentification using deep siamese network with multi-layer similarity constraints. Multimedia
Tools and Applications, pages 1– 17, 2021.
[4] D. Osokin, “Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose.” Proceedings of the 8th International Conference on Pattern
Recognition Applications and Methods, 2019.
[5] H.-S. Fang, S. Xie, Y.-W. Tai, and C. Lu, “RMPE: Regional Multiperson Pose Estimation.” 2017 IEEE International Conference on Computer Vision (ICCV),
2017.
[6] K. Sun, B. Xiao, D. Liu, and J. Wang, “Deep High-Resolution Representation Learning for Human Pose Estimation.” 2019 IEEE/CVF Conference on
Computer Vision and Pattern Recognition (CVPR), 2019.
[7] W. Li, R. Du, and S. Chen, “Skeleton-Based Spatio-Temporal UNetwork for 3D Human Pose Estimation in Video.” Sensors, vol. 22, no. 7, p. 2573, 2022.
[8] A. Toshev and C. Szegedy, “DeepPose: Human Pose Estimation via Deep Neural Networks.” 2014 IEEE Conference on Computer Vision and Pattern
Recognition, 2014.
[9] Z. Cao, T. Simon, S.-E. Wei, and Y. Sheikh, “Real-time Multi-person 2D Pose Estimation Using Part Affinity Fields.” 2017 IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), 2017.
[10] A. Bulat and G. Tzimiropoulos, “Human Pose Estimation via Convolutional Part Heatmap Regression.” Computer Vision – ECCV 2016, pp. 717-732,
2016.
19
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies
REFERENCES
[11] A. Kanazawa, M. J. Black, D. W. Jacobs, and J. Malik, “End-to-end recovery of human shape and pose,” 2018 IEEE/CVF Conference on Computer
Vision and Pattern Recognition, 2018.
[12] M. Andriluka, L. Pishchulin, P. Gehler, and B. Schiele, “2D Human Pose Estimation: New Benchmark and State of the Art Analysis.” 2014 IEEE
Conference on Computer Vision and Pattern Recognition, 2014.
[13] Y. Xu, J. Zhang, Q. Zhang, and D. Tao, “ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation.” Computer Vision – ECCV 2022,
2022.
[14] S. W. Chu, Y. Song, J. J. Zouo, and W. Cai, “Human Pose Estimation Using Deep Convolutional Densenet Hourglass Network with Intermediate
Points Voting.” 2019 IEEE International Conference on Image Processing (ICIP), 2019.
[15] J. Li, C. Wang, H. Zhu, Y. Mao, H.-S. Fang, and C. Lu, “CrowdPose: Efficient Crowded Scenes Pose Estimation and a New Benchmark.” 2019
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
[16] D. Groos, H. Ramampiaro, and E. A. Ihlen, “EfficientPose: Scalable single-person pose estimation.” Applied Intelligence, vol. 51, no. 4, pp. 2518-
2533, 2020.
[17] R. A. Guler, N. Neverova, and I. Kokkinos, “DensePose: Dense Human Pose Estimation in the Wild.” 2018 IEEE/CVF Conference on Computer
Vision and Pattern Recognition, 2018.
[18] T. Xu and W. Takano, “Graph Stacked Hourglass Networks for 3D Human Pose Estimation.” 2021 IEEE/CVF Conference on Computer Vision and
Pattern Recognition (CVPR), 2021.
[19] Y. Cai, “Learning Delicate Local Representations for Multi-person Pose Estimation.” Computer Vision – ECCV 2020.
[20] T.-Y. Lin, “Microsoft COCO: Common Objects in Context.” Computer Vision – ECCV 2014, pp. 740-755, 2014.
20
IEEE International Conference on Interdisciplinary Approaches in Technology and
Management for Social Innovation (Hybrid)
December 21 – 23, 2022, Gwalior, India
Enabling the Change! Social Innovation
for sustainable societies

More Related Content

Similar to IATMSI 2022 Presentation Format.pptx

An Extensible Web Mining Framework for Real Knowledge
An Extensible Web Mining Framework for Real KnowledgeAn Extensible Web Mining Framework for Real Knowledge
An Extensible Web Mining Framework for Real Knowledge
IJEACS
 

Similar to IATMSI 2022 Presentation Format.pptx (20)

A simplified and novel technique to retrieve color images from hand-drawn sk...
A simplified and novel technique to retrieve color images from  hand-drawn sk...A simplified and novel technique to retrieve color images from  hand-drawn sk...
A simplified and novel technique to retrieve color images from hand-drawn sk...
 
BTP Report.pdf
BTP Report.pdfBTP Report.pdf
BTP Report.pdf
 
A Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringA Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question Answering
 
Web Image Retrieval Using Visual Dictionary
Web Image Retrieval Using Visual DictionaryWeb Image Retrieval Using Visual Dictionary
Web Image Retrieval Using Visual Dictionary
 
Web Image Retrieval Using Visual Dictionary
Web Image Retrieval Using Visual DictionaryWeb Image Retrieval Using Visual Dictionary
Web Image Retrieval Using Visual Dictionary
 
Fourth issue of Newsletter
Fourth issue of NewsletterFourth issue of Newsletter
Fourth issue of Newsletter
 
Automating Software Development Using Artificial Intelligence (AI)
Automating Software Development Using Artificial Intelligence (AI)Automating Software Development Using Artificial Intelligence (AI)
Automating Software Development Using Artificial Intelligence (AI)
 
FACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEW
FACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEWFACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEW
FACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEW
 
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
 
A Survey on Human Pose Estimation
A Survey on Human Pose EstimationA Survey on Human Pose Estimation
A Survey on Human Pose Estimation
 
Top Cited Articles in Computer Graphics and Animation
Top Cited Articles in Computer Graphics and AnimationTop Cited Articles in Computer Graphics and Animation
Top Cited Articles in Computer Graphics and Animation
 
Global Descriptor Attributes Based Content Based Image Retrieval of Query Images
Global Descriptor Attributes Based Content Based Image Retrieval of Query ImagesGlobal Descriptor Attributes Based Content Based Image Retrieval of Query Images
Global Descriptor Attributes Based Content Based Image Retrieval of Query Images
 
Face Recognition Smart Attendance System: (InClass System)
Face Recognition Smart Attendance System: (InClass System)Face Recognition Smart Attendance System: (InClass System)
Face Recognition Smart Attendance System: (InClass System)
 
A virtual analysis on various techniques using ann with
A virtual analysis on various techniques using ann withA virtual analysis on various techniques using ann with
A virtual analysis on various techniques using ann with
 
Image Processing Compression and Reconstruction by Using New Approach Artific...
Image Processing Compression and Reconstruction by Using New Approach Artific...Image Processing Compression and Reconstruction by Using New Approach Artific...
Image Processing Compression and Reconstruction by Using New Approach Artific...
 
An Extensible Web Mining Framework for Real Knowledge
An Extensible Web Mining Framework for Real KnowledgeAn Extensible Web Mining Framework for Real Knowledge
An Extensible Web Mining Framework for Real Knowledge
 
Big Data visualization
Big Data visualizationBig Data visualization
Big Data visualization
 
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...
 
Welcome (D1L1 2017 UPC Deep Learning for Computer Vision)
Welcome (D1L1 2017 UPC Deep Learning for Computer Vision)Welcome (D1L1 2017 UPC Deep Learning for Computer Vision)
Welcome (D1L1 2017 UPC Deep Learning for Computer Vision)
 
Efficient content-based image retrieval using integrated dual deep convoluti...
Efficient content-based image retrieval using integrated dual  deep convoluti...Efficient content-based image retrieval using integrated dual  deep convoluti...
Efficient content-based image retrieval using integrated dual deep convoluti...
 

Recently uploaded

VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 

Recently uploaded (20)

VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spain
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 

IATMSI 2022 Presentation Format.pptx

  • 1. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies Paper Title: Human Pose Estimation: Benchmarking Deep Learning-based Methods All authors Name and Affiliation Mayank Lovanshi and Vivek Tiwari, IIIT-Naya Raipur Paper ID: 8399 Track No. : 3 Presented by Mayank Lovanshi, IIIT-Naya Raipur
  • 2. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies Content • Introduction • Related Work • Methodology: Human Pose Estimation Models • Dataset Used • Experiment & Results • Conclusion • References 2
  • 3. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies INTRODUCTION • Human Pose Estimation: Identifying and classifying the joints in the human body [1,2]. • Way to capture a set of coordinates for each joint (arm, head, torso, etc.,) Known as key points [2,3]. • The connection between these points is known as a Pair [1,2,3]. • Extraction of the angle information between the body joints [2,3]. 3 Fig.1: Sample of Pose Estimation Source:https://www.quickerhire.com/blogs/human-pose-estimation-for-multiple-subjects-with-machine-learning
  • 4. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies Cont… 4 Fig.2: Human body modeling: a) Skeleton based, b) Contour based, c) Volume-based Source:https://shop62004.afacetoreframe.org/content?c=body%20pose%20estimation&id=1 Three types of approaches to HPE:  The skeleton-based model includes a set of key points (joints) like ankles, knees, shoulders, and elbows [1].  The contour-based model consists of the contour and rough width of the body, torso, and limbs [1].  The volume-based model consists of multiple popular 3D human body models and poses represented by human geometric meshes and shapes [1].
  • 5. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies Cont… 2D Pose Estimation: 2D human pose estimation uses visuals like images and video to evaluate the 2D human pose or spatial location of the human body’s key points [2,3]. 3D Pose Estimation: The 3D Human Pose Estimation method is used to locate human joints in 3D space [2,3]. 5 Fig.3: 2D vs 3D Pose Estimation sample
  • 6. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies Related Work 6 S.N. Paper Title Problem Statement Method Limitation 1. DeepPose: Human Pose Estimation via Deep Neural Networks by A. Toshev et.al. (2014) [8]. Aim to extract 2D/3D key points information using deep learning based HPE algorithm PosePipe: a open-source deep learning model used to extract 2D/3D keypoints. Hard to work on the video based datasets. 2. Human Pose Estimation via Convolutional Part Heatmap Regression by A. Bulat et.al. (2016) [10]. Extraction of Human body joints using deep learning based approach A Convolutional Neural Network (CNN) based approach used for identification of the human pose CNN based approach doesn’t work on part based posture identification 3. Combining local appearance and holistic view: Dual-Source Deep Neural Networks for human pose estimation by Xiaochuan Fan et.al. (2018) [12]. Aim to extract local part pose information to enhance human posture evaluation Dual-Source Deep Convolutional Neural Network (DS-CNN) used for posture evaluation. Pose estimation results is not that much correct.
  • 7. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies Related Work 7 S.N. Paper Title Problem Statement Method Limitation 4. Deep learning based 2D human pose estimation: A survey by Q. Dang et.al. (2019) [2]. Identification of 2D/3D human pose estimation using kinematic model Model-free & model based;two estimation algorithm is used It doesn’t work on the RGB-D images. 5. End-to-end recovery of human shape and pose by A. Kanazawa (2020) [11]. Identification of the human posture with joints angle & key points information Human Mess Recovery (HMR): an end-to-end system for generating a complete 3D/2D mess. 3D mess can’t be extracted from the depth RGB image/video. 6. Hand Pose Estimation from RGB Images Based on Deep Learning: A Survey by Y. Liu et.al. (2021) [1]. Identification of 2D human pose estimation. DeepPose: a cascaded deep learning-based regressor used It doesn’t work to extract 3D human pose estimation
  • 8. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies HUMAN POSE ESTIMATION MODELS 1. OpenPose [4] 2. ViTPose [13] 3. HRNet [6] 4. AlphaPose [5] 5. DenseNet [14] 6. EfficientPose [15,16] 7. DensePose [17] 8. Hourglass [18] 8 Fig. 4: Architecture of our proposed work
  • 9. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies 1. OpenPose: • OpenPose is based on the VGG-19 convolutional neural network. • It comprises four parts: input, part confidence map, bipartite matching, & output image. 2. ViTPose: • Based on non-hierarchical vision transformers as backbones. • Two deconvolution layers and one prediction layer. 9 Fig. 5: Image extraction through the OpenPose method Fig. 6: Framework of the ViTPose method
  • 10. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies 3. HRNet: • Backbone model as a convolutional neural network. • Used for semantic segmentation, object recognition, and image categorisation. 4. AlphaPose: • Used Symmetric Spatial Transformer Network (SSTN) • Single-Person Pose Estimator (SPPE) 10 Fig. 8: Image extraction through the AlphaPose method Fig. 7: Framework of HRNet method
  • 11. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies 5. DenseNet: • The backbone model is Resnet (based on CNN). • Solve the vanishing gradient problem by using LSTM as one layer. 6. EfficientPose: • The backbone model is a Convolutional neural network. • It comprises two main parts; an efficient backbone and an efficient head. 11 Fig. 9: Layered structure of DenseNet method Fig. 10: Architecture of the EfficientPose Methods
  • 12. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies 7. DensePose: • A fully-convolutional network design was used in the Dense Regression (DenseReg). • It combines the DenseReg method with the Mask-RCNN to improve Pose. 8. Hourglass: • Based on tightly linked fully convolutional networks. • Conv-deconv and encoder-decoder methods are linked to the hourglass module. 12 Fig. 11: Architecture of the DensePose Methods Fig. 12: Architecture of the Hourglass Methods
  • 13. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies DATASET USED A. COCO: [20] • Images: 66,808 • Annotations: 273,469 • Key points: 17 key points B. MPII: [12] • Images: 40,000 • Annotations: 223,589 • Classes: 410 classes 13 Fig.13: COCO Dataset: sample images [20] Fig.14: MPII Dataset: sample images [12]
  • 14. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies RESULTS Evaluate results on the basis of the following matrices: 1. Average Precision(AP): The weighted mean of precisions at each threshold; the weight is the increase in recall from the prior threshold [7,8]. 𝐴𝑃@𝛼 = 0 1 𝑝 𝑟 𝑑𝑟 2. Mean Average Precision(mAP): Average precision value over different IOUs [9]. 3. Percentage of correct key point (PCK): PCK is a precision metric determining the anticipated key point and the actual joint in a given distance [10,11]. 14 𝑚𝐴𝑃@𝛼 = 1 𝑛 𝑖=1 𝑛 𝐴𝑃𝑖
  • 15. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies RESULTS TABLE I: BENCHMARKING WITH SOTA POSE ESTIMATION NETWORKS ON COCO DATASET BASED ON AP & MAP 15 Algorithm COCO Dataset AP AP0.5 AP0.75 APM APL mAP OpenPose[4] 60.5 83.4 66.4 55.1 68.1 65.9 AlphaPose [5] 73.3 89.2 79.1 69.0 78.6 77.84 HRNet [6] 77.4 92.6 84 73.6 83.7 82.3 ViTPose-B [13] 81.1 95.0 88.2 87.8 86.0 85.6 DenseNet [14] 77.1 93.3 83.6 72.2 83.6 82.6 EfficientPose [15,16] 70.5 91.1 79.0 67.3 76.2 76.1 DensePose [17] 55.8 83.7 56.3 42.2 53.8 61.1 Hourglass [18] 65.6 88.8 69.3 - - 74.5 4*RSN-50 [19] 78.6 94.6 86.6 83.3 75.5 83.8
  • 16. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies RESULTS TABLE II: BENCHMARKING WITH SOTA POSE ESTIMATION NETWORKS ON MPII DATASET BASED ON PCK OF BODY PART & AVERAGE PCK 16 Algorithm MPII Dataset Ankle Knee Hip Wrist Elbow Shoulder Head Avg. PCK OpenPose [4] 79.87 87.17 93.0 79.15 89.03 95.97 96.11 88.73 AlphaPose[5] 72.4 79.9 80.3 76.4 84.0 90.5 91.3 82.1 HRNet [6] 82.5 86.1 89.1 85.9 90.5 85.9 96.9 90.0 ViTPose-B [13] 88.3 91.9 92.4 90.1 93.7 97.4 97.6 93.4 EfficientPose [15,16] 83.9 87.5 90.3 87.5 91.7 96.0 98.2 91.2 Hourglass [18] 89.3 92.2 93.2 91.2 94.4 97.5 98.8 94.1 4*RSN-50 [19] 86.8 90.6 92.0 89.9 93.9 97.3 98.5 93.0
  • 17. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies CONCLUSION • This study helps to yield accurate and spatially precise key point heat maps, average precision & probability of correct key points of human pose estimation. • Experimental analysis was done over two datasets, i.e. COCO & MPII datasets. • ViTPose-B performed better than the others in every AP variant on COCO dataset because it uses a transformer instead of a convolution. • OpenPose underperformed on the COCO dataset. • The average PCK of the MPII dataset and the PCKs for each class were outperformed by the hourglass model. • AlphaPose underperformed on the MPII dataset. 17
  • 18. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies WORK DONE(based on discussed work) Task 1: Human Skeleton Pose and Spatio-Temporal Feature-based Activity Recognition using ST-GCN • Human activity recognition using pose estimation algorithm. • Normalise human activity sequence with the Gaussian filter method. • Investigate ST-GCN model for extraction of Spatial & Temporal features. Task 2: 3D Skeleton-based Human Motion Prediction using Dynamic Multi-scale Spatiotemporal Graph Recurrent Neural Networks • Human motion prediction using graph recurrent neural network. • Investigate a novel DMST-GRNN model on the multi-scale variation for the extraction of spatial & temporal features. • Validate human motion based on time series-based 3D sequential datasets. 18
  • 19. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies REFERENCES [1] Y. Liu, J. Jiang, and J. Sun, “Hand Pose Estimation from RGB Images Based on Deep Learning: A Survey.” 2021 IEEE 7th International Conference on Virtual Reality (ICVR), 2021. [2] Q. Dang, J. Yin, B. Wang, and W. Zheng, “Deep learning based 2D human pose estimation: A survey.” Tsinghua Science and Technology, vol. 24, no. 6, pp. 663-676, 2019. [3] Meenakshi Choudhary, Vivek Tiwari, and Swati Jain. Person reidentification using deep siamese network with multi-layer similarity constraints. Multimedia Tools and Applications, pages 1– 17, 2021. [4] D. Osokin, “Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose.” Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods, 2019. [5] H.-S. Fang, S. Xie, Y.-W. Tai, and C. Lu, “RMPE: Regional Multiperson Pose Estimation.” 2017 IEEE International Conference on Computer Vision (ICCV), 2017. [6] K. Sun, B. Xiao, D. Liu, and J. Wang, “Deep High-Resolution Representation Learning for Human Pose Estimation.” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [7] W. Li, R. Du, and S. Chen, “Skeleton-Based Spatio-Temporal UNetwork for 3D Human Pose Estimation in Video.” Sensors, vol. 22, no. 7, p. 2573, 2022. [8] A. Toshev and C. Szegedy, “DeepPose: Human Pose Estimation via Deep Neural Networks.” 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014. [9] Z. Cao, T. Simon, S.-E. Wei, and Y. Sheikh, “Real-time Multi-person 2D Pose Estimation Using Part Affinity Fields.” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. [10] A. Bulat and G. Tzimiropoulos, “Human Pose Estimation via Convolutional Part Heatmap Regression.” Computer Vision – ECCV 2016, pp. 717-732, 2016. 19
  • 20. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies REFERENCES [11] A. Kanazawa, M. J. Black, D. W. Jacobs, and J. Malik, “End-to-end recovery of human shape and pose,” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. [12] M. Andriluka, L. Pishchulin, P. Gehler, and B. Schiele, “2D Human Pose Estimation: New Benchmark and State of the Art Analysis.” 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014. [13] Y. Xu, J. Zhang, Q. Zhang, and D. Tao, “ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation.” Computer Vision – ECCV 2022, 2022. [14] S. W. Chu, Y. Song, J. J. Zouo, and W. Cai, “Human Pose Estimation Using Deep Convolutional Densenet Hourglass Network with Intermediate Points Voting.” 2019 IEEE International Conference on Image Processing (ICIP), 2019. [15] J. Li, C. Wang, H. Zhu, Y. Mao, H.-S. Fang, and C. Lu, “CrowdPose: Efficient Crowded Scenes Pose Estimation and a New Benchmark.” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019. [16] D. Groos, H. Ramampiaro, and E. A. Ihlen, “EfficientPose: Scalable single-person pose estimation.” Applied Intelligence, vol. 51, no. 4, pp. 2518- 2533, 2020. [17] R. A. Guler, N. Neverova, and I. Kokkinos, “DensePose: Dense Human Pose Estimation in the Wild.” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. [18] T. Xu and W. Takano, “Graph Stacked Hourglass Networks for 3D Human Pose Estimation.” 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021. [19] Y. Cai, “Learning Delicate Local Representations for Multi-person Pose Estimation.” Computer Vision – ECCV 2020. [20] T.-Y. Lin, “Microsoft COCO: Common Objects in Context.” Computer Vision – ECCV 2014, pp. 740-755, 2014. 20
  • 21. IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (Hybrid) December 21 – 23, 2022, Gwalior, India Enabling the Change! Social Innovation for sustainable societies