Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Action Recognition Datasets.pptx
1. 2022-04-21
Sangmin Woo
Computational Intelligence Lab.
School of Electrical Engineering
Korea Advanced Institute of Science and Technology (KAIST)
Multi-modal Action
Recognition
Datasets & Benchmarks
2. 2
Action Recognition Datasets
Generic
Kinetics [1]
Charades [2]
Activity Net [3]
UCF101 [4]
Instructional
YouCook [5]
COIN [6]
HowTo100M [7]
[1] Carreira, Joao, et al. " Quo vadis, action recognition? a new model and the kinetics dataset." CVPR 2017
[2] Sigurdsson, Gunnar A., et al. "Hollywood in homes: Crowdsourcing data collection for activity understanding." ECCV 2016
[3] Caba Heilbron, Fabian, et al. "Activitynet: A large-scale video benchmark for human activity understanding." CVPR 2015
[4] Soomro, Khurram, et al. "UCF101: A dataset of 101 human actions classes from videos in the wild." arXiv 2012
[5] Zhou, Luowei, et al. "Towards automatic learning of procedures from web instructional videos." AAAI 2018
[6] Tang, Yansong, et al. "Coin: A large-scale dataset for comprehensive instructional video analysis." CVPR 2019
[7] Miech, Antoine, et al. "Howto100m: Learning a text-video embedding by watching hundred million narrated video clips." ICCV 2019
3. 3
Action Recognition Datasets
Ego-centric
EPIC Kitchens [1]
Compositional
Action Genome [2]
Something-Something [3]
HOMAGE [4]
[1] Damen, Dima, et al. "Scaling egocentric vision: The epic-kitchens dataset." ECCV 2018
[2] Ji, Jingwei, et al. "Action genome: Actions as compositions of spatio-temporal scene graphs." CVPR 2020
[3] Goyal, Raghav, et al. "The" something something" video database for learning and evaluating visual common sense." ICCV 2017
[4] Rai, Nishant, et al. "Home Action Genome: Cooperative Compositional Action Understanding." CVPR 2021
5. 5
Action Recognition Datasets
Multi-modal
Single Label (Video-level Action)
MSR-Action3D [1] depth map
PKU-MMD [2] RGB+D+IR+skeletion
NTU RGB+D [3, 4] RGB+D+IR+3D human joint
Multi Labels (Temporally Localized Actions)
MMAct [5] RGB+keypoints+acc+gyro+orientation
LEMMA[6] RGB+D
HOMAGE [7] RGB+IR+audio+RGB light+light+acc+gyro
etc.
[1] Li, Wanqing, et al. "Action recognition based on a bag of 3d points." CVPRW 2010
[2] Liu, Chunhui, et al. "Pku-mmd: A large scale benchmark for continuous multi-modal human action understanding." arxiv 2017
[3] Shahroudy, Amir, et al. "Ntu rgb+ d: A large scale dataset for 3d human activity analysis." CVPR 2016
[4] Liu, Jun, et al. "Ntu rgb+ d 120: A large-scale benchmark for 3d human activity understanding." TPAMI 2019
[5] Kong, Quan, et al. "Mmact: A large-scale dataset for cross modal human action understanding." ICCV 2019
[6] Jia, Baoxiong, et al. "Lemma: A multi-view dataset for learning multi-agent multi-task activities." ECCV 2020
[7] Rai, Nishant, et al. "Home Action Genome: Cooperative Compositional Action Understanding." CVPR 2021
6. 6
Action Recognition Datasets
Multi-modal
Single Label (Video-level Action)
PKU-MMD RGB+D+IR+skeletion
Liu, Chunhui, et al. "Pku-mmd: A large scale benchmark for continuous multi-modal human action understanding." arxiv 2017
7. 7
Action Recognition Datasets
Multi-modal
Single Label (Video-level Action)
NTU RGB+D RGB+D+IR+3D human joint
Shahroudy, Amir, et al. "Ntu rgb+ d: A large scale dataset for 3d human activity analysis." CVPR 2016
Liu, Jun, et al. "Ntu rgb+ d 120: A large-scale benchmark for 3d human activity understanding." TPAMI 2019
RGB RGB+Joints Depth Depth+Joints IR
8. 8
Action Recognition Datasets
Multi-modal
Multi Labels (Temporally Localized Actions)
MMAct RGB+keypoints+acc+gyro+orientation
Kong, Quan, et al. "Mmact: A large-scale dataset for cross modal human action understanding." ICCV 2019
9. 9
Action Recognition Datasets
Multi-modal
Multi Labels (Temporally Localized Actions)
HOMAGE RGB+IR+audio+RGB light+light+acc+gyro
etc.
Rai, Nishant, et al. "Home Action Genome: Cooperative Compositional Action Understanding." CVPR 2021