This document summarizes recent developments in action recognition using deep learning techniques. It discusses early approaches using improved dense trajectories and two-stream convolutional neural networks. It then focuses on advances using 3D convolutional networks, enabled by large video datasets like Kinetics. State-of-the-art results are achieved using inflated 3D convolutional networks and temporal aggregation methods like temporal linear encoding. The document provides an overview of popular datasets and challenges and concludes with tips on training models at scale.
HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Est...harmonylab
公開URL:https://arxiv.org/abs/1908.10357
出典:Cheng B, Xiao B, Wang J, Shi H, Huang T S, Zhang L : Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5386-5395 (2020) https://arxiv.org/abs/1908.10357
概要:高解像度特徴量ピラミッドを用いて人物の大きさに考慮したBottom-Up型の姿勢推定手法の一つです.HRNetの特徴マップ出力と,転置畳み込みによるアップサンプリングされた高解像度な出力で構成されています.COCO test-devにおいて,中人数以上で従来のBottom-Up型手法を2.5%AP上回り,後処理などを含めない場合においてBottom-Up型でSOTA (70.5%AP)を達成しました.
This document summarizes recent developments in action recognition using deep learning techniques. It discusses early approaches using improved dense trajectories and two-stream convolutional neural networks. It then focuses on advances using 3D convolutional networks, enabled by large video datasets like Kinetics. State-of-the-art results are achieved using inflated 3D convolutional networks and temporal aggregation methods like temporal linear encoding. The document provides an overview of popular datasets and challenges and concludes with tips on training models at scale.
HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Est...harmonylab
公開URL:https://arxiv.org/abs/1908.10357
出典:Cheng B, Xiao B, Wang J, Shi H, Huang T S, Zhang L : Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5386-5395 (2020) https://arxiv.org/abs/1908.10357
概要:高解像度特徴量ピラミッドを用いて人物の大きさに考慮したBottom-Up型の姿勢推定手法の一つです.HRNetの特徴マップ出力と,転置畳み込みによるアップサンプリングされた高解像度な出力で構成されています.COCO test-devにおいて,中人数以上で従来のBottom-Up型手法を2.5%AP上回り,後処理などを含めない場合においてBottom-Up型でSOTA (70.5%AP)を達成しました.
You Only Look One-level Featureの解説と見せかけた物体検出のよもやま話Yusuke Uchida
第7回全日本コンピュータビジョン勉強会「CVPR2021読み会」(前編)の発表資料です
https://kantocv.connpass.com/event/216701/
You Only Look One-level Featureの解説と、YOLO系の雑談や、物体検出における関連する手法等を広く説明しています
Semi supervised, weakly-supervised, unsupervised, and active learningYusuke Uchida
An overview of semi supervised learning, weakly-supervised learning, unsupervised learning, and active learning.
Focused on recent deep learning-based image recognition approaches.
DeNA AIシステム部内の輪講で発表した資料です。Deep fakesの種類やその検出法の紹介です。
主に下記の論文の紹介
S. Agarwal, et al., "Protecting World Leaders Against Deep Fakes," in Proc. of CVPR Workshop on Media Forensics, 2019.
A. Rossler, et al., "FaceForensics++: Learning to Detect Manipulated Facial Images," in Proc. of ICCV, 2019.