本スライドは、弊社の梅本により弊社内の技術勉強会で使用されたものです。
近年注目を集めるアーキテクチャーである「Transformer」の解説スライドとなっております。
"Arithmer Seminar" is weekly held, where professionals from within and outside our company give lectures on their respective expertise.
The slides are made by the lecturer from outside our company, and shared here with his/her permission.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
本スライドは、弊社の梅本により弊社内の技術勉強会で使用されたものです。
近年注目を集めるアーキテクチャーである「Transformer」の解説スライドとなっております。
"Arithmer Seminar" is weekly held, where professionals from within and outside our company give lectures on their respective expertise.
The slides are made by the lecturer from outside our company, and shared here with his/her permission.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
ゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement LearningPreferred Networks
Introduction of Deep Reinforcement Learning, which was presented at domestic NLP conference.
言語処理学会第24回年次大会(NLP2018) での講演資料です。
http://www.anlp.jp/nlp2018/#tutorial
文献紹介:Gate-Shift Networks for Video Action RecognitionToru Tamaki
Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz; Gate-Shift Networks for Video Action Recognition, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 1102-1111
https://openaccess.thecvf.com/content_CVPR_2020/html/Sudhakaran_Gate-Shift_Networks_for_Video_Action_Recognition_CVPR_2020_paper.html
文献紹介:Token Shift Transformer for Video ClassificationToru Tamaki
Hao Zhang, Yanbin Hao, Chong-Wah Ngo, Token Shift Transformer for Video Classification, ACM MM '21: Proceedings of the 29th ACM International Conference on MultimediaOctober 2021 Pages 917–925https://doi.org/10.1145/3474085.3475272
http://vireo.cs.cityu.edu.hk/papers/Hao_MM2021.pdf
http://arxiv.org/abs/2108.02432
https://dl.acm.org/doi/abs/10.1145/3474085.3475272
ゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement LearningPreferred Networks
Introduction of Deep Reinforcement Learning, which was presented at domestic NLP conference.
言語処理学会第24回年次大会(NLP2018) での講演資料です。
http://www.anlp.jp/nlp2018/#tutorial
文献紹介:Gate-Shift Networks for Video Action RecognitionToru Tamaki
Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz; Gate-Shift Networks for Video Action Recognition, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 1102-1111
https://openaccess.thecvf.com/content_CVPR_2020/html/Sudhakaran_Gate-Shift_Networks_for_Video_Action_Recognition_CVPR_2020_paper.html
文献紹介:Token Shift Transformer for Video ClassificationToru Tamaki
Hao Zhang, Yanbin Hao, Chong-Wah Ngo, Token Shift Transformer for Video Classification, ACM MM '21: Proceedings of the 29th ACM International Conference on MultimediaOctober 2021 Pages 917–925https://doi.org/10.1145/3474085.3475272
http://vireo.cs.cityu.edu.hk/papers/Hao_MM2021.pdf
http://arxiv.org/abs/2108.02432
https://dl.acm.org/doi/abs/10.1145/3474085.3475272
文献紹介:Deep Analysis of CNN-Based Spatio-Temporal Representations for Action Re...Toru Tamaki
Chun-Fu Richard Chen, Rameswar Panda, Kandan Ramakrishnan, Rogerio Feris, John Cohn, Aude Oliva, Quanfu Fan; Deep Analysis of CNN-Based Spatio-Temporal Representations for Action Recognition, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 6165-6175
https://openaccess.thecvf.com/content/CVPR2021/html/Chen_Deep_Analysis_of_CNN-Based_Spatio-Temporal_Representations_for_Action_Recognition_CVPR_2021_paper.html
文献紹介:An Image is Worth 16x16 Words: Transformers for Image Recognition at ScaleToru Tamaki
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby, An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, ICLR2021.
https://openreview.net/forum?id=YicbFdNTTy
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)SSII
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
6/10 (木) 14:30~15:00
講師:Huy H. Nguyen 氏(総合研究大学院大学/国立情報学研究所)
概要: Advances in machine learning and their interference with computer graphics allow us to easily generate high-quality images and videos. State-of-the-art manipulation methods enable the real-time manipulation of videos obtained from social networks. It is also possible to generate videos from a single portrait image. By combining these methods with speech synthesis, attackers can create a realistic video of some person saying something that they never said and distribute it on the internet. This results in loosing social trust, making confusion, and harming people’s reputation. Several countermeasures have been proposed to tackle this problem, from using hand-crafted features to using convolutional neural network. Some countermeasures use images as input and other leverage temporal information in videos. Their output could be binary (bona fide or fake) or muti-class (deepfake detection), or segmentation masks (manipulation localization). Since deepfake methods evolve rapidly, dealing with unseen ones is still a challenging problem. Some solutions have been proposed, however, this problem is not completely solved. In this talk, I will provide an overview on both deepfake generation and deepfake detection/localization. I will mainly focus on image and video domain and also introduce some audiovisual-based methods on both sides. Some open discussions and future directions are also included.
2. はじめに
Inductive Transfer : 10 Years Later (NIPS2005 Workshop)
Inductive transfer or transfer learning refers to the problem
of retaining and applying the knowledge learned in one or
more tasks to efficiently develop an effective hypothesis for a
new task.
帰納的転移または転移学習とは, 新しいタスクに対する有効
な仮説を効率的に見つけ出すために, 一つ以上の別のタスク
で学習された知識を保持 · 適用する問題を指す.
本発表の目的
• 転移学習を体系的に整理する
• 転移学習の問題設定と具体的な定式化を説明する
• 転移学習の具体的な方法の例を紹介する
注) ∗
の付いているスライドや章は時間の都合上説明を省略します
松井 (名古屋大) 転移学習の基礎 1 / 41
14. いつ転移するか: 負転移
負転移
1. 一方のドメインのみで学習したモデルを目標タスクで用いる
2. 両ドメインを使って学習したモデルを目標タスクで用いる
として (2 のタスク性能) ≤ (1 のタスク性能) のとき (下図 (b))
1.0
0.2
0.4
0.6
0.8
0.0
1.0
0.2
0.4
0.6
0.8
0.0
AUC
AUC
The number of target training cases
The number of target training cases
(a) (b)
source only
transfer
target only
source only
transfer
target only
• 2 つのドメインが乖離しているほど負転移が発生しやすい
• 負転移を防ぐことは転移学習における重要な課題
松井 (名古屋大) 転移学習の基礎 転移学習の基本問題 11 / 41
47. References
[1] Hal Daumé III. Frustratingly easy domain adaptation. ACL, 2007.
[2] A. Krizhevsky et al. Imagenet classification with deep convolutional neural networks. NeurIPS, 2012.
[3] A. Radford et al. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
[4] A. Ramesh et al. Zero-shot text-to-image generation. arXiv preprint arXiv:2102.12092, 2021.
[5] A. Soltoggio et al. Born to learn: the inspiration, progress, and future of evolved plastic artificial neural
networks. Neural Networks, 108:48–67, 2018.
[6] B. K. Sriperumbudur et al. On the empirical estimation of integral probability metrics. Electronic Journal of
Statistics, 6:1550–1599, 2012.
[7] C. Finn et al. Model-agnostic meta-learning for fast adaptation of deep networks. ICML, 2017.
[8] F. Zhuang et al. Supervised representation learning: Transfer learning with deep autoencoders. IJCAI, 2015.
[9] H. Liu et al. Transferable adversarial training: A general approach to adapting deep classifiers. ICML, 2019.
[10] H. Zhao et al. On learning invariant representations for domain adaptation, 2019.
[11] I. Redko et al. Optimal transport for multi-source domain adaptation under target shift. AISTATS, 2019.
[12] I. Sato et al. Managing computer-assisted detection system based on transfer learning with negative transfer
inhibition. KDD, 2018.
[13] J. Devlin et al. Bert: Pre-training of deep bidirectional transformers for language understanding. NAACL, 2018.
[14] J. Gou et al. Knowledge distillation: A survey. International Journal of Computer Vision, pages 1–31, 2021.
[15] J. Quionero-Candela et al. Dataset shift in machine learning. The MIT Press, 2009.
[16] L. Duan et al. Learning with augmented features for heterogeneous domain adaptation. ICML, 2012.
[17] L. Franceschi et al. Forward and reverse gradient-based hyperparameter optimization. 2017.
松井 (名古屋大) 転移学習の基礎 まとめ 40 / 41
48. [18] M. Sugiyama et al. Density ratio estimation in machine learning. Cambridge University Press, 2012.
[19] N. Courty et al. Optimal transport for domain adaptation. IEEE transactions on pattern analysis and machine
intelligence, 39(9):1853–1865, 2016.
[20] S. Ben-David et al. A theory of learning from different domains. Machine learning, 79(1):151–175, 2010.
[21] S. Kuroki et al. Unsupervised domain adaptation based on source-guided discrepancy. 2019.
[22] T. Brown et al. Language models are few-shot learners. arXiv preprint arXiv:2005.14165, 2020.
[23] T. Teshima et al. Few-shot domain adaptation by causal mechanism transfer. 2020.
[24] V. Veeriah et al. Discovery of useful questions as auxiliary tasks. NeurIPS, 2019.
[25] Y. Chen et al. Learning to learn without gradient descent by gradient descent. 2017.
[26] Y. Duan et al. Rl ˆ2: Fast reinforcement learning via slow reinforcement learning. arXiv preprint
arXiv:1611.02779, 2016.
[27] Y. Ganin et al. Domain-adversarial training of neural networks. JMLR, 17(1):2096–2030, 2016.
[28] Y. Li et al. Feature-critic networks for heterogeneous domain generalization. 2019.
[29] T. Iwata and M. Yamada. Multi-view anomaly detection via robust probabilistic latent variable models.
NeurIPS, 2016.
[30] S. Ravi and H. Larochelle. Optimization as a model for few-shot learning. 2017.
[31] M. D. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. 2014.
松井 (名古屋大) 転移学習の基礎 まとめ 41 / 41