[DL輪読会]Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
1.
http://deeplearning.jp/
Model soups: averagingweights of multiple fine-tuned models
improves accuracy without increasing inference time
小林 範久 Present Square Co.,Ltd.
DEEP LEARNING JP
[DL Papers]
1
2.
Copyright (C) PresentSquare Co., Ltd. All Rights Reserved.
書誌情報
Model soups: averaging weights of multiple fine-tuned models improves accuracy
without increasing inference time
https://arxiv.org/abs/2203.05482
タイトル:
著者: Mitchell Wortsmany, Gabriel Ilharco, Samir Yitzhak Gadre, Rebecca Roelofs,
Raphael Gontijo-Lopes, Ari S. Morcos, Hongseok Namkoong, Ali Farhadi,
Yair Carmon, Simon Kornblith, Ludwig Schmidt
• 異なるハイパーパラメータの構成で学習した複数のファインチューニングモデルの「重み」を平均化すること
で、「精度」と「ロバスト性」が向上させる手法「Model soups」を提案。
• 従来のアンサンブルとは異なり、推論コストやメモリコストをかけることなく、多くのモデルを平均化することが
できる。
• CLIP、ALIGN、JFTで事前学習したViT-Gを利用することで、ImageNetで最良のモデルよりも大幅
に改善し、90.94%のトップ1精度を達成。
• さらにこのアプローチが、複数の画像分類や自然言語処理タスクに拡張され、分布外性能を向上させ、
新しい下流タスクのゼロショット性能を向上させることを示す。
概要:
2
3.
Copyright (C) PresentSquare Co., Ltd. All Rights Reserved.
アジェンダ
1. 導入
2. 先行研究
3. 手法
4. 実験
5. まとめ
3
Copyright (C) PresentSquare Co., Ltd. All Rights Reserved.
Appendix
参考文献
• [32] Jonathan Frankle, Gintare Karolina Dziugaite, Daniel Roy, and Michael Carbin. Linear mode connectivity and the
lottery ticket hypothesis. In International Conference on Machine Learning (ICML), 2020.
https://arxiv.org/abs/1912.05671.
• [46] Pavel Izmailov, Dmitrii Podoprikhin, Timur Garipov, Dmitry Vetrov, and Andrew Gordon Wilson. Averaging weights
leads to wider optima and better generalization. In Conference on Uncertainty in Articial Intelligence(UAI), 2018.
https://arxiv.org/abs/1803.05407.
• [47] Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V Le, Yunhsuan Sung, Zhen Li, and
Tom Duerig. Scaling up visual and vision-language representation learning with noisy text supervision. In International
Conference on Machine Learning (ICML), 2021. https://arxiv.org/abs/2102.05918.
• [72] Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda
Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from
natural language supervision. In International Conference on Machine Learning (ICML), 2021.
https://arxiv.org/abs/2103.00020.
• [102] Xiaohua Zhai, Alexander Kolesnikov, Neil Houlsby, and Lucas Beyer. Scaling vision transformers, 2021.
https://arxiv.org/abs/2106.04560.
27