This document summarizes recent research on applying self-attention mechanisms from Transformers to domains other than language, such as computer vision. It discusses models that use self-attention for images, including ViT, DeiT, and T2T, which apply Transformers to divided image patches. It also covers more general attention modules like the Perceiver that aims to be domain-agnostic. Finally, it discusses work on transferring pretrained language Transformers to other modalities through frozen weights, showing they can function as universal computation engines.
This document summarizes recent research on applying self-attention mechanisms from Transformers to domains other than language, such as computer vision. It discusses models that use self-attention for images, including ViT, DeiT, and T2T, which apply Transformers to divided image patches. It also covers more general attention modules like the Perceiver that aims to be domain-agnostic. Finally, it discusses work on transferring pretrained language Transformers to other modalities through frozen weights, showing they can function as universal computation engines.
[DL輪読会]A Generalization of Otsu’s Method and Minimum Error Thresholding[ECCV2020]
1. 1
DEEP LEARNING JP
[DL Papers]
http://deeplearning.jp/
A Generalization of Otsu’s Method and Minimum Error
Thresholding [ECCV2020]
Masashi Yokota, RESTAR Inc.
2. 書誌情報
• 著者: Jonathan T. Barron (Google Research)
• ECCV2020採択
• Otsu’s BinarizationとMETという有名な二値化アルゴリズムを
一般化し、少ないパラメータかつシンプルなロジック(python
で数十行のコード) な手法を提案。DL手法や複雑な前処理をし
た手法よりも良いパフォーマンスを達成。
2