【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...Deep Learning JP
The document proposes modifications to self-attention in Transformers to improve faithful signal propagation without shortcuts like skip connections or layer normalization. Specifically, it introduces a normalization-free network that uses dynamic isometry to ensure unitary transformations, a ReZero technique to implement skip connections without adding shortcuts, and modifications to attention and normalization techniques to address issues like rank collapse in Transformers. The methods are evaluated on tasks like CIFAR-10 classification and language modeling, demonstrating improved performance over standard Transformer architectures.
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...Deep Learning JP
The document proposes modifications to self-attention in Transformers to improve faithful signal propagation without shortcuts like skip connections or layer normalization. Specifically, it introduces a normalization-free network that uses dynamic isometry to ensure unitary transformations, a ReZero technique to implement skip connections without adding shortcuts, and modifications to attention and normalization techniques to address issues like rank collapse in Transformers. The methods are evaluated on tasks like CIFAR-10 classification and language modeling, demonstrating improved performance over standard Transformer architectures.
協働AIがもたらす業務効率革命 -日本企業が押さえるべきポイント-Collaborative AI Revolutionizing Busines...
[DLHacks]Privacy-preserving generative deep neural networks support clinical data sharing
1. Privacy-preserving generative deep
neural networks support clinical data
sharing
Brett K. Beaulieu-Jones, Zhiwei Steven Wu, Chris Williams,
James Brian Byrd, Casey S. Greene
2018/6/10
DL Hacks研究タスク発表
古賀樹
14. GANと差分プライバシーとの橋渡し
• 従来の手法(strong composition thorem)よりもタイトな上限を
得た
• 実装には確率分布のモーメント( )を用いた定理を利用
Deep Learning with Differential Privacy
Martin Abdi et al.
The Moments Accountant
λ 5 32