SlideShare a Scribd company logo
Deep Transformers without Shortcuts:
Modifying Self-attention for Faithful Signal Propagation
Shohei Taniguchi, Matsuo Lab
1
Deep Transformers without Shortcuts
ॻࢽ৘ใ
ஶऀ
• Bobby He, James Martens, Guodong Zhang, Aleksandar Botev, Andrew Brock,
Samuel L Smith, Yee Whye Teh (DeepMind)
֓ཁ
• TransformerΛlayer normalization΍skip connectionͳ͠ͰֶशͰ͖ΔΑ͏ʹվྑ
• ICLR 2023 accepted
2
ൃද֓ཁ
• എ‫ܠ‬
• ؔ࿈‫ڀݚ‬
• ख๏
• ࣮‫݁ݧ‬Ռ
• ·ͱΊ
3
എ‫ܠ‬
Transformer
• Transformer͸AttentionͱMLPͷ‫܁‬Γฦ͠
• ֤ϞδϡʔϧͰskip connectionͱlayer normalizationΛ
ద༻͢Δͷ͕Ұൠత
• ͜ΕΒ͕࣮ࡍʹͲ͏͍͏໾ׂΛՌ͍ͨͯ͠Δ͔͸
‫ݱ‬ঢ়ෆ໌
• ͏·ֶ͘श͢ΔͨΊͷςΫχοΫͱ͍͏Ґஔ෇͚
4
ؔ࿈‫ڀݚ‬
Normalization-free Network
• MLP΍CNNͰ͸ɼskip connection΍ਖ਼‫ن‬Խ͕ͳͯ͘΋ਂ͍ωοτϫʔΫΛֶश
Ͱ͖Δํ๏͕஌ΒΕ͍ͯΔ
• ‫ج‬ຊతʹ͸ɼޯ഑ফࣦ/രൃ͕‫͜ى‬Βͳ͍Α͏ʹద੾ʹॏΈͷॳ‫ظ‬ԽΛߦ͑͹
ਖ਼‫ن‬ԽͳͲΛ࢖Θͳͯ͘΋େৎ෉
• Dynamic isometryͱ͍͏֓೦͕ಛʹॏཁ
5
Isometry
౳ํੑ
૚ͷMLPΛߟ͑Δͱɼೖྗ͔Βग़ྗ΁ͷϠίϏߦྻ ͸ɼ֤૚ͷॏΈͷߦྻੵ
ͨͩ͠ɼ ͸ Λຬͨ͢ର֯ߦྻ
L J
xl
= ϕ (hl
), hl
= Wl
xl−1
+ bl
J =
∂xL
∂h0
=
L
∏
l=1
Dl
Wl
Dl
Dl
ij = ϕ′

(hl
i) δij
6
Isometry
౳ํੑ
• ͜ͷϠίϏߦྻ͕ফࣦ/രൃ͍ͯ͠ͳ͚Ε͹ɼ҆ఆֶͯ͠शͰ͖Δ͸ͣ
􁉃
ߦྻͷಛҟ஋͕1෇ۙʹͳ͍ͬͯΕ͹ྑ͍
• ͷಛҟ஋ͷฏ‫͕ۉ‬1ͷͱ͖ɼ౳ํੑΛຬͨ͢
• ʹ͍ͭͯ͸ɼ‫׆‬ੑԽؔ਺͕‫߃Ͱۙ෇఺ݪ‬౳ؔ਺ͳΒ౳ํతʢtanhͳͲʣ
J =
∂xL
∂h0
=
L
∏
l=1
Dl
Wl
Wl
Dl
7
Dynamic Isometry
ಈత౳ํੑ
• ͞Βʹɼ͢΂ͯͷಛҟ஋͕1ͷͱ͖ɼಈత౳ํੑΛຬͨ͢
• ͜ΕΛຬͨ͢ͷ͸ɼॏΈ͕௚ަߦྻͷͱ͖
􁉃
௚ަॳ‫ظ‬ԽΛ͢Ε͹ɼޯ഑ফࣦ/രൃ͠ͳ͍ʂ
J =
∂xL
∂h0
=
L
∏
l=1
Dl
Wl
8
ؔ࿈‫ڀݚ‬ [1]
• MLPͰCIFAR-10ͷ෼ྨ
• ௚ަॳ‫ظ‬Խ + tanh͸ଞΑΓ଎͘ऩଋ͢Δ
9
ؔ࿈‫ڀݚ‬ [2]
CNNͷ৔߹
• CNN΋ಈత౳ํੑΛຬͨ͢Α͏ʹॳ‫ظ‬Խ͢Ε͹ɼਂ͍ϞσϧΛਖ਼‫ن‬Խͳ͠Ͱ
ֶशՄೳ
• ৞ΈࠐΈΧʔωϧͷதԝͷΈΛ௚ަॳ‫ظ‬Խͯ͠ɼ࢒Γ͸͢΂ͯ0Ͱॳ‫ظ‬Խ
• 1x1 convΛ௚ަॳ‫ظ‬Խͯ͠ɼͦͷपΓΛ0ຒΊ͢Δ‫ܗ‬
• ৞ΈࠐΈॲཧશମΛߦྻԋࢉͱ‫ʹ͖ͱͨݟ‬΋௚ަߦྻʹͳΔ
10
ؔ࿈‫ڀݚ‬ [2]
CNNͷ৔߹
• MNISTΛ4,000૚ͷCNNͰֶश
• ਖ਼‫ن‬Խ΍skip connection͸ೖΕͳ͍
• ਖ਼‫ن‬෼෍Ͱॳ‫ظ‬Խ͢ΔΑΓ΋ֶश͕଎͘ͳΔ
11
ؔ࿈‫ڀݚ‬ [2]
CNNͷ৔߹
• MNISTͱCIFAR-10Ͱ༷ʑͳਂ͞ͷϞσϧΛֶश
• 10,000૚·Ͱ૿΍ͯ͠΋ֶशͰ͖Δ
• ͨͩ͠ɼCIFAR-10Ͱ͸ςετͷਫ਼౓͕ανΔ
􁉃
ਖ਼‫ن‬Խ΍skip connection͸ֶशͷ҆ఆԽΑΓ΋
൚Խʹ‫د‬༩͍ͯ͠Δ͜ͱΛࣔࠦ
12
ؔ࿈‫ڀݚ‬ [3]
ReZero
• Skip connectionΛೖΕΔ৔߹Ͱ΋ɼಈత౳ํੑΛຬͨ͢Α͏ʹ
ॳ‫ظ‬Խ͢Ε͹ɼ͞ΒʹੑೳΛ্͛ΒΕͦ͏
• ௨ৗ͸ ʹ͢Δ͕ɼ Ͱॳ‫ظ‬Խͯ͠ ΋ֶशύϥϝʔλʹ͢Δ
• ॳ‫ظ‬Խ࣌఺Ͱ͸ɼ ͳͷͰɼ໌Β͔ʹಈత౳ํੑΛຬͨ͢
xi+1 = xi + αiF (xi)
αi = 1 αi = 0 αi
xi+1 = xi
13
ؔ࿈‫ڀݚ‬ [3]
ReZero
• CIFAR-10Ͱ32૚ͷMLPΛֶश
• ਖ਼‫ن‬Խͳ͠Ͱ΋͔ͳΓֶश͕଎͘ͳΔ
14
ؔ࿈‫ڀݚ‬ [3]
ReZero
• CIFAR-10ͰResNetΛֶश
• ֶश͕଎͘ͳΓɼੑೳ΋্͕Δ
15
ؔ࿈‫ڀݚ‬ [4]
ReLUͷ৔߹
• ReLUͷ৔߹͸ɼ௚ަॏΈͷҰ෦Λ൓సͤ͞Ε͹ಈత౳ํੑΛຬͨͤΔ
• ௚‫ײ‬తʹ͸ɼReLUͰ͸ෛͷ஋ʹͳͬͨೖྗ৴߸͕͢΂ͯ0ʹःஅ͞ΕΔͷͰɼ
ͦΕΛଧͪফ͢Α͏ʹූ߸Λ൓సͤ͞Ε͹ྑ͍ͱ͍͏͜ͱ
16
ؔ࿈‫ڀݚ‬ [5]
Transformerͷrank collapse
• MLP, skip connection, LayerNormͷͳ͍
attentionͷΈͷTransformer͸ɼॳ‫ظ‬Խͷ
࣌఺ͰϞσϧશମͷߦྻ͕૚਺ʹରͯ͠
ࢦ਺తʹϥϯΫམͪ͢Δ͜ͱ͕ཧ࿦తʹ
΋ࣔͤΔ
• AttentionͷΈͰ͸Transformer͸ֶशͰ͖
ͳ͍͜ͱΛࣔࠦ
17
Deep Transformers without Shortcuts
• TransformerͰ΋ਖ਼‫ن‬Խ΍skip connectionͳ͠ͰֶशͰ͖Δʁ
􁉃
‫ؤ‬ுΕ͹Ͱ͖Δ
• ७ਮʹਖ਼‫ن‬ԽͱskipΛൈ͘ͱޯ഑͕
രൃ͢Δ
• ఏҊ๏͸͍ͩͿ཈͑ΒΕ͍ͯΔ
Deep Transformers without Shortcuts
• ຊ࿦จͰ͸ɼGPT‫࢖Ͱܥ‬ΘΕΔΑ͏ͳCausal masked attentionΛର৅ʹ͢Δ
• ະདྷͷ‫ྻܥ‬Λࢀর͠ͳ͍Α͏ʹ ͰϚεΫ͢Δ
Attn(X) = A(X)V(X)
A(X) = softmax
(
M ∘
1
dk
Q(X)K(X)⊤
− Γ(1 − M)
)
Mi,j = 1i≥j
͸े෼େ͖͍ఆ਺
Γ
Deep Transformers without Shortcuts
• ·ͣ͸ɼMLPͷͳ͍attention-onlyͷϞσϧΛߟ͑Δͱɼ ૚໨ͷಛ௃ྔ͸
• ͱ͓͘ͱɼ ͕௚ަߦྻͷͱ͖
L
XL = [ALAL−1…A1] X0W, W =
L
∏
l=1
WV
l WO
l
Σl = XlX⊤
l , Πl = AlAl−1…A1 W
Σl = Πl ⋅ Σ0 ⋅ Π⊤
l
Deep Transformers without Shortcuts
• ͱ͓͘ͱɼ ͕௚ަߦྻͷͱ͖
• ͕୯Ґߦྻʹ͚ۙΕ͹ɼޯ഑͕҆ఆ͢Δ
􁉃
ͦΕ͕‫͜ى‬ΔΑ͏ʹ Λઃ‫͍ͨ͠ܭ‬
• ͨͩ͠ɼ ͸ཁૉ͕ඇෛͷԼࡾ֯ߦྻͱ͍͏੍໿෇͖
Σl = XlX⊤
l , Πl = AlAl−1…A1 W
Σl = Πl ⋅ Σ0 ⋅ Π⊤
l
Σl
Al
Al
Deep Transformers without Shortcuts
• ͱ͓͘ͱɼ ͕੒Γཱͭ΋ͱͰ
• ͜Ε͸ίϨεΩʔ෼ղʹ૬౰͢Δ
􁉃
ଥ౰ͳ Λઃ‫ͯ͠ܭ‬ɼͦͷίϨεΩʔ෼ղ Λ‫ٻ‬ΊΕ͹ɼ৚݅Λຬͨ͢ Λ
࡞ΕΔ
Al = LlL−1
l−1 L−1
0 Σ0L−1⊤
0 = IT
Σl = LlL⊤
l
Σl Ll Al
Deep Transformers without Shortcuts
U-SPA
• ର֯੒෼͕1ͰͦΕҎ֎͕ ͷߦྻ
• Λຬͨͤ͹ɼ৚݅Λຬͨ͢
• ϥϯΫམͪ΋๷͛Δ
Σl (ρl) = (1 − ρl) IT + ρl11⊤
ρl
0 ≤ ρ0 ≤ ρ1 ≤ ⋯ ≤ ρL < 1
Deep Transformers without Shortcuts
E-SPA
• ର֯੒෼͕1ͰͦΕҎ֎͸ର֯ઢ͔Βͷ‫Ͱ཭ڑ‬஋͕ఆ·Δߦྻ
• Λຬͨͤ͹ɼ৚݅Λຬͨ͢
• ϥϯΫམͪ΋๷͛Δ
(Σl (γl))i,j
= exp (−γl |i − j|)
γ0 ≥ γ1 ≥ ⋯ ≥ γL > 0
Deep Transformers without Shortcuts
Attentionͷ࠶ఆٛ
• લड़ͷ ͔Β‫ͨͬ࡞ͯ͠ࢉٯ‬ Λɼ ͱ෼ղ
• ͸ਖ਼ͷର֯ߦྻɼ ͸֤ߦͷ࿨͕1ͷԼࡾ֯ߦྻ
• ͱ͓͍ͯɼҎԼͷΑ͏ʹattentionΛ࠶ఆٛ
• ͷॏΈ Λ0Ͱॳ‫ظ‬Խ͢Δ͜ͱͰɼॳ‫ظ‬஋ʹ͓͍ͯ ͕ॴ๬ͷ‫ͳʹܗ‬Δ
Σ A A = DP
D P
B = log(P)
Q(X) WQ
Σ
Attn(X) = DP(X)V(X), P(X) = softmax M ∘
[
1
dk
Q(X)K(X)⊤
+ B
]
− Γ(1 − M)
࣮‫ݧ‬
WikiText-103
• 36૚ͷTransformerΛֶश
• ૉ๿ʹskipΛͳͨ͘͠΋ͷ͸ɼશֶ͘शͰ͖ͳ͍
• ఏҊ๏͸ɼͪΌΜͱֶशͰ͖ͯΔ
• ͨͩ͠ɼskip + LNΛೖΕͨ௨ৗͷ΋ͷΑΓ΋
ֶश͕͍ͩͿ஗͍
࣮‫ݧ‬
C4σʔληοτ
• 32૚ͷTransformerΛֶश
• ֶश࣌ؒΛ৳͹ͤ͹ɼskip + LN͋Γͷੑೳʹ౸ୡ͢Δ
• ໿5ഒ͘Β͍͕͔͔࣌ؒΔ
• Transformerʹ͓͍ͯ͸ɼskip΍LN͸ֶशͷ
ߴ଎Խʹ‫د‬༩͍ͯ͠Δʁ
࣮‫ݧ‬
C4σʔληοτͰͷ࣮‫ݧ‬
• Skip connectionΛೖΕΔͱఏҊ๏͕ϕʔεϥΠϯͷskip + LNͷ΋ͷʹউͭ
• ΍͸ΓTransformerͰ͸skip connection͕
௒ॏཁʁ
·ͱΊ
• MLP΍CNNͰ͸ɼಈత౳ํੑΛຬͨ͢Α͏ʹॳ‫ظ‬ԽΛߦ͑͹ɼਖ਼‫ن‬Խ΍skip
connectionͳ͠Ͱ΋ɼਂ͍ωοτϫʔΫΛֶशͰ͖Δ
• TransformerͰ΋ɼಉ͡Α͏ʹॳ‫ظ‬ԽΛஸೡʹ΍Ε͹ɼskip΍LNͳ͠ͰֶशͰ
͖Δ͜ͱ͕Θ͔ͬͨ
• ͨͩ͠ɼֶश͕͔࣌ؒͳΓ͔͔Δ
‫ײ‬૝
• ए‫ׯ‬ແཧ΍Γ‫ײ‬͸൱Ίͳ͍
• ݁‫ظॳہ‬Խ࣌ͷattention͕୯Ґߦྻʹۙ͘ͳΔΑ͏ʹ͢Ε͹ྑ͍ͱ͍͏͜ͱ
ͷ͸ͣ
• ΋ͬͱγϯϓϧͳํ๏΋͋Γͦ͏ͳ‫͕͢ؾ‬Δ
• ֶश͕஗͘ͳΔ‫ݪ‬Ҽ͕Ͳ͜ʹ͋Δͷ͔͕͋·ΓΘ͔͍ͬͯͳ͍
ࢀߟจ‫ݙ‬
[1] Pennington, Jeffrey, Samuel Schoenholz, and Surya Ganguli. "Resurrecting the
sigmoid in deep learning through dynamical isometry: theory and practice."
Advances in neural information processing systems 30 (2017).
[2] Xiao, Lechao, et al. "Dynamical isometry and a mean field theory of cnns: How to
train 10,000-layer vanilla convolutional neural networks." International Conference
on Machine Learning. PMLR, 2018.
[3] Bachlechner, Thomas, et al. "Rezero is all you need: Fast convergence at large
depth." Uncertainty in Artificial Intelligence. PMLR, 2021. APA
31
ࢀߟจ‫ݙ‬
[4] Burkholz, Rebekka, and Alina Dubatovka. "Initialization of relus for dynamical
isometry." Advances in Neural Information Processing Systems 32 (2019).
[5] Dong, Yihe, Jean-Baptiste Cordonnier, and Andreas Loukas. "Attention is not all
you need: Pure attention loses rank doubly exponentially with depth." International
Conference on Machine Learning. PMLR, 2021.
[6] He, Bobby, et al. "Deep Transformers without Shortcuts: Modifying Self-attention
for Faithful Signal Propagation." The Eleventh International Conference on Learning
Representations. 2023.
32

More Related Content

What's hot

強化学習アルゴリズムPPOの解説と実験
強化学習アルゴリズムPPOの解説と実験強化学習アルゴリズムPPOの解説と実験
強化学習アルゴリズムPPOの解説と実験
克海 納谷
 
【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning
【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning
【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning
Deep Learning JP
 
自己教師学習(Self-Supervised Learning)
自己教師学習(Self-Supervised Learning)自己教師学習(Self-Supervised Learning)
自己教師学習(Self-Supervised Learning)
cvpaper. challenge
 
[DL輪読会]representation learning via invariant causal mechanisms
[DL輪読会]representation learning via invariant causal mechanisms[DL輪読会]representation learning via invariant causal mechanisms
[DL輪読会]representation learning via invariant causal mechanisms
Deep Learning JP
 
DQNからRainbowまで 〜深層強化学習の最新動向〜
DQNからRainbowまで 〜深層強化学習の最新動向〜DQNからRainbowまで 〜深層強化学習の最新動向〜
DQNからRainbowまで 〜深層強化学習の最新動向〜
Jun Okumura
 
[DL輪読会]World Models
[DL輪読会]World Models[DL輪読会]World Models
[DL輪読会]World Models
Deep Learning JP
 
方策勾配型強化学習の基礎と応用
方策勾配型強化学習の基礎と応用方策勾配型強化学習の基礎と応用
方策勾配型強化学習の基礎と応用
Ryo Iwaki
 
【DL輪読会】Prompting Decision Transformer for Few-Shot Policy Generalization
【DL輪読会】Prompting Decision Transformer for Few-Shot Policy Generalization【DL輪読会】Prompting Decision Transformer for Few-Shot Policy Generalization
【DL輪読会】Prompting Decision Transformer for Few-Shot Policy Generalization
Deep Learning JP
 
[DL輪読会]深層強化学習はなぜ難しいのか?Why Deep RL fails? A brief survey of recent works.
[DL輪読会]深層強化学習はなぜ難しいのか?Why Deep RL fails? A brief survey of recent works.[DL輪読会]深層強化学習はなぜ難しいのか?Why Deep RL fails? A brief survey of recent works.
[DL輪読会]深層強化学習はなぜ難しいのか?Why Deep RL fails? A brief survey of recent works.
Deep Learning JP
 
生成モデルの Deep Learning
生成モデルの Deep Learning生成モデルの Deep Learning
生成モデルの Deep Learning
Seiya Tokui
 
強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷
Eiji Sekiya
 
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
Deep Learning JP
 
【DL輪読会】時系列予測 Transfomers の精度向上手法
【DL輪読会】時系列予測 Transfomers の精度向上手法【DL輪読会】時系列予測 Transfomers の精度向上手法
【DL輪読会】時系列予測 Transfomers の精度向上手法
Deep Learning JP
 
Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)
Yoshitaka Ushiku
 
GAN(と強化学習との関係)
GAN(と強化学習との関係)GAN(と強化学習との関係)
GAN(と強化学習との関係)
Masahiro Suzuki
 
[DL輪読会]“SimPLe”,“Improved Dynamics Model”,“PlaNet” 近年のVAEベース系列モデルの進展とそのモデルベース...
[DL輪読会]“SimPLe”,“Improved Dynamics Model”,“PlaNet” 近年のVAEベース系列モデルの進展とそのモデルベース...[DL輪読会]“SimPLe”,“Improved Dynamics Model”,“PlaNet” 近年のVAEベース系列モデルの進展とそのモデルベース...
[DL輪読会]“SimPLe”,“Improved Dynamics Model”,“PlaNet” 近年のVAEベース系列モデルの進展とそのモデルベース...
Deep Learning JP
 
Active Learning 入門
Active Learning 入門Active Learning 入門
Active Learning 入門Shuyo Nakatani
 
【DL輪読会】Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
【DL輪読会】Diffusion Policy: Visuomotor Policy Learning via Action Diffusion【DL輪読会】Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
【DL輪読会】Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
Deep Learning JP
 
Layer Normalization@NIPS+読み会・関西
Layer Normalization@NIPS+読み会・関西Layer Normalization@NIPS+読み会・関西
Layer Normalization@NIPS+読み会・関西
Keigo Nishida
 
Long-Tailed Classificationの最新動向について
Long-Tailed Classificationの最新動向についてLong-Tailed Classificationの最新動向について
Long-Tailed Classificationの最新動向について
Plot Hong
 

What's hot (20)

強化学習アルゴリズムPPOの解説と実験
強化学習アルゴリズムPPOの解説と実験強化学習アルゴリズムPPOの解説と実験
強化学習アルゴリズムPPOの解説と実験
 
【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning
【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning
【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning
 
自己教師学習(Self-Supervised Learning)
自己教師学習(Self-Supervised Learning)自己教師学習(Self-Supervised Learning)
自己教師学習(Self-Supervised Learning)
 
[DL輪読会]representation learning via invariant causal mechanisms
[DL輪読会]representation learning via invariant causal mechanisms[DL輪読会]representation learning via invariant causal mechanisms
[DL輪読会]representation learning via invariant causal mechanisms
 
DQNからRainbowまで 〜深層強化学習の最新動向〜
DQNからRainbowまで 〜深層強化学習の最新動向〜DQNからRainbowまで 〜深層強化学習の最新動向〜
DQNからRainbowまで 〜深層強化学習の最新動向〜
 
[DL輪読会]World Models
[DL輪読会]World Models[DL輪読会]World Models
[DL輪読会]World Models
 
方策勾配型強化学習の基礎と応用
方策勾配型強化学習の基礎と応用方策勾配型強化学習の基礎と応用
方策勾配型強化学習の基礎と応用
 
【DL輪読会】Prompting Decision Transformer for Few-Shot Policy Generalization
【DL輪読会】Prompting Decision Transformer for Few-Shot Policy Generalization【DL輪読会】Prompting Decision Transformer for Few-Shot Policy Generalization
【DL輪読会】Prompting Decision Transformer for Few-Shot Policy Generalization
 
[DL輪読会]深層強化学習はなぜ難しいのか?Why Deep RL fails? A brief survey of recent works.
[DL輪読会]深層強化学習はなぜ難しいのか?Why Deep RL fails? A brief survey of recent works.[DL輪読会]深層強化学習はなぜ難しいのか?Why Deep RL fails? A brief survey of recent works.
[DL輪読会]深層強化学習はなぜ難しいのか?Why Deep RL fails? A brief survey of recent works.
 
生成モデルの Deep Learning
生成モデルの Deep Learning生成モデルの Deep Learning
生成モデルの Deep Learning
 
強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷
 
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
 
【DL輪読会】時系列予測 Transfomers の精度向上手法
【DL輪読会】時系列予測 Transfomers の精度向上手法【DL輪読会】時系列予測 Transfomers の精度向上手法
【DL輪読会】時系列予測 Transfomers の精度向上手法
 
Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)
 
GAN(と強化学習との関係)
GAN(と強化学習との関係)GAN(と強化学習との関係)
GAN(と強化学習との関係)
 
[DL輪読会]“SimPLe”,“Improved Dynamics Model”,“PlaNet” 近年のVAEベース系列モデルの進展とそのモデルベース...
[DL輪読会]“SimPLe”,“Improved Dynamics Model”,“PlaNet” 近年のVAEベース系列モデルの進展とそのモデルベース...[DL輪読会]“SimPLe”,“Improved Dynamics Model”,“PlaNet” 近年のVAEベース系列モデルの進展とそのモデルベース...
[DL輪読会]“SimPLe”,“Improved Dynamics Model”,“PlaNet” 近年のVAEベース系列モデルの進展とそのモデルベース...
 
Active Learning 入門
Active Learning 入門Active Learning 入門
Active Learning 入門
 
【DL輪読会】Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
【DL輪読会】Diffusion Policy: Visuomotor Policy Learning via Action Diffusion【DL輪読会】Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
【DL輪読会】Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
 
Layer Normalization@NIPS+読み会・関西
Layer Normalization@NIPS+読み会・関西Layer Normalization@NIPS+読み会・関西
Layer Normalization@NIPS+読み会・関西
 
Long-Tailed Classificationの最新動向について
Long-Tailed Classificationの最新動向についてLong-Tailed Classificationの最新動向について
Long-Tailed Classificationの最新動向について
 

Similar to 【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation

【DL輪読会】SUMO: Unbiased Estimation of Log Marginal Probability for Latent Varia...
【DL輪読会】SUMO: Unbiased Estimation of Log Marginal Probability for Latent Varia...【DL輪読会】SUMO: Unbiased Estimation of Log Marginal Probability for Latent Varia...
【DL輪読会】SUMO: Unbiased Estimation of Log Marginal Probability for Latent Varia...
Deep Learning JP
 
Mod 3.pptx
Mod 3.pptxMod 3.pptx
Mod 3.pptx
SHREDHAPRASAD
 
Deep learning study 2
Deep learning study 2Deep learning study 2
Deep learning study 2
San Kim
 
NIPS KANSAI Reading Group #5: State Aware Imitation Learning
NIPS KANSAI Reading Group #5: State Aware Imitation LearningNIPS KANSAI Reading Group #5: State Aware Imitation Learning
NIPS KANSAI Reading Group #5: State Aware Imitation Learning
Eiji Uchibe
 
Brief Introduction About Topological Interference Management (TIM)
Brief Introduction About Topological Interference Management (TIM)Brief Introduction About Topological Interference Management (TIM)
Brief Introduction About Topological Interference Management (TIM)
Pei-Che Chang
 
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
Deep Learning JP
 
Presentation OCIP2014
Presentation OCIP2014Presentation OCIP2014
Presentation OCIP2014
Fabian Froehlich
 
Lec05.pptx
Lec05.pptxLec05.pptx
Lec05.pptx
HassanAhmad442087
 
A compact zero knowledge proof to restrict message space in homomorphic encry...
A compact zero knowledge proof to restrict message space in homomorphic encry...A compact zero knowledge proof to restrict message space in homomorphic encry...
A compact zero knowledge proof to restrict message space in homomorphic encry...
MITSUNARI Shigeo
 
Crash course in control theory for neuroscientists and biologists
Crash course in control theory for neuroscientists and biologistsCrash course in control theory for neuroscientists and biologists
Crash course in control theory for neuroscientists and biologists
Matteo Mischiati
 
Back propagation
Back propagationBack propagation
Back propagation
San Kim
 
Max flows via electrical flows (long talk)
Max flows via electrical flows (long talk)Max flows via electrical flows (long talk)
Max flows via electrical flows (long talk)Thatchaphol Saranurak
 
[DL輪読会]Understanding Measures of Uncertainty for Adversarial Example Detection
[DL輪読会]Understanding Measures of Uncertainty for Adversarial Example Detection[DL輪読会]Understanding Measures of Uncertainty for Adversarial Example Detection
[DL輪読会]Understanding Measures of Uncertainty for Adversarial Example Detection
Deep Learning JP
 
Lec10.pptx
Lec10.pptxLec10.pptx
Lec10.pptx
AbrahamTadesse11
 
Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...
Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...
Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...
Simen Li
 
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Chiheb Ben Hammouda
 
Euler lagrange equations of motion mit-holonomic constraints_lecture7
Euler lagrange equations of motion  mit-holonomic  constraints_lecture7Euler lagrange equations of motion  mit-holonomic  constraints_lecture7
Euler lagrange equations of motion mit-holonomic constraints_lecture7
JOHN OBIDI
 
function power point presentation for class 11 and 12 for jee
function power point presentation for class 11 and 12 for jeefunction power point presentation for class 11 and 12 for jee
function power point presentation for class 11 and 12 for jee
MohanSonawane
 
Digital Electronics Fundamentals
Digital Electronics Fundamentals Digital Electronics Fundamentals
Digital Electronics Fundamentals
Darwin Nesakumar
 
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural NetworksPaper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
ChenYiHuang5
 

Similar to 【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation (20)

【DL輪読会】SUMO: Unbiased Estimation of Log Marginal Probability for Latent Varia...
【DL輪読会】SUMO: Unbiased Estimation of Log Marginal Probability for Latent Varia...【DL輪読会】SUMO: Unbiased Estimation of Log Marginal Probability for Latent Varia...
【DL輪読会】SUMO: Unbiased Estimation of Log Marginal Probability for Latent Varia...
 
Mod 3.pptx
Mod 3.pptxMod 3.pptx
Mod 3.pptx
 
Deep learning study 2
Deep learning study 2Deep learning study 2
Deep learning study 2
 
NIPS KANSAI Reading Group #5: State Aware Imitation Learning
NIPS KANSAI Reading Group #5: State Aware Imitation LearningNIPS KANSAI Reading Group #5: State Aware Imitation Learning
NIPS KANSAI Reading Group #5: State Aware Imitation Learning
 
Brief Introduction About Topological Interference Management (TIM)
Brief Introduction About Topological Interference Management (TIM)Brief Introduction About Topological Interference Management (TIM)
Brief Introduction About Topological Interference Management (TIM)
 
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
 
Presentation OCIP2014
Presentation OCIP2014Presentation OCIP2014
Presentation OCIP2014
 
Lec05.pptx
Lec05.pptxLec05.pptx
Lec05.pptx
 
A compact zero knowledge proof to restrict message space in homomorphic encry...
A compact zero knowledge proof to restrict message space in homomorphic encry...A compact zero knowledge proof to restrict message space in homomorphic encry...
A compact zero knowledge proof to restrict message space in homomorphic encry...
 
Crash course in control theory for neuroscientists and biologists
Crash course in control theory for neuroscientists and biologistsCrash course in control theory for neuroscientists and biologists
Crash course in control theory for neuroscientists and biologists
 
Back propagation
Back propagationBack propagation
Back propagation
 
Max flows via electrical flows (long talk)
Max flows via electrical flows (long talk)Max flows via electrical flows (long talk)
Max flows via electrical flows (long talk)
 
[DL輪読会]Understanding Measures of Uncertainty for Adversarial Example Detection
[DL輪読会]Understanding Measures of Uncertainty for Adversarial Example Detection[DL輪読会]Understanding Measures of Uncertainty for Adversarial Example Detection
[DL輪読会]Understanding Measures of Uncertainty for Adversarial Example Detection
 
Lec10.pptx
Lec10.pptxLec10.pptx
Lec10.pptx
 
Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...
Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...
Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...
 
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
 
Euler lagrange equations of motion mit-holonomic constraints_lecture7
Euler lagrange equations of motion  mit-holonomic  constraints_lecture7Euler lagrange equations of motion  mit-holonomic  constraints_lecture7
Euler lagrange equations of motion mit-holonomic constraints_lecture7
 
function power point presentation for class 11 and 12 for jee
function power point presentation for class 11 and 12 for jeefunction power point presentation for class 11 and 12 for jee
function power point presentation for class 11 and 12 for jee
 
Digital Electronics Fundamentals
Digital Electronics Fundamentals Digital Electronics Fundamentals
Digital Electronics Fundamentals
 
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural NetworksPaper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
 

More from Deep Learning JP

【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
Deep Learning JP
 
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて
Deep Learning JP
 
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
Deep Learning JP
 
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
Deep Learning JP
 
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
Deep Learning JP
 
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM
Deep Learning JP
 
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo... 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
Deep Learning JP
 
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
Deep Learning JP
 
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
Deep Learning JP
 
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について
Deep Learning JP
 
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
Deep Learning JP
 
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
Deep Learning JP
 
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
Deep Learning JP
 
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
Deep Learning JP
 
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
Deep Learning JP
 
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
Deep Learning JP
 
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
Deep Learning JP
 
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
Deep Learning JP
 
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
Deep Learning JP
 
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
Deep Learning JP
 

More from Deep Learning JP (20)

【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
 
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて
 
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
 
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
 
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
 
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM
 
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo... 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
 
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
 
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
 
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について
 
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
 
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
 
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
 
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
 
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
 
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
 
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
 
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
 
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
 
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
 

Recently uploaded

JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 

Recently uploaded (20)

JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 

【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation