SlideShare a Scribd company logo
1 of 32
Download to read offline
Deep Transformers without Shortcuts:
Modifying Self-attention for Faithful Signal Propagation
Shohei Taniguchi, Matsuo Lab
1
Deep Transformers without Shortcuts
ॻࢽ৘ใ
ஶऀ
• Bobby He, James Martens, Guodong Zhang, Aleksandar Botev, Andrew Brock,
Samuel L Smith, Yee Whye Teh (DeepMind)
֓ཁ
• TransformerΛlayer normalization΍skip connectionͳ͠ͰֶशͰ͖ΔΑ͏ʹվྑ
• ICLR 2023 accepted
2
ൃද֓ཁ
• എ‫ܠ‬
• ؔ࿈‫ڀݚ‬
• ख๏
• ࣮‫݁ݧ‬Ռ
• ·ͱΊ
3
എ‫ܠ‬
Transformer
• Transformer͸AttentionͱMLPͷ‫܁‬Γฦ͠
• ֤ϞδϡʔϧͰskip connectionͱlayer normalizationΛ
ద༻͢Δͷ͕Ұൠత
• ͜ΕΒ͕࣮ࡍʹͲ͏͍͏໾ׂΛՌ͍ͨͯ͠Δ͔͸
‫ݱ‬ঢ়ෆ໌
• ͏·ֶ͘श͢ΔͨΊͷςΫχοΫͱ͍͏Ґஔ෇͚
4
ؔ࿈‫ڀݚ‬
Normalization-free Network
• MLP΍CNNͰ͸ɼskip connection΍ਖ਼‫ن‬Խ͕ͳͯ͘΋ਂ͍ωοτϫʔΫΛֶश
Ͱ͖Δํ๏͕஌ΒΕ͍ͯΔ
• ‫ج‬ຊతʹ͸ɼޯ഑ফࣦ/രൃ͕‫͜ى‬Βͳ͍Α͏ʹద੾ʹॏΈͷॳ‫ظ‬ԽΛߦ͑͹
ਖ਼‫ن‬ԽͳͲΛ࢖Θͳͯ͘΋େৎ෉
• Dynamic isometryͱ͍͏֓೦͕ಛʹॏཁ
5
Isometry
౳ํੑ
૚ͷMLPΛߟ͑Δͱɼೖྗ͔Βग़ྗ΁ͷϠίϏߦྻ ͸ɼ֤૚ͷॏΈͷߦྻੵ
ͨͩ͠ɼ ͸ Λຬͨ͢ର֯ߦྻ
L J
xl
= ϕ (hl
), hl
= Wl
xl−1
+ bl
J =
∂xL
∂h0
=
L
∏
l=1
Dl
Wl
Dl
Dl
ij = ϕ′

(hl
i) δij
6
Isometry
౳ํੑ
• ͜ͷϠίϏߦྻ͕ফࣦ/രൃ͍ͯ͠ͳ͚Ε͹ɼ҆ఆֶͯ͠शͰ͖Δ͸ͣ
􁉃
ߦྻͷಛҟ஋͕1෇ۙʹͳ͍ͬͯΕ͹ྑ͍
• ͷಛҟ஋ͷฏ‫͕ۉ‬1ͷͱ͖ɼ౳ํੑΛຬͨ͢
• ʹ͍ͭͯ͸ɼ‫׆‬ੑԽؔ਺͕‫߃Ͱۙ෇఺ݪ‬౳ؔ਺ͳΒ౳ํతʢtanhͳͲʣ
J =
∂xL
∂h0
=
L
∏
l=1
Dl
Wl
Wl
Dl
7
Dynamic Isometry
ಈత౳ํੑ
• ͞Βʹɼ͢΂ͯͷಛҟ஋͕1ͷͱ͖ɼಈత౳ํੑΛຬͨ͢
• ͜ΕΛຬͨ͢ͷ͸ɼॏΈ͕௚ަߦྻͷͱ͖
􁉃
௚ަॳ‫ظ‬ԽΛ͢Ε͹ɼޯ഑ফࣦ/രൃ͠ͳ͍ʂ
J =
∂xL
∂h0
=
L
∏
l=1
Dl
Wl
8
ؔ࿈‫ڀݚ‬ [1]
• MLPͰCIFAR-10ͷ෼ྨ
• ௚ަॳ‫ظ‬Խ + tanh͸ଞΑΓ଎͘ऩଋ͢Δ
9
ؔ࿈‫ڀݚ‬ [2]
CNNͷ৔߹
• CNN΋ಈత౳ํੑΛຬͨ͢Α͏ʹॳ‫ظ‬Խ͢Ε͹ɼਂ͍ϞσϧΛਖ਼‫ن‬Խͳ͠Ͱ
ֶशՄೳ
• ৞ΈࠐΈΧʔωϧͷதԝͷΈΛ௚ަॳ‫ظ‬Խͯ͠ɼ࢒Γ͸͢΂ͯ0Ͱॳ‫ظ‬Խ
• 1x1 convΛ௚ަॳ‫ظ‬Խͯ͠ɼͦͷपΓΛ0ຒΊ͢Δ‫ܗ‬
• ৞ΈࠐΈॲཧશମΛߦྻԋࢉͱ‫ʹ͖ͱͨݟ‬΋௚ަߦྻʹͳΔ
10
ؔ࿈‫ڀݚ‬ [2]
CNNͷ৔߹
• MNISTΛ4,000૚ͷCNNͰֶश
• ਖ਼‫ن‬Խ΍skip connection͸ೖΕͳ͍
• ਖ਼‫ن‬෼෍Ͱॳ‫ظ‬Խ͢ΔΑΓ΋ֶश͕଎͘ͳΔ
11
ؔ࿈‫ڀݚ‬ [2]
CNNͷ৔߹
• MNISTͱCIFAR-10Ͱ༷ʑͳਂ͞ͷϞσϧΛֶश
• 10,000૚·Ͱ૿΍ͯ͠΋ֶशͰ͖Δ
• ͨͩ͠ɼCIFAR-10Ͱ͸ςετͷਫ਼౓͕ανΔ
􁉃
ਖ਼‫ن‬Խ΍skip connection͸ֶशͷ҆ఆԽΑΓ΋
൚Խʹ‫د‬༩͍ͯ͠Δ͜ͱΛࣔࠦ
12
ؔ࿈‫ڀݚ‬ [3]
ReZero
• Skip connectionΛೖΕΔ৔߹Ͱ΋ɼಈత౳ํੑΛຬͨ͢Α͏ʹ
ॳ‫ظ‬Խ͢Ε͹ɼ͞ΒʹੑೳΛ্͛ΒΕͦ͏
• ௨ৗ͸ ʹ͢Δ͕ɼ Ͱॳ‫ظ‬Խͯ͠ ΋ֶशύϥϝʔλʹ͢Δ
• ॳ‫ظ‬Խ࣌఺Ͱ͸ɼ ͳͷͰɼ໌Β͔ʹಈత౳ํੑΛຬͨ͢
xi+1 = xi + αiF (xi)
αi = 1 αi = 0 αi
xi+1 = xi
13
ؔ࿈‫ڀݚ‬ [3]
ReZero
• CIFAR-10Ͱ32૚ͷMLPΛֶश
• ਖ਼‫ن‬Խͳ͠Ͱ΋͔ͳΓֶश͕଎͘ͳΔ
14
ؔ࿈‫ڀݚ‬ [3]
ReZero
• CIFAR-10ͰResNetΛֶश
• ֶश͕଎͘ͳΓɼੑೳ΋্͕Δ
15
ؔ࿈‫ڀݚ‬ [4]
ReLUͷ৔߹
• ReLUͷ৔߹͸ɼ௚ަॏΈͷҰ෦Λ൓సͤ͞Ε͹ಈత౳ํੑΛຬͨͤΔ
• ௚‫ײ‬తʹ͸ɼReLUͰ͸ෛͷ஋ʹͳͬͨೖྗ৴߸͕͢΂ͯ0ʹःஅ͞ΕΔͷͰɼ
ͦΕΛଧͪফ͢Α͏ʹූ߸Λ൓సͤ͞Ε͹ྑ͍ͱ͍͏͜ͱ
16
ؔ࿈‫ڀݚ‬ [5]
Transformerͷrank collapse
• MLP, skip connection, LayerNormͷͳ͍
attentionͷΈͷTransformer͸ɼॳ‫ظ‬Խͷ
࣌఺ͰϞσϧશମͷߦྻ͕૚਺ʹରͯ͠
ࢦ਺తʹϥϯΫམͪ͢Δ͜ͱ͕ཧ࿦తʹ
΋ࣔͤΔ
• AttentionͷΈͰ͸Transformer͸ֶशͰ͖
ͳ͍͜ͱΛࣔࠦ
17
Deep Transformers without Shortcuts
• TransformerͰ΋ਖ਼‫ن‬Խ΍skip connectionͳ͠ͰֶशͰ͖Δʁ
􁉃
‫ؤ‬ுΕ͹Ͱ͖Δ
• ७ਮʹਖ਼‫ن‬ԽͱskipΛൈ͘ͱޯ഑͕
രൃ͢Δ
• ఏҊ๏͸͍ͩͿ཈͑ΒΕ͍ͯΔ
Deep Transformers without Shortcuts
• ຊ࿦จͰ͸ɼGPT‫࢖Ͱܥ‬ΘΕΔΑ͏ͳCausal masked attentionΛର৅ʹ͢Δ
• ະདྷͷ‫ྻܥ‬Λࢀর͠ͳ͍Α͏ʹ ͰϚεΫ͢Δ
Attn(X) = A(X)V(X)
A(X) = softmax
(
M ∘
1
dk
Q(X)K(X)⊤
− Γ(1 − M)
)
Mi,j = 1i≥j
͸े෼େ͖͍ఆ਺
Γ
Deep Transformers without Shortcuts
• ·ͣ͸ɼMLPͷͳ͍attention-onlyͷϞσϧΛߟ͑Δͱɼ ૚໨ͷಛ௃ྔ͸
• ͱ͓͘ͱɼ ͕௚ަߦྻͷͱ͖
L
XL = [ALAL−1…A1] X0W, W =
L
∏
l=1
WV
l WO
l
Σl = XlX⊤
l , Πl = AlAl−1…A1 W
Σl = Πl ⋅ Σ0 ⋅ Π⊤
l
Deep Transformers without Shortcuts
• ͱ͓͘ͱɼ ͕௚ަߦྻͷͱ͖
• ͕୯Ґߦྻʹ͚ۙΕ͹ɼޯ഑͕҆ఆ͢Δ
􁉃
ͦΕ͕‫͜ى‬ΔΑ͏ʹ Λઃ‫͍ͨ͠ܭ‬
• ͨͩ͠ɼ ͸ཁૉ͕ඇෛͷԼࡾ֯ߦྻͱ͍͏੍໿෇͖
Σl = XlX⊤
l , Πl = AlAl−1…A1 W
Σl = Πl ⋅ Σ0 ⋅ Π⊤
l
Σl
Al
Al
Deep Transformers without Shortcuts
• ͱ͓͘ͱɼ ͕੒Γཱͭ΋ͱͰ
• ͜Ε͸ίϨεΩʔ෼ղʹ૬౰͢Δ
􁉃
ଥ౰ͳ Λઃ‫ͯ͠ܭ‬ɼͦͷίϨεΩʔ෼ղ Λ‫ٻ‬ΊΕ͹ɼ৚݅Λຬͨ͢ Λ
࡞ΕΔ
Al = LlL−1
l−1 L−1
0 Σ0L−1⊤
0 = IT
Σl = LlL⊤
l
Σl Ll Al
Deep Transformers without Shortcuts
U-SPA
• ର֯੒෼͕1ͰͦΕҎ֎͕ ͷߦྻ
• Λຬͨͤ͹ɼ৚݅Λຬͨ͢
• ϥϯΫམͪ΋๷͛Δ
Σl (ρl) = (1 − ρl) IT + ρl11⊤
ρl
0 ≤ ρ0 ≤ ρ1 ≤ ⋯ ≤ ρL < 1
Deep Transformers without Shortcuts
E-SPA
• ର֯੒෼͕1ͰͦΕҎ֎͸ର֯ઢ͔Βͷ‫Ͱ཭ڑ‬஋͕ఆ·Δߦྻ
• Λຬͨͤ͹ɼ৚݅Λຬͨ͢
• ϥϯΫམͪ΋๷͛Δ
(Σl (γl))i,j
= exp (−γl |i − j|)
γ0 ≥ γ1 ≥ ⋯ ≥ γL > 0
Deep Transformers without Shortcuts
Attentionͷ࠶ఆٛ
• લड़ͷ ͔Β‫ͨͬ࡞ͯ͠ࢉٯ‬ Λɼ ͱ෼ղ
• ͸ਖ਼ͷର֯ߦྻɼ ͸֤ߦͷ࿨͕1ͷԼࡾ֯ߦྻ
• ͱ͓͍ͯɼҎԼͷΑ͏ʹattentionΛ࠶ఆٛ
• ͷॏΈ Λ0Ͱॳ‫ظ‬Խ͢Δ͜ͱͰɼॳ‫ظ‬஋ʹ͓͍ͯ ͕ॴ๬ͷ‫ͳʹܗ‬Δ
Σ A A = DP
D P
B = log(P)
Q(X) WQ
Σ
Attn(X) = DP(X)V(X), P(X) = softmax M ∘
[
1
dk
Q(X)K(X)⊤
+ B
]
− Γ(1 − M)
࣮‫ݧ‬
WikiText-103
• 36૚ͷTransformerΛֶश
• ૉ๿ʹskipΛͳͨ͘͠΋ͷ͸ɼશֶ͘शͰ͖ͳ͍
• ఏҊ๏͸ɼͪΌΜͱֶशͰ͖ͯΔ
• ͨͩ͠ɼskip + LNΛೖΕͨ௨ৗͷ΋ͷΑΓ΋
ֶश͕͍ͩͿ஗͍
࣮‫ݧ‬
C4σʔληοτ
• 32૚ͷTransformerΛֶश
• ֶश࣌ؒΛ৳͹ͤ͹ɼskip + LN͋Γͷੑೳʹ౸ୡ͢Δ
• ໿5ഒ͘Β͍͕͔͔࣌ؒΔ
• Transformerʹ͓͍ͯ͸ɼskip΍LN͸ֶशͷ
ߴ଎Խʹ‫د‬༩͍ͯ͠Δʁ
࣮‫ݧ‬
C4σʔληοτͰͷ࣮‫ݧ‬
• Skip connectionΛೖΕΔͱఏҊ๏͕ϕʔεϥΠϯͷskip + LNͷ΋ͷʹউͭ
• ΍͸ΓTransformerͰ͸skip connection͕
௒ॏཁʁ
·ͱΊ
• MLP΍CNNͰ͸ɼಈత౳ํੑΛຬͨ͢Α͏ʹॳ‫ظ‬ԽΛߦ͑͹ɼਖ਼‫ن‬Խ΍skip
connectionͳ͠Ͱ΋ɼਂ͍ωοτϫʔΫΛֶशͰ͖Δ
• TransformerͰ΋ɼಉ͡Α͏ʹॳ‫ظ‬ԽΛஸೡʹ΍Ε͹ɼskip΍LNͳ͠ͰֶशͰ
͖Δ͜ͱ͕Θ͔ͬͨ
• ͨͩ͠ɼֶश͕͔࣌ؒͳΓ͔͔Δ
‫ײ‬૝
• ए‫ׯ‬ແཧ΍Γ‫ײ‬͸൱Ίͳ͍
• ݁‫ظॳہ‬Խ࣌ͷattention͕୯Ґߦྻʹۙ͘ͳΔΑ͏ʹ͢Ε͹ྑ͍ͱ͍͏͜ͱ
ͷ͸ͣ
• ΋ͬͱγϯϓϧͳํ๏΋͋Γͦ͏ͳ‫͕͢ؾ‬Δ
• ֶश͕஗͘ͳΔ‫ݪ‬Ҽ͕Ͳ͜ʹ͋Δͷ͔͕͋·ΓΘ͔͍ͬͯͳ͍
ࢀߟจ‫ݙ‬
[1] Pennington, Jeffrey, Samuel Schoenholz, and Surya Ganguli. "Resurrecting the
sigmoid in deep learning through dynamical isometry: theory and practice."
Advances in neural information processing systems 30 (2017).
[2] Xiao, Lechao, et al. "Dynamical isometry and a mean field theory of cnns: How to
train 10,000-layer vanilla convolutional neural networks." International Conference
on Machine Learning. PMLR, 2018.
[3] Bachlechner, Thomas, et al. "Rezero is all you need: Fast convergence at large
depth." Uncertainty in Artificial Intelligence. PMLR, 2021. APA
31
ࢀߟจ‫ݙ‬
[4] Burkholz, Rebekka, and Alina Dubatovka. "Initialization of relus for dynamical
isometry." Advances in Neural Information Processing Systems 32 (2019).
[5] Dong, Yihe, Jean-Baptiste Cordonnier, and Andreas Loukas. "Attention is not all
you need: Pure attention loses rank doubly exponentially with depth." International
Conference on Machine Learning. PMLR, 2021.
[6] He, Bobby, et al. "Deep Transformers without Shortcuts: Modifying Self-attention
for Faithful Signal Propagation." The Eleventh International Conference on Learning
Representations. 2023.
32

More Related Content

What's hot

【DL輪読会】Toolformer: Language Models Can Teach Themselves to Use Tools
【DL輪読会】Toolformer: Language Models Can Teach Themselves to Use Tools【DL輪読会】Toolformer: Language Models Can Teach Themselves to Use Tools
【DL輪読会】Toolformer: Language Models Can Teach Themselves to Use ToolsDeep Learning JP
 
【DL輪読会】時系列予測 Transfomers の精度向上手法
【DL輪読会】時系列予測 Transfomers の精度向上手法【DL輪読会】時系列予測 Transfomers の精度向上手法
【DL輪読会】時系列予測 Transfomers の精度向上手法Deep Learning JP
 
全力解説!Transformer
全力解説!Transformer全力解説!Transformer
全力解説!TransformerArithmer Inc.
 
semantic segmentation サーベイ
semantic segmentation サーベイsemantic segmentation サーベイ
semantic segmentation サーベイyohei okawa
 
[DL輪読会]Learning Latent Dynamics for Planning from Pixels
[DL輪読会]Learning Latent Dynamics for Planning from Pixels[DL輪読会]Learning Latent Dynamics for Planning from Pixels
[DL輪読会]Learning Latent Dynamics for Planning from PixelsDeep Learning JP
 
劣モジュラ最適化と機械学習1章
劣モジュラ最適化と機械学習1章劣モジュラ最適化と機械学習1章
劣モジュラ最適化と機械学習1章Hakky St
 
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving PlannersDeep Learning JP
 
最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング
最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング
最近のKaggleに学ぶテーブルデータの特徴量エンジニアリングmlm_kansai
 
最適輸送の解き方
最適輸送の解き方最適輸送の解き方
最適輸送の解き方joisino
 
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling ProblemDeep Learning JP
 
PRML学習者から入る深層生成モデル入門
PRML学習者から入る深層生成モデル入門PRML学習者から入る深層生成モデル入門
PRML学習者から入る深層生成モデル入門tmtm otm
 
深層生成モデルと世界モデル(2020/11/20版)
深層生成モデルと世界モデル(2020/11/20版)深層生成モデルと世界モデル(2020/11/20版)
深層生成モデルと世界モデル(2020/11/20版)Masahiro Suzuki
 
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with TransformersDeep Learning JP
 
強化学習その3
強化学習その3強化学習その3
強化学習その3nishio
 
[DL輪読会]GQNと関連研究,世界モデルとの関係について
[DL輪読会]GQNと関連研究,世界モデルとの関係について[DL輪読会]GQNと関連研究,世界モデルとの関係について
[DL輪読会]GQNと関連研究,世界モデルとの関係についてDeep Learning JP
 
最適化超入門
最適化超入門最適化超入門
最適化超入門Takami Sato
 
【DL輪読会】Perceiver io a general architecture for structured inputs &amp; outputs
【DL輪読会】Perceiver io  a general architecture for structured inputs &amp; outputs 【DL輪読会】Perceiver io  a general architecture for structured inputs &amp; outputs
【DL輪読会】Perceiver io a general architecture for structured inputs &amp; outputs Deep Learning JP
 
【DL輪読会】AUTOGT: AUTOMATED GRAPH TRANSFORMER ARCHITECTURE SEARCH
【DL輪読会】AUTOGT: AUTOMATED GRAPH TRANSFORMER ARCHITECTURE SEARCH【DL輪読会】AUTOGT: AUTOMATED GRAPH TRANSFORMER ARCHITECTURE SEARCH
【DL輪読会】AUTOGT: AUTOMATED GRAPH TRANSFORMER ARCHITECTURE SEARCHDeep Learning JP
 
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
【DL輪読会】The Forward-Forward Algorithm: Some PreliminaryDeep Learning JP
 

What's hot (20)

【DL輪読会】Toolformer: Language Models Can Teach Themselves to Use Tools
【DL輪読会】Toolformer: Language Models Can Teach Themselves to Use Tools【DL輪読会】Toolformer: Language Models Can Teach Themselves to Use Tools
【DL輪読会】Toolformer: Language Models Can Teach Themselves to Use Tools
 
【DL輪読会】時系列予測 Transfomers の精度向上手法
【DL輪読会】時系列予測 Transfomers の精度向上手法【DL輪読会】時系列予測 Transfomers の精度向上手法
【DL輪読会】時系列予測 Transfomers の精度向上手法
 
全力解説!Transformer
全力解説!Transformer全力解説!Transformer
全力解説!Transformer
 
semantic segmentation サーベイ
semantic segmentation サーベイsemantic segmentation サーベイ
semantic segmentation サーベイ
 
[DL輪読会]Learning Latent Dynamics for Planning from Pixels
[DL輪読会]Learning Latent Dynamics for Planning from Pixels[DL輪読会]Learning Latent Dynamics for Planning from Pixels
[DL輪読会]Learning Latent Dynamics for Planning from Pixels
 
劣モジュラ最適化と機械学習1章
劣モジュラ最適化と機械学習1章劣モジュラ最適化と機械学習1章
劣モジュラ最適化と機械学習1章
 
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
 
最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング
最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング
最近のKaggleに学ぶテーブルデータの特徴量エンジニアリング
 
最適輸送の解き方
最適輸送の解き方最適輸送の解き方
最適輸送の解き方
 
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem
 
PRML学習者から入る深層生成モデル入門
PRML学習者から入る深層生成モデル入門PRML学習者から入る深層生成モデル入門
PRML学習者から入る深層生成モデル入門
 
深層生成モデルと世界モデル(2020/11/20版)
深層生成モデルと世界モデル(2020/11/20版)深層生成モデルと世界モデル(2020/11/20版)
深層生成モデルと世界モデル(2020/11/20版)
 
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
【DL輪読会】A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
 
強化学習その3
強化学習その3強化学習その3
強化学習その3
 
[DL輪読会]GQNと関連研究,世界モデルとの関係について
[DL輪読会]GQNと関連研究,世界モデルとの関係について[DL輪読会]GQNと関連研究,世界モデルとの関係について
[DL輪読会]GQNと関連研究,世界モデルとの関係について
 
最適化超入門
最適化超入門最適化超入門
最適化超入門
 
【DL輪読会】Perceiver io a general architecture for structured inputs &amp; outputs
【DL輪読会】Perceiver io  a general architecture for structured inputs &amp; outputs 【DL輪読会】Perceiver io  a general architecture for structured inputs &amp; outputs
【DL輪読会】Perceiver io a general architecture for structured inputs &amp; outputs
 
【DL輪読会】AUTOGT: AUTOMATED GRAPH TRANSFORMER ARCHITECTURE SEARCH
【DL輪読会】AUTOGT: AUTOMATED GRAPH TRANSFORMER ARCHITECTURE SEARCH【DL輪読会】AUTOGT: AUTOMATED GRAPH TRANSFORMER ARCHITECTURE SEARCH
【DL輪読会】AUTOGT: AUTOMATED GRAPH TRANSFORMER ARCHITECTURE SEARCH
 
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
 
深層強化学習と実装例
深層強化学習と実装例深層強化学習と実装例
深層強化学習と実装例
 

Similar to 【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation

【DL輪読会】SUMO: Unbiased Estimation of Log Marginal Probability for Latent Varia...
【DL輪読会】SUMO: Unbiased Estimation of Log Marginal Probability for Latent Varia...【DL輪読会】SUMO: Unbiased Estimation of Log Marginal Probability for Latent Varia...
【DL輪読会】SUMO: Unbiased Estimation of Log Marginal Probability for Latent Varia...Deep Learning JP
 
Deep learning study 2
Deep learning study 2Deep learning study 2
Deep learning study 2San Kim
 
NIPS KANSAI Reading Group #5: State Aware Imitation Learning
NIPS KANSAI Reading Group #5: State Aware Imitation LearningNIPS KANSAI Reading Group #5: State Aware Imitation Learning
NIPS KANSAI Reading Group #5: State Aware Imitation LearningEiji Uchibe
 
Brief Introduction About Topological Interference Management (TIM)
Brief Introduction About Topological Interference Management (TIM)Brief Introduction About Topological Interference Management (TIM)
Brief Introduction About Topological Interference Management (TIM)Pei-Che Chang
 
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihoodDeep Learning JP
 
A compact zero knowledge proof to restrict message space in homomorphic encry...
A compact zero knowledge proof to restrict message space in homomorphic encry...A compact zero knowledge proof to restrict message space in homomorphic encry...
A compact zero knowledge proof to restrict message space in homomorphic encry...MITSUNARI Shigeo
 
Crash course in control theory for neuroscientists and biologists
Crash course in control theory for neuroscientists and biologistsCrash course in control theory for neuroscientists and biologists
Crash course in control theory for neuroscientists and biologistsMatteo Mischiati
 
Back propagation
Back propagationBack propagation
Back propagationSan Kim
 
Max flows via electrical flows (long talk)
Max flows via electrical flows (long talk)Max flows via electrical flows (long talk)
Max flows via electrical flows (long talk)Thatchaphol Saranurak
 
[DL輪読会]Understanding Measures of Uncertainty for Adversarial Example Detection
[DL輪読会]Understanding Measures of Uncertainty for Adversarial Example Detection[DL輪読会]Understanding Measures of Uncertainty for Adversarial Example Detection
[DL輪読会]Understanding Measures of Uncertainty for Adversarial Example DetectionDeep Learning JP
 
Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...
Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...
Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...Simen Li
 
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...Chiheb Ben Hammouda
 
Euler lagrange equations of motion mit-holonomic constraints_lecture7
Euler lagrange equations of motion  mit-holonomic  constraints_lecture7Euler lagrange equations of motion  mit-holonomic  constraints_lecture7
Euler lagrange equations of motion mit-holonomic constraints_lecture7JOHN OBIDI
 
Digital Electronics Fundamentals
Digital Electronics Fundamentals Digital Electronics Fundamentals
Digital Electronics Fundamentals Darwin Nesakumar
 
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural NetworksPaper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural NetworksChenYiHuang5
 
Lecture 5 backpropagation
Lecture 5 backpropagationLecture 5 backpropagation
Lecture 5 backpropagationParveenMalik18
 

Similar to 【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation (20)

【DL輪読会】SUMO: Unbiased Estimation of Log Marginal Probability for Latent Varia...
【DL輪読会】SUMO: Unbiased Estimation of Log Marginal Probability for Latent Varia...【DL輪読会】SUMO: Unbiased Estimation of Log Marginal Probability for Latent Varia...
【DL輪読会】SUMO: Unbiased Estimation of Log Marginal Probability for Latent Varia...
 
Mod 3.pptx
Mod 3.pptxMod 3.pptx
Mod 3.pptx
 
Deep learning study 2
Deep learning study 2Deep learning study 2
Deep learning study 2
 
NIPS KANSAI Reading Group #5: State Aware Imitation Learning
NIPS KANSAI Reading Group #5: State Aware Imitation LearningNIPS KANSAI Reading Group #5: State Aware Imitation Learning
NIPS KANSAI Reading Group #5: State Aware Imitation Learning
 
Brief Introduction About Topological Interference Management (TIM)
Brief Introduction About Topological Interference Management (TIM)Brief Introduction About Topological Interference Management (TIM)
Brief Introduction About Topological Interference Management (TIM)
 
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
【DL輪読会】Unbiased Gradient Estimation for Marginal Log-likelihood
 
Presentation OCIP2014
Presentation OCIP2014Presentation OCIP2014
Presentation OCIP2014
 
Lec05.pptx
Lec05.pptxLec05.pptx
Lec05.pptx
 
A compact zero knowledge proof to restrict message space in homomorphic encry...
A compact zero knowledge proof to restrict message space in homomorphic encry...A compact zero knowledge proof to restrict message space in homomorphic encry...
A compact zero knowledge proof to restrict message space in homomorphic encry...
 
Crash course in control theory for neuroscientists and biologists
Crash course in control theory for neuroscientists and biologistsCrash course in control theory for neuroscientists and biologists
Crash course in control theory for neuroscientists and biologists
 
Back propagation
Back propagationBack propagation
Back propagation
 
Max flows via electrical flows (long talk)
Max flows via electrical flows (long talk)Max flows via electrical flows (long talk)
Max flows via electrical flows (long talk)
 
[DL輪読会]Understanding Measures of Uncertainty for Adversarial Example Detection
[DL輪読会]Understanding Measures of Uncertainty for Adversarial Example Detection[DL輪読会]Understanding Measures of Uncertainty for Adversarial Example Detection
[DL輪読会]Understanding Measures of Uncertainty for Adversarial Example Detection
 
Lec10.pptx
Lec10.pptxLec10.pptx
Lec10.pptx
 
Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...
Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...
Circuit Network Analysis - [Chapter5] Transfer function, frequency response, ...
 
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
 
Euler lagrange equations of motion mit-holonomic constraints_lecture7
Euler lagrange equations of motion  mit-holonomic  constraints_lecture7Euler lagrange equations of motion  mit-holonomic  constraints_lecture7
Euler lagrange equations of motion mit-holonomic constraints_lecture7
 
Digital Electronics Fundamentals
Digital Electronics Fundamentals Digital Electronics Fundamentals
Digital Electronics Fundamentals
 
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural NetworksPaper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks
 
Lecture 5 backpropagation
Lecture 5 backpropagationLecture 5 backpropagation
Lecture 5 backpropagation
 

More from Deep Learning JP

【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについてDeep Learning JP
 
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...Deep Learning JP
 
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-ResolutionDeep Learning JP
 
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxivDeep Learning JP
 
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLMDeep Learning JP
 
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo... 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...Deep Learning JP
 
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place RecognitionDeep Learning JP
 
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?Deep Learning JP
 
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究についてDeep Learning JP
 
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )Deep Learning JP
 
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...Deep Learning JP
 
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"Deep Learning JP
 
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "Deep Learning JP
 
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat ModelsDeep Learning JP
 
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"Deep Learning JP
 
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...Deep Learning JP
 
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...Deep Learning JP
 
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...Deep Learning JP
 
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...Deep Learning JP
 
【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】マルチモーダル 基盤モデル【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】マルチモーダル 基盤モデルDeep Learning JP
 

More from Deep Learning JP (20)

【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて
 
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
 
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
 
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
 
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM
 
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo... 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
 
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
 
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
 
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について
 
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
 
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
 
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
 
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
 
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
 
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
 
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
 
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
 
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
 
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
 
【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】マルチモーダル 基盤モデル【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】マルチモーダル 基盤モデル
 

Recently uploaded

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 

Recently uploaded (20)

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 

【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation