SlideShare a Scribd company logo
1 of 89
Download to read offline
takigawa.ichigaku.8s@kyoto-u.ac.jp
15 2024 1 26 ( )
Slido
https://app.sli.do/event/d13KtHT3VXxFbmLvxPSQjn
PDF
https://itakigawa.github.io/data/autograd.pdf
3
§
§
§
§
§ ( )
§ ( )
§ https://itakigawa.github.io/
4
1.
2.
3. ( )
5
§ 画像生成AIが爆速で進化した2023年をまとめて振り返る
https://ascii.jp/elem/000/004/174/4174570/
2023 AI
6
§
AI
https://www.itmedia.co.jp/news/articles/2310/05/news179.html
7
AI
§
https://www.itmedia.co.jp/news/articles/2310/05/news179.html
8
Adobe Photoshop
https://www.adobe.com/jp/products/photoshop/generative-fill.html
9
§ ChatGPT 2022 11 2 1
§ AI
§ GAFAM IT 2023
OpenAI ChatGPT
10
1.
-
-
-
2.
-
-
-
3.
-
-
- DIY
OpenAI ChatGPT
4.
-
-
5. GPTs
- DALL-E
-
-
6.
-
-
§
§ ChatGPT (ChatGPT )
13
§ Microsoft OpenAI ChatGPT DALL-E
AI
§ AI 3 (440 )
3 Apple 2 (🇬🇧🇫🇷🇮🇹 GDP )
§ Bing ( Edge) ChatGPT
ChatGPT Web
§ GitHub Copilot( )
§ ( )
§
§ Windows 11 Microsoft 365 Copilot/Copilot Pro
Microsoft Copilot ( )
14
§ AI Windows 11/ Microsoft 365 Copilot
§ MS (Word, Excel, Powerpoint, Outlook, Teams) AI
§
§ Word
§ Powerpoint
§ Powerpoint
§ Excel
§ Web
§ Windows
MS
15
§
16
AI
§
§ ( 2023 12 )
§ ChatGPT AI 2
§ http://hdl.handle.net/2433/286548
§
( ), ( )
§ 1
(Preferred Networks)
§ 2
( )
§
( )
1.
18
..
.
A B
( )
• /
•
•
19
コンピュータ
プログラム
入力 出力
my_function
x1
x2
y
20
( )
J aime la
musique I love music
(Software 2.0 )
21
=
(g) (cm)
(g)
(cm)
(cm)
(g)
or
22
=
or
(g)
(cm)
(cm)
(g)
23
=
Random Forest
Gaussian Process
Logistic Regression
P( =red)
0 1
or
24
Random Forest Neural Network SVR Kernel Ridge
p1 p2 p3 p4
…
( )
( )
( )
⾒
25
§ : 1 : 1
= ( )
26
Decision
Tree
Random
Forest GBDT
Nearest
Neighbor
Logistic
Regression
SVM
Gaussian
Process
Neural
Network
(
)
2.
28
vs ( )
q1 q2 q3 q4
…
Random
Forest
GBDT
Nearest
Neighbor
SVM
Gaussian
Process
Neural
Network
29
vs ( )
p1 p2 p3 p4
…
( )
q1 q2 q3 q4
…
30
§
§ 𝑓(𝑥) 𝑦
ℓ(𝑓 𝑥 , 𝑦)
§ (MSE)
§ (Cross Entropy)
§ =
(optimization)
Minimize! 𝐿 𝜃 𝐿 𝜃 = ∑"#$
%
ℓ(𝑓 𝑥", 𝜃 , 𝑦") + Ω(𝜃)
31
𝐿 𝑎, 𝑏 = − log *
!:#!$%
𝑃(𝑦 = 1|𝑥!) *
!:#!$&
𝑃(𝑦 = 0|𝑥!)
データ
モデル 𝑓 𝑥 = 𝜎 𝑎𝑥 + 𝑏 =
1
1 + 𝑒'()*+,)
= 𝑃(𝑦 = 1|𝑥)
𝑥%, 𝑦% , 𝑥., 𝑦. , ⋯ , (𝑥/, 𝑦/) 各𝑦! 0 1
2
(Cross Entropy)
(-1)×
= − 8
!$%
/
𝑦! log 𝑓(𝑥!) + 1 − 𝑦! log(1 − 𝑓 𝑥! )
( )
0 1
( )
32
1 1
𝑓 𝑥 = 𝜎 𝑎𝑥 + 𝑏 =
1
1 + 𝑒'()*+,)
𝑎𝑥 + 𝑏
𝑎
𝑏
𝑥
9
𝑦
𝑧
𝑥", 𝑦" , 𝑥#, 𝑦# , (𝑥$, 𝑦$)
Minimize 𝐿 𝑎, 𝑏
𝐿 𝑎, 𝑏 = − 𝑦" log
"
"%&! "#$%& + 1 − 𝑦" log 1 −
"
"%&! "#$%&
− 𝑦# log
1
1 + 𝑒'()*'%+)
+ 1 − 𝑦# log 1 −
1
1 + 𝑒'()*'%+)
− 𝑦$ log
1
1 + 𝑒'()*(%+)
+ 1 − 𝑦$ log 1 −
1
1 + 𝑒'()*(%+)
𝜎 𝑥 =
1
1 + 𝑒'*
Linear Sigmoid
( )
33
1 1 3 ( 1 )
Minimize 𝐿 𝑎!, 𝑎", 𝑏!, 𝑏"
𝐿 𝑎), 𝑎*, 𝑏), 𝑏* = − 𝑦) log
1
1 + 𝑒
+ ,!
)
)-."($%&%'(%)
-/!
+ 1 − 𝑦) log 1 −
1
1 + 𝑒
+ ,!
)
)-."($%&%'(%)
-/!
− 𝑦* log
1
1 + 𝑒
+ ,!
)
)-."($%&!'(%)
-/!
+ 1 − 𝑦* log 1 −
1
1 + 𝑒
+ ,!
)
)-."($%&!'(%)
-/!
− 𝑦0 log
1
1 + 𝑒
+ ,!
)
)-."($%&*'(%)
-/!
+ 1 − 𝑦0 log 1 −
1
1 + 𝑒
+ ,!
)
)-."($%&*'(%)
-/!
𝑎%
𝑏%
𝑥
𝑧
1
1 + 𝑒#(%!&'(!)
𝑎.
𝑏. 𝑦
1
1 + 𝑒#(%"*'(")
𝑥 𝑧 𝑦
34
1 1 3 ( 2 )
𝑥 𝑦
𝑎", 𝑏"
Linear 𝑢"
𝑎#, 𝑏#
Linear 𝑢#
Sigmoid 𝑧"
Sigmoid 𝑧#
𝑎$, 𝑏$, 𝑐$
Linear 𝑣 Sigmoid
Minimize 𝐿 𝑎!, 𝑎", 𝑎+, 𝑏!, 𝑏", 𝑏+, 𝑐+
𝑎0𝑧) + 𝑏0𝑧* + 𝑐0
35
§
§ ResNet50 2600
§ AlexNet 6100
§ ResNeXt101 8400
§ VGG19 1 4300
§ LLaMa 650
§ Chinchilla 700
§ GPT-3 1750
§ Gopher 2800
§ PaLM 5400
(CNN)
Transformer ( )
36
𝐿 𝜃%, 𝜃., ⋯
𝜃% 𝜃.
1. θ$, θ;, ⋯
2. 𝐿 𝜃$, 𝜃;, ⋯
θ$, θ;, ⋯
3.
( )
37
𝑓 𝑥, 𝑦 (𝑎, 𝑏)
𝑥
𝑥
𝑦 = 𝑏
𝑥 = 𝑎
𝑓* 𝑎, 𝑏 = lim
0→&
2 )+0, , '2(),,)
0
𝑓&(𝑎, 𝑏)
𝑓 𝑎, 𝑏 𝑎
38
(Gradient)
=
=
( )
39
=
( )
40
§
41
=
• 𝒙
𝑓(𝒙) 𝒙
(learning rate)
𝛼 : 1 " "(step size)
𝒙
𝑥%
𝑥.
⋮
𝑥4
←
𝑥%
𝑥.
⋮
𝑥4
− 𝛼 ×
𝜕𝑓/𝜕𝑥%
𝜕𝑓/𝜕𝑥.
⋮
𝜕𝑓/𝜕𝑥4
𝑓(𝒙)
𝑥!
𝑥"
𝒙 =
𝑥!
𝑥"
42
( )
43
§
§ lr
( )
( )
44
§
§
§ Momentum
§ AdaGrad
§ RMSProp
§ Adam
https://towardsdatascience.com/a-visual-
explanation-of-gradient-descent-methods-
momentum-adagrad-rmsprop-adam-f898b102325c
45
§
45
𝑥 𝑦
𝑎), 𝑏)
Linear 𝑢)
𝑎*, 𝑏*
Linear 𝑢*
Sigmoid 𝑧)
Sigmoid 𝑧*
𝑎0, 𝑏0, 𝑐0
Linear 𝑣 Sigmoid
𝑦
𝜕𝑦/𝜕𝑎!, 𝜕𝑦/𝜕𝑏!, 𝜕𝑦/𝜕𝑐!,
3. ( )
47
§ (Chain Rule)
§
𝑥 𝑢 = 𝑓(𝑥) 𝑦 = 𝑔(𝑢)
𝑑𝑦
𝑑𝑥
=
𝑑𝑦
𝑑𝑢
𝑑𝑢
𝑑𝑥
§
𝑡 𝑥! = 𝑓! 𝑡 , 𝑥" = 𝑓" 𝑡 , ⋯
⾒ 𝑧 = 𝑔(𝑥!, 𝑥", ⋯ )
𝜕𝑧
𝜕𝑡
=
𝜕𝑧
𝜕𝑥%
𝜕𝑥%
𝜕𝑡
+
𝜕𝑧
𝜕𝑥.
𝜕𝑥.
𝜕𝑡
+ ⋯
𝑡 𝑥%
𝑥.
𝑧
⋯
𝑥 𝑢 𝑦
𝑓 𝑔
𝑓!
𝑓"
𝑔
48
§ 𝑦 = 𝑒;< = >
<
𝑥
𝑒'*
1/𝑥
𝑎
𝑏
𝑦
𝑎.𝑏
𝑎 = 𝑒'*
𝑏 =
1
𝑥
𝑦 = 𝑎.𝑏
𝜕𝑓
𝜕𝑥
=
𝑒'.*
𝑥
5
= −
𝑒'.*(2𝑥 + 1)
𝑥.
( or )
𝜕𝑦
𝜕𝑥
=
𝜕𝑦
𝜕𝑎
𝜕𝑎
𝜕𝑥
+
𝜕𝑦
𝜕𝑏
𝜕𝑏
𝜕𝑥
= 2𝑎𝑏 −𝑒'* + 𝑎. −
1
𝑥.
= 2𝑒'* J
%
*
−𝑒'* + 𝑒'* . −
%
*- = −
6.-/(.*+%)
*-
( )
49
1 3
𝜕𝑦/𝜕𝑥
𝑥 𝑦
𝑎", 𝑏"
Linear 𝑢"
𝑎#, 𝑏#
Linear 𝑢#
Sigmoid 𝑧"
Sigmoid 𝑧#
𝑎$, 𝑏$, 𝑐$
Linear 𝑣 Sigmoid
50
1 3
𝑥 𝑦
𝑢"
𝑢#
𝑧"
𝑧#
𝑣
𝑑𝑦
𝑑𝑥
=
𝑑𝑢%
𝑑𝑥
𝑑𝑧%
𝑑𝑢%
𝑑𝑣
𝑑𝑧%
𝑑𝑦
𝑑𝑣
+
𝑑𝑢.
𝑑𝑥
𝑑𝑧.
𝑑𝑢.
𝑑𝑣
𝑑𝑧.
𝑑𝑦
𝑑𝑣
𝑑𝑦
𝑑𝑣
𝑑𝑣
𝑑𝑧"
𝑑𝑣
𝑑𝑧#
𝑑𝑧#
𝑑𝑢#
𝑑𝑧"
𝑑𝑢"
𝑑𝑢
𝑑𝑥
𝑑𝑢"
𝑑𝑥
𝑑𝑦
𝑑𝑣
( )
51
2
𝑣
𝑢
𝑓%
𝑣 = 3 𝑢 𝑣 = log 𝑢 𝑣 = 1/𝑢
2
𝑣
𝑎
𝑏
𝑔%
𝑣 = 𝑎 𝑏 𝑣 = 𝑎. + 𝑏
𝑑𝑣
𝑑𝑢
=
3
2 𝑢
𝑑𝑣
𝑑𝑢
=
1
𝑢
𝑑𝑣
𝑑𝑢
= −
1
𝑢.
𝜕𝑣
𝜕𝑎
= 𝑏
𝜕𝑣
𝜕𝑏
= 𝑎
𝜕𝑣
𝜕𝑎
= 2𝑎
𝜕𝑣
𝜕𝑏
= 1
𝑣
𝑢
𝑓.
𝑣
𝑢
𝑓7
𝑣
𝑎
𝑏
𝑔.
𝑦 = 𝑔%(𝑔. 𝑓7 𝑥 , 𝑓. 𝑥 , 𝑓% 𝑥 ) = 3 𝑥 log 𝑥 +
1
𝑥.
52
2
𝑑𝑦
𝑑𝑥
=
𝑑𝑧,
𝑑𝑥
𝑑𝑧!
𝑑𝑧,
𝑑𝑦
𝑑𝑧!
+
𝑑𝑧+
𝑑𝑥
𝑑𝑧!
𝑑𝑧+
𝑑𝑦
𝑑𝑧!
+
𝑑𝑧"
𝑑𝑥
𝑑𝑦
𝑑𝑧"
= 3 𝑥
1
𝑥
−
2
𝑥+ +
3 log 𝑥 +
1
𝑥"
2 𝑥
𝑦 = 𝑔% 𝑧%, 𝑧. = 𝑧%𝑧.
𝑧% = 𝑔. 𝑧8, 𝑧7 = 𝑧7 + 𝑧8
.
𝑧. = 𝑓% 𝑥 = 3 𝑥
𝑧7 = 𝑓. 𝑥 = log 𝑥
𝑧8 = 𝑓7 𝑥 = 1/𝑥
𝑥
𝑧.
𝑧7
𝑧8
𝑧%
𝑦
𝑓"
𝑓#
𝑓$
𝑔#
𝑔"
𝑦 = 𝑔%(𝑔. 𝑓7 𝑥 , 𝑓. 𝑥 , 𝑓% 𝑥 ) = 3 𝑥 log 𝑥 +
1
𝑥.
53
(1 )
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑧% = 𝑧8
.
+ 𝑧7
𝑦 = 𝑧%𝑧.
2 x=1.2
𝑦 = 3 𝑥 log 𝑥 +
1
𝑥.
2
•
• backward
x=1.2 dy/dx
( )
54
§ ( )
§
§
𝑑𝑦
𝑑𝑥
= 3 𝑥
1
𝑥
−
2
𝑥?
+
3 log 𝑥 +
1
𝑥;
2 𝑥
𝑥 = 1.2
@A
@B
0.14
§
( )
55
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2
𝑧7
0.18
𝑧.
3.29
𝑧8
0.83
data
data data
data
Forward
56
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2
𝑧7
0.18
𝑧.
3.29
𝑧8
0.83
𝑧%
0.88
data
data
data data
data
Forward
57
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2
𝑧7
0.18
𝑧.
3.29
𝑧8
0.83
𝑧%
0.88
𝑦
2.88
data
data
data
data data
data
Forward
58
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2
𝑧7
0.18
𝑧.
3.29
𝑧8
0.83
𝑧%
0.88
𝑦
2.88 1.00
data grad
data grad
data grad
data grad
𝜕𝑦
𝜕𝑦
= 1
data grad
data grad
Backward
59
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2
𝑧7
0.18
𝑧.
3.29 0.88
𝑧8
0.83
𝑧%
0.88 3.29
𝑦
2.88 1.00
data grad
data grad
data grad
data grad
𝜕𝑦
𝜕𝑦
= 1
𝜕𝑦
𝜕𝑧"
= 𝑧#
𝜕𝑦
𝜕𝑧#
= 𝑧"
data grad
data grad
= 0.88
= 3.29
Backward
60
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2
𝑧7
0.18
𝑧.
3.29 0.88
𝑧8
0.83
𝑧%
0.88 3.29
𝑦
2.88 1.00
data grad
data grad
data grad
data grad
𝜕𝑦
𝜕𝑦
= 1
𝜕𝑦
𝜕𝑧"
= 𝑧#
𝜕𝑦
𝜕𝑧#
= 𝑧"
𝜕𝑧"
𝜕𝑧$
= 1
𝜕𝑧"
𝜕𝑧0
= 2 𝑧0
data grad
data grad
= 1.67
𝜕𝑦
𝜕𝑧$
=
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧$
𝜕𝑦
𝜕𝑧0
=
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧0
= 0.88
= 3.29
Backward
61
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2
𝑧7
0.18 3.29
𝑧.
3.29 0.88
𝑧8
0.83 5.48
𝑧%
0.88 3.29
𝑦
2.88 1.00
data grad
data grad
data grad
data grad
𝜕𝑦
𝜕𝑦
= 1
𝜕𝑦
𝜕𝑧"
= 𝑧#
𝜕𝑦
𝜕𝑧#
= 𝑧"
𝜕𝑧"
𝜕𝑧$
= 1
𝜕𝑧"
𝜕𝑧0
= 2 𝑧0
data grad
data grad
= 1.67
𝜕𝑦
𝜕𝑧$
=
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧$
𝜕𝑦
𝜕𝑧0
=
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧0
= 0.88
= 3.29
Backward
62
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2
𝑧7
0.18 3.29
𝑧.
3.29 0.88
𝑧8
0.83 5.48
𝑧%
0.88 3.29
𝑦
2.88 1.00
data grad
data grad
data grad
data grad
𝜕𝑦
𝜕𝑦
= 1
𝜕𝑦
𝜕𝑧"
= 𝑧#
𝜕𝑦
𝜕𝑧#
= 𝑧"
𝜕𝑧"
𝜕𝑧$
= 1
𝜕𝑧"
𝜕𝑧0
= 2 𝑧0
data grad
data grad
= 1.67
𝜕𝑦
𝜕𝑥
=
𝜕𝑦
𝜕𝑧#
𝜕𝑧#
𝜕𝑥
+
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧$
𝜕𝑧$
𝜕𝑥
+
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧0
𝜕𝑧0
𝜕𝑥
𝜕𝑧#
𝜕𝑥
=
3
2 𝑥
= 1.37
= 0.88
= 3.29
Backward
63
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2
𝑧7
0.18 3.29
𝑧.
3.29 0.88
𝑧8
0.83 5.48
𝑧%
0.88 3.29
𝑦
2.88 1.00
data grad
data grad
data grad
data grad
𝜕𝑦
𝜕𝑦
= 1
𝜕𝑦
𝜕𝑧"
= 𝑧#
𝜕𝑦
𝜕𝑧#
= 𝑧"
𝜕𝑧"
𝜕𝑧$
= 1
𝜕𝑧"
𝜕𝑧0
= 2 𝑧0
data grad
data grad
= 1.67
𝜕𝑦
𝜕𝑥
=
𝜕𝑦
𝜕𝑧#
𝜕𝑧#
𝜕𝑥
+
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧$
𝜕𝑧$
𝜕𝑥
+
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧0
𝜕𝑧0
𝜕𝑥
𝜕𝑧#
𝜕𝑥
=
3
2 𝑥
= 1.37
𝜕𝑧$
𝜕𝑥
=
1
𝑥
= 0.83
= 0.88
= 3.29
Backward
64
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2
𝑧7
0.18 3.29
𝑧.
3.29 0.88
𝑧8
0.83 5.48
𝑧%
0.88 3.29
𝑦
2.88 1.00
data grad
data grad
data grad
data grad
𝜕𝑦
𝜕𝑦
= 1
𝜕𝑦
𝜕𝑧"
= 𝑧#
𝜕𝑦
𝜕𝑧#
= 𝑧"
𝜕𝑧"
𝜕𝑧$
= 1
𝜕𝑧"
𝜕𝑧0
= 2 𝑧0
data grad
data grad
= 1.67
𝜕𝑦
𝜕𝑥
=
𝜕𝑦
𝜕𝑧#
𝜕𝑧#
𝜕𝑥
+
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧$
𝜕𝑧$
𝜕𝑥
+
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧0
𝜕𝑧0
𝜕𝑥
𝜕𝑧#
𝜕𝑥
=
3
2 𝑥
= 1.37
𝜕𝑧$
𝜕𝑥
=
1
𝑥
= 0.83
𝜕𝑧0
𝜕𝑥
= −
1
𝑥#
= -0.69
= 0.88
= 3.29
Backward
65
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2 0.14
𝑧7
0.18 3.29
𝑧.
3.29 0.88
𝑧8
0.83 5.48
𝑧%
0.88 3.29
𝑦
2.88 1.00
data grad
data grad
data grad
data grad
𝜕𝑦
𝜕𝑦
= 1
data grad
data grad
1.67
𝜕𝑦
𝜕𝑥
=
𝜕𝑦
𝜕𝑧#
𝜕𝑧#
𝜕𝑥
+
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧$
𝜕𝑧$
𝜕𝑥
+
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧0
𝜕𝑧0
𝜕𝑥
1.37
0.83
-0.69
0.88
3.29
1
Backward
66
𝑥
1.2 0.14
𝑧7
0.18 3.29
𝑧.
3.29 0.88
𝑧8
0.83 5.48
𝑧%
0.88 3.29
𝑦
2.88 1.0
data
data
data
data data
data
grad
grad
grad
grad grad
grad
𝜕𝑦
𝜕𝑥
𝜕𝑦
𝜕𝑧#
𝜕𝑦
𝜕𝑧$
𝜕𝑦
𝜕𝑧0
𝜕𝑦
𝜕𝑧"
𝜕𝑦
𝜕𝑦
Forward Backward y
( )
67
𝑧. = 3 𝑥 𝑥
1.2
data grad
𝑧.
3.29
data grad
Forward
Backward
z2.data
← 3 * sqrt(x.data)
𝑥
1.2
data grad
𝑧.
3.29
data grad
backward
x.grad
← 3/(2*sqrt(x.data)) * z2.grad
𝜕𝑧#
𝜕𝑥
=
3
2 𝑥
x.grad
← 3/(2*sqrt(x.data))
* z2.grad
𝜕𝑦
𝜕𝑥
=
𝜕𝑦
𝜕𝑧#
×
𝜕𝑧#
𝜕𝑥
68
𝑧. = 3 𝑥 𝑥
1.2
data grad
𝑧.
3.29
data grad
Forward
Backward
z2.data
← 3 * sqrt(x.data)
𝑥
1.2
data grad
𝑧.
3.29
data grad
backward
x.grad
← 3/(2*sqrt(x.data)) * z2.grad
𝜕𝑧#
𝜕𝑥
=
3
2 𝑥
x.grad
← 3/(2*sqrt(x.data))
* z2.grad
𝜕𝑦
𝜕𝑧#
= 0.88
1.20 0.88
𝜕𝑦
𝜕𝑥
=
𝜕𝑦
𝜕𝑧#
×
𝜕𝑧#
𝜕𝑥
69
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2
𝑧7
0.18
𝑧.
3.29
𝑧8
0.83
𝑧%
0.88
𝑦
2.88
data
data grad
data grad
data grad
grad data grad
data grad
( )
1. Forward ( & )
70
𝜕𝑧+
𝜕𝑥
= −
1
𝑥,
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2
𝑧7
0.18
𝑧.
3.29
𝑧8
0.83
𝑧%
0.88
𝑦
2.88
data
data
data
data data
data
grad
grad
grad
grad grad
grad
⾒ Backward
⾒
𝜕𝑧-
𝜕𝑥
=
1
𝑥
𝜕𝑧,
𝜕𝑥
=
3
2 𝑥
𝜕𝑦
𝜕𝑧,
= 𝑧.
𝜕𝑧.
𝜕𝑧-
= 1
𝜕𝑧.
𝜕𝑧+
= 2 𝑧+
𝜕𝑦
𝜕𝑧.
= 𝑧,
1. Forward ( & )
71
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2
𝑧7
0.18
𝑧.
3.29
𝑧8
0.83
𝑧%
0.88
𝑦
2.88
data
data
data
data data
data
grad
grad
grad
grad grad
grad
x.grad
← 3/(2*sqrt(x.data)) * z2.grad
x.grad
← (-1/x.data**2) * z4.grad
z2.grad
← z1.data* y.grad
z1.grad
← z2.data* y.grad
z3.grad
← 1.0 * z1.grad
z4.grad
← (2*z4.data)* z1.grad
x.grad
← 1/x.data * z3.grad
1. Forward ( & )
⾒ Backward
⾒
72
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2 0.0
𝑧7
0.18 0.0
𝑧.
3.29 0.0
𝑧8
0.83 0.0
𝑧%
0.88 0.0
𝑦
2.88 0.0
data
data
data
data data
data
grad
grad
grad
grad grad
grad
2. Backward (grad 0 )
73
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2 0.0
𝑧7
0.18 0.0
𝑧.
3.29 0.0
𝑧8
0.83 0.0
𝑧%
0.88 0.0
𝑦
2.88 1.0
data
data
data
data data
data
grad
grad
grad
grad grad
grad
3. Backward ( grad 1 )
74
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2 0.0
𝑧7
0.18 0.0
𝑧.
3.29 0.88
𝑧8
0.83 0.0
𝑧%
0.88 3.29
𝑦
2.88 1.0
data
data
data
data data
data
grad
grad
grad
grad grad
grad
z2.grad
← z1.data* y.grad
z1.grad
← z2.data* y.grad
4. Backward (y grad )
75
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2 0.0
𝑧7
0.18 3.29
𝑧.
3.29 0.88
𝑧8
0.83 5.48
𝑧%
0.88 3.29
𝑦
2.88 1.0
data
data
data
data data
data
grad
grad
grad
grad grad
grad
z3.grad
← 1.0 * z1.grad
z4.grad
← (2*z4.data)* z1.grad
𝜕𝑦
𝜕𝑧$
=
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧$
𝜕𝑦
𝜕𝑧0
=
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧0
5. Backward (z1 grad )
76
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2 1.20
𝑧7
0.18 3.29
𝑧.
3.29 0.88
𝑧8
0.83 5.48
𝑧%
0.88 3.29
𝑦
2.88 1.0
data
data
data
data data
data
grad
grad
grad
grad grad
grad
x.grad
← 3/(2*sqrt(x.data)) * z2.grad
1.20
grad=0.0
𝜕𝑦
𝜕𝑥
=
𝜕𝑦
𝜕𝑧#
𝜕𝑧#
𝜕𝑥
+
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧$
𝜕𝑧$
𝜕𝑥
+
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧0
𝜕𝑧0
𝜕𝑥
6. Backward (z2 grad )
77
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2 3.94
𝑧7
0.18 3.29
𝑧.
3.29 0.88
𝑧8
0.83 5.48
𝑧%
0.88 3.29
𝑦
2.88 1.0
data
data
data
data data
data
grad
grad
grad
grad grad
grad
x.grad
← 1/x.data * z3.grad
2.74
grad=1.20
𝜕𝑦
𝜕𝑥
=
𝜕𝑦
𝜕𝑧#
𝜕𝑧#
𝜕𝑥
+
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧$
𝜕𝑧$
𝜕𝑥
+
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧0
𝜕𝑧0
𝜕𝑥
𝜕𝑦
𝜕𝑧$
6. Backward (z3 grad )
78
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2 0.14
𝑧7
0.18 3.29
𝑧.
3.29 0.88
𝑧8
0.83 5.48
𝑧%
0.88 3.29
𝑦
2.88 1.0
data
data
data
data data
data
grad
grad
grad
grad grad
grad
-3.80
grad=3.94
x.grad
← (-1/x.data**2) * z4.grad
𝜕𝑦
𝜕𝑥
=
𝜕𝑦
𝜕𝑧#
𝜕𝑧#
𝜕𝑥
+
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧$
𝜕𝑧$
𝜕𝑥
+
𝜕𝑦
𝜕𝑧"
𝜕𝑧"
𝜕𝑧0
𝜕𝑧0
𝜕𝑥
𝜕𝑦
𝜕𝑧0
6. Backward (z4 grad )
79
𝑦 = 𝑧%𝑧.
𝑧% = 𝑧8
.
+ 𝑧7
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑥
1.2 0.14
𝑧7
0.18 3.29
𝑧.
3.29 0.88
𝑧8
0.83 5.48
𝑧%
0.88 3.29
𝑦
2.88 1.0
data
data
data
data data
data
grad
grad
grad
grad grad
grad
𝜕𝑦
𝜕𝑥
𝜕𝑦
𝜕𝑧#
𝜕𝑦
𝜕𝑧$
𝜕𝑦
𝜕𝑧0
𝜕𝑦
𝜕𝑧"
𝜕𝑦
𝜕𝑦
( )
y
80
𝑧. = 3 𝑥
𝑧7 = log 𝑥
𝑧8 = 1/𝑥
𝑧% = 𝑧8
.
+ 𝑧7
𝑦 = 𝑧%𝑧.
PyTorch
Backward
Forward
81
grad
x = tensor(1.2, requires_grad=True)
z2 = 3*sqrt(x)
z3 = log(x)
z4 = 1/x
z1 = z4**2 + z3
y = z1 * z2
y.retain_grad()
z1.retain_grad()
z2.retain_grad()
z3.retain_grad()
z4.retain_grad()
y.backward()
print(x.data, z2.data, z3.data, z4.data, z1.data, y.data)
print(x.grad, z2.grad, z3.grad, z4.grad, z1.grad, y.grad)
tensor(1.20) tensor(3.29) tensor(0.18) tensor(0.83) tensor(0.88) tensor(2.88)
tensor(0.14) tensor(0.88) tensor(3.29) tensor(5.48) tensor(3.29) tensor(1.)
import torch
torch.set_printoptions(2)
2
PyTorch
82
𝑦 = 𝑥. + 2𝑥 + 3 = 𝑥 + 1 . + 2
(2.0)
• (Forward)
• (Backward)
•
• x.grad
𝑥 = −1 𝑦 = 2
83
𝑦 = 𝑥$ + 3.2 𝑥# + 1.3 𝑥 − 2
20 ( -1.5 1.5)
84
x.grad
(Forward)
(Backward)
3
85
MSE = mean squared
errors ( )
SGD = stochastic
gradient descent
( )
PyTorch
86
( )
•
• (lr )
•
•
• GPU
87
• https://github.com/karpathy/micrograd
• Python 94
•
https://github.com/karpathy/micrograd/blob/master/micrograd/engine.py
• Andrej Karpathy
OpenAI
(Telsa AI
2023 OpenAI )
•
https://youtu.be/VMj-3S1tku0?si=91ZWzaA4ECidua4g
micrograd
88
§ 𝑦 = 3 𝑥 𝑧 = 𝑥 𝑦 = 3𝑧
§
PyTorch https://pytorch.org/docs/stable/torch.html#math-operations
§
§
89
( )
J aime la
musique I love music
(Software 2.0 )
Q & A
91
1.
2.
3. ( )
Slido
https://app.sli.do/event/d13KtHT3VXxFbmLvxPSQjn

More Related Content

Similar to 機械学習と自動微分

1º rueda de prensa
1º rueda de prensa1º rueda de prensa
1º rueda de prensacomovamosNL
 
第5回 様々なファイル形式の読み込みとデータの書き出し
第5回 様々なファイル形式の読み込みとデータの書き出し第5回 様々なファイル形式の読み込みとデータの書き出し
第5回 様々なファイル形式の読み込みとデータの書き出しWataru Shito
 
Capurro o conceito de informação
Capurro  o conceito de informaçãoCapurro  o conceito de informação
Capurro o conceito de informaçãoTatiana Ufes
 
Identifying Attributes
Identifying AttributesIdentifying Attributes
Identifying Attributestmra
 
(KO) 온라인 뉴스 댓글 플랫폼을 흐리는 어뷰저 분석기 / (EN) Online ...
(KO) 온라인 뉴스 댓글 플랫폼을 흐리는 어뷰저 분석기 / (EN) Online ...(KO) 온라인 뉴스 댓글 플랫폼을 흐리는 어뷰저 분석기 / (EN) Online ...
(KO) 온라인 뉴스 댓글 플랫폼을 흐리는 어뷰저 분석기 / (EN) Online ...Ji Hyung Moon
 
Despacho conjunto nº300 97 de 09 setembro
Despacho conjunto nº300 97 de 09 setembroDespacho conjunto nº300 97 de 09 setembro
Despacho conjunto nº300 97 de 09 setembropatronatobonanca
 
Connectix webserver
Connectix webserverConnectix webserver
Connectix webserversteveheer
 
Connectix webserver
Connectix webserverConnectix webserver
Connectix webserversteveheer
 
Functional Gradient Boosting based on Residual Network Perception
Functional Gradient Boosting based on Residual Network PerceptionFunctional Gradient Boosting based on Residual Network Perception
Functional Gradient Boosting based on Residual Network PerceptionAtsushi Nitanda
 
Adapt, Collaborate, Innovate
Adapt, Collaborate, InnovateAdapt, Collaborate, Innovate
Adapt, Collaborate, InnovateJim Smurro
 
SUEC 高中 Adv Maths (Sin and Cos Rule)
SUEC 高中 Adv Maths (Sin and Cos Rule)SUEC 高中 Adv Maths (Sin and Cos Rule)
SUEC 高中 Adv Maths (Sin and Cos Rule)tungwc
 
Evergreen trails master plan community meeting 1 boards
Evergreen trails master plan community meeting 1 boardsEvergreen trails master plan community meeting 1 boards
Evergreen trails master plan community meeting 1 boardsOV Consulting
 

Similar to 機械学習と自動微分 (20)

1º rueda de prensa
1º rueda de prensa1º rueda de prensa
1º rueda de prensa
 
第5回 様々なファイル形式の読み込みとデータの書き出し
第5回 様々なファイル形式の読み込みとデータの書き出し第5回 様々なファイル形式の読み込みとデータの書き出し
第5回 様々なファイル形式の読み込みとデータの書き出し
 
Capurro o conceito de informação
Capurro  o conceito de informaçãoCapurro  o conceito de informação
Capurro o conceito de informação
 
Identifying Attributes
Identifying AttributesIdentifying Attributes
Identifying Attributes
 
(KO) 온라인 뉴스 댓글 플랫폼을 흐리는 어뷰저 분석기 / (EN) Online ...
(KO) 온라인 뉴스 댓글 플랫폼을 흐리는 어뷰저 분석기 / (EN) Online ...(KO) 온라인 뉴스 댓글 플랫폼을 흐리는 어뷰저 분석기 / (EN) Online ...
(KO) 온라인 뉴스 댓글 플랫폼을 흐리는 어뷰저 분석기 / (EN) Online ...
 
Ph 37
Ph 37Ph 37
Ph 37
 
Ph 38
Ph 38Ph 38
Ph 38
 
Safeway presentation
Safeway presentationSafeway presentation
Safeway presentation
 
Despacho conjunto nº300 97 de 09 setembro
Despacho conjunto nº300 97 de 09 setembroDespacho conjunto nº300 97 de 09 setembro
Despacho conjunto nº300 97 de 09 setembro
 
الجبر
الجبرالجبر
الجبر
 
Connectix webserver
Connectix webserverConnectix webserver
Connectix webserver
 
Connectix webserver
Connectix webserverConnectix webserver
Connectix webserver
 
Ph 3
Ph 3Ph 3
Ph 3
 
Functional Gradient Boosting based on Residual Network Perception
Functional Gradient Boosting based on Residual Network PerceptionFunctional Gradient Boosting based on Residual Network Perception
Functional Gradient Boosting based on Residual Network Perception
 
Planning v2
Planning v2Planning v2
Planning v2
 
Vietnamese concerns on the environment issues
Vietnamese concerns on the environment issuesVietnamese concerns on the environment issues
Vietnamese concerns on the environment issues
 
Adapt, Collaborate, Innovate
Adapt, Collaborate, InnovateAdapt, Collaborate, Innovate
Adapt, Collaborate, Innovate
 
SUEC 高中 Adv Maths (Sin and Cos Rule)
SUEC 高中 Adv Maths (Sin and Cos Rule)SUEC 高中 Adv Maths (Sin and Cos Rule)
SUEC 高中 Adv Maths (Sin and Cos Rule)
 
Evergreen trails master plan community meeting 1 boards
Evergreen trails master plan community meeting 1 boardsEvergreen trails master plan community meeting 1 boards
Evergreen trails master plan community meeting 1 boards
 
Liminar acp nasf
Liminar acp nasfLiminar acp nasf
Liminar acp nasf
 

More from Ichigaku Takigawa

データ社会を生きる技術
〜機械学習の夢と現実〜
データ社会を生きる技術
〜機械学習の夢と現実〜データ社会を生きる技術
〜機械学習の夢と現実〜
データ社会を生きる技術
〜機械学習の夢と現実〜Ichigaku Takigawa
 
機械学習を科学研究で使うとは?
機械学習を科学研究で使うとは?機械学習を科学研究で使うとは?
機械学習を科学研究で使うとは?Ichigaku Takigawa
 
A Modern Introduction to Decision Tree Ensembles
A Modern Introduction to Decision Tree EnsemblesA Modern Introduction to Decision Tree Ensembles
A Modern Introduction to Decision Tree EnsemblesIchigaku Takigawa
 
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...Ichigaku Takigawa
 
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開Ichigaku Takigawa
 
機械学習と機械発見:自然科学研究におけるデータ利活用の再考
機械学習と機械発見:自然科学研究におけるデータ利活用の再考機械学習と機械発見:自然科学研究におけるデータ利活用の再考
機械学習と機械発見:自然科学研究におけるデータ利活用の再考Ichigaku Takigawa
 
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜Ichigaku Takigawa
 
"データ化"する化学と情報技術・人工知能・データサイエンス
"データ化"する化学と情報技術・人工知能・データサイエンス"データ化"する化学と情報技術・人工知能・データサイエンス
"データ化"する化学と情報技術・人工知能・データサイエンスIchigaku Takigawa
 
自然科学における機械学習と機械発見
自然科学における機械学習と機械発見自然科学における機械学習と機械発見
自然科学における機械学習と機械発見Ichigaku Takigawa
 
幾何と機械学習: A Short Intro
幾何と機械学習: A Short Intro幾何と機械学習: A Short Intro
幾何と機械学習: A Short IntroIchigaku Takigawa
 
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割Ichigaku Takigawa
 
Machine Learning for Molecules: Lessons and Challenges of Data-Centric Chemistry
Machine Learning for Molecules: Lessons and Challenges of Data-Centric ChemistryMachine Learning for Molecules: Lessons and Challenges of Data-Centric Chemistry
Machine Learning for Molecules: Lessons and Challenges of Data-Centric ChemistryIchigaku Takigawa
 
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいことIchigaku Takigawa
 
自己紹介:機械学習・機械発見とデータ中心的自然科学
自己紹介:機械学習・機械発見とデータ中心的自然科学自己紹介:機械学習・機械発見とデータ中心的自然科学
自己紹介:機械学習・機械発見とデータ中心的自然科学Ichigaku Takigawa
 
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱Ichigaku Takigawa
 
Machine Learning for Molecular Graph Representations and Geometries
Machine Learning for Molecular Graph Representations and GeometriesMachine Learning for Molecular Graph Representations and Geometries
Machine Learning for Molecular Graph Representations and GeometriesIchigaku Takigawa
 
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれからIchigaku Takigawa
 
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)Ichigaku Takigawa
 
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから (2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから Ichigaku Takigawa
 
Machine Learning for Molecules
Machine Learning for MoleculesMachine Learning for Molecules
Machine Learning for MoleculesIchigaku Takigawa
 

More from Ichigaku Takigawa (20)

データ社会を生きる技術
〜機械学習の夢と現実〜
データ社会を生きる技術
〜機械学習の夢と現実〜データ社会を生きる技術
〜機械学習の夢と現実〜
データ社会を生きる技術
〜機械学習の夢と現実〜
 
機械学習を科学研究で使うとは?
機械学習を科学研究で使うとは?機械学習を科学研究で使うとは?
機械学習を科学研究で使うとは?
 
A Modern Introduction to Decision Tree Ensembles
A Modern Introduction to Decision Tree EnsemblesA Modern Introduction to Decision Tree Ensembles
A Modern Introduction to Decision Tree Ensembles
 
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
 
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
 
機械学習と機械発見:自然科学研究におけるデータ利活用の再考
機械学習と機械発見:自然科学研究におけるデータ利活用の再考機械学習と機械発見:自然科学研究におけるデータ利活用の再考
機械学習と機械発見:自然科学研究におけるデータ利活用の再考
 
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
 
"データ化"する化学と情報技術・人工知能・データサイエンス
"データ化"する化学と情報技術・人工知能・データサイエンス"データ化"する化学と情報技術・人工知能・データサイエンス
"データ化"する化学と情報技術・人工知能・データサイエンス
 
自然科学における機械学習と機械発見
自然科学における機械学習と機械発見自然科学における機械学習と機械発見
自然科学における機械学習と機械発見
 
幾何と機械学習: A Short Intro
幾何と機械学習: A Short Intro幾何と機械学習: A Short Intro
幾何と機械学習: A Short Intro
 
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
 
Machine Learning for Molecules: Lessons and Challenges of Data-Centric Chemistry
Machine Learning for Molecules: Lessons and Challenges of Data-Centric ChemistryMachine Learning for Molecules: Lessons and Challenges of Data-Centric Chemistry
Machine Learning for Molecules: Lessons and Challenges of Data-Centric Chemistry
 
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
 
自己紹介:機械学習・機械発見とデータ中心的自然科学
自己紹介:機械学習・機械発見とデータ中心的自然科学自己紹介:機械学習・機械発見とデータ中心的自然科学
自己紹介:機械学習・機械発見とデータ中心的自然科学
 
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
 
Machine Learning for Molecular Graph Representations and Geometries
Machine Learning for Molecular Graph Representations and GeometriesMachine Learning for Molecular Graph Representations and Geometries
Machine Learning for Molecular Graph Representations and Geometries
 
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
 
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
 
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから (2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
 
Machine Learning for Molecules
Machine Learning for MoleculesMachine Learning for Molecules
Machine Learning for Molecules
 

Recently uploaded

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 

Recently uploaded (20)

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 

機械学習と自動微分

  • 3. 3 § § § § § ( ) § ( ) § https://itakigawa.github.io/
  • 9. 9 § ChatGPT 2022 11 2 1 § AI § GAFAM IT 2023 OpenAI ChatGPT
  • 10. 10 1. - - - 2. - - - 3. - - - DIY OpenAI ChatGPT 4. - - 5. GPTs - DALL-E - - 6. - - § § ChatGPT (ChatGPT )
  • 11. 13 § Microsoft OpenAI ChatGPT DALL-E AI § AI 3 (440 ) 3 Apple 2 (🇬🇧🇫🇷🇮🇹 GDP ) § Bing ( Edge) ChatGPT ChatGPT Web § GitHub Copilot( ) § ( ) § § Windows 11 Microsoft 365 Copilot/Copilot Pro Microsoft Copilot ( )
  • 12. 14 § AI Windows 11/ Microsoft 365 Copilot § MS (Word, Excel, Powerpoint, Outlook, Teams) AI § § Word § Powerpoint § Powerpoint § Excel § Web § Windows MS
  • 13. 15 §
  • 14. 16 AI § § ( 2023 12 ) § ChatGPT AI 2 § http://hdl.handle.net/2433/286548 § ( ), ( ) § 1 (Preferred Networks) § 2 ( ) § ( )
  • 15. 1.
  • 16. 18 .. . A B ( ) • / • •
  • 18. 20 ( ) J aime la musique I love music (Software 2.0 )
  • 21. 23 = Random Forest Gaussian Process Logistic Regression P( =red) 0 1 or
  • 22. 24 Random Forest Neural Network SVR Kernel Ridge p1 p2 p3 p4 … ( ) ( ) ( ) ⾒
  • 23. 25 § : 1 : 1 = ( )
  • 25. 2.
  • 26. 28 vs ( ) q1 q2 q3 q4 … Random Forest GBDT Nearest Neighbor SVM Gaussian Process Neural Network
  • 27. 29 vs ( ) p1 p2 p3 p4 … ( ) q1 q2 q3 q4 …
  • 28. 30 § § 𝑓(𝑥) 𝑦 ℓ(𝑓 𝑥 , 𝑦) § (MSE) § (Cross Entropy) § = (optimization) Minimize! 𝐿 𝜃 𝐿 𝜃 = ∑"#$ % ℓ(𝑓 𝑥", 𝜃 , 𝑦") + Ω(𝜃)
  • 29. 31 𝐿 𝑎, 𝑏 = − log * !:#!$% 𝑃(𝑦 = 1|𝑥!) * !:#!$& 𝑃(𝑦 = 0|𝑥!) データ モデル 𝑓 𝑥 = 𝜎 𝑎𝑥 + 𝑏 = 1 1 + 𝑒'()*+,) = 𝑃(𝑦 = 1|𝑥) 𝑥%, 𝑦% , 𝑥., 𝑦. , ⋯ , (𝑥/, 𝑦/) 各𝑦! 0 1 2 (Cross Entropy) (-1)× = − 8 !$% / 𝑦! log 𝑓(𝑥!) + 1 − 𝑦! log(1 − 𝑓 𝑥! ) ( ) 0 1 ( )
  • 30. 32 1 1 𝑓 𝑥 = 𝜎 𝑎𝑥 + 𝑏 = 1 1 + 𝑒'()*+,) 𝑎𝑥 + 𝑏 𝑎 𝑏 𝑥 9 𝑦 𝑧 𝑥", 𝑦" , 𝑥#, 𝑦# , (𝑥$, 𝑦$) Minimize 𝐿 𝑎, 𝑏 𝐿 𝑎, 𝑏 = − 𝑦" log " "%&! "#$%& + 1 − 𝑦" log 1 − " "%&! "#$%& − 𝑦# log 1 1 + 𝑒'()*'%+) + 1 − 𝑦# log 1 − 1 1 + 𝑒'()*'%+) − 𝑦$ log 1 1 + 𝑒'()*(%+) + 1 − 𝑦$ log 1 − 1 1 + 𝑒'()*(%+) 𝜎 𝑥 = 1 1 + 𝑒'* Linear Sigmoid ( )
  • 31. 33 1 1 3 ( 1 ) Minimize 𝐿 𝑎!, 𝑎", 𝑏!, 𝑏" 𝐿 𝑎), 𝑎*, 𝑏), 𝑏* = − 𝑦) log 1 1 + 𝑒 + ,! ) )-."($%&%'(%) -/! + 1 − 𝑦) log 1 − 1 1 + 𝑒 + ,! ) )-."($%&%'(%) -/! − 𝑦* log 1 1 + 𝑒 + ,! ) )-."($%&!'(%) -/! + 1 − 𝑦* log 1 − 1 1 + 𝑒 + ,! ) )-."($%&!'(%) -/! − 𝑦0 log 1 1 + 𝑒 + ,! ) )-."($%&*'(%) -/! + 1 − 𝑦0 log 1 − 1 1 + 𝑒 + ,! ) )-."($%&*'(%) -/! 𝑎% 𝑏% 𝑥 𝑧 1 1 + 𝑒#(%!&'(!) 𝑎. 𝑏. 𝑦 1 1 + 𝑒#(%"*'(") 𝑥 𝑧 𝑦
  • 32. 34 1 1 3 ( 2 ) 𝑥 𝑦 𝑎", 𝑏" Linear 𝑢" 𝑎#, 𝑏# Linear 𝑢# Sigmoid 𝑧" Sigmoid 𝑧# 𝑎$, 𝑏$, 𝑐$ Linear 𝑣 Sigmoid Minimize 𝐿 𝑎!, 𝑎", 𝑎+, 𝑏!, 𝑏", 𝑏+, 𝑐+ 𝑎0𝑧) + 𝑏0𝑧* + 𝑐0
  • 33. 35 § § ResNet50 2600 § AlexNet 6100 § ResNeXt101 8400 § VGG19 1 4300 § LLaMa 650 § Chinchilla 700 § GPT-3 1750 § Gopher 2800 § PaLM 5400 (CNN) Transformer ( )
  • 34. 36 𝐿 𝜃%, 𝜃., ⋯ 𝜃% 𝜃. 1. θ$, θ;, ⋯ 2. 𝐿 𝜃$, 𝜃;, ⋯ θ$, θ;, ⋯ 3. ( )
  • 35. 37 𝑓 𝑥, 𝑦 (𝑎, 𝑏) 𝑥 𝑥 𝑦 = 𝑏 𝑥 = 𝑎 𝑓* 𝑎, 𝑏 = lim 0→& 2 )+0, , '2(),,) 0 𝑓&(𝑎, 𝑏) 𝑓 𝑎, 𝑏 𝑎
  • 38. 40 §
  • 39. 41 = • 𝒙 𝑓(𝒙) 𝒙 (learning rate) 𝛼 : 1 " "(step size) 𝒙 𝑥% 𝑥. ⋮ 𝑥4 ← 𝑥% 𝑥. ⋮ 𝑥4 − 𝛼 × 𝜕𝑓/𝜕𝑥% 𝜕𝑓/𝜕𝑥. ⋮ 𝜕𝑓/𝜕𝑥4 𝑓(𝒙) 𝑥! 𝑥" 𝒙 = 𝑥! 𝑥"
  • 42. 44 § § § Momentum § AdaGrad § RMSProp § Adam https://towardsdatascience.com/a-visual- explanation-of-gradient-descent-methods- momentum-adagrad-rmsprop-adam-f898b102325c
  • 43. 45 § 45 𝑥 𝑦 𝑎), 𝑏) Linear 𝑢) 𝑎*, 𝑏* Linear 𝑢* Sigmoid 𝑧) Sigmoid 𝑧* 𝑎0, 𝑏0, 𝑐0 Linear 𝑣 Sigmoid 𝑦 𝜕𝑦/𝜕𝑎!, 𝜕𝑦/𝜕𝑏!, 𝜕𝑦/𝜕𝑐!,
  • 45. 47 § (Chain Rule) § 𝑥 𝑢 = 𝑓(𝑥) 𝑦 = 𝑔(𝑢) 𝑑𝑦 𝑑𝑥 = 𝑑𝑦 𝑑𝑢 𝑑𝑢 𝑑𝑥 § 𝑡 𝑥! = 𝑓! 𝑡 , 𝑥" = 𝑓" 𝑡 , ⋯ ⾒ 𝑧 = 𝑔(𝑥!, 𝑥", ⋯ ) 𝜕𝑧 𝜕𝑡 = 𝜕𝑧 𝜕𝑥% 𝜕𝑥% 𝜕𝑡 + 𝜕𝑧 𝜕𝑥. 𝜕𝑥. 𝜕𝑡 + ⋯ 𝑡 𝑥% 𝑥. 𝑧 ⋯ 𝑥 𝑢 𝑦 𝑓 𝑔 𝑓! 𝑓" 𝑔
  • 46. 48 § 𝑦 = 𝑒;< = > < 𝑥 𝑒'* 1/𝑥 𝑎 𝑏 𝑦 𝑎.𝑏 𝑎 = 𝑒'* 𝑏 = 1 𝑥 𝑦 = 𝑎.𝑏 𝜕𝑓 𝜕𝑥 = 𝑒'.* 𝑥 5 = − 𝑒'.*(2𝑥 + 1) 𝑥. ( or ) 𝜕𝑦 𝜕𝑥 = 𝜕𝑦 𝜕𝑎 𝜕𝑎 𝜕𝑥 + 𝜕𝑦 𝜕𝑏 𝜕𝑏 𝜕𝑥 = 2𝑎𝑏 −𝑒'* + 𝑎. − 1 𝑥. = 2𝑒'* J % * −𝑒'* + 𝑒'* . − % *- = − 6.-/(.*+%) *- ( )
  • 47. 49 1 3 𝜕𝑦/𝜕𝑥 𝑥 𝑦 𝑎", 𝑏" Linear 𝑢" 𝑎#, 𝑏# Linear 𝑢# Sigmoid 𝑧" Sigmoid 𝑧# 𝑎$, 𝑏$, 𝑐$ Linear 𝑣 Sigmoid
  • 49. 51 2 𝑣 𝑢 𝑓% 𝑣 = 3 𝑢 𝑣 = log 𝑢 𝑣 = 1/𝑢 2 𝑣 𝑎 𝑏 𝑔% 𝑣 = 𝑎 𝑏 𝑣 = 𝑎. + 𝑏 𝑑𝑣 𝑑𝑢 = 3 2 𝑢 𝑑𝑣 𝑑𝑢 = 1 𝑢 𝑑𝑣 𝑑𝑢 = − 1 𝑢. 𝜕𝑣 𝜕𝑎 = 𝑏 𝜕𝑣 𝜕𝑏 = 𝑎 𝜕𝑣 𝜕𝑎 = 2𝑎 𝜕𝑣 𝜕𝑏 = 1 𝑣 𝑢 𝑓. 𝑣 𝑢 𝑓7 𝑣 𝑎 𝑏 𝑔. 𝑦 = 𝑔%(𝑔. 𝑓7 𝑥 , 𝑓. 𝑥 , 𝑓% 𝑥 ) = 3 𝑥 log 𝑥 + 1 𝑥.
  • 50. 52 2 𝑑𝑦 𝑑𝑥 = 𝑑𝑧, 𝑑𝑥 𝑑𝑧! 𝑑𝑧, 𝑑𝑦 𝑑𝑧! + 𝑑𝑧+ 𝑑𝑥 𝑑𝑧! 𝑑𝑧+ 𝑑𝑦 𝑑𝑧! + 𝑑𝑧" 𝑑𝑥 𝑑𝑦 𝑑𝑧" = 3 𝑥 1 𝑥 − 2 𝑥+ + 3 log 𝑥 + 1 𝑥" 2 𝑥 𝑦 = 𝑔% 𝑧%, 𝑧. = 𝑧%𝑧. 𝑧% = 𝑔. 𝑧8, 𝑧7 = 𝑧7 + 𝑧8 . 𝑧. = 𝑓% 𝑥 = 3 𝑥 𝑧7 = 𝑓. 𝑥 = log 𝑥 𝑧8 = 𝑓7 𝑥 = 1/𝑥 𝑥 𝑧. 𝑧7 𝑧8 𝑧% 𝑦 𝑓" 𝑓# 𝑓$ 𝑔# 𝑔" 𝑦 = 𝑔%(𝑔. 𝑓7 𝑥 , 𝑓. 𝑥 , 𝑓% 𝑥 ) = 3 𝑥 log 𝑥 + 1 𝑥.
  • 51. 53 (1 ) 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑧% = 𝑧8 . + 𝑧7 𝑦 = 𝑧%𝑧. 2 x=1.2 𝑦 = 3 𝑥 log 𝑥 + 1 𝑥. 2 • • backward x=1.2 dy/dx ( )
  • 52. 54 § ( ) § § 𝑑𝑦 𝑑𝑥 = 3 𝑥 1 𝑥 − 2 𝑥? + 3 log 𝑥 + 1 𝑥; 2 𝑥 𝑥 = 1.2 @A @B 0.14 § ( )
  • 53. 55 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 𝑧7 0.18 𝑧. 3.29 𝑧8 0.83 data data data data Forward
  • 54. 56 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 𝑧7 0.18 𝑧. 3.29 𝑧8 0.83 𝑧% 0.88 data data data data data Forward
  • 55. 57 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 𝑧7 0.18 𝑧. 3.29 𝑧8 0.83 𝑧% 0.88 𝑦 2.88 data data data data data data Forward
  • 56. 58 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 𝑧7 0.18 𝑧. 3.29 𝑧8 0.83 𝑧% 0.88 𝑦 2.88 1.00 data grad data grad data grad data grad 𝜕𝑦 𝜕𝑦 = 1 data grad data grad Backward
  • 57. 59 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 𝑧7 0.18 𝑧. 3.29 0.88 𝑧8 0.83 𝑧% 0.88 3.29 𝑦 2.88 1.00 data grad data grad data grad data grad 𝜕𝑦 𝜕𝑦 = 1 𝜕𝑦 𝜕𝑧" = 𝑧# 𝜕𝑦 𝜕𝑧# = 𝑧" data grad data grad = 0.88 = 3.29 Backward
  • 58. 60 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 𝑧7 0.18 𝑧. 3.29 0.88 𝑧8 0.83 𝑧% 0.88 3.29 𝑦 2.88 1.00 data grad data grad data grad data grad 𝜕𝑦 𝜕𝑦 = 1 𝜕𝑦 𝜕𝑧" = 𝑧# 𝜕𝑦 𝜕𝑧# = 𝑧" 𝜕𝑧" 𝜕𝑧$ = 1 𝜕𝑧" 𝜕𝑧0 = 2 𝑧0 data grad data grad = 1.67 𝜕𝑦 𝜕𝑧$ = 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧$ 𝜕𝑦 𝜕𝑧0 = 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧0 = 0.88 = 3.29 Backward
  • 59. 61 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 𝑧7 0.18 3.29 𝑧. 3.29 0.88 𝑧8 0.83 5.48 𝑧% 0.88 3.29 𝑦 2.88 1.00 data grad data grad data grad data grad 𝜕𝑦 𝜕𝑦 = 1 𝜕𝑦 𝜕𝑧" = 𝑧# 𝜕𝑦 𝜕𝑧# = 𝑧" 𝜕𝑧" 𝜕𝑧$ = 1 𝜕𝑧" 𝜕𝑧0 = 2 𝑧0 data grad data grad = 1.67 𝜕𝑦 𝜕𝑧$ = 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧$ 𝜕𝑦 𝜕𝑧0 = 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧0 = 0.88 = 3.29 Backward
  • 60. 62 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 𝑧7 0.18 3.29 𝑧. 3.29 0.88 𝑧8 0.83 5.48 𝑧% 0.88 3.29 𝑦 2.88 1.00 data grad data grad data grad data grad 𝜕𝑦 𝜕𝑦 = 1 𝜕𝑦 𝜕𝑧" = 𝑧# 𝜕𝑦 𝜕𝑧# = 𝑧" 𝜕𝑧" 𝜕𝑧$ = 1 𝜕𝑧" 𝜕𝑧0 = 2 𝑧0 data grad data grad = 1.67 𝜕𝑦 𝜕𝑥 = 𝜕𝑦 𝜕𝑧# 𝜕𝑧# 𝜕𝑥 + 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧$ 𝜕𝑧$ 𝜕𝑥 + 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧0 𝜕𝑧0 𝜕𝑥 𝜕𝑧# 𝜕𝑥 = 3 2 𝑥 = 1.37 = 0.88 = 3.29 Backward
  • 61. 63 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 𝑧7 0.18 3.29 𝑧. 3.29 0.88 𝑧8 0.83 5.48 𝑧% 0.88 3.29 𝑦 2.88 1.00 data grad data grad data grad data grad 𝜕𝑦 𝜕𝑦 = 1 𝜕𝑦 𝜕𝑧" = 𝑧# 𝜕𝑦 𝜕𝑧# = 𝑧" 𝜕𝑧" 𝜕𝑧$ = 1 𝜕𝑧" 𝜕𝑧0 = 2 𝑧0 data grad data grad = 1.67 𝜕𝑦 𝜕𝑥 = 𝜕𝑦 𝜕𝑧# 𝜕𝑧# 𝜕𝑥 + 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧$ 𝜕𝑧$ 𝜕𝑥 + 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧0 𝜕𝑧0 𝜕𝑥 𝜕𝑧# 𝜕𝑥 = 3 2 𝑥 = 1.37 𝜕𝑧$ 𝜕𝑥 = 1 𝑥 = 0.83 = 0.88 = 3.29 Backward
  • 62. 64 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 𝑧7 0.18 3.29 𝑧. 3.29 0.88 𝑧8 0.83 5.48 𝑧% 0.88 3.29 𝑦 2.88 1.00 data grad data grad data grad data grad 𝜕𝑦 𝜕𝑦 = 1 𝜕𝑦 𝜕𝑧" = 𝑧# 𝜕𝑦 𝜕𝑧# = 𝑧" 𝜕𝑧" 𝜕𝑧$ = 1 𝜕𝑧" 𝜕𝑧0 = 2 𝑧0 data grad data grad = 1.67 𝜕𝑦 𝜕𝑥 = 𝜕𝑦 𝜕𝑧# 𝜕𝑧# 𝜕𝑥 + 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧$ 𝜕𝑧$ 𝜕𝑥 + 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧0 𝜕𝑧0 𝜕𝑥 𝜕𝑧# 𝜕𝑥 = 3 2 𝑥 = 1.37 𝜕𝑧$ 𝜕𝑥 = 1 𝑥 = 0.83 𝜕𝑧0 𝜕𝑥 = − 1 𝑥# = -0.69 = 0.88 = 3.29 Backward
  • 63. 65 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 0.14 𝑧7 0.18 3.29 𝑧. 3.29 0.88 𝑧8 0.83 5.48 𝑧% 0.88 3.29 𝑦 2.88 1.00 data grad data grad data grad data grad 𝜕𝑦 𝜕𝑦 = 1 data grad data grad 1.67 𝜕𝑦 𝜕𝑥 = 𝜕𝑦 𝜕𝑧# 𝜕𝑧# 𝜕𝑥 + 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧$ 𝜕𝑧$ 𝜕𝑥 + 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧0 𝜕𝑧0 𝜕𝑥 1.37 0.83 -0.69 0.88 3.29 1 Backward
  • 64. 66 𝑥 1.2 0.14 𝑧7 0.18 3.29 𝑧. 3.29 0.88 𝑧8 0.83 5.48 𝑧% 0.88 3.29 𝑦 2.88 1.0 data data data data data data grad grad grad grad grad grad 𝜕𝑦 𝜕𝑥 𝜕𝑦 𝜕𝑧# 𝜕𝑦 𝜕𝑧$ 𝜕𝑦 𝜕𝑧0 𝜕𝑦 𝜕𝑧" 𝜕𝑦 𝜕𝑦 Forward Backward y ( )
  • 65. 67 𝑧. = 3 𝑥 𝑥 1.2 data grad 𝑧. 3.29 data grad Forward Backward z2.data ← 3 * sqrt(x.data) 𝑥 1.2 data grad 𝑧. 3.29 data grad backward x.grad ← 3/(2*sqrt(x.data)) * z2.grad 𝜕𝑧# 𝜕𝑥 = 3 2 𝑥 x.grad ← 3/(2*sqrt(x.data)) * z2.grad 𝜕𝑦 𝜕𝑥 = 𝜕𝑦 𝜕𝑧# × 𝜕𝑧# 𝜕𝑥
  • 66. 68 𝑧. = 3 𝑥 𝑥 1.2 data grad 𝑧. 3.29 data grad Forward Backward z2.data ← 3 * sqrt(x.data) 𝑥 1.2 data grad 𝑧. 3.29 data grad backward x.grad ← 3/(2*sqrt(x.data)) * z2.grad 𝜕𝑧# 𝜕𝑥 = 3 2 𝑥 x.grad ← 3/(2*sqrt(x.data)) * z2.grad 𝜕𝑦 𝜕𝑧# = 0.88 1.20 0.88 𝜕𝑦 𝜕𝑥 = 𝜕𝑦 𝜕𝑧# × 𝜕𝑧# 𝜕𝑥
  • 67. 69 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 𝑧7 0.18 𝑧. 3.29 𝑧8 0.83 𝑧% 0.88 𝑦 2.88 data data grad data grad data grad grad data grad data grad ( ) 1. Forward ( & )
  • 68. 70 𝜕𝑧+ 𝜕𝑥 = − 1 𝑥, 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 𝑧7 0.18 𝑧. 3.29 𝑧8 0.83 𝑧% 0.88 𝑦 2.88 data data data data data data grad grad grad grad grad grad ⾒ Backward ⾒ 𝜕𝑧- 𝜕𝑥 = 1 𝑥 𝜕𝑧, 𝜕𝑥 = 3 2 𝑥 𝜕𝑦 𝜕𝑧, = 𝑧. 𝜕𝑧. 𝜕𝑧- = 1 𝜕𝑧. 𝜕𝑧+ = 2 𝑧+ 𝜕𝑦 𝜕𝑧. = 𝑧, 1. Forward ( & )
  • 69. 71 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 𝑧7 0.18 𝑧. 3.29 𝑧8 0.83 𝑧% 0.88 𝑦 2.88 data data data data data data grad grad grad grad grad grad x.grad ← 3/(2*sqrt(x.data)) * z2.grad x.grad ← (-1/x.data**2) * z4.grad z2.grad ← z1.data* y.grad z1.grad ← z2.data* y.grad z3.grad ← 1.0 * z1.grad z4.grad ← (2*z4.data)* z1.grad x.grad ← 1/x.data * z3.grad 1. Forward ( & ) ⾒ Backward ⾒
  • 70. 72 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 0.0 𝑧7 0.18 0.0 𝑧. 3.29 0.0 𝑧8 0.83 0.0 𝑧% 0.88 0.0 𝑦 2.88 0.0 data data data data data data grad grad grad grad grad grad 2. Backward (grad 0 )
  • 71. 73 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 0.0 𝑧7 0.18 0.0 𝑧. 3.29 0.0 𝑧8 0.83 0.0 𝑧% 0.88 0.0 𝑦 2.88 1.0 data data data data data data grad grad grad grad grad grad 3. Backward ( grad 1 )
  • 72. 74 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 0.0 𝑧7 0.18 0.0 𝑧. 3.29 0.88 𝑧8 0.83 0.0 𝑧% 0.88 3.29 𝑦 2.88 1.0 data data data data data data grad grad grad grad grad grad z2.grad ← z1.data* y.grad z1.grad ← z2.data* y.grad 4. Backward (y grad )
  • 73. 75 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 0.0 𝑧7 0.18 3.29 𝑧. 3.29 0.88 𝑧8 0.83 5.48 𝑧% 0.88 3.29 𝑦 2.88 1.0 data data data data data data grad grad grad grad grad grad z3.grad ← 1.0 * z1.grad z4.grad ← (2*z4.data)* z1.grad 𝜕𝑦 𝜕𝑧$ = 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧$ 𝜕𝑦 𝜕𝑧0 = 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧0 5. Backward (z1 grad )
  • 74. 76 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 1.20 𝑧7 0.18 3.29 𝑧. 3.29 0.88 𝑧8 0.83 5.48 𝑧% 0.88 3.29 𝑦 2.88 1.0 data data data data data data grad grad grad grad grad grad x.grad ← 3/(2*sqrt(x.data)) * z2.grad 1.20 grad=0.0 𝜕𝑦 𝜕𝑥 = 𝜕𝑦 𝜕𝑧# 𝜕𝑧# 𝜕𝑥 + 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧$ 𝜕𝑧$ 𝜕𝑥 + 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧0 𝜕𝑧0 𝜕𝑥 6. Backward (z2 grad )
  • 75. 77 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 3.94 𝑧7 0.18 3.29 𝑧. 3.29 0.88 𝑧8 0.83 5.48 𝑧% 0.88 3.29 𝑦 2.88 1.0 data data data data data data grad grad grad grad grad grad x.grad ← 1/x.data * z3.grad 2.74 grad=1.20 𝜕𝑦 𝜕𝑥 = 𝜕𝑦 𝜕𝑧# 𝜕𝑧# 𝜕𝑥 + 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧$ 𝜕𝑧$ 𝜕𝑥 + 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧0 𝜕𝑧0 𝜕𝑥 𝜕𝑦 𝜕𝑧$ 6. Backward (z3 grad )
  • 76. 78 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 0.14 𝑧7 0.18 3.29 𝑧. 3.29 0.88 𝑧8 0.83 5.48 𝑧% 0.88 3.29 𝑦 2.88 1.0 data data data data data data grad grad grad grad grad grad -3.80 grad=3.94 x.grad ← (-1/x.data**2) * z4.grad 𝜕𝑦 𝜕𝑥 = 𝜕𝑦 𝜕𝑧# 𝜕𝑧# 𝜕𝑥 + 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧$ 𝜕𝑧$ 𝜕𝑥 + 𝜕𝑦 𝜕𝑧" 𝜕𝑧" 𝜕𝑧0 𝜕𝑧0 𝜕𝑥 𝜕𝑦 𝜕𝑧0 6. Backward (z4 grad )
  • 77. 79 𝑦 = 𝑧%𝑧. 𝑧% = 𝑧8 . + 𝑧7 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑥 1.2 0.14 𝑧7 0.18 3.29 𝑧. 3.29 0.88 𝑧8 0.83 5.48 𝑧% 0.88 3.29 𝑦 2.88 1.0 data data data data data data grad grad grad grad grad grad 𝜕𝑦 𝜕𝑥 𝜕𝑦 𝜕𝑧# 𝜕𝑦 𝜕𝑧$ 𝜕𝑦 𝜕𝑧0 𝜕𝑦 𝜕𝑧" 𝜕𝑦 𝜕𝑦 ( ) y
  • 78. 80 𝑧. = 3 𝑥 𝑧7 = log 𝑥 𝑧8 = 1/𝑥 𝑧% = 𝑧8 . + 𝑧7 𝑦 = 𝑧%𝑧. PyTorch Backward Forward
  • 79. 81 grad x = tensor(1.2, requires_grad=True) z2 = 3*sqrt(x) z3 = log(x) z4 = 1/x z1 = z4**2 + z3 y = z1 * z2 y.retain_grad() z1.retain_grad() z2.retain_grad() z3.retain_grad() z4.retain_grad() y.backward() print(x.data, z2.data, z3.data, z4.data, z1.data, y.data) print(x.grad, z2.grad, z3.grad, z4.grad, z1.grad, y.grad) tensor(1.20) tensor(3.29) tensor(0.18) tensor(0.83) tensor(0.88) tensor(2.88) tensor(0.14) tensor(0.88) tensor(3.29) tensor(5.48) tensor(3.29) tensor(1.) import torch torch.set_printoptions(2) 2 PyTorch
  • 80. 82 𝑦 = 𝑥. + 2𝑥 + 3 = 𝑥 + 1 . + 2 (2.0) • (Forward) • (Backward) • • x.grad 𝑥 = −1 𝑦 = 2
  • 81. 83 𝑦 = 𝑥$ + 3.2 𝑥# + 1.3 𝑥 − 2 20 ( -1.5 1.5)
  • 83. 85 MSE = mean squared errors ( ) SGD = stochastic gradient descent ( ) PyTorch
  • 84. 86 ( ) • • (lr ) • • • GPU
  • 85. 87 • https://github.com/karpathy/micrograd • Python 94 • https://github.com/karpathy/micrograd/blob/master/micrograd/engine.py • Andrej Karpathy OpenAI (Telsa AI 2023 OpenAI ) • https://youtu.be/VMj-3S1tku0?si=91ZWzaA4ECidua4g micrograd
  • 86. 88 § 𝑦 = 3 𝑥 𝑧 = 𝑥 𝑦 = 3𝑧 § PyTorch https://pytorch.org/docs/stable/torch.html#math-operations § §
  • 87. 89 ( ) J aime la musique I love music (Software 2.0 )
  • 88. Q & A