SlideShare a Scribd company logo
1 of 46
Download to read offline
計算科学技術特論
A

第14回: 深層学習フレームワークの基礎と実践1
東京工業大学 学術国際情報センター
横田理央
rioyokota@gsic.titech.ac.jp
スパコンでしかできない深層学習
TPU v3
ImageNet SOTA: 90.45
%

Top 1 Accuracy
10,000 TPUv3 core days
R
e
s
N
e
t
-
5
0
D
i
s
t
i
l
B
E
R
T
E
L
M
o
B
E
R
T
-
L
a
r
g
e
G
P
T
-
2
M
e
g
a
t
r
o
n
L
M
T
u
r
i
n
g
-
N
L
G
G
P
T
-
3
S
w
i
t
c
h
T
r
a
n
s
f
o
r
m
e
r
10
7
10
8
10
9
10
10
10
11
10
12
10
13
Number
of
parameters
x100,000
計算量上等
Papers with code
MLPerf target score: 75.9
https://paperswithcode.com
ImageNet-1k
一番良い結果を出している論文の多くが手元で再現可能
主要な深層ニューラルネットモデルの変遷
https://towardsdatascience.com/from-lenet-to-ef
fi
cientnet-the-evolution-of-cnns-3a57eb34672f
AlexNet: ReLU, Dropout, GPU
2012 2015
ResNet: Skip connection
2017
MobileNet: 1x1畳み込み
2019
Ef
fi
cientNet: Neural architecture search
2021
Transformer: 注意機構
Vision Transformer: 画像パッチ
1995
LSTM
LeNet-5:畳み込み
0.9
0.1
0
Labradoodle
Fried chicken
1
<latexit sha1_base64="Qrf0MYIwlAOrUIJFBxNweXaH96A=">AAAB/3icbVDLSsNAFL2pr1pfVZduBovgqiSi6EYounFZwbSFNpTJZNIOnUzCzEQsoQu/wK1+gTtx66f4Af6HkzYL23pg4HDOvdwzx084U9q2v63Syura+kZ5s7K1vbO7V90/aKk4lYS6JOax7PhYUc4EdTXTnHYSSXHkc9r2R7e5336kUrFYPOhxQr0IDwQLGcHaSO7T9bBv96s1u25PgZaJU5AaFGj2qz+9ICZpRIUmHCvVdexEexmWmhFOJ5VeqmiCyQgPaNdQgSOqvGwadoJOjBKgMJbmCY2m6t+NDEdKjSPfTEZYD9Wil4v/ed1Uh1dexkSSairI7FCYcqRjlP8cBUxSovnYEEwkM1kRGWKJiTb9zF0JVB5tYnpxFltYJq2zunNRt+/Pa42boqEyHMExnIIDl9CAO2iCCwQYvMArvFnP1rv1YX3ORktWsXMIc7C+fgH/kJbJ</latexit>
x = h0
<latexit sha1_base64="oeS8g7Am64cZNl7f2teu7TnWjwI=">AAACBHicdVDLSgMxFM3UV62vqks3wSK4GjKl1XYhFN24rGAf2A5DJpO2oZnMkGSEUrr1C9zqF7gTt/6HH+B/mGlHsKIHLhzOuZd77/FjzpRG6MPKrayurW/kNwtb2zu7e8X9g7aKEkloi0Q8kl0fK8qZoC3NNKfdWFIc+px2/PFV6nfuqVQsErd6ElM3xEPBBoxgbaS7xEMXHQ+NPOQVS8iu16r1Sg0iG82RkvJZvepAJ1NKIEPTK372g4gkIRWacKxUz0GxdqdYakY4nRX6iaIxJmM8pD1DBQ6pcqfzi2fwxCgBHETSlNBwrv6cmOJQqUnom84Q65H67aXiX14v0YOaO2UiTjQVZLFokHCoI5i+DwMmKdF8YggmkplbIRlhiYk2IS1tCVR62szk8v08/J+0y7ZTtdFNpdS4zBLKgyNwDE6BA85BA1yDJmgBAgR4BE/g2XqwXqxX623RmrOymUOwBOv9C3UxmK8=</latexit>
u0 = W0h0
<latexit sha1_base64="i98NF53nvMx1GTvIOlT02vBIGAA=">AAACBHicdVDLSsNAFL2pr1pfVZdugkVwFRKt2o1QdOOygn1gG8JkMmmHTiZhZiKU0K1f4Fa/wJ249T/8AP/DSVuhFT0wcDjnXu6Z4yeMSmXbn0ZhaXllda24XtrY3NreKe/utWScCkyaOGax6PhIEkY5aSqqGOkkgqDIZ6TtD69zv/1AhKQxv1OjhLgR6nMaUoyUlu5Tz7lse87Ac7xyxbHsCUzbOq9W7dOaJjPlx6rADA2v/NULYpxGhCvMkJRdx06UmyGhKGZkXOqlkiQID1GfdDXlKCLSzSaJx+aRVgIzjIV+XJkTdX4jQ5GUo8jXkxFSA/nby8W/vG6qwpqbUZ6kinA8PRSmzFSxmX/fDKggWLGRJggLqrOaeIAEwkqXtHAlkHm08Xwv/5PWieWcWfZttVK/mjVUhAM4hGNw4ALqcAMNaAIGDk/wDC/Go/FqvBnv09GCMdvZhwUYH988fpiK</latexit>
u1 = W1h1
<latexit sha1_base64="4dcnyKt/ee7kGXZ6S3uBkWEAkPc=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRKt2mXRjcsK9gFNCJPJpB06mYSZiVBC136GX+BWv8CduBU/wP9w0ka0ohcGzj3nvub4CaNSWda7UVpYXFpeKa9W1tY3NrfM7Z2OjFOBSRvHLBY9H0nCKCdtRRUjvUQQFPmMdP3RZa53b4mQNOY3apwQN0IDTkOKkdKUZ+47oUA4cxIkFEUMpp49+c6GOvPMql2zpgGt2lm9bp00NCiYL6kKimh55ocTxDiNCFeYISn7tpUoN8tHYkYmFSeVJEF4hAakryFHEZFuNv3KBB5qJoBhLPTjCk7Znx0ZiqQcR76ujJAayt9aTv6l9VMVNtyM8iRVhOPZojBlUMUw9wUGVBCs2FgDhAXVt0I8RNobpd2b2xLI/LQ5X/4HneOafVqzruvV5kXhUBnsgQNwBGxwDprgCrRAG2BwBx7AI3gy7o1n48V4nZWWjKJnF8yF8fYJDnqjNw==</latexit>
@u1
@h1
<latexit sha1_base64="rQepUHmc6aWrxxB1wV5j6ZVGC6k=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRKt2mXRjcsK9gFNCJPJpB06mYSZiVBC136GX+BWv8CduBU/wP9w0ka0ohcGzj3nvub4CaNSWda7UVpYXFpeKa9W1tY3NrfM7Z2OjFOBSRvHLBY9H0nCKCdtRRUjvUQQFPmMdP3RZa53b4mQNOY3apwQN0IDTkOKkdKUZ+47oUA4cxIkFEUMpp49+c66OvPMql2zpgGt2lm9bp00NCiYL6kKimh55ocTxDiNCFeYISn7tpUoN8tHYkYmFSeVJEF4hAakryFHEZFuNv3KBB5qJoBhLPTjCk7Znx0ZiqQcR76ujJAayt9aTv6l9VMVNtyM8iRVhOPZojBlUMUw9wUGVBCs2FgDhAXVt0I8RNobpd2b2xLI/LQ5X/4HneOafVqzruvV5kXhUBnsgQNwBGxwDprgCrRAG2BwBx7AI3gy7o1n48V4nZWWjKJnF8yF8fYJ8zGjJg==</latexit>
@u1
@W1
<latexit sha1_base64="60kCDCJfdCUFlI7azaDnN8WmG14=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRJRbHdFNy4r2Ae0IUwmk3boZBJmJkIJWfsZfoFb/QJ34lb8AP/DSRuxFT0wcObc9/FiRqWyrA+jtLS8srpWXq9sbG5t75i7ex0ZJQKTNo5YJHoekoRRTtqKKkZ6sSAo9BjpeuOrPN69I0LSiN+qSUycEA05DShGSkuueTgIBMLpIEZCUcTgyLWzn1/iWplrVq2aNQWcI41G3a43oF0oVVCg5ZqfAz/CSUi4wgxJ2betWDlp3hIzklUGiSQxwmM0JH1NOQqJdNLpKRk81ooPg0joxxWcqvMVKQqlnISezgyRGsnfsVz8K9ZPVFB3UsrjRBGOZ4OChEEVwdwX6FNBsGITTRAWVO8K8Qhpb5R2b2GKL/PVcl++j4f/k85pzT6vWTdn1eZl4VAZHIAjcAJscAGa4Bq0QBtgcA8ewRN4Nh6MF+PVeJulloyiZh8swHj/AiX8o0g=</latexit>
@h1
@u0
<latexit sha1_base64="2g++4FK2qtbTNVizFSiWmGPzmRk=">AAACHXicdVDLSsNAFJ34rPUVdelmtAiuQlJabXdFNy4r2Ac0IUwmk3bo5MHMRCihaz/DL3CrX+BO3Iof4H84aSNa0QMD555779x7j5cwKqRpvmtLyyura+uljfLm1vbOrr633xVxyjHp4JjFvO8hQRiNSEdSyUg/4QSFHiM9b3yZ53u3hAsaRzdykhAnRMOIBhQjqSRXP7IDjnBmJ4hLihhMXXP6HfVU5OoV02g26s1aA5qGOUNOqmfNugWtQqmAAm1X/7D9GKchiSRmSIiBZSbSyfIvMSPTsp0KkiA8RkMyUDRCIRFONjtlCk+U4sMg5upFEs7Unx0ZCoWYhJ6qDJEcid+5XPwrN0hl0HAyGiWpJBGeDwpSBmUMc1+gTznBkk0UQZhTtSvEI6S8kcq9hSm+yFfLffk6Hv5PulXDqhvmda3SuigcKoFDcAxOgQXOQQtcgTboAAzuwAN4BE/avfasvWiv89Ilreg5AAvQ3j4BLYSjTA==</latexit>
@u0
@W0
<latexit sha1_base64="+jqY1jG3/sRBUYetLzlPwiYD6Ak=">AAACBnicdVDLSsNAFL3xWeur6tLNYBHqpiQq2i6EohuXFewD2hAmk0k7dPJgZiKU0L1f4Fa/wJ249Tf8AP/DSRuhFT0wcDjnXu6Z48acSWWan8bS8srq2npho7i5tb2zW9rbb8soEYS2SMQj0XWxpJyFtKWY4rQbC4oDl9OOO7rJ/M4DFZJF4b0ax9QO8CBkPiNYaak/dKwr3zEriWOeOKWyWTWnQHOkXq9ZtTqycqUMOZpO6avvRSQJaKgIx1L2LDNWdoqFYoTTSbGfSBpjMsID2tM0xAGVdjrNPEHHWvGQHwn9QoWm6vxGigMpx4GrJwOshvK3l4l/eb1E+TU7ZWGcKBqS2SE/4UhFKCsAeUxQovhYE0wE01kRGWKBidI1LVzxZBZtonv5+Tz6n7RPq9ZF9ezuvNy4zhsqwCEcQQUsuIQG3EITWkAghid4hhfj0Xg13oz32eiSke8cwAKMj289EpkS</latexit>
h1 = f0(u0)
<latexit sha1_base64="YsNX+xavKPnYRsukvqtfuULWxoM=">AAACBHicbVDLSsNAFJ3UV62vqks3g0Wom5C0anUhFN24rGAf2IYwmUzaoZNJmJkIpXTrF7jVL3Anbv0PP8D/cNIGsdUDA4dz7uWeOV7MqFSW9WnklpZXVtfy64WNza3tneLuXktGicCkiSMWiY6HJGGUk6aiipFOLAgKPUba3vA69dsPREga8Ts1iokToj6nAcVIaek+vgxcu5y49rFbLFmmNQW0zNNaxbqowh/FzkgJZGi4xa+eH+EkJFxhhqTs2lasnDESimJGJoVeIkmM8BD1SVdTjkIinfE08QQeacWHQST04wpO1d8bYxRKOQo9PRkiNZCLXir+53UTFZw7Y8rjRBGOZ4eChEEVwfT70KeCYMVGmiAsqM4K8QAJhJUuae6KL9NoE92LvdjCX9KqmPaZWb09KdWvsoby4AAcgjKwQQ3UwQ1ogCbAgIMn8AxejEfj1Xgz3mejOSPb2QdzMD6+Af4MmGY=</latexit>
p = f1(u1)
深層ニューラルネットの学習
2
2
10
2
5
5
5
5
15
<latexit sha1_base64="eGta8ATILI7rVM5edzp7/uMnFog=">AAAB+3icbVDLSsNAFJ3UV62vqks3g0VwVRIVdVl047IF+4A2lMnkph06mYSZiRBCvsCtfoE7cevH+AH+h5M2C1s9MHA4517umePFnClt219WZW19Y3Orul3b2d3bP6gfHvVUlEgKXRrxSA48ooAzAV3NNIdBLIGEHoe+N7sv/P4TSMUi8ajTGNyQTAQLGCXaSJ10XG/YTXsO/Jc4JWmgEu1x/XvkRzQJQWjKiVJDx461mxGpGeWQ10aJgpjQGZnA0FBBQlBuNg+a4zOj+DiIpHlC47n6eyMjoVJp6JnJkOipWvUK8T9vmOjg1s2YiBMNgi4OBQnHOsLFr7HPJFDNU0MIlcxkxXRKJKHadLN0xVdFtNz04qy28Jf0LprOdfOyc9Vo3ZUNVdEJOkXnyEE3qIUeUBt1EUWAntELerVy6816tz4WoxWr3DlGS7A+fwCA+ZVy</latexit>
y
<latexit sha1_base64="CX39qY1yvYuKVy5jRO31RUVPsKU=">AAACG3icbVDLSsNAFJ34rPUVdenCwSK4CkmrVndFNy4r2Ac0IUwmk3bo5MHMRCihSz/DL3CrX+BO3LrwA/wPJ21QWz0wcDjnvuZ4CaNCmuaHtrC4tLyyWlorr29sbm3rO7ttEacckxaOWcy7HhKE0Yi0JJWMdBNOUOgx0vGGV7nfuSNc0Di6laOEOCHqRzSgGEklufqBHXCEMztBXFLEYDL+4alrjV29YhrmBNA0TutV86IGvxWrIBVQoOnqn7Yf4zQkkcQMCdGzzEQ6WT4SMzIu26kgCcJD1Cc9RSMUEuFkk4+M4ZFSfBjEXL1Iwon6uyNDoRCj0FOVIZIDMe/l4n9eL5XBuZPRKEklifB0UZAyKGOYpwJ9ygmWbKQIwpyqWyEeIJWMVNnNbPFFflqeizWfwl/SrhrWmVG7Oak0LouESmAfHIJjYIE6aIBr0AQtgME9eARP4Fl70F60V+1tWrqgFT17YAba+xfXKqKf</latexit>
@p
@u1
<latexit sha1_base64="U0Wku2zBzTyNreP8fA04lNpnsb8=">AAACOnicbVC7SgNBFJ31GeNr1dJmMAg2CbsqaqMEbSwsIpgHZGO4OztJhsw+mJkVwrJ/42f4Bbba2AoWYusHOJukyMMDA4dz7p1773EjzqSyrA9jYXFpeWU1t5Zf39jc2jZ3dmsyjAWhVRLyUDRckJSzgFYVU5w2IkHBdzmtu/2bzK8/USFZGDyoQURbPnQD1mEElJba5pXjg+oR4MldeunI2H9MPFCQjijhIGVaHDg87OJo0i+OpLZZsErWEHie2GNSQGNU2uaX44Uk9mmghp83bStSrQSEYoTTNO/EkkZA+tClTU0D8KlsJcM7U3yoFQ93QqFfoPBQnexIwJdy4Lu6MrtKznqZ+J/XjFXnopWwIIoVDchoUCfmWIU4Cw17TFCi+EATIILpXTHpgQCidLRTUzyZrZbqXOzZFOZJ7bhkn5VO7k8L5etxQjm0jw7QEbLROSqjW1RBVUTQM3pFb+jdeDE+jW/jZ1S6YIx79tAUjN8/EDGwDg==</latexit>
L =
data
X class
X
y log p =
data
X
log p
<latexit sha1_base64="NXhC3ff4B32CgQ5BYZuIAyqz5Qg=">AAACJHicbVDLSsNAFJ3UV62vqEs3g0XoqiQq6kYounHhooJ9QBPKZDJph04mYWYilJBf8DP8Arf6Be7EhRt3/oeTNqBtPTBwOPd15ngxo1JZ1qdRWlpeWV0rr1c2Nre2d8zdvbaMEoFJC0csEl0PScIoJy1FFSPdWBAUeox0vNF1Xu88ECFpxO/VOCZuiAacBhQjpaW+Wbt0AoFw6sRIKIoYdEKkhhix9DbLftVO1jerVt2aAC4SuyBVUKDZN78dP8JJSLjCDEnZs61YuWm+EDOSVZxEkhjhERqQnqYchUS66eRHGTzSig+DSOjHFZyofydSFEo5Dj3dmfuV87Vc/K/WS1Rw4aaUx4kiHE8PBQmDKoJ5PNCngmDFxpogLKj2CvEQ6YSUDnHmii9za3ku9nwKi6R9XLfP6id3p9XGVZFQGRyAQ1ADNjgHDXADmqAFMHgEz+AFvBpPxpvxbnxMW0tGMbMPZmB8/QAvl6Z4</latexit>
=
@L
@W
<latexit sha1_base64="RlrtYxiGwNDm/OSOojM6YjHJMWs=">AAACI3icbVC7TsMwFHV4lvIKMLJYVAimKgEEjBUsDAxFog+piSrHdVqrjmPZDlIV5RP4DL6AFb6ADbEwMPIfOG0kaMuRLB2d+zo+gWBUacf5tBYWl5ZXVktr5fWNza1te2e3qeJEYtLAMYtlO0CKMMpJQ1PNSFtIgqKAkVYwvM7rrQciFY35vR4J4keoz2lIMdJG6tpHXigRTj2BpKaIQS9CeoARS2+z7FcVWdeuOFVnDDhP3IJUQIF61/72ejFOIsI1ZkipjusI7af5QsxIVvYSRQTCQ9QnHUM5iojy0/GHMnholB4MY2ke13Cs/p1IUaTUKApMZ+5XzdZy8b9aJ9HhpZ9SLhJNOJ4cChMGdQzzdGCPSoI1GxmCsKTGK8QDZBLSJsOpKz2VW8tzcWdTmCfNk6p7Xj29O6vUroqESmAfHIBj4IILUAM3oA4aAINH8AxewKv1ZL1Z79bHpHXBKmb2wBSsrx/HuKZK</latexit>
@L
@p
<latexit sha1_base64="/J5Xk+dXiOlf6omGGiJXLYgOMI8=">AAACBXicdVDLSsNAFJ34rPVVdelmsAiuQlLb2u6KblxWsA9IY5lMJu3QmUmYmQgldO0XuNUvcCdu/Q4/wP8waSNY0QMXDufcy733eBGjSlvWh7Gyura+sVnYKm7v7O7tlw4OuyqMJSYdHLJQ9j2kCKOCdDTVjPQjSRD3GOl5k6vM790TqWgobvU0Ii5HI0EDipFOJWegYn6X+Eij2bBUtsxmo9asNqBlWnNkpFJv1mxo50oZ5GgPS58DP8QxJ0JjhpRybCvSboKkppiRWXEQKxIhPEEj4qRUIE6Um8xPnsHTVPFhEMq0hIZz9edEgrhSU+6lnRzpsfrtZeJfnhProOEmVESxJgIvFgUxgzqE2f/Qp5JgzaYpQVjS9FaIx0girNOUlrb4Kjsty+X7efg/6VZMu26e31TLrcs8oQI4BifgDNjgArTANWiDDsAgBI/gCTwbD8aL8Wq8LVpXjHzmCCzBeP8CCxKaQA==</latexit>
data
X
<latexit sha1_base64="OIzM9hBXAwa2CsqFH+Q4ck6rUCg=">AAACBXicdVDLSgMxFM3UV62vqks3wSK4Gma01C6LblxWsA+YjiWTybShmWRIMkIZuvYL3OoXuBO3focf4H+YaUewogcCh3Pu5Z6cIGFUacf5sEorq2vrG+XNytb2zu5edf+gq0QqMelgwYTsB0gRRjnpaKoZ6SeSoDhgpBdMrnK/d0+kooLf6mlC/BiNOI0oRtpI3kCl8V0WIo1mw2rNtZ05oGM36nXnvGlIoXxbNVCgPax+DkKB05hwjRlSynOdRPsZkppiRmaVQapIgvAEjYhnKEcxUX42jzyDJ0YJYSSkeVzDufpzI0OxUtM4MJMx0mP128vFvzwv1VHTzyhPUk04XhyKUga1gPn/YUglwZpNDUFYUpMV4jGSCGvT0tKVUOXRlnr5n3TPbLdhn9/Ua63LoqEyOALH4BS44AK0wDVogw7AQIBH8ASerQfrxXq13hajJavYOQRLsN6/AM2Bmhg=</latexit>
data
X
2
2
1
四則演算や初等関数の微分は内部で定義されている
それらを連鎖させれば行列積で勾配が計算できる
後ろからかければ全て行列ベクトル積になる
画像ごとにこれが行われ最後に和をとる
<latexit sha1_base64="60kCDCJfdCUFlI7azaDnN8WmG14=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRJRbHdFNy4r2Ae0IUwmk3boZBJmJkIJWfsZfoFb/QJ34lb8AP/DSRuxFT0wcObc9/FiRqWyrA+jtLS8srpWXq9sbG5t75i7ex0ZJQKTNo5YJHoekoRRTtqKKkZ6sSAo9BjpeuOrPN69I0LSiN+qSUycEA05DShGSkuueTgIBMLpIEZCUcTgyLWzn1/iWplrVq2aNQWcI41G3a43oF0oVVCg5ZqfAz/CSUi4wgxJ2betWDlp3hIzklUGiSQxwmM0JH1NOQqJdNLpKRk81ooPg0joxxWcqvMVKQqlnISezgyRGsnfsVz8K9ZPVFB3UsrjRBGOZ4OChEEVwdwX6FNBsGITTRAWVO8K8Qhpb5R2b2GKL/PVcl++j4f/k85pzT6vWTdn1eZl4VAZHIAjcAJscAGa4Bq0QBtgcA8ewRN4Nh6MF+PVeJulloyiZh8swHj/AiX8o0g=</latexit>
@h1
@u0
<latexit sha1_base64="4dcnyKt/ee7kGXZ6S3uBkWEAkPc=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRKt2mXRjcsK9gFNCJPJpB06mYSZiVBC136GX+BWv8CduBU/wP9w0ka0ohcGzj3nvub4CaNSWda7UVpYXFpeKa9W1tY3NrfM7Z2OjFOBSRvHLBY9H0nCKCdtRRUjvUQQFPmMdP3RZa53b4mQNOY3apwQN0IDTkOKkdKUZ+47oUA4cxIkFEUMpp49+c6GOvPMql2zpgGt2lm9bp00NCiYL6kKimh55ocTxDiNCFeYISn7tpUoN8tHYkYmFSeVJEF4hAakryFHEZFuNv3KBB5qJoBhLPTjCk7Znx0ZiqQcR76ujJAayt9aTv6l9VMVNtyM8iRVhOPZojBlUMUw9wUGVBCs2FgDhAXVt0I8RNobpd2b2xLI/LQ5X/4HneOafVqzruvV5kXhUBnsgQNwBGxwDprgCrRAG2BwBx7AI3gy7o1n48V4nZWWjKJnF8yF8fYJDnqjNw==</latexit>
@u1
@h1
<latexit sha1_base64="CX39qY1yvYuKVy5jRO31RUVPsKU=">AAACG3icbVDLSsNAFJ34rPUVdenCwSK4CkmrVndFNy4r2Ac0IUwmk3bo5MHMRCihSz/DL3CrX+BO3LrwA/wPJ21QWz0wcDjnvuZ4CaNCmuaHtrC4tLyyWlorr29sbm3rO7ttEacckxaOWcy7HhKE0Yi0JJWMdBNOUOgx0vGGV7nfuSNc0Di6laOEOCHqRzSgGEklufqBHXCEMztBXFLEYDL+4alrjV29YhrmBNA0TutV86IGvxWrIBVQoOnqn7Yf4zQkkcQMCdGzzEQ6WT4SMzIu26kgCcJD1Cc9RSMUEuFkk4+M4ZFSfBjEXL1Iwon6uyNDoRCj0FOVIZIDMe/l4n9eL5XBuZPRKEklifB0UZAyKGOYpwJ9ygmWbKQIwpyqWyEeIJWMVNnNbPFFflqeizWfwl/SrhrWmVG7Oak0LouESmAfHIJjYIE6aIBr0AQtgME9eARP4Fl70F60V+1tWrqgFT17YAba+xfXKqKf</latexit>
@p
@u1
<latexit sha1_base64="rQepUHmc6aWrxxB1wV5j6ZVGC6k=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRKt2mXRjcsK9gFNCJPJpB06mYSZiVBC136GX+BWv8CduBU/wP9w0ka0ohcGzj3nvub4CaNSWda7UVpYXFpeKa9W1tY3NrfM7Z2OjFOBSRvHLBY9H0nCKCdtRRUjvUQQFPmMdP3RZa53b4mQNOY3apwQN0IDTkOKkdKUZ+47oUA4cxIkFEUMpp49+c66OvPMql2zpgGt2lm9bp00NCiYL6kKimh55ocTxDiNCFeYISn7tpUoN8tHYkYmFSeVJEF4hAakryFHEZFuNv3KBB5qJoBhLPTjCk7Znx0ZiqQcR76ujJAayt9aTv6l9VMVNtyM8iRVhOPZojBlUMUw9wUGVBCs2FgDhAXVt0I8RNobpd2b2xLI/LQ5X/4HneOafVqzruvV5kXhUBnsgQNwBGxwDprgCrRAG2BwBx7AI3gy7o1n48V4nZWWjKJnF8yF8fYJ8zGjJg==</latexit>
@u1
@W1
<latexit sha1_base64="2g++4FK2qtbTNVizFSiWmGPzmRk=">AAACHXicdVDLSsNAFJ34rPUVdelmtAiuQlJabXdFNy4r2Ac0IUwmk3bo5MHMRCihaz/DL3CrX+BO3Iof4H84aSNa0QMD555779x7j5cwKqRpvmtLyyura+uljfLm1vbOrr633xVxyjHp4JjFvO8hQRiNSEdSyUg/4QSFHiM9b3yZ53u3hAsaRzdykhAnRMOIBhQjqSRXP7IDjnBmJ4hLihhMXXP6HfVU5OoV02g26s1aA5qGOUNOqmfNugWtQqmAAm1X/7D9GKchiSRmSIiBZSbSyfIvMSPTsp0KkiA8RkMyUDRCIRFONjtlCk+U4sMg5upFEs7Unx0ZCoWYhJ6qDJEcid+5XPwrN0hl0HAyGiWpJBGeDwpSBmUMc1+gTznBkk0UQZhTtSvEI6S8kcq9hSm+yFfLffk6Hv5PulXDqhvmda3SuigcKoFDcAxOgQXOQQtcgTboAAzuwAN4BE/avfasvWiv89Ilreg5AAvQ3j4BLYSjTA==</latexit>
@u0
@W0
<latexit sha1_base64="RlrtYxiGwNDm/OSOojM6YjHJMWs=">AAACI3icbVC7TsMwFHV4lvIKMLJYVAimKgEEjBUsDAxFog+piSrHdVqrjmPZDlIV5RP4DL6AFb6ADbEwMPIfOG0kaMuRLB2d+zo+gWBUacf5tBYWl5ZXVktr5fWNza1te2e3qeJEYtLAMYtlO0CKMMpJQ1PNSFtIgqKAkVYwvM7rrQciFY35vR4J4keoz2lIMdJG6tpHXigRTj2BpKaIQS9CeoARS2+z7FcVWdeuOFVnDDhP3IJUQIF61/72ejFOIsI1ZkipjusI7af5QsxIVvYSRQTCQ9QnHUM5iojy0/GHMnholB4MY2ke13Cs/p1IUaTUKApMZ+5XzdZy8b9aJ9HhpZ9SLhJNOJ4cChMGdQzzdGCPSoI1GxmCsKTGK8QDZBLSJsOpKz2VW8tzcWdTmCfNk6p7Xj29O6vUroqESmAfHIBj4IILUAM3oA4aAINH8AxewKv1ZL1Z79bHpHXBKmb2wBSsrx/HuKZK</latexit>
@L
@p
Forward propagation
Backward propagation
Cross entropy loss
確率的勾配降下法 (SGD)
:ラベル
誤差逆伝播法
<latexit sha1_base64="BpaiO7b9hbl/rfkDJL7cM+CMhk8=">AAACNHicbVDLSsNAFJ34rO+qSzeDRRDEkqhoN0LRjQsXFawtNCXcTCc6dPJg5kYoIb/iZ/gFbnUvuJNu/QYntYhVDwwczn2dOX4ihUbbfrWmpmdm5+ZLC4tLyyura+X1jRsdp4rxJotlrNo+aC5FxJsoUPJ2ojiEvuQtv39e1Fv3XGkRR9c4SHg3hNtIBIIBGskr11pehntOftrycN/lCG6ggGVuAgoFSDcEvGMgs8s8/xap6c29csWu2iPQv8QZkwoZo+GVh24vZmnII2QStO44doLdrFjJJM8X3VTzBFgfbnnH0AhCrrvZ6Ic53TFKjwaxMi9COlJ/TmQQaj0IfdNZONa/a4X4X62TYlDrZiJKUuQR+zoUpJJiTIu4aE8ozlAODAGmhPFK2R2YiNCEOnGlpwtrRS7O7xT+kpuDqnNcPbw6qtTPxgmVyBbZJrvEISekTi5IgzQJIw/kiTyTF+vRerPereFX65Q1ntkkE7A+PgFacq02</latexit>
Wt+1 = Wt ⌘
@L
@Wt
最適化手法 https://losslandscape.com
SGD
重みWとバイアスbを合わせてθとする
ミニバッチごとに損失関数の形状は変化する
momentum SGD semi-implicit Euler風に書くと
<latexit sha1_base64="ir9w3iQmgCVKvRKp5bEPzuOE5aE=">AAACH3icdVDLSitBEO3x/boademmMVxwFWa8PpKFIOrCpYJRIROkplMTG7tnhu4aIQz5AD/DL3CrX+BO3PoB/oc9SQQV74GCwzlVVNWJMiUt+f6bNzY+MTk1PTM7N7/wZ3GpsrxybtPcCGyKVKXmMgKLSibYJEkKLzODoCOFF9HNYelf3KKxMk3OqJdhW0M3kbEUQE66qlTDI1QEnPbCLmgNe2FsQBQhEvSLodR3XX6tUd9ubNW5X/MHKMnmTmM74MFIqbIRTq4q72EnFbnGhIQCa1uBn1G7AENSKOzPhbnFDMQNdLHlaAIabbsYPNPnf53S4XFqXCXEB+rXiQK0tT0duU4NdG1/eqX4m9fKKa63C5lkOWEihoviXHFKeZkM70iDglTPERBGulu5uAaXBrn8vm3p2PK0MpfP5/n/yflmLdip/Tvdqu4fjBKaYWtsnW2wgO2yfXbMTliTCXbHHtgje/LuvWfvxXsdto55o5lV9g3e2wf9lqRQ</latexit>
t = =
⌘
物理的整合性を
持たせるためには
Nesterov momentum
RMSProp
Adam
<latexit sha1_base64="IOyR436oWLZbrKn8O0Lz+NybarM=">AAACMXicbVDLSsNAFJ34rPVVdekmWARFLInvjSC6ceGigrVCU8rNdGqHTiZh5kYoIV/iZ/gFbvUL3Ingyp9w0kaw1QvDnDnnXu6Z40eCa3ScN2ticmp6ZrYwV5xfWFxaLq2s3uowVpTVaChCdeeDZoJLVkOOgt1FikHgC1b3exeZXn9gSvNQ3mA/Ys0A7iXvcApoqFbp0MMuQ2gluOOmp/kDdz1zeRJ8AV4A2KUgkqt060febpXKTsUZlP0XuDkok7yqrdKn1w5pHDCJVIDWDdeJsJmAQk4FS4terFkEtAf3rGGghIDpZjL4XmpvGqZtd0JljkR7wP6eSCDQuh/4pjMzq8e1jPxPa8TYOWkmXEYxMkmHizqxsDG0s6zsNleMougbAFRx49WmXVBA0SQ6sqWtM2upycUdT+EvuN2ruEeV/euD8tl5nlCBrJMNskVcckzOyCWpkhqh5JE8kxfyaj1Zb9a79TFsnbDymTUyUtbXNx2yq3o=</latexit>
✓t+1 = ✓t ⌘rL(✓t)
<latexit sha1_base64="EMAKBAJcozE6InSyo+X7qysKpy0=">AAACLnicbVDLSgMxFM34flt16SZYBEUoMyrqRhDduHChYFXolOFOmrbBJDMkdwplmP/wM/wCt/oFggtxJfgZZmoXvg4EDufcyz05cSqFRd9/8UZGx8YnJqemZ2bn5hcWK0vLVzbJDON1lsjE3MRguRSa11Gg5Dep4aBiya/j25PSv+5xY0WiL7Gf8qaCjhZtwQCdFFW2e1GOW0FxGHZAKaC9CLdCjhBqiCWECrDLQOZnxUaIXadHuBlVqn7NH4D+JcGQVMkQ51HlPWwlLFNcI5NgbSPwU2zmYFAwyYuZMLM8BXYLHd5wVIPitpkP/lbQdae0aDsx7mmkA/X7Rg7K2r6K3WQZ1v72SvE/r5Fh+6CZC51myDX7OtTOJMWElkXRljCcoew7AswIl5WyLhhg6Or8caVly2iF6yX43cJfcrVdC/ZqOxe71aPjYUNTZJWskQ0SkH1yRE7JOakTRu7IA3kkT9699+y9em9foyPecGeF/ID38QkKrqnh</latexit>
vt+1 = vt + ⌘rL(✓t)
<latexit sha1_base64="so4WvEfNBhzQ2+k0DIcQ37xIf6o=">AAACGXicbVDLSsNAFJ3UV62vqEsRgkUQxJKoqBuh6MZlBfuANoTJZNoOnUzCzE2hhK78DL/ArX6BO3Hryg/wP5y0WdjqgYFzz7mXe+f4MWcKbPvLKCwsLi2vFFdLa+sbm1vm9k5DRYkktE4iHsmWjxXlTNA6MOC0FUuKQ5/Tpj+4zfzmkErFIvEAo5i6Ie4J1mUEg5Y8c78DfQrYS+HYGV/nBZwMp4Jnlu2KPYH1lzg5KaMcNc/87gQRSUIqgHCsVNuxY3BTLIERTselTqJojMkA92hbU4FDqtx08o2xdaiVwOpGUj8B1kT9PZHiUKlR6OvOEENfzXuZ+J/XTqB75aZMxAlQQaaLugm3ILKyTKyASUqAjzTBRDJ9q0X6WGICOrmZLYHKTstyceZT+EsapxXnonJ2f16u3uQJFdEeOkBHyEGXqIruUA3VEUGP6Bm9oFfjyXgz3o2PaWvByGd20QyMzx85KqEn</latexit>
✓t+1 = ✓t vt+1
<latexit sha1_base64="4YkWyJvS7AvVbbmFVeff+OrbRbQ=">AAACMnicbVDLSsNAFJ34rO+qSzeDRaiIJVGpbgTRjQsXClaFpoab6dQOTiZh5qZQQv7Ez/AL3OoP6E7EnR/hpHbh68DA4Zx7uWdOmEhh0HWfnZHRsfGJydLU9Mzs3PxCeXHpwsSpZrzBYhnrqxAMl0LxBgqU/CrRHKJQ8svw9qjwL3tcGxGrc+wnvBXBjRIdwQCtFJTrvSDDDS/f93U3pr0AN6reZsHXfQWhBD8C7DKQ2Ule9bHLEQJcv94KyhW35g5A/xJvSCpkiNOg/O63Y5ZGXCGTYEzTcxNsZaBRMMnzaT81PAF2Cze8aamCiJtWNvhfTtes0qadWNunkA7U7xsZRMb0o9BOFnHNb68Q//OaKXb2WplQSYpcsa9DnVRSjGlRFm0LzRnKviXAtLBZKeuCBoa20h9X2qaIlttevN8t/CUXWzWvXts+26kcHA4bKpEVskqqxCO75IAck1PSIIzckQfySJ6ce+fFeXXevkZHnOHOMvkB5+MTv8+qnQ==</latexit>
vt+1 = ⇢vt + (1 ⇢)rL(✓t)2
<latexit sha1_base64="jPT370zvmdQBBRtoAnxlbXWW5BM=">AAACTnicbVBNixNBFOyJX3H9inr00hiElUCYUVm9CItePHhYxewupMPwpvMmaba7Z7b7zUJo5n/5M7x68Sb6C7yJ9iQ5uLsWNBRV7/Gqq6i18pSmX5PelavXrt/o39y5dfvO3XuD+w8OfdU4iRNZ6codF+BRK4sTUqTxuHYIptB4VJy87fyjM3ReVfYTrWqcGVhYVSoJFKV88NHkgUZZ+1oswBjgJqeRKB3IIJCgDcKfOgpnm6GRwNorXdm2FRYKDcIALSXo8L7dFbSMGzk9zQfDdJyuwS+TbEuGbIuDfPBdzCvZGLQkNXg/zdKaZgEcKamx3RGNxxrkCSxwGqkFg34W1n9v+ZOozHlZufgs8bX670YA4/3KFHGyC+svep34P2/aUPlqFpStG0IrN4fKRnOqeFcknyuHkvQqEpBOxaxcLiE2R7Huc1fmvovWxl6yiy1cJofPxtne+PmHF8P9N9uG+uwRe8x2WcZesn32jh2wCZPsM/vGfrCfyZfkV/I7+bMZ7SXbnYfsHHr9v7xetzQ=</latexit>
mt+1 = mt +
⌘
p
vt+1 + ✏
rL(✓t)
<latexit sha1_base64="4YEOH6DoswNYNGJQU7KvSQoMDRc=">AAACGXicbVDLSsNAFJ3UV62vqEsRBosgiCVRUTdC0Y3LCvYBbQiTybQdOpOEmRuhhK78DL/ArX6BO3Hryg/wP0zaLGzrgYFzz7mXe+d4keAaLOvbKCwsLi2vFFdLa+sbm1vm9k5Dh7GirE5DEaqWRzQTPGB14CBYK1KMSE+wpje4zfzmI1Oah8EDDCPmSNILeJdTAqnkmvsd6DMgbgLH9ug6L+BETgTXLFsVaww8T+yclFGOmmv+dPyQxpIFQAXRum1bETgJUcCpYKNSJ9YsInRAeqyd0oBIpp1k/I0RPkwVH3dDlb4A8Fj9O5EQqfVQemmnJNDXs14m/ue1Y+heOQkPohhYQCeLurHAEOIsE+xzxSiIYUoIVTy9FdM+UYRCmtzUFl9np2W52LMpzJPGacW+qJzdn5erN3lCRbSHDtARstElqqI7VEN1RNETekGv6M14Nt6ND+Nz0low8pldNAXj6xcqpaEe</latexit>
✓t+1 = ✓t mt+1
<latexit sha1_base64="hcu7JIK5zJuWJREk9Bj+qzd8oyg=">AAACNXicbVDLSsNAFJ34flt16SZYhIpYEhUfC0F048KFglWhKeFmOrVDZyZh5kYoId/iZ/gFbnXtwp269Rec1Cx8XRg495x7uWdOlAhu0POenaHhkdGx8YnJqemZ2bn5ysLipYlTTVmDxiLW1xEYJrhiDeQo2HWiGchIsKuod1zoV7dMGx6rC+wnrCXhRvEOp4CWCiv7Msxw3c8PgoghhL4Mcb3mb5TdWqAgEhBIwC4FkZ3mtQC7hYRrYaXq1b1BuX+BX4IqKessrLwF7ZimkimkAoxp+l6CrQw0cipYPhWkhiVAe3DDmhYqkMy0ssEXc3fVMm23E2v7FLoD9vtGBtKYvozsZGHW/NYK8j+tmWJnr5VxlaTIFP061EmFi7Fb5OW2uWYURd8CoJpbry7tggaKNtUfV9qmsJbbXPzfKfwFl5t1f6e+db5dPTwqE5ogy2SF1IhPdskhOSFnpEEouSMP5JE8OffOi/PqvH+NDjnlzhL5Uc7HJwCnq78=</latexit>
mt+1 = 1mt + (1 1)rL(✓t)
<latexit sha1_base64="b0A57HXrWLeK1gdAKZRl7DCDL2w=">AAACN3icbVDLSsNAFJ34rO+qSzfBIlTEklRRQYSiGxcuFKwKTQ0306kdnEzCzE2hhHyMn+EXuNWlK1eKW//ASc3C14WBc8+5l3vmBLHgGh3n2RoZHRufmCxNTc/Mzs0vlBeXLnSUKMqaNBKRugpAM8ElayJHwa5ixSAMBLsMbo9y/bLPlOaRPMdBzNoh3Eje5RTQUH55v++nuOFmB17AEPx638eNqrtZdOuehECAFwL2KIj0JKt62MslXL+u++WKU3OGZf8FbgEqpKhTv/zqdSKahEwiFaB1y3VibKegkFPBsmkv0SwGegs3rGWghJDpdjr8ZGavGaZjdyNlnkR7yH7fSCHUehAGZjK3q39rOfmf1kqwu9dOuYwTZJJ+HeomwsbIzhOzO1wximJgAFDFjVeb9kABRZPrjysdnVvLTC7u7xT+got6zd2pbZ1tVxqHRUIlskJWSZW4ZJc0yDE5JU1CyR15II/kybq3Xqw36/1rdMQqdpbJj7I+PgF9f6x3</latexit>
vt+1 = 2vt + (1 2)rL(✓t)2
<latexit sha1_base64="MjuWH5G898k351kRZEFDIU570Os=">AAACHHicbVDLSsNAFJ34rPVVdelmsAi6sCQq6kYQ3bhwoWC10JRyM5naoZNJmLkRSujWz/AL3OoXuBO3gh/gfzhps9DqgYHDOfdyz5wgkcKg6346E5NT0zOzpbny/MLi0nJlZfXGxKlmvM5iGetGAIZLoXgdBUreSDSHKJD8Nuid5f7tPddGxOoa+wlvRXCnREcwQCu1KxTaeLzjKwgk+BFgl4HMLgZbPnY5Wm+7Xam6NXcI+pd4BamSApftypcfxiyNuEImwZim5ybYykCjYJIPyn5qeAKsB3e8aamCiJtWNvzJgG5aJaSdWNunkA7VnxsZRMb0o8BO5mHNuJeL/3nNFDtHrUyoJEWu2OhQJ5UUY5rXQkOhOUPZtwSYFjYrZV3QwNCW9+tKaPJoA9uLN97CX3KzW/MOantX+9WT06KhElknG2SLeOSQnJBzcknqhJEH8kSeyYvz6Lw6b877aHTCKXbWyC84H9+Vz6Jo</latexit>
at = rL(✓t)
<latexit sha1_base64="6uRgnqbwem00uROeNjK2GJZ+8vY=">AAACHnicbZDPSsNAEMY39f//qkcvwSIIhZKoqBeh6MWjglWhKWGy3bRLd5OwOymUkLuP4RN41SfwJl71AXwPNzUHa/1g4eObGWb2FySCa3ScT6syMzs3v7C4tLyyura+Ud3cutVxqihr0VjE6j4AzQSPWAs5CnafKAYyEOwuGFwU9bshU5rH0Q2OEtaR0It4yCmgifzq7tDPsO7mZ0Mf6+CjFyqgmccQ8szrgZSQ+9Wa03DGsqeNW5oaKXXlV7+8bkxTySKkArRuu06CnQwUcipYvuylmiVAB9BjbWMjkEx3svFfcnvPJF07jJV5Edrj9PdEBlLrkQxMpwTs67+1Ivyv1k4xPO1kPEpSZBH9WRSmwsbYLsDYXa4YRTEyBqji5lab9sHQQINvYktXF6cVXNy/FKbN7UHDPW4cXh/VmucloUWyQ3bJPnHJCWmSS3JFWoSSB/JEnsmL9Wi9Wm/W+09rxSpntsmErI9vHfqj0w==</latexit>
vt+1 = vt + at
⌘
<latexit sha1_base64="+wvl9bAzEQ9hRPZ+KyDyIC4ih2s=">AAACH3icbVBLSgNBEO2Jvxh/UZduBoMgBMKMiroRgm5cRjAfSEKo6XSSJt0zQ3dNIAw5gMfwBG71BO7EbQ7gPexJZmESHzS8eq+Kqn5eKLhGx5lambX1jc2t7HZuZ3dv/yB/eFTTQaQoq9JABKrhgWaC+6yKHAVrhIqB9ASre8OHxK+PmNI88J9xHLK2hL7Pe5wCGqmTL7RwwBA6MRbdyV1aYHE0F1p9kBJMl1NyZrBXiZuSAklR6eR/Wt2ARpL5SAVo3XSdENsxKORUsEmuFWkWAh1CnzUN9UEy3Y5nn5nYZ0bp2r1AmeejPVP/TsQgtR5Lz3RKwIFe9hLxP68ZYe+2HXM/jJD5dL6oFwkbAztJxu5yxSiKsSFAFTe32nQACiia/Ba2dHVy2sTk4i6nsEpqFyX3unT5dFUo36cJZckJOSXnxCU3pEweSYVUCSUv5I28kw/r1fq0vqzveWvGSmeOyQKs6S8iWKPA</latexit>
✓t+1 = ✓t + vt+1
<latexit sha1_base64="xhnZcIhN39z2hCSemrfnC5bRMYE=">AAACMnicbVDLSsNAFJ34rPVVdekmWARBLEmV6kYounFZwT6gqWEynbRDJw9nboQS8id+hl/gVn9AdyLu/AgnaRa29TADh3Pu5d57nJAzCYbxri0sLi2vrBbWiusbm1vbpZ3dlgwiQWiTBDwQHQdLyplPm8CA004oKPYcTtvO6Dr1249USBb4dzAOac/DA5+5jGBQkl2qOXYMx2ZyabkCk9iSDwJi88RyKOD7zLGrSTKjqFcqGxUjgz5PzJyUUY6GXfq2+gGJPOoD4VjKrmmE0IuxAEY4TYpWJGmIyQgPaFdRH3tU9uLsvkQ/VEpfdwOhvg96pv7tiLEn5dhzVKWHYShnvVT8z+tG4F70YuaHEVCfTAa5Edch0NOw9D4TlAAfK4KJYGpXnQyxCgpUpFNT+jJdLc3FnE1hnrSqFbNWOb09K9ev8oQKaB8doCNkonNURzeogZqIoCf0gl7Rm/asfWif2tekdEHLe/bQFLSfX6nFqyE=</latexit>
bt+1 =
q
1 t+1
2
1 t+1
1
初期バイアス補正項
慣性項
慣性項+正規化
勾配分散項
慣性項
勾配分散項
momentum Nesterov momentum
<latexit sha1_base64="2qXPSiX6g4ESw8yJGT+gqcvZYr8=">AAACB3icbVDLTgIxFO3gC/GFunTTSEzcOJkBFd0R3bjERMAEJqTTKdDQdsa2Q0ImfIBf4Fa/wJ1x62f4Af6HHZgYQU9yk5Nz7s09OX7EqNKO82nllpZXVtfy64WNza3tneLuXlOFscSkgUMWynsfKcKoIA1NNSP3kSSI+4y0/OF16rdGRCoaijs9jojHUV/QHsVIG8k76fQR5wiOuomedIslx3amgI59Vi07lxX4o7gZKYEM9W7xqxOEOOZEaMyQUm3XibSXIKkpZmRS6MSKRAgPUZ+0DRWIE+Ul09ATeGSUAPZCaUZoOFV/XySIKzXmvtnkSA/UopeK/3ntWPcuvISKKNZE4NmjXsygDmHaAAyoJFizsSEIS2qyQjxAEmFtepr7Eqg0WtqLu9jCX9Is2+65Xbk9LdWusoby4AAcgmPggiqogRtQBw2AwQN4As/gxXq0Xq036322mrOym30wB+vjG5S9mng=</latexit>
vt
<latexit sha1_base64="cMJEeTuJyWkIIo1/oFATGDoX7ho=">AAACHHicdVDLSgNBEJz1GeMr6tHLYBD0YNj1EZOb6MWDBwUTA9kQeicTMzg7u8z0CmHJ1c/wC7zqF3gTr4If4H84m0RQ0YKGoqqb7q4glsKg6747E5NT0zOzubn8/MLi0nJhZbVuokQzXmORjHQjAMOlULyGAiVvxJpDGEh+FdycZP7VLddGROoS+zFvhXCtRFcwQCu1C3TH5wi+gkCCHwL2GMj0bLDlY8/qbdxuF4puqVo5qO5XqFtyh8jIbrl64FFvrBTJGOftwoffiVgScoVMgjFNz42xlYJGwSQf5P3E8BjYDVzzpqUKQm5a6fCTAd20Sod2I21LIR2q3ydSCI3ph4HtzI41v71M/MtrJtittFKh4gS5YqNF3URSjGgWC+0IzRnKviXAtLC3UtYDDQxteD+2dEx22sDm8vU8/Z/Ud0teubR3sV88Oh4nlCPrZINsEY8ckiNySs5JjTByRx7II3ly7p1n58V5HbVOOOOZNfIDztsnOPWizw==</latexit>
⌘rL(✓t)
<latexit sha1_base64="jikt4Srxl5+zqapEUBoYmyFrypA=">AAACAnicdVDLSsNAFJ3UV62vqks3g0VwVRIVbXdFNy4r2Ae0oUwmk3boZBJmboQSuvML3OoXuBO3/ogf4H84aSO0ogcuHM65l3vv8WLBNdj2p1VYWV1b3yhulra2d3b3yvsHbR0lirIWjUSkuh7RTHDJWsBBsG6sGAk9wTre+CbzOw9MaR7Je5jEzA3JUPKAUwJG6vZhxIAMYFCu2FV7BrxA6vWaU6tjJ1cqKEdzUP7q+xFNQiaBCqJ1z7FjcFOigFPBpqV+ollM6JgMWc9QSUKm3XR27xSfGMXHQaRMScAzdXEiJaHWk9AznSGBkf7tZeJfXi+BoOamXMYJMEnni4JEYIhw9jz2uWIUxMQQQhU3t2I6IopQMBEtbfF1dtrU5PLzPP6ftM+qzmX1/O6i0rjOEyqiI3SMTpGDrlAD3aImaiGKBHpCz+jFerRerTfrfd5asPKZQ7QE6+MbQ1KYsA==</latexit>
✓t
<latexit sha1_base64="dYecKxTxjwKqcAllQPGL/cN5Q/M=">AAACBnicdVDLSsNAFJ3UV62vqks3g0UQhJKoaLsrunFZwdpCE8pkOmmHTiZh5kYooXu/wK1+gTtx62/4Af6HkzZCK3pg4HDOvdwzx48F12Dbn1ZhaXllda24XtrY3NreKe/u3esoUZS1aCQi1fGJZoJL1gIOgnVixUjoC9b2R9eZ335gSvNI3sE4Zl5IBpIHnBIwkuvCkAHppXDiTHrlil21p8BzpF6vObU6dnKlgnI0e+Uvtx/RJGQSqCBadx07Bi8lCjgVbFJyE81iQkdkwLqGShIy7aXTzBN8ZJQ+DiJlngQ8Vec3UhJqPQ59MxkSGOrfXib+5XUTCGpeymWcAJN0dihIBIYIZwXgPleMghgbQqjiJiumQ6IIBVPTwpW+zqJlvfx8Hv9P7k+rzkX17Pa80rjKGyqiA3SIjpGDLlED3aAmaiGKYvSEntGL9Wi9Wm/W+2y0YOU7+2gB1sc3Az6aLA==</latexit>
✓t+1
<latexit sha1_base64="so4WvEfNBhzQ2+k0DIcQ37xIf6o=">AAACGXicbVDLSsNAFJ3UV62vqEsRgkUQxJKoqBuh6MZlBfuANoTJZNoOnUzCzE2hhK78DL/ArX6BO3Hryg/wP5y0WdjqgYFzz7mXe+f4MWcKbPvLKCwsLi2vFFdLa+sbm1vm9k5DRYkktE4iHsmWjxXlTNA6MOC0FUuKQ5/Tpj+4zfzmkErFIvEAo5i6Ie4J1mUEg5Y8c78DfQrYS+HYGV/nBZwMp4Jnlu2KPYH1lzg5KaMcNc/87gQRSUIqgHCsVNuxY3BTLIERTselTqJojMkA92hbU4FDqtx08o2xdaiVwOpGUj8B1kT9PZHiUKlR6OvOEENfzXuZ+J/XTqB75aZMxAlQQaaLugm3ILKyTKyASUqAjzTBRDJ9q0X6WGICOrmZLYHKTstyceZT+EsapxXnonJ2f16u3uQJFdEeOkBHyEGXqIruUA3VEUGP6Bm9oFfjyXgz3o2PaWvByGd20QyMzx85KqEn</latexit>
✓t+1 = ✓t vt+1
<latexit sha1_base64="PPIS633nLP5qR61KoXjzVGg7aSY=">AAACOXicbVBNa9tAFFylH3HdplXbYy9LTcDFxEhtSXsxmPaSQw8u1InBMuJpvbYX767E7pPBCP2a/oz8glyTU489FEKu+QNd2YbWcQYWhpn3eLOTZFJYDIJf3t6Dh48e79ee1J8+O3j+wn/56tSmuWG8z1KZmkEClkuheR8FSj7IDAeVSH6WzL9W/tmCGytS/QOXGR8pmGoxEQzQSbHfWcQFtsKyE01BKaCLGFsRR4g0JBIiBThjIItvZTPCmdNjPPo3+S72G0E7WIHuknBDGmSDXuz/icYpyxXXyCRYOwyDDEcFGBRM8rIe5ZZnwOYw5UNHNShuR8XqmyU9dMqYTlLjnka6Uv/fKEBZu1SJm6xy27teJd7nDXOcfB4VQmc5cs3Whya5pJjSqjM6FoYzlEtHgBnhslI2AwMMXbNbV8a2ila6XsK7LeyS0/ft8Lj94fvHRvfLpqEaeUPekiYJySfSJSekR/qEkZ/kglySK+/c++1dezfr0T1vs/OabMG7/QvV+65E</latexit>
vt+1 = vt + ⌘rL(✓t vt)
<latexit sha1_base64="jikt4Srxl5+zqapEUBoYmyFrypA=">AAACAnicdVDLSsNAFJ3UV62vqks3g0VwVRIVbXdFNy4r2Ae0oUwmk3boZBJmboQSuvML3OoXuBO3/ogf4H84aSO0ogcuHM65l3vv8WLBNdj2p1VYWV1b3yhulra2d3b3yvsHbR0lirIWjUSkuh7RTHDJWsBBsG6sGAk9wTre+CbzOw9MaR7Je5jEzA3JUPKAUwJG6vZhxIAMYFCu2FV7BrxA6vWaU6tjJ1cqKEdzUP7q+xFNQiaBCqJ1z7FjcFOigFPBpqV+ollM6JgMWc9QSUKm3XR27xSfGMXHQaRMScAzdXEiJaHWk9AznSGBkf7tZeJfXi+BoOamXMYJMEnni4JEYIhw9jz2uWIUxMQQQhU3t2I6IopQMBEtbfF1dtrU5PLzPP6ftM+qzmX1/O6i0rjOEyqiI3SMTpGDrlAD3aImaiGKBHpCz+jFerRerTfrfd5asPKZQ7QE6+MbQ1KYsA==</latexit>
✓t
<latexit sha1_base64="2qXPSiX6g4ESw8yJGT+gqcvZYr8=">AAACB3icbVDLTgIxFO3gC/GFunTTSEzcOJkBFd0R3bjERMAEJqTTKdDQdsa2Q0ImfIBf4Fa/wJ1x62f4Af6HHZgYQU9yk5Nz7s09OX7EqNKO82nllpZXVtfy64WNza3tneLuXlOFscSkgUMWynsfKcKoIA1NNSP3kSSI+4y0/OF16rdGRCoaijs9jojHUV/QHsVIG8k76fQR5wiOuomedIslx3amgI59Vi07lxX4o7gZKYEM9W7xqxOEOOZEaMyQUm3XibSXIKkpZmRS6MSKRAgPUZ+0DRWIE+Ul09ATeGSUAPZCaUZoOFV/XySIKzXmvtnkSA/UopeK/3ntWPcuvISKKNZE4NmjXsygDmHaAAyoJFizsSEIS2qyQjxAEmFtepr7Eqg0WtqLu9jCX9Is2+65Xbk9LdWusoby4AAcgmPggiqogRtQBw2AwQN4As/gxXq0Xq036322mrOym30wB+vjG5S9mng=</latexit>
vt
<latexit sha1_base64="pYEI99nU86rqt2I07iUb+/8icC4=">AAACUHicdVBNbxMxEJ0NHy3hKy1HLhYRUpFotFv6kdwquHDgUKSmrZRdrWYdJ7Fqe1f2bKVotX+Mn8GNG5ceyi/gBt40K1EEI1l+ejN+8/yyQklHYfgt6Ny7/+Dhxuaj7uMnT589721tn7m8tFyMea5ye5GhE0oaMSZJSlwUVqDOlDjPLj80/fMrYZ3MzSktC5FonBs5kxzJU2nvtIpXIhM7z5IqHIyGB6P94dtwEK6qAXuHo4Oo3o0FYWwwUxhrpAVHVX2qd2JaeD6l3XiOWiO7SulNnfb6rRJrlVirxKI104d1naS963ia81ILQ1yhc5MoLCip0JLkStTduHSiQH6JczHx0KAWLqlWxmv22jNTNsutP4bYiv3zRYXauaXO/GRj3P3da8h/9SYlzYZJJU1RkjD8dtGsVIxy1kTJptIKTmrpAXIrvVfGF2iRkw/8zpapa6w1ubSfZ/8HZ3uD6HDw7vN+//j9OqFNeAmvYAciOIJj+AgnMAYOX+A73MCP4GvwM/jVCW5H2xtewJ3qdH8DuV+yQA==</latexit>
⌘rL(✓t vt)
<latexit sha1_base64="dYecKxTxjwKqcAllQPGL/cN5Q/M=">AAACBnicdVDLSsNAFJ3UV62vqks3g0UQhJKoaLsrunFZwdpCE8pkOmmHTiZh5kYooXu/wK1+gTtx62/4Af6HkzZCK3pg4HDOvdwzx48F12Dbn1ZhaXllda24XtrY3NreKe/u3esoUZS1aCQi1fGJZoJL1gIOgnVixUjoC9b2R9eZ335gSvNI3sE4Zl5IBpIHnBIwkuvCkAHppXDiTHrlil21p8BzpF6vObU6dnKlgnI0e+Uvtx/RJGQSqCBadx07Bi8lCjgVbFJyE81iQkdkwLqGShIy7aXTzBN8ZJQ+DiJlngQ8Vec3UhJqPQ59MxkSGOrfXib+5XUTCGpeymWcAJN0dihIBIYIZwXgPleMghgbQqjiJiumQ6IIBVPTwpW+zqJlvfx8Hv9P7k+rzkX17Pa80rjKGyqiA3SIjpGDLlED3aAmaiGKYvSEntGL9Wi9Wm/W+2y0YOU7+2gB1sc3Az6aLA==</latexit>
✓t+1
<latexit sha1_base64="9xePHLSefWqCHhmPtXC5CMrtgpA=">AAACRnicbVDBShxBEK3ZGLNqNJt49DK4BARxmdFgcgks8SKeDGRV2FmWmt4at7GnZ+yuEZZh/imfkS8I5KQXr7kFr+nZnUPUPGh49V4VVf3iXEnLQfDLa71Yern8qr2yuvZ6feNN5+27M5sVRtBAZCozFzFaUlLTgCUrusgNYRorOo+vjmr//IaMlZn+xrOcRileaplIgeykceck4ikxjkveDavPTcF7Eap8ilFiUJTpwqzKyF4bLm+acjei3EqV6SpeKONON+gFc/jPSdiQLjQ4HXfuo0kmipQ0C4XWDsMg51GJhqVQVK1GhaUcxRVe0tBRjSnZUTn/c+W/d8rETzLjnmZ/rv47UWJq7SyNXWeKPLVPvVr8nzcsOPk0KqXOCyYtFouSQvmc+XWA/kQaEqxmjqAw0t3qiym6oNjF/GjLxNan1bmET1N4Ts72e+Fh7+Drh27/S5NQG7ZgG3YghI/Qh2M4hQEI+A4/4RbuvB/eb++P97BobXnNzCY8Qgv+Avees/A=</latexit>
✓t+1 = ✓t ↵
mt+1
p
vt+1 + ✏
bt+1 正規化
https://arxiv.org/pdf/2007.01547.pdf
最適化手法
これらの最適化手法は包含関係にある
包含している最適化手法の方がハイパラさえチューニングすれば必ず精度は良くなる
https://arxiv.org/pdf/1910.05446.pdf
(lower is better)
主要な深層ニューラルネットモデルの変遷
https://towardsdatascience.com/from-lenet-to-ef
fi
cientnet-the-evolution-of-cnns-3a57eb34672f
AlexNet: ReLU, Dropout, GPU
2012 2015
ResNet: Skip connection
2017
MobileNet: Squeeze and excite
2019
Ef
fi
cientNet: Neural architecture search
2021
Transformer: 注意機構
Vision Transformer: 画像パッチ
1995
LSTM
LeNet-5:畳み込み
畳み込みニューラルネット
https://cs231n.github.io/convolutional-networks/
[入出力テンソルの次元
]

N: バッチサイズ
C: チャネル数
H: 画像の高さ
W: 画像の幅
[畳み込みのパラメータ
]

F: フィルタの大きさ
P: パディングの幅
S: ストライド
入力チャネル3,出力チャネル2の例
[入力
]

N:
1

Cin:
3

Hin:
5

Win: 5
[出力
]

N:
1

Cout:
2

Hout:
3

Wout: 3
F:
3

P:
1

S: 2
GEMM
Winograd
入力画像
フィルタ
入力画像
フィルタ
出力画像
batched GEMM
FFT
http://cs231n.stanford.edu/reports/2016/pdfs/117_Report.pdf
https://arxiv.org/abs/1410.0759
https://www.slideshare.net/nervanasys/an-analysis-of-convolution-for-inference
正規化 (normalization)
Batch normalization (BN)
Layer normalization (LN)
Group normalization (GN)
Weight standardization (WS)
https://theaisummer.com/normalization/
<latexit sha1_base64="RqN4YpnAkqmVd2gk3UwpB5hXPg8=">AAACN3icbZDPShxBEMZ7jEZjYlyTYy6DS2BFXGZUYiAEJF48ruCqsL0sNb01s43dM0N3TdhlmIfxMXyCXJNjTjkpXn0De3b34J980PDjqyqq+otyJS0FwV9v4dXi0uvllTerb9+tvV9vbHw4s1lhBHZFpjJzEYFFJVPskiSFF7lB0JHC8+jyqK6f/0RjZZae0iTHvoYklbEUQM4aNL7xEVA5rr7zBLQGrjCmFo8NiHK8w3XRGm9VJbcy0VAjNzIZ0dY2j5Bg0GgG7WAq/yWEc2iyuTqDxg0fZqLQmJJQYG0vDHLql2BICoXVKi8s5iAuIcGewxQ02n45/WTlf3bO0I8z415K/tR9PFGCtnaiI9epgUb2ea02/1frFRR/7ZcyzQvCVMwWxYXyKfPrxPyhNChITRyAMNLd6osRuITI5fpky9DWp1Uul/B5Ci/hbLcdfmnvnew3D3/ME1phn9gma7GQHbBDdsw6rMsEu2K/2G/2x7v2/nm33t2sdcGbz3xkT+TdPwDfNq3S</latexit>
x̂ =
✓
x µ(x)
(x)
◆
+
<latexit sha1_base64="mMlCkP7IjuvtvnbWTjoEnGSHv/w=">AAACUXicbZDLSgMxFIZPx/u96tLNYBHqwjKjom4E0U1XomCt0KlDJs20wSQzJhm1hHkyH8OVSxdu9AncmWkreDsQ+Pj/k5yTP0oZVdrznkvO2PjE5NT0zOzc/MLiUnl55VIlmcSkgROWyKsIKcKoIA1NNSNXqSSIR4w0o5uTwm/eEaloIi50PyVtjrqCxhQjbaWw3AgU7XJUfdg8DNSt1CaIJcLGz81pvZkHKuPXp6ERh/6Q66HpfXEzNPeWqw/Wx737fCvgmX1n83o7D8sVr+YNyv0L/ggqMKqzsPwadBKccSI0Zkiplu+lum2Q1BQzks8GmSIpwjeoS1oWBeJEtc3g+7m7YZWOGyfSHqHdgfr9hkFcqT6PbCdHuqd+e4X4n9fKdHzQNlSkmSYCDwfFGXN14hZZuh0qCdasbwFhSe2uLu4hm5+2if+Y0lHFakUu/u8U/sLlds3fq+2c71aOjkcJTcMarEMVfNiHI6jDGTQAwyO8wBu8l55KHw44zrDVKY3urMKPcuY+AT9ntVY=</latexit>
(x) =
v
u
u
t 1
NHW
N
X
n=1
H
X
h=1
W
X
w=1
(xnchw µ(x))2
<latexit sha1_base64="dpuT1CUY56ArR9bRnSs4Ys0fxb4=">AAACPHicbZDNSgMxFIUz/tb/UZduBotQN2VGRd0IRTddlQrWCp06ZNKMDU0yQ5LRljCv42P4BG4V3OtK3Lo201ZQ64XAxzn3cm9OmFAileu+WFPTM7Nz84WFxaXlldU1e33jUsapQLiBYhqLqxBKTAnHDUUUxVeJwJCFFDfD3lnuN2+xkCTmF2qQ4DaDN5xEBEFlpMCu+Cwt9XdP/EhApL1M16rNzJcpu64Fmp94I64GuvvNzUDfGe4bG3XvssAuumV3WM4keGMognHVA/vN78QoZZgrRKGULc9NVFtDoQiiOFv0U4kTiHrwBrcMcsiwbOvhTzNnxygdJ4qFeVw5Q/XnhIZMygELTSeDqiv/ern4n9dKVXTc1oQnqcIcjRZFKXVU7OSxOR0iMFJ0YAAiQcytDupCk5ky4f7a0pH5aXku3t8UJuFyr+wdlvfPD4qV03FCBbAFtkEJeOAIVEAV1EEDIHAPHsETeLYerFfr3foYtU5Z45lN8Kuszy9NhbAf</latexit>
µ(x) =
1
NHW
N
X
n=1
H
X
h=1
W
X
w=1
xnchw
<latexit sha1_base64="jWwokzwi9lm2DSevW/HMsFrzSug=">AAACPHicbZDNSgMxFIUz/tb6V3XpZrAIdVNmVNSNUOymywq2FTp1yKQZG0wyQ5KxljCv42P4BG4V3OtK3Lo2045g1QOBj3Pv5d6cIKZEKsd5sWZm5+YXFgtLxeWV1bX10sZmW0aJQLiFIhqJywBKTAnHLUUUxZexwJAFFHeCm3pW79xiIUnEL9Qoxj0GrzkJCYLKWH6p5rGkcrd36oUCIu2mut7opJ5M2FXd1+jUnXDD14Nv7vh6aPjO1xwNhqlfKjtVZyz7L7g5lEGupl968/oRShjmClEoZdd1YtXTUCiCKE6LXiJxDNENvMZdgxwyLHt6/NPU3jVO3w4jYR5X9tj9OaEhk3LEAtPJoBrI37XM/K/WTVR40tOEx4nCHE0WhQm1VWRnsdl9IjBSdGQAIkHMrTYaQJOZMuFObenL7LQsF/d3Cn+hvV91j6oH54fl2lmeUAFsgx1QAS44BjXQAE3QAgjcg0fwBJ6tB+vVerc+Jq0zVj6zBaZkfX4BE+av/g==</latexit>
µ(x) =
1
CHW
C
X
c=1
H
X
h=1
W
X
w=1
xnchw
<latexit sha1_base64="RqN4YpnAkqmVd2gk3UwpB5hXPg8=">AAACN3icbZDPShxBEMZ7jEZjYlyTYy6DS2BFXGZUYiAEJF48ruCqsL0sNb01s43dM0N3TdhlmIfxMXyCXJNjTjkpXn0De3b34J980PDjqyqq+otyJS0FwV9v4dXi0uvllTerb9+tvV9vbHw4s1lhBHZFpjJzEYFFJVPskiSFF7lB0JHC8+jyqK6f/0RjZZae0iTHvoYklbEUQM4aNL7xEVA5rr7zBLQGrjCmFo8NiHK8w3XRGm9VJbcy0VAjNzIZ0dY2j5Bg0GgG7WAq/yWEc2iyuTqDxg0fZqLQmJJQYG0vDHLql2BICoXVKi8s5iAuIcGewxQ02n45/WTlf3bO0I8z415K/tR9PFGCtnaiI9epgUb2ea02/1frFRR/7ZcyzQvCVMwWxYXyKfPrxPyhNChITRyAMNLd6osRuITI5fpky9DWp1Uul/B5Ci/hbLcdfmnvnew3D3/ME1phn9gma7GQHbBDdsw6rMsEu2K/2G/2x7v2/nm33t2sdcGbz3xkT+TdPwDfNq3S</latexit>
x̂ =
✓
x µ(x)
(x)
◆
+
<latexit sha1_base64="ih89Pzc2EBidrxV+yRGOgCPpnQE=">AAACUXicbZC7TsMwFIZPw/1eYGSJqJDKQJUAAhYkRJeOIFGK1JTIcZ3WwnaC7QCVlSfjMZgYGVjgCdhw2iJxO5KlT/+5+o9SRpX2vOeSMzE5NT0zOze/sLi0vFJeXbtUSSYxaeKEJfIqQoowKkhTU83IVSoJ4hEjreimXuRbd0QqmogLPUhJh6OeoDHFSFspLDcDRXscVR+2jwN1K7UJYomw8XNTb7TyQGX8uh4afOyPuBGa/he3QnNvufoQGoH79/lOwDM7Z/t6Nw/LFa/mDcP9C/4YKjCOs7D8GnQTnHEiNGZIqbbvpbpjkNQUM5LPB5kiKcI3qEfaFgXiRHXM8Pu5u2WVrhsn0j6h3aH6vcMgrtSAR7aSI91Xv3OF+F+unen4qGOoSDNNBB4tijPm6sQtvHS7VBKs2cACwpLaW13cR9Y/bR3/saWritMKX/zfLvyFy92af1DbO9+vnJyOHZqFDdiEKvhwCCfQgDNoAoZHeIE3eC89lT4ccJxRqVMa96zDj3AWPgEEPLU1</latexit>
(x) =
v
u
u
t 1
CHW
C
X
c=1
H
X
h=1
W
X
w=1
(xnchw µ(x))2
<latexit sha1_base64="RqN4YpnAkqmVd2gk3UwpB5hXPg8=">AAACN3icbZDPShxBEMZ7jEZjYlyTYy6DS2BFXGZUYiAEJF48ruCqsL0sNb01s43dM0N3TdhlmIfxMXyCXJNjTjkpXn0De3b34J980PDjqyqq+otyJS0FwV9v4dXi0uvllTerb9+tvV9vbHw4s1lhBHZFpjJzEYFFJVPskiSFF7lB0JHC8+jyqK6f/0RjZZae0iTHvoYklbEUQM4aNL7xEVA5rr7zBLQGrjCmFo8NiHK8w3XRGm9VJbcy0VAjNzIZ0dY2j5Bg0GgG7WAq/yWEc2iyuTqDxg0fZqLQmJJQYG0vDHLql2BICoXVKi8s5iAuIcGewxQ02n45/WTlf3bO0I8z415K/tR9PFGCtnaiI9epgUb2ea02/1frFRR/7ZcyzQvCVMwWxYXyKfPrxPyhNChITRyAMNLd6osRuITI5fpky9DWp1Uul/B5Ci/hbLcdfmnvnew3D3/ME1phn9gma7GQHbBDdsw6rMsEu2K/2G/2x7v2/nm33t2sdcGbz3xkT+TdPwDfNq3S</latexit>
x̂ =
✓
x µ(x)
(x)
◆
+
<latexit sha1_base64="+mSuVgDmFbFdQz9MORmrfpjKhzI=">AAACQHicbZBNSwMxEIazflu/qh69LBZBL3W3inoRxHrosYL9gG5dsmm2DSbZJclaS9g/5M/wF3jVu+BNvHoy21aw1YHAM+/MMJM3iCmRynFerJnZufmFxaXl3Mrq2vpGfnOrLqNEIFxDEY1EM4ASU8JxTRFFcTMWGLKA4kZwV87qjXssJIn4jRrEuM1gl5OQIKiM5OevPJbsPxyce6GASJdSXa40Uk8m7FaXD0upr9G5O8orvu79cMPXfcMPvuao10/9fMEpOsOw/4I7hgIYR9XPv3mdCCUMc4UolLLlOrFqaygUQRSnOS+ROIboDnZxyyCHDMu2Hv42tfeM0rHDSJjHlT1Uf09oyKQcsMB0Mqh6crqWif/VWokKz9qa8DhRmKPRojChtorszDq7QwRGig4MQCSIudVGPWh8U8bgiS0dmZ2W+eJOu/AX6qWie1I8uj4uXFyOHVoCO2AX7AMXnIILUAFVUAMIPIJn8AJerSfr3fqwPketM9Z4ZhtMhPX1DSxusYA=</latexit>
µ(x) =
2
CHW
C/2
X
c=1
H
X
h=1
W
X
w=1
xnchw
<latexit sha1_base64="VgwH0utETMAfcEv0IoDUtgzOTWk=">AAACVXicbZDdTsIwGIbLxD/8Qz30ZJGY4IG4oVFPSIyccIiJiIbh0pUOGttutp1Cml2bl2G8AE/1CkzsABNRv6TJ2/f7zRPElEjlOK85ay4/v7C4tFxYWV1b3yhubl3LKBEIt1BEI3ETQIkp4biliKL4JhYYsoDidnBfz/LtRywkifiVGsW4y2Cfk5AgqIzlF289SfoMlof7NU8+CKW9UECkq6muN9qpJxN2p+uH1dTXqOZO/g1fD75129dPRpeHvuZo8JQeeCwxs/bvTEex5FSccdh/hTsVJTCNpl9883oRShjmClEoZcd1YtXVUCiCKE4LXiJxDNE97OOOkRwyLLt6jCC194zTs8NImMeVPXZ/dmjIpByxwFQyqAbydy4z/8t1EhWedTXhcaIwR5NFYUJtFdkZT7tHBEaKjoyASBBzq40G0DBUhvrMlp7MTsu4uL8p/BXX1Yp7Ujm6PC6dX0wJLYEdsAvKwAWn4Bw0QBO0AALP4A28g4/cS+7TylsLk1IrN+3ZBjNhbXwBPOS2tw==</latexit>
(x) =
v
u
u
t 2
CHW
C/2
X
c=1
H
X
h=1
W
X
w=1
(xnchw µ(x))2
<latexit sha1_base64="PSqEPc30bQy6FHyPWYe6IWsLh00=">AAACZnicbVFNS8MwGM7q9/dUxIOX4BDmwdFOUS/CcDB2VHBWWGdJs3SLS9qapMII/Y9e/QGCv8CrprMH3Xwh8LzP837xJEgYlcq230rW3PzC4tLyyura+sbmVnl7517GqcCkg2MWi4cAScJoRDqKKkYeEkEQDxhxg1Ez190XIiSNozs1TkiPo0FEQ4qRMpRffvIkHXBUdY+vPPkslPZCgbB2Mt1s+e2W72aeTPmjbma+xldOkRnJ5KHf/sW4E8Y1TNU1tUY0WXbi8dQMP36sZ365YtfsScBZ4BSgAoq48cvvXj/GKSeRwgxJ2XXsRPU0EopiRrJVL5UkQXiEBqRrYIQ4kT098SSDR4bpwzAW5kUKTtjfHRpxKcc8MJUcqaGc1nLyP62bqvCyp2mUpIpE+GdRmDKoYpgbDPtUEKzY2ACEBTW3QjxExlRlvuHPlr7MT8t9caZdmAX39ZpzXju9Pas0rguHlsEBOARV4IAL0ABtcAM6AINX8Am+SqD0YW1ae9b+T6lVKnp2wZ+w4Dc1/7tI</latexit>
(W) =
v
u
u
t 1
CFHFW
C
X
c=1
FH
X
fH =1
FW
X
fW =1
(WcfH fW
µ(W))2
<latexit sha1_base64="kB8kd70X3SXw/F5W+lkWy3s6d8o=">AAACUXicbZDNSsNAFIVv41+tf1WXboJF0E1JVNSNUCyULivYptDUMJlO2sGZJMxMhBLyZD6GK5cu3OgTuHPSZtGqFwbO/c4d5s7xY0alsqy3krGyura+Ud6sbG3v7O5V9w96MkoEJl0csUj0fSQJoyHpKqoY6ceCIO4z4vhPzdx3nomQNAof1DQmQ47GIQ0oRkojr9p1eXLqnN26gUA4Pc/SZstrtzwnc2XCH9Nm5qX41i46bek+8NoLxJkRRxNHj2ovyFG1ZtWtWZl/hV2IGhTV8aof7ijCCSehwgxJObCtWA1TJBTFjGQVN5EkRvgJjclAyxBxIofp7PuZeaLJyAwioU+ozBldvJEiLuWU+3qSIzWRv70c/ucNEhXcDFMaxokiIZ4/FCTMVJGZZ2mOqCBYsakWCAuqdzXxBOkglU586ZWRzFfLc7F/p/BX9M7r9lX94v6y1rgrEirDERzDKdhwDQ1oQwe6gOEF3uETvkqvpW8DDGM+apSKO4ewVMbWD9qTtTQ=</latexit>
µ(W) =
2
CFHFW
C
X
c=1
FH
X
fH =1
FW
X
fW =1
WcfH fW
<latexit sha1_base64="Pt2kdQmIa3vxI/2P4C+R6402oY0=">AAACK3icbZDNTsJAFIWn+If4h7p000hMYCG2atSNCdGNS0yEklBCpsMUJsy0zcytCWn6GD6GT+BWn8CVxi3v4RRYCHiSSb6ce2/uneNFnCmwrC8jt7K6tr6R3yxsbe/s7hX3D5oqjCWhDRLyULY8rChnAW0AA05bkaRYeJw63vA+qzvPVCoWBk8wimhH4H7AfEYwaKtbPHMHGBInvXU59aHs+hKTxDl1RVx2KmniKtYXOENXsv4AKt1iyapaE5nLYM+ghGaqd4tjtxeSWNAACMdKtW0rgk6CJTDCaVpwY0UjTIa4T9saAyyo6iSTj6XmiXZ6ph9K/QIwJ+7fiQQLpUbC050Cw0At1jLzv1o7Bv+mk7AgioEGZLrIj7kJoZmlZPaYpAT4SAMmkulbTTLAOhvQWc5t6anstFTnYi+msAzN86p9Vb14vCzV7mYJ5dEROkZlZKNrVEMPqI4aiKAX9Ibe0Yfxanwa38bPtDVnzGYO0ZyM8S8fuqhU</latexit>
Ŵ =
✓
W µ(W)
(W)
◆
B
N

+

LN
B
N

+

LN
B
N

+

LN
B
N

+

LN
(higher is better)
データ拡張 (augmentation)
Flipping
Rotation
Cutout
Random crop
Scale
Random Erasing
Mixup
CutMix
AugMix
AutoAugment
強化学習を使って最適なデータ拡張を探索
Fast AutoAugment
強化学習+ベイズ最適化により探索時間短縮
Faster AutoAugment
勾配ベースの探索によりさらに時間短縮
https://openreview.net/pdf?id=S1gmrxHFvB
https://github.com/xkumiyu/numpy-data-augmentation
(lower is better)
正則化 (regularization)
<latexit sha1_base64="PTOETKQJ9sDV108G8utVEdMYksY=">AAACGnicbVDLSsNAFJ34rPUVdSnIYBHcWJIi6kYounHhooJ9QBPLzWTaDp08mJkIJWTnZ/gFbvUL3IlbN36A/+Gk7cK2Hhg4nHMv98zxYs6ksqxvY2FxaXlltbBWXN/Y3No2d3YbMkoEoXUS8Ui0PJCUs5DWFVOctmJBIfA4bXqD69xvPlIhWRTeq2FM3QB6IesyAkpLHfPACUD1CfD0Nrt0ZBI8pD4oyE4cHvVw3DFLVtkaAc8Te0JKaIJax/xx/IgkAQ0V4SBl27Zi5aYgFCOcZkUnkTQGMoAebWsaQkClm47+keEjrfi4Gwn9QoVH6t+NFAIph4GnJ/PUctbLxf+8dqK6F27KwjhRNCTjQ92EYxXhvBTsM0GJ4kNNgAims2LSBwFE6eqmrvgyj5bpXuzZFuZJo1K2z8qVu9NS9WrSUAHto0N0jGx0jqroBtVQHRH0hF7QK3ozno1348P4HI8uGJOdPTQF4+sXt5uh/g==</latexit>
L =
data
X
log p
損失関数
https://arxiv.org/abs/2002.08709
L2正則化
<latexit sha1_base64="Yy8xPfnTLiqjWpL4JDjdct97CAw=">AAACJ3icbVDLSgMxFM34rPU16tJNsAhCscxUUTdC0Y0LFxXsAzq13MmkNTTzIMkIZTof4Wf4BW71C9yJLl34H2baLmz1QOBwzr259x434kwqy/o05uYXFpeWcyv51bX1jU1za7suw1gQWiMhD0XTBUk5C2hNMcVpMxIUfJfThtu/zPzGAxWShcGtGkS07UMvYF1GQGmpYxYdH9Q9AZ5cp+eOjP27xAMF6aHDwx6Oig7Xf3kwbAzvyh2zYJWsEfBfYk9IAU1Q7ZjfjheS2KeBIhykbNlWpNoJCMUIp2neiSWNgPShR1uaBuBT2U5GR6V4Xyse7oZCv0Dhkfq7IwFfyoHv6srsBDnrZeJ/XitW3bN2woIoVjQg40HdmGMV4iwh7DFBieIDTYAIpnfF5B4EEKVznJriyWy1VOdiz6bwl9TLJfukdHRzXKhcTBLKoV20hw6QjU5RBV2hKqohgh7RM3pBr8aT8Wa8Gx/j0jlj0rODpmB8/QDq3qdI</latexit>
L =
data
X
log p + |W|2
L1正則化
<latexit sha1_base64="qqFUGj5bRbXJTuGZgDLorSbuXL0=">AAACJXicbVDLSgMxFM3UV62vqks3wSIoYplRUTdC0Y0LFxXsAzq13MmkbWjmQZIRynS+wc/wC9zqF7gTwZUr/8NM24VtPRA4nHNv7r3HCTmTyjS/jMzc/MLiUnY5t7K6tr6R39yqyiAShFZIwANRd0BSznxaUUxxWg8FBc/htOb0rlO/9kiFZIF/r/ohbXrQ8VmbEVBaauUPbA9UlwCPb5NLW0beQ+yCguTI5kEHh4c213+5MKgNWvmCWTSHwLPEGpMCGqPcyv/YbkAij/qKcJCyYZmhasYgFCOcJjk7kjQE0oMObWjqg0dlMx6elOA9rbi4HQj9fIWH6t+OGDwp+56jK9MD5LSXiv95jUi1L5ox88NIUZ+MBrUjjlWA03ywywQlivc1ASKY3hWTLgggSqc4McWV6WqJzsWaTmGWVI+L1lnx5O60ULoaJ5RFO2gX7SMLnaMSukFlVEEEPaEX9IrejGfj3fgwPkelGWPcs40mYHz/AprfpqQ=</latexit>
L =
data
X
log p + |W|
Sharpness Aware Minimization (SAM)
Flooding
<latexit sha1_base64="+29JRp4dO+SSQAn2+lrjhc+5WsE=">AAACIHicbVDLSgMxFM34rPVVdekmWAVBWmZU1I1QdOPCRQX7gE4tdzJpG5p5kGSEMp0f8DP8Arf6Be7Epe79DzNtF7b1QOBwzr3ck+OEnEllml/G3PzC4tJyZiW7ura+sZnb2q7KIBKEVkjAA1F3QFLOfFpRTHFaDwUFz+G05vSuU7/2SIVkgX+v+iFtetDxWZsRUFpq5fZtD1SXAI9vk0tbRt5D7IKCZFCwedDBYcEZHDmtXN4smkPgWWKNSR6NUW7lfmw3IJFHfUU4SNmwzFA1YxCKEU6TrB1JGgLpQYc2NPXBo7IZD3+T4AOtuLgdCP18hYfq340YPCn7nqMn0+xy2kvF/7xGpNoXzZj5YaSoT0aH2hHHKsBpNdhlghLF+5oAEUxnxaQLAojSBU5ccWUaLdG9WNMtzJLqcdE6K57cneZLV+OGMmgX7aFDZKFzVEI3qIwqiKAn9IJe0ZvxbLwbH8bnaHTOGO/soAkY378ypqRP</latexit>
L =
data
X
| log p b| + b
<latexit sha1_base64="9vmQ2Pyai29yDGuUrPziv2Rg5HE=">AAACPnicbVBNSxxBEO0xfn+uydFL4yIo4jKjol6EZb3kkIOBrCvsjEtNT+1uY0/30N0jLsP8n/yM/IJcE39AvEmuOaZnXcGvBwWP96qoqhdnghvr+3fe1Ifpmdm5+YXFpeWV1bXa+scLo3LNsM2UUPoyBoOCS2xbbgVeZhohjQV24uuzyu/coDZcyW92lGGUwkDyPmdgndSrtcIU7JCBKL6Up6HJ06siAQulk297RYiZ4ULJUGCoh6rs7oVCDWi23dl9snaiXq3uN/wx6FsSTEidTHDeq/0JE8XyFKVlAozpBn5mowK05UxguRjmBjNg1zDArqMSUjRRMf61pFtOSWhfaVfS0rH6fKKA1JhRGrvO6jPz2qvE97xubvsnUcFllluU7HFRPxfUKloFRxOukVkxcgSY5u5WyoaggVkX74stialOK10uwesU3pKL/UZw1Dj4elhvtiYJzZMNskm2SUCOSZN8JuekTRj5Tn6SX+S398O79x68v4+tU95k5hN5Ae/ff/ElsWk=</latexit>
L =
data
X
max
✏⇢
[ log p(W + ✏)]
https://arxiv.org/abs/2010.01412
Dropout
https://arxiv.org/abs/1603.09382
Stochastic depth
分散並列化
データ並列 テンソル並列 層並列
データを分散
モデルは冗長
勾配を通信
バッチが巨大化
例:Horovod
データは冗長
モデルは分散
活性を通信
通信頻度が多い
例:Mesh TensorFlow
データは冗長
モデルは分散
活性を通信
計算が逐次的
例:GPipe
“Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis”, Ben-nun and Hoe
fl
er, ACM Computing Surveys, Article No.: 65
データ並列における通信と同期
パラメータサーバ 集団通信
同期型
非同期型
同期型
非同期型
“Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis”, Ben-nun and Hoe
fl
er, ACM Computing Surveys, Article No.: 65
大規模データ並列分散学習の問題
バッチサイズ = GPUあたりのバッチサイズ
×

GPU数
full batch large mini-batch small mini-batch
なぜバッチサイズを大きくすると汎化性能が低下するのか?
なぜバッチサイズを大きくすると汎化性能が低下するのか?
GPU数に比例してバッチサイズが増える
無駄な更新 無駄なデータ
ノイズが支配的 曲率が支配的
ラージバッチ問題の網羅的調査
ラージバッチ用の最適化手法?
LARS
LAMB (ADAM+LARS)
<latexit sha1_base64="o+M96FcBaWGVEj1D/kpYk4bZkPU=">AAACPXicbVA9TyMxEPXycQccHDkoaSyik2iIdhESlBE0FCmCRAhSNopmnVmw8HpX9uxF0SZ/534Gv4AWJHroEC0t3iQFX0+y/PxmRvP8okxJS77/4M3NLyz++Lm0vPJrde33euXPxrlNcyOwJVKVmosILCqpsUWSFF5kBiGJFLaj6+Oy3v6HxspUn9Eww24Cl1rGUgA5qVepD3ioMCYwJh3wwW6IBJyHsQFRjEaD0WjsrlBDpIA33GtKwwToSoAqGuNeperX/An4VxLMSJXN0OxVnsJ+KvIENQkF1nYCP6NuAYakUDheCXOLGYhruMSOoxoStN1i8tMx/+uUPo9T444mPlHfTxSQWDtMItdZWrSfa6X4Xa2TU3zYLaTOckItpoviXHFKeRkb70uDgtTQERBGOq9cXIFLiVy4H7b0bWmtzCX4nMJX0t6rBfu1IDjdr9aPZhEtsS22zXZYwA5YnZ2wJmsxwf6zW3bH7r0b79F79l6mrXPebGaTfYD3+gZIWrEN</latexit>
w w ⌘
||w||
||rL||
rL
32kのハッチサイズにおいても:
NesterovでLARSと同じ性能を達成
AdamでLAMBと同じ性能を達成
結局ハイパラチューニング次第
東工大のHPCの講義
https://github.com/rioyokotalab/hpc_lecture_2021
画像分類問題
2層の全結合NN
D_in=3
H=5
D_out=2
Data
batch_size(BS)=2
x(BS,D_in)
w1(D_in,H) w2(H,D_out)
y_p(BS,D_out)
h_r=f(x*w1)
y=f(x)
ReLU (Recti
fi
ed Linear Unit)
y_p=h_r*w2
Back propagation
@L
@w2
=
@L
@yp
@yp
@w2
=
1
NO
2(yp y)hr
<latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit>
<latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit>
<latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit>
<latexit sha1_base64="KD+WkL8b4iPtxogxdCifBK+sduE=">AAAB7HicbVDLSgNBEOyNrxijxrOXwSB4Crte9Ch48RjBPCBZwuxsJxkyO7vM9AphyQ949Qu8iX/kB/gfziY5mMSCgaKqm66pKFPSku9/e5W9/YPDo+px7aReOz07b9S7Ns2NwI5IVWr6EbeopMYOSVLYzwzyJFLYi2aPpd97RWNlql9onmGY8ImWYyk4Oak9ajT9lr8E2yXBmjRhjVHjZxinIk9Qk1Dc2kHgZxQW3JAUChe1YW4x42LGJzhwVPMEbVgsYy7YtVNiNk6Ne5rYUv27UfDE2nkSucmE09Rue6X4nzfIaXwfFlJnOaEWq0PjXDFKWflnFkuDgtTcES6MdFmZmHLDBblmNq7Etoy2cLUE2yXsku5tK/BbwbMPVbiEK7iBAO7gAZ6gDR0QEMMbvHuF9+F9ruqreOseL2AD3tcvU8WSkg==</latexit>
<latexit sha1_base64="BPl6LZUWEc7LnKT4OpXuCsHjQG0=">AAACa3icfZHNS8MwGMbT+jXndNWrIMExnAdHu4teBMGLB9EJ7gPWUtIs3cLSD5JUqaV/qDcv/g8eTbeCbhNfCDw8T/K+L794MaNCmua7pm9sbm3vVHare7X9g7pxWOuLKOGY9HDEIj70kCCMhqQnqWRkGHOCAo+RgTe7LfLBC+GCRuGzTGPiBGgSUp9iJJXlGm+2zxHO7BhxSRGD9/mPfnU7+fU/eerG+UpcWMsNqmUHK88e3Me801JXLtLzqctdo2G2zXnBdWGVogHK6rrGpz2OcBKQUGKGhBhZZiydrBiGGcmrdiJIjPAMTchIyRAFRDjZnFEOm8oZQz/i6oQSzt3fLzIUCJEGnroZIDkVq1lh/pWNEulfORkN40SSEC8G+QmDMoIFcDimnGDJUiUQ5lTtCvEUKSRSfcvSlLEoVssVF2uVwrrod9qW2baeTFABx+AUtIAFLsENuANd0AMYfGjbWl0ztC/9RG8uCOpaifIILJV+9g0Z5MCq</latexit>
<latexit sha1_base64="BPl6LZUWEc7LnKT4OpXuCsHjQG0=">AAACa3icfZHNS8MwGMbT+jXndNWrIMExnAdHu4teBMGLB9EJ7gPWUtIs3cLSD5JUqaV/qDcv/g8eTbeCbhNfCDw8T/K+L794MaNCmua7pm9sbm3vVHare7X9g7pxWOuLKOGY9HDEIj70kCCMhqQnqWRkGHOCAo+RgTe7LfLBC+GCRuGzTGPiBGgSUp9iJJXlGm+2zxHO7BhxSRGD9/mPfnU7+fU/eerG+UpcWMsNqmUHK88e3Me801JXLtLzqctdo2G2zXnBdWGVogHK6rrGpz2OcBKQUGKGhBhZZiydrBiGGcmrdiJIjPAMTchIyRAFRDjZnFEOm8oZQz/i6oQSzt3fLzIUCJEGnroZIDkVq1lh/pWNEulfORkN40SSEC8G+QmDMoIFcDimnGDJUiUQ5lTtCvEUKSRSfcvSlLEoVssVF2uVwrrod9qW2baeTFABx+AUtIAFLsENuANd0AMYfGjbWl0ztC/9RG8uCOpaifIILJV+9g0Z5MCq</latexit>
<latexit sha1_base64="9s6RO95OuhDQX1hzGQv1yWyzpFM=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUl6UY3QtGNC9EK9gFNCZPpxA6dPJiZKDHkQ9258R9cOmkD2la8MHA45957LmfciFEhTfNd09fWN0qb5a3K9s7uXtXYP+iJMOaYdHHIQj5wkSCMBqQrqWRkEHGCfJeRvju9yfX+C+GChsGTTCIy8tFzQD2KkVSUY7zZHkc4tSPEJUUM3mU/+NVpZVf/6IkTZUtyTi0uqBQbrCy9dx6yVkO1nCdnE4c7Rs1smrOCq8AqQA0U1XGMT3sc4tgngcQMCTG0zEiO0twMM5JV7FiQCOEpeiZDBQPkEzFKZxllsK6YMfRCrl4g4Yz9PZEiX4jEd1Wnj+RELGs5+Zc2jKV3OUppEMWSBHhu5MUMyhDmgcMx5QRLliiAMKfqVognSEUi1bcsuIxFflqmcrGWU1gFvVbTMpvWo1lrXxcJlcEROAENYIEL0Aa3oAO6AIMPraRVNUP70o/1un46b9W1YuYQLJRufgN6n8GI</latexit>
<latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit>
<latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit>
<latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit>
<latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit>
<latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit>
<latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit>
L =
1
NO
X
(yp y)2
<latexit sha1_base64="HuDWCNBZ3fiwLb58W34hfFqbE4Y=">AAACGHicbZDLSsNAFIYn9VbrLepSF8Ei1IUlKYJuhKIbF6IV7AWaGCaTSTt0cmFmIoSQjY/hE7jVJ3Anbt35AL6HkzYL2/rDwM9/zuGc+ZyIEi50/VspLSwuLa+UVytr6xubW+r2ToeHMUO4jUIasp4DOaYkwG1BBMW9iGHoOxR3ndFlXu8+YsZJGNyLJMKWDwcB8QiCQka2un99bnoMotTI0hv7NjN57NcSOzpOjh4atlrV6/pY2rwxClMFhVq2+mO6IYp9HAhEIed9Q4+ElUImCKI4q5gxxxFEIzjAfWkD6GNupeNfZNqhTFzNC5l8gdDG6d+JFPqcJ74jO30ohny2lof/1fqx8M6slARRLHCAJou8mGoi1HIkmksYRoIm0kDEiLxVQ0MooQgJbmqLy/PTMsnFmKUwbzqNuqHXjbuTavOiIFQGe+AA1IABTkETXIEWaAMEnsALeAVvyrPyrnwon5PWklLM7IIpKV+/xXGgTg==</latexit>
<latexit sha1_base64="HuDWCNBZ3fiwLb58W34hfFqbE4Y=">AAACGHicbZDLSsNAFIYn9VbrLepSF8Ei1IUlKYJuhKIbF6IV7AWaGCaTSTt0cmFmIoSQjY/hE7jVJ3Anbt35AL6HkzYL2/rDwM9/zuGc+ZyIEi50/VspLSwuLa+UVytr6xubW+r2ToeHMUO4jUIasp4DOaYkwG1BBMW9iGHoOxR3ndFlXu8+YsZJGNyLJMKWDwcB8QiCQka2un99bnoMotTI0hv7NjN57NcSOzpOjh4atlrV6/pY2rwxClMFhVq2+mO6IYp9HAhEIed9Q4+ElUImCKI4q5gxxxFEIzjAfWkD6GNupeNfZNqhTFzNC5l8gdDG6d+JFPqcJ74jO30ohny2lof/1fqx8M6slARRLHCAJou8mGoi1HIkmksYRoIm0kDEiLxVQ0MooQgJbmqLy/PTMsnFmKUwbzqNuqHXjbuTavOiIFQGe+AA1IABTkETXIEWaAMEnsALeAVvyrPyrnwon5PWklLM7IIpKV+/xXGgTg==</latexit>
<latexit sha1_base64="HuDWCNBZ3fiwLb58W34hfFqbE4Y=">AAACGHicbZDLSsNAFIYn9VbrLepSF8Ei1IUlKYJuhKIbF6IV7AWaGCaTSTt0cmFmIoSQjY/hE7jVJ3Anbt35AL6HkzYL2/rDwM9/zuGc+ZyIEi50/VspLSwuLa+UVytr6xubW+r2ToeHMUO4jUIasp4DOaYkwG1BBMW9iGHoOxR3ndFlXu8+YsZJGNyLJMKWDwcB8QiCQka2un99bnoMotTI0hv7NjN57NcSOzpOjh4atlrV6/pY2rwxClMFhVq2+mO6IYp9HAhEIed9Q4+ElUImCKI4q5gxxxFEIzjAfWkD6GNupeNfZNqhTFzNC5l8gdDG6d+JFPqcJ74jO30ohny2lof/1fqx8M6slARRLHCAJou8mGoi1HIkmksYRoIm0kDEiLxVQ0MooQgJbmqLy/PTMsnFmKUwbzqNuqHXjbuTavOiIFQGe+AA1IABTkETXIEWaAMEnsALeAVvyrPyrnwon5PWklLM7IIpKV+/xXGgTg==</latexit>
<latexit sha1_base64="HuDWCNBZ3fiwLb58W34hfFqbE4Y=">AAACGHicbZDLSsNAFIYn9VbrLepSF8Ei1IUlKYJuhKIbF6IV7AWaGCaTSTt0cmFmIoSQjY/hE7jVJ3Anbt35AL6HkzYL2/rDwM9/zuGc+ZyIEi50/VspLSwuLa+UVytr6xubW+r2ToeHMUO4jUIasp4DOaYkwG1BBMW9iGHoOxR3ndFlXu8+YsZJGNyLJMKWDwcB8QiCQka2un99bnoMotTI0hv7NjN57NcSOzpOjh4atlrV6/pY2rwxClMFhVq2+mO6IYp9HAhEIed9Q4+ElUImCKI4q5gxxxFEIzjAfWkD6GNupeNfZNqhTFzNC5l8gdDG6d+JFPqcJ74jO30ohny2lof/1fqx8M6slARRLHCAJou8mGoi1HIkmksYRoIm0kDEiLxVQ0MooQgJbmqLy/PTMsnFmKUwbzqNuqHXjbuTavOiIFQGe+AA1IABTkETXIEWaAMEnsALeAVvyrPyrnwon5PWklLM7IIpKV+/xXGgTg==</latexit>
@L
@w1
=
@L
@yp
@yp
@hr
@hr
@w1
=
1
NO
2(yp y)w2x
<latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit>
<latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit>
<latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit>
<latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit>
w1 w1 ⌘
@L
@w1
<latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit>
<latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit>
<latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit>
<latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit>
w2 w2 ⌘
@L
@w2
<latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit>
<latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit>
<latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit>
<latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit>
hr > 0
<latexit sha1_base64="cIbhX78ttMoC8DFM/VOzOTz2skU=">AAAB/3icbVDLSsNAFL2pr1pfUZduBovgqiQi6EqKblxWMG2hDWUymbRDJ5MwMxFK6MIvcKtf4E7c+il+gP/hpM3Cth4YOJxzL/fMCVLOlHacb6uytr6xuVXdru3s7u0f2IdHbZVkklCPJDyR3QArypmgnmaa024qKY4DTjvB+K7wO09UKpaIRz1JqR/joWARI1gbyRsN5I0zsOtOw5kBrRK3JHUo0RrYP/0wIVlMhSYcK9VznVT7OZaaEU6ntX6maIrJGA9pz1CBY6r8fBZ2is6MEqIokeYJjWbq340cx0pN4sBMxliP1LJXiP95vUxH137ORJppKsj8UJRxpBNU/ByFTFKi+cQQTCQzWREZYYmJNv0sXAlVEW1qenGXW1gl7YuG6zTch8t687ZsqAoncArn4MIVNOEeWuABAQYv8Apv1rP1bn1Yn/PRilXuHMMCrK9f9mSWwA==</latexit>
<latexit sha1_base64="cIbhX78ttMoC8DFM/VOzOTz2skU=">AAAB/3icbVDLSsNAFL2pr1pfUZduBovgqiQi6EqKblxWMG2hDWUymbRDJ5MwMxFK6MIvcKtf4E7c+il+gP/hpM3Cth4YOJxzL/fMCVLOlHacb6uytr6xuVXdru3s7u0f2IdHbZVkklCPJDyR3QArypmgnmaa024qKY4DTjvB+K7wO09UKpaIRz1JqR/joWARI1gbyRsN5I0zsOtOw5kBrRK3JHUo0RrYP/0wIVlMhSYcK9VznVT7OZaaEU6ntX6maIrJGA9pz1CBY6r8fBZ2is6MEqIokeYJjWbq340cx0pN4sBMxliP1LJXiP95vUxH137ORJppKsj8UJRxpBNU/ByFTFKi+cQQTCQzWREZYYmJNv0sXAlVEW1qenGXW1gl7YuG6zTch8t687ZsqAoncArn4MIVNOEeWuABAQYv8Apv1rP1bn1Yn/PRilXuHMMCrK9f9mSWwA==</latexit>
<latexit sha1_base64="cIbhX78ttMoC8DFM/VOzOTz2skU=">AAAB/3icbVDLSsNAFL2pr1pfUZduBovgqiQi6EqKblxWMG2hDWUymbRDJ5MwMxFK6MIvcKtf4E7c+il+gP/hpM3Cth4YOJxzL/fMCVLOlHacb6uytr6xuVXdru3s7u0f2IdHbZVkklCPJDyR3QArypmgnmaa024qKY4DTjvB+K7wO09UKpaIRz1JqR/joWARI1gbyRsN5I0zsOtOw5kBrRK3JHUo0RrYP/0wIVlMhSYcK9VznVT7OZaaEU6ntX6maIrJGA9pz1CBY6r8fBZ2is6MEqIokeYJjWbq340cx0pN4sBMxliP1LJXiP95vUxH137ORJppKsj8UJRxpBNU/ByFTFKi+cQQTCQzWREZYYmJNv0sXAlVEW1qenGXW1gl7YuG6zTch8t687ZsqAoncArn4MIVNOEeWuABAQYv8Apv1rP1bn1Yn/PRilXuHMMCrK9f9mSWwA==</latexit>
<latexit sha1_base64="cIbhX78ttMoC8DFM/VOzOTz2skU=">AAAB/3icbVDLSsNAFL2pr1pfUZduBovgqiQi6EqKblxWMG2hDWUymbRDJ5MwMxFK6MIvcKtf4E7c+il+gP/hpM3Cth4YOJxzL/fMCVLOlHacb6uytr6xuVXdru3s7u0f2IdHbZVkklCPJDyR3QArypmgnmaa024qKY4DTjvB+K7wO09UKpaIRz1JqR/joWARI1gbyRsN5I0zsOtOw5kBrRK3JHUo0RrYP/0wIVlMhSYcK9VznVT7OZaaEU6ntX6maIrJGA9pz1CBY6r8fBZ2is6MEqIokeYJjWbq340cx0pN4sBMxliP1LJXiP95vUxH137ORJppKsj8UJRxpBNU/ByFTFKi+cQQTCQzWREZYYmJNv0sXAlVEW1qenGXW1gl7YuG6zTch8t687ZsqAoncArn4MIVNOEeWuABAQYv8Apv1rP1bn1Yn/PRilXuHMMCrK9f9mSWwA==</latexit>
NumPyだけによる実装
import numpy as n
p

 

epochs = 30
0

batch_size = 3
2

D_in = 78
4

H = 10
0

D_out = 1
0

learning_rate = 1.0e-0
6

# create random input and output dat
a

x = np.random.randn(batch_size, D_in
)

y = np.random.randn(batch_size, D_out
)

# randomly initialize weight
s

w1 = np.random.randn(D_in, H
)

w2 = np.random.randn(H, D_out
)

for epoch in range(epochs)
:

# forward pas
s

h = x.dot(w1) # h = x * w
1

h_r = np.maximum(h, 0) # h_r = ReLU(h
)

y_p = h_r.dot(w2) # y_p = h_r * w
2

# compute mean squared error and print los
s

loss = np.square(y_p - y).sum()
print(epoch, loss
)

# backward pass: compute gradients of loss with respect to w
2

grad_y_p = 2.0 * (y_p - y)
grad_w2 = h_r.T.dot(grad_y_p)
# backward pass: compute gradients of loss with respect to w
1

grad_h_r = grad_y_p.dot(w2.T)
grad_h = grad_h_r.copy()
grad_h[h < 0] = 0
grad_w1 = x.T.dot(grad_h)
# update weight
s

w1 -= learning_rate * grad_w
1

w2 -= learning_rate * grad_w2
w1 w1 ⌘
@L
@w1
<latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit>
<latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit>
<latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit>
<latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit>
w2 w2 ⌘
@L
@w2
<latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit>
<latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit>
<latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit>
<latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit>
@L
@w2
=
@L
@yp
@yp
@w2
=
1
NO
2 (yp y) hr
@L
@w1
=
@L
@yp
@yp
@hr
@hr
@w1
=
1
NO
2 (yp y) w2x
L =
1
NO
X
(yp y)
2
00_numpy.py
PyTorch の導入
import torc
h

 

epochs = 30
0

batch_size = 3
2

D_in = 78
4

H = 10
0

D_out = 1
0

learning_rate = 1.0e-0
6

# create random input and output dat
a

x = torch.randn(batch_size, D_in
)

y = torch.randn(batch_size, D_out
)

# randomly initialize weight
s

w1 = torch.randn(D_in, H
)

w2 = torch.randn(H, D_out
)

for epoch in range(epochs)
:

# forward pass: compute predicted
y

h = x.mm(w1
)

h_r = h.clamp(min=0
)

y_p = h_r.mm(w2
)

# compute and print los
s

loss = (y_p - y).pow(2).sum().item(
)

print(t, loss
)

# backward pass: compute gradients of loss with respect to w
2

grad_y_p = 2.0 * (y_p - y
)

grad_w2 = h_r.t().mm(grad_y_p
)

# backward pass: compute gradients of loss with respect to w
1

grad_h_r = grad_y_p.mm(w2.t()
)

grad_h = grad_h_r.clone(
)

grad_h[h < 0] =
0

grad_w1 = x.t().mm(grad_h
)

# update weight
s

w1 -= learning_rate * grad_w
1

w2 -= learning_rate * grad_w2
np.random torch
np torch
x.dot(w1) x.mm(w1)
np.maximum(h, 0) h.clamp(min=0)
np.square(y_p-y) (y_p-y).pow(2)
copy() clone()
01_tensors.py
自動微分の導入
# randomly initialize weight
s

w1 = torch.randn(D_in, H
)

w2 = torch.randn(H, D_out
)

for epoch in range(epochs)
:

# forward pass: compute predicted
y

h = x.mm(w1
)

h_r = h.clamp(min=0
)

y_p = h_r.mm(w2
)

# compute and print los
s

loss = (y_p - y).pow(2).sum().item(
)

print(t, loss
)

# backward pass: compute gradients of loss
with respect to w
2

grad_y_p = 2.0 * (y_p - y
)

grad_w2 = h_r.t().mm(grad_y_p
)

# backward pass: compute gradients of loss
with respect to w
1

grad_h_r = grad_y_p.mm(w2.t()
)

grad_h = grad_h_r.clone(
)

grad_h[h < 0] =
0

grad_w1 = x.t().mm(grad_h
)

# update weight
s

w1 -= learning_rate * grad_w
1

w2 -= learning_rate * grad_w2
01_tensor.py 02_autograd.py
# randomly initialize weight
s

w1 = torch.randn(D_in, H, requires_grad=True
)

w2 = torch.randn(H, D_out, requires_grad=True
)

for epoch in range(epochs)
:

# forward pass: compute predicted
y

h = x.mm(w1
)

h_r = h.clamp(min=0
)

y_p = h_r.mm(w2
)

# compute and print los
s

loss = (y_p - y).pow(2).sum(
)

print(t, loss.item()
)

# backward pas
s

loss.backward(
)

with torch.no_grad()
:

# update weight
s

w1 -= learning_rate * w1.grad
w2 -= learning_rate * w2.grad
# initialize weight
s

w1.grad.zero_(
)

w2.grad.zero_()
@L
@w1
=
@L
@yp
@yp
@hr
@hr
@w1
=
1
NO
2(yp y)w2x
<latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit>
<latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit>
<latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit>
<latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit>
微分を自動的に計算してくれる
活性化関数の自作 03_function.py
import torc
h

 

for epoch in range(epochs)
:

# forward pass: compute predicted
y

h = x.mm(w1
)

h_r = h.clamp(min=0
)

y_p = h_r.mm(w2)
02_autograd.py
import torc
h

 

class ReLU(torch.autograd.Function)
:

@staticmetho
d

def forward(ctx, input)
:

ctx.save_for_backward(input
)

return input.clamp(min=0
)

@staticmetho
d

def backward(ctx, grad_output)
:

input, = ctx.saved_tensor
s

grad_input = grad_output.clone(
)

grad_input[input<0] =
0

return grad_inpu
t

for epoch in range(epochs)
:

# forward pass: compute predicted
y

relu = ReLU.appl
y

h = x.mm(w1
)

h_r = relu(h
)

y_p = h_r.mm(w2)
.
.
.
.
.
.
y=f(x)
ReLU (Recti
fi
ed Linear Unit)
torch.nnの利用 04_nn_module.py
# create random input and output dat
a

x = torch.randn(batch_size, D_in
)

y = torch.randn(batch_size, D_out
)

# randomly initialize weight
s

w1 = torch.randn(D_in, H, requires_grad=True
)

w2 = torch.randn(H, D_out, requires_grad=True
)

for epoch in range(epochs)
:

# forward pass: compute predicted
y

h = x.mm(w1
)

h_r = h.clamp(min=0
)

y_p = h_r.mm(w2
)

# compute and print los
s

loss = (y_p - y).pow(2).sum()
print(t, loss.item()
)

# backward pas
s

loss.backward(
)

with torch.no_grad()
:

# update weight
s

w1 -= learning_rate * w1.gra
d

w2 -= learning_rate * w2.gra
d

# initialize weight
s

w1.grad.zero_(
)

w2.grad.zero_()
02_autograd.py
# create random input and output dat
a

x = torch.randn(batch_size, D_in
)

y = torch.randn(batch_size, D_out
)

# define mode
l

model = torch.nn.Sequential
(

torch.nn.Linear(D_in, H)
,

torch.nn.ReLU()
,

torch.nn.Linear(H, D_out)
,

)

# define loss functio
n

criterion = torch.nn.MSELoss(reduction='sum'
)

for epoch in range(epochs)
:

# forward pass: compute predicted
y

y_p = model(x)
# compute and print los
s

loss = criterion(y_p, y)
print(t, loss.item()
)

# backward pas
s

model.zero_grad()
loss.backward(
)

with torch.no_grad()
:

# update weight
s

for param in model.parameters()
:

param -= learning_rate * param.grad
最適化関数の呼び出し 05_optimizer.py
04_nn_module.py
# define loss functio
n

criterion = torch.nn.MSELoss(reduction='sum'
)

for t in range(epochs)
:

# forward pass: compute predicted
y

y_p = model(x
)

# compute and print los
s

loss = criterion(y_p, y
)

print(t, loss.item()
)

# backward pas
s

model.zero_grad()
loss.backward(
)

with torch.no_grad()
:

# update weight
s

for param in model.parameters()
:

param -= learning_rate * param.grad
# define loss functio
n

criterion = torch.nn.MSELoss(reduction='sum'
)

# define optimize
r

optimizer = torch.optim.SGD(model.parameters(),
lr=learning_rate
)

for epoch in range(epochs)
:

# forward pass: compute predicted
y

y_p = model(x
)

# compute and print los
s

loss = criterion(y_p, y
)

print(t, loss.item()
)

# backward pas
s

optimizer.zero_grad()
loss.backward(
)

# update weight
s

optimizer.step()
モデルを自作 06_mm_module.py
05_optimizer.py
# create random input and output dat
a

x = torch.randn(batch_size, D_in
)

y = torch.randn(batch_size, D_out
)

# define mode
l

model = torch.nn.Sequential
(

torch.nn.Linear(D_in, H)
,

torch.nn.ReLU()
,

torch.nn.Linear(H, D_out)
,

)

# define loss functio
n

criterion = torch.nn.MSELoss(reduction='sum')
import torch.nn as n
n

import torch.nn.functional as
F

class TwoLayerNet(nn.Module)
:

def __init__(self, D_in, H, D_out)
:

super(TwoLayerNet, self).__init__(
)

self.fc1 = nn.Linear(D_in, H
)

self.fc2 = nn.Linear(H, D_out
)

def forward(self, x)
:

h = self.fc1(x
)

h_r = F.relu(h
)

y_p = self.fc2(h_r
)

return y_
p

# create random input and output dat
a

x = torch.randn(batch_size, D_in
)

y = torch.randn(batch_size, D_out
)

# define mode
l

model = TwoLayerNet(D_in, H, D_out
)

# define loss functio
n

criterion = nn.MSELoss(reduction='sum')
.
.
.
学習時に不変
MNIST Datasetのロード 07_mnist.py
06_mm_module.py
import torch.nn as n
n

import torch.nn.functional as
F

from torchvision import datasets, transform
s

# read input data and label
s

train_dataset = datasets.MNIST('./data'
,

train=True
,

download=True
,

transform=transforms.ToTensor()
)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset
,

batch_size=batch_size
,

shuffle=True
)

for epoch in range(epochs)
:

# Set model to training mod
e

model.train(
)

# Loop over each batch from the training se
t

for batch_idx, (x, y) in enumerate(train_loader):
# forward pass: compute predicted
y

y_p = model(x)
.
.
.
import torch.nn as n
n

import torch.nn.functional as
F

# create random input and output dat
a

x = torch.randn(batch_size, D_in
)

y = torch.randn(batch_size, D_out
)

for t in range(epochs)
:

# forward pass: compute predictedy

y_p = model(x)
.
.
.
.
.
.
Validationデータによる検証
08_validate.py
def validate()
:

model.eval(
)

val_loss, val_acc = 0,
0

for data, target in val_loader
:

output = model(data
)

loss = criterion(output, target
)

val_loss += loss.item()
pred = output.data.max(1)[1
]

val_acc += 100. * pred.eq(target.data).cpu().sum() / target.size(0)
val_loss /= len(val_loader
)

val_acc /= len(val_loader
)

print('nValidation set: Average loss: {:.4f}, Accuracy: {:.1f}%n'.format
(

val_loss, val_acc))
学習時に使うデータ
ハイパラやモデル
を変えて試すとき
に使うデータ
最終的な精度の評価
に使うデータ
Validation dataのloss
予測クラスがラベルと一致しているか?
パーセンテージに変換
sum()はGPUでやると遅いのでCPUで
train(), main()関数の形で書く
09_train.py
def train(train_loader,model,criterion,optimizer,epoch)
:

model.train(
)

t = time.perf_counter(
)

for batch_idx, (data, target) in enumerate(train_loader)
:

output = model(data
)

loss = criterion(output, target
)

optimizer.zero_grad(
)

loss.backward(
)

optimizer.step(
)

if batch_idx % 200 == 0
:

print('Train Epoch: {} [{:>5}/{} ({:.0%})]tLoss: {:.6f}t Time:{:.4f}'.format
(

epoch, batch_idx * len(data), len(train_loader.dataset)
,

batch_idx / len(train_loader), loss.data.item()
,

time.perf_counter() - t)
)

t = time.perf_counter()
def main()
:

epochs = 1
0

batch_size = 3
2

learning_rate = 1.0e-0
2

train_dataset = datasets.MNIST('./data'
,

train=True
,

download=True
,

transform=transforms.ToTensor()
)

val_dataset = datasets.MNIST('./data'
,

train=False
,

transform=transforms.ToTensor()
)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset
,

batch_size=batch_size
,

shuffle=True
)

val_loader = torch.utils.data.DataLoader(dataset=validation_dataset
,

batch_size=batch_size
,

shuffle=False
)

model = TwoLayerNet(D_in, H, D_out
)

criterion = nn.CrossEntropyLoss(
)

optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate
)

for epoch in range(epochs)
:

model.train(
)

train(train_loader,model,criterion,optimizer,epoch
)

validate(val_loader,model,criterion
)
畳み込みNNモデル 10_cnn.py
09_train.py
class CNN(nn.Module)
:

def __init__(self)
:

super(CNN, self).__init__(
)

self.conv1 = nn.Conv2d(1, 32, 3, 1
)

self.conv2 = nn.Conv2d(32, 64, 3, 1
)

self.dropout1 = nn.Dropout2d(0.25
)

self.dropout2 = nn.Dropout2d(0.5
)

self.fc1 = nn.Linear(9216, 128
)

self.fc2 = nn.Linear(128, 10
)

def forward(self, x)
:

x = self.conv1(x
)

x = F.relu(x
)

x = self.conv2(x
)

x = F.relu(x
)

x = F.max_pool2d(x, 2
)

x = self.dropout1(x
)

x = torch.flatten(x, 1
)

x = self.fc1(x
)

x = F.relu(x
)

x = self.dropout2(x
)

x = self.fc2(x
)

output = F.log_softmax(x, dim=1
)

return output
class TwoLayerNet(nn.Module)
:

def __init__(self, D_in, H, D_out)
:

super(TwoLayerNet, self).__init__(
)

self.fc1 = nn.Linear(D_in, H
)

self.fc2 = nn.Linear(H, D_out
)

def forward(self, x)
:

x = x.view(-1, D_in
)

h = self.fc1(x
)

h_r = F.relu(h
)

y_p = self.fc2(h_r
)

return F.log_softmax(y_p, dim=1)
GPUを利用
11_gpu.py
device = torch.device('cuda'
)

model = CNN().to(device)
def train(train_loader,model,criterion,optimizer,epoch)
:

model.train(
)

t = time.perf_counter(
)

for batch_idx, (data, target) in enumerate(train_loader)
:

data = data.to(device)
target = target.to(device)
def validate(loss_vector, accuracy_vector)
:

model.eval(
)

val_loss, correct = 0,
0

for data, target in validation_loader
:

data = data.to(device)
target = target.to(device)
.
.
.
.
.
.
.
.
.
PyTorchは裏でcuDNNを呼んでいる
1. torch.device(‘cuda’)でデバイスを指定
2. data, targetをデバイスに送る
3. 計算は全て自動的にGPUを用いて行われる
分散並列
12_distributed.py
import o
s

import torc
h

import torch.distributed as dis
t

master_addr = os.getenv("MASTER_ADDR", default="localhost"
)

master_port = os.getenv('MASTER_PORT', default='8888'
)

method = "tcp://{}:{}".format(master_addr, master_port
)

rank = int(os.getenv('OMPI_COMM_WORLD_RANK', '0')
)

world_size = int(os.getenv('OMPI_COMM_WORLD_SIZE', '1')
)

dist.init_process_group("nccl", init_method=method, rank=rank, world_size=world_size
)

print('Rank: {}, Size: {}'.format(dist.get_rank(),dist.get_world_size())
)

ngpus =
4

device = rank % ngpu
s

x = torch.randn(1).to(device
)

print('rank {}: {}'.format(rank, x)
)

dist.broadcast(x, src=0
)

print('rank {}: {}'.format(rank, x))
通信に用いるホストアドレスとポート番号を指定
OpenMPI環境変数からrankとsizeを取得
PyTorchにこれらを設定
PyTorchによる集団通信
.bashrcに以下を記入
if [ -f "$SGE_JOB_SPOOL_DIR/pe_hostfile" ]; the
n

export MASTER_ADDR=`head -n 1 $SGE_JOB_SPOOL_DIR/pe_hostfile | cut -d " " -f 1
`

f
i

mpirun -np 4 python 12_distributed.py
分散並列MNIST
13_ddp.py
def print0(message)
:

if torch.distributed.is_initialized()
:

if torch.distributed.get_rank() == 0
:

print(message, flush=True
)

else
:

print(message, flush=True
)

train_sampler = torch.utils.data.distributed.DistributedSampler
(

train_dataset
,

num_replicas=torch.distributed.get_world_size()
,

rank=torch.distributed.get_rank()
)

model = DDP(model, device_ids=[rank])
.
.
.
.
.
.
全プロセスがprintすると見づらいので1プロセスだけprintするようなprint関数を定義
train dataの読み込みで異なるプロセスが異なるデータを読むようにする
モデルをDDP()に通すことで分散並列計算を行う
Argparse
14_args.py
import argpars
e

import torc
h

import torch.distributed as dis
t

import torch.nn as n
n

parser = argparse.ArgumentParser(description='PyTorch MNIST Example'
)

parser.add_argument('--batch-size', type=int, default=32, metavar='N'
,

help='input batch size for training (default: 32)'
)

parser.add_argument('--epochs', type=int, default=10, metavar='N'
,

help='number of epochs to train (default: 10)'
)

parser.add_argument('--lr', type=float, default=1.0e-02, metavar='LR'
,

help='learning rate (default: 1.0e-02)'
)

args = parser.parse_args(
)

epochs = args.epochs
batch_size = args.batch_size
learning_rate = args.lr * world_size
直接数字を入れていたところをargsの変数を入れられる
https://docs.python.org/ja/3/library/argparse.html#action
AverageMeter
15_meter.py
def train(train_loader,model,criterion,optimizer,epoch,device)
:

batch_time = AverageMeter('Time', ':.4f'
)

train_loss = AverageMeter('Loss', ':.6f')
class AverageMeter(object)
:

def __init__(self, name, fmt=':f')
:

self.name = nam
e

self.fmt = fm
t

self.reset(
)

def reset(self)
:

self.val =
0

self.avg =
0

self.sum =
0

self.count =
0

def update(self, val, n=1)
:

self.val = va
l

self.sum += val *
n

self.count +=
n

self.avg = self.sum / self.coun
t

def __str__(self)
:

fmtstr = '{name} {val' + self.fmt + '} ({avg' + self.fmt + '})
'

return fmtstr.format(**self.__dict__)
valが既にn個の平均の場合
値
平均
和
個数
出力形式
ProgressMeter
15_meter.py
def train(train_loader,model,criterion,optimizer,epoch,device)
:

batch_time = AverageMeter('Time', ':.4f'
)

train_loss = AverageMeter('Loss', ‘:.6f'
)

progress = ProgressMeter
(

len(train_loader)
,

[train_loss, batch_time]
,

prefix="Epoch: [{}]".format(epoch))
class ProgressMeter(object)
:

def __init__(self, num_batches, meters, prefix="", postfix="")
:

self.batch_fmtstr = self._get_batch_fmtstr(num_batches
)

self.meters = meters
self.prefix = prefix
self.postfix = postfix
def display(self, batch)
:

entries = [self.prefix + self.batch_fmtstr.format(batch)
]

entries += [str(meter) for meter in self.meters
]

entries += self.postfi
x

print0('t'.join(entries)
)

def _get_batch_fmtstr(self, num_batches)
:

num_digits = len(str(num_batches // 1)
)

fmt = '{:' + str(num_digits) + 'd}
'

return '[' + fmt + '/' + fmt.format(num_batches) + ']'
前にprintするもの
後にprintするもの
printしたい変数
printしたいものを連結
[ 今のbatch / 全batch数 ] のような表示をしたい
Weights and Biases
pip install wand
b

wandb login
import wand
b

os.environ['MASTER_ADDR'] = 'localhost
'

os.environ['MASTER_PORT'] = '8888
'

rank = int(os.getenv('OMPI_COMM_WORLD_RANK', '0')
)

world_size = int(os.getenv('OMPI_COMM_WORLD_SIZE', '1')
)

dist.init_process_group("nccl", rank=rank, world_size=world_size
)

device = torch.device('cuda',rank
)

if torch.distributed.get_rank() == 0
:

wandb.init(project="example-project"
)

wandb.config.update(args
)

epochs = args.epoch
s

batch_size = args.batch_siz
e

learning_rate = args.lr * world_size
for epoch in range(epochs)
:

model.train(
)

train_loss, train_acc = train(train_loader,model,criterion,optimizer,epoch,device
)

val_loss, val_acc = validate(val_loader,model,criterion,device
)

if torch.distributed.get_rank() == 0
:

wandb.log(
{

'train_loss': train_loss
,

'train_acc': train_acc
,

'val_loss': val_loss
,

'val_acc': val_ac
c

})
wandbで記録したい変数
trainとvalidateがlossとaccuracyを返すようにする
wandbの初期化
argsを渡すと実験条件を勝手に記録してくれる
16_wandb.py
train_dataset = datasets.CIFAR10('./data'
,

train=True
,

download=True
,

transform=transforms.ToTensor()
)

val_dataset = datasets.CIFAR10('./data'
,

train=False
,

download=True
,

transform=transforms.ToTensor())
CIFAR10
17_cifar10.py
model = VGG('VGG19').to(device
)

model = DDP(model, device_ids=[rank % 4]
)

criterion = nn.CrossEntropyLoss(
)

optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
データセットの変更
モデルの変更
transform_train = transforms.Compose(
[

transforms.RandomCrop(32, padding=4)
,

transforms.RandomHorizontalFlip(),
transforms.ToTensor()
,

transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
]
)

transform_val = transforms.Compose(
[

transforms.ToTensor()
,

transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))
,

])
データ拡張
18_augmentation.py
輝度値の正規化
parser.add_argument('--momentum', type=float, default=0.9, metavar='M'
,

help='momentum (default: 0.9)'
)

parser.add_argument('--wd', '--weight_decay', type=float, default=5.0e-04, metavar='W'
,

help='learning rate (default: 5.0e-04)')
正則化
19_regularization.py
optimizer = torch.optim.SGD(model.parameters(), lr=args.lr
,

momentum=args.momentum, weight_decay=args.wd)
<latexit sha1_base64="GBuDE2XllGcgcaw34c+lteFIv9Q=">AAACJnicbVDLSsNAFJ34rO+oSzeDRRCFkhRFN0LRjcsKVoWmlsnkxg5OHszcCCXNP/gZfoFb/QJ3Iu7c+B9OHwtrPTBwOOce7p3jp1JodJxPa2p6ZnZuvrSwuLS8srpmr29c6SRTHBo8kYm68ZkGKWJooEAJN6kCFvkSrv37s75//QBKiyS+xG4KrYjdxSIUnKGR2vaeh0IGkMviRO57oWI896SJB6zIq0Wv52EHkNFe77batstOxRmAThJ3RMpkhHrb/vaChGcRxMgl07rpOim2cqZQcAnFopdpSBm/Z3fQNDRmEehWPvhTQXeMEtAwUebFSAfq70TOIq27kW8mI4Yd/dfri/95zQzD41Yu4jRDiPlwUZhJigntF0QDoYCj7BrCuBLmVso7zBSDpsaxLYHun1aYXty/LUySq2rFPaw4Fwfl2umooRLZIttkl7jkiNTIOamTBuHkkTyTF/JqPVlv1rv1MRydskaZTTIG6+sH2qSnTw==</latexit>
˜
l = l +
2
||✓||2
<latexit sha1_base64="aTi3bmxUDeM1ohqP26MgoCugRRA=">AAACI3icbVDLSgMxFM34tr6qLt0EiygIZUYU3QhFNy4VrAqdUu5kbm0wkxmSO0IZ+gl+hl/gVr/Anbhx4dL/MH0stHogcHLOvbk3J8qUtOT7H97E5NT0zOzcfGlhcWl5pby6dmXT3Aisi1Sl5iYCi0pqrJMkhTeZQUgihdfR3Wnfv75HY2WqL6mbYTOBWy3bUgA5qVXeDjVECkKSKsZC9Y6Hd652Q+VeiYGH1EGCVrniV/0B+F8SjEiFjXDeKn+FcSryBDUJBdY2Aj+jZgGGpFDYK4W5xQzEHdxiw1ENCdpmMfhQj285Jebt1LijiQ/Unx0FJNZ2k8hVJkAdO+71xf+8Rk7to2YhdZYTajEc1M4Vp5T30+GxNChIdR0BYaTblYsOGBDkMvw1Jbb91Xoul2A8hb/kaq8aHFT9i/1K7WSU0BzbYJtshwXskNXYGTtndSbYA3tiz+zFe/RevTfvfVg64Y161tkveJ/fX8Glaw==</latexit>
r˜
l = rl + ✓
<latexit sha1_base64="+SsjknKlypA8fgkVhF8vJQYEV3M=">AAACPnicbVDLSgMxFM34rO+qSzfBIlSkZUYU3QhFNy4VrAqdUu5k0jaYyQzJHaEM/R8/wy9wq36A7sStS9M6gm29EHJyzn3lBIkUBl331Zmanpmdmy8sLC4tr6yuFdc3rk2casbrLJaxvg3AcCkUr6NAyW8TzSEKJL8J7s4G+s0910bE6gp7CW9G0FGiLRigpVrFUx+7HKGV4Z7XP8kfWPHt5SsIJFBZ/mV3aYUOBWn7h/BLt4olt+oOg04CLwclksdFq/jmhzFLI66QSTCm4bkJNjPQKJjk/UU/NTwBdgcd3rBQQcRNMxv+tU93LBPSdqztUUiH7N+KDCJjelFgMyPArhnXBuR/WiPF9nEzEypJkSv2M6idSooxHRhHQ6E5Q9mzAJgWdlfKuqCBobV3ZEpoBqv1rS/euAuT4Hq/6h1W3cuDUu00d6hAtsg2KROPHJEaOScXpE4YeSBP5Jm8OI/Ou/PhfP6kTjl5zSYZCefrG44ksA8=</latexit>
✓t+1 = ✓t ⌘rl(✓t) ⌘ ✓t
Momentum L2 正則化
Sweep
sweep.yaml
program: wrapper.p
y

method: gri
d

metric
:

goal: minimiz
e

name: val_los
s

parameters
:

epochs
:

values: [100
]

batch_size
:

values: [32
]

learning_rate
:

values: [0.005, 0.01, 0.02, 0.05, 0.1
]

momentum
:

values: [0.85, 0.9, 0.95
]

weight_decay
:

values: [1.0e-4, 2.0e-4, 5.0e-4, 1.0e-3, 2.0e-3]
wandb sweep sweep.yaml
Models
19_regularization.py
model = VGG('VGG19').to(device
)

# model = ResNet18().to(device
)

# model = PreActResNet18().to(device
)

# model = GoogLeNet().to(device
)

# model = DenseNet121().to(device
)

# model = ResNeXt29_2x64d().to(device
)

# model = MobileNet().to(device
)

# model = MobileNetV2().to(device
)

# model = DPN92().to(device
)

# model = ShuffleNetG2().to(device
)

# model = SENet18().to(device
)

# model = ShuffleNetV2(1).to(device
)

# model = EfficientNetB0().to(device
)

# model = RegNetX_200MF().to(device)
今はこれを使っている
他のモデルも試して見ましょう
参考文献
Learning PyTorch with Example
s

https://pytorch.org/tutorials/beginner/pytorch_with_examples.html
PyTorch Examples githu
b

https://github.com/pytorch/examples
PyTorch Tutorial githu
b

https://github.com/yunjey/pytorch-tutorial
Understanding PyTorch with an example: a step-by-step tutorial by Daniel Godo
y

https://towardsdatascience.com/understanding-pytorch-with-an-example-a-step-by-step-tutorial-81fc5f8c4e8e
Practical Deep Learning for Coders, v3 by fast.a
i

https://course.fast.ai
PyTorch by Beeren Sah
u

https://beerensahu.wordpress.com/2018/03/21/pytorch-tutorial-lesson-1-tensor/
Writing Distributed Applications with PyTorch by Séb Arnol
d

https://pytorch.org/tutorials/intermediate/dist_tuto.html

More Related Content

What's hot

High-impact Papers in Computer Vision: 歴史を変えた/トレンドを創る論文
High-impact Papers in Computer Vision: 歴史を変えた/トレンドを創る論文High-impact Papers in Computer Vision: 歴史を変えた/トレンドを創る論文
High-impact Papers in Computer Vision: 歴史を変えた/トレンドを創る論文cvpaper. challenge
 
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksDeep Learning JP
 
【DL輪読会】ViT + Self Supervised Learningまとめ
【DL輪読会】ViT + Self Supervised Learningまとめ【DL輪読会】ViT + Self Supervised Learningまとめ
【DL輪読会】ViT + Self Supervised LearningまとめDeep Learning JP
 
第10回 配信講義 計算科学技術特論A(2021)
第10回 配信講義 計算科学技術特論A(2021)第10回 配信講義 計算科学技術特論A(2021)
第10回 配信講義 計算科学技術特論A(2021)RCCSRENKEI
 
ブレインパッドにおける機械学習プロジェクトの進め方
ブレインパッドにおける機械学習プロジェクトの進め方ブレインパッドにおける機械学習プロジェクトの進め方
ブレインパッドにおける機械学習プロジェクトの進め方BrainPad Inc.
 
深層学習の数理
深層学習の数理深層学習の数理
深層学習の数理Taiji Suzuki
 
Tensorflow Liteの量子化アーキテクチャ
Tensorflow Liteの量子化アーキテクチャTensorflow Liteの量子化アーキテクチャ
Tensorflow Liteの量子化アーキテクチャHitoshiSHINABE1
 
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料Yusuke Uchida
 
【メタサーベイ】Neural Fields
【メタサーベイ】Neural Fields【メタサーベイ】Neural Fields
【メタサーベイ】Neural Fieldscvpaper. challenge
 
TensorFlow計算グラフ最適化処理
TensorFlow計算グラフ最適化処理TensorFlow計算グラフ最適化処理
TensorFlow計算グラフ最適化処理Atsushi Nukariya
 
[DL輪読会]Reward Augmented Maximum Likelihood for Neural Structured Prediction
[DL輪読会]Reward Augmented Maximum Likelihood for Neural Structured Prediction[DL輪読会]Reward Augmented Maximum Likelihood for Neural Structured Prediction
[DL輪読会]Reward Augmented Maximum Likelihood for Neural Structured PredictionDeep Learning JP
 
全力解説!Transformer
全力解説!Transformer全力解説!Transformer
全力解説!TransformerArithmer Inc.
 
機械学習で泣かないためのコード設計 2018
機械学習で泣かないためのコード設計 2018機械学習で泣かないためのコード設計 2018
機械学習で泣かないためのコード設計 2018Takahiro Kubo
 
画像処理AIを用いた異常検知
画像処理AIを用いた異常検知画像処理AIを用いた異常検知
画像処理AIを用いた異常検知Hideo Terada
 
人工知能研究のための視覚情報処理
人工知能研究のための視覚情報処理人工知能研究のための視覚情報処理
人工知能研究のための視覚情報処理Koki Nakamura
 
Deep Learning Lab 異常検知入門
Deep Learning Lab 異常検知入門Deep Learning Lab 異常検知入門
Deep Learning Lab 異常検知入門Shohei Hido
 
[論文紹介] LSTM (LONG SHORT-TERM MEMORY)
[論文紹介] LSTM (LONG SHORT-TERM MEMORY)[論文紹介] LSTM (LONG SHORT-TERM MEMORY)
[論文紹介] LSTM (LONG SHORT-TERM MEMORY)Tomoyuki Hioki
 
【DL輪読会】Novel View Synthesis with Diffusion Models
【DL輪読会】Novel View Synthesis with Diffusion Models【DL輪読会】Novel View Synthesis with Diffusion Models
【DL輪読会】Novel View Synthesis with Diffusion ModelsDeep Learning JP
 

What's hot (20)

Point net
Point netPoint net
Point net
 
High-impact Papers in Computer Vision: 歴史を変えた/トレンドを創る論文
High-impact Papers in Computer Vision: 歴史を変えた/トレンドを創る論文High-impact Papers in Computer Vision: 歴史を変えた/トレンドを創る論文
High-impact Papers in Computer Vision: 歴史を変えた/トレンドを創る論文
 
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
 
【DL輪読会】ViT + Self Supervised Learningまとめ
【DL輪読会】ViT + Self Supervised Learningまとめ【DL輪読会】ViT + Self Supervised Learningまとめ
【DL輪読会】ViT + Self Supervised Learningまとめ
 
第10回 配信講義 計算科学技術特論A(2021)
第10回 配信講義 計算科学技術特論A(2021)第10回 配信講義 計算科学技術特論A(2021)
第10回 配信講義 計算科学技術特論A(2021)
 
ブレインパッドにおける機械学習プロジェクトの進め方
ブレインパッドにおける機械学習プロジェクトの進め方ブレインパッドにおける機械学習プロジェクトの進め方
ブレインパッドにおける機械学習プロジェクトの進め方
 
深層学習の数理
深層学習の数理深層学習の数理
深層学習の数理
 
Tensorflow Liteの量子化アーキテクチャ
Tensorflow Liteの量子化アーキテクチャTensorflow Liteの量子化アーキテクチャ
Tensorflow Liteの量子化アーキテクチャ
 
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
 
【メタサーベイ】Neural Fields
【メタサーベイ】Neural Fields【メタサーベイ】Neural Fields
【メタサーベイ】Neural Fields
 
TensorFlow計算グラフ最適化処理
TensorFlow計算グラフ最適化処理TensorFlow計算グラフ最適化処理
TensorFlow計算グラフ最適化処理
 
[DL輪読会]Reward Augmented Maximum Likelihood for Neural Structured Prediction
[DL輪読会]Reward Augmented Maximum Likelihood for Neural Structured Prediction[DL輪読会]Reward Augmented Maximum Likelihood for Neural Structured Prediction
[DL輪読会]Reward Augmented Maximum Likelihood for Neural Structured Prediction
 
全力解説!Transformer
全力解説!Transformer全力解説!Transformer
全力解説!Transformer
 
機械学習で泣かないためのコード設計 2018
機械学習で泣かないためのコード設計 2018機械学習で泣かないためのコード設計 2018
機械学習で泣かないためのコード設計 2018
 
画像処理AIを用いた異常検知
画像処理AIを用いた異常検知画像処理AIを用いた異常検知
画像処理AIを用いた異常検知
 
Depth Estimation論文紹介
Depth Estimation論文紹介Depth Estimation論文紹介
Depth Estimation論文紹介
 
人工知能研究のための視覚情報処理
人工知能研究のための視覚情報処理人工知能研究のための視覚情報処理
人工知能研究のための視覚情報処理
 
Deep Learning Lab 異常検知入門
Deep Learning Lab 異常検知入門Deep Learning Lab 異常検知入門
Deep Learning Lab 異常検知入門
 
[論文紹介] LSTM (LONG SHORT-TERM MEMORY)
[論文紹介] LSTM (LONG SHORT-TERM MEMORY)[論文紹介] LSTM (LONG SHORT-TERM MEMORY)
[論文紹介] LSTM (LONG SHORT-TERM MEMORY)
 
【DL輪読会】Novel View Synthesis with Diffusion Models
【DL輪読会】Novel View Synthesis with Diffusion Models【DL輪読会】Novel View Synthesis with Diffusion Models
【DL輪読会】Novel View Synthesis with Diffusion Models
 

Similar to 第14回 配信講義 計算科学技術特論A(2021)

Scaling Up AI Research to Production with PyTorch and MLFlow
Scaling Up AI Research to Production with PyTorch and MLFlowScaling Up AI Research to Production with PyTorch and MLFlow
Scaling Up AI Research to Production with PyTorch and MLFlowDatabricks
 
Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)Ha Phuong
 
IRJET- Python Libraries and Packages for Deep Learning-A Survey
IRJET-  	  Python Libraries and Packages for Deep Learning-A SurveyIRJET-  	  Python Libraries and Packages for Deep Learning-A Survey
IRJET- Python Libraries and Packages for Deep Learning-A SurveyIRJET Journal
 
On the Viability of CAPTCHAs for Use in Telephony Systems: A Usability Field ...
On the Viability of CAPTCHAs for Use in Telephony Systems: A Usability Field ...On the Viability of CAPTCHAs for Use in Telephony Systems: A Usability Field ...
On the Viability of CAPTCHAs for Use in Telephony Systems: A Usability Field ...IIIT Hyderabad
 
Introduction to computing Processing and performance.pdf
Introduction to computing Processing and performance.pdfIntroduction to computing Processing and performance.pdf
Introduction to computing Processing and performance.pdfTulasiramKandula1
 
Ceh v8 labs module 02 footprinting and reconnaissance
Ceh v8 labs module 02 footprinting and reconnaissanceCeh v8 labs module 02 footprinting and reconnaissance
Ceh v8 labs module 02 footprinting and reconnaissanceMehrdad Jingoism
 
Lecture-1-2-+(1).pdf
Lecture-1-2-+(1).pdfLecture-1-2-+(1).pdf
Lecture-1-2-+(1).pdfsamaghorab
 
Lecture-1-2-+(1).pdf
Lecture-1-2-+(1).pdfLecture-1-2-+(1).pdf
Lecture-1-2-+(1).pdfsamaghorab
 
PREDICTING STOCK PRICE MOVEMENTS BASED ON NEWSPAPER ARTICLES USING A NOVEL DE...
PREDICTING STOCK PRICE MOVEMENTS BASED ON NEWSPAPER ARTICLES USING A NOVEL DE...PREDICTING STOCK PRICE MOVEMENTS BASED ON NEWSPAPER ARTICLES USING A NOVEL DE...
PREDICTING STOCK PRICE MOVEMENTS BASED ON NEWSPAPER ARTICLES USING A NOVEL DE...webwinkelvakdag
 
Session 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data BenchmarksSession 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data BenchmarksDataBench
 
Real Time Sign Language Translation Using Tensor Flow Object Detection
Real Time Sign Language Translation Using Tensor Flow Object DetectionReal Time Sign Language Translation Using Tensor Flow Object Detection
Real Time Sign Language Translation Using Tensor Flow Object DetectionIRJET Journal
 
Best Data Science Online Training in Hyderabad
  Best Data Science Online Training in Hyderabad  Best Data Science Online Training in Hyderabad
Best Data Science Online Training in Hyderabadbharathtsofttech
 
JobTech at PyCon 2018
JobTech at PyCon 2018JobTech at PyCon 2018
JobTech at PyCon 2018Lee Wei Xuan
 
Machine Learning for Everyone
Machine Learning for EveryoneMachine Learning for Everyone
Machine Learning for EveryoneDhiana Deva
 
IRJET- Recognition of Handwritten Characters based on Deep Learning with Tens...
IRJET- Recognition of Handwritten Characters based on Deep Learning with Tens...IRJET- Recognition of Handwritten Characters based on Deep Learning with Tens...
IRJET- Recognition of Handwritten Characters based on Deep Learning with Tens...IRJET Journal
 

Similar to 第14回 配信講義 計算科学技術特論A(2021) (20)

Scaling Up AI Research to Production with PyTorch and MLFlow
Scaling Up AI Research to Production with PyTorch and MLFlowScaling Up AI Research to Production with PyTorch and MLFlow
Scaling Up AI Research to Production with PyTorch and MLFlow
 
Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)
 
IRJET- Python Libraries and Packages for Deep Learning-A Survey
IRJET-  	  Python Libraries and Packages for Deep Learning-A SurveyIRJET-  	  Python Libraries and Packages for Deep Learning-A Survey
IRJET- Python Libraries and Packages for Deep Learning-A Survey
 
Make2win 線上課程分析
Make2win 線上課程分析Make2win 線上課程分析
Make2win 線上課程分析
 
On the Viability of CAPTCHAs for Use in Telephony Systems: A Usability Field ...
On the Viability of CAPTCHAs for Use in Telephony Systems: A Usability Field ...On the Viability of CAPTCHAs for Use in Telephony Systems: A Usability Field ...
On the Viability of CAPTCHAs for Use in Telephony Systems: A Usability Field ...
 
Training ImageNet-1k ResNet50 in 15min pfn
Training ImageNet-1k ResNet50 in 15min pfnTraining ImageNet-1k ResNet50 in 15min pfn
Training ImageNet-1k ResNet50 in 15min pfn
 
Resume
Resume Resume
Resume
 
Resume
ResumeResume
Resume
 
Introduction to computing Processing and performance.pdf
Introduction to computing Processing and performance.pdfIntroduction to computing Processing and performance.pdf
Introduction to computing Processing and performance.pdf
 
Ceh v8 labs module 02 footprinting and reconnaissance
Ceh v8 labs module 02 footprinting and reconnaissanceCeh v8 labs module 02 footprinting and reconnaissance
Ceh v8 labs module 02 footprinting and reconnaissance
 
Lecture-1-2-+(1).pdf
Lecture-1-2-+(1).pdfLecture-1-2-+(1).pdf
Lecture-1-2-+(1).pdf
 
Lecture-1-2-+(1).pdf
Lecture-1-2-+(1).pdfLecture-1-2-+(1).pdf
Lecture-1-2-+(1).pdf
 
PREDICTING STOCK PRICE MOVEMENTS BASED ON NEWSPAPER ARTICLES USING A NOVEL DE...
PREDICTING STOCK PRICE MOVEMENTS BASED ON NEWSPAPER ARTICLES USING A NOVEL DE...PREDICTING STOCK PRICE MOVEMENTS BASED ON NEWSPAPER ARTICLES USING A NOVEL DE...
PREDICTING STOCK PRICE MOVEMENTS BASED ON NEWSPAPER ARTICLES USING A NOVEL DE...
 
Session 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data BenchmarksSession 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data Benchmarks
 
Real Time Sign Language Translation Using Tensor Flow Object Detection
Real Time Sign Language Translation Using Tensor Flow Object DetectionReal Time Sign Language Translation Using Tensor Flow Object Detection
Real Time Sign Language Translation Using Tensor Flow Object Detection
 
Resume
ResumeResume
Resume
 
Best Data Science Online Training in Hyderabad
  Best Data Science Online Training in Hyderabad  Best Data Science Online Training in Hyderabad
Best Data Science Online Training in Hyderabad
 
JobTech at PyCon 2018
JobTech at PyCon 2018JobTech at PyCon 2018
JobTech at PyCon 2018
 
Machine Learning for Everyone
Machine Learning for EveryoneMachine Learning for Everyone
Machine Learning for Everyone
 
IRJET- Recognition of Handwritten Characters based on Deep Learning with Tens...
IRJET- Recognition of Handwritten Characters based on Deep Learning with Tens...IRJET- Recognition of Handwritten Characters based on Deep Learning with Tens...
IRJET- Recognition of Handwritten Characters based on Deep Learning with Tens...
 

More from RCCSRENKEI

第15回 配信講義 計算科学技術特論B(2022)
第15回 配信講義 計算科学技術特論B(2022)第15回 配信講義 計算科学技術特論B(2022)
第15回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第14回 配信講義 計算科学技術特論B(2022)
第14回 配信講義 計算科学技術特論B(2022)第14回 配信講義 計算科学技術特論B(2022)
第14回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第12回 配信講義 計算科学技術特論B(2022)
第12回 配信講義 計算科学技術特論B(2022)第12回 配信講義 計算科学技術特論B(2022)
第12回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第13回 配信講義 計算科学技術特論B(2022)
第13回 配信講義 計算科学技術特論B(2022)第13回 配信講義 計算科学技術特論B(2022)
第13回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第11回 配信講義 計算科学技術特論B(2022)
第11回 配信講義 計算科学技術特論B(2022)第11回 配信講義 計算科学技術特論B(2022)
第11回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第10回 配信講義 計算科学技術特論B(2022)
第10回 配信講義 計算科学技術特論B(2022)第10回 配信講義 計算科学技術特論B(2022)
第10回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第9回 配信講義 計算科学技術特論B(2022)
 第9回 配信講義 計算科学技術特論B(2022) 第9回 配信講義 計算科学技術特論B(2022)
第9回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第8回 配信講義 計算科学技術特論B(2022)
第8回 配信講義 計算科学技術特論B(2022)第8回 配信講義 計算科学技術特論B(2022)
第8回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第7回 配信講義 計算科学技術特論B(2022)
第7回 配信講義 計算科学技術特論B(2022)第7回 配信講義 計算科学技術特論B(2022)
第7回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第6回 配信講義 計算科学技術特論B(2022)
第6回 配信講義 計算科学技術特論B(2022)第6回 配信講義 計算科学技術特論B(2022)
第6回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第5回 配信講義 計算科学技術特論B(2022)
第5回 配信講義 計算科学技術特論B(2022)第5回 配信講義 計算科学技術特論B(2022)
第5回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
Realization of Innovative Light Energy Conversion Materials utilizing the Sup...
Realization of Innovative Light Energy Conversion Materials utilizing the Sup...Realization of Innovative Light Energy Conversion Materials utilizing the Sup...
Realization of Innovative Light Energy Conversion Materials utilizing the Sup...RCCSRENKEI
 
Current status of the project "Toward a unified view of the universe: from la...
Current status of the project "Toward a unified view of the universe: from la...Current status of the project "Toward a unified view of the universe: from la...
Current status of the project "Toward a unified view of the universe: from la...RCCSRENKEI
 
Fugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons LearnedFugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons LearnedRCCSRENKEI
 
第4回 配信講義 計算科学技術特論B(2022)
第4回 配信講義 計算科学技術特論B(2022)第4回 配信講義 計算科学技術特論B(2022)
第4回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第3回 配信講義 計算科学技術特論B(2022)
第3回 配信講義 計算科学技術特論B(2022)第3回 配信講義 計算科学技術特論B(2022)
第3回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第2回 配信講義 計算科学技術特論B(2022)
第2回 配信講義 計算科学技術特論B(2022)第2回 配信講義 計算科学技術特論B(2022)
第2回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第1回 配信講義 計算科学技術特論B(2022)
第1回 配信講義 計算科学技術特論B(2022)第1回 配信講義 計算科学技術特論B(2022)
第1回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
210603 yamamoto
210603 yamamoto210603 yamamoto
210603 yamamotoRCCSRENKEI
 
第12回 配信講義 計算科学技術特論A(2021)
第12回 配信講義 計算科学技術特論A(2021)第12回 配信講義 計算科学技術特論A(2021)
第12回 配信講義 計算科学技術特論A(2021)RCCSRENKEI
 

More from RCCSRENKEI (20)

第15回 配信講義 計算科学技術特論B(2022)
第15回 配信講義 計算科学技術特論B(2022)第15回 配信講義 計算科学技術特論B(2022)
第15回 配信講義 計算科学技術特論B(2022)
 
第14回 配信講義 計算科学技術特論B(2022)
第14回 配信講義 計算科学技術特論B(2022)第14回 配信講義 計算科学技術特論B(2022)
第14回 配信講義 計算科学技術特論B(2022)
 
第12回 配信講義 計算科学技術特論B(2022)
第12回 配信講義 計算科学技術特論B(2022)第12回 配信講義 計算科学技術特論B(2022)
第12回 配信講義 計算科学技術特論B(2022)
 
第13回 配信講義 計算科学技術特論B(2022)
第13回 配信講義 計算科学技術特論B(2022)第13回 配信講義 計算科学技術特論B(2022)
第13回 配信講義 計算科学技術特論B(2022)
 
第11回 配信講義 計算科学技術特論B(2022)
第11回 配信講義 計算科学技術特論B(2022)第11回 配信講義 計算科学技術特論B(2022)
第11回 配信講義 計算科学技術特論B(2022)
 
第10回 配信講義 計算科学技術特論B(2022)
第10回 配信講義 計算科学技術特論B(2022)第10回 配信講義 計算科学技術特論B(2022)
第10回 配信講義 計算科学技術特論B(2022)
 
第9回 配信講義 計算科学技術特論B(2022)
 第9回 配信講義 計算科学技術特論B(2022) 第9回 配信講義 計算科学技術特論B(2022)
第9回 配信講義 計算科学技術特論B(2022)
 
第8回 配信講義 計算科学技術特論B(2022)
第8回 配信講義 計算科学技術特論B(2022)第8回 配信講義 計算科学技術特論B(2022)
第8回 配信講義 計算科学技術特論B(2022)
 
第7回 配信講義 計算科学技術特論B(2022)
第7回 配信講義 計算科学技術特論B(2022)第7回 配信講義 計算科学技術特論B(2022)
第7回 配信講義 計算科学技術特論B(2022)
 
第6回 配信講義 計算科学技術特論B(2022)
第6回 配信講義 計算科学技術特論B(2022)第6回 配信講義 計算科学技術特論B(2022)
第6回 配信講義 計算科学技術特論B(2022)
 
第5回 配信講義 計算科学技術特論B(2022)
第5回 配信講義 計算科学技術特論B(2022)第5回 配信講義 計算科学技術特論B(2022)
第5回 配信講義 計算科学技術特論B(2022)
 
Realization of Innovative Light Energy Conversion Materials utilizing the Sup...
Realization of Innovative Light Energy Conversion Materials utilizing the Sup...Realization of Innovative Light Energy Conversion Materials utilizing the Sup...
Realization of Innovative Light Energy Conversion Materials utilizing the Sup...
 
Current status of the project "Toward a unified view of the universe: from la...
Current status of the project "Toward a unified view of the universe: from la...Current status of the project "Toward a unified view of the universe: from la...
Current status of the project "Toward a unified view of the universe: from la...
 
Fugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons LearnedFugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons Learned
 
第4回 配信講義 計算科学技術特論B(2022)
第4回 配信講義 計算科学技術特論B(2022)第4回 配信講義 計算科学技術特論B(2022)
第4回 配信講義 計算科学技術特論B(2022)
 
第3回 配信講義 計算科学技術特論B(2022)
第3回 配信講義 計算科学技術特論B(2022)第3回 配信講義 計算科学技術特論B(2022)
第3回 配信講義 計算科学技術特論B(2022)
 
第2回 配信講義 計算科学技術特論B(2022)
第2回 配信講義 計算科学技術特論B(2022)第2回 配信講義 計算科学技術特論B(2022)
第2回 配信講義 計算科学技術特論B(2022)
 
第1回 配信講義 計算科学技術特論B(2022)
第1回 配信講義 計算科学技術特論B(2022)第1回 配信講義 計算科学技術特論B(2022)
第1回 配信講義 計算科学技術特論B(2022)
 
210603 yamamoto
210603 yamamoto210603 yamamoto
210603 yamamoto
 
第12回 配信講義 計算科学技術特論A(2021)
第12回 配信講義 計算科学技術特論A(2021)第12回 配信講義 計算科学技術特論A(2021)
第12回 配信講義 計算科学技術特論A(2021)
 

Recently uploaded

Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfWadeK3
 

Recently uploaded (20)

Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
 

第14回 配信講義 計算科学技術特論A(2021)

  • 2. スパコンでしかできない深層学習 TPU v3 ImageNet SOTA: 90.45 % Top 1 Accuracy 10,000 TPUv3 core days R e s N e t - 5 0 D i s t i l B E R T E L M o B E R T - L a r g e G P T - 2 M e g a t r o n L M T u r i n g - N L G G P T - 3 S w i t c h T r a n s f o r m e r 10 7 10 8 10 9 10 10 10 11 10 12 10 13 Number of parameters x100,000 計算量上等
  • 3. Papers with code MLPerf target score: 75.9 https://paperswithcode.com ImageNet-1k 一番良い結果を出している論文の多くが手元で再現可能
  • 4. 主要な深層ニューラルネットモデルの変遷 https://towardsdatascience.com/from-lenet-to-ef fi cientnet-the-evolution-of-cnns-3a57eb34672f AlexNet: ReLU, Dropout, GPU 2012 2015 ResNet: Skip connection 2017 MobileNet: 1x1畳み込み 2019 Ef fi cientNet: Neural architecture search 2021 Transformer: 注意機構 Vision Transformer: 画像パッチ 1995 LSTM LeNet-5:畳み込み
  • 5. 0.9 0.1 0 Labradoodle Fried chicken 1 <latexit sha1_base64="Qrf0MYIwlAOrUIJFBxNweXaH96A=">AAAB/3icbVDLSsNAFL2pr1pfVZduBovgqiSi6EYounFZwbSFNpTJZNIOnUzCzEQsoQu/wK1+gTtx66f4Af6HkzYL23pg4HDOvdwzx084U9q2v63Syura+kZ5s7K1vbO7V90/aKk4lYS6JOax7PhYUc4EdTXTnHYSSXHkc9r2R7e5336kUrFYPOhxQr0IDwQLGcHaSO7T9bBv96s1u25PgZaJU5AaFGj2qz+9ICZpRIUmHCvVdexEexmWmhFOJ5VeqmiCyQgPaNdQgSOqvGwadoJOjBKgMJbmCY2m6t+NDEdKjSPfTEZYD9Wil4v/ed1Uh1dexkSSairI7FCYcqRjlP8cBUxSovnYEEwkM1kRGWKJiTb9zF0JVB5tYnpxFltYJq2zunNRt+/Pa42boqEyHMExnIIDl9CAO2iCCwQYvMArvFnP1rv1YX3ORktWsXMIc7C+fgH/kJbJ</latexit> x = h0 <latexit sha1_base64="oeS8g7Am64cZNl7f2teu7TnWjwI=">AAACBHicdVDLSgMxFM3UV62vqks3wSK4GjKl1XYhFN24rGAf2A5DJpO2oZnMkGSEUrr1C9zqF7gTt/6HH+B/mGlHsKIHLhzOuZd77/FjzpRG6MPKrayurW/kNwtb2zu7e8X9g7aKEkloi0Q8kl0fK8qZoC3NNKfdWFIc+px2/PFV6nfuqVQsErd6ElM3xEPBBoxgbaS7xEMXHQ+NPOQVS8iu16r1Sg0iG82RkvJZvepAJ1NKIEPTK372g4gkIRWacKxUz0GxdqdYakY4nRX6iaIxJmM8pD1DBQ6pcqfzi2fwxCgBHETSlNBwrv6cmOJQqUnom84Q65H67aXiX14v0YOaO2UiTjQVZLFokHCoI5i+DwMmKdF8YggmkplbIRlhiYk2IS1tCVR62szk8v08/J+0y7ZTtdFNpdS4zBLKgyNwDE6BA85BA1yDJmgBAgR4BE/g2XqwXqxX623RmrOymUOwBOv9C3UxmK8=</latexit> u0 = W0h0 <latexit sha1_base64="i98NF53nvMx1GTvIOlT02vBIGAA=">AAACBHicdVDLSsNAFL2pr1pfVZdugkVwFRKt2o1QdOOygn1gG8JkMmmHTiZhZiKU0K1f4Fa/wJ249T/8AP/DSVuhFT0wcDjnXu6Z4yeMSmXbn0ZhaXllda24XtrY3NreKe/utWScCkyaOGax6PhIEkY5aSqqGOkkgqDIZ6TtD69zv/1AhKQxv1OjhLgR6nMaUoyUlu5Tz7lse87Ac7xyxbHsCUzbOq9W7dOaJjPlx6rADA2v/NULYpxGhCvMkJRdx06UmyGhKGZkXOqlkiQID1GfdDXlKCLSzSaJx+aRVgIzjIV+XJkTdX4jQ5GUo8jXkxFSA/nby8W/vG6qwpqbUZ6kinA8PRSmzFSxmX/fDKggWLGRJggLqrOaeIAEwkqXtHAlkHm08Xwv/5PWieWcWfZttVK/mjVUhAM4hGNw4ALqcAMNaAIGDk/wDC/Go/FqvBnv09GCMdvZhwUYH988fpiK</latexit> u1 = W1h1 <latexit sha1_base64="4dcnyKt/ee7kGXZ6S3uBkWEAkPc=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRKt2mXRjcsK9gFNCJPJpB06mYSZiVBC136GX+BWv8CduBU/wP9w0ka0ohcGzj3nvub4CaNSWda7UVpYXFpeKa9W1tY3NrfM7Z2OjFOBSRvHLBY9H0nCKCdtRRUjvUQQFPmMdP3RZa53b4mQNOY3apwQN0IDTkOKkdKUZ+47oUA4cxIkFEUMpp49+c6GOvPMql2zpgGt2lm9bp00NCiYL6kKimh55ocTxDiNCFeYISn7tpUoN8tHYkYmFSeVJEF4hAakryFHEZFuNv3KBB5qJoBhLPTjCk7Znx0ZiqQcR76ujJAayt9aTv6l9VMVNtyM8iRVhOPZojBlUMUw9wUGVBCs2FgDhAXVt0I8RNobpd2b2xLI/LQ5X/4HneOafVqzruvV5kXhUBnsgQNwBGxwDprgCrRAG2BwBx7AI3gy7o1n48V4nZWWjKJnF8yF8fYJDnqjNw==</latexit> @u1 @h1 <latexit sha1_base64="rQepUHmc6aWrxxB1wV5j6ZVGC6k=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRKt2mXRjcsK9gFNCJPJpB06mYSZiVBC136GX+BWv8CduBU/wP9w0ka0ohcGzj3nvub4CaNSWda7UVpYXFpeKa9W1tY3NrfM7Z2OjFOBSRvHLBY9H0nCKCdtRRUjvUQQFPmMdP3RZa53b4mQNOY3apwQN0IDTkOKkdKUZ+47oUA4cxIkFEUMpp49+c66OvPMql2zpgGt2lm9bp00NCiYL6kKimh55ocTxDiNCFeYISn7tpUoN8tHYkYmFSeVJEF4hAakryFHEZFuNv3KBB5qJoBhLPTjCk7Znx0ZiqQcR76ujJAayt9aTv6l9VMVNtyM8iRVhOPZojBlUMUw9wUGVBCs2FgDhAXVt0I8RNobpd2b2xLI/LQ5X/4HneOafVqzruvV5kXhUBnsgQNwBGxwDprgCrRAG2BwBx7AI3gy7o1n48V4nZWWjKJnF8yF8fYJ8zGjJg==</latexit> @u1 @W1 <latexit sha1_base64="60kCDCJfdCUFlI7azaDnN8WmG14=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRJRbHdFNy4r2Ae0IUwmk3boZBJmJkIJWfsZfoFb/QJ34lb8AP/DSRuxFT0wcObc9/FiRqWyrA+jtLS8srpWXq9sbG5t75i7ex0ZJQKTNo5YJHoekoRRTtqKKkZ6sSAo9BjpeuOrPN69I0LSiN+qSUycEA05DShGSkuueTgIBMLpIEZCUcTgyLWzn1/iWplrVq2aNQWcI41G3a43oF0oVVCg5ZqfAz/CSUi4wgxJ2betWDlp3hIzklUGiSQxwmM0JH1NOQqJdNLpKRk81ooPg0joxxWcqvMVKQqlnISezgyRGsnfsVz8K9ZPVFB3UsrjRBGOZ4OChEEVwdwX6FNBsGITTRAWVO8K8Qhpb5R2b2GKL/PVcl++j4f/k85pzT6vWTdn1eZl4VAZHIAjcAJscAGa4Bq0QBtgcA8ewRN4Nh6MF+PVeJulloyiZh8swHj/AiX8o0g=</latexit> @h1 @u0 <latexit sha1_base64="2g++4FK2qtbTNVizFSiWmGPzmRk=">AAACHXicdVDLSsNAFJ34rPUVdelmtAiuQlJabXdFNy4r2Ac0IUwmk3bo5MHMRCihaz/DL3CrX+BO3Iof4H84aSNa0QMD555779x7j5cwKqRpvmtLyyura+uljfLm1vbOrr633xVxyjHp4JjFvO8hQRiNSEdSyUg/4QSFHiM9b3yZ53u3hAsaRzdykhAnRMOIBhQjqSRXP7IDjnBmJ4hLihhMXXP6HfVU5OoV02g26s1aA5qGOUNOqmfNugWtQqmAAm1X/7D9GKchiSRmSIiBZSbSyfIvMSPTsp0KkiA8RkMyUDRCIRFONjtlCk+U4sMg5upFEs7Unx0ZCoWYhJ6qDJEcid+5XPwrN0hl0HAyGiWpJBGeDwpSBmUMc1+gTznBkk0UQZhTtSvEI6S8kcq9hSm+yFfLffk6Hv5PulXDqhvmda3SuigcKoFDcAxOgQXOQQtcgTboAAzuwAN4BE/avfasvWiv89Ilreg5AAvQ3j4BLYSjTA==</latexit> @u0 @W0 <latexit sha1_base64="+jqY1jG3/sRBUYetLzlPwiYD6Ak=">AAACBnicdVDLSsNAFL3xWeur6tLNYBHqpiQq2i6EohuXFewD2hAmk0k7dPJgZiKU0L1f4Fa/wJ249Tf8AP/DSRuhFT0wcDjnXu6Z48acSWWan8bS8srq2npho7i5tb2zW9rbb8soEYS2SMQj0XWxpJyFtKWY4rQbC4oDl9OOO7rJ/M4DFZJF4b0ax9QO8CBkPiNYaak/dKwr3zEriWOeOKWyWTWnQHOkXq9ZtTqycqUMOZpO6avvRSQJaKgIx1L2LDNWdoqFYoTTSbGfSBpjMsID2tM0xAGVdjrNPEHHWvGQHwn9QoWm6vxGigMpx4GrJwOshvK3l4l/eb1E+TU7ZWGcKBqS2SE/4UhFKCsAeUxQovhYE0wE01kRGWKBidI1LVzxZBZtonv5+Tz6n7RPq9ZF9ezuvNy4zhsqwCEcQQUsuIQG3EITWkAghid4hhfj0Xg13oz32eiSke8cwAKMj289EpkS</latexit> h1 = f0(u0) <latexit sha1_base64="YsNX+xavKPnYRsukvqtfuULWxoM=">AAACBHicbVDLSsNAFJ3UV62vqks3g0Wom5C0anUhFN24rGAf2IYwmUzaoZNJmJkIpXTrF7jVL3Anbv0PP8D/cNIGsdUDA4dz7uWeOV7MqFSW9WnklpZXVtfy64WNza3tneLuXktGicCkiSMWiY6HJGGUk6aiipFOLAgKPUba3vA69dsPREga8Ts1iokToj6nAcVIaek+vgxcu5y49rFbLFmmNQW0zNNaxbqowh/FzkgJZGi4xa+eH+EkJFxhhqTs2lasnDESimJGJoVeIkmM8BD1SVdTjkIinfE08QQeacWHQST04wpO1d8bYxRKOQo9PRkiNZCLXir+53UTFZw7Y8rjRBGOZ4eChEEVwfT70KeCYMVGmiAsqM4K8QAJhJUuae6KL9NoE92LvdjCX9KqmPaZWb09KdWvsoby4AAcgjKwQQ3UwQ1ogCbAgIMn8AxejEfj1Xgz3mejOSPb2QdzMD6+Af4MmGY=</latexit> p = f1(u1) 深層ニューラルネットの学習 2 2 10 2 5 5 5 5 15 <latexit sha1_base64="eGta8ATILI7rVM5edzp7/uMnFog=">AAAB+3icbVDLSsNAFJ3UV62vqks3g0VwVRIVdVl047IF+4A2lMnkph06mYSZiRBCvsCtfoE7cevH+AH+h5M2C1s9MHA4517umePFnClt219WZW19Y3Orul3b2d3bP6gfHvVUlEgKXRrxSA48ooAzAV3NNIdBLIGEHoe+N7sv/P4TSMUi8ajTGNyQTAQLGCXaSJ10XG/YTXsO/Jc4JWmgEu1x/XvkRzQJQWjKiVJDx461mxGpGeWQ10aJgpjQGZnA0FBBQlBuNg+a4zOj+DiIpHlC47n6eyMjoVJp6JnJkOipWvUK8T9vmOjg1s2YiBMNgi4OBQnHOsLFr7HPJFDNU0MIlcxkxXRKJKHadLN0xVdFtNz04qy28Jf0LprOdfOyc9Vo3ZUNVdEJOkXnyEE3qIUeUBt1EUWAntELerVy6816tz4WoxWr3DlGS7A+fwCA+ZVy</latexit> y <latexit sha1_base64="CX39qY1yvYuKVy5jRO31RUVPsKU=">AAACG3icbVDLSsNAFJ34rPUVdenCwSK4CkmrVndFNy4r2Ac0IUwmk3bo5MHMRCihSz/DL3CrX+BO3LrwA/wPJ21QWz0wcDjnvuZ4CaNCmuaHtrC4tLyyWlorr29sbm3rO7ttEacckxaOWcy7HhKE0Yi0JJWMdBNOUOgx0vGGV7nfuSNc0Di6laOEOCHqRzSgGEklufqBHXCEMztBXFLEYDL+4alrjV29YhrmBNA0TutV86IGvxWrIBVQoOnqn7Yf4zQkkcQMCdGzzEQ6WT4SMzIu26kgCcJD1Cc9RSMUEuFkk4+M4ZFSfBjEXL1Iwon6uyNDoRCj0FOVIZIDMe/l4n9eL5XBuZPRKEklifB0UZAyKGOYpwJ9ygmWbKQIwpyqWyEeIJWMVNnNbPFFflqeizWfwl/SrhrWmVG7Oak0LouESmAfHIJjYIE6aIBr0AQtgME9eARP4Fl70F60V+1tWrqgFT17YAba+xfXKqKf</latexit> @p @u1 <latexit sha1_base64="U0Wku2zBzTyNreP8fA04lNpnsb8=">AAACOnicbVC7SgNBFJ31GeNr1dJmMAg2CbsqaqMEbSwsIpgHZGO4OztJhsw+mJkVwrJ/42f4Bbba2AoWYusHOJukyMMDA4dz7p1773EjzqSyrA9jYXFpeWU1t5Zf39jc2jZ3dmsyjAWhVRLyUDRckJSzgFYVU5w2IkHBdzmtu/2bzK8/USFZGDyoQURbPnQD1mEElJba5pXjg+oR4MldeunI2H9MPFCQjijhIGVaHDg87OJo0i+OpLZZsErWEHie2GNSQGNU2uaX44Uk9mmghp83bStSrQSEYoTTNO/EkkZA+tClTU0D8KlsJcM7U3yoFQ93QqFfoPBQnexIwJdy4Lu6MrtKznqZ+J/XjFXnopWwIIoVDchoUCfmWIU4Cw17TFCi+EATIILpXTHpgQCidLRTUzyZrZbqXOzZFOZJ7bhkn5VO7k8L5etxQjm0jw7QEbLROSqjW1RBVUTQM3pFb+jdeDE+jW/jZ1S6YIx79tAUjN8/EDGwDg==</latexit> L = data X class X y log p = data X log p <latexit sha1_base64="NXhC3ff4B32CgQ5BYZuIAyqz5Qg=">AAACJHicbVDLSsNAFJ3UV62vqEs3g0XoqiQq6kYounHhooJ9QBPKZDJph04mYWYilJBf8DP8Arf6Be7EhRt3/oeTNqBtPTBwOPd15ngxo1JZ1qdRWlpeWV0rr1c2Nre2d8zdvbaMEoFJC0csEl0PScIoJy1FFSPdWBAUeox0vNF1Xu88ECFpxO/VOCZuiAacBhQjpaW+Wbt0AoFw6sRIKIoYdEKkhhix9DbLftVO1jerVt2aAC4SuyBVUKDZN78dP8JJSLjCDEnZs61YuWm+EDOSVZxEkhjhERqQnqYchUS66eRHGTzSig+DSOjHFZyofydSFEo5Dj3dmfuV87Vc/K/WS1Rw4aaUx4kiHE8PBQmDKoJ5PNCngmDFxpogLKj2CvEQ6YSUDnHmii9za3ku9nwKi6R9XLfP6id3p9XGVZFQGRyAQ1ADNjgHDXADmqAFMHgEz+AFvBpPxpvxbnxMW0tGMbMPZmB8/QAvl6Z4</latexit> = @L @W <latexit sha1_base64="RlrtYxiGwNDm/OSOojM6YjHJMWs=">AAACI3icbVC7TsMwFHV4lvIKMLJYVAimKgEEjBUsDAxFog+piSrHdVqrjmPZDlIV5RP4DL6AFb6ADbEwMPIfOG0kaMuRLB2d+zo+gWBUacf5tBYWl5ZXVktr5fWNza1te2e3qeJEYtLAMYtlO0CKMMpJQ1PNSFtIgqKAkVYwvM7rrQciFY35vR4J4keoz2lIMdJG6tpHXigRTj2BpKaIQS9CeoARS2+z7FcVWdeuOFVnDDhP3IJUQIF61/72ejFOIsI1ZkipjusI7af5QsxIVvYSRQTCQ9QnHUM5iojy0/GHMnholB4MY2ke13Cs/p1IUaTUKApMZ+5XzdZy8b9aJ9HhpZ9SLhJNOJ4cChMGdQzzdGCPSoI1GxmCsKTGK8QDZBLSJsOpKz2VW8tzcWdTmCfNk6p7Xj29O6vUroqESmAfHIBj4IILUAM3oA4aAINH8AxewKv1ZL1Z79bHpHXBKmb2wBSsrx/HuKZK</latexit> @L @p <latexit sha1_base64="/J5Xk+dXiOlf6omGGiJXLYgOMI8=">AAACBXicdVDLSsNAFJ34rPVVdelmsAiuQlLb2u6KblxWsA9IY5lMJu3QmUmYmQgldO0XuNUvcCdu/Q4/wP8waSNY0QMXDufcy733eBGjSlvWh7Gyura+sVnYKm7v7O7tlw4OuyqMJSYdHLJQ9j2kCKOCdDTVjPQjSRD3GOl5k6vM790TqWgobvU0Ii5HI0EDipFOJWegYn6X+Eij2bBUtsxmo9asNqBlWnNkpFJv1mxo50oZ5GgPS58DP8QxJ0JjhpRybCvSboKkppiRWXEQKxIhPEEj4qRUIE6Um8xPnsHTVPFhEMq0hIZz9edEgrhSU+6lnRzpsfrtZeJfnhProOEmVESxJgIvFgUxgzqE2f/Qp5JgzaYpQVjS9FaIx0girNOUlrb4Kjsty+X7efg/6VZMu26e31TLrcs8oQI4BifgDNjgArTANWiDDsAgBI/gCTwbD8aL8Wq8LVpXjHzmCCzBeP8CCxKaQA==</latexit> data X <latexit sha1_base64="OIzM9hBXAwa2CsqFH+Q4ck6rUCg=">AAACBXicdVDLSgMxFM3UV62vqks3wSK4Gma01C6LblxWsA+YjiWTybShmWRIMkIZuvYL3OoXuBO3focf4H+YaUewogcCh3Pu5Z6cIGFUacf5sEorq2vrG+XNytb2zu5edf+gq0QqMelgwYTsB0gRRjnpaKoZ6SeSoDhgpBdMrnK/d0+kooLf6mlC/BiNOI0oRtpI3kCl8V0WIo1mw2rNtZ05oGM36nXnvGlIoXxbNVCgPax+DkKB05hwjRlSynOdRPsZkppiRmaVQapIgvAEjYhnKEcxUX42jzyDJ0YJYSSkeVzDufpzI0OxUtM4MJMx0mP128vFvzwv1VHTzyhPUk04XhyKUga1gPn/YUglwZpNDUFYUpMV4jGSCGvT0tKVUOXRlnr5n3TPbLdhn9/Ua63LoqEyOALH4BS44AK0wDVogw7AQIBH8ASerQfrxXq13hajJavYOQRLsN6/AM2Bmhg=</latexit> data X 2 2 1 四則演算や初等関数の微分は内部で定義されている それらを連鎖させれば行列積で勾配が計算できる 後ろからかければ全て行列ベクトル積になる 画像ごとにこれが行われ最後に和をとる <latexit sha1_base64="60kCDCJfdCUFlI7azaDnN8WmG14=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRJRbHdFNy4r2Ae0IUwmk3boZBJmJkIJWfsZfoFb/QJ34lb8AP/DSRuxFT0wcObc9/FiRqWyrA+jtLS8srpWXq9sbG5t75i7ex0ZJQKTNo5YJHoekoRRTtqKKkZ6sSAo9BjpeuOrPN69I0LSiN+qSUycEA05DShGSkuueTgIBMLpIEZCUcTgyLWzn1/iWplrVq2aNQWcI41G3a43oF0oVVCg5ZqfAz/CSUi4wgxJ2betWDlp3hIzklUGiSQxwmM0JH1NOQqJdNLpKRk81ooPg0joxxWcqvMVKQqlnISezgyRGsnfsVz8K9ZPVFB3UsrjRBGOZ4OChEEVwdwX6FNBsGITTRAWVO8K8Qhpb5R2b2GKL/PVcl++j4f/k85pzT6vWTdn1eZl4VAZHIAjcAJscAGa4Bq0QBtgcA8ewRN4Nh6MF+PVeJulloyiZh8swHj/AiX8o0g=</latexit> @h1 @u0 <latexit sha1_base64="4dcnyKt/ee7kGXZ6S3uBkWEAkPc=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRKt2mXRjcsK9gFNCJPJpB06mYSZiVBC136GX+BWv8CduBU/wP9w0ka0ohcGzj3nvub4CaNSWda7UVpYXFpeKa9W1tY3NrfM7Z2OjFOBSRvHLBY9H0nCKCdtRRUjvUQQFPmMdP3RZa53b4mQNOY3apwQN0IDTkOKkdKUZ+47oUA4cxIkFEUMpp49+c6GOvPMql2zpgGt2lm9bp00NCiYL6kKimh55ocTxDiNCFeYISn7tpUoN8tHYkYmFSeVJEF4hAakryFHEZFuNv3KBB5qJoBhLPTjCk7Znx0ZiqQcR76ujJAayt9aTv6l9VMVNtyM8iRVhOPZojBlUMUw9wUGVBCs2FgDhAXVt0I8RNobpd2b2xLI/LQ5X/4HneOafVqzruvV5kXhUBnsgQNwBGxwDprgCrRAG2BwBx7AI3gy7o1n48V4nZWWjKJnF8yF8fYJDnqjNw==</latexit> @u1 @h1 <latexit sha1_base64="CX39qY1yvYuKVy5jRO31RUVPsKU=">AAACG3icbVDLSsNAFJ34rPUVdenCwSK4CkmrVndFNy4r2Ac0IUwmk3bo5MHMRCihSz/DL3CrX+BO3LrwA/wPJ21QWz0wcDjnvuZ4CaNCmuaHtrC4tLyyWlorr29sbm3rO7ttEacckxaOWcy7HhKE0Yi0JJWMdBNOUOgx0vGGV7nfuSNc0Di6laOEOCHqRzSgGEklufqBHXCEMztBXFLEYDL+4alrjV29YhrmBNA0TutV86IGvxWrIBVQoOnqn7Yf4zQkkcQMCdGzzEQ6WT4SMzIu26kgCcJD1Cc9RSMUEuFkk4+M4ZFSfBjEXL1Iwon6uyNDoRCj0FOVIZIDMe/l4n9eL5XBuZPRKEklifB0UZAyKGOYpwJ9ygmWbKQIwpyqWyEeIJWMVNnNbPFFflqeizWfwl/SrhrWmVG7Oak0LouESmAfHIJjYIE6aIBr0AQtgME9eARP4Fl70F60V+1tWrqgFT17YAba+xfXKqKf</latexit> @p @u1 <latexit sha1_base64="rQepUHmc6aWrxxB1wV5j6ZVGC6k=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRKt2mXRjcsK9gFNCJPJpB06mYSZiVBC136GX+BWv8CduBU/wP9w0ka0ohcGzj3nvub4CaNSWda7UVpYXFpeKa9W1tY3NrfM7Z2OjFOBSRvHLBY9H0nCKCdtRRUjvUQQFPmMdP3RZa53b4mQNOY3apwQN0IDTkOKkdKUZ+47oUA4cxIkFEUMpp49+c66OvPMql2zpgGt2lm9bp00NCiYL6kKimh55ocTxDiNCFeYISn7tpUoN8tHYkYmFSeVJEF4hAakryFHEZFuNv3KBB5qJoBhLPTjCk7Znx0ZiqQcR76ujJAayt9aTv6l9VMVNtyM8iRVhOPZojBlUMUw9wUGVBCs2FgDhAXVt0I8RNobpd2b2xLI/LQ5X/4HneOafVqzruvV5kXhUBnsgQNwBGxwDprgCrRAG2BwBx7AI3gy7o1n48V4nZWWjKJnF8yF8fYJ8zGjJg==</latexit> @u1 @W1 <latexit sha1_base64="2g++4FK2qtbTNVizFSiWmGPzmRk=">AAACHXicdVDLSsNAFJ34rPUVdelmtAiuQlJabXdFNy4r2Ac0IUwmk3bo5MHMRCihaz/DL3CrX+BO3Iof4H84aSNa0QMD555779x7j5cwKqRpvmtLyyura+uljfLm1vbOrr633xVxyjHp4JjFvO8hQRiNSEdSyUg/4QSFHiM9b3yZ53u3hAsaRzdykhAnRMOIBhQjqSRXP7IDjnBmJ4hLihhMXXP6HfVU5OoV02g26s1aA5qGOUNOqmfNugWtQqmAAm1X/7D9GKchiSRmSIiBZSbSyfIvMSPTsp0KkiA8RkMyUDRCIRFONjtlCk+U4sMg5upFEs7Unx0ZCoWYhJ6qDJEcid+5XPwrN0hl0HAyGiWpJBGeDwpSBmUMc1+gTznBkk0UQZhTtSvEI6S8kcq9hSm+yFfLffk6Hv5PulXDqhvmda3SuigcKoFDcAxOgQXOQQtcgTboAAzuwAN4BE/avfasvWiv89Ilreg5AAvQ3j4BLYSjTA==</latexit> @u0 @W0 <latexit sha1_base64="RlrtYxiGwNDm/OSOojM6YjHJMWs=">AAACI3icbVC7TsMwFHV4lvIKMLJYVAimKgEEjBUsDAxFog+piSrHdVqrjmPZDlIV5RP4DL6AFb6ADbEwMPIfOG0kaMuRLB2d+zo+gWBUacf5tBYWl5ZXVktr5fWNza1te2e3qeJEYtLAMYtlO0CKMMpJQ1PNSFtIgqKAkVYwvM7rrQciFY35vR4J4keoz2lIMdJG6tpHXigRTj2BpKaIQS9CeoARS2+z7FcVWdeuOFVnDDhP3IJUQIF61/72ejFOIsI1ZkipjusI7af5QsxIVvYSRQTCQ9QnHUM5iojy0/GHMnholB4MY2ke13Cs/p1IUaTUKApMZ+5XzdZy8b9aJ9HhpZ9SLhJNOJ4cChMGdQzzdGCPSoI1GxmCsKTGK8QDZBLSJsOpKz2VW8tzcWdTmCfNk6p7Xj29O6vUroqESmAfHIBj4IILUAM3oA4aAINH8AxewKv1ZL1Z79bHpHXBKmb2wBSsrx/HuKZK</latexit> @L @p Forward propagation Backward propagation Cross entropy loss 確率的勾配降下法 (SGD) :ラベル 誤差逆伝播法 <latexit sha1_base64="BpaiO7b9hbl/rfkDJL7cM+CMhk8=">AAACNHicbVDLSsNAFJ34rO+qSzeDRRDEkqhoN0LRjQsXFawtNCXcTCc6dPJg5kYoIb/iZ/gFbnUvuJNu/QYntYhVDwwczn2dOX4ihUbbfrWmpmdm5+ZLC4tLyyura+X1jRsdp4rxJotlrNo+aC5FxJsoUPJ2ojiEvuQtv39e1Fv3XGkRR9c4SHg3hNtIBIIBGskr11pehntOftrycN/lCG6ggGVuAgoFSDcEvGMgs8s8/xap6c29csWu2iPQv8QZkwoZo+GVh24vZmnII2QStO44doLdrFjJJM8X3VTzBFgfbnnH0AhCrrvZ6Ic53TFKjwaxMi9COlJ/TmQQaj0IfdNZONa/a4X4X62TYlDrZiJKUuQR+zoUpJJiTIu4aE8ozlAODAGmhPFK2R2YiNCEOnGlpwtrRS7O7xT+kpuDqnNcPbw6qtTPxgmVyBbZJrvEISekTi5IgzQJIw/kiTyTF+vRerPereFX65Q1ntkkE7A+PgFacq02</latexit> Wt+1 = Wt ⌘ @L @Wt
  • 6. 最適化手法 https://losslandscape.com SGD 重みWとバイアスbを合わせてθとする ミニバッチごとに損失関数の形状は変化する momentum SGD semi-implicit Euler風に書くと <latexit sha1_base64="ir9w3iQmgCVKvRKp5bEPzuOE5aE=">AAACH3icdVDLSitBEO3x/boademmMVxwFWa8PpKFIOrCpYJRIROkplMTG7tnhu4aIQz5AD/DL3CrX+BO3PoB/oc9SQQV74GCwzlVVNWJMiUt+f6bNzY+MTk1PTM7N7/wZ3GpsrxybtPcCGyKVKXmMgKLSibYJEkKLzODoCOFF9HNYelf3KKxMk3OqJdhW0M3kbEUQE66qlTDI1QEnPbCLmgNe2FsQBQhEvSLodR3XX6tUd9ubNW5X/MHKMnmTmM74MFIqbIRTq4q72EnFbnGhIQCa1uBn1G7AENSKOzPhbnFDMQNdLHlaAIabbsYPNPnf53S4XFqXCXEB+rXiQK0tT0duU4NdG1/eqX4m9fKKa63C5lkOWEihoviXHFKeZkM70iDglTPERBGulu5uAaXBrn8vm3p2PK0MpfP5/n/yflmLdip/Tvdqu4fjBKaYWtsnW2wgO2yfXbMTliTCXbHHtgje/LuvWfvxXsdto55o5lV9g3e2wf9lqRQ</latexit> t = = ⌘ 物理的整合性を 持たせるためには Nesterov momentum RMSProp Adam <latexit sha1_base64="IOyR436oWLZbrKn8O0Lz+NybarM=">AAACMXicbVDLSsNAFJ34rPVVdekmWARFLInvjSC6ceGigrVCU8rNdGqHTiZh5kYoIV/iZ/gFbvUL3Ingyp9w0kaw1QvDnDnnXu6Z40eCa3ScN2ticmp6ZrYwV5xfWFxaLq2s3uowVpTVaChCdeeDZoJLVkOOgt1FikHgC1b3exeZXn9gSvNQ3mA/Ys0A7iXvcApoqFbp0MMuQ2gluOOmp/kDdz1zeRJ8AV4A2KUgkqt060febpXKTsUZlP0XuDkok7yqrdKn1w5pHDCJVIDWDdeJsJmAQk4FS4terFkEtAf3rGGghIDpZjL4XmpvGqZtd0JljkR7wP6eSCDQuh/4pjMzq8e1jPxPa8TYOWkmXEYxMkmHizqxsDG0s6zsNleMougbAFRx49WmXVBA0SQ6sqWtM2upycUdT+EvuN2ruEeV/euD8tl5nlCBrJMNskVcckzOyCWpkhqh5JE8kxfyaj1Zb9a79TFsnbDymTUyUtbXNx2yq3o=</latexit> ✓t+1 = ✓t ⌘rL(✓t) <latexit sha1_base64="EMAKBAJcozE6InSyo+X7qysKpy0=">AAACLnicbVDLSgMxFM34flt16SZYBEUoMyrqRhDduHChYFXolOFOmrbBJDMkdwplmP/wM/wCt/oFggtxJfgZZmoXvg4EDufcyz05cSqFRd9/8UZGx8YnJqemZ2bn5hcWK0vLVzbJDON1lsjE3MRguRSa11Gg5Dep4aBiya/j25PSv+5xY0WiL7Gf8qaCjhZtwQCdFFW2e1GOW0FxGHZAKaC9CLdCjhBqiCWECrDLQOZnxUaIXadHuBlVqn7NH4D+JcGQVMkQ51HlPWwlLFNcI5NgbSPwU2zmYFAwyYuZMLM8BXYLHd5wVIPitpkP/lbQdae0aDsx7mmkA/X7Rg7K2r6K3WQZ1v72SvE/r5Fh+6CZC51myDX7OtTOJMWElkXRljCcoew7AswIl5WyLhhg6Or8caVly2iF6yX43cJfcrVdC/ZqOxe71aPjYUNTZJWskQ0SkH1yRE7JOakTRu7IA3kkT9699+y9em9foyPecGeF/ID38QkKrqnh</latexit> vt+1 = vt + ⌘rL(✓t) <latexit sha1_base64="so4WvEfNBhzQ2+k0DIcQ37xIf6o=">AAACGXicbVDLSsNAFJ3UV62vqEsRgkUQxJKoqBuh6MZlBfuANoTJZNoOnUzCzE2hhK78DL/ArX6BO3Hryg/wP5y0WdjqgYFzz7mXe+f4MWcKbPvLKCwsLi2vFFdLa+sbm1vm9k5DRYkktE4iHsmWjxXlTNA6MOC0FUuKQ5/Tpj+4zfzmkErFIvEAo5i6Ie4J1mUEg5Y8c78DfQrYS+HYGV/nBZwMp4Jnlu2KPYH1lzg5KaMcNc/87gQRSUIqgHCsVNuxY3BTLIERTselTqJojMkA92hbU4FDqtx08o2xdaiVwOpGUj8B1kT9PZHiUKlR6OvOEENfzXuZ+J/XTqB75aZMxAlQQaaLugm3ILKyTKyASUqAjzTBRDJ9q0X6WGICOrmZLYHKTstyceZT+EsapxXnonJ2f16u3uQJFdEeOkBHyEGXqIruUA3VEUGP6Bm9oFfjyXgz3o2PaWvByGd20QyMzx85KqEn</latexit> ✓t+1 = ✓t vt+1 <latexit sha1_base64="4YkWyJvS7AvVbbmFVeff+OrbRbQ=">AAACMnicbVDLSsNAFJ34rO+qSzeDRaiIJVGpbgTRjQsXClaFpoab6dQOTiZh5qZQQv7Ez/AL3OoP6E7EnR/hpHbh68DA4Zx7uWdOmEhh0HWfnZHRsfGJydLU9Mzs3PxCeXHpwsSpZrzBYhnrqxAMl0LxBgqU/CrRHKJQ8svw9qjwL3tcGxGrc+wnvBXBjRIdwQCtFJTrvSDDDS/f93U3pr0AN6reZsHXfQWhBD8C7DKQ2Ule9bHLEQJcv94KyhW35g5A/xJvSCpkiNOg/O63Y5ZGXCGTYEzTcxNsZaBRMMnzaT81PAF2Cze8aamCiJtWNvhfTtes0qadWNunkA7U7xsZRMb0o9BOFnHNb68Q//OaKXb2WplQSYpcsa9DnVRSjGlRFm0LzRnKviXAtLBZKeuCBoa20h9X2qaIlttevN8t/CUXWzWvXts+26kcHA4bKpEVskqqxCO75IAck1PSIIzckQfySJ6ce+fFeXXevkZHnOHOMvkB5+MTv8+qnQ==</latexit> vt+1 = ⇢vt + (1 ⇢)rL(✓t)2 <latexit sha1_base64="jPT370zvmdQBBRtoAnxlbXWW5BM=">AAACTnicbVBNixNBFOyJX3H9inr00hiElUCYUVm9CItePHhYxewupMPwpvMmaba7Z7b7zUJo5n/5M7x68Sb6C7yJ9iQ5uLsWNBRV7/Gqq6i18pSmX5PelavXrt/o39y5dfvO3XuD+w8OfdU4iRNZ6codF+BRK4sTUqTxuHYIptB4VJy87fyjM3ReVfYTrWqcGVhYVSoJFKV88NHkgUZZ+1oswBjgJqeRKB3IIJCgDcKfOgpnm6GRwNorXdm2FRYKDcIALSXo8L7dFbSMGzk9zQfDdJyuwS+TbEuGbIuDfPBdzCvZGLQkNXg/zdKaZgEcKamx3RGNxxrkCSxwGqkFg34W1n9v+ZOozHlZufgs8bX670YA4/3KFHGyC+svep34P2/aUPlqFpStG0IrN4fKRnOqeFcknyuHkvQqEpBOxaxcLiE2R7Huc1fmvovWxl6yiy1cJofPxtne+PmHF8P9N9uG+uwRe8x2WcZesn32jh2wCZPsM/vGfrCfyZfkV/I7+bMZ7SXbnYfsHHr9v7xetzQ=</latexit> mt+1 = mt + ⌘ p vt+1 + ✏ rL(✓t) <latexit sha1_base64="4YEOH6DoswNYNGJQU7KvSQoMDRc=">AAACGXicbVDLSsNAFJ3UV62vqEsRBosgiCVRUTdC0Y3LCvYBbQiTybQdOpOEmRuhhK78DL/ArX6BO3Hryg/wP0zaLGzrgYFzz7mXe+d4keAaLOvbKCwsLi2vFFdLa+sbm1vm9k5Dh7GirE5DEaqWRzQTPGB14CBYK1KMSE+wpje4zfzmI1Oah8EDDCPmSNILeJdTAqnkmvsd6DMgbgLH9ug6L+BETgTXLFsVaww8T+yclFGOmmv+dPyQxpIFQAXRum1bETgJUcCpYKNSJ9YsInRAeqyd0oBIpp1k/I0RPkwVH3dDlb4A8Fj9O5EQqfVQemmnJNDXs14m/ue1Y+heOQkPohhYQCeLurHAEOIsE+xzxSiIYUoIVTy9FdM+UYRCmtzUFl9np2W52LMpzJPGacW+qJzdn5erN3lCRbSHDtARstElqqI7VEN1RNETekGv6M14Nt6ND+Nz0low8pldNAXj6xcqpaEe</latexit> ✓t+1 = ✓t mt+1 <latexit sha1_base64="hcu7JIK5zJuWJREk9Bj+qzd8oyg=">AAACNXicbVDLSsNAFJ34flt16SZYhIpYEhUfC0F048KFglWhKeFmOrVDZyZh5kYoId/iZ/gFbnXtwp269Rec1Cx8XRg495x7uWdOlAhu0POenaHhkdGx8YnJqemZ2bn5ysLipYlTTVmDxiLW1xEYJrhiDeQo2HWiGchIsKuod1zoV7dMGx6rC+wnrCXhRvEOp4CWCiv7Msxw3c8PgoghhL4Mcb3mb5TdWqAgEhBIwC4FkZ3mtQC7hYRrYaXq1b1BuX+BX4IqKessrLwF7ZimkimkAoxp+l6CrQw0cipYPhWkhiVAe3DDmhYqkMy0ssEXc3fVMm23E2v7FLoD9vtGBtKYvozsZGHW/NYK8j+tmWJnr5VxlaTIFP061EmFi7Fb5OW2uWYURd8CoJpbry7tggaKNtUfV9qmsJbbXPzfKfwFl5t1f6e+db5dPTwqE5ogy2SF1IhPdskhOSFnpEEouSMP5JE8OffOi/PqvH+NDjnlzhL5Uc7HJwCnq78=</latexit> mt+1 = 1mt + (1 1)rL(✓t) <latexit sha1_base64="b0A57HXrWLeK1gdAKZRl7DCDL2w=">AAACN3icbVDLSsNAFJ34rO+qSzfBIlTEklRRQYSiGxcuFKwKTQ0306kdnEzCzE2hhHyMn+EXuNWlK1eKW//ASc3C14WBc8+5l3vmBLHgGh3n2RoZHRufmCxNTc/Mzs0vlBeXLnSUKMqaNBKRugpAM8ElayJHwa5ixSAMBLsMbo9y/bLPlOaRPMdBzNoh3Eje5RTQUH55v++nuOFmB17AEPx638eNqrtZdOuehECAFwL2KIj0JKt62MslXL+u++WKU3OGZf8FbgEqpKhTv/zqdSKahEwiFaB1y3VibKegkFPBsmkv0SwGegs3rGWghJDpdjr8ZGavGaZjdyNlnkR7yH7fSCHUehAGZjK3q39rOfmf1kqwu9dOuYwTZJJ+HeomwsbIzhOzO1wximJgAFDFjVeb9kABRZPrjysdnVvLTC7u7xT+got6zd2pbZ1tVxqHRUIlskJWSZW4ZJc0yDE5JU1CyR15II/kybq3Xqw36/1rdMQqdpbJj7I+PgF9f6x3</latexit> vt+1 = 2vt + (1 2)rL(✓t)2 <latexit sha1_base64="MjuWH5G898k351kRZEFDIU570Os=">AAACHHicbVDLSsNAFJ34rPVVdelmsAi6sCQq6kYQ3bhwoWC10JRyM5naoZNJmLkRSujWz/AL3OoXuBO3gh/gfzhps9DqgYHDOfdyz5wgkcKg6346E5NT0zOzpbny/MLi0nJlZfXGxKlmvM5iGetGAIZLoXgdBUreSDSHKJD8Nuid5f7tPddGxOoa+wlvRXCnREcwQCu1KxTaeLzjKwgk+BFgl4HMLgZbPnY5Wm+7Xam6NXcI+pd4BamSApftypcfxiyNuEImwZim5ybYykCjYJIPyn5qeAKsB3e8aamCiJtWNvzJgG5aJaSdWNunkA7VnxsZRMb0o8BO5mHNuJeL/3nNFDtHrUyoJEWu2OhQJ5UUY5rXQkOhOUPZtwSYFjYrZV3QwNCW9+tKaPJoA9uLN97CX3KzW/MOantX+9WT06KhElknG2SLeOSQnJBzcknqhJEH8kSeyYvz6Lw6b877aHTCKXbWyC84H9+Vz6Jo</latexit> at = rL(✓t) <latexit sha1_base64="6uRgnqbwem00uROeNjK2GJZ+8vY=">AAACHnicbZDPSsNAEMY39f//qkcvwSIIhZKoqBeh6MWjglWhKWGy3bRLd5OwOymUkLuP4RN41SfwJl71AXwPNzUHa/1g4eObGWb2FySCa3ScT6syMzs3v7C4tLyyura+Ud3cutVxqihr0VjE6j4AzQSPWAs5CnafKAYyEOwuGFwU9bshU5rH0Q2OEtaR0It4yCmgifzq7tDPsO7mZ0Mf6+CjFyqgmccQ8szrgZSQ+9Wa03DGsqeNW5oaKXXlV7+8bkxTySKkArRuu06CnQwUcipYvuylmiVAB9BjbWMjkEx3svFfcnvPJF07jJV5Edrj9PdEBlLrkQxMpwTs67+1Ivyv1k4xPO1kPEpSZBH9WRSmwsbYLsDYXa4YRTEyBqji5lab9sHQQINvYktXF6cVXNy/FKbN7UHDPW4cXh/VmucloUWyQ3bJPnHJCWmSS3JFWoSSB/JEnsmL9Wi9Wm/W+09rxSpntsmErI9vHfqj0w==</latexit> vt+1 = vt + at ⌘ <latexit sha1_base64="+wvl9bAzEQ9hRPZ+KyDyIC4ih2s=">AAACH3icbVBLSgNBEO2Jvxh/UZduBoMgBMKMiroRgm5cRjAfSEKo6XSSJt0zQ3dNIAw5gMfwBG71BO7EbQ7gPexJZmESHzS8eq+Kqn5eKLhGx5lambX1jc2t7HZuZ3dv/yB/eFTTQaQoq9JABKrhgWaC+6yKHAVrhIqB9ASre8OHxK+PmNI88J9xHLK2hL7Pe5wCGqmTL7RwwBA6MRbdyV1aYHE0F1p9kBJMl1NyZrBXiZuSAklR6eR/Wt2ARpL5SAVo3XSdENsxKORUsEmuFWkWAh1CnzUN9UEy3Y5nn5nYZ0bp2r1AmeejPVP/TsQgtR5Lz3RKwIFe9hLxP68ZYe+2HXM/jJD5dL6oFwkbAztJxu5yxSiKsSFAFTe32nQACiia/Ba2dHVy2sTk4i6nsEpqFyX3unT5dFUo36cJZckJOSXnxCU3pEweSYVUCSUv5I28kw/r1fq0vqzveWvGSmeOyQKs6S8iWKPA</latexit> ✓t+1 = ✓t + vt+1 <latexit sha1_base64="xhnZcIhN39z2hCSemrfnC5bRMYE=">AAACMnicbVDLSsNAFJ34rPVVdekmWARBLEmV6kYounFZwT6gqWEynbRDJw9nboQS8id+hl/gVn9AdyLu/AgnaRa29TADh3Pu5d57nJAzCYbxri0sLi2vrBbWiusbm1vbpZ3dlgwiQWiTBDwQHQdLyplPm8CA004oKPYcTtvO6Dr1249USBb4dzAOac/DA5+5jGBQkl2qOXYMx2ZyabkCk9iSDwJi88RyKOD7zLGrSTKjqFcqGxUjgz5PzJyUUY6GXfq2+gGJPOoD4VjKrmmE0IuxAEY4TYpWJGmIyQgPaFdRH3tU9uLsvkQ/VEpfdwOhvg96pv7tiLEn5dhzVKWHYShnvVT8z+tG4F70YuaHEVCfTAa5Edch0NOw9D4TlAAfK4KJYGpXnQyxCgpUpFNT+jJdLc3FnE1hnrSqFbNWOb09K9ev8oQKaB8doCNkonNURzeogZqIoCf0gl7Rm/asfWif2tekdEHLe/bQFLSfX6nFqyE=</latexit> bt+1 = q 1 t+1 2 1 t+1 1 初期バイアス補正項 慣性項 慣性項+正規化 勾配分散項 慣性項 勾配分散項 momentum Nesterov momentum <latexit sha1_base64="2qXPSiX6g4ESw8yJGT+gqcvZYr8=">AAACB3icbVDLTgIxFO3gC/GFunTTSEzcOJkBFd0R3bjERMAEJqTTKdDQdsa2Q0ImfIBf4Fa/wJ1x62f4Af6HHZgYQU9yk5Nz7s09OX7EqNKO82nllpZXVtfy64WNza3tneLuXlOFscSkgUMWynsfKcKoIA1NNSP3kSSI+4y0/OF16rdGRCoaijs9jojHUV/QHsVIG8k76fQR5wiOuomedIslx3amgI59Vi07lxX4o7gZKYEM9W7xqxOEOOZEaMyQUm3XibSXIKkpZmRS6MSKRAgPUZ+0DRWIE+Ul09ATeGSUAPZCaUZoOFV/XySIKzXmvtnkSA/UopeK/3ntWPcuvISKKNZE4NmjXsygDmHaAAyoJFizsSEIS2qyQjxAEmFtepr7Eqg0WtqLu9jCX9Is2+65Xbk9LdWusoby4AAcgmPggiqogRtQBw2AwQN4As/gxXq0Xq036322mrOym30wB+vjG5S9mng=</latexit> vt <latexit sha1_base64="cMJEeTuJyWkIIo1/oFATGDoX7ho=">AAACHHicdVDLSgNBEJz1GeMr6tHLYBD0YNj1EZOb6MWDBwUTA9kQeicTMzg7u8z0CmHJ1c/wC7zqF3gTr4If4H84m0RQ0YKGoqqb7q4glsKg6747E5NT0zOzubn8/MLi0nJhZbVuokQzXmORjHQjAMOlULyGAiVvxJpDGEh+FdycZP7VLddGROoS+zFvhXCtRFcwQCu1C3TH5wi+gkCCHwL2GMj0bLDlY8/qbdxuF4puqVo5qO5XqFtyh8jIbrl64FFvrBTJGOftwoffiVgScoVMgjFNz42xlYJGwSQf5P3E8BjYDVzzpqUKQm5a6fCTAd20Sod2I21LIR2q3ydSCI3ph4HtzI41v71M/MtrJtittFKh4gS5YqNF3URSjGgWC+0IzRnKviXAtLC3UtYDDQxteD+2dEx22sDm8vU8/Z/Ud0teubR3sV88Oh4nlCPrZINsEY8ckiNySs5JjTByRx7II3ly7p1n58V5HbVOOOOZNfIDztsnOPWizw==</latexit> ⌘rL(✓t) <latexit sha1_base64="jikt4Srxl5+zqapEUBoYmyFrypA=">AAACAnicdVDLSsNAFJ3UV62vqks3g0VwVRIVbXdFNy4r2Ae0oUwmk3boZBJmboQSuvML3OoXuBO3/ogf4H84aSO0ogcuHM65l3vv8WLBNdj2p1VYWV1b3yhulra2d3b3yvsHbR0lirIWjUSkuh7RTHDJWsBBsG6sGAk9wTre+CbzOw9MaR7Je5jEzA3JUPKAUwJG6vZhxIAMYFCu2FV7BrxA6vWaU6tjJ1cqKEdzUP7q+xFNQiaBCqJ1z7FjcFOigFPBpqV+ollM6JgMWc9QSUKm3XR27xSfGMXHQaRMScAzdXEiJaHWk9AznSGBkf7tZeJfXi+BoOamXMYJMEnni4JEYIhw9jz2uWIUxMQQQhU3t2I6IopQMBEtbfF1dtrU5PLzPP6ftM+qzmX1/O6i0rjOEyqiI3SMTpGDrlAD3aImaiGKBHpCz+jFerRerTfrfd5asPKZQ7QE6+MbQ1KYsA==</latexit> ✓t <latexit sha1_base64="dYecKxTxjwKqcAllQPGL/cN5Q/M=">AAACBnicdVDLSsNAFJ3UV62vqks3g0UQhJKoaLsrunFZwdpCE8pkOmmHTiZh5kYooXu/wK1+gTtx62/4Af6HkzZCK3pg4HDOvdwzx48F12Dbn1ZhaXllda24XtrY3NreKe/u3esoUZS1aCQi1fGJZoJL1gIOgnVixUjoC9b2R9eZ335gSvNI3sE4Zl5IBpIHnBIwkuvCkAHppXDiTHrlil21p8BzpF6vObU6dnKlgnI0e+Uvtx/RJGQSqCBadx07Bi8lCjgVbFJyE81iQkdkwLqGShIy7aXTzBN8ZJQ+DiJlngQ8Vec3UhJqPQ59MxkSGOrfXib+5XUTCGpeymWcAJN0dihIBIYIZwXgPleMghgbQqjiJiumQ6IIBVPTwpW+zqJlvfx8Hv9P7k+rzkX17Pa80rjKGyqiA3SIjpGDLlED3aAmaiGKYvSEntGL9Wi9Wm/W+2y0YOU7+2gB1sc3Az6aLA==</latexit> ✓t+1 <latexit sha1_base64="so4WvEfNBhzQ2+k0DIcQ37xIf6o=">AAACGXicbVDLSsNAFJ3UV62vqEsRgkUQxJKoqBuh6MZlBfuANoTJZNoOnUzCzE2hhK78DL/ArX6BO3Hryg/wP5y0WdjqgYFzz7mXe+f4MWcKbPvLKCwsLi2vFFdLa+sbm1vm9k5DRYkktE4iHsmWjxXlTNA6MOC0FUuKQ5/Tpj+4zfzmkErFIvEAo5i6Ie4J1mUEg5Y8c78DfQrYS+HYGV/nBZwMp4Jnlu2KPYH1lzg5KaMcNc/87gQRSUIqgHCsVNuxY3BTLIERTselTqJojMkA92hbU4FDqtx08o2xdaiVwOpGUj8B1kT9PZHiUKlR6OvOEENfzXuZ+J/XTqB75aZMxAlQQaaLugm3ILKyTKyASUqAjzTBRDJ9q0X6WGICOrmZLYHKTstyceZT+EsapxXnonJ2f16u3uQJFdEeOkBHyEGXqIruUA3VEUGP6Bm9oFfjyXgz3o2PaWvByGd20QyMzx85KqEn</latexit> ✓t+1 = ✓t vt+1 <latexit sha1_base64="PPIS633nLP5qR61KoXjzVGg7aSY=">AAACOXicbVBNa9tAFFylH3HdplXbYy9LTcDFxEhtSXsxmPaSQw8u1InBMuJpvbYX767E7pPBCP2a/oz8glyTU489FEKu+QNd2YbWcQYWhpn3eLOTZFJYDIJf3t6Dh48e79ee1J8+O3j+wn/56tSmuWG8z1KZmkEClkuheR8FSj7IDAeVSH6WzL9W/tmCGytS/QOXGR8pmGoxEQzQSbHfWcQFtsKyE01BKaCLGFsRR4g0JBIiBThjIItvZTPCmdNjPPo3+S72G0E7WIHuknBDGmSDXuz/icYpyxXXyCRYOwyDDEcFGBRM8rIe5ZZnwOYw5UNHNShuR8XqmyU9dMqYTlLjnka6Uv/fKEBZu1SJm6xy27teJd7nDXOcfB4VQmc5cs3Whya5pJjSqjM6FoYzlEtHgBnhslI2AwMMXbNbV8a2ila6XsK7LeyS0/ft8Lj94fvHRvfLpqEaeUPekiYJySfSJSekR/qEkZ/kglySK+/c++1dezfr0T1vs/OabMG7/QvV+65E</latexit> vt+1 = vt + ⌘rL(✓t vt) <latexit sha1_base64="jikt4Srxl5+zqapEUBoYmyFrypA=">AAACAnicdVDLSsNAFJ3UV62vqks3g0VwVRIVbXdFNy4r2Ae0oUwmk3boZBJmboQSuvML3OoXuBO3/ogf4H84aSO0ogcuHM65l3vv8WLBNdj2p1VYWV1b3yhulra2d3b3yvsHbR0lirIWjUSkuh7RTHDJWsBBsG6sGAk9wTre+CbzOw9MaR7Je5jEzA3JUPKAUwJG6vZhxIAMYFCu2FV7BrxA6vWaU6tjJ1cqKEdzUP7q+xFNQiaBCqJ1z7FjcFOigFPBpqV+ollM6JgMWc9QSUKm3XR27xSfGMXHQaRMScAzdXEiJaHWk9AznSGBkf7tZeJfXi+BoOamXMYJMEnni4JEYIhw9jz2uWIUxMQQQhU3t2I6IopQMBEtbfF1dtrU5PLzPP6ftM+qzmX1/O6i0rjOEyqiI3SMTpGDrlAD3aImaiGKBHpCz+jFerRerTfrfd5asPKZQ7QE6+MbQ1KYsA==</latexit> ✓t <latexit sha1_base64="2qXPSiX6g4ESw8yJGT+gqcvZYr8=">AAACB3icbVDLTgIxFO3gC/GFunTTSEzcOJkBFd0R3bjERMAEJqTTKdDQdsa2Q0ImfIBf4Fa/wJ1x62f4Af6HHZgYQU9yk5Nz7s09OX7EqNKO82nllpZXVtfy64WNza3tneLuXlOFscSkgUMWynsfKcKoIA1NNSP3kSSI+4y0/OF16rdGRCoaijs9jojHUV/QHsVIG8k76fQR5wiOuomedIslx3amgI59Vi07lxX4o7gZKYEM9W7xqxOEOOZEaMyQUm3XibSXIKkpZmRS6MSKRAgPUZ+0DRWIE+Ul09ATeGSUAPZCaUZoOFV/XySIKzXmvtnkSA/UopeK/3ntWPcuvISKKNZE4NmjXsygDmHaAAyoJFizsSEIS2qyQjxAEmFtepr7Eqg0WtqLu9jCX9Is2+65Xbk9LdWusoby4AAcgmPggiqogRtQBw2AwQN4As/gxXq0Xq036322mrOym30wB+vjG5S9mng=</latexit> vt <latexit sha1_base64="pYEI99nU86rqt2I07iUb+/8icC4=">AAACUHicdVBNbxMxEJ0NHy3hKy1HLhYRUpFotFv6kdwquHDgUKSmrZRdrWYdJ7Fqe1f2bKVotX+Mn8GNG5ceyi/gBt40K1EEI1l+ejN+8/yyQklHYfgt6Ny7/+Dhxuaj7uMnT589721tn7m8tFyMea5ye5GhE0oaMSZJSlwUVqDOlDjPLj80/fMrYZ3MzSktC5FonBs5kxzJU2nvtIpXIhM7z5IqHIyGB6P94dtwEK6qAXuHo4Oo3o0FYWwwUxhrpAVHVX2qd2JaeD6l3XiOWiO7SulNnfb6rRJrlVirxKI104d1naS963ia81ILQ1yhc5MoLCip0JLkStTduHSiQH6JczHx0KAWLqlWxmv22jNTNsutP4bYiv3zRYXauaXO/GRj3P3da8h/9SYlzYZJJU1RkjD8dtGsVIxy1kTJptIKTmrpAXIrvVfGF2iRkw/8zpapa6w1ubSfZ/8HZ3uD6HDw7vN+//j9OqFNeAmvYAciOIJj+AgnMAYOX+A73MCP4GvwM/jVCW5H2xtewJ3qdH8DuV+yQA==</latexit> ⌘rL(✓t vt) <latexit sha1_base64="dYecKxTxjwKqcAllQPGL/cN5Q/M=">AAACBnicdVDLSsNAFJ3UV62vqks3g0UQhJKoaLsrunFZwdpCE8pkOmmHTiZh5kYooXu/wK1+gTtx62/4Af6HkzZCK3pg4HDOvdwzx48F12Dbn1ZhaXllda24XtrY3NreKe/u3esoUZS1aCQi1fGJZoJL1gIOgnVixUjoC9b2R9eZ335gSvNI3sE4Zl5IBpIHnBIwkuvCkAHppXDiTHrlil21p8BzpF6vObU6dnKlgnI0e+Uvtx/RJGQSqCBadx07Bi8lCjgVbFJyE81iQkdkwLqGShIy7aXTzBN8ZJQ+DiJlngQ8Vec3UhJqPQ59MxkSGOrfXib+5XUTCGpeymWcAJN0dihIBIYIZwXgPleMghgbQqjiJiumQ6IIBVPTwpW+zqJlvfx8Hv9P7k+rzkX17Pa80rjKGyqiA3SIjpGDLlED3aAmaiGKYvSEntGL9Wi9Wm/W+2y0YOU7+2gB1sc3Az6aLA==</latexit> ✓t+1 <latexit sha1_base64="9xePHLSefWqCHhmPtXC5CMrtgpA=">AAACRnicbVDBShxBEK3ZGLNqNJt49DK4BARxmdFgcgks8SKeDGRV2FmWmt4at7GnZ+yuEZZh/imfkS8I5KQXr7kFr+nZnUPUPGh49V4VVf3iXEnLQfDLa71Yern8qr2yuvZ6feNN5+27M5sVRtBAZCozFzFaUlLTgCUrusgNYRorOo+vjmr//IaMlZn+xrOcRileaplIgeykceck4ikxjkveDavPTcF7Eap8ilFiUJTpwqzKyF4bLm+acjei3EqV6SpeKONON+gFc/jPSdiQLjQ4HXfuo0kmipQ0C4XWDsMg51GJhqVQVK1GhaUcxRVe0tBRjSnZUTn/c+W/d8rETzLjnmZ/rv47UWJq7SyNXWeKPLVPvVr8nzcsOPk0KqXOCyYtFouSQvmc+XWA/kQaEqxmjqAw0t3qiym6oNjF/GjLxNan1bmET1N4Ts72e+Fh7+Drh27/S5NQG7ZgG3YghI/Qh2M4hQEI+A4/4RbuvB/eb++P97BobXnNzCY8Qgv+Avees/A=</latexit> ✓t+1 = ✓t ↵ mt+1 p vt+1 + ✏ bt+1 正規化 https://arxiv.org/pdf/2007.01547.pdf
  • 8. 主要な深層ニューラルネットモデルの変遷 https://towardsdatascience.com/from-lenet-to-ef fi cientnet-the-evolution-of-cnns-3a57eb34672f AlexNet: ReLU, Dropout, GPU 2012 2015 ResNet: Skip connection 2017 MobileNet: Squeeze and excite 2019 Ef fi cientNet: Neural architecture search 2021 Transformer: 注意機構 Vision Transformer: 画像パッチ 1995 LSTM LeNet-5:畳み込み
  • 9. 畳み込みニューラルネット https://cs231n.github.io/convolutional-networks/ [入出力テンソルの次元 ] N: バッチサイズ C: チャネル数 H: 画像の高さ W: 画像の幅 [畳み込みのパラメータ ] F: フィルタの大きさ P: パディングの幅 S: ストライド 入力チャネル3,出力チャネル2の例 [入力 ] N: 1 Cin: 3 Hin: 5 Win: 5 [出力 ] N: 1 Cout: 2 Hout: 3 Wout: 3 F: 3 P: 1 S: 2 GEMM Winograd 入力画像 フィルタ 入力画像 フィルタ 出力画像 batched GEMM FFT http://cs231n.stanford.edu/reports/2016/pdfs/117_Report.pdf https://arxiv.org/abs/1410.0759 https://www.slideshare.net/nervanasys/an-analysis-of-convolution-for-inference
  • 10. 正規化 (normalization) Batch normalization (BN) Layer normalization (LN) Group normalization (GN) Weight standardization (WS) https://theaisummer.com/normalization/ <latexit sha1_base64="RqN4YpnAkqmVd2gk3UwpB5hXPg8=">AAACN3icbZDPShxBEMZ7jEZjYlyTYy6DS2BFXGZUYiAEJF48ruCqsL0sNb01s43dM0N3TdhlmIfxMXyCXJNjTjkpXn0De3b34J980PDjqyqq+otyJS0FwV9v4dXi0uvllTerb9+tvV9vbHw4s1lhBHZFpjJzEYFFJVPskiSFF7lB0JHC8+jyqK6f/0RjZZae0iTHvoYklbEUQM4aNL7xEVA5rr7zBLQGrjCmFo8NiHK8w3XRGm9VJbcy0VAjNzIZ0dY2j5Bg0GgG7WAq/yWEc2iyuTqDxg0fZqLQmJJQYG0vDHLql2BICoXVKi8s5iAuIcGewxQ02n45/WTlf3bO0I8z415K/tR9PFGCtnaiI9epgUb2ea02/1frFRR/7ZcyzQvCVMwWxYXyKfPrxPyhNChITRyAMNLd6osRuITI5fpky9DWp1Uul/B5Ci/hbLcdfmnvnew3D3/ME1phn9gma7GQHbBDdsw6rMsEu2K/2G/2x7v2/nm33t2sdcGbz3xkT+TdPwDfNq3S</latexit> x̂ = ✓ x µ(x) (x) ◆ + <latexit sha1_base64="mMlCkP7IjuvtvnbWTjoEnGSHv/w=">AAACUXicbZDLSgMxFIZPx/u96tLNYBHqwjKjom4E0U1XomCt0KlDJs20wSQzJhm1hHkyH8OVSxdu9AncmWkreDsQ+Pj/k5yTP0oZVdrznkvO2PjE5NT0zOzc/MLiUnl55VIlmcSkgROWyKsIKcKoIA1NNSNXqSSIR4w0o5uTwm/eEaloIi50PyVtjrqCxhQjbaWw3AgU7XJUfdg8DNSt1CaIJcLGz81pvZkHKuPXp6ERh/6Q66HpfXEzNPeWqw/Wx737fCvgmX1n83o7D8sVr+YNyv0L/ggqMKqzsPwadBKccSI0Zkiplu+lum2Q1BQzks8GmSIpwjeoS1oWBeJEtc3g+7m7YZWOGyfSHqHdgfr9hkFcqT6PbCdHuqd+e4X4n9fKdHzQNlSkmSYCDwfFGXN14hZZuh0qCdasbwFhSe2uLu4hm5+2if+Y0lHFakUu/u8U/sLlds3fq+2c71aOjkcJTcMarEMVfNiHI6jDGTQAwyO8wBu8l55KHw44zrDVKY3urMKPcuY+AT9ntVY=</latexit> (x) = v u u t 1 NHW N X n=1 H X h=1 W X w=1 (xnchw µ(x))2 <latexit sha1_base64="dpuT1CUY56ArR9bRnSs4Ys0fxb4=">AAACPHicbZDNSgMxFIUz/tb/UZduBotQN2VGRd0IRTddlQrWCp06ZNKMDU0yQ5LRljCv42P4BG4V3OtK3Lo201ZQ64XAxzn3cm9OmFAileu+WFPTM7Nz84WFxaXlldU1e33jUsapQLiBYhqLqxBKTAnHDUUUxVeJwJCFFDfD3lnuN2+xkCTmF2qQ4DaDN5xEBEFlpMCu+Cwt9XdP/EhApL1M16rNzJcpu64Fmp94I64GuvvNzUDfGe4bG3XvssAuumV3WM4keGMognHVA/vN78QoZZgrRKGULc9NVFtDoQiiOFv0U4kTiHrwBrcMcsiwbOvhTzNnxygdJ4qFeVw5Q/XnhIZMygELTSeDqiv/ern4n9dKVXTc1oQnqcIcjRZFKXVU7OSxOR0iMFJ0YAAiQcytDupCk5ky4f7a0pH5aXku3t8UJuFyr+wdlvfPD4qV03FCBbAFtkEJeOAIVEAV1EEDIHAPHsETeLYerFfr3foYtU5Z45lN8Kuszy9NhbAf</latexit> µ(x) = 1 NHW N X n=1 H X h=1 W X w=1 xnchw <latexit sha1_base64="jWwokzwi9lm2DSevW/HMsFrzSug=">AAACPHicbZDNSgMxFIUz/tb6V3XpZrAIdVNmVNSNUOymywq2FTp1yKQZG0wyQ5KxljCv42P4BG4V3OtK3Lo2045g1QOBj3Pv5d6cIKZEKsd5sWZm5+YXFgtLxeWV1bX10sZmW0aJQLiFIhqJywBKTAnHLUUUxZexwJAFFHeCm3pW79xiIUnEL9Qoxj0GrzkJCYLKWH6p5rGkcrd36oUCIu2mut7opJ5M2FXd1+jUnXDD14Nv7vh6aPjO1xwNhqlfKjtVZyz7L7g5lEGupl968/oRShjmClEoZdd1YtXTUCiCKE6LXiJxDNENvMZdgxwyLHt6/NPU3jVO3w4jYR5X9tj9OaEhk3LEAtPJoBrI37XM/K/WTVR40tOEx4nCHE0WhQm1VWRnsdl9IjBSdGQAIkHMrTYaQJOZMuFObenL7LQsF/d3Cn+hvV91j6oH54fl2lmeUAFsgx1QAS44BjXQAE3QAgjcg0fwBJ6tB+vVerc+Jq0zVj6zBaZkfX4BE+av/g==</latexit> µ(x) = 1 CHW C X c=1 H X h=1 W X w=1 xnchw <latexit sha1_base64="RqN4YpnAkqmVd2gk3UwpB5hXPg8=">AAACN3icbZDPShxBEMZ7jEZjYlyTYy6DS2BFXGZUYiAEJF48ruCqsL0sNb01s43dM0N3TdhlmIfxMXyCXJNjTjkpXn0De3b34J980PDjqyqq+otyJS0FwV9v4dXi0uvllTerb9+tvV9vbHw4s1lhBHZFpjJzEYFFJVPskiSFF7lB0JHC8+jyqK6f/0RjZZae0iTHvoYklbEUQM4aNL7xEVA5rr7zBLQGrjCmFo8NiHK8w3XRGm9VJbcy0VAjNzIZ0dY2j5Bg0GgG7WAq/yWEc2iyuTqDxg0fZqLQmJJQYG0vDHLql2BICoXVKi8s5iAuIcGewxQ02n45/WTlf3bO0I8z415K/tR9PFGCtnaiI9epgUb2ea02/1frFRR/7ZcyzQvCVMwWxYXyKfPrxPyhNChITRyAMNLd6osRuITI5fpky9DWp1Uul/B5Ci/hbLcdfmnvnew3D3/ME1phn9gma7GQHbBDdsw6rMsEu2K/2G/2x7v2/nm33t2sdcGbz3xkT+TdPwDfNq3S</latexit> x̂ = ✓ x µ(x) (x) ◆ + <latexit sha1_base64="ih89Pzc2EBidrxV+yRGOgCPpnQE=">AAACUXicbZC7TsMwFIZPw/1eYGSJqJDKQJUAAhYkRJeOIFGK1JTIcZ3WwnaC7QCVlSfjMZgYGVjgCdhw2iJxO5KlT/+5+o9SRpX2vOeSMzE5NT0zOze/sLi0vFJeXbtUSSYxaeKEJfIqQoowKkhTU83IVSoJ4hEjreimXuRbd0QqmogLPUhJh6OeoDHFSFspLDcDRXscVR+2jwN1K7UJYomw8XNTb7TyQGX8uh4afOyPuBGa/he3QnNvufoQGoH79/lOwDM7Z/t6Nw/LFa/mDcP9C/4YKjCOs7D8GnQTnHEiNGZIqbbvpbpjkNQUM5LPB5kiKcI3qEfaFgXiRHXM8Pu5u2WVrhsn0j6h3aH6vcMgrtSAR7aSI91Xv3OF+F+unen4qGOoSDNNBB4tijPm6sQtvHS7VBKs2cACwpLaW13cR9Y/bR3/saWritMKX/zfLvyFy92af1DbO9+vnJyOHZqFDdiEKvhwCCfQgDNoAoZHeIE3eC89lT4ccJxRqVMa96zDj3AWPgEEPLU1</latexit> (x) = v u u t 1 CHW C X c=1 H X h=1 W X w=1 (xnchw µ(x))2 <latexit sha1_base64="RqN4YpnAkqmVd2gk3UwpB5hXPg8=">AAACN3icbZDPShxBEMZ7jEZjYlyTYy6DS2BFXGZUYiAEJF48ruCqsL0sNb01s43dM0N3TdhlmIfxMXyCXJNjTjkpXn0De3b34J980PDjqyqq+otyJS0FwV9v4dXi0uvllTerb9+tvV9vbHw4s1lhBHZFpjJzEYFFJVPskiSFF7lB0JHC8+jyqK6f/0RjZZae0iTHvoYklbEUQM4aNL7xEVA5rr7zBLQGrjCmFo8NiHK8w3XRGm9VJbcy0VAjNzIZ0dY2j5Bg0GgG7WAq/yWEc2iyuTqDxg0fZqLQmJJQYG0vDHLql2BICoXVKi8s5iAuIcGewxQ02n45/WTlf3bO0I8z415K/tR9PFGCtnaiI9epgUb2ea02/1frFRR/7ZcyzQvCVMwWxYXyKfPrxPyhNChITRyAMNLd6osRuITI5fpky9DWp1Uul/B5Ci/hbLcdfmnvnew3D3/ME1phn9gma7GQHbBDdsw6rMsEu2K/2G/2x7v2/nm33t2sdcGbz3xkT+TdPwDfNq3S</latexit> x̂ = ✓ x µ(x) (x) ◆ + <latexit sha1_base64="+mSuVgDmFbFdQz9MORmrfpjKhzI=">AAACQHicbZBNSwMxEIazflu/qh69LBZBL3W3inoRxHrosYL9gG5dsmm2DSbZJclaS9g/5M/wF3jVu+BNvHoy21aw1YHAM+/MMJM3iCmRynFerJnZufmFxaXl3Mrq2vpGfnOrLqNEIFxDEY1EM4ASU8JxTRFFcTMWGLKA4kZwV87qjXssJIn4jRrEuM1gl5OQIKiM5OevPJbsPxyce6GASJdSXa40Uk8m7FaXD0upr9G5O8orvu79cMPXfcMPvuao10/9fMEpOsOw/4I7hgIYR9XPv3mdCCUMc4UolLLlOrFqaygUQRSnOS+ROIboDnZxyyCHDMu2Hv42tfeM0rHDSJjHlT1Uf09oyKQcsMB0Mqh6crqWif/VWokKz9qa8DhRmKPRojChtorszDq7QwRGig4MQCSIudVGPWh8U8bgiS0dmZ2W+eJOu/AX6qWie1I8uj4uXFyOHVoCO2AX7AMXnIILUAFVUAMIPIJn8AJerSfr3fqwPketM9Z4ZhtMhPX1DSxusYA=</latexit> µ(x) = 2 CHW C/2 X c=1 H X h=1 W X w=1 xnchw <latexit sha1_base64="VgwH0utETMAfcEv0IoDUtgzOTWk=">AAACVXicbZDdTsIwGIbLxD/8Qz30ZJGY4IG4oVFPSIyccIiJiIbh0pUOGttutp1Cml2bl2G8AE/1CkzsABNRv6TJ2/f7zRPElEjlOK85ay4/v7C4tFxYWV1b3yhubl3LKBEIt1BEI3ETQIkp4biliKL4JhYYsoDidnBfz/LtRywkifiVGsW4y2Cfk5AgqIzlF289SfoMlof7NU8+CKW9UECkq6muN9qpJxN2p+uH1dTXqOZO/g1fD75129dPRpeHvuZo8JQeeCwxs/bvTEex5FSccdh/hTsVJTCNpl9883oRShjmClEoZcd1YtXVUCiCKE4LXiJxDNE97OOOkRwyLLt6jCC194zTs8NImMeVPXZ/dmjIpByxwFQyqAbydy4z/8t1EhWedTXhcaIwR5NFYUJtFdkZT7tHBEaKjoyASBBzq40G0DBUhvrMlp7MTsu4uL8p/BXX1Yp7Ujm6PC6dX0wJLYEdsAvKwAWn4Bw0QBO0AALP4A28g4/cS+7TylsLk1IrN+3ZBjNhbXwBPOS2tw==</latexit> (x) = v u u t 2 CHW C/2 X c=1 H X h=1 W X w=1 (xnchw µ(x))2 <latexit sha1_base64="PSqEPc30bQy6FHyPWYe6IWsLh00=">AAACZnicbVFNS8MwGM7q9/dUxIOX4BDmwdFOUS/CcDB2VHBWWGdJs3SLS9qapMII/Y9e/QGCv8CrprMH3Xwh8LzP837xJEgYlcq230rW3PzC4tLyyura+sbmVnl7517GqcCkg2MWi4cAScJoRDqKKkYeEkEQDxhxg1Ez190XIiSNozs1TkiPo0FEQ4qRMpRffvIkHXBUdY+vPPkslPZCgbB2Mt1s+e2W72aeTPmjbma+xldOkRnJ5KHf/sW4E8Y1TNU1tUY0WXbi8dQMP36sZ365YtfsScBZ4BSgAoq48cvvXj/GKSeRwgxJ2XXsRPU0EopiRrJVL5UkQXiEBqRrYIQ4kT098SSDR4bpwzAW5kUKTtjfHRpxKcc8MJUcqaGc1nLyP62bqvCyp2mUpIpE+GdRmDKoYpgbDPtUEKzY2ACEBTW3QjxExlRlvuHPlr7MT8t9caZdmAX39ZpzXju9Pas0rguHlsEBOARV4IAL0ABtcAM6AINX8Am+SqD0YW1ae9b+T6lVKnp2wZ+w4Dc1/7tI</latexit> (W) = v u u t 1 CFHFW C X c=1 FH X fH =1 FW X fW =1 (WcfH fW µ(W))2 <latexit sha1_base64="kB8kd70X3SXw/F5W+lkWy3s6d8o=">AAACUXicbZDNSsNAFIVv41+tf1WXboJF0E1JVNSNUCyULivYptDUMJlO2sGZJMxMhBLyZD6GK5cu3OgTuHPSZtGqFwbO/c4d5s7xY0alsqy3krGyura+Ud6sbG3v7O5V9w96MkoEJl0csUj0fSQJoyHpKqoY6ceCIO4z4vhPzdx3nomQNAof1DQmQ47GIQ0oRkojr9p1eXLqnN26gUA4Pc/SZstrtzwnc2XCH9Nm5qX41i46bek+8NoLxJkRRxNHj2ovyFG1ZtWtWZl/hV2IGhTV8aof7ijCCSehwgxJObCtWA1TJBTFjGQVN5EkRvgJjclAyxBxIofp7PuZeaLJyAwioU+ozBldvJEiLuWU+3qSIzWRv70c/ucNEhXcDFMaxokiIZ4/FCTMVJGZZ2mOqCBYsakWCAuqdzXxBOkglU586ZWRzFfLc7F/p/BX9M7r9lX94v6y1rgrEirDERzDKdhwDQ1oQwe6gOEF3uETvkqvpW8DDGM+apSKO4ewVMbWD9qTtTQ=</latexit> µ(W) = 2 CFHFW C X c=1 FH X fH =1 FW X fW =1 WcfH fW <latexit sha1_base64="Pt2kdQmIa3vxI/2P4C+R6402oY0=">AAACK3icbZDNTsJAFIWn+If4h7p000hMYCG2atSNCdGNS0yEklBCpsMUJsy0zcytCWn6GD6GT+BWn8CVxi3v4RRYCHiSSb6ce2/uneNFnCmwrC8jt7K6tr6R3yxsbe/s7hX3D5oqjCWhDRLyULY8rChnAW0AA05bkaRYeJw63vA+qzvPVCoWBk8wimhH4H7AfEYwaKtbPHMHGBInvXU59aHs+hKTxDl1RVx2KmniKtYXOENXsv4AKt1iyapaE5nLYM+ghGaqd4tjtxeSWNAACMdKtW0rgk6CJTDCaVpwY0UjTIa4T9saAyyo6iSTj6XmiXZ6ph9K/QIwJ+7fiQQLpUbC050Cw0At1jLzv1o7Bv+mk7AgioEGZLrIj7kJoZmlZPaYpAT4SAMmkulbTTLAOhvQWc5t6anstFTnYi+msAzN86p9Vb14vCzV7mYJ5dEROkZlZKNrVEMPqI4aiKAX9Ibe0Yfxanwa38bPtDVnzGYO0ZyM8S8fuqhU</latexit> Ŵ = ✓ W µ(W) (W) ◆ B N + LN B N + LN B N + LN B N + LN (higher is better)
  • 11. データ拡張 (augmentation) Flipping Rotation Cutout Random crop Scale Random Erasing Mixup CutMix AugMix AutoAugment 強化学習を使って最適なデータ拡張を探索 Fast AutoAugment 強化学習+ベイズ最適化により探索時間短縮 Faster AutoAugment 勾配ベースの探索によりさらに時間短縮 https://openreview.net/pdf?id=S1gmrxHFvB https://github.com/xkumiyu/numpy-data-augmentation (lower is better)
  • 12. 正則化 (regularization) <latexit sha1_base64="PTOETKQJ9sDV108G8utVEdMYksY=">AAACGnicbVDLSsNAFJ34rPUVdSnIYBHcWJIi6kYounHhooJ9QBPLzWTaDp08mJkIJWTnZ/gFbvUL3IlbN36A/+Gk7cK2Hhg4nHMv98zxYs6ksqxvY2FxaXlltbBWXN/Y3No2d3YbMkoEoXUS8Ui0PJCUs5DWFVOctmJBIfA4bXqD69xvPlIhWRTeq2FM3QB6IesyAkpLHfPACUD1CfD0Nrt0ZBI8pD4oyE4cHvVw3DFLVtkaAc8Te0JKaIJax/xx/IgkAQ0V4SBl27Zi5aYgFCOcZkUnkTQGMoAebWsaQkClm47+keEjrfi4Gwn9QoVH6t+NFAIph4GnJ/PUctbLxf+8dqK6F27KwjhRNCTjQ92EYxXhvBTsM0GJ4kNNgAims2LSBwFE6eqmrvgyj5bpXuzZFuZJo1K2z8qVu9NS9WrSUAHto0N0jGx0jqroBtVQHRH0hF7QK3ozno1348P4HI8uGJOdPTQF4+sXt5uh/g==</latexit> L = data X log p 損失関数 https://arxiv.org/abs/2002.08709 L2正則化 <latexit sha1_base64="Yy8xPfnTLiqjWpL4JDjdct97CAw=">AAACJ3icbVDLSgMxFM34rPU16tJNsAhCscxUUTdC0Y0LFxXsAzq13MmkNTTzIMkIZTof4Wf4BW71C9yJLl34H2baLmz1QOBwzr259x434kwqy/o05uYXFpeWcyv51bX1jU1za7suw1gQWiMhD0XTBUk5C2hNMcVpMxIUfJfThtu/zPzGAxWShcGtGkS07UMvYF1GQGmpYxYdH9Q9AZ5cp+eOjP27xAMF6aHDwx6Oig7Xf3kwbAzvyh2zYJWsEfBfYk9IAU1Q7ZjfjheS2KeBIhykbNlWpNoJCMUIp2neiSWNgPShR1uaBuBT2U5GR6V4Xyse7oZCv0Dhkfq7IwFfyoHv6srsBDnrZeJ/XitW3bN2woIoVjQg40HdmGMV4iwh7DFBieIDTYAIpnfF5B4EEKVznJriyWy1VOdiz6bwl9TLJfukdHRzXKhcTBLKoV20hw6QjU5RBV2hKqohgh7RM3pBr8aT8Wa8Gx/j0jlj0rODpmB8/QDq3qdI</latexit> L = data X log p + |W|2 L1正則化 <latexit sha1_base64="qqFUGj5bRbXJTuGZgDLorSbuXL0=">AAACJXicbVDLSgMxFM3UV62vqks3wSIoYplRUTdC0Y0LFxXsAzq13MmkbWjmQZIRynS+wc/wC9zqF7gTwZUr/8NM24VtPRA4nHNv7r3HCTmTyjS/jMzc/MLiUnY5t7K6tr6R39yqyiAShFZIwANRd0BSznxaUUxxWg8FBc/htOb0rlO/9kiFZIF/r/ohbXrQ8VmbEVBaauUPbA9UlwCPb5NLW0beQ+yCguTI5kEHh4c213+5MKgNWvmCWTSHwLPEGpMCGqPcyv/YbkAij/qKcJCyYZmhasYgFCOcJjk7kjQE0oMObWjqg0dlMx6elOA9rbi4HQj9fIWH6t+OGDwp+56jK9MD5LSXiv95jUi1L5ox88NIUZ+MBrUjjlWA03ywywQlivc1ASKY3hWTLgggSqc4McWV6WqJzsWaTmGWVI+L1lnx5O60ULoaJ5RFO2gX7SMLnaMSukFlVEEEPaEX9IrejGfj3fgwPkelGWPcs40mYHz/AprfpqQ=</latexit> L = data X log p + |W| Sharpness Aware Minimization (SAM) Flooding <latexit sha1_base64="+29JRp4dO+SSQAn2+lrjhc+5WsE=">AAACIHicbVDLSgMxFM34rPVVdekmWAVBWmZU1I1QdOPCRQX7gE4tdzJpG5p5kGSEMp0f8DP8Arf6Be7Epe79DzNtF7b1QOBwzr3ck+OEnEllml/G3PzC4tJyZiW7ura+sZnb2q7KIBKEVkjAA1F3QFLOfFpRTHFaDwUFz+G05vSuU7/2SIVkgX+v+iFtetDxWZsRUFpq5fZtD1SXAI9vk0tbRt5D7IKCZFCwedDBYcEZHDmtXN4smkPgWWKNSR6NUW7lfmw3IJFHfUU4SNmwzFA1YxCKEU6TrB1JGgLpQYc2NPXBo7IZD3+T4AOtuLgdCP18hYfq340YPCn7nqMn0+xy2kvF/7xGpNoXzZj5YaSoT0aH2hHHKsBpNdhlghLF+5oAEUxnxaQLAojSBU5ccWUaLdG9WNMtzJLqcdE6K57cneZLV+OGMmgX7aFDZKFzVEI3qIwqiKAn9IJe0ZvxbLwbH8bnaHTOGO/soAkY378ypqRP</latexit> L = data X | log p b| + b <latexit sha1_base64="9vmQ2Pyai29yDGuUrPziv2Rg5HE=">AAACPnicbVBNSxxBEO0xfn+uydFL4yIo4jKjol6EZb3kkIOBrCvsjEtNT+1uY0/30N0jLsP8n/yM/IJcE39AvEmuOaZnXcGvBwWP96qoqhdnghvr+3fe1Ifpmdm5+YXFpeWV1bXa+scLo3LNsM2UUPoyBoOCS2xbbgVeZhohjQV24uuzyu/coDZcyW92lGGUwkDyPmdgndSrtcIU7JCBKL6Up6HJ06siAQulk297RYiZ4ULJUGCoh6rs7oVCDWi23dl9snaiXq3uN/wx6FsSTEidTHDeq/0JE8XyFKVlAozpBn5mowK05UxguRjmBjNg1zDArqMSUjRRMf61pFtOSWhfaVfS0rH6fKKA1JhRGrvO6jPz2qvE97xubvsnUcFllluU7HFRPxfUKloFRxOukVkxcgSY5u5WyoaggVkX74stialOK10uwesU3pKL/UZw1Dj4elhvtiYJzZMNskm2SUCOSZN8JuekTRj5Tn6SX+S398O79x68v4+tU95k5hN5Ae/ff/ElsWk=</latexit> L = data X max ✏⇢ [ log p(W + ✏)] https://arxiv.org/abs/2010.01412 Dropout https://arxiv.org/abs/1603.09382 Stochastic depth
  • 13. 分散並列化 データ並列 テンソル並列 層並列 データを分散 モデルは冗長 勾配を通信 バッチが巨大化 例:Horovod データは冗長 モデルは分散 活性を通信 通信頻度が多い 例:Mesh TensorFlow データは冗長 モデルは分散 活性を通信 計算が逐次的 例:GPipe “Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis”, Ben-nun and Hoe fl er, ACM Computing Surveys, Article No.: 65
  • 14. データ並列における通信と同期 パラメータサーバ 集団通信 同期型 非同期型 同期型 非同期型 “Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis”, Ben-nun and Hoe fl er, ACM Computing Surveys, Article No.: 65
  • 16. full batch large mini-batch small mini-batch なぜバッチサイズを大きくすると汎化性能が低下するのか?
  • 19. ラージバッチ用の最適化手法? LARS LAMB (ADAM+LARS) <latexit sha1_base64="o+M96FcBaWGVEj1D/kpYk4bZkPU=">AAACPXicbVA9TyMxEPXycQccHDkoaSyik2iIdhESlBE0FCmCRAhSNopmnVmw8HpX9uxF0SZ/534Gv4AWJHroEC0t3iQFX0+y/PxmRvP8okxJS77/4M3NLyz++Lm0vPJrde33euXPxrlNcyOwJVKVmosILCqpsUWSFF5kBiGJFLaj6+Oy3v6HxspUn9Eww24Cl1rGUgA5qVepD3ioMCYwJh3wwW6IBJyHsQFRjEaD0WjsrlBDpIA33GtKwwToSoAqGuNeperX/An4VxLMSJXN0OxVnsJ+KvIENQkF1nYCP6NuAYakUDheCXOLGYhruMSOoxoStN1i8tMx/+uUPo9T444mPlHfTxSQWDtMItdZWrSfa6X4Xa2TU3zYLaTOckItpoviXHFKeRkb70uDgtTQERBGOq9cXIFLiVy4H7b0bWmtzCX4nMJX0t6rBfu1IDjdr9aPZhEtsS22zXZYwA5YnZ2wJmsxwf6zW3bH7r0b79F79l6mrXPebGaTfYD3+gZIWrEN</latexit> w w ⌘ ||w|| ||rL|| rL 32kのハッチサイズにおいても: NesterovでLARSと同じ性能を達成 AdamでLAMBと同じ性能を達成 結局ハイパラチューニング次第
  • 22. 2層の全結合NN D_in=3 H=5 D_out=2 Data batch_size(BS)=2 x(BS,D_in) w1(D_in,H) w2(H,D_out) y_p(BS,D_out) h_r=f(x*w1) y=f(x) ReLU (Recti fi ed Linear Unit) y_p=h_r*w2 Back propagation @L @w2 = @L @yp @yp @w2 = 1 NO 2(yp y)hr <latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit> <latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit> <latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit> <latexit sha1_base64="KD+WkL8b4iPtxogxdCifBK+sduE=">AAAB7HicbVDLSgNBEOyNrxijxrOXwSB4Crte9Ch48RjBPCBZwuxsJxkyO7vM9AphyQ949Qu8iX/kB/gfziY5mMSCgaKqm66pKFPSku9/e5W9/YPDo+px7aReOz07b9S7Ns2NwI5IVWr6EbeopMYOSVLYzwzyJFLYi2aPpd97RWNlql9onmGY8ImWYyk4Oak9ajT9lr8E2yXBmjRhjVHjZxinIk9Qk1Dc2kHgZxQW3JAUChe1YW4x42LGJzhwVPMEbVgsYy7YtVNiNk6Ne5rYUv27UfDE2nkSucmE09Rue6X4nzfIaXwfFlJnOaEWq0PjXDFKWflnFkuDgtTcES6MdFmZmHLDBblmNq7Etoy2cLUE2yXsku5tK/BbwbMPVbiEK7iBAO7gAZ6gDR0QEMMbvHuF9+F9ruqreOseL2AD3tcvU8WSkg==</latexit> <latexit sha1_base64="BPl6LZUWEc7LnKT4OpXuCsHjQG0=">AAACa3icfZHNS8MwGMbT+jXndNWrIMExnAdHu4teBMGLB9EJ7gPWUtIs3cLSD5JUqaV/qDcv/g8eTbeCbhNfCDw8T/K+L794MaNCmua7pm9sbm3vVHare7X9g7pxWOuLKOGY9HDEIj70kCCMhqQnqWRkGHOCAo+RgTe7LfLBC+GCRuGzTGPiBGgSUp9iJJXlGm+2zxHO7BhxSRGD9/mPfnU7+fU/eerG+UpcWMsNqmUHK88e3Me801JXLtLzqctdo2G2zXnBdWGVogHK6rrGpz2OcBKQUGKGhBhZZiydrBiGGcmrdiJIjPAMTchIyRAFRDjZnFEOm8oZQz/i6oQSzt3fLzIUCJEGnroZIDkVq1lh/pWNEulfORkN40SSEC8G+QmDMoIFcDimnGDJUiUQ5lTtCvEUKSRSfcvSlLEoVssVF2uVwrrod9qW2baeTFABx+AUtIAFLsENuANd0AMYfGjbWl0ztC/9RG8uCOpaifIILJV+9g0Z5MCq</latexit> <latexit sha1_base64="BPl6LZUWEc7LnKT4OpXuCsHjQG0=">AAACa3icfZHNS8MwGMbT+jXndNWrIMExnAdHu4teBMGLB9EJ7gPWUtIs3cLSD5JUqaV/qDcv/g8eTbeCbhNfCDw8T/K+L794MaNCmua7pm9sbm3vVHare7X9g7pxWOuLKOGY9HDEIj70kCCMhqQnqWRkGHOCAo+RgTe7LfLBC+GCRuGzTGPiBGgSUp9iJJXlGm+2zxHO7BhxSRGD9/mPfnU7+fU/eerG+UpcWMsNqmUHK88e3Me801JXLtLzqctdo2G2zXnBdWGVogHK6rrGpz2OcBKQUGKGhBhZZiydrBiGGcmrdiJIjPAMTchIyRAFRDjZnFEOm8oZQz/i6oQSzt3fLzIUCJEGnroZIDkVq1lh/pWNEulfORkN40SSEC8G+QmDMoIFcDimnGDJUiUQ5lTtCvEUKSRSfcvSlLEoVssVF2uVwrrod9qW2baeTFABx+AUtIAFLsENuANd0AMYfGjbWl0ztC/9RG8uCOpaifIILJV+9g0Z5MCq</latexit> <latexit sha1_base64="9s6RO95OuhDQX1hzGQv1yWyzpFM=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUl6UY3QtGNC9EK9gFNCZPpxA6dPJiZKDHkQ9258R9cOmkD2la8MHA45957LmfciFEhTfNd09fWN0qb5a3K9s7uXtXYP+iJMOaYdHHIQj5wkSCMBqQrqWRkEHGCfJeRvju9yfX+C+GChsGTTCIy8tFzQD2KkVSUY7zZHkc4tSPEJUUM3mU/+NVpZVf/6IkTZUtyTi0uqBQbrCy9dx6yVkO1nCdnE4c7Rs1smrOCq8AqQA0U1XGMT3sc4tgngcQMCTG0zEiO0twMM5JV7FiQCOEpeiZDBQPkEzFKZxllsK6YMfRCrl4g4Yz9PZEiX4jEd1Wnj+RELGs5+Zc2jKV3OUppEMWSBHhu5MUMyhDmgcMx5QRLliiAMKfqVognSEUi1bcsuIxFflqmcrGWU1gFvVbTMpvWo1lrXxcJlcEROAENYIEL0Aa3oAO6AIMPraRVNUP70o/1un46b9W1YuYQLJRufgN6n8GI</latexit> <latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit> <latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit> <latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit> <latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit> <latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit> <latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit> L = 1 NO X (yp y)2 <latexit sha1_base64="HuDWCNBZ3fiwLb58W34hfFqbE4Y=">AAACGHicbZDLSsNAFIYn9VbrLepSF8Ei1IUlKYJuhKIbF6IV7AWaGCaTSTt0cmFmIoSQjY/hE7jVJ3Anbt35AL6HkzYL2/rDwM9/zuGc+ZyIEi50/VspLSwuLa+UVytr6xubW+r2ToeHMUO4jUIasp4DOaYkwG1BBMW9iGHoOxR3ndFlXu8+YsZJGNyLJMKWDwcB8QiCQka2un99bnoMotTI0hv7NjN57NcSOzpOjh4atlrV6/pY2rwxClMFhVq2+mO6IYp9HAhEIed9Q4+ElUImCKI4q5gxxxFEIzjAfWkD6GNupeNfZNqhTFzNC5l8gdDG6d+JFPqcJ74jO30ohny2lof/1fqx8M6slARRLHCAJou8mGoi1HIkmksYRoIm0kDEiLxVQ0MooQgJbmqLy/PTMsnFmKUwbzqNuqHXjbuTavOiIFQGe+AA1IABTkETXIEWaAMEnsALeAVvyrPyrnwon5PWklLM7IIpKV+/xXGgTg==</latexit> <latexit sha1_base64="HuDWCNBZ3fiwLb58W34hfFqbE4Y=">AAACGHicbZDLSsNAFIYn9VbrLepSF8Ei1IUlKYJuhKIbF6IV7AWaGCaTSTt0cmFmIoSQjY/hE7jVJ3Anbt35AL6HkzYL2/rDwM9/zuGc+ZyIEi50/VspLSwuLa+UVytr6xubW+r2ToeHMUO4jUIasp4DOaYkwG1BBMW9iGHoOxR3ndFlXu8+YsZJGNyLJMKWDwcB8QiCQka2un99bnoMotTI0hv7NjN57NcSOzpOjh4atlrV6/pY2rwxClMFhVq2+mO6IYp9HAhEIed9Q4+ElUImCKI4q5gxxxFEIzjAfWkD6GNupeNfZNqhTFzNC5l8gdDG6d+JFPqcJ74jO30ohny2lof/1fqx8M6slARRLHCAJou8mGoi1HIkmksYRoIm0kDEiLxVQ0MooQgJbmqLy/PTMsnFmKUwbzqNuqHXjbuTavOiIFQGe+AA1IABTkETXIEWaAMEnsALeAVvyrPyrnwon5PWklLM7IIpKV+/xXGgTg==</latexit> <latexit sha1_base64="HuDWCNBZ3fiwLb58W34hfFqbE4Y=">AAACGHicbZDLSsNAFIYn9VbrLepSF8Ei1IUlKYJuhKIbF6IV7AWaGCaTSTt0cmFmIoSQjY/hE7jVJ3Anbt35AL6HkzYL2/rDwM9/zuGc+ZyIEi50/VspLSwuLa+UVytr6xubW+r2ToeHMUO4jUIasp4DOaYkwG1BBMW9iGHoOxR3ndFlXu8+YsZJGNyLJMKWDwcB8QiCQka2un99bnoMotTI0hv7NjN57NcSOzpOjh4atlrV6/pY2rwxClMFhVq2+mO6IYp9HAhEIed9Q4+ElUImCKI4q5gxxxFEIzjAfWkD6GNupeNfZNqhTFzNC5l8gdDG6d+JFPqcJ74jO30ohny2lof/1fqx8M6slARRLHCAJou8mGoi1HIkmksYRoIm0kDEiLxVQ0MooQgJbmqLy/PTMsnFmKUwbzqNuqHXjbuTavOiIFQGe+AA1IABTkETXIEWaAMEnsALeAVvyrPyrnwon5PWklLM7IIpKV+/xXGgTg==</latexit> <latexit sha1_base64="HuDWCNBZ3fiwLb58W34hfFqbE4Y=">AAACGHicbZDLSsNAFIYn9VbrLepSF8Ei1IUlKYJuhKIbF6IV7AWaGCaTSTt0cmFmIoSQjY/hE7jVJ3Anbt35AL6HkzYL2/rDwM9/zuGc+ZyIEi50/VspLSwuLa+UVytr6xubW+r2ToeHMUO4jUIasp4DOaYkwG1BBMW9iGHoOxR3ndFlXu8+YsZJGNyLJMKWDwcB8QiCQka2un99bnoMotTI0hv7NjN57NcSOzpOjh4atlrV6/pY2rwxClMFhVq2+mO6IYp9HAhEIed9Q4+ElUImCKI4q5gxxxFEIzjAfWkD6GNupeNfZNqhTFzNC5l8gdDG6d+JFPqcJ74jO30ohny2lof/1fqx8M6slARRLHCAJou8mGoi1HIkmksYRoIm0kDEiLxVQ0MooQgJbmqLy/PTMsnFmKUwbzqNuqHXjbuTavOiIFQGe+AA1IABTkETXIEWaAMEnsALeAVvyrPyrnwon5PWklLM7IIpKV+/xXGgTg==</latexit> @L @w1 = @L @yp @yp @hr @hr @w1 = 1 NO 2(yp y)w2x <latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit> <latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit> <latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit> <latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit> w1 w1 ⌘ @L @w1 <latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit> <latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit> <latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit> <latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit> w2 w2 ⌘ @L @w2 <latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit> <latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit> <latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit> <latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit> hr > 0 <latexit sha1_base64="cIbhX78ttMoC8DFM/VOzOTz2skU=">AAAB/3icbVDLSsNAFL2pr1pfUZduBovgqiQi6EqKblxWMG2hDWUymbRDJ5MwMxFK6MIvcKtf4E7c+il+gP/hpM3Cth4YOJxzL/fMCVLOlHacb6uytr6xuVXdru3s7u0f2IdHbZVkklCPJDyR3QArypmgnmaa024qKY4DTjvB+K7wO09UKpaIRz1JqR/joWARI1gbyRsN5I0zsOtOw5kBrRK3JHUo0RrYP/0wIVlMhSYcK9VznVT7OZaaEU6ntX6maIrJGA9pz1CBY6r8fBZ2is6MEqIokeYJjWbq340cx0pN4sBMxliP1LJXiP95vUxH137ORJppKsj8UJRxpBNU/ByFTFKi+cQQTCQzWREZYYmJNv0sXAlVEW1qenGXW1gl7YuG6zTch8t687ZsqAoncArn4MIVNOEeWuABAQYv8Apv1rP1bn1Yn/PRilXuHMMCrK9f9mSWwA==</latexit> <latexit sha1_base64="cIbhX78ttMoC8DFM/VOzOTz2skU=">AAAB/3icbVDLSsNAFL2pr1pfUZduBovgqiQi6EqKblxWMG2hDWUymbRDJ5MwMxFK6MIvcKtf4E7c+il+gP/hpM3Cth4YOJxzL/fMCVLOlHacb6uytr6xuVXdru3s7u0f2IdHbZVkklCPJDyR3QArypmgnmaa024qKY4DTjvB+K7wO09UKpaIRz1JqR/joWARI1gbyRsN5I0zsOtOw5kBrRK3JHUo0RrYP/0wIVlMhSYcK9VznVT7OZaaEU6ntX6maIrJGA9pz1CBY6r8fBZ2is6MEqIokeYJjWbq340cx0pN4sBMxliP1LJXiP95vUxH137ORJppKsj8UJRxpBNU/ByFTFKi+cQQTCQzWREZYYmJNv0sXAlVEW1qenGXW1gl7YuG6zTch8t687ZsqAoncArn4MIVNOEeWuABAQYv8Apv1rP1bn1Yn/PRilXuHMMCrK9f9mSWwA==</latexit> <latexit sha1_base64="cIbhX78ttMoC8DFM/VOzOTz2skU=">AAAB/3icbVDLSsNAFL2pr1pfUZduBovgqiQi6EqKblxWMG2hDWUymbRDJ5MwMxFK6MIvcKtf4E7c+il+gP/hpM3Cth4YOJxzL/fMCVLOlHacb6uytr6xuVXdru3s7u0f2IdHbZVkklCPJDyR3QArypmgnmaa024qKY4DTjvB+K7wO09UKpaIRz1JqR/joWARI1gbyRsN5I0zsOtOw5kBrRK3JHUo0RrYP/0wIVlMhSYcK9VznVT7OZaaEU6ntX6maIrJGA9pz1CBY6r8fBZ2is6MEqIokeYJjWbq340cx0pN4sBMxliP1LJXiP95vUxH137ORJppKsj8UJRxpBNU/ByFTFKi+cQQTCQzWREZYYmJNv0sXAlVEW1qenGXW1gl7YuG6zTch8t687ZsqAoncArn4MIVNOEeWuABAQYv8Apv1rP1bn1Yn/PRilXuHMMCrK9f9mSWwA==</latexit> <latexit sha1_base64="cIbhX78ttMoC8DFM/VOzOTz2skU=">AAAB/3icbVDLSsNAFL2pr1pfUZduBovgqiQi6EqKblxWMG2hDWUymbRDJ5MwMxFK6MIvcKtf4E7c+il+gP/hpM3Cth4YOJxzL/fMCVLOlHacb6uytr6xuVXdru3s7u0f2IdHbZVkklCPJDyR3QArypmgnmaa024qKY4DTjvB+K7wO09UKpaIRz1JqR/joWARI1gbyRsN5I0zsOtOw5kBrRK3JHUo0RrYP/0wIVlMhSYcK9VznVT7OZaaEU6ntX6maIrJGA9pz1CBY6r8fBZ2is6MEqIokeYJjWbq340cx0pN4sBMxliP1LJXiP95vUxH137ORJppKsj8UJRxpBNU/ByFTFKi+cQQTCQzWREZYYmJNv0sXAlVEW1qenGXW1gl7YuG6zTch8t687ZsqAoncArn4MIVNOEeWuABAQYv8Apv1rP1bn1Yn/PRilXuHMMCrK9f9mSWwA==</latexit>
  • 23. NumPyだけによる実装 import numpy as n p epochs = 30 0 batch_size = 3 2 D_in = 78 4 H = 10 0 D_out = 1 0 learning_rate = 1.0e-0 6 # create random input and output dat a x = np.random.randn(batch_size, D_in ) y = np.random.randn(batch_size, D_out ) # randomly initialize weight s w1 = np.random.randn(D_in, H ) w2 = np.random.randn(H, D_out ) for epoch in range(epochs) : # forward pas s h = x.dot(w1) # h = x * w 1 h_r = np.maximum(h, 0) # h_r = ReLU(h ) y_p = h_r.dot(w2) # y_p = h_r * w 2 # compute mean squared error and print los s loss = np.square(y_p - y).sum() print(epoch, loss ) # backward pass: compute gradients of loss with respect to w 2 grad_y_p = 2.0 * (y_p - y) grad_w2 = h_r.T.dot(grad_y_p) # backward pass: compute gradients of loss with respect to w 1 grad_h_r = grad_y_p.dot(w2.T) grad_h = grad_h_r.copy() grad_h[h < 0] = 0 grad_w1 = x.T.dot(grad_h) # update weight s w1 -= learning_rate * grad_w 1 w2 -= learning_rate * grad_w2 w1 w1 ⌘ @L @w1 <latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit> <latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit> <latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit> <latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit> w2 w2 ⌘ @L @w2 <latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit> <latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit> <latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit> <latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit> @L @w2 = @L @yp @yp @w2 = 1 NO 2 (yp y) hr @L @w1 = @L @yp @yp @hr @hr @w1 = 1 NO 2 (yp y) w2x L = 1 NO X (yp y) 2 00_numpy.py
  • 24. PyTorch の導入 import torc h epochs = 30 0 batch_size = 3 2 D_in = 78 4 H = 10 0 D_out = 1 0 learning_rate = 1.0e-0 6 # create random input and output dat a x = torch.randn(batch_size, D_in ) y = torch.randn(batch_size, D_out ) # randomly initialize weight s w1 = torch.randn(D_in, H ) w2 = torch.randn(H, D_out ) for epoch in range(epochs) : # forward pass: compute predicted y h = x.mm(w1 ) h_r = h.clamp(min=0 ) y_p = h_r.mm(w2 ) # compute and print los s loss = (y_p - y).pow(2).sum().item( ) print(t, loss ) # backward pass: compute gradients of loss with respect to w 2 grad_y_p = 2.0 * (y_p - y ) grad_w2 = h_r.t().mm(grad_y_p ) # backward pass: compute gradients of loss with respect to w 1 grad_h_r = grad_y_p.mm(w2.t() ) grad_h = grad_h_r.clone( ) grad_h[h < 0] = 0 grad_w1 = x.t().mm(grad_h ) # update weight s w1 -= learning_rate * grad_w 1 w2 -= learning_rate * grad_w2 np.random torch np torch x.dot(w1) x.mm(w1) np.maximum(h, 0) h.clamp(min=0) np.square(y_p-y) (y_p-y).pow(2) copy() clone() 01_tensors.py
  • 25. 自動微分の導入 # randomly initialize weight s w1 = torch.randn(D_in, H ) w2 = torch.randn(H, D_out ) for epoch in range(epochs) : # forward pass: compute predicted y h = x.mm(w1 ) h_r = h.clamp(min=0 ) y_p = h_r.mm(w2 ) # compute and print los s loss = (y_p - y).pow(2).sum().item( ) print(t, loss ) # backward pass: compute gradients of loss with respect to w 2 grad_y_p = 2.0 * (y_p - y ) grad_w2 = h_r.t().mm(grad_y_p ) # backward pass: compute gradients of loss with respect to w 1 grad_h_r = grad_y_p.mm(w2.t() ) grad_h = grad_h_r.clone( ) grad_h[h < 0] = 0 grad_w1 = x.t().mm(grad_h ) # update weight s w1 -= learning_rate * grad_w 1 w2 -= learning_rate * grad_w2 01_tensor.py 02_autograd.py # randomly initialize weight s w1 = torch.randn(D_in, H, requires_grad=True ) w2 = torch.randn(H, D_out, requires_grad=True ) for epoch in range(epochs) : # forward pass: compute predicted y h = x.mm(w1 ) h_r = h.clamp(min=0 ) y_p = h_r.mm(w2 ) # compute and print los s loss = (y_p - y).pow(2).sum( ) print(t, loss.item() ) # backward pas s loss.backward( ) with torch.no_grad() : # update weight s w1 -= learning_rate * w1.grad w2 -= learning_rate * w2.grad # initialize weight s w1.grad.zero_( ) w2.grad.zero_() @L @w1 = @L @yp @yp @hr @hr @w1 = 1 NO 2(yp y)w2x <latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit> <latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit> <latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit> <latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit> 微分を自動的に計算してくれる
  • 26. 活性化関数の自作 03_function.py import torc h for epoch in range(epochs) : # forward pass: compute predicted y h = x.mm(w1 ) h_r = h.clamp(min=0 ) y_p = h_r.mm(w2) 02_autograd.py import torc h class ReLU(torch.autograd.Function) : @staticmetho d def forward(ctx, input) : ctx.save_for_backward(input ) return input.clamp(min=0 ) @staticmetho d def backward(ctx, grad_output) : input, = ctx.saved_tensor s grad_input = grad_output.clone( ) grad_input[input<0] = 0 return grad_inpu t for epoch in range(epochs) : # forward pass: compute predicted y relu = ReLU.appl y h = x.mm(w1 ) h_r = relu(h ) y_p = h_r.mm(w2) . . . . . . y=f(x) ReLU (Recti fi ed Linear Unit)
  • 27. torch.nnの利用 04_nn_module.py # create random input and output dat a x = torch.randn(batch_size, D_in ) y = torch.randn(batch_size, D_out ) # randomly initialize weight s w1 = torch.randn(D_in, H, requires_grad=True ) w2 = torch.randn(H, D_out, requires_grad=True ) for epoch in range(epochs) : # forward pass: compute predicted y h = x.mm(w1 ) h_r = h.clamp(min=0 ) y_p = h_r.mm(w2 ) # compute and print los s loss = (y_p - y).pow(2).sum() print(t, loss.item() ) # backward pas s loss.backward( ) with torch.no_grad() : # update weight s w1 -= learning_rate * w1.gra d w2 -= learning_rate * w2.gra d # initialize weight s w1.grad.zero_( ) w2.grad.zero_() 02_autograd.py # create random input and output dat a x = torch.randn(batch_size, D_in ) y = torch.randn(batch_size, D_out ) # define mode l model = torch.nn.Sequential ( torch.nn.Linear(D_in, H) , torch.nn.ReLU() , torch.nn.Linear(H, D_out) , ) # define loss functio n criterion = torch.nn.MSELoss(reduction='sum' ) for epoch in range(epochs) : # forward pass: compute predicted y y_p = model(x) # compute and print los s loss = criterion(y_p, y) print(t, loss.item() ) # backward pas s model.zero_grad() loss.backward( ) with torch.no_grad() : # update weight s for param in model.parameters() : param -= learning_rate * param.grad
  • 28. 最適化関数の呼び出し 05_optimizer.py 04_nn_module.py # define loss functio n criterion = torch.nn.MSELoss(reduction='sum' ) for t in range(epochs) : # forward pass: compute predicted y y_p = model(x ) # compute and print los s loss = criterion(y_p, y ) print(t, loss.item() ) # backward pas s model.zero_grad() loss.backward( ) with torch.no_grad() : # update weight s for param in model.parameters() : param -= learning_rate * param.grad # define loss functio n criterion = torch.nn.MSELoss(reduction='sum' ) # define optimize r optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate ) for epoch in range(epochs) : # forward pass: compute predicted y y_p = model(x ) # compute and print los s loss = criterion(y_p, y ) print(t, loss.item() ) # backward pas s optimizer.zero_grad() loss.backward( ) # update weight s optimizer.step()
  • 29. モデルを自作 06_mm_module.py 05_optimizer.py # create random input and output dat a x = torch.randn(batch_size, D_in ) y = torch.randn(batch_size, D_out ) # define mode l model = torch.nn.Sequential ( torch.nn.Linear(D_in, H) , torch.nn.ReLU() , torch.nn.Linear(H, D_out) , ) # define loss functio n criterion = torch.nn.MSELoss(reduction='sum') import torch.nn as n n import torch.nn.functional as F class TwoLayerNet(nn.Module) : def __init__(self, D_in, H, D_out) : super(TwoLayerNet, self).__init__( ) self.fc1 = nn.Linear(D_in, H ) self.fc2 = nn.Linear(H, D_out ) def forward(self, x) : h = self.fc1(x ) h_r = F.relu(h ) y_p = self.fc2(h_r ) return y_ p # create random input and output dat a x = torch.randn(batch_size, D_in ) y = torch.randn(batch_size, D_out ) # define mode l model = TwoLayerNet(D_in, H, D_out ) # define loss functio n criterion = nn.MSELoss(reduction='sum') . . . 学習時に不変
  • 30. MNIST Datasetのロード 07_mnist.py 06_mm_module.py import torch.nn as n n import torch.nn.functional as F from torchvision import datasets, transform s # read input data and label s train_dataset = datasets.MNIST('./data' , train=True , download=True , transform=transforms.ToTensor() ) train_loader = torch.utils.data.DataLoader(dataset=train_dataset , batch_size=batch_size , shuffle=True ) for epoch in range(epochs) : # Set model to training mod e model.train( ) # Loop over each batch from the training se t for batch_idx, (x, y) in enumerate(train_loader): # forward pass: compute predicted y y_p = model(x) . . . import torch.nn as n n import torch.nn.functional as F # create random input and output dat a x = torch.randn(batch_size, D_in ) y = torch.randn(batch_size, D_out ) for t in range(epochs) : # forward pass: compute predictedy y_p = model(x) . . . . . .
  • 31. Validationデータによる検証 08_validate.py def validate() : model.eval( ) val_loss, val_acc = 0, 0 for data, target in val_loader : output = model(data ) loss = criterion(output, target ) val_loss += loss.item() pred = output.data.max(1)[1 ] val_acc += 100. * pred.eq(target.data).cpu().sum() / target.size(0) val_loss /= len(val_loader ) val_acc /= len(val_loader ) print('nValidation set: Average loss: {:.4f}, Accuracy: {:.1f}%n'.format ( val_loss, val_acc)) 学習時に使うデータ ハイパラやモデル を変えて試すとき に使うデータ 最終的な精度の評価 に使うデータ Validation dataのloss 予測クラスがラベルと一致しているか? パーセンテージに変換 sum()はGPUでやると遅いのでCPUで
  • 32. train(), main()関数の形で書く 09_train.py def train(train_loader,model,criterion,optimizer,epoch) : model.train( ) t = time.perf_counter( ) for batch_idx, (data, target) in enumerate(train_loader) : output = model(data ) loss = criterion(output, target ) optimizer.zero_grad( ) loss.backward( ) optimizer.step( ) if batch_idx % 200 == 0 : print('Train Epoch: {} [{:>5}/{} ({:.0%})]tLoss: {:.6f}t Time:{:.4f}'.format ( epoch, batch_idx * len(data), len(train_loader.dataset) , batch_idx / len(train_loader), loss.data.item() , time.perf_counter() - t) ) t = time.perf_counter() def main() : epochs = 1 0 batch_size = 3 2 learning_rate = 1.0e-0 2 train_dataset = datasets.MNIST('./data' , train=True , download=True , transform=transforms.ToTensor() ) val_dataset = datasets.MNIST('./data' , train=False , transform=transforms.ToTensor() ) train_loader = torch.utils.data.DataLoader(dataset=train_dataset , batch_size=batch_size , shuffle=True ) val_loader = torch.utils.data.DataLoader(dataset=validation_dataset , batch_size=batch_size , shuffle=False ) model = TwoLayerNet(D_in, H, D_out ) criterion = nn.CrossEntropyLoss( ) optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate ) for epoch in range(epochs) : model.train( ) train(train_loader,model,criterion,optimizer,epoch ) validate(val_loader,model,criterion )
  • 33. 畳み込みNNモデル 10_cnn.py 09_train.py class CNN(nn.Module) : def __init__(self) : super(CNN, self).__init__( ) self.conv1 = nn.Conv2d(1, 32, 3, 1 ) self.conv2 = nn.Conv2d(32, 64, 3, 1 ) self.dropout1 = nn.Dropout2d(0.25 ) self.dropout2 = nn.Dropout2d(0.5 ) self.fc1 = nn.Linear(9216, 128 ) self.fc2 = nn.Linear(128, 10 ) def forward(self, x) : x = self.conv1(x ) x = F.relu(x ) x = self.conv2(x ) x = F.relu(x ) x = F.max_pool2d(x, 2 ) x = self.dropout1(x ) x = torch.flatten(x, 1 ) x = self.fc1(x ) x = F.relu(x ) x = self.dropout2(x ) x = self.fc2(x ) output = F.log_softmax(x, dim=1 ) return output class TwoLayerNet(nn.Module) : def __init__(self, D_in, H, D_out) : super(TwoLayerNet, self).__init__( ) self.fc1 = nn.Linear(D_in, H ) self.fc2 = nn.Linear(H, D_out ) def forward(self, x) : x = x.view(-1, D_in ) h = self.fc1(x ) h_r = F.relu(h ) y_p = self.fc2(h_r ) return F.log_softmax(y_p, dim=1)
  • 34. GPUを利用 11_gpu.py device = torch.device('cuda' ) model = CNN().to(device) def train(train_loader,model,criterion,optimizer,epoch) : model.train( ) t = time.perf_counter( ) for batch_idx, (data, target) in enumerate(train_loader) : data = data.to(device) target = target.to(device) def validate(loss_vector, accuracy_vector) : model.eval( ) val_loss, correct = 0, 0 for data, target in validation_loader : data = data.to(device) target = target.to(device) . . . . . . . . . PyTorchは裏でcuDNNを呼んでいる 1. torch.device(‘cuda’)でデバイスを指定 2. data, targetをデバイスに送る 3. 計算は全て自動的にGPUを用いて行われる
  • 35. 分散並列 12_distributed.py import o s import torc h import torch.distributed as dis t master_addr = os.getenv("MASTER_ADDR", default="localhost" ) master_port = os.getenv('MASTER_PORT', default='8888' ) method = "tcp://{}:{}".format(master_addr, master_port ) rank = int(os.getenv('OMPI_COMM_WORLD_RANK', '0') ) world_size = int(os.getenv('OMPI_COMM_WORLD_SIZE', '1') ) dist.init_process_group("nccl", init_method=method, rank=rank, world_size=world_size ) print('Rank: {}, Size: {}'.format(dist.get_rank(),dist.get_world_size()) ) ngpus = 4 device = rank % ngpu s x = torch.randn(1).to(device ) print('rank {}: {}'.format(rank, x) ) dist.broadcast(x, src=0 ) print('rank {}: {}'.format(rank, x)) 通信に用いるホストアドレスとポート番号を指定 OpenMPI環境変数からrankとsizeを取得 PyTorchにこれらを設定 PyTorchによる集団通信 .bashrcに以下を記入 if [ -f "$SGE_JOB_SPOOL_DIR/pe_hostfile" ]; the n export MASTER_ADDR=`head -n 1 $SGE_JOB_SPOOL_DIR/pe_hostfile | cut -d " " -f 1 ` f i mpirun -np 4 python 12_distributed.py
  • 36. 分散並列MNIST 13_ddp.py def print0(message) : if torch.distributed.is_initialized() : if torch.distributed.get_rank() == 0 : print(message, flush=True ) else : print(message, flush=True ) train_sampler = torch.utils.data.distributed.DistributedSampler ( train_dataset , num_replicas=torch.distributed.get_world_size() , rank=torch.distributed.get_rank() ) model = DDP(model, device_ids=[rank]) . . . . . . 全プロセスがprintすると見づらいので1プロセスだけprintするようなprint関数を定義 train dataの読み込みで異なるプロセスが異なるデータを読むようにする モデルをDDP()に通すことで分散並列計算を行う
  • 37. Argparse 14_args.py import argpars e import torc h import torch.distributed as dis t import torch.nn as n n parser = argparse.ArgumentParser(description='PyTorch MNIST Example' ) parser.add_argument('--batch-size', type=int, default=32, metavar='N' , help='input batch size for training (default: 32)' ) parser.add_argument('--epochs', type=int, default=10, metavar='N' , help='number of epochs to train (default: 10)' ) parser.add_argument('--lr', type=float, default=1.0e-02, metavar='LR' , help='learning rate (default: 1.0e-02)' ) args = parser.parse_args( ) epochs = args.epochs batch_size = args.batch_size learning_rate = args.lr * world_size 直接数字を入れていたところをargsの変数を入れられる https://docs.python.org/ja/3/library/argparse.html#action
  • 38. AverageMeter 15_meter.py def train(train_loader,model,criterion,optimizer,epoch,device) : batch_time = AverageMeter('Time', ':.4f' ) train_loss = AverageMeter('Loss', ':.6f') class AverageMeter(object) : def __init__(self, name, fmt=':f') : self.name = nam e self.fmt = fm t self.reset( ) def reset(self) : self.val = 0 self.avg = 0 self.sum = 0 self.count = 0 def update(self, val, n=1) : self.val = va l self.sum += val * n self.count += n self.avg = self.sum / self.coun t def __str__(self) : fmtstr = '{name} {val' + self.fmt + '} ({avg' + self.fmt + '}) ' return fmtstr.format(**self.__dict__) valが既にn個の平均の場合 値 平均 和 個数 出力形式
  • 39. ProgressMeter 15_meter.py def train(train_loader,model,criterion,optimizer,epoch,device) : batch_time = AverageMeter('Time', ':.4f' ) train_loss = AverageMeter('Loss', ‘:.6f' ) progress = ProgressMeter ( len(train_loader) , [train_loss, batch_time] , prefix="Epoch: [{}]".format(epoch)) class ProgressMeter(object) : def __init__(self, num_batches, meters, prefix="", postfix="") : self.batch_fmtstr = self._get_batch_fmtstr(num_batches ) self.meters = meters self.prefix = prefix self.postfix = postfix def display(self, batch) : entries = [self.prefix + self.batch_fmtstr.format(batch) ] entries += [str(meter) for meter in self.meters ] entries += self.postfi x print0('t'.join(entries) ) def _get_batch_fmtstr(self, num_batches) : num_digits = len(str(num_batches // 1) ) fmt = '{:' + str(num_digits) + 'd} ' return '[' + fmt + '/' + fmt.format(num_batches) + ']' 前にprintするもの 後にprintするもの printしたい変数 printしたいものを連結 [ 今のbatch / 全batch数 ] のような表示をしたい
  • 40. Weights and Biases pip install wand b wandb login import wand b os.environ['MASTER_ADDR'] = 'localhost ' os.environ['MASTER_PORT'] = '8888 ' rank = int(os.getenv('OMPI_COMM_WORLD_RANK', '0') ) world_size = int(os.getenv('OMPI_COMM_WORLD_SIZE', '1') ) dist.init_process_group("nccl", rank=rank, world_size=world_size ) device = torch.device('cuda',rank ) if torch.distributed.get_rank() == 0 : wandb.init(project="example-project" ) wandb.config.update(args ) epochs = args.epoch s batch_size = args.batch_siz e learning_rate = args.lr * world_size for epoch in range(epochs) : model.train( ) train_loss, train_acc = train(train_loader,model,criterion,optimizer,epoch,device ) val_loss, val_acc = validate(val_loader,model,criterion,device ) if torch.distributed.get_rank() == 0 : wandb.log( { 'train_loss': train_loss , 'train_acc': train_acc , 'val_loss': val_loss , 'val_acc': val_ac c }) wandbで記録したい変数 trainとvalidateがlossとaccuracyを返すようにする wandbの初期化 argsを渡すと実験条件を勝手に記録してくれる 16_wandb.py
  • 41. train_dataset = datasets.CIFAR10('./data' , train=True , download=True , transform=transforms.ToTensor() ) val_dataset = datasets.CIFAR10('./data' , train=False , download=True , transform=transforms.ToTensor()) CIFAR10 17_cifar10.py model = VGG('VGG19').to(device ) model = DDP(model, device_ids=[rank % 4] ) criterion = nn.CrossEntropyLoss( ) optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate) データセットの変更 モデルの変更
  • 42. transform_train = transforms.Compose( [ transforms.RandomCrop(32, padding=4) , transforms.RandomHorizontalFlip(), transforms.ToTensor() , transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)), ] ) transform_val = transforms.Compose( [ transforms.ToTensor() , transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)) , ]) データ拡張 18_augmentation.py 輝度値の正規化
  • 43. parser.add_argument('--momentum', type=float, default=0.9, metavar='M' , help='momentum (default: 0.9)' ) parser.add_argument('--wd', '--weight_decay', type=float, default=5.0e-04, metavar='W' , help='learning rate (default: 5.0e-04)') 正則化 19_regularization.py optimizer = torch.optim.SGD(model.parameters(), lr=args.lr , momentum=args.momentum, weight_decay=args.wd) <latexit sha1_base64="GBuDE2XllGcgcaw34c+lteFIv9Q=">AAACJnicbVDLSsNAFJ34rO+oSzeDRRCFkhRFN0LRjcsKVoWmlsnkxg5OHszcCCXNP/gZfoFb/QJ3Iu7c+B9OHwtrPTBwOOce7p3jp1JodJxPa2p6ZnZuvrSwuLS8srpmr29c6SRTHBo8kYm68ZkGKWJooEAJN6kCFvkSrv37s75//QBKiyS+xG4KrYjdxSIUnKGR2vaeh0IGkMviRO57oWI896SJB6zIq0Wv52EHkNFe77batstOxRmAThJ3RMpkhHrb/vaChGcRxMgl07rpOim2cqZQcAnFopdpSBm/Z3fQNDRmEehWPvhTQXeMEtAwUebFSAfq70TOIq27kW8mI4Yd/dfri/95zQzD41Yu4jRDiPlwUZhJigntF0QDoYCj7BrCuBLmVso7zBSDpsaxLYHun1aYXty/LUySq2rFPaw4Fwfl2umooRLZIttkl7jkiNTIOamTBuHkkTyTF/JqPVlv1rv1MRydskaZTTIG6+sH2qSnTw==</latexit> ˜ l = l + 2 ||✓||2 <latexit sha1_base64="aTi3bmxUDeM1ohqP26MgoCugRRA=">AAACI3icbVDLSgMxFM34tr6qLt0EiygIZUYU3QhFNy4VrAqdUu5kbm0wkxmSO0IZ+gl+hl/gVr/Anbhx4dL/MH0stHogcHLOvbk3J8qUtOT7H97E5NT0zOzcfGlhcWl5pby6dmXT3Aisi1Sl5iYCi0pqrJMkhTeZQUgihdfR3Wnfv75HY2WqL6mbYTOBWy3bUgA5qVXeDjVECkKSKsZC9Y6Hd652Q+VeiYGH1EGCVrniV/0B+F8SjEiFjXDeKn+FcSryBDUJBdY2Aj+jZgGGpFDYK4W5xQzEHdxiw1ENCdpmMfhQj285Jebt1LijiQ/Unx0FJNZ2k8hVJkAdO+71xf+8Rk7to2YhdZYTajEc1M4Vp5T30+GxNChIdR0BYaTblYsOGBDkMvw1Jbb91Xoul2A8hb/kaq8aHFT9i/1K7WSU0BzbYJtshwXskNXYGTtndSbYA3tiz+zFe/RevTfvfVg64Y161tkveJ/fX8Glaw==</latexit> r˜ l = rl + ✓ <latexit sha1_base64="+SsjknKlypA8fgkVhF8vJQYEV3M=">AAACPnicbVDLSgMxFM34rO+qSzfBIlSkZUYU3QhFNy4VrAqdUu5k0jaYyQzJHaEM/R8/wy9wq36A7sStS9M6gm29EHJyzn3lBIkUBl331Zmanpmdmy8sLC4tr6yuFdc3rk2casbrLJaxvg3AcCkUr6NAyW8TzSEKJL8J7s4G+s0910bE6gp7CW9G0FGiLRigpVrFUx+7HKGV4Z7XP8kfWPHt5SsIJFBZ/mV3aYUOBWn7h/BLt4olt+oOg04CLwclksdFq/jmhzFLI66QSTCm4bkJNjPQKJjk/UU/NTwBdgcd3rBQQcRNMxv+tU93LBPSdqztUUiH7N+KDCJjelFgMyPArhnXBuR/WiPF9nEzEypJkSv2M6idSooxHRhHQ6E5Q9mzAJgWdlfKuqCBobV3ZEpoBqv1rS/euAuT4Hq/6h1W3cuDUu00d6hAtsg2KROPHJEaOScXpE4YeSBP5Jm8OI/Ou/PhfP6kTjl5zSYZCefrG44ksA8=</latexit> ✓t+1 = ✓t ⌘rl(✓t) ⌘ ✓t Momentum L2 正則化
  • 44. Sweep sweep.yaml program: wrapper.p y method: gri d metric : goal: minimiz e name: val_los s parameters : epochs : values: [100 ] batch_size : values: [32 ] learning_rate : values: [0.005, 0.01, 0.02, 0.05, 0.1 ] momentum : values: [0.85, 0.9, 0.95 ] weight_decay : values: [1.0e-4, 2.0e-4, 5.0e-4, 1.0e-3, 2.0e-3] wandb sweep sweep.yaml
  • 45. Models 19_regularization.py model = VGG('VGG19').to(device ) # model = ResNet18().to(device ) # model = PreActResNet18().to(device ) # model = GoogLeNet().to(device ) # model = DenseNet121().to(device ) # model = ResNeXt29_2x64d().to(device ) # model = MobileNet().to(device ) # model = MobileNetV2().to(device ) # model = DPN92().to(device ) # model = ShuffleNetG2().to(device ) # model = SENet18().to(device ) # model = ShuffleNetV2(1).to(device ) # model = EfficientNetB0().to(device ) # model = RegNetX_200MF().to(device) 今はこれを使っている 他のモデルも試して見ましょう
  • 46. 参考文献 Learning PyTorch with Example s https://pytorch.org/tutorials/beginner/pytorch_with_examples.html PyTorch Examples githu b https://github.com/pytorch/examples PyTorch Tutorial githu b https://github.com/yunjey/pytorch-tutorial Understanding PyTorch with an example: a step-by-step tutorial by Daniel Godo y https://towardsdatascience.com/understanding-pytorch-with-an-example-a-step-by-step-tutorial-81fc5f8c4e8e Practical Deep Learning for Coders, v3 by fast.a i https://course.fast.ai PyTorch by Beeren Sah u https://beerensahu.wordpress.com/2018/03/21/pytorch-tutorial-lesson-1-tensor/ Writing Distributed Applications with PyTorch by Séb Arnol d https://pytorch.org/tutorials/intermediate/dist_tuto.html