計算科学技術特論
A

第14回: 深層学習フレームワークの基礎と実践1
東京工業大学 学術国際情報センター
横田理央
rioyokota@gsic.titech.ac.jp
スパコンでしかできない深層学習
TPU v3
ImageNet SOTA: 90.45
%

Top 1 Accuracy
10,000 TPUv3 core days
R
e
s
N
e
t
-
5
0
D
i
s
t
i
l
B
E
R
T
E
L
M
o
B
E
R
T
-
L
a
r
g
e
G
P
T
-
2
M
e
g
a
t
r
o
n
L
M
T
u
r
i
n
g
-
N
L
G
G
P
T
-
3
S
w
i
t
c
h
T
r
a
n
s
f
o
r
m
e
r
10
7
10
8
10
9
10
10
10
11
10
12
10
13
Number
of
parameters
x100,000
計算量上等
Papers with code
MLPerf target score: 75.9
https://paperswithcode.com
ImageNet-1k
一番良い結果を出している論文の多くが手元で再現可能
主要な深層ニューラルネットモデルの変遷
https://towardsdatascience.com/from-lenet-to-ef
fi
cientnet-the-evolution-of-cnns-3a57eb34672f
AlexNet: ReLU, Dropout, GPU
2012 2015
ResNet: Skip connection
2017
MobileNet: 1x1畳み込み
2019
Ef
fi
cientNet: Neural architecture search
2021
Transformer: 注意機構
Vision Transformer: 画像パッチ
1995
LSTM
LeNet-5:畳み込み
0.9
0.1
0
Labradoodle
Fried chicken
1
<latexit sha1_base64="Qrf0MYIwlAOrUIJFBxNweXaH96A=">AAAB/3icbVDLSsNAFL2pr1pfVZduBovgqiSi6EYounFZwbSFNpTJZNIOnUzCzEQsoQu/wK1+gTtx66f4Af6HkzYL23pg4HDOvdwzx084U9q2v63Syura+kZ5s7K1vbO7V90/aKk4lYS6JOax7PhYUc4EdTXTnHYSSXHkc9r2R7e5336kUrFYPOhxQr0IDwQLGcHaSO7T9bBv96s1u25PgZaJU5AaFGj2qz+9ICZpRIUmHCvVdexEexmWmhFOJ5VeqmiCyQgPaNdQgSOqvGwadoJOjBKgMJbmCY2m6t+NDEdKjSPfTEZYD9Wil4v/ed1Uh1dexkSSairI7FCYcqRjlP8cBUxSovnYEEwkM1kRGWKJiTb9zF0JVB5tYnpxFltYJq2zunNRt+/Pa42boqEyHMExnIIDl9CAO2iCCwQYvMArvFnP1rv1YX3ORktWsXMIc7C+fgH/kJbJ</latexit>
x = h0
<latexit sha1_base64="oeS8g7Am64cZNl7f2teu7TnWjwI=">AAACBHicdVDLSgMxFM3UV62vqks3wSK4GjKl1XYhFN24rGAf2A5DJpO2oZnMkGSEUrr1C9zqF7gTt/6HH+B/mGlHsKIHLhzOuZd77/FjzpRG6MPKrayurW/kNwtb2zu7e8X9g7aKEkloi0Q8kl0fK8qZoC3NNKfdWFIc+px2/PFV6nfuqVQsErd6ElM3xEPBBoxgbaS7xEMXHQ+NPOQVS8iu16r1Sg0iG82RkvJZvepAJ1NKIEPTK372g4gkIRWacKxUz0GxdqdYakY4nRX6iaIxJmM8pD1DBQ6pcqfzi2fwxCgBHETSlNBwrv6cmOJQqUnom84Q65H67aXiX14v0YOaO2UiTjQVZLFokHCoI5i+DwMmKdF8YggmkplbIRlhiYk2IS1tCVR62szk8v08/J+0y7ZTtdFNpdS4zBLKgyNwDE6BA85BA1yDJmgBAgR4BE/g2XqwXqxX623RmrOymUOwBOv9C3UxmK8=</latexit>
u0 = W0h0
<latexit sha1_base64="i98NF53nvMx1GTvIOlT02vBIGAA=">AAACBHicdVDLSsNAFL2pr1pfVZdugkVwFRKt2o1QdOOygn1gG8JkMmmHTiZhZiKU0K1f4Fa/wJ249T/8AP/DSVuhFT0wcDjnXu6Z4yeMSmXbn0ZhaXllda24XtrY3NreKe/utWScCkyaOGax6PhIEkY5aSqqGOkkgqDIZ6TtD69zv/1AhKQxv1OjhLgR6nMaUoyUlu5Tz7lse87Ac7xyxbHsCUzbOq9W7dOaJjPlx6rADA2v/NULYpxGhCvMkJRdx06UmyGhKGZkXOqlkiQID1GfdDXlKCLSzSaJx+aRVgIzjIV+XJkTdX4jQ5GUo8jXkxFSA/nby8W/vG6qwpqbUZ6kinA8PRSmzFSxmX/fDKggWLGRJggLqrOaeIAEwkqXtHAlkHm08Xwv/5PWieWcWfZttVK/mjVUhAM4hGNw4ALqcAMNaAIGDk/wDC/Go/FqvBnv09GCMdvZhwUYH988fpiK</latexit>
u1 = W1h1
<latexit sha1_base64="4dcnyKt/ee7kGXZ6S3uBkWEAkPc=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRKt2mXRjcsK9gFNCJPJpB06mYSZiVBC136GX+BWv8CduBU/wP9w0ka0ohcGzj3nvub4CaNSWda7UVpYXFpeKa9W1tY3NrfM7Z2OjFOBSRvHLBY9H0nCKCdtRRUjvUQQFPmMdP3RZa53b4mQNOY3apwQN0IDTkOKkdKUZ+47oUA4cxIkFEUMpp49+c6GOvPMql2zpgGt2lm9bp00NCiYL6kKimh55ocTxDiNCFeYISn7tpUoN8tHYkYmFSeVJEF4hAakryFHEZFuNv3KBB5qJoBhLPTjCk7Znx0ZiqQcR76ujJAayt9aTv6l9VMVNtyM8iRVhOPZojBlUMUw9wUGVBCs2FgDhAXVt0I8RNobpd2b2xLI/LQ5X/4HneOafVqzruvV5kXhUBnsgQNwBGxwDprgCrRAG2BwBx7AI3gy7o1n48V4nZWWjKJnF8yF8fYJDnqjNw==</latexit>
@u1
@h1
<latexit sha1_base64="rQepUHmc6aWrxxB1wV5j6ZVGC6k=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRKt2mXRjcsK9gFNCJPJpB06mYSZiVBC136GX+BWv8CduBU/wP9w0ka0ohcGzj3nvub4CaNSWda7UVpYXFpeKa9W1tY3NrfM7Z2OjFOBSRvHLBY9H0nCKCdtRRUjvUQQFPmMdP3RZa53b4mQNOY3apwQN0IDTkOKkdKUZ+47oUA4cxIkFEUMpp49+c66OvPMql2zpgGt2lm9bp00NCiYL6kKimh55ocTxDiNCFeYISn7tpUoN8tHYkYmFSeVJEF4hAakryFHEZFuNv3KBB5qJoBhLPTjCk7Znx0ZiqQcR76ujJAayt9aTv6l9VMVNtyM8iRVhOPZojBlUMUw9wUGVBCs2FgDhAXVt0I8RNobpd2b2xLI/LQ5X/4HneOafVqzruvV5kXhUBnsgQNwBGxwDprgCrRAG2BwBx7AI3gy7o1n48V4nZWWjKJnF8yF8fYJ8zGjJg==</latexit>
@u1
@W1
<latexit sha1_base64="60kCDCJfdCUFlI7azaDnN8WmG14=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRJRbHdFNy4r2Ae0IUwmk3boZBJmJkIJWfsZfoFb/QJ34lb8AP/DSRuxFT0wcObc9/FiRqWyrA+jtLS8srpWXq9sbG5t75i7ex0ZJQKTNo5YJHoekoRRTtqKKkZ6sSAo9BjpeuOrPN69I0LSiN+qSUycEA05DShGSkuueTgIBMLpIEZCUcTgyLWzn1/iWplrVq2aNQWcI41G3a43oF0oVVCg5ZqfAz/CSUi4wgxJ2betWDlp3hIzklUGiSQxwmM0JH1NOQqJdNLpKRk81ooPg0joxxWcqvMVKQqlnISezgyRGsnfsVz8K9ZPVFB3UsrjRBGOZ4OChEEVwdwX6FNBsGITTRAWVO8K8Qhpb5R2b2GKL/PVcl++j4f/k85pzT6vWTdn1eZl4VAZHIAjcAJscAGa4Bq0QBtgcA8ewRN4Nh6MF+PVeJulloyiZh8swHj/AiX8o0g=</latexit>
@h1
@u0
<latexit sha1_base64="2g++4FK2qtbTNVizFSiWmGPzmRk=">AAACHXicdVDLSsNAFJ34rPUVdelmtAiuQlJabXdFNy4r2Ac0IUwmk3bo5MHMRCihaz/DL3CrX+BO3Iof4H84aSNa0QMD555779x7j5cwKqRpvmtLyyura+uljfLm1vbOrr633xVxyjHp4JjFvO8hQRiNSEdSyUg/4QSFHiM9b3yZ53u3hAsaRzdykhAnRMOIBhQjqSRXP7IDjnBmJ4hLihhMXXP6HfVU5OoV02g26s1aA5qGOUNOqmfNugWtQqmAAm1X/7D9GKchiSRmSIiBZSbSyfIvMSPTsp0KkiA8RkMyUDRCIRFONjtlCk+U4sMg5upFEs7Unx0ZCoWYhJ6qDJEcid+5XPwrN0hl0HAyGiWpJBGeDwpSBmUMc1+gTznBkk0UQZhTtSvEI6S8kcq9hSm+yFfLffk6Hv5PulXDqhvmda3SuigcKoFDcAxOgQXOQQtcgTboAAzuwAN4BE/avfasvWiv89Ilreg5AAvQ3j4BLYSjTA==</latexit>
@u0
@W0
<latexit sha1_base64="+jqY1jG3/sRBUYetLzlPwiYD6Ak=">AAACBnicdVDLSsNAFL3xWeur6tLNYBHqpiQq2i6EohuXFewD2hAmk0k7dPJgZiKU0L1f4Fa/wJ249Tf8AP/DSRuhFT0wcDjnXu6Z48acSWWan8bS8srq2npho7i5tb2zW9rbb8soEYS2SMQj0XWxpJyFtKWY4rQbC4oDl9OOO7rJ/M4DFZJF4b0ax9QO8CBkPiNYaak/dKwr3zEriWOeOKWyWTWnQHOkXq9ZtTqycqUMOZpO6avvRSQJaKgIx1L2LDNWdoqFYoTTSbGfSBpjMsID2tM0xAGVdjrNPEHHWvGQHwn9QoWm6vxGigMpx4GrJwOshvK3l4l/eb1E+TU7ZWGcKBqS2SE/4UhFKCsAeUxQovhYE0wE01kRGWKBidI1LVzxZBZtonv5+Tz6n7RPq9ZF9ezuvNy4zhsqwCEcQQUsuIQG3EITWkAghid4hhfj0Xg13oz32eiSke8cwAKMj289EpkS</latexit>
h1 = f0(u0)
<latexit sha1_base64="YsNX+xavKPnYRsukvqtfuULWxoM=">AAACBHicbVDLSsNAFJ3UV62vqks3g0Wom5C0anUhFN24rGAf2IYwmUzaoZNJmJkIpXTrF7jVL3Anbv0PP8D/cNIGsdUDA4dz7uWeOV7MqFSW9WnklpZXVtfy64WNza3tneLuXktGicCkiSMWiY6HJGGUk6aiipFOLAgKPUba3vA69dsPREga8Ts1iokToj6nAcVIaek+vgxcu5y49rFbLFmmNQW0zNNaxbqowh/FzkgJZGi4xa+eH+EkJFxhhqTs2lasnDESimJGJoVeIkmM8BD1SVdTjkIinfE08QQeacWHQST04wpO1d8bYxRKOQo9PRkiNZCLXir+53UTFZw7Y8rjRBGOZ4eChEEVwfT70KeCYMVGmiAsqM4K8QAJhJUuae6KL9NoE92LvdjCX9KqmPaZWb09KdWvsoby4AAcgjKwQQ3UwQ1ogCbAgIMn8AxejEfj1Xgz3mejOSPb2QdzMD6+Af4MmGY=</latexit>
p = f1(u1)
深層ニューラルネットの学習
2
2
10
2
5
5
5
5
15
<latexit sha1_base64="eGta8ATILI7rVM5edzp7/uMnFog=">AAAB+3icbVDLSsNAFJ3UV62vqks3g0VwVRIVdVl047IF+4A2lMnkph06mYSZiRBCvsCtfoE7cevH+AH+h5M2C1s9MHA4517umePFnClt219WZW19Y3Orul3b2d3bP6gfHvVUlEgKXRrxSA48ooAzAV3NNIdBLIGEHoe+N7sv/P4TSMUi8ajTGNyQTAQLGCXaSJ10XG/YTXsO/Jc4JWmgEu1x/XvkRzQJQWjKiVJDx461mxGpGeWQ10aJgpjQGZnA0FBBQlBuNg+a4zOj+DiIpHlC47n6eyMjoVJp6JnJkOipWvUK8T9vmOjg1s2YiBMNgi4OBQnHOsLFr7HPJFDNU0MIlcxkxXRKJKHadLN0xVdFtNz04qy28Jf0LprOdfOyc9Vo3ZUNVdEJOkXnyEE3qIUeUBt1EUWAntELerVy6816tz4WoxWr3DlGS7A+fwCA+ZVy</latexit>
y
<latexit sha1_base64="CX39qY1yvYuKVy5jRO31RUVPsKU=">AAACG3icbVDLSsNAFJ34rPUVdenCwSK4CkmrVndFNy4r2Ac0IUwmk3bo5MHMRCihSz/DL3CrX+BO3LrwA/wPJ21QWz0wcDjnvuZ4CaNCmuaHtrC4tLyyWlorr29sbm3rO7ttEacckxaOWcy7HhKE0Yi0JJWMdBNOUOgx0vGGV7nfuSNc0Di6laOEOCHqRzSgGEklufqBHXCEMztBXFLEYDL+4alrjV29YhrmBNA0TutV86IGvxWrIBVQoOnqn7Yf4zQkkcQMCdGzzEQ6WT4SMzIu26kgCcJD1Cc9RSMUEuFkk4+M4ZFSfBjEXL1Iwon6uyNDoRCj0FOVIZIDMe/l4n9eL5XBuZPRKEklifB0UZAyKGOYpwJ9ygmWbKQIwpyqWyEeIJWMVNnNbPFFflqeizWfwl/SrhrWmVG7Oak0LouESmAfHIJjYIE6aIBr0AQtgME9eARP4Fl70F60V+1tWrqgFT17YAba+xfXKqKf</latexit>
@p
@u1
<latexit sha1_base64="U0Wku2zBzTyNreP8fA04lNpnsb8=">AAACOnicbVC7SgNBFJ31GeNr1dJmMAg2CbsqaqMEbSwsIpgHZGO4OztJhsw+mJkVwrJ/42f4Bbba2AoWYusHOJukyMMDA4dz7p1773EjzqSyrA9jYXFpeWU1t5Zf39jc2jZ3dmsyjAWhVRLyUDRckJSzgFYVU5w2IkHBdzmtu/2bzK8/USFZGDyoQURbPnQD1mEElJba5pXjg+oR4MldeunI2H9MPFCQjijhIGVaHDg87OJo0i+OpLZZsErWEHie2GNSQGNU2uaX44Uk9mmghp83bStSrQSEYoTTNO/EkkZA+tClTU0D8KlsJcM7U3yoFQ93QqFfoPBQnexIwJdy4Lu6MrtKznqZ+J/XjFXnopWwIIoVDchoUCfmWIU4Cw17TFCi+EATIILpXTHpgQCidLRTUzyZrZbqXOzZFOZJ7bhkn5VO7k8L5etxQjm0jw7QEbLROSqjW1RBVUTQM3pFb+jdeDE+jW/jZ1S6YIx79tAUjN8/EDGwDg==</latexit>
L =
data
X class
X
y log p =
data
X
log p
<latexit sha1_base64="NXhC3ff4B32CgQ5BYZuIAyqz5Qg=">AAACJHicbVDLSsNAFJ3UV62vqEs3g0XoqiQq6kYounHhooJ9QBPKZDJph04mYWYilJBf8DP8Arf6Be7EhRt3/oeTNqBtPTBwOPd15ngxo1JZ1qdRWlpeWV0rr1c2Nre2d8zdvbaMEoFJC0csEl0PScIoJy1FFSPdWBAUeox0vNF1Xu88ECFpxO/VOCZuiAacBhQjpaW+Wbt0AoFw6sRIKIoYdEKkhhix9DbLftVO1jerVt2aAC4SuyBVUKDZN78dP8JJSLjCDEnZs61YuWm+EDOSVZxEkhjhERqQnqYchUS66eRHGTzSig+DSOjHFZyofydSFEo5Dj3dmfuV87Vc/K/WS1Rw4aaUx4kiHE8PBQmDKoJ5PNCngmDFxpogLKj2CvEQ6YSUDnHmii9za3ku9nwKi6R9XLfP6id3p9XGVZFQGRyAQ1ADNjgHDXADmqAFMHgEz+AFvBpPxpvxbnxMW0tGMbMPZmB8/QAvl6Z4</latexit>
=
@L
@W
<latexit sha1_base64="RlrtYxiGwNDm/OSOojM6YjHJMWs=">AAACI3icbVC7TsMwFHV4lvIKMLJYVAimKgEEjBUsDAxFog+piSrHdVqrjmPZDlIV5RP4DL6AFb6ADbEwMPIfOG0kaMuRLB2d+zo+gWBUacf5tBYWl5ZXVktr5fWNza1te2e3qeJEYtLAMYtlO0CKMMpJQ1PNSFtIgqKAkVYwvM7rrQciFY35vR4J4keoz2lIMdJG6tpHXigRTj2BpKaIQS9CeoARS2+z7FcVWdeuOFVnDDhP3IJUQIF61/72ejFOIsI1ZkipjusI7af5QsxIVvYSRQTCQ9QnHUM5iojy0/GHMnholB4MY2ke13Cs/p1IUaTUKApMZ+5XzdZy8b9aJ9HhpZ9SLhJNOJ4cChMGdQzzdGCPSoI1GxmCsKTGK8QDZBLSJsOpKz2VW8tzcWdTmCfNk6p7Xj29O6vUroqESmAfHIBj4IILUAM3oA4aAINH8AxewKv1ZL1Z79bHpHXBKmb2wBSsrx/HuKZK</latexit>
@L
@p
<latexit sha1_base64="/J5Xk+dXiOlf6omGGiJXLYgOMI8=">AAACBXicdVDLSsNAFJ34rPVVdelmsAiuQlLb2u6KblxWsA9IY5lMJu3QmUmYmQgldO0XuNUvcCdu/Q4/wP8waSNY0QMXDufcy733eBGjSlvWh7Gyura+sVnYKm7v7O7tlw4OuyqMJSYdHLJQ9j2kCKOCdDTVjPQjSRD3GOl5k6vM790TqWgobvU0Ii5HI0EDipFOJWegYn6X+Eij2bBUtsxmo9asNqBlWnNkpFJv1mxo50oZ5GgPS58DP8QxJ0JjhpRybCvSboKkppiRWXEQKxIhPEEj4qRUIE6Um8xPnsHTVPFhEMq0hIZz9edEgrhSU+6lnRzpsfrtZeJfnhProOEmVESxJgIvFgUxgzqE2f/Qp5JgzaYpQVjS9FaIx0girNOUlrb4Kjsty+X7efg/6VZMu26e31TLrcs8oQI4BifgDNjgArTANWiDDsAgBI/gCTwbD8aL8Wq8LVpXjHzmCCzBeP8CCxKaQA==</latexit>
data
X
<latexit sha1_base64="OIzM9hBXAwa2CsqFH+Q4ck6rUCg=">AAACBXicdVDLSgMxFM3UV62vqks3wSK4Gma01C6LblxWsA+YjiWTybShmWRIMkIZuvYL3OoXuBO3focf4H+YaUewogcCh3Pu5Z6cIGFUacf5sEorq2vrG+XNytb2zu5edf+gq0QqMelgwYTsB0gRRjnpaKoZ6SeSoDhgpBdMrnK/d0+kooLf6mlC/BiNOI0oRtpI3kCl8V0WIo1mw2rNtZ05oGM36nXnvGlIoXxbNVCgPax+DkKB05hwjRlSynOdRPsZkppiRmaVQapIgvAEjYhnKEcxUX42jzyDJ0YJYSSkeVzDufpzI0OxUtM4MJMx0mP128vFvzwv1VHTzyhPUk04XhyKUga1gPn/YUglwZpNDUFYUpMV4jGSCGvT0tKVUOXRlnr5n3TPbLdhn9/Ua63LoqEyOALH4BS44AK0wDVogw7AQIBH8ASerQfrxXq13hajJavYOQRLsN6/AM2Bmhg=</latexit>
data
X
2
2
1
四則演算や初等関数の微分は内部で定義されている
それらを連鎖させれば行列積で勾配が計算できる
後ろからかければ全て行列ベクトル積になる
画像ごとにこれが行われ最後に和をとる
<latexit sha1_base64="60kCDCJfdCUFlI7azaDnN8WmG14=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRJRbHdFNy4r2Ae0IUwmk3boZBJmJkIJWfsZfoFb/QJ34lb8AP/DSRuxFT0wcObc9/FiRqWyrA+jtLS8srpWXq9sbG5t75i7ex0ZJQKTNo5YJHoekoRRTtqKKkZ6sSAo9BjpeuOrPN69I0LSiN+qSUycEA05DShGSkuueTgIBMLpIEZCUcTgyLWzn1/iWplrVq2aNQWcI41G3a43oF0oVVCg5ZqfAz/CSUi4wgxJ2betWDlp3hIzklUGiSQxwmM0JH1NOQqJdNLpKRk81ooPg0joxxWcqvMVKQqlnISezgyRGsnfsVz8K9ZPVFB3UsrjRBGOZ4OChEEVwdwX6FNBsGITTRAWVO8K8Qhpb5R2b2GKL/PVcl++j4f/k85pzT6vWTdn1eZl4VAZHIAjcAJscAGa4Bq0QBtgcA8ewRN4Nh6MF+PVeJulloyiZh8swHj/AiX8o0g=</latexit>
@h1
@u0
<latexit sha1_base64="4dcnyKt/ee7kGXZ6S3uBkWEAkPc=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRKt2mXRjcsK9gFNCJPJpB06mYSZiVBC136GX+BWv8CduBU/wP9w0ka0ohcGzj3nvub4CaNSWda7UVpYXFpeKa9W1tY3NrfM7Z2OjFOBSRvHLBY9H0nCKCdtRRUjvUQQFPmMdP3RZa53b4mQNOY3apwQN0IDTkOKkdKUZ+47oUA4cxIkFEUMpp49+c6GOvPMql2zpgGt2lm9bp00NCiYL6kKimh55ocTxDiNCFeYISn7tpUoN8tHYkYmFSeVJEF4hAakryFHEZFuNv3KBB5qJoBhLPTjCk7Znx0ZiqQcR76ujJAayt9aTv6l9VMVNtyM8iRVhOPZojBlUMUw9wUGVBCs2FgDhAXVt0I8RNobpd2b2xLI/LQ5X/4HneOafVqzruvV5kXhUBnsgQNwBGxwDprgCrRAG2BwBx7AI3gy7o1n48V4nZWWjKJnF8yF8fYJDnqjNw==</latexit>
@u1
@h1
<latexit sha1_base64="CX39qY1yvYuKVy5jRO31RUVPsKU=">AAACG3icbVDLSsNAFJ34rPUVdenCwSK4CkmrVndFNy4r2Ac0IUwmk3bo5MHMRCihSz/DL3CrX+BO3LrwA/wPJ21QWz0wcDjnvuZ4CaNCmuaHtrC4tLyyWlorr29sbm3rO7ttEacckxaOWcy7HhKE0Yi0JJWMdBNOUOgx0vGGV7nfuSNc0Di6laOEOCHqRzSgGEklufqBHXCEMztBXFLEYDL+4alrjV29YhrmBNA0TutV86IGvxWrIBVQoOnqn7Yf4zQkkcQMCdGzzEQ6WT4SMzIu26kgCcJD1Cc9RSMUEuFkk4+M4ZFSfBjEXL1Iwon6uyNDoRCj0FOVIZIDMe/l4n9eL5XBuZPRKEklifB0UZAyKGOYpwJ9ygmWbKQIwpyqWyEeIJWMVNnNbPFFflqeizWfwl/SrhrWmVG7Oak0LouESmAfHIJjYIE6aIBr0AQtgME9eARP4Fl70F60V+1tWrqgFT17YAba+xfXKqKf</latexit>
@p
@u1
<latexit sha1_base64="rQepUHmc6aWrxxB1wV5j6ZVGC6k=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRKt2mXRjcsK9gFNCJPJpB06mYSZiVBC136GX+BWv8CduBU/wP9w0ka0ohcGzj3nvub4CaNSWda7UVpYXFpeKa9W1tY3NrfM7Z2OjFOBSRvHLBY9H0nCKCdtRRUjvUQQFPmMdP3RZa53b4mQNOY3apwQN0IDTkOKkdKUZ+47oUA4cxIkFEUMpp49+c66OvPMql2zpgGt2lm9bp00NCiYL6kKimh55ocTxDiNCFeYISn7tpUoN8tHYkYmFSeVJEF4hAakryFHEZFuNv3KBB5qJoBhLPTjCk7Znx0ZiqQcR76ujJAayt9aTv6l9VMVNtyM8iRVhOPZojBlUMUw9wUGVBCs2FgDhAXVt0I8RNobpd2b2xLI/LQ5X/4HneOafVqzruvV5kXhUBnsgQNwBGxwDprgCrRAG2BwBx7AI3gy7o1n48V4nZWWjKJnF8yF8fYJ8zGjJg==</latexit>
@u1
@W1
<latexit sha1_base64="2g++4FK2qtbTNVizFSiWmGPzmRk=">AAACHXicdVDLSsNAFJ34rPUVdelmtAiuQlJabXdFNy4r2Ac0IUwmk3bo5MHMRCihaz/DL3CrX+BO3Iof4H84aSNa0QMD555779x7j5cwKqRpvmtLyyura+uljfLm1vbOrr633xVxyjHp4JjFvO8hQRiNSEdSyUg/4QSFHiM9b3yZ53u3hAsaRzdykhAnRMOIBhQjqSRXP7IDjnBmJ4hLihhMXXP6HfVU5OoV02g26s1aA5qGOUNOqmfNugWtQqmAAm1X/7D9GKchiSRmSIiBZSbSyfIvMSPTsp0KkiA8RkMyUDRCIRFONjtlCk+U4sMg5upFEs7Unx0ZCoWYhJ6qDJEcid+5XPwrN0hl0HAyGiWpJBGeDwpSBmUMc1+gTznBkk0UQZhTtSvEI6S8kcq9hSm+yFfLffk6Hv5PulXDqhvmda3SuigcKoFDcAxOgQXOQQtcgTboAAzuwAN4BE/avfasvWiv89Ilreg5AAvQ3j4BLYSjTA==</latexit>
@u0
@W0
<latexit sha1_base64="RlrtYxiGwNDm/OSOojM6YjHJMWs=">AAACI3icbVC7TsMwFHV4lvIKMLJYVAimKgEEjBUsDAxFog+piSrHdVqrjmPZDlIV5RP4DL6AFb6ADbEwMPIfOG0kaMuRLB2d+zo+gWBUacf5tBYWl5ZXVktr5fWNza1te2e3qeJEYtLAMYtlO0CKMMpJQ1PNSFtIgqKAkVYwvM7rrQciFY35vR4J4keoz2lIMdJG6tpHXigRTj2BpKaIQS9CeoARS2+z7FcVWdeuOFVnDDhP3IJUQIF61/72ejFOIsI1ZkipjusI7af5QsxIVvYSRQTCQ9QnHUM5iojy0/GHMnholB4MY2ke13Cs/p1IUaTUKApMZ+5XzdZy8b9aJ9HhpZ9SLhJNOJ4cChMGdQzzdGCPSoI1GxmCsKTGK8QDZBLSJsOpKz2VW8tzcWdTmCfNk6p7Xj29O6vUroqESmAfHIBj4IILUAM3oA4aAINH8AxewKv1ZL1Z79bHpHXBKmb2wBSsrx/HuKZK</latexit>
@L
@p
Forward propagation
Backward propagation
Cross entropy loss
確率的勾配降下法 (SGD)
:ラベル
誤差逆伝播法
<latexit sha1_base64="BpaiO7b9hbl/rfkDJL7cM+CMhk8=">AAACNHicbVDLSsNAFJ34rO+qSzeDRRDEkqhoN0LRjQsXFawtNCXcTCc6dPJg5kYoIb/iZ/gFbnUvuJNu/QYntYhVDwwczn2dOX4ihUbbfrWmpmdm5+ZLC4tLyyura+X1jRsdp4rxJotlrNo+aC5FxJsoUPJ2ojiEvuQtv39e1Fv3XGkRR9c4SHg3hNtIBIIBGskr11pehntOftrycN/lCG6ggGVuAgoFSDcEvGMgs8s8/xap6c29csWu2iPQv8QZkwoZo+GVh24vZmnII2QStO44doLdrFjJJM8X3VTzBFgfbnnH0AhCrrvZ6Ic53TFKjwaxMi9COlJ/TmQQaj0IfdNZONa/a4X4X62TYlDrZiJKUuQR+zoUpJJiTIu4aE8ozlAODAGmhPFK2R2YiNCEOnGlpwtrRS7O7xT+kpuDqnNcPbw6qtTPxgmVyBbZJrvEISekTi5IgzQJIw/kiTyTF+vRerPereFX65Q1ntkkE7A+PgFacq02</latexit>
Wt+1 = Wt ⌘
@L
@Wt
最適化手法 https://losslandscape.com
SGD
重みWとバイアスbを合わせてθとする
ミニバッチごとに損失関数の形状は変化する
momentum SGD semi-implicit Euler風に書くと
<latexit sha1_base64="ir9w3iQmgCVKvRKp5bEPzuOE5aE=">AAACH3icdVDLSitBEO3x/boademmMVxwFWa8PpKFIOrCpYJRIROkplMTG7tnhu4aIQz5AD/DL3CrX+BO3PoB/oc9SQQV74GCwzlVVNWJMiUt+f6bNzY+MTk1PTM7N7/wZ3GpsrxybtPcCGyKVKXmMgKLSibYJEkKLzODoCOFF9HNYelf3KKxMk3OqJdhW0M3kbEUQE66qlTDI1QEnPbCLmgNe2FsQBQhEvSLodR3XX6tUd9ubNW5X/MHKMnmTmM74MFIqbIRTq4q72EnFbnGhIQCa1uBn1G7AENSKOzPhbnFDMQNdLHlaAIabbsYPNPnf53S4XFqXCXEB+rXiQK0tT0duU4NdG1/eqX4m9fKKa63C5lkOWEihoviXHFKeZkM70iDglTPERBGulu5uAaXBrn8vm3p2PK0MpfP5/n/yflmLdip/Tvdqu4fjBKaYWtsnW2wgO2yfXbMTliTCXbHHtgje/LuvWfvxXsdto55o5lV9g3e2wf9lqRQ</latexit>
t = =
⌘
物理的整合性を
持たせるためには
Nesterov momentum
RMSProp
Adam
<latexit sha1_base64="IOyR436oWLZbrKn8O0Lz+NybarM=">AAACMXicbVDLSsNAFJ34rPVVdekmWARFLInvjSC6ceGigrVCU8rNdGqHTiZh5kYoIV/iZ/gFbvUL3Ingyp9w0kaw1QvDnDnnXu6Z40eCa3ScN2ticmp6ZrYwV5xfWFxaLq2s3uowVpTVaChCdeeDZoJLVkOOgt1FikHgC1b3exeZXn9gSvNQ3mA/Ys0A7iXvcApoqFbp0MMuQ2gluOOmp/kDdz1zeRJ8AV4A2KUgkqt060febpXKTsUZlP0XuDkok7yqrdKn1w5pHDCJVIDWDdeJsJmAQk4FS4terFkEtAf3rGGghIDpZjL4XmpvGqZtd0JljkR7wP6eSCDQuh/4pjMzq8e1jPxPa8TYOWkmXEYxMkmHizqxsDG0s6zsNleMougbAFRx49WmXVBA0SQ6sqWtM2upycUdT+EvuN2ruEeV/euD8tl5nlCBrJMNskVcckzOyCWpkhqh5JE8kxfyaj1Zb9a79TFsnbDymTUyUtbXNx2yq3o=</latexit>
✓t+1 = ✓t ⌘rL(✓t)
<latexit sha1_base64="EMAKBAJcozE6InSyo+X7qysKpy0=">AAACLnicbVDLSgMxFM34flt16SZYBEUoMyrqRhDduHChYFXolOFOmrbBJDMkdwplmP/wM/wCt/oFggtxJfgZZmoXvg4EDufcyz05cSqFRd9/8UZGx8YnJqemZ2bn5hcWK0vLVzbJDON1lsjE3MRguRSa11Gg5Dep4aBiya/j25PSv+5xY0WiL7Gf8qaCjhZtwQCdFFW2e1GOW0FxGHZAKaC9CLdCjhBqiCWECrDLQOZnxUaIXadHuBlVqn7NH4D+JcGQVMkQ51HlPWwlLFNcI5NgbSPwU2zmYFAwyYuZMLM8BXYLHd5wVIPitpkP/lbQdae0aDsx7mmkA/X7Rg7K2r6K3WQZ1v72SvE/r5Fh+6CZC51myDX7OtTOJMWElkXRljCcoew7AswIl5WyLhhg6Or8caVly2iF6yX43cJfcrVdC/ZqOxe71aPjYUNTZJWskQ0SkH1yRE7JOakTRu7IA3kkT9699+y9em9foyPecGeF/ID38QkKrqnh</latexit>
vt+1 = vt + ⌘rL(✓t)
<latexit sha1_base64="so4WvEfNBhzQ2+k0DIcQ37xIf6o=">AAACGXicbVDLSsNAFJ3UV62vqEsRgkUQxJKoqBuh6MZlBfuANoTJZNoOnUzCzE2hhK78DL/ArX6BO3Hryg/wP5y0WdjqgYFzz7mXe+f4MWcKbPvLKCwsLi2vFFdLa+sbm1vm9k5DRYkktE4iHsmWjxXlTNA6MOC0FUuKQ5/Tpj+4zfzmkErFIvEAo5i6Ie4J1mUEg5Y8c78DfQrYS+HYGV/nBZwMp4Jnlu2KPYH1lzg5KaMcNc/87gQRSUIqgHCsVNuxY3BTLIERTselTqJojMkA92hbU4FDqtx08o2xdaiVwOpGUj8B1kT9PZHiUKlR6OvOEENfzXuZ+J/XTqB75aZMxAlQQaaLugm3ILKyTKyASUqAjzTBRDJ9q0X6WGICOrmZLYHKTstyceZT+EsapxXnonJ2f16u3uQJFdEeOkBHyEGXqIruUA3VEUGP6Bm9oFfjyXgz3o2PaWvByGd20QyMzx85KqEn</latexit>
✓t+1 = ✓t vt+1
<latexit sha1_base64="4YkWyJvS7AvVbbmFVeff+OrbRbQ=">AAACMnicbVDLSsNAFJ34rO+qSzeDRaiIJVGpbgTRjQsXClaFpoab6dQOTiZh5qZQQv7Ez/AL3OoP6E7EnR/hpHbh68DA4Zx7uWdOmEhh0HWfnZHRsfGJydLU9Mzs3PxCeXHpwsSpZrzBYhnrqxAMl0LxBgqU/CrRHKJQ8svw9qjwL3tcGxGrc+wnvBXBjRIdwQCtFJTrvSDDDS/f93U3pr0AN6reZsHXfQWhBD8C7DKQ2Ule9bHLEQJcv94KyhW35g5A/xJvSCpkiNOg/O63Y5ZGXCGTYEzTcxNsZaBRMMnzaT81PAF2Cze8aamCiJtWNvhfTtes0qadWNunkA7U7xsZRMb0o9BOFnHNb68Q//OaKXb2WplQSYpcsa9DnVRSjGlRFm0LzRnKviXAtLBZKeuCBoa20h9X2qaIlttevN8t/CUXWzWvXts+26kcHA4bKpEVskqqxCO75IAck1PSIIzckQfySJ6ce+fFeXXevkZHnOHOMvkB5+MTv8+qnQ==</latexit>
vt+1 = ⇢vt + (1 ⇢)rL(✓t)2
<latexit sha1_base64="jPT370zvmdQBBRtoAnxlbXWW5BM=">AAACTnicbVBNixNBFOyJX3H9inr00hiElUCYUVm9CItePHhYxewupMPwpvMmaba7Z7b7zUJo5n/5M7x68Sb6C7yJ9iQ5uLsWNBRV7/Gqq6i18pSmX5PelavXrt/o39y5dfvO3XuD+w8OfdU4iRNZ6codF+BRK4sTUqTxuHYIptB4VJy87fyjM3ReVfYTrWqcGVhYVSoJFKV88NHkgUZZ+1oswBjgJqeRKB3IIJCgDcKfOgpnm6GRwNorXdm2FRYKDcIALSXo8L7dFbSMGzk9zQfDdJyuwS+TbEuGbIuDfPBdzCvZGLQkNXg/zdKaZgEcKamx3RGNxxrkCSxwGqkFg34W1n9v+ZOozHlZufgs8bX670YA4/3KFHGyC+svep34P2/aUPlqFpStG0IrN4fKRnOqeFcknyuHkvQqEpBOxaxcLiE2R7Huc1fmvovWxl6yiy1cJofPxtne+PmHF8P9N9uG+uwRe8x2WcZesn32jh2wCZPsM/vGfrCfyZfkV/I7+bMZ7SXbnYfsHHr9v7xetzQ=</latexit>
mt+1 = mt +
⌘
p
vt+1 + ✏
rL(✓t)
<latexit sha1_base64="4YEOH6DoswNYNGJQU7KvSQoMDRc=">AAACGXicbVDLSsNAFJ3UV62vqEsRBosgiCVRUTdC0Y3LCvYBbQiTybQdOpOEmRuhhK78DL/ArX6BO3Hryg/wP0zaLGzrgYFzz7mXe+d4keAaLOvbKCwsLi2vFFdLa+sbm1vm9k5Dh7GirE5DEaqWRzQTPGB14CBYK1KMSE+wpje4zfzmI1Oah8EDDCPmSNILeJdTAqnkmvsd6DMgbgLH9ug6L+BETgTXLFsVaww8T+yclFGOmmv+dPyQxpIFQAXRum1bETgJUcCpYKNSJ9YsInRAeqyd0oBIpp1k/I0RPkwVH3dDlb4A8Fj9O5EQqfVQemmnJNDXs14m/ue1Y+heOQkPohhYQCeLurHAEOIsE+xzxSiIYUoIVTy9FdM+UYRCmtzUFl9np2W52LMpzJPGacW+qJzdn5erN3lCRbSHDtARstElqqI7VEN1RNETekGv6M14Nt6ND+Nz0low8pldNAXj6xcqpaEe</latexit>
✓t+1 = ✓t mt+1
<latexit sha1_base64="hcu7JIK5zJuWJREk9Bj+qzd8oyg=">AAACNXicbVDLSsNAFJ34flt16SZYhIpYEhUfC0F048KFglWhKeFmOrVDZyZh5kYoId/iZ/gFbnXtwp269Rec1Cx8XRg495x7uWdOlAhu0POenaHhkdGx8YnJqemZ2bn5ysLipYlTTVmDxiLW1xEYJrhiDeQo2HWiGchIsKuod1zoV7dMGx6rC+wnrCXhRvEOp4CWCiv7Msxw3c8PgoghhL4Mcb3mb5TdWqAgEhBIwC4FkZ3mtQC7hYRrYaXq1b1BuX+BX4IqKessrLwF7ZimkimkAoxp+l6CrQw0cipYPhWkhiVAe3DDmhYqkMy0ssEXc3fVMm23E2v7FLoD9vtGBtKYvozsZGHW/NYK8j+tmWJnr5VxlaTIFP061EmFi7Fb5OW2uWYURd8CoJpbry7tggaKNtUfV9qmsJbbXPzfKfwFl5t1f6e+db5dPTwqE5ogy2SF1IhPdskhOSFnpEEouSMP5JE8OffOi/PqvH+NDjnlzhL5Uc7HJwCnq78=</latexit>
mt+1 = 1mt + (1 1)rL(✓t)
<latexit sha1_base64="b0A57HXrWLeK1gdAKZRl7DCDL2w=">AAACN3icbVDLSsNAFJ34rO+qSzfBIlTEklRRQYSiGxcuFKwKTQ0306kdnEzCzE2hhHyMn+EXuNWlK1eKW//ASc3C14WBc8+5l3vmBLHgGh3n2RoZHRufmCxNTc/Mzs0vlBeXLnSUKMqaNBKRugpAM8ElayJHwa5ixSAMBLsMbo9y/bLPlOaRPMdBzNoh3Eje5RTQUH55v++nuOFmB17AEPx638eNqrtZdOuehECAFwL2KIj0JKt62MslXL+u++WKU3OGZf8FbgEqpKhTv/zqdSKahEwiFaB1y3VibKegkFPBsmkv0SwGegs3rGWghJDpdjr8ZGavGaZjdyNlnkR7yH7fSCHUehAGZjK3q39rOfmf1kqwu9dOuYwTZJJ+HeomwsbIzhOzO1wximJgAFDFjVeb9kABRZPrjysdnVvLTC7u7xT+got6zd2pbZ1tVxqHRUIlskJWSZW4ZJc0yDE5JU1CyR15II/kybq3Xqw36/1rdMQqdpbJj7I+PgF9f6x3</latexit>
vt+1 = 2vt + (1 2)rL(✓t)2
<latexit sha1_base64="MjuWH5G898k351kRZEFDIU570Os=">AAACHHicbVDLSsNAFJ34rPVVdelmsAi6sCQq6kYQ3bhwoWC10JRyM5naoZNJmLkRSujWz/AL3OoXuBO3gh/gfzhps9DqgYHDOfdyz5wgkcKg6346E5NT0zOzpbny/MLi0nJlZfXGxKlmvM5iGetGAIZLoXgdBUreSDSHKJD8Nuid5f7tPddGxOoa+wlvRXCnREcwQCu1KxTaeLzjKwgk+BFgl4HMLgZbPnY5Wm+7Xam6NXcI+pd4BamSApftypcfxiyNuEImwZim5ybYykCjYJIPyn5qeAKsB3e8aamCiJtWNvzJgG5aJaSdWNunkA7VnxsZRMb0o8BO5mHNuJeL/3nNFDtHrUyoJEWu2OhQJ5UUY5rXQkOhOUPZtwSYFjYrZV3QwNCW9+tKaPJoA9uLN97CX3KzW/MOantX+9WT06KhElknG2SLeOSQnJBzcknqhJEH8kSeyYvz6Lw6b877aHTCKXbWyC84H9+Vz6Jo</latexit>
at = rL(✓t)
<latexit sha1_base64="6uRgnqbwem00uROeNjK2GJZ+8vY=">AAACHnicbZDPSsNAEMY39f//qkcvwSIIhZKoqBeh6MWjglWhKWGy3bRLd5OwOymUkLuP4RN41SfwJl71AXwPNzUHa/1g4eObGWb2FySCa3ScT6syMzs3v7C4tLyyura+Ud3cutVxqihr0VjE6j4AzQSPWAs5CnafKAYyEOwuGFwU9bshU5rH0Q2OEtaR0It4yCmgifzq7tDPsO7mZ0Mf6+CjFyqgmccQ8szrgZSQ+9Wa03DGsqeNW5oaKXXlV7+8bkxTySKkArRuu06CnQwUcipYvuylmiVAB9BjbWMjkEx3svFfcnvPJF07jJV5Edrj9PdEBlLrkQxMpwTs67+1Ivyv1k4xPO1kPEpSZBH9WRSmwsbYLsDYXa4YRTEyBqji5lab9sHQQINvYktXF6cVXNy/FKbN7UHDPW4cXh/VmucloUWyQ3bJPnHJCWmSS3JFWoSSB/JEnsmL9Wi9Wm/W+09rxSpntsmErI9vHfqj0w==</latexit>
vt+1 = vt + at
⌘
<latexit sha1_base64="+wvl9bAzEQ9hRPZ+KyDyIC4ih2s=">AAACH3icbVBLSgNBEO2Jvxh/UZduBoMgBMKMiroRgm5cRjAfSEKo6XSSJt0zQ3dNIAw5gMfwBG71BO7EbQ7gPexJZmESHzS8eq+Kqn5eKLhGx5lambX1jc2t7HZuZ3dv/yB/eFTTQaQoq9JABKrhgWaC+6yKHAVrhIqB9ASre8OHxK+PmNI88J9xHLK2hL7Pe5wCGqmTL7RwwBA6MRbdyV1aYHE0F1p9kBJMl1NyZrBXiZuSAklR6eR/Wt2ARpL5SAVo3XSdENsxKORUsEmuFWkWAh1CnzUN9UEy3Y5nn5nYZ0bp2r1AmeejPVP/TsQgtR5Lz3RKwIFe9hLxP68ZYe+2HXM/jJD5dL6oFwkbAztJxu5yxSiKsSFAFTe32nQACiia/Ba2dHVy2sTk4i6nsEpqFyX3unT5dFUo36cJZckJOSXnxCU3pEweSYVUCSUv5I28kw/r1fq0vqzveWvGSmeOyQKs6S8iWKPA</latexit>
✓t+1 = ✓t + vt+1
<latexit sha1_base64="xhnZcIhN39z2hCSemrfnC5bRMYE=">AAACMnicbVDLSsNAFJ34rPVVdekmWARBLEmV6kYounFZwT6gqWEynbRDJw9nboQS8id+hl/gVn9AdyLu/AgnaRa29TADh3Pu5d57nJAzCYbxri0sLi2vrBbWiusbm1vbpZ3dlgwiQWiTBDwQHQdLyplPm8CA004oKPYcTtvO6Dr1249USBb4dzAOac/DA5+5jGBQkl2qOXYMx2ZyabkCk9iSDwJi88RyKOD7zLGrSTKjqFcqGxUjgz5PzJyUUY6GXfq2+gGJPOoD4VjKrmmE0IuxAEY4TYpWJGmIyQgPaFdRH3tU9uLsvkQ/VEpfdwOhvg96pv7tiLEn5dhzVKWHYShnvVT8z+tG4F70YuaHEVCfTAa5Edch0NOw9D4TlAAfK4KJYGpXnQyxCgpUpFNT+jJdLc3FnE1hnrSqFbNWOb09K9ev8oQKaB8doCNkonNURzeogZqIoCf0gl7Rm/asfWif2tekdEHLe/bQFLSfX6nFqyE=</latexit>
bt+1 =
q
1 t+1
2
1 t+1
1
初期バイアス補正項
慣性項
慣性項+正規化
勾配分散項
慣性項
勾配分散項
momentum Nesterov momentum
<latexit sha1_base64="2qXPSiX6g4ESw8yJGT+gqcvZYr8=">AAACB3icbVDLTgIxFO3gC/GFunTTSEzcOJkBFd0R3bjERMAEJqTTKdDQdsa2Q0ImfIBf4Fa/wJ1x62f4Af6HHZgYQU9yk5Nz7s09OX7EqNKO82nllpZXVtfy64WNza3tneLuXlOFscSkgUMWynsfKcKoIA1NNSP3kSSI+4y0/OF16rdGRCoaijs9jojHUV/QHsVIG8k76fQR5wiOuomedIslx3amgI59Vi07lxX4o7gZKYEM9W7xqxOEOOZEaMyQUm3XibSXIKkpZmRS6MSKRAgPUZ+0DRWIE+Ul09ATeGSUAPZCaUZoOFV/XySIKzXmvtnkSA/UopeK/3ntWPcuvISKKNZE4NmjXsygDmHaAAyoJFizsSEIS2qyQjxAEmFtepr7Eqg0WtqLu9jCX9Is2+65Xbk9LdWusoby4AAcgmPggiqogRtQBw2AwQN4As/gxXq0Xq036322mrOym30wB+vjG5S9mng=</latexit>
vt
<latexit sha1_base64="cMJEeTuJyWkIIo1/oFATGDoX7ho=">AAACHHicdVDLSgNBEJz1GeMr6tHLYBD0YNj1EZOb6MWDBwUTA9kQeicTMzg7u8z0CmHJ1c/wC7zqF3gTr4If4H84m0RQ0YKGoqqb7q4glsKg6747E5NT0zOzubn8/MLi0nJhZbVuokQzXmORjHQjAMOlULyGAiVvxJpDGEh+FdycZP7VLddGROoS+zFvhXCtRFcwQCu1C3TH5wi+gkCCHwL2GMj0bLDlY8/qbdxuF4puqVo5qO5XqFtyh8jIbrl64FFvrBTJGOftwoffiVgScoVMgjFNz42xlYJGwSQf5P3E8BjYDVzzpqUKQm5a6fCTAd20Sod2I21LIR2q3ydSCI3ph4HtzI41v71M/MtrJtittFKh4gS5YqNF3URSjGgWC+0IzRnKviXAtLC3UtYDDQxteD+2dEx22sDm8vU8/Z/Ud0teubR3sV88Oh4nlCPrZINsEY8ckiNySs5JjTByRx7II3ly7p1n58V5HbVOOOOZNfIDztsnOPWizw==</latexit>
⌘rL(✓t)
<latexit sha1_base64="jikt4Srxl5+zqapEUBoYmyFrypA=">AAACAnicdVDLSsNAFJ3UV62vqks3g0VwVRIVbXdFNy4r2Ae0oUwmk3boZBJmboQSuvML3OoXuBO3/ogf4H84aSO0ogcuHM65l3vv8WLBNdj2p1VYWV1b3yhulra2d3b3yvsHbR0lirIWjUSkuh7RTHDJWsBBsG6sGAk9wTre+CbzOw9MaR7Je5jEzA3JUPKAUwJG6vZhxIAMYFCu2FV7BrxA6vWaU6tjJ1cqKEdzUP7q+xFNQiaBCqJ1z7FjcFOigFPBpqV+ollM6JgMWc9QSUKm3XR27xSfGMXHQaRMScAzdXEiJaHWk9AznSGBkf7tZeJfXi+BoOamXMYJMEnni4JEYIhw9jz2uWIUxMQQQhU3t2I6IopQMBEtbfF1dtrU5PLzPP6ftM+qzmX1/O6i0rjOEyqiI3SMTpGDrlAD3aImaiGKBHpCz+jFerRerTfrfd5asPKZQ7QE6+MbQ1KYsA==</latexit>
✓t
<latexit sha1_base64="dYecKxTxjwKqcAllQPGL/cN5Q/M=">AAACBnicdVDLSsNAFJ3UV62vqks3g0UQhJKoaLsrunFZwdpCE8pkOmmHTiZh5kYooXu/wK1+gTtx62/4Af6HkzZCK3pg4HDOvdwzx48F12Dbn1ZhaXllda24XtrY3NreKe/u3esoUZS1aCQi1fGJZoJL1gIOgnVixUjoC9b2R9eZ335gSvNI3sE4Zl5IBpIHnBIwkuvCkAHppXDiTHrlil21p8BzpF6vObU6dnKlgnI0e+Uvtx/RJGQSqCBadx07Bi8lCjgVbFJyE81iQkdkwLqGShIy7aXTzBN8ZJQ+DiJlngQ8Vec3UhJqPQ59MxkSGOrfXib+5XUTCGpeymWcAJN0dihIBIYIZwXgPleMghgbQqjiJiumQ6IIBVPTwpW+zqJlvfx8Hv9P7k+rzkX17Pa80rjKGyqiA3SIjpGDLlED3aAmaiGKYvSEntGL9Wi9Wm/W+2y0YOU7+2gB1sc3Az6aLA==</latexit>
✓t+1
<latexit sha1_base64="so4WvEfNBhzQ2+k0DIcQ37xIf6o=">AAACGXicbVDLSsNAFJ3UV62vqEsRgkUQxJKoqBuh6MZlBfuANoTJZNoOnUzCzE2hhK78DL/ArX6BO3Hryg/wP5y0WdjqgYFzz7mXe+f4MWcKbPvLKCwsLi2vFFdLa+sbm1vm9k5DRYkktE4iHsmWjxXlTNA6MOC0FUuKQ5/Tpj+4zfzmkErFIvEAo5i6Ie4J1mUEg5Y8c78DfQrYS+HYGV/nBZwMp4Jnlu2KPYH1lzg5KaMcNc/87gQRSUIqgHCsVNuxY3BTLIERTselTqJojMkA92hbU4FDqtx08o2xdaiVwOpGUj8B1kT9PZHiUKlR6OvOEENfzXuZ+J/XTqB75aZMxAlQQaaLugm3ILKyTKyASUqAjzTBRDJ9q0X6WGICOrmZLYHKTstyceZT+EsapxXnonJ2f16u3uQJFdEeOkBHyEGXqIruUA3VEUGP6Bm9oFfjyXgz3o2PaWvByGd20QyMzx85KqEn</latexit>
✓t+1 = ✓t vt+1
<latexit sha1_base64="PPIS633nLP5qR61KoXjzVGg7aSY=">AAACOXicbVBNa9tAFFylH3HdplXbYy9LTcDFxEhtSXsxmPaSQw8u1InBMuJpvbYX767E7pPBCP2a/oz8glyTU489FEKu+QNd2YbWcQYWhpn3eLOTZFJYDIJf3t6Dh48e79ee1J8+O3j+wn/56tSmuWG8z1KZmkEClkuheR8FSj7IDAeVSH6WzL9W/tmCGytS/QOXGR8pmGoxEQzQSbHfWcQFtsKyE01BKaCLGFsRR4g0JBIiBThjIItvZTPCmdNjPPo3+S72G0E7WIHuknBDGmSDXuz/icYpyxXXyCRYOwyDDEcFGBRM8rIe5ZZnwOYw5UNHNShuR8XqmyU9dMqYTlLjnka6Uv/fKEBZu1SJm6xy27teJd7nDXOcfB4VQmc5cs3Whya5pJjSqjM6FoYzlEtHgBnhslI2AwMMXbNbV8a2ila6XsK7LeyS0/ft8Lj94fvHRvfLpqEaeUPekiYJySfSJSekR/qEkZ/kglySK+/c++1dezfr0T1vs/OabMG7/QvV+65E</latexit>
vt+1 = vt + ⌘rL(✓t vt)
<latexit sha1_base64="jikt4Srxl5+zqapEUBoYmyFrypA=">AAACAnicdVDLSsNAFJ3UV62vqks3g0VwVRIVbXdFNy4r2Ae0oUwmk3boZBJmboQSuvML3OoXuBO3/ogf4H84aSO0ogcuHM65l3vv8WLBNdj2p1VYWV1b3yhulra2d3b3yvsHbR0lirIWjUSkuh7RTHDJWsBBsG6sGAk9wTre+CbzOw9MaR7Je5jEzA3JUPKAUwJG6vZhxIAMYFCu2FV7BrxA6vWaU6tjJ1cqKEdzUP7q+xFNQiaBCqJ1z7FjcFOigFPBpqV+ollM6JgMWc9QSUKm3XR27xSfGMXHQaRMScAzdXEiJaHWk9AznSGBkf7tZeJfXi+BoOamXMYJMEnni4JEYIhw9jz2uWIUxMQQQhU3t2I6IopQMBEtbfF1dtrU5PLzPP6ftM+qzmX1/O6i0rjOEyqiI3SMTpGDrlAD3aImaiGKBHpCz+jFerRerTfrfd5asPKZQ7QE6+MbQ1KYsA==</latexit>
✓t
<latexit sha1_base64="2qXPSiX6g4ESw8yJGT+gqcvZYr8=">AAACB3icbVDLTgIxFO3gC/GFunTTSEzcOJkBFd0R3bjERMAEJqTTKdDQdsa2Q0ImfIBf4Fa/wJ1x62f4Af6HHZgYQU9yk5Nz7s09OX7EqNKO82nllpZXVtfy64WNza3tneLuXlOFscSkgUMWynsfKcKoIA1NNSP3kSSI+4y0/OF16rdGRCoaijs9jojHUV/QHsVIG8k76fQR5wiOuomedIslx3amgI59Vi07lxX4o7gZKYEM9W7xqxOEOOZEaMyQUm3XibSXIKkpZmRS6MSKRAgPUZ+0DRWIE+Ul09ATeGSUAPZCaUZoOFV/XySIKzXmvtnkSA/UopeK/3ntWPcuvISKKNZE4NmjXsygDmHaAAyoJFizsSEIS2qyQjxAEmFtepr7Eqg0WtqLu9jCX9Is2+65Xbk9LdWusoby4AAcgmPggiqogRtQBw2AwQN4As/gxXq0Xq036322mrOym30wB+vjG5S9mng=</latexit>
vt
<latexit sha1_base64="pYEI99nU86rqt2I07iUb+/8icC4=">AAACUHicdVBNbxMxEJ0NHy3hKy1HLhYRUpFotFv6kdwquHDgUKSmrZRdrWYdJ7Fqe1f2bKVotX+Mn8GNG5ceyi/gBt40K1EEI1l+ejN+8/yyQklHYfgt6Ny7/+Dhxuaj7uMnT589721tn7m8tFyMea5ye5GhE0oaMSZJSlwUVqDOlDjPLj80/fMrYZ3MzSktC5FonBs5kxzJU2nvtIpXIhM7z5IqHIyGB6P94dtwEK6qAXuHo4Oo3o0FYWwwUxhrpAVHVX2qd2JaeD6l3XiOWiO7SulNnfb6rRJrlVirxKI104d1naS963ia81ILQ1yhc5MoLCip0JLkStTduHSiQH6JczHx0KAWLqlWxmv22jNTNsutP4bYiv3zRYXauaXO/GRj3P3da8h/9SYlzYZJJU1RkjD8dtGsVIxy1kTJptIKTmrpAXIrvVfGF2iRkw/8zpapa6w1ubSfZ/8HZ3uD6HDw7vN+//j9OqFNeAmvYAciOIJj+AgnMAYOX+A73MCP4GvwM/jVCW5H2xtewJ3qdH8DuV+yQA==</latexit>
⌘rL(✓t vt)
<latexit sha1_base64="dYecKxTxjwKqcAllQPGL/cN5Q/M=">AAACBnicdVDLSsNAFJ3UV62vqks3g0UQhJKoaLsrunFZwdpCE8pkOmmHTiZh5kYooXu/wK1+gTtx62/4Af6HkzZCK3pg4HDOvdwzx48F12Dbn1ZhaXllda24XtrY3NreKe/u3esoUZS1aCQi1fGJZoJL1gIOgnVixUjoC9b2R9eZ335gSvNI3sE4Zl5IBpIHnBIwkuvCkAHppXDiTHrlil21p8BzpF6vObU6dnKlgnI0e+Uvtx/RJGQSqCBadx07Bi8lCjgVbFJyE81iQkdkwLqGShIy7aXTzBN8ZJQ+DiJlngQ8Vec3UhJqPQ59MxkSGOrfXib+5XUTCGpeymWcAJN0dihIBIYIZwXgPleMghgbQqjiJiumQ6IIBVPTwpW+zqJlvfx8Hv9P7k+rzkX17Pa80rjKGyqiA3SIjpGDLlED3aAmaiGKYvSEntGL9Wi9Wm/W+2y0YOU7+2gB1sc3Az6aLA==</latexit>
✓t+1
<latexit sha1_base64="9xePHLSefWqCHhmPtXC5CMrtgpA=">AAACRnicbVDBShxBEK3ZGLNqNJt49DK4BARxmdFgcgks8SKeDGRV2FmWmt4at7GnZ+yuEZZh/imfkS8I5KQXr7kFr+nZnUPUPGh49V4VVf3iXEnLQfDLa71Yern8qr2yuvZ6feNN5+27M5sVRtBAZCozFzFaUlLTgCUrusgNYRorOo+vjmr//IaMlZn+xrOcRileaplIgeykceck4ikxjkveDavPTcF7Eap8ilFiUJTpwqzKyF4bLm+acjei3EqV6SpeKONON+gFc/jPSdiQLjQ4HXfuo0kmipQ0C4XWDsMg51GJhqVQVK1GhaUcxRVe0tBRjSnZUTn/c+W/d8rETzLjnmZ/rv47UWJq7SyNXWeKPLVPvVr8nzcsOPk0KqXOCyYtFouSQvmc+XWA/kQaEqxmjqAw0t3qiym6oNjF/GjLxNan1bmET1N4Ts72e+Fh7+Drh27/S5NQG7ZgG3YghI/Qh2M4hQEI+A4/4RbuvB/eb++P97BobXnNzCY8Qgv+Avees/A=</latexit>
✓t+1 = ✓t ↵
mt+1
p
vt+1 + ✏
bt+1 正規化
https://arxiv.org/pdf/2007.01547.pdf
最適化手法
これらの最適化手法は包含関係にある
包含している最適化手法の方がハイパラさえチューニングすれば必ず精度は良くなる
https://arxiv.org/pdf/1910.05446.pdf
(lower is better)
主要な深層ニューラルネットモデルの変遷
https://towardsdatascience.com/from-lenet-to-ef
fi
cientnet-the-evolution-of-cnns-3a57eb34672f
AlexNet: ReLU, Dropout, GPU
2012 2015
ResNet: Skip connection
2017
MobileNet: Squeeze and excite
2019
Ef
fi
cientNet: Neural architecture search
2021
Transformer: 注意機構
Vision Transformer: 画像パッチ
1995
LSTM
LeNet-5:畳み込み
畳み込みニューラルネット
https://cs231n.github.io/convolutional-networks/
[入出力テンソルの次元
]

N: バッチサイズ
C: チャネル数
H: 画像の高さ
W: 画像の幅
[畳み込みのパラメータ
]

F: フィルタの大きさ
P: パディングの幅
S: ストライド
入力チャネル3,出力チャネル2の例
[入力
]

N:
1

Cin:
3

Hin:
5

Win: 5
[出力
]

N:
1

Cout:
2

Hout:
3

Wout: 3
F:
3

P:
1

S: 2
GEMM
Winograd
入力画像
フィルタ
入力画像
フィルタ
出力画像
batched GEMM
FFT
http://cs231n.stanford.edu/reports/2016/pdfs/117_Report.pdf
https://arxiv.org/abs/1410.0759
https://www.slideshare.net/nervanasys/an-analysis-of-convolution-for-inference
正規化 (normalization)
Batch normalization (BN)
Layer normalization (LN)
Group normalization (GN)
Weight standardization (WS)
https://theaisummer.com/normalization/
<latexit sha1_base64="RqN4YpnAkqmVd2gk3UwpB5hXPg8=">AAACN3icbZDPShxBEMZ7jEZjYlyTYy6DS2BFXGZUYiAEJF48ruCqsL0sNb01s43dM0N3TdhlmIfxMXyCXJNjTjkpXn0De3b34J980PDjqyqq+otyJS0FwV9v4dXi0uvllTerb9+tvV9vbHw4s1lhBHZFpjJzEYFFJVPskiSFF7lB0JHC8+jyqK6f/0RjZZae0iTHvoYklbEUQM4aNL7xEVA5rr7zBLQGrjCmFo8NiHK8w3XRGm9VJbcy0VAjNzIZ0dY2j5Bg0GgG7WAq/yWEc2iyuTqDxg0fZqLQmJJQYG0vDHLql2BICoXVKi8s5iAuIcGewxQ02n45/WTlf3bO0I8z415K/tR9PFGCtnaiI9epgUb2ea02/1frFRR/7ZcyzQvCVMwWxYXyKfPrxPyhNChITRyAMNLd6osRuITI5fpky9DWp1Uul/B5Ci/hbLcdfmnvnew3D3/ME1phn9gma7GQHbBDdsw6rMsEu2K/2G/2x7v2/nm33t2sdcGbz3xkT+TdPwDfNq3S</latexit>
x̂ =
✓
x µ(x)
(x)
◆
+
<latexit sha1_base64="mMlCkP7IjuvtvnbWTjoEnGSHv/w=">AAACUXicbZDLSgMxFIZPx/u96tLNYBHqwjKjom4E0U1XomCt0KlDJs20wSQzJhm1hHkyH8OVSxdu9AncmWkreDsQ+Pj/k5yTP0oZVdrznkvO2PjE5NT0zOzc/MLiUnl55VIlmcSkgROWyKsIKcKoIA1NNSNXqSSIR4w0o5uTwm/eEaloIi50PyVtjrqCxhQjbaWw3AgU7XJUfdg8DNSt1CaIJcLGz81pvZkHKuPXp6ERh/6Q66HpfXEzNPeWqw/Wx737fCvgmX1n83o7D8sVr+YNyv0L/ggqMKqzsPwadBKccSI0Zkiplu+lum2Q1BQzks8GmSIpwjeoS1oWBeJEtc3g+7m7YZWOGyfSHqHdgfr9hkFcqT6PbCdHuqd+e4X4n9fKdHzQNlSkmSYCDwfFGXN14hZZuh0qCdasbwFhSe2uLu4hm5+2if+Y0lHFakUu/u8U/sLlds3fq+2c71aOjkcJTcMarEMVfNiHI6jDGTQAwyO8wBu8l55KHw44zrDVKY3urMKPcuY+AT9ntVY=</latexit>
(x) =
v
u
u
t 1
NHW
N
X
n=1
H
X
h=1
W
X
w=1
(xnchw µ(x))2
<latexit sha1_base64="dpuT1CUY56ArR9bRnSs4Ys0fxb4=">AAACPHicbZDNSgMxFIUz/tb/UZduBotQN2VGRd0IRTddlQrWCp06ZNKMDU0yQ5LRljCv42P4BG4V3OtK3Lo201ZQ64XAxzn3cm9OmFAileu+WFPTM7Nz84WFxaXlldU1e33jUsapQLiBYhqLqxBKTAnHDUUUxVeJwJCFFDfD3lnuN2+xkCTmF2qQ4DaDN5xEBEFlpMCu+Cwt9XdP/EhApL1M16rNzJcpu64Fmp94I64GuvvNzUDfGe4bG3XvssAuumV3WM4keGMognHVA/vN78QoZZgrRKGULc9NVFtDoQiiOFv0U4kTiHrwBrcMcsiwbOvhTzNnxygdJ4qFeVw5Q/XnhIZMygELTSeDqiv/ern4n9dKVXTc1oQnqcIcjRZFKXVU7OSxOR0iMFJ0YAAiQcytDupCk5ky4f7a0pH5aXku3t8UJuFyr+wdlvfPD4qV03FCBbAFtkEJeOAIVEAV1EEDIHAPHsETeLYerFfr3foYtU5Z45lN8Kuszy9NhbAf</latexit>
µ(x) =
1
NHW
N
X
n=1
H
X
h=1
W
X
w=1
xnchw
<latexit sha1_base64="jWwokzwi9lm2DSevW/HMsFrzSug=">AAACPHicbZDNSgMxFIUz/tb6V3XpZrAIdVNmVNSNUOymywq2FTp1yKQZG0wyQ5KxljCv42P4BG4V3OtK3Lo2045g1QOBj3Pv5d6cIKZEKsd5sWZm5+YXFgtLxeWV1bX10sZmW0aJQLiFIhqJywBKTAnHLUUUxZexwJAFFHeCm3pW79xiIUnEL9Qoxj0GrzkJCYLKWH6p5rGkcrd36oUCIu2mut7opJ5M2FXd1+jUnXDD14Nv7vh6aPjO1xwNhqlfKjtVZyz7L7g5lEGupl968/oRShjmClEoZdd1YtXTUCiCKE6LXiJxDNENvMZdgxwyLHt6/NPU3jVO3w4jYR5X9tj9OaEhk3LEAtPJoBrI37XM/K/WTVR40tOEx4nCHE0WhQm1VWRnsdl9IjBSdGQAIkHMrTYaQJOZMuFObenL7LQsF/d3Cn+hvV91j6oH54fl2lmeUAFsgx1QAS44BjXQAE3QAgjcg0fwBJ6tB+vVerc+Jq0zVj6zBaZkfX4BE+av/g==</latexit>
µ(x) =
1
CHW
C
X
c=1
H
X
h=1
W
X
w=1
xnchw
<latexit sha1_base64="RqN4YpnAkqmVd2gk3UwpB5hXPg8=">AAACN3icbZDPShxBEMZ7jEZjYlyTYy6DS2BFXGZUYiAEJF48ruCqsL0sNb01s43dM0N3TdhlmIfxMXyCXJNjTjkpXn0De3b34J980PDjqyqq+otyJS0FwV9v4dXi0uvllTerb9+tvV9vbHw4s1lhBHZFpjJzEYFFJVPskiSFF7lB0JHC8+jyqK6f/0RjZZae0iTHvoYklbEUQM4aNL7xEVA5rr7zBLQGrjCmFo8NiHK8w3XRGm9VJbcy0VAjNzIZ0dY2j5Bg0GgG7WAq/yWEc2iyuTqDxg0fZqLQmJJQYG0vDHLql2BICoXVKi8s5iAuIcGewxQ02n45/WTlf3bO0I8z415K/tR9PFGCtnaiI9epgUb2ea02/1frFRR/7ZcyzQvCVMwWxYXyKfPrxPyhNChITRyAMNLd6osRuITI5fpky9DWp1Uul/B5Ci/hbLcdfmnvnew3D3/ME1phn9gma7GQHbBDdsw6rMsEu2K/2G/2x7v2/nm33t2sdcGbz3xkT+TdPwDfNq3S</latexit>
x̂ =
✓
x µ(x)
(x)
◆
+
<latexit sha1_base64="ih89Pzc2EBidrxV+yRGOgCPpnQE=">AAACUXicbZC7TsMwFIZPw/1eYGSJqJDKQJUAAhYkRJeOIFGK1JTIcZ3WwnaC7QCVlSfjMZgYGVjgCdhw2iJxO5KlT/+5+o9SRpX2vOeSMzE5NT0zOze/sLi0vFJeXbtUSSYxaeKEJfIqQoowKkhTU83IVSoJ4hEjreimXuRbd0QqmogLPUhJh6OeoDHFSFspLDcDRXscVR+2jwN1K7UJYomw8XNTb7TyQGX8uh4afOyPuBGa/he3QnNvufoQGoH79/lOwDM7Z/t6Nw/LFa/mDcP9C/4YKjCOs7D8GnQTnHEiNGZIqbbvpbpjkNQUM5LPB5kiKcI3qEfaFgXiRHXM8Pu5u2WVrhsn0j6h3aH6vcMgrtSAR7aSI91Xv3OF+F+unen4qGOoSDNNBB4tijPm6sQtvHS7VBKs2cACwpLaW13cR9Y/bR3/saWritMKX/zfLvyFy92af1DbO9+vnJyOHZqFDdiEKvhwCCfQgDNoAoZHeIE3eC89lT4ccJxRqVMa96zDj3AWPgEEPLU1</latexit>
(x) =
v
u
u
t 1
CHW
C
X
c=1
H
X
h=1
W
X
w=1
(xnchw µ(x))2
<latexit sha1_base64="RqN4YpnAkqmVd2gk3UwpB5hXPg8=">AAACN3icbZDPShxBEMZ7jEZjYlyTYy6DS2BFXGZUYiAEJF48ruCqsL0sNb01s43dM0N3TdhlmIfxMXyCXJNjTjkpXn0De3b34J980PDjqyqq+otyJS0FwV9v4dXi0uvllTerb9+tvV9vbHw4s1lhBHZFpjJzEYFFJVPskiSFF7lB0JHC8+jyqK6f/0RjZZae0iTHvoYklbEUQM4aNL7xEVA5rr7zBLQGrjCmFo8NiHK8w3XRGm9VJbcy0VAjNzIZ0dY2j5Bg0GgG7WAq/yWEc2iyuTqDxg0fZqLQmJJQYG0vDHLql2BICoXVKi8s5iAuIcGewxQ02n45/WTlf3bO0I8z415K/tR9PFGCtnaiI9epgUb2ea02/1frFRR/7ZcyzQvCVMwWxYXyKfPrxPyhNChITRyAMNLd6osRuITI5fpky9DWp1Uul/B5Ci/hbLcdfmnvnew3D3/ME1phn9gma7GQHbBDdsw6rMsEu2K/2G/2x7v2/nm33t2sdcGbz3xkT+TdPwDfNq3S</latexit>
x̂ =
✓
x µ(x)
(x)
◆
+
<latexit sha1_base64="+mSuVgDmFbFdQz9MORmrfpjKhzI=">AAACQHicbZBNSwMxEIazflu/qh69LBZBL3W3inoRxHrosYL9gG5dsmm2DSbZJclaS9g/5M/wF3jVu+BNvHoy21aw1YHAM+/MMJM3iCmRynFerJnZufmFxaXl3Mrq2vpGfnOrLqNEIFxDEY1EM4ASU8JxTRFFcTMWGLKA4kZwV87qjXssJIn4jRrEuM1gl5OQIKiM5OevPJbsPxyce6GASJdSXa40Uk8m7FaXD0upr9G5O8orvu79cMPXfcMPvuao10/9fMEpOsOw/4I7hgIYR9XPv3mdCCUMc4UolLLlOrFqaygUQRSnOS+ROIboDnZxyyCHDMu2Hv42tfeM0rHDSJjHlT1Uf09oyKQcsMB0Mqh6crqWif/VWokKz9qa8DhRmKPRojChtorszDq7QwRGig4MQCSIudVGPWh8U8bgiS0dmZ2W+eJOu/AX6qWie1I8uj4uXFyOHVoCO2AX7AMXnIILUAFVUAMIPIJn8AJerSfr3fqwPketM9Z4ZhtMhPX1DSxusYA=</latexit>
µ(x) =
2
CHW
C/2
X
c=1
H
X
h=1
W
X
w=1
xnchw
<latexit sha1_base64="VgwH0utETMAfcEv0IoDUtgzOTWk=">AAACVXicbZDdTsIwGIbLxD/8Qz30ZJGY4IG4oVFPSIyccIiJiIbh0pUOGttutp1Cml2bl2G8AE/1CkzsABNRv6TJ2/f7zRPElEjlOK85ay4/v7C4tFxYWV1b3yhubl3LKBEIt1BEI3ETQIkp4biliKL4JhYYsoDidnBfz/LtRywkifiVGsW4y2Cfk5AgqIzlF289SfoMlof7NU8+CKW9UECkq6muN9qpJxN2p+uH1dTXqOZO/g1fD75129dPRpeHvuZo8JQeeCwxs/bvTEex5FSccdh/hTsVJTCNpl9883oRShjmClEoZcd1YtXVUCiCKE4LXiJxDNE97OOOkRwyLLt6jCC194zTs8NImMeVPXZ/dmjIpByxwFQyqAbydy4z/8t1EhWedTXhcaIwR5NFYUJtFdkZT7tHBEaKjoyASBBzq40G0DBUhvrMlp7MTsu4uL8p/BXX1Yp7Ujm6PC6dX0wJLYEdsAvKwAWn4Bw0QBO0AALP4A28g4/cS+7TylsLk1IrN+3ZBjNhbXwBPOS2tw==</latexit>
(x) =
v
u
u
t 2
CHW
C/2
X
c=1
H
X
h=1
W
X
w=1
(xnchw µ(x))2
<latexit sha1_base64="PSqEPc30bQy6FHyPWYe6IWsLh00=">AAACZnicbVFNS8MwGM7q9/dUxIOX4BDmwdFOUS/CcDB2VHBWWGdJs3SLS9qapMII/Y9e/QGCv8CrprMH3Xwh8LzP837xJEgYlcq230rW3PzC4tLyyura+sbmVnl7517GqcCkg2MWi4cAScJoRDqKKkYeEkEQDxhxg1Ez190XIiSNozs1TkiPo0FEQ4qRMpRffvIkHXBUdY+vPPkslPZCgbB2Mt1s+e2W72aeTPmjbma+xldOkRnJ5KHf/sW4E8Y1TNU1tUY0WXbi8dQMP36sZ365YtfsScBZ4BSgAoq48cvvXj/GKSeRwgxJ2XXsRPU0EopiRrJVL5UkQXiEBqRrYIQ4kT098SSDR4bpwzAW5kUKTtjfHRpxKcc8MJUcqaGc1nLyP62bqvCyp2mUpIpE+GdRmDKoYpgbDPtUEKzY2ACEBTW3QjxExlRlvuHPlr7MT8t9caZdmAX39ZpzXju9Pas0rguHlsEBOARV4IAL0ABtcAM6AINX8Am+SqD0YW1ae9b+T6lVKnp2wZ+w4Dc1/7tI</latexit>
(W) =
v
u
u
t 1
CFHFW
C
X
c=1
FH
X
fH =1
FW
X
fW =1
(WcfH fW
µ(W))2
<latexit sha1_base64="kB8kd70X3SXw/F5W+lkWy3s6d8o=">AAACUXicbZDNSsNAFIVv41+tf1WXboJF0E1JVNSNUCyULivYptDUMJlO2sGZJMxMhBLyZD6GK5cu3OgTuHPSZtGqFwbO/c4d5s7xY0alsqy3krGyura+Ud6sbG3v7O5V9w96MkoEJl0csUj0fSQJoyHpKqoY6ceCIO4z4vhPzdx3nomQNAof1DQmQ47GIQ0oRkojr9p1eXLqnN26gUA4Pc/SZstrtzwnc2XCH9Nm5qX41i46bek+8NoLxJkRRxNHj2ovyFG1ZtWtWZl/hV2IGhTV8aof7ijCCSehwgxJObCtWA1TJBTFjGQVN5EkRvgJjclAyxBxIofp7PuZeaLJyAwioU+ozBldvJEiLuWU+3qSIzWRv70c/ucNEhXcDFMaxokiIZ4/FCTMVJGZZ2mOqCBYsakWCAuqdzXxBOkglU586ZWRzFfLc7F/p/BX9M7r9lX94v6y1rgrEirDERzDKdhwDQ1oQwe6gOEF3uETvkqvpW8DDGM+apSKO4ewVMbWD9qTtTQ=</latexit>
µ(W) =
2
CFHFW
C
X
c=1
FH
X
fH =1
FW
X
fW =1
WcfH fW
<latexit sha1_base64="Pt2kdQmIa3vxI/2P4C+R6402oY0=">AAACK3icbZDNTsJAFIWn+If4h7p000hMYCG2atSNCdGNS0yEklBCpsMUJsy0zcytCWn6GD6GT+BWn8CVxi3v4RRYCHiSSb6ce2/uneNFnCmwrC8jt7K6tr6R3yxsbe/s7hX3D5oqjCWhDRLyULY8rChnAW0AA05bkaRYeJw63vA+qzvPVCoWBk8wimhH4H7AfEYwaKtbPHMHGBInvXU59aHs+hKTxDl1RVx2KmniKtYXOENXsv4AKt1iyapaE5nLYM+ghGaqd4tjtxeSWNAACMdKtW0rgk6CJTDCaVpwY0UjTIa4T9saAyyo6iSTj6XmiXZ6ph9K/QIwJ+7fiQQLpUbC050Cw0At1jLzv1o7Bv+mk7AgioEGZLrIj7kJoZmlZPaYpAT4SAMmkulbTTLAOhvQWc5t6anstFTnYi+msAzN86p9Vb14vCzV7mYJ5dEROkZlZKNrVEMPqI4aiKAX9Ibe0Yfxanwa38bPtDVnzGYO0ZyM8S8fuqhU</latexit>
Ŵ =
✓
W µ(W)
(W)
◆
B
N

+

LN
B
N

+

LN
B
N

+

LN
B
N

+

LN
(higher is better)
データ拡張 (augmentation)
Flipping
Rotation
Cutout
Random crop
Scale
Random Erasing
Mixup
CutMix
AugMix
AutoAugment
強化学習を使って最適なデータ拡張を探索
Fast AutoAugment
強化学習+ベイズ最適化により探索時間短縮
Faster AutoAugment
勾配ベースの探索によりさらに時間短縮
https://openreview.net/pdf?id=S1gmrxHFvB
https://github.com/xkumiyu/numpy-data-augmentation
(lower is better)
正則化 (regularization)
<latexit sha1_base64="PTOETKQJ9sDV108G8utVEdMYksY=">AAACGnicbVDLSsNAFJ34rPUVdSnIYBHcWJIi6kYounHhooJ9QBPLzWTaDp08mJkIJWTnZ/gFbvUL3IlbN36A/+Gk7cK2Hhg4nHMv98zxYs6ksqxvY2FxaXlltbBWXN/Y3No2d3YbMkoEoXUS8Ui0PJCUs5DWFVOctmJBIfA4bXqD69xvPlIhWRTeq2FM3QB6IesyAkpLHfPACUD1CfD0Nrt0ZBI8pD4oyE4cHvVw3DFLVtkaAc8Te0JKaIJax/xx/IgkAQ0V4SBl27Zi5aYgFCOcZkUnkTQGMoAebWsaQkClm47+keEjrfi4Gwn9QoVH6t+NFAIph4GnJ/PUctbLxf+8dqK6F27KwjhRNCTjQ92EYxXhvBTsM0GJ4kNNgAims2LSBwFE6eqmrvgyj5bpXuzZFuZJo1K2z8qVu9NS9WrSUAHto0N0jGx0jqroBtVQHRH0hF7QK3ozno1348P4HI8uGJOdPTQF4+sXt5uh/g==</latexit>
L =
data
X
log p
損失関数
https://arxiv.org/abs/2002.08709
L2正則化
<latexit sha1_base64="Yy8xPfnTLiqjWpL4JDjdct97CAw=">AAACJ3icbVDLSgMxFM34rPU16tJNsAhCscxUUTdC0Y0LFxXsAzq13MmkNTTzIMkIZTof4Wf4BW71C9yJLl34H2baLmz1QOBwzr259x434kwqy/o05uYXFpeWcyv51bX1jU1za7suw1gQWiMhD0XTBUk5C2hNMcVpMxIUfJfThtu/zPzGAxWShcGtGkS07UMvYF1GQGmpYxYdH9Q9AZ5cp+eOjP27xAMF6aHDwx6Oig7Xf3kwbAzvyh2zYJWsEfBfYk9IAU1Q7ZjfjheS2KeBIhykbNlWpNoJCMUIp2neiSWNgPShR1uaBuBT2U5GR6V4Xyse7oZCv0Dhkfq7IwFfyoHv6srsBDnrZeJ/XitW3bN2woIoVjQg40HdmGMV4iwh7DFBieIDTYAIpnfF5B4EEKVznJriyWy1VOdiz6bwl9TLJfukdHRzXKhcTBLKoV20hw6QjU5RBV2hKqohgh7RM3pBr8aT8Wa8Gx/j0jlj0rODpmB8/QDq3qdI</latexit>
L =
data
X
log p + |W|2
L1正則化
<latexit sha1_base64="qqFUGj5bRbXJTuGZgDLorSbuXL0=">AAACJXicbVDLSgMxFM3UV62vqks3wSIoYplRUTdC0Y0LFxXsAzq13MmkbWjmQZIRynS+wc/wC9zqF7gTwZUr/8NM24VtPRA4nHNv7r3HCTmTyjS/jMzc/MLiUnY5t7K6tr6R39yqyiAShFZIwANRd0BSznxaUUxxWg8FBc/htOb0rlO/9kiFZIF/r/ohbXrQ8VmbEVBaauUPbA9UlwCPb5NLW0beQ+yCguTI5kEHh4c213+5MKgNWvmCWTSHwLPEGpMCGqPcyv/YbkAij/qKcJCyYZmhasYgFCOcJjk7kjQE0oMObWjqg0dlMx6elOA9rbi4HQj9fIWH6t+OGDwp+56jK9MD5LSXiv95jUi1L5ox88NIUZ+MBrUjjlWA03ywywQlivc1ASKY3hWTLgggSqc4McWV6WqJzsWaTmGWVI+L1lnx5O60ULoaJ5RFO2gX7SMLnaMSukFlVEEEPaEX9IrejGfj3fgwPkelGWPcs40mYHz/AprfpqQ=</latexit>
L =
data
X
log p + |W|
Sharpness Aware Minimization (SAM)
Flooding
<latexit sha1_base64="+29JRp4dO+SSQAn2+lrjhc+5WsE=">AAACIHicbVDLSgMxFM34rPVVdekmWAVBWmZU1I1QdOPCRQX7gE4tdzJpG5p5kGSEMp0f8DP8Arf6Be7Epe79DzNtF7b1QOBwzr3ck+OEnEllml/G3PzC4tJyZiW7ura+sZnb2q7KIBKEVkjAA1F3QFLOfFpRTHFaDwUFz+G05vSuU7/2SIVkgX+v+iFtetDxWZsRUFpq5fZtD1SXAI9vk0tbRt5D7IKCZFCwedDBYcEZHDmtXN4smkPgWWKNSR6NUW7lfmw3IJFHfUU4SNmwzFA1YxCKEU6TrB1JGgLpQYc2NPXBo7IZD3+T4AOtuLgdCP18hYfq340YPCn7nqMn0+xy2kvF/7xGpNoXzZj5YaSoT0aH2hHHKsBpNdhlghLF+5oAEUxnxaQLAojSBU5ccWUaLdG9WNMtzJLqcdE6K57cneZLV+OGMmgX7aFDZKFzVEI3qIwqiKAn9IJe0ZvxbLwbH8bnaHTOGO/soAkY378ypqRP</latexit>
L =
data
X
| log p b| + b
<latexit sha1_base64="9vmQ2Pyai29yDGuUrPziv2Rg5HE=">AAACPnicbVBNSxxBEO0xfn+uydFL4yIo4jKjol6EZb3kkIOBrCvsjEtNT+1uY0/30N0jLsP8n/yM/IJcE39AvEmuOaZnXcGvBwWP96qoqhdnghvr+3fe1Ifpmdm5+YXFpeWV1bXa+scLo3LNsM2UUPoyBoOCS2xbbgVeZhohjQV24uuzyu/coDZcyW92lGGUwkDyPmdgndSrtcIU7JCBKL6Up6HJ06siAQulk297RYiZ4ULJUGCoh6rs7oVCDWi23dl9snaiXq3uN/wx6FsSTEidTHDeq/0JE8XyFKVlAozpBn5mowK05UxguRjmBjNg1zDArqMSUjRRMf61pFtOSWhfaVfS0rH6fKKA1JhRGrvO6jPz2qvE97xubvsnUcFllluU7HFRPxfUKloFRxOukVkxcgSY5u5WyoaggVkX74stialOK10uwesU3pKL/UZw1Dj4elhvtiYJzZMNskm2SUCOSZN8JuekTRj5Tn6SX+S398O79x68v4+tU95k5hN5Ae/ff/ElsWk=</latexit>
L =
data
X
max
✏⇢
[ log p(W + ✏)]
https://arxiv.org/abs/2010.01412
Dropout
https://arxiv.org/abs/1603.09382
Stochastic depth
分散並列化
データ並列 テンソル並列 層並列
データを分散
モデルは冗長
勾配を通信
バッチが巨大化
例:Horovod
データは冗長
モデルは分散
活性を通信
通信頻度が多い
例:Mesh TensorFlow
データは冗長
モデルは分散
活性を通信
計算が逐次的
例:GPipe
“Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis”, Ben-nun and Hoe
fl
er, ACM Computing Surveys, Article No.: 65
データ並列における通信と同期
パラメータサーバ 集団通信
同期型
非同期型
同期型
非同期型
“Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis”, Ben-nun and Hoe
fl
er, ACM Computing Surveys, Article No.: 65
大規模データ並列分散学習の問題
バッチサイズ = GPUあたりのバッチサイズ
×

GPU数
full batch large mini-batch small mini-batch
なぜバッチサイズを大きくすると汎化性能が低下するのか?
なぜバッチサイズを大きくすると汎化性能が低下するのか?
GPU数に比例してバッチサイズが増える
無駄な更新 無駄なデータ
ノイズが支配的 曲率が支配的
ラージバッチ問題の網羅的調査
ラージバッチ用の最適化手法?
LARS
LAMB (ADAM+LARS)
<latexit sha1_base64="o+M96FcBaWGVEj1D/kpYk4bZkPU=">AAACPXicbVA9TyMxEPXycQccHDkoaSyik2iIdhESlBE0FCmCRAhSNopmnVmw8HpX9uxF0SZ/534Gv4AWJHroEC0t3iQFX0+y/PxmRvP8okxJS77/4M3NLyz++Lm0vPJrde33euXPxrlNcyOwJVKVmosILCqpsUWSFF5kBiGJFLaj6+Oy3v6HxspUn9Eww24Cl1rGUgA5qVepD3ioMCYwJh3wwW6IBJyHsQFRjEaD0WjsrlBDpIA33GtKwwToSoAqGuNeperX/An4VxLMSJXN0OxVnsJ+KvIENQkF1nYCP6NuAYakUDheCXOLGYhruMSOoxoStN1i8tMx/+uUPo9T444mPlHfTxSQWDtMItdZWrSfa6X4Xa2TU3zYLaTOckItpoviXHFKeRkb70uDgtTQERBGOq9cXIFLiVy4H7b0bWmtzCX4nMJX0t6rBfu1IDjdr9aPZhEtsS22zXZYwA5YnZ2wJmsxwf6zW3bH7r0b79F79l6mrXPebGaTfYD3+gZIWrEN</latexit>
w w ⌘
||w||
||rL||
rL
32kのハッチサイズにおいても:
NesterovでLARSと同じ性能を達成
AdamでLAMBと同じ性能を達成
結局ハイパラチューニング次第
東工大のHPCの講義
https://github.com/rioyokotalab/hpc_lecture_2021
画像分類問題
2層の全結合NN
D_in=3
H=5
D_out=2
Data
batch_size(BS)=2
x(BS,D_in)
w1(D_in,H) w2(H,D_out)
y_p(BS,D_out)
h_r=f(x*w1)
y=f(x)
ReLU (Recti
fi
ed Linear Unit)
y_p=h_r*w2
Back propagation
@L
@w2
=
@L
@yp
@yp
@w2
=
1
NO
2(yp y)hr
<latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit>
<latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit>
<latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit>
<latexit sha1_base64="KD+WkL8b4iPtxogxdCifBK+sduE=">AAAB7HicbVDLSgNBEOyNrxijxrOXwSB4Crte9Ch48RjBPCBZwuxsJxkyO7vM9AphyQ949Qu8iX/kB/gfziY5mMSCgaKqm66pKFPSku9/e5W9/YPDo+px7aReOz07b9S7Ns2NwI5IVWr6EbeopMYOSVLYzwzyJFLYi2aPpd97RWNlql9onmGY8ImWYyk4Oak9ajT9lr8E2yXBmjRhjVHjZxinIk9Qk1Dc2kHgZxQW3JAUChe1YW4x42LGJzhwVPMEbVgsYy7YtVNiNk6Ne5rYUv27UfDE2nkSucmE09Rue6X4nzfIaXwfFlJnOaEWq0PjXDFKWflnFkuDgtTcES6MdFmZmHLDBblmNq7Etoy2cLUE2yXsku5tK/BbwbMPVbiEK7iBAO7gAZ6gDR0QEMMbvHuF9+F9ruqreOseL2AD3tcvU8WSkg==</latexit>
<latexit sha1_base64="BPl6LZUWEc7LnKT4OpXuCsHjQG0=">AAACa3icfZHNS8MwGMbT+jXndNWrIMExnAdHu4teBMGLB9EJ7gPWUtIs3cLSD5JUqaV/qDcv/g8eTbeCbhNfCDw8T/K+L794MaNCmua7pm9sbm3vVHare7X9g7pxWOuLKOGY9HDEIj70kCCMhqQnqWRkGHOCAo+RgTe7LfLBC+GCRuGzTGPiBGgSUp9iJJXlGm+2zxHO7BhxSRGD9/mPfnU7+fU/eerG+UpcWMsNqmUHK88e3Me801JXLtLzqctdo2G2zXnBdWGVogHK6rrGpz2OcBKQUGKGhBhZZiydrBiGGcmrdiJIjPAMTchIyRAFRDjZnFEOm8oZQz/i6oQSzt3fLzIUCJEGnroZIDkVq1lh/pWNEulfORkN40SSEC8G+QmDMoIFcDimnGDJUiUQ5lTtCvEUKSRSfcvSlLEoVssVF2uVwrrod9qW2baeTFABx+AUtIAFLsENuANd0AMYfGjbWl0ztC/9RG8uCOpaifIILJV+9g0Z5MCq</latexit>
<latexit sha1_base64="BPl6LZUWEc7LnKT4OpXuCsHjQG0=">AAACa3icfZHNS8MwGMbT+jXndNWrIMExnAdHu4teBMGLB9EJ7gPWUtIs3cLSD5JUqaV/qDcv/g8eTbeCbhNfCDw8T/K+L794MaNCmua7pm9sbm3vVHare7X9g7pxWOuLKOGY9HDEIj70kCCMhqQnqWRkGHOCAo+RgTe7LfLBC+GCRuGzTGPiBGgSUp9iJJXlGm+2zxHO7BhxSRGD9/mPfnU7+fU/eerG+UpcWMsNqmUHK88e3Me801JXLtLzqctdo2G2zXnBdWGVogHK6rrGpz2OcBKQUGKGhBhZZiydrBiGGcmrdiJIjPAMTchIyRAFRDjZnFEOm8oZQz/i6oQSzt3fLzIUCJEGnroZIDkVq1lh/pWNEulfORkN40SSEC8G+QmDMoIFcDimnGDJUiUQ5lTtCvEUKSRSfcvSlLEoVssVF2uVwrrod9qW2baeTFABx+AUtIAFLsENuANd0AMYfGjbWl0ztC/9RG8uCOpaifIILJV+9g0Z5MCq</latexit>
<latexit sha1_base64="9s6RO95OuhDQX1hzGQv1yWyzpFM=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUl6UY3QtGNC9EK9gFNCZPpxA6dPJiZKDHkQ9258R9cOmkD2la8MHA45957LmfciFEhTfNd09fWN0qb5a3K9s7uXtXYP+iJMOaYdHHIQj5wkSCMBqQrqWRkEHGCfJeRvju9yfX+C+GChsGTTCIy8tFzQD2KkVSUY7zZHkc4tSPEJUUM3mU/+NVpZVf/6IkTZUtyTi0uqBQbrCy9dx6yVkO1nCdnE4c7Rs1smrOCq8AqQA0U1XGMT3sc4tgngcQMCTG0zEiO0twMM5JV7FiQCOEpeiZDBQPkEzFKZxllsK6YMfRCrl4g4Yz9PZEiX4jEd1Wnj+RELGs5+Zc2jKV3OUppEMWSBHhu5MUMyhDmgcMx5QRLliiAMKfqVognSEUi1bcsuIxFflqmcrGWU1gFvVbTMpvWo1lrXxcJlcEROAENYIEL0Aa3oAO6AIMPraRVNUP70o/1un46b9W1YuYQLJRufgN6n8GI</latexit>
<latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit>
<latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit>
<latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit>
<latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit>
<latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit>
<latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit>
L =
1
NO
X
(yp y)2
<latexit sha1_base64="HuDWCNBZ3fiwLb58W34hfFqbE4Y=">AAACGHicbZDLSsNAFIYn9VbrLepSF8Ei1IUlKYJuhKIbF6IV7AWaGCaTSTt0cmFmIoSQjY/hE7jVJ3Anbt35AL6HkzYL2/rDwM9/zuGc+ZyIEi50/VspLSwuLa+UVytr6xubW+r2ToeHMUO4jUIasp4DOaYkwG1BBMW9iGHoOxR3ndFlXu8+YsZJGNyLJMKWDwcB8QiCQka2un99bnoMotTI0hv7NjN57NcSOzpOjh4atlrV6/pY2rwxClMFhVq2+mO6IYp9HAhEIed9Q4+ElUImCKI4q5gxxxFEIzjAfWkD6GNupeNfZNqhTFzNC5l8gdDG6d+JFPqcJ74jO30ohny2lof/1fqx8M6slARRLHCAJou8mGoi1HIkmksYRoIm0kDEiLxVQ0MooQgJbmqLy/PTMsnFmKUwbzqNuqHXjbuTavOiIFQGe+AA1IABTkETXIEWaAMEnsALeAVvyrPyrnwon5PWklLM7IIpKV+/xXGgTg==</latexit>
<latexit sha1_base64="HuDWCNBZ3fiwLb58W34hfFqbE4Y=">AAACGHicbZDLSsNAFIYn9VbrLepSF8Ei1IUlKYJuhKIbF6IV7AWaGCaTSTt0cmFmIoSQjY/hE7jVJ3Anbt35AL6HkzYL2/rDwM9/zuGc+ZyIEi50/VspLSwuLa+UVytr6xubW+r2ToeHMUO4jUIasp4DOaYkwG1BBMW9iGHoOxR3ndFlXu8+YsZJGNyLJMKWDwcB8QiCQka2un99bnoMotTI0hv7NjN57NcSOzpOjh4atlrV6/pY2rwxClMFhVq2+mO6IYp9HAhEIed9Q4+ElUImCKI4q5gxxxFEIzjAfWkD6GNupeNfZNqhTFzNC5l8gdDG6d+JFPqcJ74jO30ohny2lof/1fqx8M6slARRLHCAJou8mGoi1HIkmksYRoIm0kDEiLxVQ0MooQgJbmqLy/PTMsnFmKUwbzqNuqHXjbuTavOiIFQGe+AA1IABTkETXIEWaAMEnsALeAVvyrPyrnwon5PWklLM7IIpKV+/xXGgTg==</latexit>
<latexit sha1_base64="HuDWCNBZ3fiwLb58W34hfFqbE4Y=">AAACGHicbZDLSsNAFIYn9VbrLepSF8Ei1IUlKYJuhKIbF6IV7AWaGCaTSTt0cmFmIoSQjY/hE7jVJ3Anbt35AL6HkzYL2/rDwM9/zuGc+ZyIEi50/VspLSwuLa+UVytr6xubW+r2ToeHMUO4jUIasp4DOaYkwG1BBMW9iGHoOxR3ndFlXu8+YsZJGNyLJMKWDwcB8QiCQka2un99bnoMotTI0hv7NjN57NcSOzpOjh4atlrV6/pY2rwxClMFhVq2+mO6IYp9HAhEIed9Q4+ElUImCKI4q5gxxxFEIzjAfWkD6GNupeNfZNqhTFzNC5l8gdDG6d+JFPqcJ74jO30ohny2lof/1fqx8M6slARRLHCAJou8mGoi1HIkmksYRoIm0kDEiLxVQ0MooQgJbmqLy/PTMsnFmKUwbzqNuqHXjbuTavOiIFQGe+AA1IABTkETXIEWaAMEnsALeAVvyrPyrnwon5PWklLM7IIpKV+/xXGgTg==</latexit>
<latexit sha1_base64="HuDWCNBZ3fiwLb58W34hfFqbE4Y=">AAACGHicbZDLSsNAFIYn9VbrLepSF8Ei1IUlKYJuhKIbF6IV7AWaGCaTSTt0cmFmIoSQjY/hE7jVJ3Anbt35AL6HkzYL2/rDwM9/zuGc+ZyIEi50/VspLSwuLa+UVytr6xubW+r2ToeHMUO4jUIasp4DOaYkwG1BBMW9iGHoOxR3ndFlXu8+YsZJGNyLJMKWDwcB8QiCQka2un99bnoMotTI0hv7NjN57NcSOzpOjh4atlrV6/pY2rwxClMFhVq2+mO6IYp9HAhEIed9Q4+ElUImCKI4q5gxxxFEIzjAfWkD6GNupeNfZNqhTFzNC5l8gdDG6d+JFPqcJ74jO30ohny2lof/1fqx8M6slARRLHCAJou8mGoi1HIkmksYRoIm0kDEiLxVQ0MooQgJbmqLy/PTMsnFmKUwbzqNuqHXjbuTavOiIFQGe+AA1IABTkETXIEWaAMEnsALeAVvyrPyrnwon5PWklLM7IIpKV+/xXGgTg==</latexit>
@L
@w1
=
@L
@yp
@yp
@hr
@hr
@w1
=
1
NO
2(yp y)w2x
<latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit>
<latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit>
<latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit>
<latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit>
w1 w1 ⌘
@L
@w1
<latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit>
<latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit>
<latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit>
<latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit>
w2 w2 ⌘
@L
@w2
<latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit>
<latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit>
<latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit>
<latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit>
hr > 0
<latexit sha1_base64="cIbhX78ttMoC8DFM/VOzOTz2skU=">AAAB/3icbVDLSsNAFL2pr1pfUZduBovgqiQi6EqKblxWMG2hDWUymbRDJ5MwMxFK6MIvcKtf4E7c+il+gP/hpM3Cth4YOJxzL/fMCVLOlHacb6uytr6xuVXdru3s7u0f2IdHbZVkklCPJDyR3QArypmgnmaa024qKY4DTjvB+K7wO09UKpaIRz1JqR/joWARI1gbyRsN5I0zsOtOw5kBrRK3JHUo0RrYP/0wIVlMhSYcK9VznVT7OZaaEU6ntX6maIrJGA9pz1CBY6r8fBZ2is6MEqIokeYJjWbq340cx0pN4sBMxliP1LJXiP95vUxH137ORJppKsj8UJRxpBNU/ByFTFKi+cQQTCQzWREZYYmJNv0sXAlVEW1qenGXW1gl7YuG6zTch8t687ZsqAoncArn4MIVNOEeWuABAQYv8Apv1rP1bn1Yn/PRilXuHMMCrK9f9mSWwA==</latexit>
<latexit sha1_base64="cIbhX78ttMoC8DFM/VOzOTz2skU=">AAAB/3icbVDLSsNAFL2pr1pfUZduBovgqiQi6EqKblxWMG2hDWUymbRDJ5MwMxFK6MIvcKtf4E7c+il+gP/hpM3Cth4YOJxzL/fMCVLOlHacb6uytr6xuVXdru3s7u0f2IdHbZVkklCPJDyR3QArypmgnmaa024qKY4DTjvB+K7wO09UKpaIRz1JqR/joWARI1gbyRsN5I0zsOtOw5kBrRK3JHUo0RrYP/0wIVlMhSYcK9VznVT7OZaaEU6ntX6maIrJGA9pz1CBY6r8fBZ2is6MEqIokeYJjWbq340cx0pN4sBMxliP1LJXiP95vUxH137ORJppKsj8UJRxpBNU/ByFTFKi+cQQTCQzWREZYYmJNv0sXAlVEW1qenGXW1gl7YuG6zTch8t687ZsqAoncArn4MIVNOEeWuABAQYv8Apv1rP1bn1Yn/PRilXuHMMCrK9f9mSWwA==</latexit>
<latexit sha1_base64="cIbhX78ttMoC8DFM/VOzOTz2skU=">AAAB/3icbVDLSsNAFL2pr1pfUZduBovgqiQi6EqKblxWMG2hDWUymbRDJ5MwMxFK6MIvcKtf4E7c+il+gP/hpM3Cth4YOJxzL/fMCVLOlHacb6uytr6xuVXdru3s7u0f2IdHbZVkklCPJDyR3QArypmgnmaa024qKY4DTjvB+K7wO09UKpaIRz1JqR/joWARI1gbyRsN5I0zsOtOw5kBrRK3JHUo0RrYP/0wIVlMhSYcK9VznVT7OZaaEU6ntX6maIrJGA9pz1CBY6r8fBZ2is6MEqIokeYJjWbq340cx0pN4sBMxliP1LJXiP95vUxH137ORJppKsj8UJRxpBNU/ByFTFKi+cQQTCQzWREZYYmJNv0sXAlVEW1qenGXW1gl7YuG6zTch8t687ZsqAoncArn4MIVNOEeWuABAQYv8Apv1rP1bn1Yn/PRilXuHMMCrK9f9mSWwA==</latexit>
<latexit sha1_base64="cIbhX78ttMoC8DFM/VOzOTz2skU=">AAAB/3icbVDLSsNAFL2pr1pfUZduBovgqiQi6EqKblxWMG2hDWUymbRDJ5MwMxFK6MIvcKtf4E7c+il+gP/hpM3Cth4YOJxzL/fMCVLOlHacb6uytr6xuVXdru3s7u0f2IdHbZVkklCPJDyR3QArypmgnmaa024qKY4DTjvB+K7wO09UKpaIRz1JqR/joWARI1gbyRsN5I0zsOtOw5kBrRK3JHUo0RrYP/0wIVlMhSYcK9VznVT7OZaaEU6ntX6maIrJGA9pz1CBY6r8fBZ2is6MEqIokeYJjWbq340cx0pN4sBMxliP1LJXiP95vUxH137ORJppKsj8UJRxpBNU/ByFTFKi+cQQTCQzWREZYYmJNv0sXAlVEW1qenGXW1gl7YuG6zTch8t687ZsqAoncArn4MIVNOEeWuABAQYv8Apv1rP1bn1Yn/PRilXuHMMCrK9f9mSWwA==</latexit>
NumPyだけによる実装
import numpy as n
p

 

epochs = 30
0

batch_size = 3
2

D_in = 78
4

H = 10
0

D_out = 1
0

learning_rate = 1.0e-0
6

# create random input and output dat
a

x = np.random.randn(batch_size, D_in
)

y = np.random.randn(batch_size, D_out
)

# randomly initialize weight
s

w1 = np.random.randn(D_in, H
)

w2 = np.random.randn(H, D_out
)

for epoch in range(epochs)
:

# forward pas
s

h = x.dot(w1) # h = x * w
1

h_r = np.maximum(h, 0) # h_r = ReLU(h
)

y_p = h_r.dot(w2) # y_p = h_r * w
2

# compute mean squared error and print los
s

loss = np.square(y_p - y).sum()
print(epoch, loss
)

# backward pass: compute gradients of loss with respect to w
2

grad_y_p = 2.0 * (y_p - y)
grad_w2 = h_r.T.dot(grad_y_p)
# backward pass: compute gradients of loss with respect to w
1

grad_h_r = grad_y_p.dot(w2.T)
grad_h = grad_h_r.copy()
grad_h[h < 0] = 0
grad_w1 = x.T.dot(grad_h)
# update weight
s

w1 -= learning_rate * grad_w
1

w2 -= learning_rate * grad_w2
w1 w1 ⌘
@L
@w1
<latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit>
<latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit>
<latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit>
<latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit>
w2 w2 ⌘
@L
@w2
<latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit>
<latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit>
<latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit>
<latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit>
@L
@w2
=
@L
@yp
@yp
@w2
=
1
NO
2 (yp y) hr
@L
@w1
=
@L
@yp
@yp
@hr
@hr
@w1
=
1
NO
2 (yp y) w2x
L =
1
NO
X
(yp y)
2
00_numpy.py
PyTorch の導入
import torc
h

 

epochs = 30
0

batch_size = 3
2

D_in = 78
4

H = 10
0

D_out = 1
0

learning_rate = 1.0e-0
6

# create random input and output dat
a

x = torch.randn(batch_size, D_in
)

y = torch.randn(batch_size, D_out
)

# randomly initialize weight
s

w1 = torch.randn(D_in, H
)

w2 = torch.randn(H, D_out
)

for epoch in range(epochs)
:

# forward pass: compute predicted
y

h = x.mm(w1
)

h_r = h.clamp(min=0
)

y_p = h_r.mm(w2
)

# compute and print los
s

loss = (y_p - y).pow(2).sum().item(
)

print(t, loss
)

# backward pass: compute gradients of loss with respect to w
2

grad_y_p = 2.0 * (y_p - y
)

grad_w2 = h_r.t().mm(grad_y_p
)

# backward pass: compute gradients of loss with respect to w
1

grad_h_r = grad_y_p.mm(w2.t()
)

grad_h = grad_h_r.clone(
)

grad_h[h < 0] =
0

grad_w1 = x.t().mm(grad_h
)

# update weight
s

w1 -= learning_rate * grad_w
1

w2 -= learning_rate * grad_w2
np.random torch
np torch
x.dot(w1) x.mm(w1)
np.maximum(h, 0) h.clamp(min=0)
np.square(y_p-y) (y_p-y).pow(2)
copy() clone()
01_tensors.py
自動微分の導入
# randomly initialize weight
s

w1 = torch.randn(D_in, H
)

w2 = torch.randn(H, D_out
)

for epoch in range(epochs)
:

# forward pass: compute predicted
y

h = x.mm(w1
)

h_r = h.clamp(min=0
)

y_p = h_r.mm(w2
)

# compute and print los
s

loss = (y_p - y).pow(2).sum().item(
)

print(t, loss
)

# backward pass: compute gradients of loss
with respect to w
2

grad_y_p = 2.0 * (y_p - y
)

grad_w2 = h_r.t().mm(grad_y_p
)

# backward pass: compute gradients of loss
with respect to w
1

grad_h_r = grad_y_p.mm(w2.t()
)

grad_h = grad_h_r.clone(
)

grad_h[h < 0] =
0

grad_w1 = x.t().mm(grad_h
)

# update weight
s

w1 -= learning_rate * grad_w
1

w2 -= learning_rate * grad_w2
01_tensor.py 02_autograd.py
# randomly initialize weight
s

w1 = torch.randn(D_in, H, requires_grad=True
)

w2 = torch.randn(H, D_out, requires_grad=True
)

for epoch in range(epochs)
:

# forward pass: compute predicted
y

h = x.mm(w1
)

h_r = h.clamp(min=0
)

y_p = h_r.mm(w2
)

# compute and print los
s

loss = (y_p - y).pow(2).sum(
)

print(t, loss.item()
)

# backward pas
s

loss.backward(
)

with torch.no_grad()
:

# update weight
s

w1 -= learning_rate * w1.grad
w2 -= learning_rate * w2.grad
# initialize weight
s

w1.grad.zero_(
)

w2.grad.zero_()
@L
@w1
=
@L
@yp
@yp
@hr
@hr
@w1
=
1
NO
2(yp y)w2x
<latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit>
<latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit>
<latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit>
<latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit>
微分を自動的に計算してくれる
活性化関数の自作 03_function.py
import torc
h

 

for epoch in range(epochs)
:

# forward pass: compute predicted
y

h = x.mm(w1
)

h_r = h.clamp(min=0
)

y_p = h_r.mm(w2)
02_autograd.py
import torc
h

 

class ReLU(torch.autograd.Function)
:

@staticmetho
d

def forward(ctx, input)
:

ctx.save_for_backward(input
)

return input.clamp(min=0
)

@staticmetho
d

def backward(ctx, grad_output)
:

input, = ctx.saved_tensor
s

grad_input = grad_output.clone(
)

grad_input[input<0] =
0

return grad_inpu
t

for epoch in range(epochs)
:

# forward pass: compute predicted
y

relu = ReLU.appl
y

h = x.mm(w1
)

h_r = relu(h
)

y_p = h_r.mm(w2)
.
.
.
.
.
.
y=f(x)
ReLU (Recti
fi
ed Linear Unit)
torch.nnの利用 04_nn_module.py
# create random input and output dat
a

x = torch.randn(batch_size, D_in
)

y = torch.randn(batch_size, D_out
)

# randomly initialize weight
s

w1 = torch.randn(D_in, H, requires_grad=True
)

w2 = torch.randn(H, D_out, requires_grad=True
)

for epoch in range(epochs)
:

# forward pass: compute predicted
y

h = x.mm(w1
)

h_r = h.clamp(min=0
)

y_p = h_r.mm(w2
)

# compute and print los
s

loss = (y_p - y).pow(2).sum()
print(t, loss.item()
)

# backward pas
s

loss.backward(
)

with torch.no_grad()
:

# update weight
s

w1 -= learning_rate * w1.gra
d

w2 -= learning_rate * w2.gra
d

# initialize weight
s

w1.grad.zero_(
)

w2.grad.zero_()
02_autograd.py
# create random input and output dat
a

x = torch.randn(batch_size, D_in
)

y = torch.randn(batch_size, D_out
)

# define mode
l

model = torch.nn.Sequential
(

torch.nn.Linear(D_in, H)
,

torch.nn.ReLU()
,

torch.nn.Linear(H, D_out)
,

)

# define loss functio
n

criterion = torch.nn.MSELoss(reduction='sum'
)

for epoch in range(epochs)
:

# forward pass: compute predicted
y

y_p = model(x)
# compute and print los
s

loss = criterion(y_p, y)
print(t, loss.item()
)

# backward pas
s

model.zero_grad()
loss.backward(
)

with torch.no_grad()
:

# update weight
s

for param in model.parameters()
:

param -= learning_rate * param.grad
最適化関数の呼び出し 05_optimizer.py
04_nn_module.py
# define loss functio
n

criterion = torch.nn.MSELoss(reduction='sum'
)

for t in range(epochs)
:

# forward pass: compute predicted
y

y_p = model(x
)

# compute and print los
s

loss = criterion(y_p, y
)

print(t, loss.item()
)

# backward pas
s

model.zero_grad()
loss.backward(
)

with torch.no_grad()
:

# update weight
s

for param in model.parameters()
:

param -= learning_rate * param.grad
# define loss functio
n

criterion = torch.nn.MSELoss(reduction='sum'
)

# define optimize
r

optimizer = torch.optim.SGD(model.parameters(),
lr=learning_rate
)

for epoch in range(epochs)
:

# forward pass: compute predicted
y

y_p = model(x
)

# compute and print los
s

loss = criterion(y_p, y
)

print(t, loss.item()
)

# backward pas
s

optimizer.zero_grad()
loss.backward(
)

# update weight
s

optimizer.step()
モデルを自作 06_mm_module.py
05_optimizer.py
# create random input and output dat
a

x = torch.randn(batch_size, D_in
)

y = torch.randn(batch_size, D_out
)

# define mode
l

model = torch.nn.Sequential
(

torch.nn.Linear(D_in, H)
,

torch.nn.ReLU()
,

torch.nn.Linear(H, D_out)
,

)

# define loss functio
n

criterion = torch.nn.MSELoss(reduction='sum')
import torch.nn as n
n

import torch.nn.functional as
F

class TwoLayerNet(nn.Module)
:

def __init__(self, D_in, H, D_out)
:

super(TwoLayerNet, self).__init__(
)

self.fc1 = nn.Linear(D_in, H
)

self.fc2 = nn.Linear(H, D_out
)

def forward(self, x)
:

h = self.fc1(x
)

h_r = F.relu(h
)

y_p = self.fc2(h_r
)

return y_
p

# create random input and output dat
a

x = torch.randn(batch_size, D_in
)

y = torch.randn(batch_size, D_out
)

# define mode
l

model = TwoLayerNet(D_in, H, D_out
)

# define loss functio
n

criterion = nn.MSELoss(reduction='sum')
.
.
.
学習時に不変
MNIST Datasetのロード 07_mnist.py
06_mm_module.py
import torch.nn as n
n

import torch.nn.functional as
F

from torchvision import datasets, transform
s

# read input data and label
s

train_dataset = datasets.MNIST('./data'
,

train=True
,

download=True
,

transform=transforms.ToTensor()
)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset
,

batch_size=batch_size
,

shuffle=True
)

for epoch in range(epochs)
:

# Set model to training mod
e

model.train(
)

# Loop over each batch from the training se
t

for batch_idx, (x, y) in enumerate(train_loader):
# forward pass: compute predicted
y

y_p = model(x)
.
.
.
import torch.nn as n
n

import torch.nn.functional as
F

# create random input and output dat
a

x = torch.randn(batch_size, D_in
)

y = torch.randn(batch_size, D_out
)

for t in range(epochs)
:

# forward pass: compute predictedy

y_p = model(x)
.
.
.
.
.
.
Validationデータによる検証
08_validate.py
def validate()
:

model.eval(
)

val_loss, val_acc = 0,
0

for data, target in val_loader
:

output = model(data
)

loss = criterion(output, target
)

val_loss += loss.item()
pred = output.data.max(1)[1
]

val_acc += 100. * pred.eq(target.data).cpu().sum() / target.size(0)
val_loss /= len(val_loader
)

val_acc /= len(val_loader
)

print('nValidation set: Average loss: {:.4f}, Accuracy: {:.1f}%n'.format
(

val_loss, val_acc))
学習時に使うデータ
ハイパラやモデル
を変えて試すとき
に使うデータ
最終的な精度の評価
に使うデータ
Validation dataのloss
予測クラスがラベルと一致しているか?
パーセンテージに変換
sum()はGPUでやると遅いのでCPUで
train(), main()関数の形で書く
09_train.py
def train(train_loader,model,criterion,optimizer,epoch)
:

model.train(
)

t = time.perf_counter(
)

for batch_idx, (data, target) in enumerate(train_loader)
:

output = model(data
)

loss = criterion(output, target
)

optimizer.zero_grad(
)

loss.backward(
)

optimizer.step(
)

if batch_idx % 200 == 0
:

print('Train Epoch: {} [{:>5}/{} ({:.0%})]tLoss: {:.6f}t Time:{:.4f}'.format
(

epoch, batch_idx * len(data), len(train_loader.dataset)
,

batch_idx / len(train_loader), loss.data.item()
,

time.perf_counter() - t)
)

t = time.perf_counter()
def main()
:

epochs = 1
0

batch_size = 3
2

learning_rate = 1.0e-0
2

train_dataset = datasets.MNIST('./data'
,

train=True
,

download=True
,

transform=transforms.ToTensor()
)

val_dataset = datasets.MNIST('./data'
,

train=False
,

transform=transforms.ToTensor()
)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset
,

batch_size=batch_size
,

shuffle=True
)

val_loader = torch.utils.data.DataLoader(dataset=validation_dataset
,

batch_size=batch_size
,

shuffle=False
)

model = TwoLayerNet(D_in, H, D_out
)

criterion = nn.CrossEntropyLoss(
)

optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate
)

for epoch in range(epochs)
:

model.train(
)

train(train_loader,model,criterion,optimizer,epoch
)

validate(val_loader,model,criterion
)
畳み込みNNモデル 10_cnn.py
09_train.py
class CNN(nn.Module)
:

def __init__(self)
:

super(CNN, self).__init__(
)

self.conv1 = nn.Conv2d(1, 32, 3, 1
)

self.conv2 = nn.Conv2d(32, 64, 3, 1
)

self.dropout1 = nn.Dropout2d(0.25
)

self.dropout2 = nn.Dropout2d(0.5
)

self.fc1 = nn.Linear(9216, 128
)

self.fc2 = nn.Linear(128, 10
)

def forward(self, x)
:

x = self.conv1(x
)

x = F.relu(x
)

x = self.conv2(x
)

x = F.relu(x
)

x = F.max_pool2d(x, 2
)

x = self.dropout1(x
)

x = torch.flatten(x, 1
)

x = self.fc1(x
)

x = F.relu(x
)

x = self.dropout2(x
)

x = self.fc2(x
)

output = F.log_softmax(x, dim=1
)

return output
class TwoLayerNet(nn.Module)
:

def __init__(self, D_in, H, D_out)
:

super(TwoLayerNet, self).__init__(
)

self.fc1 = nn.Linear(D_in, H
)

self.fc2 = nn.Linear(H, D_out
)

def forward(self, x)
:

x = x.view(-1, D_in
)

h = self.fc1(x
)

h_r = F.relu(h
)

y_p = self.fc2(h_r
)

return F.log_softmax(y_p, dim=1)
GPUを利用
11_gpu.py
device = torch.device('cuda'
)

model = CNN().to(device)
def train(train_loader,model,criterion,optimizer,epoch)
:

model.train(
)

t = time.perf_counter(
)

for batch_idx, (data, target) in enumerate(train_loader)
:

data = data.to(device)
target = target.to(device)
def validate(loss_vector, accuracy_vector)
:

model.eval(
)

val_loss, correct = 0,
0

for data, target in validation_loader
:

data = data.to(device)
target = target.to(device)
.
.
.
.
.
.
.
.
.
PyTorchは裏でcuDNNを呼んでいる
1. torch.device(‘cuda’)でデバイスを指定
2. data, targetをデバイスに送る
3. 計算は全て自動的にGPUを用いて行われる
分散並列
12_distributed.py
import o
s

import torc
h

import torch.distributed as dis
t

master_addr = os.getenv("MASTER_ADDR", default="localhost"
)

master_port = os.getenv('MASTER_PORT', default='8888'
)

method = "tcp://{}:{}".format(master_addr, master_port
)

rank = int(os.getenv('OMPI_COMM_WORLD_RANK', '0')
)

world_size = int(os.getenv('OMPI_COMM_WORLD_SIZE', '1')
)

dist.init_process_group("nccl", init_method=method, rank=rank, world_size=world_size
)

print('Rank: {}, Size: {}'.format(dist.get_rank(),dist.get_world_size())
)

ngpus =
4

device = rank % ngpu
s

x = torch.randn(1).to(device
)

print('rank {}: {}'.format(rank, x)
)

dist.broadcast(x, src=0
)

print('rank {}: {}'.format(rank, x))
通信に用いるホストアドレスとポート番号を指定
OpenMPI環境変数からrankとsizeを取得
PyTorchにこれらを設定
PyTorchによる集団通信
.bashrcに以下を記入
if [ -f "$SGE_JOB_SPOOL_DIR/pe_hostfile" ]; the
n

export MASTER_ADDR=`head -n 1 $SGE_JOB_SPOOL_DIR/pe_hostfile | cut -d " " -f 1
`

f
i

mpirun -np 4 python 12_distributed.py
分散並列MNIST
13_ddp.py
def print0(message)
:

if torch.distributed.is_initialized()
:

if torch.distributed.get_rank() == 0
:

print(message, flush=True
)

else
:

print(message, flush=True
)

train_sampler = torch.utils.data.distributed.DistributedSampler
(

train_dataset
,

num_replicas=torch.distributed.get_world_size()
,

rank=torch.distributed.get_rank()
)

model = DDP(model, device_ids=[rank])
.
.
.
.
.
.
全プロセスがprintすると見づらいので1プロセスだけprintするようなprint関数を定義
train dataの読み込みで異なるプロセスが異なるデータを読むようにする
モデルをDDP()に通すことで分散並列計算を行う
Argparse
14_args.py
import argpars
e

import torc
h

import torch.distributed as dis
t

import torch.nn as n
n

parser = argparse.ArgumentParser(description='PyTorch MNIST Example'
)

parser.add_argument('--batch-size', type=int, default=32, metavar='N'
,

help='input batch size for training (default: 32)'
)

parser.add_argument('--epochs', type=int, default=10, metavar='N'
,

help='number of epochs to train (default: 10)'
)

parser.add_argument('--lr', type=float, default=1.0e-02, metavar='LR'
,

help='learning rate (default: 1.0e-02)'
)

args = parser.parse_args(
)

epochs = args.epochs
batch_size = args.batch_size
learning_rate = args.lr * world_size
直接数字を入れていたところをargsの変数を入れられる
https://docs.python.org/ja/3/library/argparse.html#action
AverageMeter
15_meter.py
def train(train_loader,model,criterion,optimizer,epoch,device)
:

batch_time = AverageMeter('Time', ':.4f'
)

train_loss = AverageMeter('Loss', ':.6f')
class AverageMeter(object)
:

def __init__(self, name, fmt=':f')
:

self.name = nam
e

self.fmt = fm
t

self.reset(
)

def reset(self)
:

self.val =
0

self.avg =
0

self.sum =
0

self.count =
0

def update(self, val, n=1)
:

self.val = va
l

self.sum += val *
n

self.count +=
n

self.avg = self.sum / self.coun
t

def __str__(self)
:

fmtstr = '{name} {val' + self.fmt + '} ({avg' + self.fmt + '})
'

return fmtstr.format(**self.__dict__)
valが既にn個の平均の場合
値
平均
和
個数
出力形式
ProgressMeter
15_meter.py
def train(train_loader,model,criterion,optimizer,epoch,device)
:

batch_time = AverageMeter('Time', ':.4f'
)

train_loss = AverageMeter('Loss', ‘:.6f'
)

progress = ProgressMeter
(

len(train_loader)
,

[train_loss, batch_time]
,

prefix="Epoch: [{}]".format(epoch))
class ProgressMeter(object)
:

def __init__(self, num_batches, meters, prefix="", postfix="")
:

self.batch_fmtstr = self._get_batch_fmtstr(num_batches
)

self.meters = meters
self.prefix = prefix
self.postfix = postfix
def display(self, batch)
:

entries = [self.prefix + self.batch_fmtstr.format(batch)
]

entries += [str(meter) for meter in self.meters
]

entries += self.postfi
x

print0('t'.join(entries)
)

def _get_batch_fmtstr(self, num_batches)
:

num_digits = len(str(num_batches // 1)
)

fmt = '{:' + str(num_digits) + 'd}
'

return '[' + fmt + '/' + fmt.format(num_batches) + ']'
前にprintするもの
後にprintするもの
printしたい変数
printしたいものを連結
[ 今のbatch / 全batch数 ] のような表示をしたい
Weights and Biases
pip install wand
b

wandb login
import wand
b

os.environ['MASTER_ADDR'] = 'localhost
'

os.environ['MASTER_PORT'] = '8888
'

rank = int(os.getenv('OMPI_COMM_WORLD_RANK', '0')
)

world_size = int(os.getenv('OMPI_COMM_WORLD_SIZE', '1')
)

dist.init_process_group("nccl", rank=rank, world_size=world_size
)

device = torch.device('cuda',rank
)

if torch.distributed.get_rank() == 0
:

wandb.init(project="example-project"
)

wandb.config.update(args
)

epochs = args.epoch
s

batch_size = args.batch_siz
e

learning_rate = args.lr * world_size
for epoch in range(epochs)
:

model.train(
)

train_loss, train_acc = train(train_loader,model,criterion,optimizer,epoch,device
)

val_loss, val_acc = validate(val_loader,model,criterion,device
)

if torch.distributed.get_rank() == 0
:

wandb.log(
{

'train_loss': train_loss
,

'train_acc': train_acc
,

'val_loss': val_loss
,

'val_acc': val_ac
c

})
wandbで記録したい変数
trainとvalidateがlossとaccuracyを返すようにする
wandbの初期化
argsを渡すと実験条件を勝手に記録してくれる
16_wandb.py
train_dataset = datasets.CIFAR10('./data'
,

train=True
,

download=True
,

transform=transforms.ToTensor()
)

val_dataset = datasets.CIFAR10('./data'
,

train=False
,

download=True
,

transform=transforms.ToTensor())
CIFAR10
17_cifar10.py
model = VGG('VGG19').to(device
)

model = DDP(model, device_ids=[rank % 4]
)

criterion = nn.CrossEntropyLoss(
)

optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
データセットの変更
モデルの変更
transform_train = transforms.Compose(
[

transforms.RandomCrop(32, padding=4)
,

transforms.RandomHorizontalFlip(),
transforms.ToTensor()
,

transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
]
)

transform_val = transforms.Compose(
[

transforms.ToTensor()
,

transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))
,

])
データ拡張
18_augmentation.py
輝度値の正規化
parser.add_argument('--momentum', type=float, default=0.9, metavar='M'
,

help='momentum (default: 0.9)'
)

parser.add_argument('--wd', '--weight_decay', type=float, default=5.0e-04, metavar='W'
,

help='learning rate (default: 5.0e-04)')
正則化
19_regularization.py
optimizer = torch.optim.SGD(model.parameters(), lr=args.lr
,

momentum=args.momentum, weight_decay=args.wd)
<latexit sha1_base64="GBuDE2XllGcgcaw34c+lteFIv9Q=">AAACJnicbVDLSsNAFJ34rO+oSzeDRRCFkhRFN0LRjcsKVoWmlsnkxg5OHszcCCXNP/gZfoFb/QJ3Iu7c+B9OHwtrPTBwOOce7p3jp1JodJxPa2p6ZnZuvrSwuLS8srpmr29c6SRTHBo8kYm68ZkGKWJooEAJN6kCFvkSrv37s75//QBKiyS+xG4KrYjdxSIUnKGR2vaeh0IGkMviRO57oWI896SJB6zIq0Wv52EHkNFe77batstOxRmAThJ3RMpkhHrb/vaChGcRxMgl07rpOim2cqZQcAnFopdpSBm/Z3fQNDRmEehWPvhTQXeMEtAwUebFSAfq70TOIq27kW8mI4Yd/dfri/95zQzD41Yu4jRDiPlwUZhJigntF0QDoYCj7BrCuBLmVso7zBSDpsaxLYHun1aYXty/LUySq2rFPaw4Fwfl2umooRLZIttkl7jkiNTIOamTBuHkkTyTF/JqPVlv1rv1MRydskaZTTIG6+sH2qSnTw==</latexit>
˜
l = l +
2
||✓||2
<latexit sha1_base64="aTi3bmxUDeM1ohqP26MgoCugRRA=">AAACI3icbVDLSgMxFM34tr6qLt0EiygIZUYU3QhFNy4VrAqdUu5kbm0wkxmSO0IZ+gl+hl/gVr/Anbhx4dL/MH0stHogcHLOvbk3J8qUtOT7H97E5NT0zOzcfGlhcWl5pby6dmXT3Aisi1Sl5iYCi0pqrJMkhTeZQUgihdfR3Wnfv75HY2WqL6mbYTOBWy3bUgA5qVXeDjVECkKSKsZC9Y6Hd652Q+VeiYGH1EGCVrniV/0B+F8SjEiFjXDeKn+FcSryBDUJBdY2Aj+jZgGGpFDYK4W5xQzEHdxiw1ENCdpmMfhQj285Jebt1LijiQ/Unx0FJNZ2k8hVJkAdO+71xf+8Rk7to2YhdZYTajEc1M4Vp5T30+GxNChIdR0BYaTblYsOGBDkMvw1Jbb91Xoul2A8hb/kaq8aHFT9i/1K7WSU0BzbYJtshwXskNXYGTtndSbYA3tiz+zFe/RevTfvfVg64Y161tkveJ/fX8Glaw==</latexit>
r˜
l = rl + ✓
<latexit sha1_base64="+SsjknKlypA8fgkVhF8vJQYEV3M=">AAACPnicbVDLSgMxFM34rO+qSzfBIlSkZUYU3QhFNy4VrAqdUu5k0jaYyQzJHaEM/R8/wy9wq36A7sStS9M6gm29EHJyzn3lBIkUBl331Zmanpmdmy8sLC4tr6yuFdc3rk2casbrLJaxvg3AcCkUr6NAyW8TzSEKJL8J7s4G+s0910bE6gp7CW9G0FGiLRigpVrFUx+7HKGV4Z7XP8kfWPHt5SsIJFBZ/mV3aYUOBWn7h/BLt4olt+oOg04CLwclksdFq/jmhzFLI66QSTCm4bkJNjPQKJjk/UU/NTwBdgcd3rBQQcRNMxv+tU93LBPSdqztUUiH7N+KDCJjelFgMyPArhnXBuR/WiPF9nEzEypJkSv2M6idSooxHRhHQ6E5Q9mzAJgWdlfKuqCBobV3ZEpoBqv1rS/euAuT4Hq/6h1W3cuDUu00d6hAtsg2KROPHJEaOScXpE4YeSBP5Jm8OI/Ou/PhfP6kTjl5zSYZCefrG44ksA8=</latexit>
✓t+1 = ✓t ⌘rl(✓t) ⌘ ✓t
Momentum L2 正則化
Sweep
sweep.yaml
program: wrapper.p
y

method: gri
d

metric
:

goal: minimiz
e

name: val_los
s

parameters
:

epochs
:

values: [100
]

batch_size
:

values: [32
]

learning_rate
:

values: [0.005, 0.01, 0.02, 0.05, 0.1
]

momentum
:

values: [0.85, 0.9, 0.95
]

weight_decay
:

values: [1.0e-4, 2.0e-4, 5.0e-4, 1.0e-3, 2.0e-3]
wandb sweep sweep.yaml
Models
19_regularization.py
model = VGG('VGG19').to(device
)

# model = ResNet18().to(device
)

# model = PreActResNet18().to(device
)

# model = GoogLeNet().to(device
)

# model = DenseNet121().to(device
)

# model = ResNeXt29_2x64d().to(device
)

# model = MobileNet().to(device
)

# model = MobileNetV2().to(device
)

# model = DPN92().to(device
)

# model = ShuffleNetG2().to(device
)

# model = SENet18().to(device
)

# model = ShuffleNetV2(1).to(device
)

# model = EfficientNetB0().to(device
)

# model = RegNetX_200MF().to(device)
今はこれを使っている
他のモデルも試して見ましょう
参考文献
Learning PyTorch with Example
s

https://pytorch.org/tutorials/beginner/pytorch_with_examples.html
PyTorch Examples githu
b

https://github.com/pytorch/examples
PyTorch Tutorial githu
b

https://github.com/yunjey/pytorch-tutorial
Understanding PyTorch with an example: a step-by-step tutorial by Daniel Godo
y

https://towardsdatascience.com/understanding-pytorch-with-an-example-a-step-by-step-tutorial-81fc5f8c4e8e
Practical Deep Learning for Coders, v3 by fast.a
i

https://course.fast.ai
PyTorch by Beeren Sah
u

https://beerensahu.wordpress.com/2018/03/21/pytorch-tutorial-lesson-1-tensor/
Writing Distributed Applications with PyTorch by Séb Arnol
d

https://pytorch.org/tutorials/intermediate/dist_tuto.html

第14回 配信講義 計算科学技術特論A(2021)

  • 1.
  • 2.
    スパコンでしかできない深層学習 TPU v3 ImageNet SOTA:90.45 % Top 1 Accuracy 10,000 TPUv3 core days R e s N e t - 5 0 D i s t i l B E R T E L M o B E R T - L a r g e G P T - 2 M e g a t r o n L M T u r i n g - N L G G P T - 3 S w i t c h T r a n s f o r m e r 10 7 10 8 10 9 10 10 10 11 10 12 10 13 Number of parameters x100,000 計算量上等
  • 3.
    Papers with code MLPerftarget score: 75.9 https://paperswithcode.com ImageNet-1k 一番良い結果を出している論文の多くが手元で再現可能
  • 4.
    主要な深層ニューラルネットモデルの変遷 https://towardsdatascience.com/from-lenet-to-ef fi cientnet-the-evolution-of-cnns-3a57eb34672f AlexNet: ReLU, Dropout,GPU 2012 2015 ResNet: Skip connection 2017 MobileNet: 1x1畳み込み 2019 Ef fi cientNet: Neural architecture search 2021 Transformer: 注意機構 Vision Transformer: 画像パッチ 1995 LSTM LeNet-5:畳み込み
  • 5.
    0.9 0.1 0 Labradoodle Fried chicken 1 <latexit sha1_base64="Qrf0MYIwlAOrUIJFBxNweXaH96A=">AAAB/3icbVDLSsNAFL2pr1pfVZduBovgqiSi6EYounFZwbSFNpTJZNIOnUzCzEQsoQu/wK1+gTtx66f4Af6HkzYL23pg4HDOvdwzx084U9q2v63Syura+kZ5s7K1vbO7V90/aKk4lYS6JOax7PhYUc4EdTXTnHYSSXHkc9r2R7e5336kUrFYPOhxQr0IDwQLGcHaSO7T9bBv96s1u25PgZaJU5AaFGj2qz+9ICZpRIUmHCvVdexEexmWmhFOJ5VeqmiCyQgPaNdQgSOqvGwadoJOjBKgMJbmCY2m6t+NDEdKjSPfTEZYD9Wil4v/ed1Uh1dexkSSairI7FCYcqRjlP8cBUxSovnYEEwkM1kRGWKJiTb9zF0JVB5tYnpxFltYJq2zunNRt+/Pa42boqEyHMExnIIDl9CAO2iCCwQYvMArvFnP1rv1YX3ORktWsXMIc7C+fgH/kJbJ</latexit> x= h0 <latexit sha1_base64="oeS8g7Am64cZNl7f2teu7TnWjwI=">AAACBHicdVDLSgMxFM3UV62vqks3wSK4GjKl1XYhFN24rGAf2A5DJpO2oZnMkGSEUrr1C9zqF7gTt/6HH+B/mGlHsKIHLhzOuZd77/FjzpRG6MPKrayurW/kNwtb2zu7e8X9g7aKEkloi0Q8kl0fK8qZoC3NNKfdWFIc+px2/PFV6nfuqVQsErd6ElM3xEPBBoxgbaS7xEMXHQ+NPOQVS8iu16r1Sg0iG82RkvJZvepAJ1NKIEPTK372g4gkIRWacKxUz0GxdqdYakY4nRX6iaIxJmM8pD1DBQ6pcqfzi2fwxCgBHETSlNBwrv6cmOJQqUnom84Q65H67aXiX14v0YOaO2UiTjQVZLFokHCoI5i+DwMmKdF8YggmkplbIRlhiYk2IS1tCVR62szk8v08/J+0y7ZTtdFNpdS4zBLKgyNwDE6BA85BA1yDJmgBAgR4BE/g2XqwXqxX623RmrOymUOwBOv9C3UxmK8=</latexit> u0 = W0h0 <latexit sha1_base64="i98NF53nvMx1GTvIOlT02vBIGAA=">AAACBHicdVDLSsNAFL2pr1pfVZdugkVwFRKt2o1QdOOygn1gG8JkMmmHTiZhZiKU0K1f4Fa/wJ249T/8AP/DSVuhFT0wcDjnXu6Z4yeMSmXbn0ZhaXllda24XtrY3NreKe/utWScCkyaOGax6PhIEkY5aSqqGOkkgqDIZ6TtD69zv/1AhKQxv1OjhLgR6nMaUoyUlu5Tz7lse87Ac7xyxbHsCUzbOq9W7dOaJjPlx6rADA2v/NULYpxGhCvMkJRdx06UmyGhKGZkXOqlkiQID1GfdDXlKCLSzSaJx+aRVgIzjIV+XJkTdX4jQ5GUo8jXkxFSA/nby8W/vG6qwpqbUZ6kinA8PRSmzFSxmX/fDKggWLGRJggLqrOaeIAEwkqXtHAlkHm08Xwv/5PWieWcWfZttVK/mjVUhAM4hGNw4ALqcAMNaAIGDk/wDC/Go/FqvBnv09GCMdvZhwUYH988fpiK</latexit> u1 = W1h1 <latexit sha1_base64="4dcnyKt/ee7kGXZ6S3uBkWEAkPc=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRKt2mXRjcsK9gFNCJPJpB06mYSZiVBC136GX+BWv8CduBU/wP9w0ka0ohcGzj3nvub4CaNSWda7UVpYXFpeKa9W1tY3NrfM7Z2OjFOBSRvHLBY9H0nCKCdtRRUjvUQQFPmMdP3RZa53b4mQNOY3apwQN0IDTkOKkdKUZ+47oUA4cxIkFEUMpp49+c6GOvPMql2zpgGt2lm9bp00NCiYL6kKimh55ocTxDiNCFeYISn7tpUoN8tHYkYmFSeVJEF4hAakryFHEZFuNv3KBB5qJoBhLPTjCk7Znx0ZiqQcR76ujJAayt9aTv6l9VMVNtyM8iRVhOPZojBlUMUw9wUGVBCs2FgDhAXVt0I8RNobpd2b2xLI/LQ5X/4HneOafVqzruvV5kXhUBnsgQNwBGxwDprgCrRAG2BwBx7AI3gy7o1n48V4nZWWjKJnF8yF8fYJDnqjNw==</latexit> @u1 @h1 <latexit sha1_base64="rQepUHmc6aWrxxB1wV5j6ZVGC6k=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRKt2mXRjcsK9gFNCJPJpB06mYSZiVBC136GX+BWv8CduBU/wP9w0ka0ohcGzj3nvub4CaNSWda7UVpYXFpeKa9W1tY3NrfM7Z2OjFOBSRvHLBY9H0nCKCdtRRUjvUQQFPmMdP3RZa53b4mQNOY3apwQN0IDTkOKkdKUZ+47oUA4cxIkFEUMpp49+c66OvPMql2zpgGt2lm9bp00NCiYL6kKimh55ocTxDiNCFeYISn7tpUoN8tHYkYmFSeVJEF4hAakryFHEZFuNv3KBB5qJoBhLPTjCk7Znx0ZiqQcR76ujJAayt9aTv6l9VMVNtyM8iRVhOPZojBlUMUw9wUGVBCs2FgDhAXVt0I8RNobpd2b2xLI/LQ5X/4HneOafVqzruvV5kXhUBnsgQNwBGxwDprgCrRAG2BwBx7AI3gy7o1n48V4nZWWjKJnF8yF8fYJ8zGjJg==</latexit> @u1 @W1 <latexit sha1_base64="60kCDCJfdCUFlI7azaDnN8WmG14=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRJRbHdFNy4r2Ae0IUwmk3boZBJmJkIJWfsZfoFb/QJ34lb8AP/DSRuxFT0wcObc9/FiRqWyrA+jtLS8srpWXq9sbG5t75i7ex0ZJQKTNo5YJHoekoRRTtqKKkZ6sSAo9BjpeuOrPN69I0LSiN+qSUycEA05DShGSkuueTgIBMLpIEZCUcTgyLWzn1/iWplrVq2aNQWcI41G3a43oF0oVVCg5ZqfAz/CSUi4wgxJ2betWDlp3hIzklUGiSQxwmM0JH1NOQqJdNLpKRk81ooPg0joxxWcqvMVKQqlnISezgyRGsnfsVz8K9ZPVFB3UsrjRBGOZ4OChEEVwdwX6FNBsGITTRAWVO8K8Qhpb5R2b2GKL/PVcl++j4f/k85pzT6vWTdn1eZl4VAZHIAjcAJscAGa4Bq0QBtgcA8ewRN4Nh6MF+PVeJulloyiZh8swHj/AiX8o0g=</latexit> @h1 @u0 <latexit sha1_base64="2g++4FK2qtbTNVizFSiWmGPzmRk=">AAACHXicdVDLSsNAFJ34rPUVdelmtAiuQlJabXdFNy4r2Ac0IUwmk3bo5MHMRCihaz/DL3CrX+BO3Iof4H84aSNa0QMD555779x7j5cwKqRpvmtLyyura+uljfLm1vbOrr633xVxyjHp4JjFvO8hQRiNSEdSyUg/4QSFHiM9b3yZ53u3hAsaRzdykhAnRMOIBhQjqSRXP7IDjnBmJ4hLihhMXXP6HfVU5OoV02g26s1aA5qGOUNOqmfNugWtQqmAAm1X/7D9GKchiSRmSIiBZSbSyfIvMSPTsp0KkiA8RkMyUDRCIRFONjtlCk+U4sMg5upFEs7Unx0ZCoWYhJ6qDJEcid+5XPwrN0hl0HAyGiWpJBGeDwpSBmUMc1+gTznBkk0UQZhTtSvEI6S8kcq9hSm+yFfLffk6Hv5PulXDqhvmda3SuigcKoFDcAxOgQXOQQtcgTboAAzuwAN4BE/avfasvWiv89Ilreg5AAvQ3j4BLYSjTA==</latexit> @u0 @W0 <latexit sha1_base64="+jqY1jG3/sRBUYetLzlPwiYD6Ak=">AAACBnicdVDLSsNAFL3xWeur6tLNYBHqpiQq2i6EohuXFewD2hAmk0k7dPJgZiKU0L1f4Fa/wJ249Tf8AP/DSRuhFT0wcDjnXu6Z48acSWWan8bS8srq2npho7i5tb2zW9rbb8soEYS2SMQj0XWxpJyFtKWY4rQbC4oDl9OOO7rJ/M4DFZJF4b0ax9QO8CBkPiNYaak/dKwr3zEriWOeOKWyWTWnQHOkXq9ZtTqycqUMOZpO6avvRSQJaKgIx1L2LDNWdoqFYoTTSbGfSBpjMsID2tM0xAGVdjrNPEHHWvGQHwn9QoWm6vxGigMpx4GrJwOshvK3l4l/eb1E+TU7ZWGcKBqS2SE/4UhFKCsAeUxQovhYE0wE01kRGWKBidI1LVzxZBZtonv5+Tz6n7RPq9ZF9ezuvNy4zhsqwCEcQQUsuIQG3EITWkAghid4hhfj0Xg13oz32eiSke8cwAKMj289EpkS</latexit> h1 = f0(u0) <latexit sha1_base64="YsNX+xavKPnYRsukvqtfuULWxoM=">AAACBHicbVDLSsNAFJ3UV62vqks3g0Wom5C0anUhFN24rGAf2IYwmUzaoZNJmJkIpXTrF7jVL3Anbv0PP8D/cNIGsdUDA4dz7uWeOV7MqFSW9WnklpZXVtfy64WNza3tneLuXktGicCkiSMWiY6HJGGUk6aiipFOLAgKPUba3vA69dsPREga8Ts1iokToj6nAcVIaek+vgxcu5y49rFbLFmmNQW0zNNaxbqowh/FzkgJZGi4xa+eH+EkJFxhhqTs2lasnDESimJGJoVeIkmM8BD1SVdTjkIinfE08QQeacWHQST04wpO1d8bYxRKOQo9PRkiNZCLXir+53UTFZw7Y8rjRBGOZ4eChEEVwfT70KeCYMVGmiAsqM4K8QAJhJUuae6KL9NoE92LvdjCX9KqmPaZWb09KdWvsoby4AAcgjKwQQ3UwQ1ogCbAgIMn8AxejEfj1Xgz3mejOSPb2QdzMD6+Af4MmGY=</latexit> p = f1(u1) 深層ニューラルネットの学習 2 2 10 2 5 5 5 5 15 <latexit sha1_base64="eGta8ATILI7rVM5edzp7/uMnFog=">AAAB+3icbVDLSsNAFJ3UV62vqks3g0VwVRIVdVl047IF+4A2lMnkph06mYSZiRBCvsCtfoE7cevH+AH+h5M2C1s9MHA4517umePFnClt219WZW19Y3Orul3b2d3bP6gfHvVUlEgKXRrxSA48ooAzAV3NNIdBLIGEHoe+N7sv/P4TSMUi8ajTGNyQTAQLGCXaSJ10XG/YTXsO/Jc4JWmgEu1x/XvkRzQJQWjKiVJDx461mxGpGeWQ10aJgpjQGZnA0FBBQlBuNg+a4zOj+DiIpHlC47n6eyMjoVJp6JnJkOipWvUK8T9vmOjg1s2YiBMNgi4OBQnHOsLFr7HPJFDNU0MIlcxkxXRKJKHadLN0xVdFtNz04qy28Jf0LprOdfOyc9Vo3ZUNVdEJOkXnyEE3qIUeUBt1EUWAntELerVy6816tz4WoxWr3DlGS7A+fwCA+ZVy</latexit> y <latexit sha1_base64="CX39qY1yvYuKVy5jRO31RUVPsKU=">AAACG3icbVDLSsNAFJ34rPUVdenCwSK4CkmrVndFNy4r2Ac0IUwmk3bo5MHMRCihSz/DL3CrX+BO3LrwA/wPJ21QWz0wcDjnvuZ4CaNCmuaHtrC4tLyyWlorr29sbm3rO7ttEacckxaOWcy7HhKE0Yi0JJWMdBNOUOgx0vGGV7nfuSNc0Di6laOEOCHqRzSgGEklufqBHXCEMztBXFLEYDL+4alrjV29YhrmBNA0TutV86IGvxWrIBVQoOnqn7Yf4zQkkcQMCdGzzEQ6WT4SMzIu26kgCcJD1Cc9RSMUEuFkk4+M4ZFSfBjEXL1Iwon6uyNDoRCj0FOVIZIDMe/l4n9eL5XBuZPRKEklifB0UZAyKGOYpwJ9ygmWbKQIwpyqWyEeIJWMVNnNbPFFflqeizWfwl/SrhrWmVG7Oak0LouESmAfHIJjYIE6aIBr0AQtgME9eARP4Fl70F60V+1tWrqgFT17YAba+xfXKqKf</latexit> @p @u1 <latexit sha1_base64="U0Wku2zBzTyNreP8fA04lNpnsb8=">AAACOnicbVC7SgNBFJ31GeNr1dJmMAg2CbsqaqMEbSwsIpgHZGO4OztJhsw+mJkVwrJ/42f4Bbba2AoWYusHOJukyMMDA4dz7p1773EjzqSyrA9jYXFpeWU1t5Zf39jc2jZ3dmsyjAWhVRLyUDRckJSzgFYVU5w2IkHBdzmtu/2bzK8/USFZGDyoQURbPnQD1mEElJba5pXjg+oR4MldeunI2H9MPFCQjijhIGVaHDg87OJo0i+OpLZZsErWEHie2GNSQGNU2uaX44Uk9mmghp83bStSrQSEYoTTNO/EkkZA+tClTU0D8KlsJcM7U3yoFQ93QqFfoPBQnexIwJdy4Lu6MrtKznqZ+J/XjFXnopWwIIoVDchoUCfmWIU4Cw17TFCi+EATIILpXTHpgQCidLRTUzyZrZbqXOzZFOZJ7bhkn5VO7k8L5etxQjm0jw7QEbLROSqjW1RBVUTQM3pFb+jdeDE+jW/jZ1S6YIx79tAUjN8/EDGwDg==</latexit> L = data X class X y log p = data X log p <latexit sha1_base64="NXhC3ff4B32CgQ5BYZuIAyqz5Qg=">AAACJHicbVDLSsNAFJ3UV62vqEs3g0XoqiQq6kYounHhooJ9QBPKZDJph04mYWYilJBf8DP8Arf6Be7EhRt3/oeTNqBtPTBwOPd15ngxo1JZ1qdRWlpeWV0rr1c2Nre2d8zdvbaMEoFJC0csEl0PScIoJy1FFSPdWBAUeox0vNF1Xu88ECFpxO/VOCZuiAacBhQjpaW+Wbt0AoFw6sRIKIoYdEKkhhix9DbLftVO1jerVt2aAC4SuyBVUKDZN78dP8JJSLjCDEnZs61YuWm+EDOSVZxEkhjhERqQnqYchUS66eRHGTzSig+DSOjHFZyofydSFEo5Dj3dmfuV87Vc/K/WS1Rw4aaUx4kiHE8PBQmDKoJ5PNCngmDFxpogLKj2CvEQ6YSUDnHmii9za3ku9nwKi6R9XLfP6id3p9XGVZFQGRyAQ1ADNjgHDXADmqAFMHgEz+AFvBpPxpvxbnxMW0tGMbMPZmB8/QAvl6Z4</latexit> = @L @W <latexit sha1_base64="RlrtYxiGwNDm/OSOojM6YjHJMWs=">AAACI3icbVC7TsMwFHV4lvIKMLJYVAimKgEEjBUsDAxFog+piSrHdVqrjmPZDlIV5RP4DL6AFb6ADbEwMPIfOG0kaMuRLB2d+zo+gWBUacf5tBYWl5ZXVktr5fWNza1te2e3qeJEYtLAMYtlO0CKMMpJQ1PNSFtIgqKAkVYwvM7rrQciFY35vR4J4keoz2lIMdJG6tpHXigRTj2BpKaIQS9CeoARS2+z7FcVWdeuOFVnDDhP3IJUQIF61/72ejFOIsI1ZkipjusI7af5QsxIVvYSRQTCQ9QnHUM5iojy0/GHMnholB4MY2ke13Cs/p1IUaTUKApMZ+5XzdZy8b9aJ9HhpZ9SLhJNOJ4cChMGdQzzdGCPSoI1GxmCsKTGK8QDZBLSJsOpKz2VW8tzcWdTmCfNk6p7Xj29O6vUroqESmAfHIBj4IILUAM3oA4aAINH8AxewKv1ZL1Z79bHpHXBKmb2wBSsrx/HuKZK</latexit> @L @p <latexit sha1_base64="/J5Xk+dXiOlf6omGGiJXLYgOMI8=">AAACBXicdVDLSsNAFJ34rPVVdelmsAiuQlLb2u6KblxWsA9IY5lMJu3QmUmYmQgldO0XuNUvcCdu/Q4/wP8waSNY0QMXDufcy733eBGjSlvWh7Gyura+sVnYKm7v7O7tlw4OuyqMJSYdHLJQ9j2kCKOCdDTVjPQjSRD3GOl5k6vM790TqWgobvU0Ii5HI0EDipFOJWegYn6X+Eij2bBUtsxmo9asNqBlWnNkpFJv1mxo50oZ5GgPS58DP8QxJ0JjhpRybCvSboKkppiRWXEQKxIhPEEj4qRUIE6Um8xPnsHTVPFhEMq0hIZz9edEgrhSU+6lnRzpsfrtZeJfnhProOEmVESxJgIvFgUxgzqE2f/Qp5JgzaYpQVjS9FaIx0girNOUlrb4Kjsty+X7efg/6VZMu26e31TLrcs8oQI4BifgDNjgArTANWiDDsAgBI/gCTwbD8aL8Wq8LVpXjHzmCCzBeP8CCxKaQA==</latexit> data X <latexit sha1_base64="OIzM9hBXAwa2CsqFH+Q4ck6rUCg=">AAACBXicdVDLSgMxFM3UV62vqks3wSK4Gma01C6LblxWsA+YjiWTybShmWRIMkIZuvYL3OoXuBO3focf4H+YaUewogcCh3Pu5Z6cIGFUacf5sEorq2vrG+XNytb2zu5edf+gq0QqMelgwYTsB0gRRjnpaKoZ6SeSoDhgpBdMrnK/d0+kooLf6mlC/BiNOI0oRtpI3kCl8V0WIo1mw2rNtZ05oGM36nXnvGlIoXxbNVCgPax+DkKB05hwjRlSynOdRPsZkppiRmaVQapIgvAEjYhnKEcxUX42jzyDJ0YJYSSkeVzDufpzI0OxUtM4MJMx0mP128vFvzwv1VHTzyhPUk04XhyKUga1gPn/YUglwZpNDUFYUpMV4jGSCGvT0tKVUOXRlnr5n3TPbLdhn9/Ua63LoqEyOALH4BS44AK0wDVogw7AQIBH8ASerQfrxXq13hajJavYOQRLsN6/AM2Bmhg=</latexit> data X 2 2 1 四則演算や初等関数の微分は内部で定義されている それらを連鎖させれば行列積で勾配が計算できる 後ろからかければ全て行列ベクトル積になる 画像ごとにこれが行われ最後に和をとる <latexit sha1_base64="60kCDCJfdCUFlI7azaDnN8WmG14=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRJRbHdFNy4r2Ae0IUwmk3boZBJmJkIJWfsZfoFb/QJ34lb8AP/DSRuxFT0wcObc9/FiRqWyrA+jtLS8srpWXq9sbG5t75i7ex0ZJQKTNo5YJHoekoRRTtqKKkZ6sSAo9BjpeuOrPN69I0LSiN+qSUycEA05DShGSkuueTgIBMLpIEZCUcTgyLWzn1/iWplrVq2aNQWcI41G3a43oF0oVVCg5ZqfAz/CSUi4wgxJ2betWDlp3hIzklUGiSQxwmM0JH1NOQqJdNLpKRk81ooPg0joxxWcqvMVKQqlnISezgyRGsnfsVz8K9ZPVFB3UsrjRBGOZ4OChEEVwdwX6FNBsGITTRAWVO8K8Qhpb5R2b2GKL/PVcl++j4f/k85pzT6vWTdn1eZl4VAZHIAjcAJscAGa4Bq0QBtgcA8ewRN4Nh6MF+PVeJulloyiZh8swHj/AiX8o0g=</latexit> @h1 @u0 <latexit sha1_base64="4dcnyKt/ee7kGXZ6S3uBkWEAkPc=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRKt2mXRjcsK9gFNCJPJpB06mYSZiVBC136GX+BWv8CduBU/wP9w0ka0ohcGzj3nvub4CaNSWda7UVpYXFpeKa9W1tY3NrfM7Z2OjFOBSRvHLBY9H0nCKCdtRRUjvUQQFPmMdP3RZa53b4mQNOY3apwQN0IDTkOKkdKUZ+47oUA4cxIkFEUMpp49+c6GOvPMql2zpgGt2lm9bp00NCiYL6kKimh55ocTxDiNCFeYISn7tpUoN8tHYkYmFSeVJEF4hAakryFHEZFuNv3KBB5qJoBhLPTjCk7Znx0ZiqQcR76ujJAayt9aTv6l9VMVNtyM8iRVhOPZojBlUMUw9wUGVBCs2FgDhAXVt0I8RNobpd2b2xLI/LQ5X/4HneOafVqzruvV5kXhUBnsgQNwBGxwDprgCrRAG2BwBx7AI3gy7o1n48V4nZWWjKJnF8yF8fYJDnqjNw==</latexit> @u1 @h1 <latexit sha1_base64="CX39qY1yvYuKVy5jRO31RUVPsKU=">AAACG3icbVDLSsNAFJ34rPUVdenCwSK4CkmrVndFNy4r2Ac0IUwmk3bo5MHMRCihSz/DL3CrX+BO3LrwA/wPJ21QWz0wcDjnvuZ4CaNCmuaHtrC4tLyyWlorr29sbm3rO7ttEacckxaOWcy7HhKE0Yi0JJWMdBNOUOgx0vGGV7nfuSNc0Di6laOEOCHqRzSgGEklufqBHXCEMztBXFLEYDL+4alrjV29YhrmBNA0TutV86IGvxWrIBVQoOnqn7Yf4zQkkcQMCdGzzEQ6WT4SMzIu26kgCcJD1Cc9RSMUEuFkk4+M4ZFSfBjEXL1Iwon6uyNDoRCj0FOVIZIDMe/l4n9eL5XBuZPRKEklifB0UZAyKGOYpwJ9ygmWbKQIwpyqWyEeIJWMVNnNbPFFflqeizWfwl/SrhrWmVG7Oak0LouESmAfHIJjYIE6aIBr0AQtgME9eARP4Fl70F60V+1tWrqgFT17YAba+xfXKqKf</latexit> @p @u1 <latexit sha1_base64="rQepUHmc6aWrxxB1wV5j6ZVGC6k=">AAACHXicdVDLSsNAFJ3UV62vqEs3o0VwVRKt2mXRjcsK9gFNCJPJpB06mYSZiVBC136GX+BWv8CduBU/wP9w0ka0ohcGzj3nvub4CaNSWda7UVpYXFpeKa9W1tY3NrfM7Z2OjFOBSRvHLBY9H0nCKCdtRRUjvUQQFPmMdP3RZa53b4mQNOY3apwQN0IDTkOKkdKUZ+47oUA4cxIkFEUMpp49+c66OvPMql2zpgGt2lm9bp00NCiYL6kKimh55ocTxDiNCFeYISn7tpUoN8tHYkYmFSeVJEF4hAakryFHEZFuNv3KBB5qJoBhLPTjCk7Znx0ZiqQcR76ujJAayt9aTv6l9VMVNtyM8iRVhOPZojBlUMUw9wUGVBCs2FgDhAXVt0I8RNobpd2b2xLI/LQ5X/4HneOafVqzruvV5kXhUBnsgQNwBGxwDprgCrRAG2BwBx7AI3gy7o1n48V4nZWWjKJnF8yF8fYJ8zGjJg==</latexit> @u1 @W1 <latexit sha1_base64="2g++4FK2qtbTNVizFSiWmGPzmRk=">AAACHXicdVDLSsNAFJ34rPUVdelmtAiuQlJabXdFNy4r2Ac0IUwmk3bo5MHMRCihaz/DL3CrX+BO3Iof4H84aSNa0QMD555779x7j5cwKqRpvmtLyyura+uljfLm1vbOrr633xVxyjHp4JjFvO8hQRiNSEdSyUg/4QSFHiM9b3yZ53u3hAsaRzdykhAnRMOIBhQjqSRXP7IDjnBmJ4hLihhMXXP6HfVU5OoV02g26s1aA5qGOUNOqmfNugWtQqmAAm1X/7D9GKchiSRmSIiBZSbSyfIvMSPTsp0KkiA8RkMyUDRCIRFONjtlCk+U4sMg5upFEs7Unx0ZCoWYhJ6qDJEcid+5XPwrN0hl0HAyGiWpJBGeDwpSBmUMc1+gTznBkk0UQZhTtSvEI6S8kcq9hSm+yFfLffk6Hv5PulXDqhvmda3SuigcKoFDcAxOgQXOQQtcgTboAAzuwAN4BE/avfasvWiv89Ilreg5AAvQ3j4BLYSjTA==</latexit> @u0 @W0 <latexit sha1_base64="RlrtYxiGwNDm/OSOojM6YjHJMWs=">AAACI3icbVC7TsMwFHV4lvIKMLJYVAimKgEEjBUsDAxFog+piSrHdVqrjmPZDlIV5RP4DL6AFb6ADbEwMPIfOG0kaMuRLB2d+zo+gWBUacf5tBYWl5ZXVktr5fWNza1te2e3qeJEYtLAMYtlO0CKMMpJQ1PNSFtIgqKAkVYwvM7rrQciFY35vR4J4keoz2lIMdJG6tpHXigRTj2BpKaIQS9CeoARS2+z7FcVWdeuOFVnDDhP3IJUQIF61/72ejFOIsI1ZkipjusI7af5QsxIVvYSRQTCQ9QnHUM5iojy0/GHMnholB4MY2ke13Cs/p1IUaTUKApMZ+5XzdZy8b9aJ9HhpZ9SLhJNOJ4cChMGdQzzdGCPSoI1GxmCsKTGK8QDZBLSJsOpKz2VW8tzcWdTmCfNk6p7Xj29O6vUroqESmAfHIBj4IILUAM3oA4aAINH8AxewKv1ZL1Z79bHpHXBKmb2wBSsrx/HuKZK</latexit> @L @p Forward propagation Backward propagation Cross entropy loss 確率的勾配降下法 (SGD) :ラベル 誤差逆伝播法 <latexit sha1_base64="BpaiO7b9hbl/rfkDJL7cM+CMhk8=">AAACNHicbVDLSsNAFJ34rO+qSzeDRRDEkqhoN0LRjQsXFawtNCXcTCc6dPJg5kYoIb/iZ/gFbnUvuJNu/QYntYhVDwwczn2dOX4ihUbbfrWmpmdm5+ZLC4tLyyura+X1jRsdp4rxJotlrNo+aC5FxJsoUPJ2ojiEvuQtv39e1Fv3XGkRR9c4SHg3hNtIBIIBGskr11pehntOftrycN/lCG6ggGVuAgoFSDcEvGMgs8s8/xap6c29csWu2iPQv8QZkwoZo+GVh24vZmnII2QStO44doLdrFjJJM8X3VTzBFgfbnnH0AhCrrvZ6Ic53TFKjwaxMi9COlJ/TmQQaj0IfdNZONa/a4X4X62TYlDrZiJKUuQR+zoUpJJiTIu4aE8ozlAODAGmhPFK2R2YiNCEOnGlpwtrRS7O7xT+kpuDqnNcPbw6qtTPxgmVyBbZJrvEISekTi5IgzQJIw/kiTyTF+vRerPereFX65Q1ntkkE7A+PgFacq02</latexit> Wt+1 = Wt ⌘ @L @Wt
  • 6.
    最適化手法 https://losslandscape.com SGD 重みWとバイアスbを合わせてθとする ミニバッチごとに損失関数の形状は変化する momentum SGDsemi-implicit Euler風に書くと <latexit sha1_base64="ir9w3iQmgCVKvRKp5bEPzuOE5aE=">AAACH3icdVDLSitBEO3x/boademmMVxwFWa8PpKFIOrCpYJRIROkplMTG7tnhu4aIQz5AD/DL3CrX+BO3PoB/oc9SQQV74GCwzlVVNWJMiUt+f6bNzY+MTk1PTM7N7/wZ3GpsrxybtPcCGyKVKXmMgKLSibYJEkKLzODoCOFF9HNYelf3KKxMk3OqJdhW0M3kbEUQE66qlTDI1QEnPbCLmgNe2FsQBQhEvSLodR3XX6tUd9ubNW5X/MHKMnmTmM74MFIqbIRTq4q72EnFbnGhIQCa1uBn1G7AENSKOzPhbnFDMQNdLHlaAIabbsYPNPnf53S4XFqXCXEB+rXiQK0tT0duU4NdG1/eqX4m9fKKa63C5lkOWEihoviXHFKeZkM70iDglTPERBGulu5uAaXBrn8vm3p2PK0MpfP5/n/yflmLdip/Tvdqu4fjBKaYWtsnW2wgO2yfXbMTliTCXbHHtgje/LuvWfvxXsdto55o5lV9g3e2wf9lqRQ</latexit> t = = ⌘ 物理的整合性を 持たせるためには Nesterov momentum RMSProp Adam <latexit sha1_base64="IOyR436oWLZbrKn8O0Lz+NybarM=">AAACMXicbVDLSsNAFJ34rPVVdekmWARFLInvjSC6ceGigrVCU8rNdGqHTiZh5kYoIV/iZ/gFbvUL3Ingyp9w0kaw1QvDnDnnXu6Z40eCa3ScN2ticmp6ZrYwV5xfWFxaLq2s3uowVpTVaChCdeeDZoJLVkOOgt1FikHgC1b3exeZXn9gSvNQ3mA/Ys0A7iXvcApoqFbp0MMuQ2gluOOmp/kDdz1zeRJ8AV4A2KUgkqt060febpXKTsUZlP0XuDkok7yqrdKn1w5pHDCJVIDWDdeJsJmAQk4FS4terFkEtAf3rGGghIDpZjL4XmpvGqZtd0JljkR7wP6eSCDQuh/4pjMzq8e1jPxPa8TYOWkmXEYxMkmHizqxsDG0s6zsNleMougbAFRx49WmXVBA0SQ6sqWtM2upycUdT+EvuN2ruEeV/euD8tl5nlCBrJMNskVcckzOyCWpkhqh5JE8kxfyaj1Zb9a79TFsnbDymTUyUtbXNx2yq3o=</latexit> ✓t+1 = ✓t ⌘rL(✓t) <latexit sha1_base64="EMAKBAJcozE6InSyo+X7qysKpy0=">AAACLnicbVDLSgMxFM34flt16SZYBEUoMyrqRhDduHChYFXolOFOmrbBJDMkdwplmP/wM/wCt/oFggtxJfgZZmoXvg4EDufcyz05cSqFRd9/8UZGx8YnJqemZ2bn5hcWK0vLVzbJDON1lsjE3MRguRSa11Gg5Dep4aBiya/j25PSv+5xY0WiL7Gf8qaCjhZtwQCdFFW2e1GOW0FxGHZAKaC9CLdCjhBqiCWECrDLQOZnxUaIXadHuBlVqn7NH4D+JcGQVMkQ51HlPWwlLFNcI5NgbSPwU2zmYFAwyYuZMLM8BXYLHd5wVIPitpkP/lbQdae0aDsx7mmkA/X7Rg7K2r6K3WQZ1v72SvE/r5Fh+6CZC51myDX7OtTOJMWElkXRljCcoew7AswIl5WyLhhg6Or8caVly2iF6yX43cJfcrVdC/ZqOxe71aPjYUNTZJWskQ0SkH1yRE7JOakTRu7IA3kkT9699+y9em9foyPecGeF/ID38QkKrqnh</latexit> vt+1 = vt + ⌘rL(✓t) <latexit sha1_base64="so4WvEfNBhzQ2+k0DIcQ37xIf6o=">AAACGXicbVDLSsNAFJ3UV62vqEsRgkUQxJKoqBuh6MZlBfuANoTJZNoOnUzCzE2hhK78DL/ArX6BO3Hryg/wP5y0WdjqgYFzz7mXe+f4MWcKbPvLKCwsLi2vFFdLa+sbm1vm9k5DRYkktE4iHsmWjxXlTNA6MOC0FUuKQ5/Tpj+4zfzmkErFIvEAo5i6Ie4J1mUEg5Y8c78DfQrYS+HYGV/nBZwMp4Jnlu2KPYH1lzg5KaMcNc/87gQRSUIqgHCsVNuxY3BTLIERTselTqJojMkA92hbU4FDqtx08o2xdaiVwOpGUj8B1kT9PZHiUKlR6OvOEENfzXuZ+J/XTqB75aZMxAlQQaaLugm3ILKyTKyASUqAjzTBRDJ9q0X6WGICOrmZLYHKTstyceZT+EsapxXnonJ2f16u3uQJFdEeOkBHyEGXqIruUA3VEUGP6Bm9oFfjyXgz3o2PaWvByGd20QyMzx85KqEn</latexit> ✓t+1 = ✓t vt+1 <latexit sha1_base64="4YkWyJvS7AvVbbmFVeff+OrbRbQ=">AAACMnicbVDLSsNAFJ34rO+qSzeDRaiIJVGpbgTRjQsXClaFpoab6dQOTiZh5qZQQv7Ez/AL3OoP6E7EnR/hpHbh68DA4Zx7uWdOmEhh0HWfnZHRsfGJydLU9Mzs3PxCeXHpwsSpZrzBYhnrqxAMl0LxBgqU/CrRHKJQ8svw9qjwL3tcGxGrc+wnvBXBjRIdwQCtFJTrvSDDDS/f93U3pr0AN6reZsHXfQWhBD8C7DKQ2Ule9bHLEQJcv94KyhW35g5A/xJvSCpkiNOg/O63Y5ZGXCGTYEzTcxNsZaBRMMnzaT81PAF2Cze8aamCiJtWNvhfTtes0qadWNunkA7U7xsZRMb0o9BOFnHNb68Q//OaKXb2WplQSYpcsa9DnVRSjGlRFm0LzRnKviXAtLBZKeuCBoa20h9X2qaIlttevN8t/CUXWzWvXts+26kcHA4bKpEVskqqxCO75IAck1PSIIzckQfySJ6ce+fFeXXevkZHnOHOMvkB5+MTv8+qnQ==</latexit> vt+1 = ⇢vt + (1 ⇢)rL(✓t)2 <latexit sha1_base64="jPT370zvmdQBBRtoAnxlbXWW5BM=">AAACTnicbVBNixNBFOyJX3H9inr00hiElUCYUVm9CItePHhYxewupMPwpvMmaba7Z7b7zUJo5n/5M7x68Sb6C7yJ9iQ5uLsWNBRV7/Gqq6i18pSmX5PelavXrt/o39y5dfvO3XuD+w8OfdU4iRNZ6codF+BRK4sTUqTxuHYIptB4VJy87fyjM3ReVfYTrWqcGVhYVSoJFKV88NHkgUZZ+1oswBjgJqeRKB3IIJCgDcKfOgpnm6GRwNorXdm2FRYKDcIALSXo8L7dFbSMGzk9zQfDdJyuwS+TbEuGbIuDfPBdzCvZGLQkNXg/zdKaZgEcKamx3RGNxxrkCSxwGqkFg34W1n9v+ZOozHlZufgs8bX670YA4/3KFHGyC+svep34P2/aUPlqFpStG0IrN4fKRnOqeFcknyuHkvQqEpBOxaxcLiE2R7Huc1fmvovWxl6yiy1cJofPxtne+PmHF8P9N9uG+uwRe8x2WcZesn32jh2wCZPsM/vGfrCfyZfkV/I7+bMZ7SXbnYfsHHr9v7xetzQ=</latexit> mt+1 = mt + ⌘ p vt+1 + ✏ rL(✓t) <latexit sha1_base64="4YEOH6DoswNYNGJQU7KvSQoMDRc=">AAACGXicbVDLSsNAFJ3UV62vqEsRBosgiCVRUTdC0Y3LCvYBbQiTybQdOpOEmRuhhK78DL/ArX6BO3Hryg/wP0zaLGzrgYFzz7mXe+d4keAaLOvbKCwsLi2vFFdLa+sbm1vm9k5Dh7GirE5DEaqWRzQTPGB14CBYK1KMSE+wpje4zfzmI1Oah8EDDCPmSNILeJdTAqnkmvsd6DMgbgLH9ug6L+BETgTXLFsVaww8T+yclFGOmmv+dPyQxpIFQAXRum1bETgJUcCpYKNSJ9YsInRAeqyd0oBIpp1k/I0RPkwVH3dDlb4A8Fj9O5EQqfVQemmnJNDXs14m/ue1Y+heOQkPohhYQCeLurHAEOIsE+xzxSiIYUoIVTy9FdM+UYRCmtzUFl9np2W52LMpzJPGacW+qJzdn5erN3lCRbSHDtARstElqqI7VEN1RNETekGv6M14Nt6ND+Nz0low8pldNAXj6xcqpaEe</latexit> ✓t+1 = ✓t mt+1 <latexit sha1_base64="hcu7JIK5zJuWJREk9Bj+qzd8oyg=">AAACNXicbVDLSsNAFJ34flt16SZYhIpYEhUfC0F048KFglWhKeFmOrVDZyZh5kYoId/iZ/gFbnXtwp269Rec1Cx8XRg495x7uWdOlAhu0POenaHhkdGx8YnJqemZ2bn5ysLipYlTTVmDxiLW1xEYJrhiDeQo2HWiGchIsKuod1zoV7dMGx6rC+wnrCXhRvEOp4CWCiv7Msxw3c8PgoghhL4Mcb3mb5TdWqAgEhBIwC4FkZ3mtQC7hYRrYaXq1b1BuX+BX4IqKessrLwF7ZimkimkAoxp+l6CrQw0cipYPhWkhiVAe3DDmhYqkMy0ssEXc3fVMm23E2v7FLoD9vtGBtKYvozsZGHW/NYK8j+tmWJnr5VxlaTIFP061EmFi7Fb5OW2uWYURd8CoJpbry7tggaKNtUfV9qmsJbbXPzfKfwFl5t1f6e+db5dPTwqE5ogy2SF1IhPdskhOSFnpEEouSMP5JE8OffOi/PqvH+NDjnlzhL5Uc7HJwCnq78=</latexit> mt+1 = 1mt + (1 1)rL(✓t) <latexit sha1_base64="b0A57HXrWLeK1gdAKZRl7DCDL2w=">AAACN3icbVDLSsNAFJ34rO+qSzfBIlTEklRRQYSiGxcuFKwKTQ0306kdnEzCzE2hhHyMn+EXuNWlK1eKW//ASc3C14WBc8+5l3vmBLHgGh3n2RoZHRufmCxNTc/Mzs0vlBeXLnSUKMqaNBKRugpAM8ElayJHwa5ixSAMBLsMbo9y/bLPlOaRPMdBzNoh3Eje5RTQUH55v++nuOFmB17AEPx638eNqrtZdOuehECAFwL2KIj0JKt62MslXL+u++WKU3OGZf8FbgEqpKhTv/zqdSKahEwiFaB1y3VibKegkFPBsmkv0SwGegs3rGWghJDpdjr8ZGavGaZjdyNlnkR7yH7fSCHUehAGZjK3q39rOfmf1kqwu9dOuYwTZJJ+HeomwsbIzhOzO1wximJgAFDFjVeb9kABRZPrjysdnVvLTC7u7xT+got6zd2pbZ1tVxqHRUIlskJWSZW4ZJc0yDE5JU1CyR15II/kybq3Xqw36/1rdMQqdpbJj7I+PgF9f6x3</latexit> vt+1 = 2vt + (1 2)rL(✓t)2 <latexit sha1_base64="MjuWH5G898k351kRZEFDIU570Os=">AAACHHicbVDLSsNAFJ34rPVVdelmsAi6sCQq6kYQ3bhwoWC10JRyM5naoZNJmLkRSujWz/AL3OoXuBO3gh/gfzhps9DqgYHDOfdyz5wgkcKg6346E5NT0zOzpbny/MLi0nJlZfXGxKlmvM5iGetGAIZLoXgdBUreSDSHKJD8Nuid5f7tPddGxOoa+wlvRXCnREcwQCu1KxTaeLzjKwgk+BFgl4HMLgZbPnY5Wm+7Xam6NXcI+pd4BamSApftypcfxiyNuEImwZim5ybYykCjYJIPyn5qeAKsB3e8aamCiJtWNvzJgG5aJaSdWNunkA7VnxsZRMb0o8BO5mHNuJeL/3nNFDtHrUyoJEWu2OhQJ5UUY5rXQkOhOUPZtwSYFjYrZV3QwNCW9+tKaPJoA9uLN97CX3KzW/MOantX+9WT06KhElknG2SLeOSQnJBzcknqhJEH8kSeyYvz6Lw6b877aHTCKXbWyC84H9+Vz6Jo</latexit> at = rL(✓t) <latexit sha1_base64="6uRgnqbwem00uROeNjK2GJZ+8vY=">AAACHnicbZDPSsNAEMY39f//qkcvwSIIhZKoqBeh6MWjglWhKWGy3bRLd5OwOymUkLuP4RN41SfwJl71AXwPNzUHa/1g4eObGWb2FySCa3ScT6syMzs3v7C4tLyyura+Ud3cutVxqihr0VjE6j4AzQSPWAs5CnafKAYyEOwuGFwU9bshU5rH0Q2OEtaR0It4yCmgifzq7tDPsO7mZ0Mf6+CjFyqgmccQ8szrgZSQ+9Wa03DGsqeNW5oaKXXlV7+8bkxTySKkArRuu06CnQwUcipYvuylmiVAB9BjbWMjkEx3svFfcnvPJF07jJV5Edrj9PdEBlLrkQxMpwTs67+1Ivyv1k4xPO1kPEpSZBH9WRSmwsbYLsDYXa4YRTEyBqji5lab9sHQQINvYktXF6cVXNy/FKbN7UHDPW4cXh/VmucloUWyQ3bJPnHJCWmSS3JFWoSSB/JEnsmL9Wi9Wm/W+09rxSpntsmErI9vHfqj0w==</latexit> vt+1 = vt + at ⌘ <latexit sha1_base64="+wvl9bAzEQ9hRPZ+KyDyIC4ih2s=">AAACH3icbVBLSgNBEO2Jvxh/UZduBoMgBMKMiroRgm5cRjAfSEKo6XSSJt0zQ3dNIAw5gMfwBG71BO7EbQ7gPexJZmESHzS8eq+Kqn5eKLhGx5lambX1jc2t7HZuZ3dv/yB/eFTTQaQoq9JABKrhgWaC+6yKHAVrhIqB9ASre8OHxK+PmNI88J9xHLK2hL7Pe5wCGqmTL7RwwBA6MRbdyV1aYHE0F1p9kBJMl1NyZrBXiZuSAklR6eR/Wt2ARpL5SAVo3XSdENsxKORUsEmuFWkWAh1CnzUN9UEy3Y5nn5nYZ0bp2r1AmeejPVP/TsQgtR5Lz3RKwIFe9hLxP68ZYe+2HXM/jJD5dL6oFwkbAztJxu5yxSiKsSFAFTe32nQACiia/Ba2dHVy2sTk4i6nsEpqFyX3unT5dFUo36cJZckJOSXnxCU3pEweSYVUCSUv5I28kw/r1fq0vqzveWvGSmeOyQKs6S8iWKPA</latexit> ✓t+1 = ✓t + vt+1 <latexit sha1_base64="xhnZcIhN39z2hCSemrfnC5bRMYE=">AAACMnicbVDLSsNAFJ34rPVVdekmWARBLEmV6kYounFZwT6gqWEynbRDJw9nboQS8id+hl/gVn9AdyLu/AgnaRa29TADh3Pu5d57nJAzCYbxri0sLi2vrBbWiusbm1vbpZ3dlgwiQWiTBDwQHQdLyplPm8CA004oKPYcTtvO6Dr1249USBb4dzAOac/DA5+5jGBQkl2qOXYMx2ZyabkCk9iSDwJi88RyKOD7zLGrSTKjqFcqGxUjgz5PzJyUUY6GXfq2+gGJPOoD4VjKrmmE0IuxAEY4TYpWJGmIyQgPaFdRH3tU9uLsvkQ/VEpfdwOhvg96pv7tiLEn5dhzVKWHYShnvVT8z+tG4F70YuaHEVCfTAa5Edch0NOw9D4TlAAfK4KJYGpXnQyxCgpUpFNT+jJdLc3FnE1hnrSqFbNWOb09K9ev8oQKaB8doCNkonNURzeogZqIoCf0gl7Rm/asfWif2tekdEHLe/bQFLSfX6nFqyE=</latexit> bt+1 = q 1 t+1 2 1 t+1 1 初期バイアス補正項 慣性項 慣性項+正規化 勾配分散項 慣性項 勾配分散項 momentum Nesterov momentum <latexit sha1_base64="2qXPSiX6g4ESw8yJGT+gqcvZYr8=">AAACB3icbVDLTgIxFO3gC/GFunTTSEzcOJkBFd0R3bjERMAEJqTTKdDQdsa2Q0ImfIBf4Fa/wJ1x62f4Af6HHZgYQU9yk5Nz7s09OX7EqNKO82nllpZXVtfy64WNza3tneLuXlOFscSkgUMWynsfKcKoIA1NNSP3kSSI+4y0/OF16rdGRCoaijs9jojHUV/QHsVIG8k76fQR5wiOuomedIslx3amgI59Vi07lxX4o7gZKYEM9W7xqxOEOOZEaMyQUm3XibSXIKkpZmRS6MSKRAgPUZ+0DRWIE+Ul09ATeGSUAPZCaUZoOFV/XySIKzXmvtnkSA/UopeK/3ntWPcuvISKKNZE4NmjXsygDmHaAAyoJFizsSEIS2qyQjxAEmFtepr7Eqg0WtqLu9jCX9Is2+65Xbk9LdWusoby4AAcgmPggiqogRtQBw2AwQN4As/gxXq0Xq036322mrOym30wB+vjG5S9mng=</latexit> vt <latexit sha1_base64="cMJEeTuJyWkIIo1/oFATGDoX7ho=">AAACHHicdVDLSgNBEJz1GeMr6tHLYBD0YNj1EZOb6MWDBwUTA9kQeicTMzg7u8z0CmHJ1c/wC7zqF3gTr4If4H84m0RQ0YKGoqqb7q4glsKg6747E5NT0zOzubn8/MLi0nJhZbVuokQzXmORjHQjAMOlULyGAiVvxJpDGEh+FdycZP7VLddGROoS+zFvhXCtRFcwQCu1C3TH5wi+gkCCHwL2GMj0bLDlY8/qbdxuF4puqVo5qO5XqFtyh8jIbrl64FFvrBTJGOftwoffiVgScoVMgjFNz42xlYJGwSQf5P3E8BjYDVzzpqUKQm5a6fCTAd20Sod2I21LIR2q3ydSCI3ph4HtzI41v71M/MtrJtittFKh4gS5YqNF3URSjGgWC+0IzRnKviXAtLC3UtYDDQxteD+2dEx22sDm8vU8/Z/Ud0teubR3sV88Oh4nlCPrZINsEY8ckiNySs5JjTByRx7II3ly7p1n58V5HbVOOOOZNfIDztsnOPWizw==</latexit> ⌘rL(✓t) <latexit sha1_base64="jikt4Srxl5+zqapEUBoYmyFrypA=">AAACAnicdVDLSsNAFJ3UV62vqks3g0VwVRIVbXdFNy4r2Ae0oUwmk3boZBJmboQSuvML3OoXuBO3/ogf4H84aSO0ogcuHM65l3vv8WLBNdj2p1VYWV1b3yhulra2d3b3yvsHbR0lirIWjUSkuh7RTHDJWsBBsG6sGAk9wTre+CbzOw9MaR7Je5jEzA3JUPKAUwJG6vZhxIAMYFCu2FV7BrxA6vWaU6tjJ1cqKEdzUP7q+xFNQiaBCqJ1z7FjcFOigFPBpqV+ollM6JgMWc9QSUKm3XR27xSfGMXHQaRMScAzdXEiJaHWk9AznSGBkf7tZeJfXi+BoOamXMYJMEnni4JEYIhw9jz2uWIUxMQQQhU3t2I6IopQMBEtbfF1dtrU5PLzPP6ftM+qzmX1/O6i0rjOEyqiI3SMTpGDrlAD3aImaiGKBHpCz+jFerRerTfrfd5asPKZQ7QE6+MbQ1KYsA==</latexit> ✓t <latexit sha1_base64="dYecKxTxjwKqcAllQPGL/cN5Q/M=">AAACBnicdVDLSsNAFJ3UV62vqks3g0UQhJKoaLsrunFZwdpCE8pkOmmHTiZh5kYooXu/wK1+gTtx62/4Af6HkzZCK3pg4HDOvdwzx48F12Dbn1ZhaXllda24XtrY3NreKe/u3esoUZS1aCQi1fGJZoJL1gIOgnVixUjoC9b2R9eZ335gSvNI3sE4Zl5IBpIHnBIwkuvCkAHppXDiTHrlil21p8BzpF6vObU6dnKlgnI0e+Uvtx/RJGQSqCBadx07Bi8lCjgVbFJyE81iQkdkwLqGShIy7aXTzBN8ZJQ+DiJlngQ8Vec3UhJqPQ59MxkSGOrfXib+5XUTCGpeymWcAJN0dihIBIYIZwXgPleMghgbQqjiJiumQ6IIBVPTwpW+zqJlvfx8Hv9P7k+rzkX17Pa80rjKGyqiA3SIjpGDLlED3aAmaiGKYvSEntGL9Wi9Wm/W+2y0YOU7+2gB1sc3Az6aLA==</latexit> ✓t+1 <latexit sha1_base64="so4WvEfNBhzQ2+k0DIcQ37xIf6o=">AAACGXicbVDLSsNAFJ3UV62vqEsRgkUQxJKoqBuh6MZlBfuANoTJZNoOnUzCzE2hhK78DL/ArX6BO3Hryg/wP5y0WdjqgYFzz7mXe+f4MWcKbPvLKCwsLi2vFFdLa+sbm1vm9k5DRYkktE4iHsmWjxXlTNA6MOC0FUuKQ5/Tpj+4zfzmkErFIvEAo5i6Ie4J1mUEg5Y8c78DfQrYS+HYGV/nBZwMp4Jnlu2KPYH1lzg5KaMcNc/87gQRSUIqgHCsVNuxY3BTLIERTselTqJojMkA92hbU4FDqtx08o2xdaiVwOpGUj8B1kT9PZHiUKlR6OvOEENfzXuZ+J/XTqB75aZMxAlQQaaLugm3ILKyTKyASUqAjzTBRDJ9q0X6WGICOrmZLYHKTstyceZT+EsapxXnonJ2f16u3uQJFdEeOkBHyEGXqIruUA3VEUGP6Bm9oFfjyXgz3o2PaWvByGd20QyMzx85KqEn</latexit> ✓t+1 = ✓t vt+1 <latexit sha1_base64="PPIS633nLP5qR61KoXjzVGg7aSY=">AAACOXicbVBNa9tAFFylH3HdplXbYy9LTcDFxEhtSXsxmPaSQw8u1InBMuJpvbYX767E7pPBCP2a/oz8glyTU489FEKu+QNd2YbWcQYWhpn3eLOTZFJYDIJf3t6Dh48e79ee1J8+O3j+wn/56tSmuWG8z1KZmkEClkuheR8FSj7IDAeVSH6WzL9W/tmCGytS/QOXGR8pmGoxEQzQSbHfWcQFtsKyE01BKaCLGFsRR4g0JBIiBThjIItvZTPCmdNjPPo3+S72G0E7WIHuknBDGmSDXuz/icYpyxXXyCRYOwyDDEcFGBRM8rIe5ZZnwOYw5UNHNShuR8XqmyU9dMqYTlLjnka6Uv/fKEBZu1SJm6xy27teJd7nDXOcfB4VQmc5cs3Whya5pJjSqjM6FoYzlEtHgBnhslI2AwMMXbNbV8a2ila6XsK7LeyS0/ft8Lj94fvHRvfLpqEaeUPekiYJySfSJSekR/qEkZ/kglySK+/c++1dezfr0T1vs/OabMG7/QvV+65E</latexit> vt+1 = vt + ⌘rL(✓t vt) <latexit sha1_base64="jikt4Srxl5+zqapEUBoYmyFrypA=">AAACAnicdVDLSsNAFJ3UV62vqks3g0VwVRIVbXdFNy4r2Ae0oUwmk3boZBJmboQSuvML3OoXuBO3/ogf4H84aSO0ogcuHM65l3vv8WLBNdj2p1VYWV1b3yhulra2d3b3yvsHbR0lirIWjUSkuh7RTHDJWsBBsG6sGAk9wTre+CbzOw9MaR7Je5jEzA3JUPKAUwJG6vZhxIAMYFCu2FV7BrxA6vWaU6tjJ1cqKEdzUP7q+xFNQiaBCqJ1z7FjcFOigFPBpqV+ollM6JgMWc9QSUKm3XR27xSfGMXHQaRMScAzdXEiJaHWk9AznSGBkf7tZeJfXi+BoOamXMYJMEnni4JEYIhw9jz2uWIUxMQQQhU3t2I6IopQMBEtbfF1dtrU5PLzPP6ftM+qzmX1/O6i0rjOEyqiI3SMTpGDrlAD3aImaiGKBHpCz+jFerRerTfrfd5asPKZQ7QE6+MbQ1KYsA==</latexit> ✓t <latexit sha1_base64="2qXPSiX6g4ESw8yJGT+gqcvZYr8=">AAACB3icbVDLTgIxFO3gC/GFunTTSEzcOJkBFd0R3bjERMAEJqTTKdDQdsa2Q0ImfIBf4Fa/wJ1x62f4Af6HHZgYQU9yk5Nz7s09OX7EqNKO82nllpZXVtfy64WNza3tneLuXlOFscSkgUMWynsfKcKoIA1NNSP3kSSI+4y0/OF16rdGRCoaijs9jojHUV/QHsVIG8k76fQR5wiOuomedIslx3amgI59Vi07lxX4o7gZKYEM9W7xqxOEOOZEaMyQUm3XibSXIKkpZmRS6MSKRAgPUZ+0DRWIE+Ul09ATeGSUAPZCaUZoOFV/XySIKzXmvtnkSA/UopeK/3ntWPcuvISKKNZE4NmjXsygDmHaAAyoJFizsSEIS2qyQjxAEmFtepr7Eqg0WtqLu9jCX9Is2+65Xbk9LdWusoby4AAcgmPggiqogRtQBw2AwQN4As/gxXq0Xq036322mrOym30wB+vjG5S9mng=</latexit> vt <latexit sha1_base64="pYEI99nU86rqt2I07iUb+/8icC4=">AAACUHicdVBNbxMxEJ0NHy3hKy1HLhYRUpFotFv6kdwquHDgUKSmrZRdrWYdJ7Fqe1f2bKVotX+Mn8GNG5ceyi/gBt40K1EEI1l+ejN+8/yyQklHYfgt6Ny7/+Dhxuaj7uMnT589721tn7m8tFyMea5ye5GhE0oaMSZJSlwUVqDOlDjPLj80/fMrYZ3MzSktC5FonBs5kxzJU2nvtIpXIhM7z5IqHIyGB6P94dtwEK6qAXuHo4Oo3o0FYWwwUxhrpAVHVX2qd2JaeD6l3XiOWiO7SulNnfb6rRJrlVirxKI104d1naS963ia81ILQ1yhc5MoLCip0JLkStTduHSiQH6JczHx0KAWLqlWxmv22jNTNsutP4bYiv3zRYXauaXO/GRj3P3da8h/9SYlzYZJJU1RkjD8dtGsVIxy1kTJptIKTmrpAXIrvVfGF2iRkw/8zpapa6w1ubSfZ/8HZ3uD6HDw7vN+//j9OqFNeAmvYAciOIJj+AgnMAYOX+A73MCP4GvwM/jVCW5H2xtewJ3qdH8DuV+yQA==</latexit> ⌘rL(✓t vt) <latexit sha1_base64="dYecKxTxjwKqcAllQPGL/cN5Q/M=">AAACBnicdVDLSsNAFJ3UV62vqks3g0UQhJKoaLsrunFZwdpCE8pkOmmHTiZh5kYooXu/wK1+gTtx62/4Af6HkzZCK3pg4HDOvdwzx48F12Dbn1ZhaXllda24XtrY3NreKe/u3esoUZS1aCQi1fGJZoJL1gIOgnVixUjoC9b2R9eZ335gSvNI3sE4Zl5IBpIHnBIwkuvCkAHppXDiTHrlil21p8BzpF6vObU6dnKlgnI0e+Uvtx/RJGQSqCBadx07Bi8lCjgVbFJyE81iQkdkwLqGShIy7aXTzBN8ZJQ+DiJlngQ8Vec3UhJqPQ59MxkSGOrfXib+5XUTCGpeymWcAJN0dihIBIYIZwXgPleMghgbQqjiJiumQ6IIBVPTwpW+zqJlvfx8Hv9P7k+rzkX17Pa80rjKGyqiA3SIjpGDLlED3aAmaiGKYvSEntGL9Wi9Wm/W+2y0YOU7+2gB1sc3Az6aLA==</latexit> ✓t+1 <latexit sha1_base64="9xePHLSefWqCHhmPtXC5CMrtgpA=">AAACRnicbVDBShxBEK3ZGLNqNJt49DK4BARxmdFgcgks8SKeDGRV2FmWmt4at7GnZ+yuEZZh/imfkS8I5KQXr7kFr+nZnUPUPGh49V4VVf3iXEnLQfDLa71Yern8qr2yuvZ6feNN5+27M5sVRtBAZCozFzFaUlLTgCUrusgNYRorOo+vjmr//IaMlZn+xrOcRileaplIgeykceck4ikxjkveDavPTcF7Eap8ilFiUJTpwqzKyF4bLm+acjei3EqV6SpeKONON+gFc/jPSdiQLjQ4HXfuo0kmipQ0C4XWDsMg51GJhqVQVK1GhaUcxRVe0tBRjSnZUTn/c+W/d8rETzLjnmZ/rv47UWJq7SyNXWeKPLVPvVr8nzcsOPk0KqXOCyYtFouSQvmc+XWA/kQaEqxmjqAw0t3qiym6oNjF/GjLxNan1bmET1N4Ts72e+Fh7+Drh27/S5NQG7ZgG3YghI/Qh2M4hQEI+A4/4RbuvB/eb++P97BobXnNzCY8Qgv+Avees/A=</latexit> ✓t+1 = ✓t ↵ mt+1 p vt+1 + ✏ bt+1 正規化 https://arxiv.org/pdf/2007.01547.pdf
  • 7.
  • 8.
    主要な深層ニューラルネットモデルの変遷 https://towardsdatascience.com/from-lenet-to-ef fi cientnet-the-evolution-of-cnns-3a57eb34672f AlexNet: ReLU, Dropout,GPU 2012 2015 ResNet: Skip connection 2017 MobileNet: Squeeze and excite 2019 Ef fi cientNet: Neural architecture search 2021 Transformer: 注意機構 Vision Transformer: 画像パッチ 1995 LSTM LeNet-5:畳み込み
  • 9.
    畳み込みニューラルネット https://cs231n.github.io/convolutional-networks/ [入出力テンソルの次元 ] N: バッチサイズ C: チャネル数 H:画像の高さ W: 画像の幅 [畳み込みのパラメータ ] F: フィルタの大きさ P: パディングの幅 S: ストライド 入力チャネル3,出力チャネル2の例 [入力 ] N: 1 Cin: 3 Hin: 5 Win: 5 [出力 ] N: 1 Cout: 2 Hout: 3 Wout: 3 F: 3 P: 1 S: 2 GEMM Winograd 入力画像 フィルタ 入力画像 フィルタ 出力画像 batched GEMM FFT http://cs231n.stanford.edu/reports/2016/pdfs/117_Report.pdf https://arxiv.org/abs/1410.0759 https://www.slideshare.net/nervanasys/an-analysis-of-convolution-for-inference
  • 10.
    正規化 (normalization) Batch normalization(BN) Layer normalization (LN) Group normalization (GN) Weight standardization (WS) https://theaisummer.com/normalization/ <latexit sha1_base64="RqN4YpnAkqmVd2gk3UwpB5hXPg8=">AAACN3icbZDPShxBEMZ7jEZjYlyTYy6DS2BFXGZUYiAEJF48ruCqsL0sNb01s43dM0N3TdhlmIfxMXyCXJNjTjkpXn0De3b34J980PDjqyqq+otyJS0FwV9v4dXi0uvllTerb9+tvV9vbHw4s1lhBHZFpjJzEYFFJVPskiSFF7lB0JHC8+jyqK6f/0RjZZae0iTHvoYklbEUQM4aNL7xEVA5rr7zBLQGrjCmFo8NiHK8w3XRGm9VJbcy0VAjNzIZ0dY2j5Bg0GgG7WAq/yWEc2iyuTqDxg0fZqLQmJJQYG0vDHLql2BICoXVKi8s5iAuIcGewxQ02n45/WTlf3bO0I8z415K/tR9PFGCtnaiI9epgUb2ea02/1frFRR/7ZcyzQvCVMwWxYXyKfPrxPyhNChITRyAMNLd6osRuITI5fpky9DWp1Uul/B5Ci/hbLcdfmnvnew3D3/ME1phn9gma7GQHbBDdsw6rMsEu2K/2G/2x7v2/nm33t2sdcGbz3xkT+TdPwDfNq3S</latexit> x̂ = ✓ x µ(x) (x) ◆ + <latexit sha1_base64="mMlCkP7IjuvtvnbWTjoEnGSHv/w=">AAACUXicbZDLSgMxFIZPx/u96tLNYBHqwjKjom4E0U1XomCt0KlDJs20wSQzJhm1hHkyH8OVSxdu9AncmWkreDsQ+Pj/k5yTP0oZVdrznkvO2PjE5NT0zOzc/MLiUnl55VIlmcSkgROWyKsIKcKoIA1NNSNXqSSIR4w0o5uTwm/eEaloIi50PyVtjrqCxhQjbaWw3AgU7XJUfdg8DNSt1CaIJcLGz81pvZkHKuPXp6ERh/6Q66HpfXEzNPeWqw/Wx737fCvgmX1n83o7D8sVr+YNyv0L/ggqMKqzsPwadBKccSI0Zkiplu+lum2Q1BQzks8GmSIpwjeoS1oWBeJEtc3g+7m7YZWOGyfSHqHdgfr9hkFcqT6PbCdHuqd+e4X4n9fKdHzQNlSkmSYCDwfFGXN14hZZuh0qCdasbwFhSe2uLu4hm5+2if+Y0lHFakUu/u8U/sLlds3fq+2c71aOjkcJTcMarEMVfNiHI6jDGTQAwyO8wBu8l55KHw44zrDVKY3urMKPcuY+AT9ntVY=</latexit> (x) = v u u t 1 NHW N X n=1 H X h=1 W X w=1 (xnchw µ(x))2 <latexit sha1_base64="dpuT1CUY56ArR9bRnSs4Ys0fxb4=">AAACPHicbZDNSgMxFIUz/tb/UZduBotQN2VGRd0IRTddlQrWCp06ZNKMDU0yQ5LRljCv42P4BG4V3OtK3Lo201ZQ64XAxzn3cm9OmFAileu+WFPTM7Nz84WFxaXlldU1e33jUsapQLiBYhqLqxBKTAnHDUUUxVeJwJCFFDfD3lnuN2+xkCTmF2qQ4DaDN5xEBEFlpMCu+Cwt9XdP/EhApL1M16rNzJcpu64Fmp94I64GuvvNzUDfGe4bG3XvssAuumV3WM4keGMognHVA/vN78QoZZgrRKGULc9NVFtDoQiiOFv0U4kTiHrwBrcMcsiwbOvhTzNnxygdJ4qFeVw5Q/XnhIZMygELTSeDqiv/ern4n9dKVXTc1oQnqcIcjRZFKXVU7OSxOR0iMFJ0YAAiQcytDupCk5ky4f7a0pH5aXku3t8UJuFyr+wdlvfPD4qV03FCBbAFtkEJeOAIVEAV1EEDIHAPHsETeLYerFfr3foYtU5Z45lN8Kuszy9NhbAf</latexit> µ(x) = 1 NHW N X n=1 H X h=1 W X w=1 xnchw <latexit sha1_base64="jWwokzwi9lm2DSevW/HMsFrzSug=">AAACPHicbZDNSgMxFIUz/tb6V3XpZrAIdVNmVNSNUOymywq2FTp1yKQZG0wyQ5KxljCv42P4BG4V3OtK3Lo2045g1QOBj3Pv5d6cIKZEKsd5sWZm5+YXFgtLxeWV1bX10sZmW0aJQLiFIhqJywBKTAnHLUUUxZexwJAFFHeCm3pW79xiIUnEL9Qoxj0GrzkJCYLKWH6p5rGkcrd36oUCIu2mut7opJ5M2FXd1+jUnXDD14Nv7vh6aPjO1xwNhqlfKjtVZyz7L7g5lEGupl968/oRShjmClEoZdd1YtXTUCiCKE6LXiJxDNENvMZdgxwyLHt6/NPU3jVO3w4jYR5X9tj9OaEhk3LEAtPJoBrI37XM/K/WTVR40tOEx4nCHE0WhQm1VWRnsdl9IjBSdGQAIkHMrTYaQJOZMuFObenL7LQsF/d3Cn+hvV91j6oH54fl2lmeUAFsgx1QAS44BjXQAE3QAgjcg0fwBJ6tB+vVerc+Jq0zVj6zBaZkfX4BE+av/g==</latexit> µ(x) = 1 CHW C X c=1 H X h=1 W X w=1 xnchw <latexit sha1_base64="RqN4YpnAkqmVd2gk3UwpB5hXPg8=">AAACN3icbZDPShxBEMZ7jEZjYlyTYy6DS2BFXGZUYiAEJF48ruCqsL0sNb01s43dM0N3TdhlmIfxMXyCXJNjTjkpXn0De3b34J980PDjqyqq+otyJS0FwV9v4dXi0uvllTerb9+tvV9vbHw4s1lhBHZFpjJzEYFFJVPskiSFF7lB0JHC8+jyqK6f/0RjZZae0iTHvoYklbEUQM4aNL7xEVA5rr7zBLQGrjCmFo8NiHK8w3XRGm9VJbcy0VAjNzIZ0dY2j5Bg0GgG7WAq/yWEc2iyuTqDxg0fZqLQmJJQYG0vDHLql2BICoXVKi8s5iAuIcGewxQ02n45/WTlf3bO0I8z415K/tR9PFGCtnaiI9epgUb2ea02/1frFRR/7ZcyzQvCVMwWxYXyKfPrxPyhNChITRyAMNLd6osRuITI5fpky9DWp1Uul/B5Ci/hbLcdfmnvnew3D3/ME1phn9gma7GQHbBDdsw6rMsEu2K/2G/2x7v2/nm33t2sdcGbz3xkT+TdPwDfNq3S</latexit> x̂ = ✓ x µ(x) (x) ◆ + <latexit sha1_base64="ih89Pzc2EBidrxV+yRGOgCPpnQE=">AAACUXicbZC7TsMwFIZPw/1eYGSJqJDKQJUAAhYkRJeOIFGK1JTIcZ3WwnaC7QCVlSfjMZgYGVjgCdhw2iJxO5KlT/+5+o9SRpX2vOeSMzE5NT0zOze/sLi0vFJeXbtUSSYxaeKEJfIqQoowKkhTU83IVSoJ4hEjreimXuRbd0QqmogLPUhJh6OeoDHFSFspLDcDRXscVR+2jwN1K7UJYomw8XNTb7TyQGX8uh4afOyPuBGa/he3QnNvufoQGoH79/lOwDM7Z/t6Nw/LFa/mDcP9C/4YKjCOs7D8GnQTnHEiNGZIqbbvpbpjkNQUM5LPB5kiKcI3qEfaFgXiRHXM8Pu5u2WVrhsn0j6h3aH6vcMgrtSAR7aSI91Xv3OF+F+unen4qGOoSDNNBB4tijPm6sQtvHS7VBKs2cACwpLaW13cR9Y/bR3/saWritMKX/zfLvyFy92af1DbO9+vnJyOHZqFDdiEKvhwCCfQgDNoAoZHeIE3eC89lT4ccJxRqVMa96zDj3AWPgEEPLU1</latexit> (x) = v u u t 1 CHW C X c=1 H X h=1 W X w=1 (xnchw µ(x))2 <latexit sha1_base64="RqN4YpnAkqmVd2gk3UwpB5hXPg8=">AAACN3icbZDPShxBEMZ7jEZjYlyTYy6DS2BFXGZUYiAEJF48ruCqsL0sNb01s43dM0N3TdhlmIfxMXyCXJNjTjkpXn0De3b34J980PDjqyqq+otyJS0FwV9v4dXi0uvllTerb9+tvV9vbHw4s1lhBHZFpjJzEYFFJVPskiSFF7lB0JHC8+jyqK6f/0RjZZae0iTHvoYklbEUQM4aNL7xEVA5rr7zBLQGrjCmFo8NiHK8w3XRGm9VJbcy0VAjNzIZ0dY2j5Bg0GgG7WAq/yWEc2iyuTqDxg0fZqLQmJJQYG0vDHLql2BICoXVKi8s5iAuIcGewxQ02n45/WTlf3bO0I8z415K/tR9PFGCtnaiI9epgUb2ea02/1frFRR/7ZcyzQvCVMwWxYXyKfPrxPyhNChITRyAMNLd6osRuITI5fpky9DWp1Uul/B5Ci/hbLcdfmnvnew3D3/ME1phn9gma7GQHbBDdsw6rMsEu2K/2G/2x7v2/nm33t2sdcGbz3xkT+TdPwDfNq3S</latexit> x̂ = ✓ x µ(x) (x) ◆ + <latexit sha1_base64="+mSuVgDmFbFdQz9MORmrfpjKhzI=">AAACQHicbZBNSwMxEIazflu/qh69LBZBL3W3inoRxHrosYL9gG5dsmm2DSbZJclaS9g/5M/wF3jVu+BNvHoy21aw1YHAM+/MMJM3iCmRynFerJnZufmFxaXl3Mrq2vpGfnOrLqNEIFxDEY1EM4ASU8JxTRFFcTMWGLKA4kZwV87qjXssJIn4jRrEuM1gl5OQIKiM5OevPJbsPxyce6GASJdSXa40Uk8m7FaXD0upr9G5O8orvu79cMPXfcMPvuao10/9fMEpOsOw/4I7hgIYR9XPv3mdCCUMc4UolLLlOrFqaygUQRSnOS+ROIboDnZxyyCHDMu2Hv42tfeM0rHDSJjHlT1Uf09oyKQcsMB0Mqh6crqWif/VWokKz9qa8DhRmKPRojChtorszDq7QwRGig4MQCSIudVGPWh8U8bgiS0dmZ2W+eJOu/AX6qWie1I8uj4uXFyOHVoCO2AX7AMXnIILUAFVUAMIPIJn8AJerSfr3fqwPketM9Z4ZhtMhPX1DSxusYA=</latexit> µ(x) = 2 CHW C/2 X c=1 H X h=1 W X w=1 xnchw <latexit sha1_base64="VgwH0utETMAfcEv0IoDUtgzOTWk=">AAACVXicbZDdTsIwGIbLxD/8Qz30ZJGY4IG4oVFPSIyccIiJiIbh0pUOGttutp1Cml2bl2G8AE/1CkzsABNRv6TJ2/f7zRPElEjlOK85ay4/v7C4tFxYWV1b3yhubl3LKBEIt1BEI3ETQIkp4biliKL4JhYYsoDidnBfz/LtRywkifiVGsW4y2Cfk5AgqIzlF289SfoMlof7NU8+CKW9UECkq6muN9qpJxN2p+uH1dTXqOZO/g1fD75129dPRpeHvuZo8JQeeCwxs/bvTEex5FSccdh/hTsVJTCNpl9883oRShjmClEoZcd1YtXVUCiCKE4LXiJxDNE97OOOkRwyLLt6jCC194zTs8NImMeVPXZ/dmjIpByxwFQyqAbydy4z/8t1EhWedTXhcaIwR5NFYUJtFdkZT7tHBEaKjoyASBBzq40G0DBUhvrMlp7MTsu4uL8p/BXX1Yp7Ujm6PC6dX0wJLYEdsAvKwAWn4Bw0QBO0AALP4A28g4/cS+7TylsLk1IrN+3ZBjNhbXwBPOS2tw==</latexit> (x) = v u u t 2 CHW C/2 X c=1 H X h=1 W X w=1 (xnchw µ(x))2 <latexit sha1_base64="PSqEPc30bQy6FHyPWYe6IWsLh00=">AAACZnicbVFNS8MwGM7q9/dUxIOX4BDmwdFOUS/CcDB2VHBWWGdJs3SLS9qapMII/Y9e/QGCv8CrprMH3Xwh8LzP837xJEgYlcq230rW3PzC4tLyyura+sbmVnl7517GqcCkg2MWi4cAScJoRDqKKkYeEkEQDxhxg1Ez190XIiSNozs1TkiPo0FEQ4qRMpRffvIkHXBUdY+vPPkslPZCgbB2Mt1s+e2W72aeTPmjbma+xldOkRnJ5KHf/sW4E8Y1TNU1tUY0WXbi8dQMP36sZ365YtfsScBZ4BSgAoq48cvvXj/GKSeRwgxJ2XXsRPU0EopiRrJVL5UkQXiEBqRrYIQ4kT098SSDR4bpwzAW5kUKTtjfHRpxKcc8MJUcqaGc1nLyP62bqvCyp2mUpIpE+GdRmDKoYpgbDPtUEKzY2ACEBTW3QjxExlRlvuHPlr7MT8t9caZdmAX39ZpzXju9Pas0rguHlsEBOARV4IAL0ABtcAM6AINX8Am+SqD0YW1ae9b+T6lVKnp2wZ+w4Dc1/7tI</latexit> (W) = v u u t 1 CFHFW C X c=1 FH X fH =1 FW X fW =1 (WcfH fW µ(W))2 <latexit sha1_base64="kB8kd70X3SXw/F5W+lkWy3s6d8o=">AAACUXicbZDNSsNAFIVv41+tf1WXboJF0E1JVNSNUCyULivYptDUMJlO2sGZJMxMhBLyZD6GK5cu3OgTuHPSZtGqFwbO/c4d5s7xY0alsqy3krGyura+Ud6sbG3v7O5V9w96MkoEJl0csUj0fSQJoyHpKqoY6ceCIO4z4vhPzdx3nomQNAof1DQmQ47GIQ0oRkojr9p1eXLqnN26gUA4Pc/SZstrtzwnc2XCH9Nm5qX41i46bek+8NoLxJkRRxNHj2ovyFG1ZtWtWZl/hV2IGhTV8aof7ijCCSehwgxJObCtWA1TJBTFjGQVN5EkRvgJjclAyxBxIofp7PuZeaLJyAwioU+ozBldvJEiLuWU+3qSIzWRv70c/ucNEhXcDFMaxokiIZ4/FCTMVJGZZ2mOqCBYsakWCAuqdzXxBOkglU586ZWRzFfLc7F/p/BX9M7r9lX94v6y1rgrEirDERzDKdhwDQ1oQwe6gOEF3uETvkqvpW8DDGM+apSKO4ewVMbWD9qTtTQ=</latexit> µ(W) = 2 CFHFW C X c=1 FH X fH =1 FW X fW =1 WcfH fW <latexit sha1_base64="Pt2kdQmIa3vxI/2P4C+R6402oY0=">AAACK3icbZDNTsJAFIWn+If4h7p000hMYCG2atSNCdGNS0yEklBCpsMUJsy0zcytCWn6GD6GT+BWn8CVxi3v4RRYCHiSSb6ce2/uneNFnCmwrC8jt7K6tr6R3yxsbe/s7hX3D5oqjCWhDRLyULY8rChnAW0AA05bkaRYeJw63vA+qzvPVCoWBk8wimhH4H7AfEYwaKtbPHMHGBInvXU59aHs+hKTxDl1RVx2KmniKtYXOENXsv4AKt1iyapaE5nLYM+ghGaqd4tjtxeSWNAACMdKtW0rgk6CJTDCaVpwY0UjTIa4T9saAyyo6iSTj6XmiXZ6ph9K/QIwJ+7fiQQLpUbC050Cw0At1jLzv1o7Bv+mk7AgioEGZLrIj7kJoZmlZPaYpAT4SAMmkulbTTLAOhvQWc5t6anstFTnYi+msAzN86p9Vb14vCzV7mYJ5dEROkZlZKNrVEMPqI4aiKAX9Ibe0Yfxanwa38bPtDVnzGYO0ZyM8S8fuqhU</latexit> Ŵ = ✓ W µ(W) (W) ◆ B N + LN B N + LN B N + LN B N + LN (higher is better)
  • 11.
    データ拡張 (augmentation) Flipping Rotation Cutout Random crop Scale RandomErasing Mixup CutMix AugMix AutoAugment 強化学習を使って最適なデータ拡張を探索 Fast AutoAugment 強化学習+ベイズ最適化により探索時間短縮 Faster AutoAugment 勾配ベースの探索によりさらに時間短縮 https://openreview.net/pdf?id=S1gmrxHFvB https://github.com/xkumiyu/numpy-data-augmentation (lower is better)
  • 12.
    正則化 (regularization) <latexit sha1_base64="PTOETKQJ9sDV108G8utVEdMYksY=">AAACGnicbVDLSsNAFJ34rPUVdSnIYBHcWJIi6kYounHhooJ9QBPLzWTaDp08mJkIJWTnZ/gFbvUL3IlbN36A/+Gk7cK2Hhg4nHMv98zxYs6ksqxvY2FxaXlltbBWXN/Y3No2d3YbMkoEoXUS8Ui0PJCUs5DWFVOctmJBIfA4bXqD69xvPlIhWRTeq2FM3QB6IesyAkpLHfPACUD1CfD0Nrt0ZBI8pD4oyE4cHvVw3DFLVtkaAc8Te0JKaIJax/xx/IgkAQ0V4SBl27Zi5aYgFCOcZkUnkTQGMoAebWsaQkClm47+keEjrfi4Gwn9QoVH6t+NFAIph4GnJ/PUctbLxf+8dqK6F27KwjhRNCTjQ92EYxXhvBTsM0GJ4kNNgAims2LSBwFE6eqmrvgyj5bpXuzZFuZJo1K2z8qVu9NS9WrSUAHto0N0jGx0jqroBtVQHRH0hF7QK3ozno1348P4HI8uGJOdPTQF4+sXt5uh/g==</latexit> L= data X log p 損失関数 https://arxiv.org/abs/2002.08709 L2正則化 <latexit sha1_base64="Yy8xPfnTLiqjWpL4JDjdct97CAw=">AAACJ3icbVDLSgMxFM34rPU16tJNsAhCscxUUTdC0Y0LFxXsAzq13MmkNTTzIMkIZTof4Wf4BW71C9yJLl34H2baLmz1QOBwzr259x434kwqy/o05uYXFpeWcyv51bX1jU1za7suw1gQWiMhD0XTBUk5C2hNMcVpMxIUfJfThtu/zPzGAxWShcGtGkS07UMvYF1GQGmpYxYdH9Q9AZ5cp+eOjP27xAMF6aHDwx6Oig7Xf3kwbAzvyh2zYJWsEfBfYk9IAU1Q7ZjfjheS2KeBIhykbNlWpNoJCMUIp2neiSWNgPShR1uaBuBT2U5GR6V4Xyse7oZCv0Dhkfq7IwFfyoHv6srsBDnrZeJ/XitW3bN2woIoVjQg40HdmGMV4iwh7DFBieIDTYAIpnfF5B4EEKVznJriyWy1VOdiz6bwl9TLJfukdHRzXKhcTBLKoV20hw6QjU5RBV2hKqohgh7RM3pBr8aT8Wa8Gx/j0jlj0rODpmB8/QDq3qdI</latexit> L = data X log p + |W|2 L1正則化 <latexit sha1_base64="qqFUGj5bRbXJTuGZgDLorSbuXL0=">AAACJXicbVDLSgMxFM3UV62vqks3wSIoYplRUTdC0Y0LFxXsAzq13MmkbWjmQZIRynS+wc/wC9zqF7gTwZUr/8NM24VtPRA4nHNv7r3HCTmTyjS/jMzc/MLiUnY5t7K6tr6R39yqyiAShFZIwANRd0BSznxaUUxxWg8FBc/htOb0rlO/9kiFZIF/r/ohbXrQ8VmbEVBaauUPbA9UlwCPb5NLW0beQ+yCguTI5kEHh4c213+5MKgNWvmCWTSHwLPEGpMCGqPcyv/YbkAij/qKcJCyYZmhasYgFCOcJjk7kjQE0oMObWjqg0dlMx6elOA9rbi4HQj9fIWH6t+OGDwp+56jK9MD5LSXiv95jUi1L5ox88NIUZ+MBrUjjlWA03ywywQlivc1ASKY3hWTLgggSqc4McWV6WqJzsWaTmGWVI+L1lnx5O60ULoaJ5RFO2gX7SMLnaMSukFlVEEEPaEX9IrejGfj3fgwPkelGWPcs40mYHz/AprfpqQ=</latexit> L = data X log p + |W| Sharpness Aware Minimization (SAM) Flooding <latexit sha1_base64="+29JRp4dO+SSQAn2+lrjhc+5WsE=">AAACIHicbVDLSgMxFM34rPVVdekmWAVBWmZU1I1QdOPCRQX7gE4tdzJpG5p5kGSEMp0f8DP8Arf6Be7Epe79DzNtF7b1QOBwzr3ck+OEnEllml/G3PzC4tJyZiW7ura+sZnb2q7KIBKEVkjAA1F3QFLOfFpRTHFaDwUFz+G05vSuU7/2SIVkgX+v+iFtetDxWZsRUFpq5fZtD1SXAI9vk0tbRt5D7IKCZFCwedDBYcEZHDmtXN4smkPgWWKNSR6NUW7lfmw3IJFHfUU4SNmwzFA1YxCKEU6TrB1JGgLpQYc2NPXBo7IZD3+T4AOtuLgdCP18hYfq340YPCn7nqMn0+xy2kvF/7xGpNoXzZj5YaSoT0aH2hHHKsBpNdhlghLF+5oAEUxnxaQLAojSBU5ccWUaLdG9WNMtzJLqcdE6K57cneZLV+OGMmgX7aFDZKFzVEI3qIwqiKAn9IJe0ZvxbLwbH8bnaHTOGO/soAkY378ypqRP</latexit> L = data X | log p b| + b <latexit sha1_base64="9vmQ2Pyai29yDGuUrPziv2Rg5HE=">AAACPnicbVBNSxxBEO0xfn+uydFL4yIo4jKjol6EZb3kkIOBrCvsjEtNT+1uY0/30N0jLsP8n/yM/IJcE39AvEmuOaZnXcGvBwWP96qoqhdnghvr+3fe1Ifpmdm5+YXFpeWV1bXa+scLo3LNsM2UUPoyBoOCS2xbbgVeZhohjQV24uuzyu/coDZcyW92lGGUwkDyPmdgndSrtcIU7JCBKL6Up6HJ06siAQulk297RYiZ4ULJUGCoh6rs7oVCDWi23dl9snaiXq3uN/wx6FsSTEidTHDeq/0JE8XyFKVlAozpBn5mowK05UxguRjmBjNg1zDArqMSUjRRMf61pFtOSWhfaVfS0rH6fKKA1JhRGrvO6jPz2qvE97xubvsnUcFllluU7HFRPxfUKloFRxOukVkxcgSY5u5WyoaggVkX74stialOK10uwesU3pKL/UZw1Dj4elhvtiYJzZMNskm2SUCOSZN8JuekTRj5Tn6SX+S398O79x68v4+tU95k5hN5Ae/ff/ElsWk=</latexit> L = data X max ✏⇢ [ log p(W + ✏)] https://arxiv.org/abs/2010.01412 Dropout https://arxiv.org/abs/1603.09382 Stochastic depth
  • 13.
    分散並列化 データ並列 テンソル並列 層並列 データを分散 モデルは冗長 勾配を通信 バッチが巨大化 例:Horovod データは冗長 モデルは分散 活性を通信 通信頻度が多い 例:MeshTensorFlow データは冗長 モデルは分散 活性を通信 計算が逐次的 例:GPipe “Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis”, Ben-nun and Hoe fl er, ACM Computing Surveys, Article No.: 65
  • 14.
    データ並列における通信と同期 パラメータサーバ 集団通信 同期型 非同期型 同期型 非同期型 “Demystifying Paralleland Distributed Deep Learning: An In-Depth Concurrency Analysis”, Ben-nun and Hoe fl er, ACM Computing Surveys, Article No.: 65
  • 15.
  • 16.
    full batch largemini-batch small mini-batch なぜバッチサイズを大きくすると汎化性能が低下するのか?
  • 17.
  • 18.
  • 19.
    ラージバッチ用の最適化手法? LARS LAMB (ADAM+LARS) <latexit sha1_base64="o+M96FcBaWGVEj1D/kpYk4bZkPU=">AAACPXicbVA9TyMxEPXycQccHDkoaSyik2iIdhESlBE0FCmCRAhSNopmnVmw8HpX9uxF0SZ/534Gv4AWJHroEC0t3iQFX0+y/PxmRvP8okxJS77/4M3NLyz++Lm0vPJrde33euXPxrlNcyOwJVKVmosILCqpsUWSFF5kBiGJFLaj6+Oy3v6HxspUn9Eww24Cl1rGUgA5qVepD3ioMCYwJh3wwW6IBJyHsQFRjEaD0WjsrlBDpIA33GtKwwToSoAqGuNeperX/An4VxLMSJXN0OxVnsJ+KvIENQkF1nYCP6NuAYakUDheCXOLGYhruMSOoxoStN1i8tMx/+uUPo9T444mPlHfTxSQWDtMItdZWrSfa6X4Xa2TU3zYLaTOckItpoviXHFKeRkb70uDgtTQERBGOq9cXIFLiVy4H7b0bWmtzCX4nMJX0t6rBfu1IDjdr9aPZhEtsS22zXZYwA5YnZ2wJmsxwf6zW3bH7r0b79F79l6mrXPebGaTfYD3+gZIWrEN</latexit> ww ⌘ ||w|| ||rL|| rL 32kのハッチサイズにおいても: NesterovでLARSと同じ性能を達成 AdamでLAMBと同じ性能を達成 結局ハイパラチューニング次第
  • 20.
  • 21.
  • 22.
    2層の全結合NN D_in=3 H=5 D_out=2 Data batch_size(BS)=2 x(BS,D_in) w1(D_in,H) w2(H,D_out) y_p(BS,D_out) h_r=f(x*w1) y=f(x) ReLU (Recti fi edLinear Unit) y_p=h_r*w2 Back propagation @L @w2 = @L @yp @yp @w2 = 1 NO 2(yp y)hr <latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit> <latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit> <latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit> <latexit sha1_base64="KD+WkL8b4iPtxogxdCifBK+sduE=">AAAB7HicbVDLSgNBEOyNrxijxrOXwSB4Crte9Ch48RjBPCBZwuxsJxkyO7vM9AphyQ949Qu8iX/kB/gfziY5mMSCgaKqm66pKFPSku9/e5W9/YPDo+px7aReOz07b9S7Ns2NwI5IVWr6EbeopMYOSVLYzwzyJFLYi2aPpd97RWNlql9onmGY8ImWYyk4Oak9ajT9lr8E2yXBmjRhjVHjZxinIk9Qk1Dc2kHgZxQW3JAUChe1YW4x42LGJzhwVPMEbVgsYy7YtVNiNk6Ne5rYUv27UfDE2nkSucmE09Rue6X4nzfIaXwfFlJnOaEWq0PjXDFKWflnFkuDgtTcES6MdFmZmHLDBblmNq7Etoy2cLUE2yXsku5tK/BbwbMPVbiEK7iBAO7gAZ6gDR0QEMMbvHuF9+F9ruqreOseL2AD3tcvU8WSkg==</latexit> <latexit sha1_base64="BPl6LZUWEc7LnKT4OpXuCsHjQG0=">AAACa3icfZHNS8MwGMbT+jXndNWrIMExnAdHu4teBMGLB9EJ7gPWUtIs3cLSD5JUqaV/qDcv/g8eTbeCbhNfCDw8T/K+L794MaNCmua7pm9sbm3vVHare7X9g7pxWOuLKOGY9HDEIj70kCCMhqQnqWRkGHOCAo+RgTe7LfLBC+GCRuGzTGPiBGgSUp9iJJXlGm+2zxHO7BhxSRGD9/mPfnU7+fU/eerG+UpcWMsNqmUHK88e3Me801JXLtLzqctdo2G2zXnBdWGVogHK6rrGpz2OcBKQUGKGhBhZZiydrBiGGcmrdiJIjPAMTchIyRAFRDjZnFEOm8oZQz/i6oQSzt3fLzIUCJEGnroZIDkVq1lh/pWNEulfORkN40SSEC8G+QmDMoIFcDimnGDJUiUQ5lTtCvEUKSRSfcvSlLEoVssVF2uVwrrod9qW2baeTFABx+AUtIAFLsENuANd0AMYfGjbWl0ztC/9RG8uCOpaifIILJV+9g0Z5MCq</latexit> <latexit sha1_base64="BPl6LZUWEc7LnKT4OpXuCsHjQG0=">AAACa3icfZHNS8MwGMbT+jXndNWrIMExnAdHu4teBMGLB9EJ7gPWUtIs3cLSD5JUqaV/qDcv/g8eTbeCbhNfCDw8T/K+L794MaNCmua7pm9sbm3vVHare7X9g7pxWOuLKOGY9HDEIj70kCCMhqQnqWRkGHOCAo+RgTe7LfLBC+GCRuGzTGPiBGgSUp9iJJXlGm+2zxHO7BhxSRGD9/mPfnU7+fU/eerG+UpcWMsNqmUHK88e3Me801JXLtLzqctdo2G2zXnBdWGVogHK6rrGpz2OcBKQUGKGhBhZZiydrBiGGcmrdiJIjPAMTchIyRAFRDjZnFEOm8oZQz/i6oQSzt3fLzIUCJEGnroZIDkVq1lh/pWNEulfORkN40SSEC8G+QmDMoIFcDimnGDJUiUQ5lTtCvEUKSRSfcvSlLEoVssVF2uVwrrod9qW2baeTFABx+AUtIAFLsENuANd0AMYfGjbWl0ztC/9RG8uCOpaifIILJV+9g0Z5MCq</latexit> <latexit sha1_base64="9s6RO95OuhDQX1hzGQv1yWyzpFM=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUl6UY3QtGNC9EK9gFNCZPpxA6dPJiZKDHkQ9258R9cOmkD2la8MHA45957LmfciFEhTfNd09fWN0qb5a3K9s7uXtXYP+iJMOaYdHHIQj5wkSCMBqQrqWRkEHGCfJeRvju9yfX+C+GChsGTTCIy8tFzQD2KkVSUY7zZHkc4tSPEJUUM3mU/+NVpZVf/6IkTZUtyTi0uqBQbrCy9dx6yVkO1nCdnE4c7Rs1smrOCq8AqQA0U1XGMT3sc4tgngcQMCTG0zEiO0twMM5JV7FiQCOEpeiZDBQPkEzFKZxllsK6YMfRCrl4g4Yz9PZEiX4jEd1Wnj+RELGs5+Zc2jKV3OUppEMWSBHhu5MUMyhDmgcMx5QRLliiAMKfqVognSEUi1bcsuIxFflqmcrGWU1gFvVbTMpvWo1lrXxcJlcEROAENYIEL0Aa3oAO6AIMPraRVNUP70o/1un46b9W1YuYQLJRufgN6n8GI</latexit> <latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit> <latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit> <latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit> <latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit> <latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit> <latexit sha1_base64="0rAPpB7aTBShnIAwO9pRmo0aRkQ=">AAACdnicfVHLSsNAFJ1ErbU+GnUpyGAp1oUlKYJuhKIbF6IV7APaEibTSTt08mBmosSQD3Xnxn9w6aQNaFvxwsDhnHvvuZxxQkaFNM13TV9b3yhsFrdK2zu7e2Vj/6Ajgohj0sYBC3jPQYIw6pO2pJKRXsgJ8hxGus70NtO7L4QLGvjPMg7J0ENjn7oUI6ko23gbuBzhZBAiLili8D79wa92I73+R4/tMF2SM2pxQSnfYKXJg/2YNmqq5Tw+m9jcNipm3ZwVXAVWDiogr5ZtfA5GAY484kvMkBB9ywzlMMnMMCNpaRAJEiI8RWPSV9BHHhHDZJZRCquKGUE34Or5Es7Y3xMJ8oSIPUd1ekhOxLKWkX9p/Ui6V8OE+mEkiY/nRm7EoAxgFjgcUU6wZLECCHOqboV4glQkUn3LgstIZKelKhdrOYVV0GnULbNuPV1Umjd5QkVwBE5ADVjgEjTBHWiBNsDgQytoZc3QvvRjvaqfzlt1LZ85BAulm99738GM</latexit> L = 1 NO X (yp y)2 <latexit sha1_base64="HuDWCNBZ3fiwLb58W34hfFqbE4Y=">AAACGHicbZDLSsNAFIYn9VbrLepSF8Ei1IUlKYJuhKIbF6IV7AWaGCaTSTt0cmFmIoSQjY/hE7jVJ3Anbt35AL6HkzYL2/rDwM9/zuGc+ZyIEi50/VspLSwuLa+UVytr6xubW+r2ToeHMUO4jUIasp4DOaYkwG1BBMW9iGHoOxR3ndFlXu8+YsZJGNyLJMKWDwcB8QiCQka2un99bnoMotTI0hv7NjN57NcSOzpOjh4atlrV6/pY2rwxClMFhVq2+mO6IYp9HAhEIed9Q4+ElUImCKI4q5gxxxFEIzjAfWkD6GNupeNfZNqhTFzNC5l8gdDG6d+JFPqcJ74jO30ohny2lof/1fqx8M6slARRLHCAJou8mGoi1HIkmksYRoIm0kDEiLxVQ0MooQgJbmqLy/PTMsnFmKUwbzqNuqHXjbuTavOiIFQGe+AA1IABTkETXIEWaAMEnsALeAVvyrPyrnwon5PWklLM7IIpKV+/xXGgTg==</latexit> <latexit sha1_base64="HuDWCNBZ3fiwLb58W34hfFqbE4Y=">AAACGHicbZDLSsNAFIYn9VbrLepSF8Ei1IUlKYJuhKIbF6IV7AWaGCaTSTt0cmFmIoSQjY/hE7jVJ3Anbt35AL6HkzYL2/rDwM9/zuGc+ZyIEi50/VspLSwuLa+UVytr6xubW+r2ToeHMUO4jUIasp4DOaYkwG1BBMW9iGHoOxR3ndFlXu8+YsZJGNyLJMKWDwcB8QiCQka2un99bnoMotTI0hv7NjN57NcSOzpOjh4atlrV6/pY2rwxClMFhVq2+mO6IYp9HAhEIed9Q4+ElUImCKI4q5gxxxFEIzjAfWkD6GNupeNfZNqhTFzNC5l8gdDG6d+JFPqcJ74jO30ohny2lof/1fqx8M6slARRLHCAJou8mGoi1HIkmksYRoIm0kDEiLxVQ0MooQgJbmqLy/PTMsnFmKUwbzqNuqHXjbuTavOiIFQGe+AA1IABTkETXIEWaAMEnsALeAVvyrPyrnwon5PWklLM7IIpKV+/xXGgTg==</latexit> <latexit sha1_base64="HuDWCNBZ3fiwLb58W34hfFqbE4Y=">AAACGHicbZDLSsNAFIYn9VbrLepSF8Ei1IUlKYJuhKIbF6IV7AWaGCaTSTt0cmFmIoSQjY/hE7jVJ3Anbt35AL6HkzYL2/rDwM9/zuGc+ZyIEi50/VspLSwuLa+UVytr6xubW+r2ToeHMUO4jUIasp4DOaYkwG1BBMW9iGHoOxR3ndFlXu8+YsZJGNyLJMKWDwcB8QiCQka2un99bnoMotTI0hv7NjN57NcSOzpOjh4atlrV6/pY2rwxClMFhVq2+mO6IYp9HAhEIed9Q4+ElUImCKI4q5gxxxFEIzjAfWkD6GNupeNfZNqhTFzNC5l8gdDG6d+JFPqcJ74jO30ohny2lof/1fqx8M6slARRLHCAJou8mGoi1HIkmksYRoIm0kDEiLxVQ0MooQgJbmqLy/PTMsnFmKUwbzqNuqHXjbuTavOiIFQGe+AA1IABTkETXIEWaAMEnsALeAVvyrPyrnwon5PWklLM7IIpKV+/xXGgTg==</latexit> <latexit sha1_base64="HuDWCNBZ3fiwLb58W34hfFqbE4Y=">AAACGHicbZDLSsNAFIYn9VbrLepSF8Ei1IUlKYJuhKIbF6IV7AWaGCaTSTt0cmFmIoSQjY/hE7jVJ3Anbt35AL6HkzYL2/rDwM9/zuGc+ZyIEi50/VspLSwuLa+UVytr6xubW+r2ToeHMUO4jUIasp4DOaYkwG1BBMW9iGHoOxR3ndFlXu8+YsZJGNyLJMKWDwcB8QiCQka2un99bnoMotTI0hv7NjN57NcSOzpOjh4atlrV6/pY2rwxClMFhVq2+mO6IYp9HAhEIed9Q4+ElUImCKI4q5gxxxFEIzjAfWkD6GNupeNfZNqhTFzNC5l8gdDG6d+JFPqcJ74jO30ohny2lof/1fqx8M6slARRLHCAJou8mGoi1HIkmksYRoIm0kDEiLxVQ0MooQgJbmqLy/PTMsnFmKUwbzqNuqHXjbuTavOiIFQGe+AA1IABTkETXIEWaAMEnsALeAVvyrPyrnwon5PWklLM7IIpKV+/xXGgTg==</latexit> @L @w1 = @L @yp @yp @hr @hr @w1 = 1 NO 2(yp y)w2x <latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit> <latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit> <latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit> <latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit> w1 w1 ⌘ @L @w1 <latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit> <latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit> <latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit> <latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit> w2 w2 ⌘ @L @w2 <latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit> <latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit> <latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit> <latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit> hr > 0 <latexit sha1_base64="cIbhX78ttMoC8DFM/VOzOTz2skU=">AAAB/3icbVDLSsNAFL2pr1pfUZduBovgqiQi6EqKblxWMG2hDWUymbRDJ5MwMxFK6MIvcKtf4E7c+il+gP/hpM3Cth4YOJxzL/fMCVLOlHacb6uytr6xuVXdru3s7u0f2IdHbZVkklCPJDyR3QArypmgnmaa024qKY4DTjvB+K7wO09UKpaIRz1JqR/joWARI1gbyRsN5I0zsOtOw5kBrRK3JHUo0RrYP/0wIVlMhSYcK9VznVT7OZaaEU6ntX6maIrJGA9pz1CBY6r8fBZ2is6MEqIokeYJjWbq340cx0pN4sBMxliP1LJXiP95vUxH137ORJppKsj8UJRxpBNU/ByFTFKi+cQQTCQzWREZYYmJNv0sXAlVEW1qenGXW1gl7YuG6zTch8t687ZsqAoncArn4MIVNOEeWuABAQYv8Apv1rP1bn1Yn/PRilXuHMMCrK9f9mSWwA==</latexit> <latexit sha1_base64="cIbhX78ttMoC8DFM/VOzOTz2skU=">AAAB/3icbVDLSsNAFL2pr1pfUZduBovgqiQi6EqKblxWMG2hDWUymbRDJ5MwMxFK6MIvcKtf4E7c+il+gP/hpM3Cth4YOJxzL/fMCVLOlHacb6uytr6xuVXdru3s7u0f2IdHbZVkklCPJDyR3QArypmgnmaa024qKY4DTjvB+K7wO09UKpaIRz1JqR/joWARI1gbyRsN5I0zsOtOw5kBrRK3JHUo0RrYP/0wIVlMhSYcK9VznVT7OZaaEU6ntX6maIrJGA9pz1CBY6r8fBZ2is6MEqIokeYJjWbq340cx0pN4sBMxliP1LJXiP95vUxH137ORJppKsj8UJRxpBNU/ByFTFKi+cQQTCQzWREZYYmJNv0sXAlVEW1qenGXW1gl7YuG6zTch8t687ZsqAoncArn4MIVNOEeWuABAQYv8Apv1rP1bn1Yn/PRilXuHMMCrK9f9mSWwA==</latexit> <latexit sha1_base64="cIbhX78ttMoC8DFM/VOzOTz2skU=">AAAB/3icbVDLSsNAFL2pr1pfUZduBovgqiQi6EqKblxWMG2hDWUymbRDJ5MwMxFK6MIvcKtf4E7c+il+gP/hpM3Cth4YOJxzL/fMCVLOlHacb6uytr6xuVXdru3s7u0f2IdHbZVkklCPJDyR3QArypmgnmaa024qKY4DTjvB+K7wO09UKpaIRz1JqR/joWARI1gbyRsN5I0zsOtOw5kBrRK3JHUo0RrYP/0wIVlMhSYcK9VznVT7OZaaEU6ntX6maIrJGA9pz1CBY6r8fBZ2is6MEqIokeYJjWbq340cx0pN4sBMxliP1LJXiP95vUxH137ORJppKsj8UJRxpBNU/ByFTFKi+cQQTCQzWREZYYmJNv0sXAlVEW1qenGXW1gl7YuG6zTch8t687ZsqAoncArn4MIVNOEeWuABAQYv8Apv1rP1bn1Yn/PRilXuHMMCrK9f9mSWwA==</latexit> <latexit sha1_base64="cIbhX78ttMoC8DFM/VOzOTz2skU=">AAAB/3icbVDLSsNAFL2pr1pfUZduBovgqiQi6EqKblxWMG2hDWUymbRDJ5MwMxFK6MIvcKtf4E7c+il+gP/hpM3Cth4YOJxzL/fMCVLOlHacb6uytr6xuVXdru3s7u0f2IdHbZVkklCPJDyR3QArypmgnmaa024qKY4DTjvB+K7wO09UKpaIRz1JqR/joWARI1gbyRsN5I0zsOtOw5kBrRK3JHUo0RrYP/0wIVlMhSYcK9VznVT7OZaaEU6ntX6maIrJGA9pz1CBY6r8fBZ2is6MEqIokeYJjWbq340cx0pN4sBMxliP1LJXiP95vUxH137ORJppKsj8UJRxpBNU/ByFTFKi+cQQTCQzWREZYYmJNv0sXAlVEW1qenGXW1gl7YuG6zTch8t687ZsqAoncArn4MIVNOEeWuABAQYv8Apv1rP1bn1Yn/PRilXuHMMCrK9f9mSWwA==</latexit>
  • 23.
    NumPyだけによる実装 import numpy asn p epochs = 30 0 batch_size = 3 2 D_in = 78 4 H = 10 0 D_out = 1 0 learning_rate = 1.0e-0 6 # create random input and output dat a x = np.random.randn(batch_size, D_in ) y = np.random.randn(batch_size, D_out ) # randomly initialize weight s w1 = np.random.randn(D_in, H ) w2 = np.random.randn(H, D_out ) for epoch in range(epochs) : # forward pas s h = x.dot(w1) # h = x * w 1 h_r = np.maximum(h, 0) # h_r = ReLU(h ) y_p = h_r.dot(w2) # y_p = h_r * w 2 # compute mean squared error and print los s loss = np.square(y_p - y).sum() print(epoch, loss ) # backward pass: compute gradients of loss with respect to w 2 grad_y_p = 2.0 * (y_p - y) grad_w2 = h_r.T.dot(grad_y_p) # backward pass: compute gradients of loss with respect to w 1 grad_h_r = grad_y_p.dot(w2.T) grad_h = grad_h_r.copy() grad_h[h < 0] = 0 grad_w1 = x.T.dot(grad_h) # update weight s w1 -= learning_rate * grad_w 1 w2 -= learning_rate * grad_w2 w1 w1 ⌘ @L @w1 <latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit> <latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit> <latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit> <latexit sha1_base64="kN3sQo8OP8glKG68w4PsUtC/f3k=">AAACMXicbVDLSgNBEJyN73fUo5fBIHgx7IqiR9GLBw8RjArZEHonvcmQ2QczvYaw5Ev8DL/Aq35BbiJ48iecjQGfDQM1VdXTPRWkShpy3ZFTmpqemZ2bX1hcWl5ZXSuvb1ybJNMC6yJRib4NwKCSMdZJksLbVCNEgcKboHdW6Dd3qI1M4isapNiMoBPLUAogS7XKh/2W5ysMCbRO+tze9nwk8EMNIvdT0CRB8YvhF7aWYatccavuuPhf4E1AhU2q1iq/+e1EZBHGJBQY0/DclJp58aRQOFz0M4MpiB50sGFhDBGaZj7+3pDvWKbNw0TbExMfs987coiMGUSBdUZAXfNbK8j/tEZG4XEzl3GaEcbic1CYKU4JL7LibalRkBpYAEJLuysXXbDJkE30x5S2KVYrcvF+p/AXXO9XPbfqXR5UTk4nCc2zLbbNdpnHjtgJO2c1VmeC3bNH9sSenQdn5Lw4r5/WkjPp2WQ/ynn/AO/Rq1o=</latexit> w2 w2 ⌘ @L @w2 <latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit> <latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit> <latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit> <latexit sha1_base64="XDVYNpwB7UZogd7iSX6oklwpHpw=">AAACMXicbVDLSgNBEJz1bXxFPXoZDIIXw64oehS9ePAQwZhANoTeSa8Ozj6Y6TWEJV/iZ/gFXvULchPBkz/hbAz4iA0DNVXV0z0VpEoact2hMzU9Mzs3v7BYWlpeWV0rr29cmyTTAusiUYluBmBQyRjrJElhM9UIUaCwEdydFXrjHrWRSXxF/RTbEdzEMpQCyFKd8mGvs+8rDAm0Tnrc3vZ8JPBDDSL3U9AkQfGLwTe2lkGnXHGr7qj4JPDGoMLGVeuU3/1uIrIIYxIKjGl5bkrtvHhSKByU/MxgCuIObrBlYQwRmnY++t6A71imy8NE2xMTH7E/O3KIjOlHgXVGQLfmr1aQ/2mtjMLjdi7jNCOMxdegMFOcEl5kxbtSoyDVtwCElnZXLm7BJkM20V9TuqZYrcjF+5vCJLjer3pu1bs8qJycjhNaYFtsm+0yjx2xE3bOaqzOBHtgT+yZvTiPztB5dd6+rFPOuGeT/Srn4xP07atd</latexit> @L @w2 = @L @yp @yp @w2 = 1 NO 2 (yp y) hr @L @w1 = @L @yp @yp @hr @hr @w1 = 1 NO 2 (yp y) w2x L = 1 NO X (yp y) 2 00_numpy.py
  • 24.
    PyTorch の導入 import torc h epochs = 30 0 batch_size = 3 2 D_in = 78 4 H = 10 0 D_out = 1 0 learning_rate = 1.0e-0 6 # create random input and output dat a x = torch.randn(batch_size, D_in ) y = torch.randn(batch_size, D_out ) # randomly initialize weight s w1 = torch.randn(D_in, H ) w2 = torch.randn(H, D_out ) for epoch in range(epochs) : # forward pass: compute predicted y h = x.mm(w1 ) h_r = h.clamp(min=0 ) y_p = h_r.mm(w2 ) # compute and print los s loss = (y_p - y).pow(2).sum().item( ) print(t, loss ) # backward pass: compute gradients of loss with respect to w 2 grad_y_p = 2.0 * (y_p - y ) grad_w2 = h_r.t().mm(grad_y_p ) # backward pass: compute gradients of loss with respect to w 1 grad_h_r = grad_y_p.mm(w2.t() ) grad_h = grad_h_r.clone( ) grad_h[h < 0] = 0 grad_w1 = x.t().mm(grad_h ) # update weight s w1 -= learning_rate * grad_w 1 w2 -= learning_rate * grad_w2 np.random torch np torch x.dot(w1) x.mm(w1) np.maximum(h, 0) h.clamp(min=0) np.square(y_p-y) (y_p-y).pow(2) copy() clone() 01_tensors.py
  • 25.
    自動微分の導入 # randomly initializeweight s w1 = torch.randn(D_in, H ) w2 = torch.randn(H, D_out ) for epoch in range(epochs) : # forward pass: compute predicted y h = x.mm(w1 ) h_r = h.clamp(min=0 ) y_p = h_r.mm(w2 ) # compute and print los s loss = (y_p - y).pow(2).sum().item( ) print(t, loss ) # backward pass: compute gradients of loss with respect to w 2 grad_y_p = 2.0 * (y_p - y ) grad_w2 = h_r.t().mm(grad_y_p ) # backward pass: compute gradients of loss with respect to w 1 grad_h_r = grad_y_p.mm(w2.t() ) grad_h = grad_h_r.clone( ) grad_h[h < 0] = 0 grad_w1 = x.t().mm(grad_h ) # update weight s w1 -= learning_rate * grad_w 1 w2 -= learning_rate * grad_w2 01_tensor.py 02_autograd.py # randomly initialize weight s w1 = torch.randn(D_in, H, requires_grad=True ) w2 = torch.randn(H, D_out, requires_grad=True ) for epoch in range(epochs) : # forward pass: compute predicted y h = x.mm(w1 ) h_r = h.clamp(min=0 ) y_p = h_r.mm(w2 ) # compute and print los s loss = (y_p - y).pow(2).sum( ) print(t, loss.item() ) # backward pas s loss.backward( ) with torch.no_grad() : # update weight s w1 -= learning_rate * w1.grad w2 -= learning_rate * w2.grad # initialize weight s w1.grad.zero_( ) w2.grad.zero_() @L @w1 = @L @yp @yp @hr @hr @w1 = 1 NO 2(yp y)w2x <latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit> <latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit> <latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit> <latexit sha1_base64="V1OkoDW7pmfxcKKULrjvVELuV8s=">AAACmHicfVHdSsMwGE3r//ybeqc3wSHMC0dTBL0Rhl6oIDrFqbCNkmbpFkx/SFJnKX0TX8wH8D1Mt8JcFT8InJzzfTkfJ27EmVSW9WmYc/MLi0vLK5XVtfWNzerW9pMMY0Fom4Q8FC8ulpSzgLYVU5y+RIJi3+X02X29yPXnNyokC4NHlUS05+NBwDxGsNKUU/3oegKTtBthoRjm8Cab4pGDsrN/9MSJspKcU9Pb0BHlhpyadagUFihLb527zK7rN46Sw5FjvzvVmtWwxgV/A1SAGiiq5VS/uv2QxD4NFOFYyg6yItVLczfCaVbpxpJGmLziAe1oGGCfyl46TjGDB5rpQy8U+gQKjtmfEyn2pUx8V3f6WA1lWcvJv7ROrLzTXsqCKFY0IBMjL+ZQhTD/EthnghLFEw0wEUzvCskQ60yU/rgZl77MV8t0Lqicwm/wZDeQ1UD3x7XmeZHQMtgD+6AOEDgBTXAFWqANiGEadQMZtrlrNs1L83rSahrFzA6YKfPhG92OzlM=</latexit> 微分を自動的に計算してくれる
  • 26.
    活性化関数の自作 03_function.py import torc h for epoch in range(epochs) : # forward pass: compute predicted y h = x.mm(w1 ) h_r = h.clamp(min=0 ) y_p = h_r.mm(w2) 02_autograd.py import torc h class ReLU(torch.autograd.Function) : @staticmetho d def forward(ctx, input) : ctx.save_for_backward(input ) return input.clamp(min=0 ) @staticmetho d def backward(ctx, grad_output) : input, = ctx.saved_tensor s grad_input = grad_output.clone( ) grad_input[input<0] = 0 return grad_inpu t for epoch in range(epochs) : # forward pass: compute predicted y relu = ReLU.appl y h = x.mm(w1 ) h_r = relu(h ) y_p = h_r.mm(w2) . . . . . . y=f(x) ReLU (Recti fi ed Linear Unit)
  • 27.
    torch.nnの利用 04_nn_module.py # createrandom input and output dat a x = torch.randn(batch_size, D_in ) y = torch.randn(batch_size, D_out ) # randomly initialize weight s w1 = torch.randn(D_in, H, requires_grad=True ) w2 = torch.randn(H, D_out, requires_grad=True ) for epoch in range(epochs) : # forward pass: compute predicted y h = x.mm(w1 ) h_r = h.clamp(min=0 ) y_p = h_r.mm(w2 ) # compute and print los s loss = (y_p - y).pow(2).sum() print(t, loss.item() ) # backward pas s loss.backward( ) with torch.no_grad() : # update weight s w1 -= learning_rate * w1.gra d w2 -= learning_rate * w2.gra d # initialize weight s w1.grad.zero_( ) w2.grad.zero_() 02_autograd.py # create random input and output dat a x = torch.randn(batch_size, D_in ) y = torch.randn(batch_size, D_out ) # define mode l model = torch.nn.Sequential ( torch.nn.Linear(D_in, H) , torch.nn.ReLU() , torch.nn.Linear(H, D_out) , ) # define loss functio n criterion = torch.nn.MSELoss(reduction='sum' ) for epoch in range(epochs) : # forward pass: compute predicted y y_p = model(x) # compute and print los s loss = criterion(y_p, y) print(t, loss.item() ) # backward pas s model.zero_grad() loss.backward( ) with torch.no_grad() : # update weight s for param in model.parameters() : param -= learning_rate * param.grad
  • 28.
    最適化関数の呼び出し 05_optimizer.py 04_nn_module.py # defineloss functio n criterion = torch.nn.MSELoss(reduction='sum' ) for t in range(epochs) : # forward pass: compute predicted y y_p = model(x ) # compute and print los s loss = criterion(y_p, y ) print(t, loss.item() ) # backward pas s model.zero_grad() loss.backward( ) with torch.no_grad() : # update weight s for param in model.parameters() : param -= learning_rate * param.grad # define loss functio n criterion = torch.nn.MSELoss(reduction='sum' ) # define optimize r optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate ) for epoch in range(epochs) : # forward pass: compute predicted y y_p = model(x ) # compute and print los s loss = criterion(y_p, y ) print(t, loss.item() ) # backward pas s optimizer.zero_grad() loss.backward( ) # update weight s optimizer.step()
  • 29.
    モデルを自作 06_mm_module.py 05_optimizer.py # createrandom input and output dat a x = torch.randn(batch_size, D_in ) y = torch.randn(batch_size, D_out ) # define mode l model = torch.nn.Sequential ( torch.nn.Linear(D_in, H) , torch.nn.ReLU() , torch.nn.Linear(H, D_out) , ) # define loss functio n criterion = torch.nn.MSELoss(reduction='sum') import torch.nn as n n import torch.nn.functional as F class TwoLayerNet(nn.Module) : def __init__(self, D_in, H, D_out) : super(TwoLayerNet, self).__init__( ) self.fc1 = nn.Linear(D_in, H ) self.fc2 = nn.Linear(H, D_out ) def forward(self, x) : h = self.fc1(x ) h_r = F.relu(h ) y_p = self.fc2(h_r ) return y_ p # create random input and output dat a x = torch.randn(batch_size, D_in ) y = torch.randn(batch_size, D_out ) # define mode l model = TwoLayerNet(D_in, H, D_out ) # define loss functio n criterion = nn.MSELoss(reduction='sum') . . . 学習時に不変
  • 30.
    MNIST Datasetのロード 07_mnist.py 06_mm_module.py importtorch.nn as n n import torch.nn.functional as F from torchvision import datasets, transform s # read input data and label s train_dataset = datasets.MNIST('./data' , train=True , download=True , transform=transforms.ToTensor() ) train_loader = torch.utils.data.DataLoader(dataset=train_dataset , batch_size=batch_size , shuffle=True ) for epoch in range(epochs) : # Set model to training mod e model.train( ) # Loop over each batch from the training se t for batch_idx, (x, y) in enumerate(train_loader): # forward pass: compute predicted y y_p = model(x) . . . import torch.nn as n n import torch.nn.functional as F # create random input and output dat a x = torch.randn(batch_size, D_in ) y = torch.randn(batch_size, D_out ) for t in range(epochs) : # forward pass: compute predictedy y_p = model(x) . . . . . .
  • 31.
    Validationデータによる検証 08_validate.py def validate() : model.eval( ) val_loss, val_acc= 0, 0 for data, target in val_loader : output = model(data ) loss = criterion(output, target ) val_loss += loss.item() pred = output.data.max(1)[1 ] val_acc += 100. * pred.eq(target.data).cpu().sum() / target.size(0) val_loss /= len(val_loader ) val_acc /= len(val_loader ) print('nValidation set: Average loss: {:.4f}, Accuracy: {:.1f}%n'.format ( val_loss, val_acc)) 学習時に使うデータ ハイパラやモデル を変えて試すとき に使うデータ 最終的な精度の評価 に使うデータ Validation dataのloss 予測クラスがラベルと一致しているか? パーセンテージに変換 sum()はGPUでやると遅いのでCPUで
  • 32.
    train(), main()関数の形で書く 09_train.py def train(train_loader,model,criterion,optimizer,epoch) : model.train( ) t= time.perf_counter( ) for batch_idx, (data, target) in enumerate(train_loader) : output = model(data ) loss = criterion(output, target ) optimizer.zero_grad( ) loss.backward( ) optimizer.step( ) if batch_idx % 200 == 0 : print('Train Epoch: {} [{:>5}/{} ({:.0%})]tLoss: {:.6f}t Time:{:.4f}'.format ( epoch, batch_idx * len(data), len(train_loader.dataset) , batch_idx / len(train_loader), loss.data.item() , time.perf_counter() - t) ) t = time.perf_counter() def main() : epochs = 1 0 batch_size = 3 2 learning_rate = 1.0e-0 2 train_dataset = datasets.MNIST('./data' , train=True , download=True , transform=transforms.ToTensor() ) val_dataset = datasets.MNIST('./data' , train=False , transform=transforms.ToTensor() ) train_loader = torch.utils.data.DataLoader(dataset=train_dataset , batch_size=batch_size , shuffle=True ) val_loader = torch.utils.data.DataLoader(dataset=validation_dataset , batch_size=batch_size , shuffle=False ) model = TwoLayerNet(D_in, H, D_out ) criterion = nn.CrossEntropyLoss( ) optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate ) for epoch in range(epochs) : model.train( ) train(train_loader,model,criterion,optimizer,epoch ) validate(val_loader,model,criterion )
  • 33.
    畳み込みNNモデル 10_cnn.py 09_train.py class CNN(nn.Module) : def__init__(self) : super(CNN, self).__init__( ) self.conv1 = nn.Conv2d(1, 32, 3, 1 ) self.conv2 = nn.Conv2d(32, 64, 3, 1 ) self.dropout1 = nn.Dropout2d(0.25 ) self.dropout2 = nn.Dropout2d(0.5 ) self.fc1 = nn.Linear(9216, 128 ) self.fc2 = nn.Linear(128, 10 ) def forward(self, x) : x = self.conv1(x ) x = F.relu(x ) x = self.conv2(x ) x = F.relu(x ) x = F.max_pool2d(x, 2 ) x = self.dropout1(x ) x = torch.flatten(x, 1 ) x = self.fc1(x ) x = F.relu(x ) x = self.dropout2(x ) x = self.fc2(x ) output = F.log_softmax(x, dim=1 ) return output class TwoLayerNet(nn.Module) : def __init__(self, D_in, H, D_out) : super(TwoLayerNet, self).__init__( ) self.fc1 = nn.Linear(D_in, H ) self.fc2 = nn.Linear(H, D_out ) def forward(self, x) : x = x.view(-1, D_in ) h = self.fc1(x ) h_r = F.relu(h ) y_p = self.fc2(h_r ) return F.log_softmax(y_p, dim=1)
  • 34.
    GPUを利用 11_gpu.py device = torch.device('cuda' ) model= CNN().to(device) def train(train_loader,model,criterion,optimizer,epoch) : model.train( ) t = time.perf_counter( ) for batch_idx, (data, target) in enumerate(train_loader) : data = data.to(device) target = target.to(device) def validate(loss_vector, accuracy_vector) : model.eval( ) val_loss, correct = 0, 0 for data, target in validation_loader : data = data.to(device) target = target.to(device) . . . . . . . . . PyTorchは裏でcuDNNを呼んでいる 1. torch.device(‘cuda’)でデバイスを指定 2. data, targetをデバイスに送る 3. 計算は全て自動的にGPUを用いて行われる
  • 35.
    分散並列 12_distributed.py import o s import torc h importtorch.distributed as dis t master_addr = os.getenv("MASTER_ADDR", default="localhost" ) master_port = os.getenv('MASTER_PORT', default='8888' ) method = "tcp://{}:{}".format(master_addr, master_port ) rank = int(os.getenv('OMPI_COMM_WORLD_RANK', '0') ) world_size = int(os.getenv('OMPI_COMM_WORLD_SIZE', '1') ) dist.init_process_group("nccl", init_method=method, rank=rank, world_size=world_size ) print('Rank: {}, Size: {}'.format(dist.get_rank(),dist.get_world_size()) ) ngpus = 4 device = rank % ngpu s x = torch.randn(1).to(device ) print('rank {}: {}'.format(rank, x) ) dist.broadcast(x, src=0 ) print('rank {}: {}'.format(rank, x)) 通信に用いるホストアドレスとポート番号を指定 OpenMPI環境変数からrankとsizeを取得 PyTorchにこれらを設定 PyTorchによる集団通信 .bashrcに以下を記入 if [ -f "$SGE_JOB_SPOOL_DIR/pe_hostfile" ]; the n export MASTER_ADDR=`head -n 1 $SGE_JOB_SPOOL_DIR/pe_hostfile | cut -d " " -f 1 ` f i mpirun -np 4 python 12_distributed.py
  • 36.
    分散並列MNIST 13_ddp.py def print0(message) : if torch.distributed.is_initialized() : iftorch.distributed.get_rank() == 0 : print(message, flush=True ) else : print(message, flush=True ) train_sampler = torch.utils.data.distributed.DistributedSampler ( train_dataset , num_replicas=torch.distributed.get_world_size() , rank=torch.distributed.get_rank() ) model = DDP(model, device_ids=[rank]) . . . . . . 全プロセスがprintすると見づらいので1プロセスだけprintするようなprint関数を定義 train dataの読み込みで異なるプロセスが異なるデータを読むようにする モデルをDDP()に通すことで分散並列計算を行う
  • 37.
    Argparse 14_args.py import argpars e import torc h importtorch.distributed as dis t import torch.nn as n n parser = argparse.ArgumentParser(description='PyTorch MNIST Example' ) parser.add_argument('--batch-size', type=int, default=32, metavar='N' , help='input batch size for training (default: 32)' ) parser.add_argument('--epochs', type=int, default=10, metavar='N' , help='number of epochs to train (default: 10)' ) parser.add_argument('--lr', type=float, default=1.0e-02, metavar='LR' , help='learning rate (default: 1.0e-02)' ) args = parser.parse_args( ) epochs = args.epochs batch_size = args.batch_size learning_rate = args.lr * world_size 直接数字を入れていたところをargsの変数を入れられる https://docs.python.org/ja/3/library/argparse.html#action
  • 38.
    AverageMeter 15_meter.py def train(train_loader,model,criterion,optimizer,epoch,device) : batch_time =AverageMeter('Time', ':.4f' ) train_loss = AverageMeter('Loss', ':.6f') class AverageMeter(object) : def __init__(self, name, fmt=':f') : self.name = nam e self.fmt = fm t self.reset( ) def reset(self) : self.val = 0 self.avg = 0 self.sum = 0 self.count = 0 def update(self, val, n=1) : self.val = va l self.sum += val * n self.count += n self.avg = self.sum / self.coun t def __str__(self) : fmtstr = '{name} {val' + self.fmt + '} ({avg' + self.fmt + '}) ' return fmtstr.format(**self.__dict__) valが既にn個の平均の場合 値 平均 和 個数 出力形式
  • 39.
    ProgressMeter 15_meter.py def train(train_loader,model,criterion,optimizer,epoch,device) : batch_time =AverageMeter('Time', ':.4f' ) train_loss = AverageMeter('Loss', ‘:.6f' ) progress = ProgressMeter ( len(train_loader) , [train_loss, batch_time] , prefix="Epoch: [{}]".format(epoch)) class ProgressMeter(object) : def __init__(self, num_batches, meters, prefix="", postfix="") : self.batch_fmtstr = self._get_batch_fmtstr(num_batches ) self.meters = meters self.prefix = prefix self.postfix = postfix def display(self, batch) : entries = [self.prefix + self.batch_fmtstr.format(batch) ] entries += [str(meter) for meter in self.meters ] entries += self.postfi x print0('t'.join(entries) ) def _get_batch_fmtstr(self, num_batches) : num_digits = len(str(num_batches // 1) ) fmt = '{:' + str(num_digits) + 'd} ' return '[' + fmt + '/' + fmt.format(num_batches) + ']' 前にprintするもの 後にprintするもの printしたい変数 printしたいものを連結 [ 今のbatch / 全batch数 ] のような表示をしたい
  • 40.
    Weights and Biases pipinstall wand b wandb login import wand b os.environ['MASTER_ADDR'] = 'localhost ' os.environ['MASTER_PORT'] = '8888 ' rank = int(os.getenv('OMPI_COMM_WORLD_RANK', '0') ) world_size = int(os.getenv('OMPI_COMM_WORLD_SIZE', '1') ) dist.init_process_group("nccl", rank=rank, world_size=world_size ) device = torch.device('cuda',rank ) if torch.distributed.get_rank() == 0 : wandb.init(project="example-project" ) wandb.config.update(args ) epochs = args.epoch s batch_size = args.batch_siz e learning_rate = args.lr * world_size for epoch in range(epochs) : model.train( ) train_loss, train_acc = train(train_loader,model,criterion,optimizer,epoch,device ) val_loss, val_acc = validate(val_loader,model,criterion,device ) if torch.distributed.get_rank() == 0 : wandb.log( { 'train_loss': train_loss , 'train_acc': train_acc , 'val_loss': val_loss , 'val_acc': val_ac c }) wandbで記録したい変数 trainとvalidateがlossとaccuracyを返すようにする wandbの初期化 argsを渡すと実験条件を勝手に記録してくれる 16_wandb.py
  • 41.
    train_dataset = datasets.CIFAR10('./data' , train=True , download=True , transform=transforms.ToTensor() ) val_dataset= datasets.CIFAR10('./data' , train=False , download=True , transform=transforms.ToTensor()) CIFAR10 17_cifar10.py model = VGG('VGG19').to(device ) model = DDP(model, device_ids=[rank % 4] ) criterion = nn.CrossEntropyLoss( ) optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate) データセットの変更 モデルの変更
  • 42.
    transform_train = transforms.Compose( [ transforms.RandomCrop(32,padding=4) , transforms.RandomHorizontalFlip(), transforms.ToTensor() , transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)), ] ) transform_val = transforms.Compose( [ transforms.ToTensor() , transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)) , ]) データ拡張 18_augmentation.py 輝度値の正規化
  • 43.
    parser.add_argument('--momentum', type=float, default=0.9,metavar='M' , help='momentum (default: 0.9)' ) parser.add_argument('--wd', '--weight_decay', type=float, default=5.0e-04, metavar='W' , help='learning rate (default: 5.0e-04)') 正則化 19_regularization.py optimizer = torch.optim.SGD(model.parameters(), lr=args.lr , momentum=args.momentum, weight_decay=args.wd) <latexit sha1_base64="GBuDE2XllGcgcaw34c+lteFIv9Q=">AAACJnicbVDLSsNAFJ34rO+oSzeDRRCFkhRFN0LRjcsKVoWmlsnkxg5OHszcCCXNP/gZfoFb/QJ3Iu7c+B9OHwtrPTBwOOce7p3jp1JodJxPa2p6ZnZuvrSwuLS8srpmr29c6SRTHBo8kYm68ZkGKWJooEAJN6kCFvkSrv37s75//QBKiyS+xG4KrYjdxSIUnKGR2vaeh0IGkMviRO57oWI896SJB6zIq0Wv52EHkNFe77batstOxRmAThJ3RMpkhHrb/vaChGcRxMgl07rpOim2cqZQcAnFopdpSBm/Z3fQNDRmEehWPvhTQXeMEtAwUebFSAfq70TOIq27kW8mI4Yd/dfri/95zQzD41Yu4jRDiPlwUZhJigntF0QDoYCj7BrCuBLmVso7zBSDpsaxLYHun1aYXty/LUySq2rFPaw4Fwfl2umooRLZIttkl7jkiNTIOamTBuHkkTyTF/JqPVlv1rv1MRydskaZTTIG6+sH2qSnTw==</latexit> ˜ l = l + 2 ||✓||2 <latexit sha1_base64="aTi3bmxUDeM1ohqP26MgoCugRRA=">AAACI3icbVDLSgMxFM34tr6qLt0EiygIZUYU3QhFNy4VrAqdUu5kbm0wkxmSO0IZ+gl+hl/gVr/Anbhx4dL/MH0stHogcHLOvbk3J8qUtOT7H97E5NT0zOzcfGlhcWl5pby6dmXT3Aisi1Sl5iYCi0pqrJMkhTeZQUgihdfR3Wnfv75HY2WqL6mbYTOBWy3bUgA5qVXeDjVECkKSKsZC9Y6Hd652Q+VeiYGH1EGCVrniV/0B+F8SjEiFjXDeKn+FcSryBDUJBdY2Aj+jZgGGpFDYK4W5xQzEHdxiw1ENCdpmMfhQj285Jebt1LijiQ/Unx0FJNZ2k8hVJkAdO+71xf+8Rk7to2YhdZYTajEc1M4Vp5T30+GxNChIdR0BYaTblYsOGBDkMvw1Jbb91Xoul2A8hb/kaq8aHFT9i/1K7WSU0BzbYJtshwXskNXYGTtndSbYA3tiz+zFe/RevTfvfVg64Y161tkveJ/fX8Glaw==</latexit> r˜ l = rl + ✓ <latexit sha1_base64="+SsjknKlypA8fgkVhF8vJQYEV3M=">AAACPnicbVDLSgMxFM34rO+qSzfBIlSkZUYU3QhFNy4VrAqdUu5k0jaYyQzJHaEM/R8/wy9wq36A7sStS9M6gm29EHJyzn3lBIkUBl331Zmanpmdmy8sLC4tr6yuFdc3rk2casbrLJaxvg3AcCkUr6NAyW8TzSEKJL8J7s4G+s0910bE6gp7CW9G0FGiLRigpVrFUx+7HKGV4Z7XP8kfWPHt5SsIJFBZ/mV3aYUOBWn7h/BLt4olt+oOg04CLwclksdFq/jmhzFLI66QSTCm4bkJNjPQKJjk/UU/NTwBdgcd3rBQQcRNMxv+tU93LBPSdqztUUiH7N+KDCJjelFgMyPArhnXBuR/WiPF9nEzEypJkSv2M6idSooxHRhHQ6E5Q9mzAJgWdlfKuqCBobV3ZEpoBqv1rS/euAuT4Hq/6h1W3cuDUu00d6hAtsg2KROPHJEaOScXpE4YeSBP5Jm8OI/Ou/PhfP6kTjl5zSYZCefrG44ksA8=</latexit> ✓t+1 = ✓t ⌘rl(✓t) ⌘ ✓t Momentum L2 正則化
  • 44.
    Sweep sweep.yaml program: wrapper.p y method: gri d metric : goal:minimiz e name: val_los s parameters : epochs : values: [100 ] batch_size : values: [32 ] learning_rate : values: [0.005, 0.01, 0.02, 0.05, 0.1 ] momentum : values: [0.85, 0.9, 0.95 ] weight_decay : values: [1.0e-4, 2.0e-4, 5.0e-4, 1.0e-3, 2.0e-3] wandb sweep sweep.yaml
  • 45.
    Models 19_regularization.py model = VGG('VGG19').to(device ) #model = ResNet18().to(device ) # model = PreActResNet18().to(device ) # model = GoogLeNet().to(device ) # model = DenseNet121().to(device ) # model = ResNeXt29_2x64d().to(device ) # model = MobileNet().to(device ) # model = MobileNetV2().to(device ) # model = DPN92().to(device ) # model = ShuffleNetG2().to(device ) # model = SENet18().to(device ) # model = ShuffleNetV2(1).to(device ) # model = EfficientNetB0().to(device ) # model = RegNetX_200MF().to(device) 今はこれを使っている 他のモデルも試して見ましょう
  • 46.
    参考文献 Learning PyTorch withExample s https://pytorch.org/tutorials/beginner/pytorch_with_examples.html PyTorch Examples githu b https://github.com/pytorch/examples PyTorch Tutorial githu b https://github.com/yunjey/pytorch-tutorial Understanding PyTorch with an example: a step-by-step tutorial by Daniel Godo y https://towardsdatascience.com/understanding-pytorch-with-an-example-a-step-by-step-tutorial-81fc5f8c4e8e Practical Deep Learning for Coders, v3 by fast.a i https://course.fast.ai PyTorch by Beeren Sah u https://beerensahu.wordpress.com/2018/03/21/pytorch-tutorial-lesson-1-tensor/ Writing Distributed Applications with PyTorch by Séb Arnol d https://pytorch.org/tutorials/intermediate/dist_tuto.html