4. クルマで想定されるスペック
4
クラウド エッジ
Many classes (1000s) Few classes (<10)
Large workloads Frame rates (15‐30 FPS)
High efficiency
(Performance/W)
Low cost & low power
(1W‐5W)
Server form factor Custom form factor
J. Freeman (Intel), “FPGA Acceleration in the era of high level design”, 2017
9. Artificial Neuron (AN)
+
x0=1
x1
x2
xN
... w0 (Bias)
w1
w2
wN
f(u)
u y
xi: Input signal
wi: Weight
u: Internal state
f(u): Activation function
(Sigmoid, ReLU, etc.)
y: Output signal
y f (u)
u wi xi
i0
N
9
20. • Normalizing the result
of MAC operations
• Batch normalization is
necessary for the
Binarized CNN to
improve its accracy
20
Normalization for Binarized DNN
Batch
Norm
0
20
40
60
80
100
1 80 160 200
Error rate[%]
epoch
Without BN
With BN
H. Nakahara, H. Yonekawa, T. Sasao, H. Iwamoto,
and M. Motomura, "A Memory‐Based Realization
of a Binarized Deep Convolutional Neural
Network," The International Conference on Field‐
Programmable Technology (FPT 2016), pp.273‐76,
2016.
mean
variance
Scaling Shift
21. • Batch Normalization is implemented by fixed
point adders and multipliers
21
バッチ正規化を導⼊した回路
Adder tree
Batch normalization
Sign bit
XNOR gate
22. • The output from batch
normalization( ) is the
input to sign function
Constant factor can
be ignored
• The input from batch
normalization( ) is the
integer value
To integer
22
バッチ正規化をバイアスで実現