2. Weights and bias
All Zero InitializationEvery neuron in the network computes the sameoutputThere is no source of asymmetryInitialization with Small Random NumbersSmall random numbers which are very close tozero, and it is treated as symmetry breaking
6. Xavier initialization
The variance of the output is the variance of
the input, but scaled by nVar(Wi).
So if we want the variance of the input and
output to be the same, that means nVar(Wi)
should be 1.
Which means the variance of the weights
should be