3. Background
Deep Neural Network’s Mystery
● Huge amount of parameters
○ Ex.) VGG net has 155 million parameters
● Train data size can be small
○ Ex.) ImageNet has 1.2 million samples
● Still shows high generalization performance
4. Background
Deep Neural Network’s Mystery
● Huge amount of parameters
○ Ex.) VGG net has 155 million parameters
● Train data size can be small
○ Ex.) ImageNet has 1.2 million samples
● Still shows high generalization performance
5. Background
Deep Neural Network’s Mystery
● Huge amount of parameters
○ Ex.) VGG net has 155 million parameters
● Train data size can be small
○ Ex.) ImageNet has 1.2 million samples
● Still shows high generalization performance
54. What does this paper imply?
With early stopping,
Larger model →Better generalization
Discussion
55. What does this paper imply?
With early stopping,
Larger model →Better generalization
Deep Learning’s Mystery!!
Discussion
56. What does this paper imply?
With early stopping,
Larger model →Better generalization
How can this paper explain?
Deep Learning’s Mystery!!
Discussion
57. What does this paper imply?
With early stopping,
Larger model →Better generalization
How can this paper explain?
Larger model → Bigger eigenvalues
(Marchenko-Pasteur distribution)
Deep Learning’s Mystery!!
Discussion
58. What does this paper imply?
With early stopping,
Larger model →Better generalization
How can this paper explain?
Larger model → Bigger eigenvalues
(Marchenko-Pasteur distribution)
Deep Learning’s Mystery!!
Discussion
59. What does this paper imply?
With early stopping,
Larger model →Better generalization
How can this paper explain?
Larger model → Bigger eigenvalues
(Marchenko-Pasteur distribution)
Deep Learning’s Mystery!!
Discussion