“On Large-Batch Training for Deep
Learning: Generalization Gap and
Rating: 6: Marginally above acceptance threshold
Rating: 10: Top 5% of accepted papers, seminal paper
Review: …… the work is of great value in shedding light into some interesting
questions around generalization of deep networks. …….. such results may have
impact on both theory and practice, respectively by suggesting what assumptions are
legitimate for real scenarios for building new theories, or be used heuristically to
develop new algorithms ….
Rating: 8: Top 50% of accepted papers, clear accept
Review: Interesting paper, definitely provides value to the community by discussing
why large batch gradient descent does not work too well
Comment: All reviews (including the public one) were extremely positive, and this
sheds light on a universal engineering issue that arises in fitting non-convex models.
I think the community will benefit a lot from the insights here.
Decision: Accept (Oral)