SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
21.
参考⽂献
• Sebastian Ruder. “An overview of gradient descent optimization algorithms”. http://ruder.io/optimizing-gradient-descent/.
• Sebastian Ruder. “Optimization for Deep Learning Highlights in 2017”. http://ruder.io/deep-learning-optimization-
2017/index.html.
• Ian Goodfellow and Yoshua Bengio and Aaron Courville. “Deep Learning”. http://www.deeplearningbook.org.
• Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … Polosukhin, I. (2017). Attention Is All You
Need. In Advances in Neural Information Processing Systems.
• Loshchilov, I., & Hutter, F. (2017). SGDR: Stochastic Gradient Descent with Warm Restarts. In Proceedings of ICLR 2017.
• Loshchilov, I., & Hutter, F. (2017). Fixing Weight Decay Regularization in Adam. arXiv Preprint arXi1711.05101. Retrieved
from http://arxiv.org/abs/1711.05101
• Zeiler, M. D. (2012). ADADELTA: An Adaptive Learning Rate Method. Retrieved from http://arxiv.org/abs/1212.5701
• Kingma, D. P., & Ba, J. L. (2015). Adam: a Method for Stochastic Optimization. International Conference on Learning
Representations, 1‒13.
• Masaaki Imaizumi. “深層学習による⾮滑らかな関数の推定”. SlideShare.
https://www.slideshare.net/masaakiimaizumi1/ss-87969960.
• nishio.”勾配降下法の最適化アルゴリズム”. SlideShare. https://www.slideshare.net/nishio/ss-66840545