This document summarizes recent progress in distributed deep learning. It discusses the state of the art in neural networks and deep learning, as well as factors driving advances in deep learning like big data and increased computing power. It then covers approaches for scaling deep learning through model parallelism, data parallelism, and distributed training frameworks. Several deep learning applications developed in Vietnam are presented as examples, including optical character recognition and predictive text. The document concludes with principles for machine learning system design in distributed settings.