This document discusses strategies for training deep neural networks. It introduces stacked restricted Boltzmann machine networks and stacked autoencoder networks as two methods. For stacked restricted Boltzmann machine networks, individual layers of restricted Boltzmann machines are trained using contrastive divergence and then stacked together. For stacked autoencoder networks, layers of autoencoders are trained to minimize reconstruction loss and stacked in the same way. Experimental results show that pre-training deep networks layer-by-layer in an unsupervised manner helps learn more complex representations than single layer networks.