The document proposes incorporating a simple recurrent unit (SRU) into deep Gaussian processes (DGP) for speech synthesis to enable utterance-level sequential modeling. The SRU-DGP model outperformed feedforward DGP and LSTM RNN baselines in subjective evaluations, and achieved faster speech generation than an LSTM RNN. Experimental results on a Japanese speech corpus showed the SRU-DGP model yielded smaller spectral distortion than other neural network and Bayesian neural network baselines. Future work will investigate incorporating other differentiable components like attention into the DGP framework.