This document discusses SampleRNN, a neural audio generation model that can generate high-quality speech. SampleRNN uses a multi-rate recurrent neural network architecture with learned upsampling to directly model raw audio waveforms. It can be conditioned on vocoder features to allow for high-quality speech coding at bitrates as low as 6.4 kbps, significantly lower than traditional speech codecs. Experimental results show SampleRNN achieves better quality than existing codecs like AMR-WB at comparable or lower bitrates. Future work may focus on improving SampleRNN's robustness and reducing its computational complexity.