The document proposes a neural source-filter waveform model (NSF) for speech synthesis. The NSF model consists of three modules: a condition module that upsamples spectral features and F0, a source module that generates a sine excitation signal, and a filter module with dilated convolutional blocks. The model is trained directly on waveforms using a spectral distance criterion in the STFT domain. Experiments show the NSF model generates high quality waveforms comparable to WaveNet, with faster generation speed. Ablation tests analyze the importance of the sine excitation source and different spectral loss terms. The NSF provides a simpler alternative to autoregressive models for neural speech synthesis.