This document presents a system for transcribing singing voice melodies from polyphonic music signals using deep neural networks (DNNs). It introduces two DNN models: one for fundamental frequency (f0) estimation of the melody, and another for voice activity detection. The f0 estimation DNN takes log-spectrograms as input and classifies each time frame into one of 193 pitch classes or an unvoiced class. It achieves higher accuracy than a state-of-the-art method through better generalization across genres and reduction of octave errors. The system evaluates the DNNs on various music databases and shows promising results for singing voice melody transcription.