The document proposes a novel linear associative unit (LAU) for neural machine translation. LAU reduces the gradient path inside recurrent units to address issues with optimizing deep neural networks. Experiments on Chinese-English, English-German, and English-French translation tasks show LAU outperforms GRU baselines, and increasing model depth is more effective than width. Analysis indicates LAU enables information transfer better than GRU, especially for longer sentences.