Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Quoc Le, Software Engineer, Google at MLconf SF

3,378 views

Published on

Title: Deep Learning for Language Understanding

Abstract:
Many current language understanding algorithms rely on expert knowledge to engineer models and features. In this talk, I will discuss how to use Deep Learning to understand texts without much prior knowledge. In particular, our algorithms will learn the vector representations of words. These vector representations can be used to solve word analogy or translate unknown words between languages. Our algorithms also learn vector representations of sentences and documents. These vector representations preserve the semantics of sentences and documents and therefore can be used for machine translation, text classification, information retrieval and sentiment analysis.

Published in: Technology
  • Be the first to comment

Quoc Le, Software Engineer, Google at MLconf SF

  1. 1. Sequence Learning for Language Understanding Presenter: Quoc V. Le Google Thanks: Andrew Dai, Jeff Dean, Matthieu Devin, Geoff Hinton, Thang Luong, Rajat Monga, Ilya Sutskever, Oriol Vinyals
  2. 2. Sequence Learning Typical success of Machine Learning: Mapping fixed length input to a scalar value: - Image recognition (Pixels -> “cat”) - Speech recognition (Waveforms -> the utterance of “cat”) Many language understanding problems require mapping from sequences to sequences: - Machine Translation (“I love music” -> “Je aime la musique”) Quoc V. Le
  3. 3. Sequence Learning Typical success of Machine Learning: Mapping fixed length input to a scalar value: - Image recognition (Pixels -> “cat”) - Speech recognition (Waveforms -> the utterance of “cat”) Many language understanding problems require mapping from sequences to sequences: - Machine Translation (“I love music” -> “Je aime la musique”) Quoc V. Le
  4. 4. How does Machine Translation work? Use a dictionary to translate one word at a time Use a model put reorder the words so that the sentence looks reasonable. Lots of rules: - Phrases instead of words (“New York” should not be translated as “New” + “York”) - Meaning of words depend on contexts Quoc V. Le
  5. 5. Ideas: Sequence Learning - Use a Recurrent Neural Net encoder to map an input sequence to a vector - Use a Recurrent Neural Net decoder to map the vector to another sequence Quoc V. Le
  6. 6. Sequence Learning W X Y Z <EOS> Quoc V. Le Example network that maps ABC -> WXYZ A B C <EOS> W X Y Z At test time, feed the output back into the decoder as the input For better output sequence, generate many candidates, feed each candidate to the decoder to have a beam of possible sequences Use “beam search” to find the top sequences
  7. 7. Sequence Learning W X Y Z <EOS> Quoc V. Le Example network that maps ABC -> WXYZ A B C <EOS> W X Y Z At test time, feed the output back into the decoder as the input For better output sequence, generate many candidates, feed each candidate to the decoder to have a beam of possible sequences Use “beam search” to find the top sequences
  8. 8. A machine translation experiment WMT’2014 (small in comparison to Google’s data): - State-of-art (a combination of many methods, took 20 years to develop): 37 - Our method (took 3 person year): 37 Important achievement because it’s a new way to represent input texts and output texts. Potential breakthrough in many other areas of language understanding. Quoc V. Le
  9. 9. Sequence Learning W X Y Z <EOS> A B C <EOS> W X Y Z Quoc V. Le
  10. 10. Contact: Quoc V. Le (qvl@google.com), Ilya Sutskever (ilyasu@google.com), Oriol Vinyals (vinyals@google.com) Minh-Thang Luong (lmthang@cs.stanford.edu) Paper: Sequence to Sequence Learning with Neural Networks Addressing the Rare Word Problem in Neural Machine Translation Upcoming NIPS paper Quoc V. Le

×