Quoc le, slides MLconf 11/15/13

2. Deep Learning •  Google is using Machine Learning •  Machine Learning is difﬁcult •  Requires domain knowledge from human experts Deep Learning: •  Great performances for many problems •  Works well with a large amount of data •  Requires less domain knowledge Focus: •  Scale deep learning to bigger models and bigger problems Quoc V. Le

3. Deep Learning •  Google is using Machine Learning •  Machine Learning is difﬁcult •  Requires domain knowledge from human experts Deep Learning: •  Great performances for many problems •  Works well with a large amount of data •  Requires less domain knowledge Focus: •  Scale deep learning to bigger models and bigger problems Quoc V. Le

4. What is Deep Learning? Quoc V. Le

5. What is Deep Learning? … v = g(B u) B A u = g(A x) x (images, audio, texts, etc.) Quoc V. Le

6. What is Deep Learning? … v = g(B u) B A u = g(A x) x (images, audio, texts, etc.) Quoc V. Le

7. High-level features by Deep Learning Face detector, Cat detector … Edge detectors Pixels Quoc V. Le

8. Google’s DistBelief Model Goal: Train deep learning on many machines Model: A multiple layered architecture Forward pass to compute the features Backward pass to compute the gradient Training Data Quoc V. Le

9. Model partition with DistBelief Model DistBelief distributes a model across multiple machines and multiple cores. Machine (Model Partition) Training Data Quoc V. Le

10. Model partition with DistBelief Model DistBelief distributes a model across multiple machines and cores. Machine (Model Partition) Training Data Core Quoc V. Le

11. Model partition with DistBelief Model Stochastic Gradient Descent (SGD) Model parameters are partitioned Can use up to 1000 cores Training Data Quoc V. Le

12. Model partition with DistBelief Model But training is still slow on large data sets Can we add more parallelism? Idea: Train multiple models on different partitions of the data, and merge them Training Data Quoc V. Le

13. Data partition with DistBelief Parameter Server ∆p p’ = p + ∆p p’ Model Workers Data Shards Quoc V. Le

14. Parallelism in DistBelief Model parallelism via model partitioning Data parallelism via data partitioning and asynchronous communications DistBelief can scale to billion examples and use 100,000 cores or more Thanks to its speed, DistBelief dramatically improves many applications Quoc V. Le

15. Applications Voice Search Photo Search Text Understanding Quoc V. Le

16. Voice Search Classiﬁer Hidden layers with 1000s nodes Speech frame label! Quoc V. Le

17. Voice Search Quoc V. Le

18. Applications Voice Search Photo Search Text Understanding Quoc V. Le

19. Photo Search

20. Cat detector Front page of New York Times Quoc V. Le

21. Seat-belt Archery Boston rocker Shredder

22. Face Amusement, Park Hammock

23. Google+ PhotoSearch

24. Applications Voice Search Photo Search Text Understanding Quoc V. Le

25. Text understanding Very useful but also difﬁcult We should try to understand the meaning of words Deep Learning can learn the meaning of words Quoc V. Le

26. Text understanding ~100-D vector space Clinton Paris Obama whale dolphin Quoc V. Le

27. Predicting the next word in a sentence Classiﬁer Hidden Layers E E E E E the! Word Matrix cat! sat! on! the! is a matrix of dimension ||Vocab|| x d Quoc V. Le

28. Visualizing the word vectors •  Example nearest neighbors trained on Google News apple Apple iPhone

29. Relation Extraction Mikolov, Sutskever, Le. Learning the Meaning behind Words. Google OpenSource Blog, 2013 Quoc V. Le

30. Machine Translation Quoc V. Le

31. Summary Model partition Data partition Voice Search Photo Search Text Understanding Quoc V. Le

32. Joint work with Kai Chen Greg Corrado Rajat Monga Andrew Ng Jeff Dean Matthieu Devin Paul Tucker Ke Yang Samy Bengio, Tom Dean, Josh Levenberg, Geoff Hinton, Tomas Additional Mikolov, Mark Mao, Patrick Nguyen, Marc’Aurelio Ranzato, Thanks: Mark Segal, Jon Shlens, Ilya Sutskever, Vincent Vanhoucke

Quoc le, slides MLconf 11/15/13

Recommended

Recommended

More Related Content

What's hot

What's hot (13)

Similar to Quoc le, slides MLconf 11/15/13

Similar to Quoc le, slides MLconf 11/15/13 (20)

More from MLconf

More from MLconf (20)

Recently uploaded

Recently uploaded (20)

Quoc le, slides MLconf 11/15/13