Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Deep Learning, an interactive introduction for NLP-ers

12,626 views

Published on

Deep Learning intro for NLP Meetup Stockholm
22 January 2015
http://www.meetup.com/Stockholm-Natural-Language-Processing-Meetup/events/219787462/

Published in: Internet
  • Follow the link, new dating source: ❤❤❤ http://bit.ly/36cXjBY ❤❤❤
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Dating direct: ♥♥♥ http://bit.ly/36cXjBY ♥♥♥
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD THIS BOOKS INTO AVAILABLE FORMAT (2019 Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download Full EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download Full doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THIS can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THIS is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THIS Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THIS the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THIS Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD THIS BOOKS INTO AVAILABLE FORMAT (Unlimited) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... Download Full EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ACCESS WEBSITE for All Ebooks ......................................................................................................................... Download Full PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... Download EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... Download doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Deep Learning, an interactive introduction for NLP-ers

  1. 1. @graphific Roelof Pieters Introduc0on  to  
 Deep  Learning  for  NLP 22  January  2015  
 Stockholm  Natural  Language  Processing  Meetup FEEDA Slides at:
 http://www.slideshare.net/roelofp/220115dlmeetup 1
  2. 2. Deep Learning ??? 2
  3. 3. A couple of headlines… [all November ’14] 3
  4. 4. (source: Google Trends) 4
  5. 5. Machine Learning ?? - Audience Check - 5
  6. 6. • “Brain” inspired / simulations: • vision: make learning algorithms 
 better and easier to use • goal: revolutions in (practical) 
 advances for machine learning and AI • Deep Learning = subfield of Machine Learning Deep Learning ?? 6
  7. 7. Biological Inspiration 7
  8. 8. Deep Learning ?? 8
  9. 9. DL: Impact 9 Speech Recognition
  10. 10. DL: Impact 10 Deep Learning for the win! a few examples: • IJCNN 2011 Traffic Sign Recognition Competition • ISBI 2012 Segmentation of neuronal structures in EM stacks challenge • ICDAR 2011 Chinese handwriting recognition
  11. 11. • Deals with “construction and study of systems that can learn from data” Machine Learning ?? A computer program is said to learn from experience (E) with respect to some class of tasks (T) and performance measure (P), if its performance at tasks in T, as measured by P, improves with experience E — T. Mitchell 1997 11
  12. 12. Machine Learning ?? Traditional Programming: Data Program Output Data Program Output Machine Learning: 12
  13. 13. Supervised (inductive) learning • Training data includes desired outputs Unsupervised learning • Training data does not include desired outputs Semi-supervised learning • Training data includes a few desired outputs Reinforcement learning • Rewards from sequence of actions Types of Learning 13
  14. 14. ML: Traditional Approach 1. Gather as much LABELED data as you can get 2. Throw some algorithms at it (mainly put in an SVM and keep it at that) 3. If you actually have tried more algos: Pick the best 4. Spend hours hand engineering some features / feature selection / dimensionality reduction (PCA, SVD, etc) 5. Repeat… For each new problem/question:: 14
  15. 15. Machine Learning for NLP Data Classic Approach: Data is fed into a learning algorithm: Learning 
 Algorithm 15
  16. 16. Machine Learning for NLP some of the (many) treebank datasets source: http://www-nlp.stanford.edu/links/statnlp.html#Treebanks ! 16
  17. 17. Penn Treebank That’s a lot of “manual” work: 17
  18. 18. • the students went to class DT NN VB P NN • plays well with others VB ADV P NN NN NN P DT • fruit flies like a banana NN NN VB DT NN NN VB P DT NN NN NN P DT NN NN VB VB DT NN With a lot of issues: Penn Treebank 18
  19. 19. Machine Learning for NLP Learning 
 Algorithm Data “Features” Prediction Prediction/
 Classifier train set test set 19
  20. 20. Machine Learning for NLP Learning 
 Algorithm “Features” Prediction Prediction/
 Classifier train set test set 20
  21. 21. Machine Learning for NLP • Until the early 1990’s, NLP systems were built manually with hand-crafted dictionaries and rules. • As large electronic text corpora became increasingly available, researchers began using machine learning techniques to automatically build NLP systems. • Today, the vast majority of NLP systems use machine learning. 21
  22. 22. 2. Neural Networks
 and a short history lesson 22
  23. 23. Perceptron (1957) Frank Rosenblatt 
 (1928-1971) Original Perceptron Simplified model: (From Perceptrons by M. L Minsky and S. Papert, 1969, Cambridge, MA: MIT Press. Copyright 1969 by MIT Press. 23
  24. 24. Perceptron (1957) Perceptron Research, youtube clip: 
 https://www.youtube.com/watch?v=cNxadbrN_aI&feature=youtu.be&t=12 24
  25. 25. Perceptron (1957) 25
  26. 26. or Multilayer Perceptron (1986) inputs weights bias activation 26
  27. 27. Neuron Model All you need to know: 27
  28. 28. Activation functions 28
  29. 29. Backpropagation (1974/1986) 1974 Paul Werbos’ invents Backpropagation algorithm for NN 1986 Backdrop popularized by Rumelhart, Hinton, Williams 1990: Renewed Interest in NN’s 29
  30. 30. Backprop Renaissance Forward Propagation • Sum inputs, produce activation, feed-forward 30
  31. 31. Backprop Renaissance Back Propagation (of error) • Calculate total error at the top • Calculate contributions to error at each step going backwards 31
  32. 32. • Compute gradient of example-wise loss wrt parameters • Simply applying the derivative chain rule wisely 
 
 
 • If computing the loss (example, parameters) is O(n) computation, then so is computing the gradient Backpropagation 32
  33. 33. Simple Chain Rule 33
  34. 34. Training procedure • Initialize randomly • Sequentially give it data. • See what the difference is between network output and actual output. • Update the weights according to this error. • End result: give a model input, and it produces a proper output. Quest for the weights. The weights are the model! To reiterate: 34
  35. 35. So why only now? • Inspired by the architectural depth of the brain, researchers wanted for decades to train deep multi-layer neural networks. • No successful attempts were reported before 2006 …Exception: convolutional neural networks, LeCun 1998 • SVM: Vapnik and his co-workers developed the Support Vector Machine (1993) (shallow architecture). • Breakthrough in 2006! 35
  36. 36. 2006 Breakthrough • More data • Faster hardware: GPU’s, multi-core CPU’s • Working ideas on how to train deep architectures 36
  37. 37. 2006 Breakthrough • More data • Faster hardware: GPU’s, multi-core CPU’s • Working ideas on how to train deep architectures 37
  38. 38. 2006 Breakthrough 38
  39. 39. 2006 Breakthrough • More data • Faster hardware: GPU’s, multi-core CPU’s • Working ideas on how to train deep architectures 39
  40. 40. 2006 Breakthrough 40
  41. 41. 2006 Breakthrough • More data • Faster hardware: GPU’s, multi-core CPU’s • Working ideas on how to train deep architectures 41
  42. 42. 2006 Breakthrough Stacked Restricted Boltzman Machines* (RBM) Hinton, G. E, Osindero, S., and Teh, Y. W. (2006).
 A fast learning algorithm for deep belief nets.
 Neural Computation, 18:1527-1554. Stacked Autoencoders (AE) Bengio, Y., Lamblin, P., Popovici, P., Larochelle, H. (2007).
 Greedy Layer-Wise Training of Deep Networks,
 Advances in Neural Information Processing Systems 19 * called Deep Belief Networks (DBN)
42
  43. 43. 3. Deep Learning
 onwards we go… 43
  44. 44. 44
  45. 45. Hierarchies Efficient Generalization Distributed Sharing Unsupervised* Black Box Training Time Major PWNAGE! Much Data Why go Deep ? 45
  46. 46. No More Handcrafted Features ! 46
  47. 47. — Andrew Ng “I’ve worked all my life in Machine Learning, and I’ve never seen one algorithm knock over benchmarks like Deep Learning” Deep Learning: Why? 47
  48. 48. Biological Justification Deep Learning = Brain “inspired”
 Audio/Visual Cortex has multiple stages == Hierarchical • Computational Biology • CVAP • Jorge Dávila-Chacón • “that guy” “Brainiacs” “Pragmatists”vs 48
  49. 49. Different Levels of Abstraction 49
  50. 50. Hierarchical Learning • Natural progression from low level to high level structure as seen in natural complexity Different Levels of Abstraction Feature Representation 50
  51. 51. Hierarchical Learning • Natural progression from low level to high level structure as seen in natural complexity • Easier to monitor what is being learnt and to guide the machine to better subspaces Different Levels of Abstraction Feature Representation 51
  52. 52. Hierarchical Learning • Natural progression from low level to high level structure as seen in natural complexity • Easier to monitor what is being learnt and to guide the machine to better subspaces • A good lower level representation can be used for many distinct tasks Different Levels of Abstraction Feature Representation 52
  53. 53. Hierarchical Learning • Natural progression from low level to high level structure as seen in natural complexity • Easier to monitor what is being learnt and to guide the machine to better subspaces • A good lower level representation can be used for many distinct tasks Different Levels of Abstraction Feature Representation 53
  54. 54. • Shared Low Level Representations • Multi-Task Learning • Unsupervised Training Generalizable Learning 54
  55. 55. • Shared Low Level Representations • Multi-Task Learning • Unsupervised Training • Partial Feature Sharing • Mixed Mode Learning • Composition of Functions Generalizable Learning 55
  56. 56. Classic Deep Architecture Input layer Hidden layers Output layer 56
  57. 57. Modern Deep Architecture Input layer Hidden layers Output layer 57
  58. 58. Deep Learning: Why? (again) Beat state of the art in many areas: • Language Modeling (2012, Mikolov et al) • Image Recognition (Krizhevsky won 2012 ImageNet competition) • Sentiment Classification (2011, Socher et al) • Speech Recognition (2010, Dahl et al) • MNIST hand-written digit recognition (Ciresan et al, 2010) 58
  59. 59. One Model rules them all ?
 
 DL approaches have been successfully applied to: Deep Learning: Why for NLP ? Automatic summarization Coreference resolution Discourse analysis Machine translation Morphological segmentation Named entity recognition (NER) Natural language generation Natural language understanding Optical character recognition (OCR) Part-of-speech tagging Parsing Question answering Relationship extraction sentence boundary disambiguation Sentiment analysis Speech recognition Speech segmentation Topic segmentation and recognition Word segmentation Word sense disambiguation Information retrieval (IR) Information extraction (IE) Speech processing 59
  60. 60. - COFFEE BREAK - after the break we return with: CODE Download the code samples already now from: https://github.com/graphific/DL-Meetup-intro http://goo.gl/abX1E2shortened url: 
 60
  61. 61. • Deep Neural Network • Multilayer Perceptron (MLP) or Artificial Neural Network (ANN) 1. MLP Logistic regression Training regime: 
 Stochastic Gradient Descent (SGD) with minibatches MNIST dataset Simple hidden layer 61
  62. 62. 2. Convolutional Neural Network 62 from: Krizhevsky, Sutskever, Hinton. (2012). ImageNet Classification with Deep Convolutional Neural Networks [breakthrough in object recognition, Imagenet 2012]
  63. 63. Convolutional Neural Network http://ufldl.stanford.edu/wiki/index.php/ Feature_extraction_using_convolution movie time: http://www.cs.toronto.edu/~hinton/adi/index.htm 63
  64. 64. Thats it, no more code! (for now) 64
  65. 65. Deep Learning: Future Developments Currently an explosion of developments • Hessian-Free networks (2010) • Long Short Term Memory (2011) • Large Convolutional nets, max-pooling (2011) • Nesterov’s Gradient Descent (2013) Currently state of the art but... • No way of doing logical inference (extrapolation) • No easy integration of abstract knowledge • Hypothetic space bias might not conform with reality 65
  66. 66. Deep Learning: Future Challenges a 66 Szegedy, C., Wojciech, Z., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R. (2013) Intriguing properties of neural networks L: correctly identified, Center: added noise x10, R: “Ostrich”
  67. 67. • cuda-convnet2 (Alex Krizhevsky, Toronto) (c++/ CUDA, optimized for GTX 580) 
 https://code.google.com/p/cuda-convnet2/ • Caffe (Berkeley) (Cuda/OpenCL, Theano, Python)
 http://caffe.berkeleyvision.org/ • OverFeat (NYU) 
 http://cilvr.nyu.edu/doku.php?id=code:start Wanna Play ?
  68. 68. • Theano - CPU/GPU symbolic expression compiler in python (from LISA lab at University of Montreal). http:// deeplearning.net/software/theano/ • Pylearn2 - library designed to make machine learning research easy. http://deeplearning.net/software/pylearn2/ • Torch - Matlab-like environment for state-of-the-art machine learning algorithms in lua (from Ronan Collobert, Clement Farabet and Koray Kavukcuoglu) http://torch.ch/ • more info: http://deeplearning.net/software links/ Wanna Play ? Wanna Play ?
  69. 69. as PhD candidate KTH/CSC: “Always interested in discussing Machine Learning, Deep Architectures, Graphs, and Language Technology” In touch! roelof@kth.se www.csc.kth.se/~roelof/ Internship / EntrepeneurshipAcademic/Research as CIO/CTO Feeda: “Always looking for additions to our 
 brand new R&D team”
 
 [Internships upcoming on 
 KTH exjobb website…] roelof@feeda.com www.feeda.com Feeda 69
  70. 70. Were Hiring! roelof@feeda.com www.feeda.com Feeda • Dev Ops • Software Developers • Data Scientists 70
  71. 71. Thanks for listening Mingling time! 71
  72. 72. 72 Can’t get enough? Come to my talk Tomorrow (friday) Description on KTH website Visual-Semantic Embeddings: 
 some thoughts on Language Roelof Pieters TCS/CSC Friday jan 23 13:30. Room 304, Teknikringen 14 level 3
  73. 73. Appendum Some of the exciting recent developments in NLP
 especially Distributed Semantics 73
  74. 74. Word Embeddings: Turian (2010) Turian, J., Ratinov, L., Bengio, Y. (2010). Word representations: A simple and general method for semi-supervised learning code & info: http://metaoptimize.com/projects/wordreprs/74
  75. 75. Word Embeddings: Turian (2010) Turian, J., Ratinov, L., Bengio, Y. (2010). Word representations: A simple and general method for semi-supervised learning code & info: http://metaoptimize.com/projects/wordreprs/75
  76. 76. Word Embeddings: Collobert & Weston (2011) Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P. (2011) . Natural Language Processing (almost) from Scratch 76
  77. 77. Multi-embeddings: Stanford (2012) Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng 
 Improving Word Representations via Global Context and Multiple Word Prototypes 77
  78. 78. Linguistic Regularities: Mikolov (2013) code & info: https://code.google.com/p/word2vec/ Mikolov, T., Yih, W., & Zweig, G. (2013). Linguistic Regularities in Continuous Space Word Representations 78
  79. 79. Word Embeddings for MT: Mikolov (2013) Mikolov, T., Le, V. L., Sutskever, I. (2013) . Exploiting Similarities among Languages for Machine Translation 79
  80. 80. Recursive Deep Models & Sentiment: Socher (2013) Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Chris Manning, Andrew Ng and Chris Potts. 2013. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. EMNLP 2013 code & demo: http://nlp.stanford.edu/sentiment/index.html80
  81. 81. Paragraph Vectors: Le & Mikolov (2014) Le, Q., Mikolov,. T. (2014) Distributed Representations of Sentences and Documents 81 • add context (sentence, paragraph, document) to word vectors during training ! Results on Stanford Sentiment 
 Treebank dataset:
  82. 82. Global Vectors, GloVe: Stanford (2014) Pennington, P., Socher, R., Manning,. D.M. (2014). GloVe: Global Vectors for Word Representation code & demo: http://nlp.stanford.edu/projects/glove/ vs results on the word analogy task “similar accuracy” 82
  83. 83. Dependency-based Embeddings: Levy & Goldberg (2014) Levy, O., Goldberg, Y. (2014). Dependency-Based Word Embeddings code & demo: https://levyomer.wordpress.com/2014/04/25/ dependency-based-word-embeddings/ - Syntactic Dependency Context Australian scientist discovers star with telescope - Bag of Words (BoW) Context 0.3$ 0.4$ 0.5$ 0.6$ 0.7$ 0.8$ 0.9$ 1$ 0$ 0.1$ 0.2$ 0.3$ 0.4$ 0.5$ 0.6$ 0.7$ 0.8$ 0.9$ 1$ Precision$ Recall$ “Dependency-based embeddings have more functional similarities” 83

×