Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Deep Learning for NLP

1,255 views

Published on

How to take your first steps in doing text generation with Deep Learning using an example case.

Published in: Data & Analytics
  • Be the first to comment

Deep Learning for NLP

  1. 1. Deep Learning for Natural Language Processing        Bargava Subramanian @bargava Amit Kapoor @amitkaps 1
  2. 2. Language Challenge 2
  3. 3. Put these adjectives in order: [adj.] + [Knife] — old — French — lovely — green — rectangular — whittling — silver — little 3
  4. 4. Which order is correct? lovely old silver rectangular green little French whittling knife old lovely French rectangular green little whittling silver knife lovely little old rectangular green French silver whittling knife 4
  5. 5. Grammar has rules opinion - size - age - shape - colour - origin -material - purpose [Noun] The right version: lovely little old rectangular green French silver whittling knife 5
  6. 6. We speak the grammar, yet we don't know it 6
  7. 7. Natural Language Problems are hard 7
  8. 8. Natural Language Programming Problems — Summarization — Text Classification (e.g. spam) — Sentiment / Emotion Analysis — Topic Modelling — Recommendations — Text Evaluation (e.g. grading) 8
  9. 9. Plan for this Session — Moving beyond Statistical Learning — Take first steps in NLP with Deep Learning — Showcase an example — Practical challenges to overcome 9
  10. 10. NLP Learning Process ___ [1] Frame: Problem definition [2] Acquire: Text ingestion [3] Refine: Text wrangling [4] Transform: Feature creation [5] Explore: Feature selection [6] Model: Model selection [7] Insight: Solution communication 10
  11. 11. Simple Case Demonetisation in India 11
  12. 12. Demonetisation in India On Nov 8th, 2016, the National Government announced that existing INR 1000 and INR 500 notes are no longer legal. 12
  13. 13. 13
  14. 14. Reactions on Twitter People started tweeting with the tag: #demonetisation 14
  15. 15. [1] Frame Create a viral tweet on #demonetisation 15
  16. 16. Traditional way of framing 1. Someone has to write a tweet. 2. Run it on the classifier 3. If probability is high, post it. 4. Else, goto step 1 The prediction will be a probability of a new tweet to go viral or not? 16
  17. 17. Generating tweets — Can we learn from historical tweets algorithmically to generate a viral tweet? — Not possible to do using traditional methods 17
  18. 18. Revised framing for Text Generation Generate a tweet algorithmically, that is likely to go viral 18
  19. 19. [2] Acquire Get the raw tweets data 19
  20. 20. Get Tweets on #demonetisation Write your own twitter api client to get json file or use a python package like Tweepy, but need to manage rate limiting etc. We used tweezer - an open source project to get twitter data Raw dataset - 30,000+ tweets from past 1 week. 20
  21. 21. [3] Refine How to categorise a tweet as viral or not? 21
  22. 22. Simple Approach for Labelling IF retweets + favourites > = 100 THEN Label = viral ELSE Label = normal 22
  23. 23. Sanitizing Tweets — Stopword — Stemming — Remove urls — Remove 'RT' — Remove 'n' 23
  24. 24. [4] Transform Creating Features from Text 24
  25. 25. Traditional methods to covert text to numeric — TF-IDF: Measures importance of a word in a document relative to the corpus — Bag-of-Word: Count of occurrences of a word in a document — n-grams: Count of every 1-word, 2-word, etc combinations in a document — entity & POS tagging: Transform sentence to parts-of-speech, extract entities and encode 25
  26. 26. Challenges in traditional methods of encoding — Sparse inputs — Input data space explodes — Context lost in encoding A quiet crowd entered the historic church != A historic crowd entered the quiet church 26
  27. 27. Deep Learning Approach Low-dimensional dense vectors for representation. — Tokenise characters (Faster) — Tokenise words (More accurate, but needs more memory) 27
  28. 28. Word Embedding — Learn high-quality word vectors — Similar words needs to be close to each other — Words can have multiple degrees of similarity 28
  29. 29. Word Embedding using word2vec Combines two approaches — skip-gram: Predicting word given its context — continuous bag-of-words: Predicting context given a word 29
  30. 30. word2vec: Example vec[queen] − vec[king] = vec[woman] − vec[man] 1 1 https://www.tensorflow.org/versions/r0.12/tutorials/word2vec/index.html 30
  31. 31. [5] Explore Features Selection 31
  32. 32. Feature Selection — Manual process in Traditional Approach — Feature selection happens automatically in Deep Learning 32
  33. 33. [6] Model Model Selection 33
  34. 34. Recurrent Neural Network (RNN) — Network with loops — Allows information to persist — Enables connecting previous information to present task — Context preserved I grew up in Brazil and I speak ______________.                                                         portuguese 34
  35. 35. Unrolling over Time ____ [1] Think sequences - in input & output      - Recognize Image -> Explain in words - Sentence(s) -> Sentiment Analysis - English - Spanish Translation - Video - task classification 35
  36. 36. Unrolled RNN [2] Multiple copies of the same network [3] Each pass message to its successor 2 2 http://colah.github.io/posts/2015-08-Understanding-LSTMs/ 36
  37. 37. Architecture Overview 37
  38. 38. [7] Model Solution Communication 38
  39. 39. Generated Tweets 39
  40. 40. Deep Learning Challenges — Data Size: RNN doesn't generalize well on small datasets — Relevant Corpus: Required to create domain specific word embedding — Deeper Networks: Empirically deeper networks have better accuracy — Training Time: RNNs take a long time to learn. 40
  41. 41. Use case: Chat Bots — Bookings — Customer Support — Help Desk Automation — ... 41
  42. 42. Tools to get started: Software Python Stack - Use spacy for NLP preprocessing - Use gensim for word2vec training - Start with keras - Have tensorflow as backend Use pre-trained models like word2vec for word embedding and similarly for RNNs 42
  43. 43. Tools to get started: Hardware Work on GPUs - Nvidia TitanX (suitable for consumers) - Tesla K80 (suitable for professionals) For detailed hardware choices: http://timdettmers.com/2015/03/09/deep- learning-hardware-guide/ 43
  44. 44. Closing thoughts 44
  45. 45. Reference: Deep Learning for NLP Notebooks and Material @ https://github.com/rouseguy/ DeepLearningNLP_Py - What is deep learning? - Motivation: Some use cases - Building blocks of Neural Networks (Neuron, Activation Function) - Backpropagation Algorithm - Word Embedding - word2vec - Introduction to keras - Multi-layer perceptron - Convolutional Neural Network - Recurrent Neural Network - Challenges in Deep Learning 45
  46. 46. Contact ___ Bargava Subramanian @bargava Amit Kapoor @amitkaps amitkaps.com 46

×