Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Deep learning for product title summarization

366 views

Published on

Joan Xiao, Lead Machine Learning Scientist, Figure Eight

Published in: Technology
  • Be the first to comment

Deep learning for product title summarization

  1. 1. Deep Learning for Product Title Summarization Joan Xiao Lead Machine Learning Scientist Figure Eight Nov. 14 2018
  2. 2. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Our Mission Figure Eight is the essential Human-in-the-Loop AI platform for data science & machine learning teams. Our software platform trains, tests, and tunes machine learning models to make AI work in the real world. 2
  3. 3. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Agenda • Motivation • Two deep learning based approaches • Our implementations • Results 4
  4. 4. Product Title Summarization Motivation
  5. 5. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. 6
  6. 6. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. 8
  7. 7. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Need Short Titles! 9 Original Title: KROSER Laptop Backpack Computer Backpack School Backpack Casual Daypack Water-Repellent Laptop Bag with USB Charging Port for Travel/Business/College/Wo men/Men Grey Short Title: Kroser Laptop Backpack, Grey
  8. 8. Product Title Summarization Approach #1
  9. 9. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. 11 Original Title: KROSER Laptop Backpack Computer Backpack School Backpack Casual Daypack Water-Repellent Laptop Bag with USB Charging Port for Travel/Business/College/Wo men/Men Grey Short Title: Kroser Laptop Backpack, Grey
  10. 10. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. 12 Original Title: KROSER Laptop Backpack Computer Backpack School Backpack Casual Daypack Water-Repellent Laptop Bag with USB Charging Port for Travel/Business/College/Wom en/Men Grey Short Title: Kroser Laptop Backpack, Grey Kroser Brand Laptop Backpack Function Grey Variation
  11. 11. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Named Entity Recognition 13 Named entity recognition (NER, also known as entity chunking and entity extraction) - A subtask of information extraction - Locates named entities in text and classifies them into pre-defined categories - Common categories: persons, organizations, locations, quantities, monetary values, percentages, etc.
  12. 12. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. NER Example 14
  13. 13. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. NER – Traditional Approaches 17 - Rule based: hand-crafted linguistic grammar-based - Supervised Learning • Decision Trees • Maximum Entropy Models • Support Vector Machines • Hidden Markov Models • Conditional Random Fields
  14. 14. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. NER – Open Source APIs 19 - NLTK - Stanford NLP - OpenNLP - SpaCy
  15. 15. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. CNN 28 Kim, Y. (2014). Convolutional Neural Networks for Sentence Classification
  16. 16. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. RNN 29
  17. 17. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. RNN 30 http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  18. 18. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. RNN 31 Unfolded BLSTM architecture with 3 consecutive steps. From Cui et al. (2017) http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  19. 19. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Encoder-decoder RNN 34 http://www.wildml.com/2016/01/attention-and-memory-in-deep-learning-and-nlp/
  20. 20. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Encoder-decoder RNN with Attention 35 Neural Machine Translation by Jointly Learning to Align and Translate. From Bahdanau, et al. (2015)
  21. 21. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. 36 https://ai.googleblog.com/2016/09/a-neural-network-for-machine.html
  22. 22. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. 37 Attention is all you need Attention is all you need - Vaswani et. al, 2017
  23. 23. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Pre-trained language models 38 • ELMo • ULMFiT • OpenAI Transformer • BERT
  24. 24. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Deep Learning Results on CoNLL 2003 40 Year Model Architecture Author(s) F1 2018 BiLSTM-CRF+Flair Akbik et al. 93.09 2018 BERT Large Devlin et al. 92.8 2018 CVT + Multi Task Learning Clark et al. 92.6 2018 BERT Base Devlin et al. 92.4 2018 BiLSTM-CRF+ELMo Peters et al. 92.22 2017 GRU-GRU-CRF Yang et al. 91.26 2016 BiLSTM-CNN-CRF Ma and Hovy 91.21 2016 LSTM-LSTM-CRF Lample et al. 90.94 … 2011 CNN-CRF Collbert et al. 88.67 https://github.com/sebastianruder/NLP-progress
  25. 25. Product Title Summarization Approach #2
  26. 26. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Automatic Text Summarization 42 - The task of producing a concise and fluent summary while preserving key information content and overall meaning.
  27. 27. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Two Types of Text Summarization 43 - Extractive Summarization • Extracts key sections of the text, and compose summary without modifying original text. - Abstractive Summarization • Generates a new shorter text that conveys the most critical information from the original text.
  28. 28. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Text Summarization Example 46 - Original Text: Alice and Bob took the train to visit the zoo. They saw a baby giraffe, a lion, and a flock of colorful tropical birds. - Extractive Summary: Alice and Bob visit the zoo. saw a flock of birds. - Abstractive summary: Alice and Bob visited the zoo and saw animals and birds. https://ai.googleblog.com/2016/08/text-summarization-with-tensorflow.html
  29. 29. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Automatic Text Summarization Evaluation Methods 47 - Human Evaluation - Recall-Oriented Understudy for Gisting Evaluation (ROUGE)  • Compares a candidate summary to human (reference) summary. • ROUGE-n: based on comparison of n-grams, where n is 1, 2, 3, etc. Defined as the number of common n-grams between candidate and reference summary, divided by the number of n- grams extracted from the reference summary only. • ROUGE-L: based on longest common subsequence (LCS) between the candidate and reference summary.
  30. 30. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Deep Learning Results on Gigaword (Abstractive Text Summarization ) 51 Year Model ROUGE-1 ROUGE-2 ROUGE-L 2018 Re^3 Sum (Cao et al.) 37.04 19.03 34.46 2018 CGU (Lin et al.) 36.3 18.0 33.8 2018 Pointer + Coverage + EntailmentGen + QuestionGen (Guo et al.) 35.98 17.76 33.63 2018 words-lvt5k-1sent (Nallapti et al.) 36.4 17.7 33.71 2018 Struct+2Way+Word (Song et al.) 35.47 17.66 33.52 https://github.com/sebastianruder/NLP-progress
  31. 31. Product Title Summarization Implementation
  32. 32. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Implementation Details (NER) 54 - A dataset of about 56K product titles - Obtain entity labels for NER - Train NER model - Predict entities for a title - Compose a short title from predicted entities Output KROSER Laptop Backpack, Grey
  33. 33. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Implementation Details (Text Summarization) 55 - Ground truth label for each title is composed of the words corresponding to the entity tags in NER implementation - Model predicts short title directly Output KROSER Laptop Backpack, Grey
  34. 34. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Model Architecture 56 - Bi-directional LSTM encoder/decoder with attention. • Batch size: 256 • bidirectional encoding layer: 2 • Word embedding size: 128 • LSTM hidden units: 512 • Dropout at embedding and decoder layers: 0.5 • Beam search length: 5
  35. 35. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Evaluation on test set 57 Method ROUGE-1 ROUGE-2 NER 84.71 65.98 Abstract Text Summarization 78.83 47.85
  36. 36. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Human Evaluation 58 • 1000 random samples from test set • Manual summarization by crowd worker (on Figure Eight) • Crowd workers rated 4 short titles side-by-side, on the scale from 1 to 10: o Short title composed from the ground truth label of NER model o Short title composed from NER model prediction o Short title from Abstract Text Summarization model o Human summarization
  37. 37. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Human Evaluation Results (average rating) 59 Ground Truth NER Text Summarization Human Summarization 8.26 ± 1.04 8.16 ± 1.17 6.73 ± 2.24 8.67 ± 1.27 • There is no statistical significance in the difference between the short titles generated from NER prediction, NER Ground Truth and human summarization. • The Text Summarization results are significantly below the other 3 versions.
  38. 38. Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved. Summary • Reviewed NER and Text Summarization approaches and the latest advancements • How to summarize product titles using NER and Text Summarization • Evaluation results show that NER performs much better than Text Summarization 61
  39. 39. Thank You Joan.xiao@figure-eight.com https://linked.in/joanxiao

×