Deep learning for product title summarization

Deep Learning for
Product Title Summarization
Joan Xiao
Lead Machine Learning Scientist
Figure Eight
Nov. 14 2018

Proprietary and Confidential - Do Not Distribute I © 2018 Figure Eight. All Rights Reserved.
Our Mission
Figure Eight is the essential Human-in-the-Loop AI platform for data science &
machine learning teams. Our software platform trains, tests, and tunes machine
learning models to make AI work in the real world.
2

Agenda
• Motivation
• Two deep learning based approaches
• Our implementations
• Results
4

Motivation

Need Short Titles!
9
Original Title:
KROSER Laptop Backpack
Computer Backpack School
Backpack Casual Daypack
Water-Repellent Laptop Bag
with USB Charging Port for
Travel/Business/College/Wo
men/Men Grey
Short Title:
Kroser Laptop
Backpack, Grey

Approach #1

Original Title:
Travel/Business/College/Wo
men/Men Grey
Short Title:
Kroser Laptop
Backpack, Grey

Original Title:
Travel/Business/College/Wom
en/Men Grey
Short Title:
Kroser Laptop Backpack,
Grey
Kroser Brand
Laptop Backpack Function
Grey Variation

Named Entity Recognition
13
Named entity recognition (NER, also known as entity chunking and entity extraction)
- A subtask of information extraction
- Locates named entities in text and classifies them into pre-defined categories
- Common categories: persons, organizations, locations, quantities, monetary
values, percentages, etc.

NER Example
14

NER – Traditional Approaches
17
- Rule based: hand-crafted linguistic grammar-based
- Supervised Learning
• Decision Trees
• Maximum Entropy Models
• Support Vector Machines
• Hidden Markov Models
• Conditional Random Fields

NER – Open Source APIs
19
- NLTK
- Stanford NLP
- OpenNLP
- SpaCy

CNN
28
Kim, Y. (2014). Convolutional Neural Networks for Sentence Classification

RNN
29

RNN
30
http://colah.github.io/posts/2015-08-Understanding-LSTMs/

RNN
31
Unfolded BLSTM architecture with 3 consecutive steps. From Cui et al. (2017)
http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Encoder-decoder RNN
34
http://www.wildml.com/2016/01/attention-and-memory-in-deep-learning-and-nlp/

Encoder-decoder RNN with Attention
35
Neural Machine Translation by Jointly Learning to Align and Translate. From Bahdanau, et al. (2015)

https://ai.googleblog.com/2016/09/a-neural-network-for-machine.html

Attention is all you need
Attention is all you need
- Vaswani et. al, 2017

Pre-trained language models
38
• ELMo
• ULMFiT
• OpenAI Transformer
• BERT

Deep Learning Results on CoNLL 2003
40
Year Model Architecture Author(s) F1
2018 BiLSTM-CRF+Flair Akbik et al. 93.09
2018 BERT Large Devlin et al. 92.8
2018 CVT + Multi Task Learning Clark et al. 92.6
2018 BERT Base Devlin et al. 92.4
2018 BiLSTM-CRF+ELMo Peters et al. 92.22
2017 GRU-GRU-CRF Yang et al. 91.26
2016 BiLSTM-CNN-CRF Ma and Hovy 91.21
2016 LSTM-LSTM-CRF Lample et al. 90.94
…
2011 CNN-CRF Collbert et al. 88.67
https://github.com/sebastianruder/NLP-progress

Approach #2

Automatic Text Summarization
42
- The task of producing a concise and fluent summary while
preserving key information content and overall meaning.

Two Types of Text Summarization
43
- Extractive Summarization
• Extracts key sections of the text, and compose summary without
modifying original text.
- Abstractive Summarization
• Generates a new shorter text that conveys the most critical information
from the original text.

Text Summarization Example
46
- Original Text: Alice and Bob took the train to visit the zoo. They saw a
baby giraffe, a lion, and a flock of colorful tropical birds.
- Extractive Summary: Alice and Bob visit the zoo. saw a flock of birds.
- Abstractive summary: Alice and Bob visited the zoo and saw animals
and birds.
https://ai.googleblog.com/2016/08/text-summarization-with-tensorflow.html

Automatic Text Summarization Evaluation Methods
47
- Human Evaluation
- Recall-Oriented Understudy for Gisting Evaluation (ROUGE)
• Compares a candidate summary to human (reference) summary.
• ROUGE-n: based on comparison of n-grams, where n is 1, 2, 3,
etc. Defined as the number of common n-grams between
candidate and reference summary, divided by the number of n-
grams extracted from the reference summary only.
• ROUGE-L: based on longest common subsequence (LCS) between
the candidate and reference summary.

Deep Learning Results on Gigaword
(Abstractive Text Summarization )
51
Year Model ROUGE-1 ROUGE-2 ROUGE-L
2018 Re^3 Sum (Cao et al.) 37.04 19.03 34.46
2018 CGU (Lin et al.) 36.3 18.0 33.8
2018 Pointer + Coverage + EntailmentGen +
QuestionGen (Guo et al.)
35.98 17.76 33.63
2018 words-lvt5k-1sent (Nallapti et al.) 36.4 17.7 33.71
2018 Struct+2Way+Word (Song et al.) 35.47 17.66 33.52
https://github.com/sebastianruder/NLP-progress

Implementation

Implementation Details (NER)
54
- A dataset of about 56K product titles
- Obtain entity labels for NER
- Train NER model
- Predict entities for a title
- Compose a short title from predicted entities
Output KROSER Laptop Backpack, Grey

Implementation Details (Text Summarization)
55
- Ground truth label for each title is composed of the words corresponding
to the entity tags in NER implementation
- Model predicts short title directly
Output KROSER Laptop Backpack, Grey

Model Architecture
56
- Bi-directional LSTM encoder/decoder with attention.
• Batch size: 256
• bidirectional encoding layer: 2
• Word embedding size: 128
• LSTM hidden units: 512
• Dropout at embedding and decoder layers: 0.5
• Beam search length: 5

Evaluation on test set
57
Method ROUGE-1 ROUGE-2
NER 84.71 65.98
Abstract Text Summarization 78.83 47.85

Human Evaluation
58
• 1000 random samples from test set
• Manual summarization by crowd worker (on Figure Eight)
• Crowd workers rated 4 short titles side-by-side, on the scale from 1 to 10:
o Short title composed from the ground truth label of NER model
o Short title composed from NER model prediction
o Short title from Abstract Text Summarization model
o Human summarization

Human Evaluation Results (average rating)
59
Ground Truth NER Text Summarization Human Summarization
8.26 ± 1.04 8.16 ± 1.17 6.73 ± 2.24 8.67 ± 1.27
• There is no statistical significance in the difference between the short titles
generated from NER prediction, NER Ground Truth and human summarization.
• The Text Summarization results are significantly below the other 3 versions.

Summary
• Reviewed NER and Text Summarization approaches and the latest advancements
• How to summarize product titles using NER and Text Summarization
• Evaluation results show that NER performs much better than Text Summarization
61

Thank You
Joan.xiao@figure-eight.com
https://linked.in/joanxiao

Deep learning for product title summarization

Recommended

Recommended

More Related Content

Similar to Deep learning for product title summarization

Similar to Deep learning for product title summarization (20)

More from MLconf

More from MLconf (20)

Recently uploaded

Recently uploaded (20)

Deep learning for product title summarization