2. Abstract
• Sentences generated by current works describe shallow appearances
and are boring.
• Netizen Style Commenting automatically generate characteristic
comments to a user-contributed fashion photo.
• Three major component:
• Construct a large-scale clothing dataset
• Marry topic models with neural networks
• Propose three unique measures to estimate the diversity of comments
• Improve accuracy and diversity
4. Introduction
• Modern model can achieve good scores in machine translation
metrics but are short of humanity.
• Collect a large corpus of paired user-contributed fashion photos and
comments, called NetiLook
• Existing models may overfit the dataset and generate comment like
“love the ….”.
• Integrate latent topic models with state-of-the-art methods and
make the generated sentence vivacious.
• Propose performance measurement for diversity.
7. Related work
• Image caption help visually impaired
users and human-robot interaction.
• State-of-the-art model are majorly
attention-based models because they
focus on correctness of description.
• Compared with depicting images,
giving comments is more challenging
because it needs to not only
understand images but take care of
engagement with users.
(Jonghwan Mun , AAAI 2017)
8. Dataset - Netilook
• Collect photos and comments from
Lookbook to construct NetiLook.
9. Method - Netizen Style Commenting
• Some frequently used sentences along with posts (e.g., “love this!”,
“nice”) which cause current models inclined to generate similar
sentences.
10. Method - Netizen Style Commenting (cont.)
• Introduce style-weight wstyle element-wised multiplied (◦) with
outputs at each step of LSTM to season generated sentences.
• Style-weight wstyle represents the comment style, which teaches
models to be acquainted with style in the corpus while generating
captioning.
11. Method - Netizen Style Commenting (cont.)
• Abstract concepts are hard for people to give a specific
definition.
• Apply LDA to discover latent topics and fuse with current models.
• LDA:
• Topic-word vectors:
• Comment-topic vectors:
• N: word dictionary
• z: topics
• m: comments
12. Method - Netizen Style Commenting (cont.)
• To find the topic distribution in corpus, each comment votes the
topic with highest probability by .
• The voting gives the most characteristic style in the corpus:
• The topic distribution of the corpus:
13. Method - Netizen Style Commenting (cont.)
• With the topic distribution of corpus y and topic-word vectors ϕ,
our style-weight wstyle is now defined as:
where yk means the k-th dimension of y
14. Diversity measures
• BLEU and METEOR are not for diversity measure, diversity measures
are being put importance on sentence generation models.
• More diverse sentences are generated, more unique words are
used.
• DicRate: ratio of unique words in ground truth and generations.
15. Diversity measures (cont.)
• WF-KL: The KL divergence of word frequency distribution.
• Frequency distribution:
• KL:
16. Diversity measures (cont.)
• POS-KL: The KL divergence of part-of-speech (POS) distribution.
• Frequency distribution:
• KL:
17. Experiment
• Setting: Beam size= 3; k= 3 or 5
• Topic models would not benefit the attention-based approach for
the reason that attention-based models are greatly restricted the
word selection.
18. Experiment (cont.)
• For a comment given by a human or machine, it is difficult to be
evaluated on conventional measures such as BLEU in NetiLook.
• Netilook has much more diversity and unique words than other
datasets.
19. Experiment (cont.)
• There are some common words and general patterns to describe
and comment on the clothing style in comparison with Flickr30k.
• In NetiLook, the experiment in Table 3 shows that our method can
greatly improve the diversity.
21. Experiment (cont.)
• User study:
• about 25 year-old and familiar with netizen style community
• 2.83 males/female
22. Conclusion
• Style-weight that greatly influences on current captioning models to
immerse into human online society.
• Proposed approaches benefit fashion photo commenting and
improve image captioning task.
• The approach could be applied on other fields to help generate
sentences with various styles by the idea of style-weight.