When E-commerce Meets Social Media: Identifying Business on WeChat Moment Using Bilateral-Attention LSTM

When E-commerce Meets Social Media:
Identifying Business on WeChat Moment
Using Bilateral-Attention LSTM
Tianlang Chen1, Yuxiao Chen1, Han Guo2, Jiebo Luo1
1. University of Rochester
2. Institute of Computing Technology, Chinese Academy of Sciences

E-commerce and Online Shopping

The Largest 10 Companies by Market Cap

WeChat Business/Non-business Moment
Examples of WeChat Moment. A red dotted box indicates a WeChat
Business Moment, a blue dotted box indicates a WeChat non-business
Moment.
1

幻灯片 4
1 >WeChat, one of the most well-known messaging platforms with 806 millions monthly active users 1 in China
>WeChat Business represents the activity for sellers to advertise their products, companies platforms or other services via WeChat.
>they can adveritise them by posting a WeChat Moment, which contains up to 9 images and a text, similar to Instragram post
, 2018/4/20

Task and Motivation
• Research goal:
 Identifying WeChat Business Moment
• Motivation:
 For online shopping lovers/addicts: push notification
service from these Moments to provide instant and
comprehensive information for potential purchases
 For online shopping haters: block these Moments because
they are nuisance
 A robust classifier is beneficial to both kinds of users.
 Task:
 Binary Classification

Overview
Overview of the proposed work
2

幻灯片 6
2 In particular, different from previous schemes that equally consider visual and textual modalities for a joint visual-textual
classification task, we start our work with a text classification task based on an LSTM network, then we incorporate a
bilateral-attention mechanism that can automatically learn two kinds of explicit attention weights for each word, namely
1) a global weight that is insensitive to the images in the same Moment with the word, and 2) a local weight that is sensitive to the
images in the same Moment. In this process, we utilize visual
information as a guidance to figure out the local weight of a word in a specific Moment.
>local weight represents semantic similarity between word and image environments, indicating the importance of word in a
Moment.
>global weight indicates the importance of word in the classification task.
, 2018/4/21

Architecture of Bilateral-Attention LSTM
Blue part:
 basic LSTM-based text
classification model
Green part:
 self-learned attention sub-network
 computes word’s global weight
Red part:
 image-guide attention sub-network
 computes word’s local weight
3

幻灯片 7
3 >The blue part is a combination of word embedding, LSTM and fully-connected layers for text classification.
>Compute the local and global weight and feed them into LSTM equation to guide the model predict
, 2018/4/21

Word Local Weight
 Meaning:
 semantic similarity between word and
image environment, representing the
importance of word in a Moment
 Unsupervised Image classification:
 ResNet-50 + K-means
 Image environment feature:
 feature construction based on
category
 Category-based Word feature:
 Bayesian model:
 Local weight:
Category name and illustration
(a) Flower (b) Meal (c) Cosmetics
(d) Chat screenshot
4

幻灯片 8
4 >Image environment feature and category-based word feature of a Moment are constructed based on image category. Therefore,
first do unsupervised image classification.
>Image environment feature = category distribution feature, is a n-dimensional feature where n represents the category number, for
each category, if there exist at least one image in this category, then it is set as 1.
>Category-based word feature (CC word feature) is a n-dimensional feature vector that records the correlation coefficient thetaWC
between the word and each specific image category, it is denoted as thetaW
>compute local weight based on Image environment feature and category-based word feature
, 2018/4/21

Word Global Weight
 Meaning:
 Representing the importance of a word in the classification task, unrelated to
the corresponding images of the Moment
 Example:
 “mountain” and “face mask” hold high global weight, “mountain” has strong
negative correlation with WeChat Business Moment, “face mask” has strong
positive correlation with WeChat Business Moment
 Word feature:
 Embedding feature extracted from word embedding layer
 Global weight
5

幻灯片 9
5 Global weight is self-learned from word embedding feature
, 2018/4/20

Word Final Weight
 Meaning:
 Representing the overall importance of a word
 Representation:

 Feeding:

6

幻灯片 10
6 The final weight is the multiplication of global weight and local weight, which achieves best performance
, 2018/4/20

Experiments
 Dataset:
 Totally 570 users, 37,359 WeChat Moments and 109,545 Moment images
 Labeled 10078 Moments, 4309 of them are WeChat Business Moments
 Randomly select 80% Moments as training/validation samples, 20% as test
samples
Vertical Experiment
Parallel Experiment
B-LSTM over other
textual models
B-LSTM over other
fusion schemes
7

幻灯片 11
7 >First demonstrates that LSTM achieves better performance than other text classification models (decision tree model , and another
two models that using LDA and doc2vec to extract feature)
>For the parallel experiment, we compare the model’s performance with late fusion, factorization machines, LSTM Q and CCR. For
late fusion, we fuse text feature and image environment feature (Category Distribution Feature) at the top of the model. For CCR,
we add a new loss term that imposes consistent constraints across visual and textual modalities. For factorization machines, we
replace the top fully-connected layers with factorization machines layers. For LSTM Q, we normalize the visual feature (CD feature),
and also implement common space mapping and element-wise multiplication.
>CD + LF = late fusion with category distribution feature (image environment feature)
, 2018/4/21

Visualization
Correlation between WeChat Business and each image category
8

幻灯片 12
8 Train a logistic regression model for image environment feature (category distribution feature), get the model’s learning weight of
each category to represent the correlation between category and WeChat Business
, 2018/4/21

Visualization
Typical words with their local weights in a Moment that contains
images of a single category (Sigmoid unit is replaced with ReLU
for a better display of words’ differences).

Visualization
Visualization of the bilateral attention mechanism
9

幻灯片 14
9 Two moments with same text and different images. Therefore, each word has same global weight but different local weight. Moment
A contains images related to cosmetic, it gives the last several words high local weights. Moment B contains images related to
cosmetic, it gives the first several words high local weights. In the end, the model predict Moment A as WeChat Business Moment, B
as WeChat non-business Moment
, 2018/4/20

Visualization
Visualization of the bilateral attention mechanism
10

幻灯片 15
10 Two Moments with same image and different texts. The model will in the end capture the most significant words and improve the
classification accuracy.
(Not very typical, this page can be dropped off if it is not necessary)
, 2018/4/20

Thanks!
Disclaimer: The dataset used in this study was not provided by Tencent.

When E-commerce Meets Social Media: Identifying Business on WeChat Moment Using Bilateral-Attention LSTM

Recommended

Recommended

More Related Content

Similar to When E-commerce Meets Social Media: Identifying Business on WeChat Moment Using Bilateral-Attention LSTM

Similar to When E-commerce Meets Social Media: Identifying Business on WeChat Moment Using Bilateral-Attention LSTM (20)

More from Goergen Institute for Data Science

More from Goergen Institute for Data Science (14)

Recently uploaded

Recently uploaded (20)

When E-commerce Meets Social Media: Identifying Business on WeChat Moment Using Bilateral-Attention LSTM