The document discusses several studies that use recurrent neural networks to predict customer behavior from e-commerce data. Specifically, it examines research that uses RNNs to predict future customer purchases and product ratings. It also describes experiments on datasets from online retailers and reviews the results of predicting customer shopping patterns and lifetime value with RNN models.
2. BACKGROUND TO THIS STUDY
The rapid development of e-commerce in recent decades has been a clear
development for businessman to commerce (B2C).
Those numbers indicate that B2C commerce is still growing up, and that the
traditional retailers are not in danger of being replaced by electronic commerce.
3. USING RECURRENT NEURAL NETWORKS TO PREDICT CUSTOMER
BEHAVIOR FROM INTERACTION DATA
• In e-commerce prediction the future customer behavior is an important
task to offer the best possible experience and improved their satisfaction.
• The goal of this is to study the performance of different attention
mechanism methods and explore their utility to explain the predictions of
the model.
4. DATASETS ARE USED
- Santander product , the number of different products are 24.
- In this dataset the data is sampled regular for each month for every
month that we have a different products that the customer has in the
moment. And also he doesn't focus on the age or the country of
residence of the customer they focus on the iterations of the users.
5. 2- MOVIELENS
• The data set consists of the history ratings of movies for different users. The
ratings contain a time stamp of the user rates the movies. However, they only
use the fact that the user rated a movie and not the score.
• The researcher focus on predicting the ratings of the users from April 2014
until April 2015. Although to optimize the model to learn which movie will be
rated next (short-term prediction), also measure which movies the user will
rate eventually (long-term prediction).
6. PREDICTING SHOPPING BEHAVIOR WITH
MIXTURE OF RNNS
• The researchers are compared between two machine learning approaches for
early prediction of shoppers behavior, leveraging features to predict three
outcomes: purchase, abandoned shopping cart, and browsing-only mixture of
Markov models be used.
• From Clickstream data is used to experiment with mixtures of high-order
Markov Chain Models (MCMs) and mixtures of Recurrent Neural Networks
(RNNs) that use the Long Short-Term Memory (LSTM) architecture.
7. EXPERIMENTS
• The dataset divided in two secession, It was partitioned into an
80% training/20% testing split.
• All RNNs were trained for 10 epochs, using batches of 20
sequences.
• The number of recurrent layers was 1 or 2, the keep probability was
1 or 0.5, and the hidden state size was 10, 20, 40, or 80. For a
particular mixture model, all the RNNs used the same parameter
values.
9. CUSTOMER SHOPPING PATTERN PREDICTION:
A RECURRENT NEURAL NETWORK APPROACH
• One of the most popular methods to deal with this challenge and the big data
availability of online or offline, is using customer lifetime value (CLV)
model. CLV is defined as the value of relationship with the current
customers.
• The CLV models use different strategies for customer behavior modelling.
One of the most reliable ones is using the regency (R), frequency (F), and
monetary value (M) variables, called RFM
• The researchers show that RNNs can predict RFM values of customers
efficiently.
10. EXPERIMENTS
The data set used in the experiments is ta-feng dataset2, containing
817,741 transactions belonging to 32,266 users and 23,812 items.
RNN setting parameters.
11. • The results performance analyze of the proposed RNN with ReLU activation
function (ReLU-RNN) model with LSTM-RNN and SRNN.
12. WHAT TO DO NEXT: MODELING USER
BEHAVIORS BY TIME-LSTM
• The researchers propose a new LSTM variant called Time-LSTM. Time-
LSTM equips LSTM with time gates to model time intervals.
• Theses time gates are specially designed, compared to the traditional RNN
solutions, Time-LSTM better captures both of users’ short term and long-
term interests, to improve the recommendation performance.
• Time-LSTM are suppose with three versions. The first version has only one
time gate, which exploits time in the short-term and long-term interests.
There are two time gates in the second version. One is designed to exploit
time intervals to capture the short-term and the other is to save time
intervals to model the long-term.
• In the third version, they use coupled input and forget gates to reduce the
number of parameters.
13. DATASETS
• To evaluated the algorithm the researchers used two
datasets, LastFM and CiteULike. They extract three tuples
from the first dataset that user “user id” listens to song
“song id” at time “timestamp”.
• For the CiteULike dataset, one user annotating one research
paper at a certain time may have several records, in order to
distinguish different tags. they extract tuples “user id”,
“paper id”, “timestamp”.
14. SESSION-BASED RECOMMENDATIONS WITH
RECURRENT NEURAL NETWORKS
• They applied a kind of modern recurrent neural network (GRU) to new
application domain recommender systems.
• Gated Recurrent Unit (GRU) that aims at dealing with the vanishing gradient
problem. GRU gates essentially learn when and by how much to update the
hidden state of the unit.
• two datasets are used:
• The first dataset is that of RecSys Challenge 2015. This dataset contains click-
streams of an ecommerce site that sometimes end in purchase events. The
network is trained on 6 months of data, containing 7,966,257 sessions of
31,637,239 clicks on 37,483 items.
15. • The second dataset is collected from a Youtube-like OTT video
service platform. The training data consists of all but the last day of
the aforementioned period and has 3 million sessions of 13
million watch events on 330 thousand videos. The test set contains
the sessions of the last day of the collection period and has 37
thousand sessions with 180 thousand watch events. This dataset
will be referred to as VIDEO.
16. UNDERSTANDING CONSUMER BEHAVIOR
WITH RECURRENT NEURAL NETWORKS
• The researchers notice that recurrent neural networks (RNNs) are a natural
fit for modeling and predicting consumer behavior.
• The researchers take the advantages of RNNs in experiments on large-
scale data-sets from Europe's online fashion platform Zalando1, operating
in multiple countries with millions of customers and a catalog comprising
hundreds of thousands of products at any given moment. They focus on the
example of predicting order probabilities, which is fundamental to many e-
commerce and recommender system scenarios.
17. EXPLAINING PREDICTION
• They are summarize the contributions : (i) they show how
consumer behavior can be predicted without sophisticated
feature engineering by using RNNs.
• (ii) they provide an empirical comparison of prediction
performance on real-world e-commerce data.
• (iii) they demonstrate how RNNs are helpful in explaining
the predictions for individual consumers.
Editor's Notes
E-commerce, also known as electronic commerce or internet commerce, refers to the buying and selling of goods or services using the internet, and the transfer of money and data to execute these transactions
The history of ecommerce begins with the first ever online sale: on the August 11, 1994 a man sold a CD by the band Sting to his friend through his website NetMarket, an American retail platform.
Only in United States, for the fiscal year 2001, total retail sales was 3.50 trillion dollars while in ecommerce retail sales was 32.57 billion dollars.
For example if user buys a new Mobil phone he might purchase accessories for this mobile phone in new future. If users buys a book he might be interested in books by the same author or gene.
The researcher has answer the question in this thesis like:
1- how can the embedding be used to improve the performance when using RNN to model sequence.
2- which methods used to create item embedding lead to better prediction accuracy.
The researcher used the Santander product recommendation this datasets includes the elimination between January 2015 to may 2016
The researcher he focus on this data products to savings accounts, mortgage or funds.
-And also he doesn't focus on the age or the country of residence of the customer they focus on the iterations of the users.
-the goal is to predict which products the user will purchase in May 2016.
As the user can product in the same month they tread this problem as a multi-label classification.
Where the data goes from 0 to 1 the customer purchase this product. And when the data goes from 1 to 0 the customer remove this product.
We observe that RNN-Att-HS-Lin obtained the best performance, with results very similar to the RNN_Baseline.
The Logistic Regression baseline also performed similarly to these models. Finally, the RNN-Att-HS-Nonlin performed poorly compared with the others.
The model RNN-Att-HS-Nonlin achieves the highest sps@10 and R-Precision. Contrary to the Santander dataset,
The RNN-Baseline achieves the lowest sps@10 and R-Precision and its performance is
worse than the Frequency Baseline.
They observe that the results are much lower than in the Santander dataset.
The goal is to classify customer behavior into final decision categories in LSTM RNNs generalize with Markov chain models.
The researchers goal to classify PURCHASE, if the sequence leads to an item purchase;
ABANDON, if an item was left in the shopping cart, but there was no purchase; and
BROWSING-ONLY, when the shopping cart was not used.
Although they tested 16 deferent RNN parameter combinations, results were so similar that we are only reporting on one of them.
when considering 25%, 50%, 75%, and 100% of the total length of the sequence.
when splitting at 50%, the Markov chain model can predict a
PURCHASE with a 0:42 precision and 0:11 recall, resulting in an overall F1-measure
of 0:17. For the same conditions, the RNN-based model reaches a precision of 0:82 with 0:71 recall and an F1-measure of 0:76.
1- The availability of big transactions data have provided a great opportunity predict customer behavior.
2-In deep learning and machine learning the researchers could to analysis the behavior of the consumer in e-commerce.
4-These variables present some understanding of customer’s behavior and try to answer
the following questions: “How recently did the customer
purchase?”, “How often do they purchase?”, and “How much do they spend?”
The researchers proposes a new model for RFM prediction of
customers based on recurrent neural networks (RNNs) with rectified linear unit activation function.
The model utilizes an auto-encoder to represent the features of input parameters.
The data is divided into training (50%), validation (25%), and test (25%) partitions. In order to have the R,F, and M values in same scale, the data is normalized.
The experiments are cross-validated 10 times.
The hyper-parameters of the models are selected based on the cross validation.
The regularization coefficient is set to 0.0001. Each binary representation has 8 bits.
Therefore, the number of target is 38=24. Each input parameter is represented with 20 features in auto-encoder.
The performance results on the test dataset in this Table show that the RNN models have
competitive performance for RFM recommender system.
Time-LSTM 2 and Time-LSTM 3 have better performance than Time-LSTM1,
which demonstrates the effectiveness of using two time gates instead of one time gate.
We briefly experimented with other units than GRU. We found both the classic RNN unit and LSTM
to perform worse.
-1 Most of the popular machine learning methods used
in e-commerce, including logistic regression, neural networks,
and random forests, employ vector-based models.