SlideShare a Scribd company logo
1 of 8
Boosting conversion rates on ecommerce using deep learning
algorithms
Armando Vieira (Armando@dataai.uk)
31 Oct 2014
Objective
Predict the probability that a user will buy a product from an online shop based
on past interactions within the shop website.
Approach
This problem will be analysed in two stages. First using off the shelf classification
algorithms and a second using a stacked auto-encoder to reduce the
dimensionality of the problem.
Data description
Data consists of one week of records of user interaction with a ecommerce site.
Events have a userId, a timestamp, an event type (5 categories: pageview,
basketview, buy, adclick and adview) and productId (around 25 000 categories).
In case of a buy of basketview we have information on the price. We ignore
adview and aclick events.
Only about 1% of products (around 250) have a full category
identification. However, these corresponds to about 85% of pageviews and 92%
of buys. In this section we only consider interactions with these product and
exclude the others.
The data is about 10Gb and cannot be loaded into my laptop memory, so
we first took a subsample of the first 100 000 events just to have a snapshot of
the interactions. We found:
 78 360 pageviews events (~78.4% of total events) from 13342 unique
users.
 16 409 basketview (~16.4%) from 3091 unique users.
 2 430 sales events (~2.5%) from 2014 unique users (around 1.2 sales per
user).
If we restrict to the 257 label product categories, we found 39561 pageviews,
from 7469 distinct users, which is about half of the population.
We found an average of 6 interactions per user, the distribution is very
skewed, following a power-law distribution (see next figure). Most users do a
single interaction while very few engage in very large interactions.
In terms of interactions with products we found also that a few products receive
a very large number of interactions (pageviews) while others just a few, see next
figure:
Data for training the classifiers
To build the data set we will restrict, for the moment, to the set of 257 product
categories (which account for half of the pageviews) – will deal with all
categories in future (see last section). Data was aggregated at the week level per
product category and semi-week (two time buckets). In this first iteration we
will not add basketview events as most of them are made on the same
session/day of sales events and the objective is to predict sales with at least one
day of delay. We will consider this in next iteration.
All data sets were balanced: same number of sales events and non-sales
events. Due to the large size of data, we essentially study the importance of
sample size. We excluded pageview events from the same day or day before the
sale event.
Next table describe the various tests done with the 5 datasets consider:
Data set Size Comments
Data 1 3 000 Only page views; 257 categories; weekly aggregate
Data 2 10 000 Same as data 1 but more data
Data 3 30 000 Same as data 1 but more data
Data 4 10 000 Same as Data 2 but semi-week aggregation
Data 5 3 000 Same as Data 1 but including top 2000 categories
Feature selection with Non-Negative Matrix Factorization (NMF)
In order to test the impact of not including all product categories, we considered
a new data set (Data 5) containing the top 2000 more visited product categories.
Since this a huge dimensional search space, we applied Non-Negative Matrix
Factorization (NMF) to reduce dimensionality.
Non-negative Matrix Factorization (NMF) is a class of unsupervised
learning algorithms, such as Principal Components Analysis (PCA) or learning
vector quantization (LVQ) that factorizes a data matrix subjected to constraints.
Although PCA is a widely used algorithm it has some drawbacks, like its linearity
and poor performance on factors. Furthermore, it enforces a weak orthogonality
constraint. LVQ uses a winner-take-all constraint that results in clustering the
data into mutually exclusive prototypes but it performs poorly on high
dimensional correlated data.
Non-negativity is a more robust constraint for matrix factorization [5].
Given a non-negative matrix V (containing the training data), NMF finds non-
negative matrix factors, W and H, such that: �≅��.
Each data vector V (data entry) can be approximated by a linear
combination of the columns of W, weighted by the patterns matrix H. Therefore,
W can be regarded as containing a basis that is optimized for the linear
approximation of the data in V. Since relatively few basis vectors are used to
represent many data vectors, good approximation can only be achieve if the
basis vectors discover the structure that is latent in the data.
NMF was successfully applied to high dimensional problems with sparse
data, like image recognition and text analysis. In our case we used NMF to
compress data into a 10 feature subset. The major issue with NMF is the lack of
an optimal method to compute the factor matrixes and stopping criteria to find
the ideal number of features to be selected.
In our case we apply NMF to reduced the dimensionality of the search
space to 100, 200 and 300 (Data 5 – 100, 200 and 300).
Running the Classifiers
Based on the data sets, we test the performance of two classifiers: Logistic
Regression and Random Forest. The first is a standard in industry and serve as a
baseline the second is more robust and produce in general better results. It has
the disadvantage of their predictions not being ease to understand (black box).
We used the algorithms without any optimization of the parameters (number of
trees, numbers of variables to consider in each split, split level, etc.)
As a KPI to measure performance we use the standard Area Under Roc
curve (AUC). An AUC=0.5 meaning a random (useless) classifier and 1 a perfect
one. For all runs we used 10 fold cross validation. The results are presented in
next table:
Data set Logistic Random Forest
Data 1 0.67 0.71
Data 2 0.69 0.76
Data 3 0.70 0.80
Data 4 0.68 0.82
Data 5 - 100 0.62 0.67
Data 5 - 200 0.64 0.69
Data 5 – 300 0.64 0.72
We conclude that sample size is an important factor in the performance of
the classifier, though the Logistic Regression does not have the same gains as the
Random Forest (RF) algorithm. Clearly RF has a much best performance than
logistic regression.
From data set 4 we also conclude that time of events is an important
factor to taken into account: although we increase the dimensionality of the
search space, we still have a net gain even using fewer training examples.
From applying the algorithms to data set 5, we concluded that NFM
algorithm is doing some compression on data but not in a very efficient way
(only the data with 300 features had improved the accuracy over the initial
subset of products). In next section we suggest using Auto-encoders to reduce
the dimensionality of data for all the 25 000 categories.
Polarity of variables is presented in appendix 1. The most important
variables are the ones corresponding to products that have highest purchase
rate, which make some sense, as they correspond to the categories where most
buys are made.
Table 1: Confusion matrix for the dataset 1 with classifier .
1 0
1 .89 .11
0 .07 .93
Confusion Matrix, ROC curves, variable importance and polarity: To Be Delivered
Work to be performed
Stacked auto-enconders
Auto-encoders are unsupervised feature learning and classification neural
networks machines that belong to the category of the now called deep learning
neural networks. They are especially fitted for hard problems involving very high
dimensional data when we have a large number of training examples but most of
them are unlabeled, like text analysis or bioinformatics.
At its simplest form, an auto-encoder can be seen as a special neural
network with three layers – the input layer, the latent (hidden) layer, and the
reconstruction layer (as shown in Figure1 below). An auto-encoder contains two
parts: (1) The encoder maps an input to the latent representation
(feature) via a deterministic mapping fe:
x1 = fe(x0) = se(WT
1 x0 + b1)
Figure 1: schematic representation of an auto-encoder. The blue points corresponds to raw data and
the red to label data used for fine-tuning supervision.
where se is the activation function of the encoder, whose input is called the activation
of the latent layer, and {W1, b1} is the parameter set with a weight matrix
and a bias vector b1. The decoder maps the latent representation x1 back to a
reconstruction via another mapping function fd:
x2 = fd(x1) = sd(WT
2 x1 + b2)
The input of sd is called the activation of the reconstruction layer. Parameters are
learned through back-propagation by minimizing the loss function L(x0, x2):
L(x0, x2) = Lr(x0, x2) + 0.5 (||W1||2
2 + ||W2||2
2)
which consists of the reconstruction error Lr(x0, x2) and the L2 regularization ofW1
andW2. By minimizing the reconstruction error, we require the latent features should
be able to reconstruct the original input as much as possible. In this way, the latent
features preserve regularities of the original data. The squared Euclidean distance is
often used for Lr(x0, x2). Other loss functions such as negative log likelihood and
cross-entropy are also used. The L2 regularization term is a weight-decay which is
added to the objective function to penalize large weights and reduce over-fitting. The
term is the weight decay cost, which is usually a small number.
The stacked auto-encoders (SAE) is a neural network with multiple layers of auto-
encoders. It has been widely used as a deep learning method for dimensionality
reduction and feature learning
Figure 2: schematic representation of a stacked auto-encoder.
As illustrated in Figure 2, there are h auto-encoders which are trained in a bottom-up
and layer-wise manner. The input vectors (blue color in the figure) are fed to the
bottom auto-encoder. After finishing training the bottom auto-encoder, the output
latent representations are propagated to the higher layer. The sigmoid function or tanh
function is typically used for the activation functions of se and sd.
The same procedure is repeated until all the auto-encoders are trained. After
such a pre-training stage, the whole neural network is fine-tuned based on a pre-
defined objective. The latent layer of the top auto-encoder is the output of the stacked
auto-encoders, which can be further fed into other applications, such as SVM for
classification. The unsupervised pre-training can automatically exploit large amounts
of unlabeled data to obtain a good weight initialization for the neural network than
traditional random initialization.
Staked auto-encoders have been used in problems with very sparse data high
dimensional data of up to 100 000 input variables and billions of rows. Contrary to
shallow learning machines, like support vector machines (SVM) and traditional neural
networks, these architectures can take advantage of the large quantities of data and
continuously improve performance by adding new training examples.
The only downsize of them is the large computational effort needed to train
them (typically tens of hours or days in regular computers) – in some cases we are
working with 100 millions parameters that have to be learned... This can be alleviated
by using computation based on the CPUs and a cluster of machines (like the Amazon
cloud) which can reduce the training time to a couple of hours or minutes.
Results
We used two approaches: Stacked Auto-Encoders and Deep Belief Networks.
DBN with several architectures, with N inputs, M outputs (in this case M=1).
Stopping criteria. Learning rate.
Data set Architecture AUC
1 N-100-200-M 0.88
1 N-200-100-M 0.85
2 N-100-200-M 0.91
The only downsize of them is the large computational effort needed to train them
(typically tens of hours or days in regular computers) – in some cases we are working
with 100 millions parameters that have to be learned... This can be alleviated by using
computation based on the CPUs and a cluster of machines (like the Amazon cloud)
which can reduce the training time to a couple of hours or minutes.
Boosting conversion rates on ecommerce using deep learning algorithms

More Related Content

What's hot

IRJET- Customer Buying Prediction using Machine-Learning Techniques: A Survey
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A SurveyIRJET- Customer Buying Prediction using Machine-Learning Techniques: A Survey
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A SurveyIRJET Journal
 
Machine-Learning: Customer Segmentation and Analysis.
Machine-Learning: Customer Segmentation and Analysis.Machine-Learning: Customer Segmentation and Analysis.
Machine-Learning: Customer Segmentation and Analysis.Siddhanth Chaurasiya
 
Business Analytics in Retail E-Commerce
Business Analytics in Retail E-CommerceBusiness Analytics in Retail E-Commerce
Business Analytics in Retail E-CommerceAnand Narayanan
 
FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...
FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...
FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...IJDKP
 
Association Rule based Recommendation System using Big Data
Association Rule based Recommendation System using Big DataAssociation Rule based Recommendation System using Big Data
Association Rule based Recommendation System using Big DataIRJET Journal
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional ModellingAshish Chandwani
 
A Comparative Study of Techniques to Predict Customer Churn in Telecommunicat...
A Comparative Study of Techniques to Predict Customer Churn in Telecommunicat...A Comparative Study of Techniques to Predict Customer Churn in Telecommunicat...
A Comparative Study of Techniques to Predict Customer Churn in Telecommunicat...IRJET Journal
 
An impact of knowledge mining on satisfaction of consumers in super bazaars
An impact of knowledge mining on satisfaction of consumers in super bazaarsAn impact of knowledge mining on satisfaction of consumers in super bazaars
An impact of knowledge mining on satisfaction of consumers in super bazaarsIAEME Publication
 
Customer analytics software - Quiterian
Customer analytics software - QuiterianCustomer analytics software - Quiterian
Customer analytics software - QuiterianJosep Arroyo
 
Marketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success RatesMarketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success RatesRevolution Analytics
 
Sales analysis using product rating in data mining techniques
Sales analysis using product rating in data mining techniquesSales analysis using product rating in data mining techniques
Sales analysis using product rating in data mining techniqueseSAT Journals
 
Clustering customer data dr sankar rajagopal
Clustering customer data   dr sankar rajagopalClustering customer data   dr sankar rajagopal
Clustering customer data dr sankar rajagopalDr.Sankar Rajagopal
 
Threshold Secure B2B Model
Threshold Secure B2B ModelThreshold Secure B2B Model
Threshold Secure B2B ModelIOSR Journals
 
Predictive Modelling & Market-Basket Analysis.
Predictive Modelling & Market-Basket Analysis.Predictive Modelling & Market-Basket Analysis.
Predictive Modelling & Market-Basket Analysis.Siddhanth Chaurasiya
 
DW DIMENSN MODELNG
DW DIMENSN MODELNGDW DIMENSN MODELNG
DW DIMENSN MODELNGDivya Tadi
 
Paper id 212014126
Paper id 212014126Paper id 212014126
Paper id 212014126IJRAT
 
Presentation Title
Presentation TitlePresentation Title
Presentation Titlebutest
 
Predictive analytics km chicago
Predictive analytics km chicagoPredictive analytics km chicago
Predictive analytics km chicagoKM Chicago
 
Retail Design
Retail DesignRetail Design
Retail Designjagishar
 

What's hot (20)

IRJET- Customer Buying Prediction using Machine-Learning Techniques: A Survey
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A SurveyIRJET- Customer Buying Prediction using Machine-Learning Techniques: A Survey
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A Survey
 
Machine-Learning: Customer Segmentation and Analysis.
Machine-Learning: Customer Segmentation and Analysis.Machine-Learning: Customer Segmentation and Analysis.
Machine-Learning: Customer Segmentation and Analysis.
 
Business Analytics in Retail E-Commerce
Business Analytics in Retail E-CommerceBusiness Analytics in Retail E-Commerce
Business Analytics in Retail E-Commerce
 
FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...
FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...
FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...
 
Association Rule based Recommendation System using Big Data
Association Rule based Recommendation System using Big DataAssociation Rule based Recommendation System using Big Data
Association Rule based Recommendation System using Big Data
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional Modelling
 
A Comparative Study of Techniques to Predict Customer Churn in Telecommunicat...
A Comparative Study of Techniques to Predict Customer Churn in Telecommunicat...A Comparative Study of Techniques to Predict Customer Churn in Telecommunicat...
A Comparative Study of Techniques to Predict Customer Churn in Telecommunicat...
 
An impact of knowledge mining on satisfaction of consumers in super bazaars
An impact of knowledge mining on satisfaction of consumers in super bazaarsAn impact of knowledge mining on satisfaction of consumers in super bazaars
An impact of knowledge mining on satisfaction of consumers in super bazaars
 
Customer analytics software - Quiterian
Customer analytics software - QuiterianCustomer analytics software - Quiterian
Customer analytics software - Quiterian
 
Marketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success RatesMarketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success Rates
 
Sales analysis using product rating in data mining techniques
Sales analysis using product rating in data mining techniquesSales analysis using product rating in data mining techniques
Sales analysis using product rating in data mining techniques
 
Bank marketing mini-project
Bank marketing mini-projectBank marketing mini-project
Bank marketing mini-project
 
Clustering customer data dr sankar rajagopal
Clustering customer data   dr sankar rajagopalClustering customer data   dr sankar rajagopal
Clustering customer data dr sankar rajagopal
 
Threshold Secure B2B Model
Threshold Secure B2B ModelThreshold Secure B2B Model
Threshold Secure B2B Model
 
Predictive Modelling & Market-Basket Analysis.
Predictive Modelling & Market-Basket Analysis.Predictive Modelling & Market-Basket Analysis.
Predictive Modelling & Market-Basket Analysis.
 
DW DIMENSN MODELNG
DW DIMENSN MODELNGDW DIMENSN MODELNG
DW DIMENSN MODELNG
 
Paper id 212014126
Paper id 212014126Paper id 212014126
Paper id 212014126
 
Presentation Title
Presentation TitlePresentation Title
Presentation Title
 
Predictive analytics km chicago
Predictive analytics km chicagoPredictive analytics km chicago
Predictive analytics km chicago
 
Retail Design
Retail DesignRetail Design
Retail Design
 

Viewers also liked

E-commerce product classification with deep learning
E-commerce product classification with deep learning E-commerce product classification with deep learning
E-commerce product classification with deep learning Christopher Bonnett Ph.D
 
Applying machine learning to product categorization
Applying machine learning to product categorizationApplying machine learning to product categorization
Applying machine learning to product categorizationSushant Shankar
 
How Data Science can increase Ecommerce profits
How Data Science can increase Ecommerce profitsHow Data Science can increase Ecommerce profits
How Data Science can increase Ecommerce profitsRomexsoft
 
Data Science for e-commerce
Data Science for e-commerceData Science for e-commerce
Data Science for e-commerceInfoFarm
 
Data mining with Google analytics
Data mining with Google analyticsData mining with Google analytics
Data mining with Google analyticsGreg Bray
 
Machine Learning
Machine LearningMachine Learning
Machine Learningbutest
 
The Evolution of Digital Ecommerce
The Evolution of Digital EcommerceThe Evolution of Digital Ecommerce
The Evolution of Digital EcommerceIncubeta NMPi
 
Supervised Classifcation Portland Metro
Supervised Classifcation Portland MetroSupervised Classifcation Portland Metro
Supervised Classifcation Portland MetroDonnych Diaz
 
Practical Predictive Analytics Models and Methods
Practical Predictive Analytics Models and MethodsPractical Predictive Analytics Models and Methods
Practical Predictive Analytics Models and MethodsZhipeng Liang
 
Webinar: Maximize Keyword Profits & Conversions with Data Science
Webinar: Maximize Keyword Profits & Conversions with Data ScienceWebinar: Maximize Keyword Profits & Conversions with Data Science
Webinar: Maximize Keyword Profits & Conversions with Data ScienceQuanticMind
 
An ad words ad performance analysis by r
An ad words ad performance analysis by rAn ad words ad performance analysis by r
An ad words ad performance analysis by rSimonChen888
 
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn..."Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...Edge AI and Vision Alliance
 
Machine Learning in Ecommerce
Machine Learning in EcommerceMachine Learning in Ecommerce
Machine Learning in EcommerceDavid Jones
 
Digital analytics with R - Sydney Users of R Forum - May 2015
Digital analytics with R - Sydney Users of R Forum - May 2015Digital analytics with R - Sydney Users of R Forum - May 2015
Digital analytics with R - Sydney Users of R Forum - May 2015Johann de Boer
 
Interactively querying Google Analytics reports from R using ganalytics
Interactively querying Google Analytics reports from R using ganalyticsInteractively querying Google Analytics reports from R using ganalytics
Interactively querying Google Analytics reports from R using ganalyticsJohann de Boer
 
Locality Sensitive Hashing By Spark
Locality Sensitive Hashing By SparkLocality Sensitive Hashing By Spark
Locality Sensitive Hashing By SparkSpark Summit
 
Web data from R
Web data from RWeb data from R
Web data from Rschamber
 
Data Science and Machine Learning for eCommerce and Retail
Data Science and Machine Learning for eCommerce and RetailData Science and Machine Learning for eCommerce and Retail
Data Science and Machine Learning for eCommerce and RetailAndrei Lopatenko
 
Tapping the Data Deluge with R
Tapping the Data Deluge with RTapping the Data Deluge with R
Tapping the Data Deluge with RJeffrey Breen
 

Viewers also liked (20)

E-commerce product classification with deep learning
E-commerce product classification with deep learning E-commerce product classification with deep learning
E-commerce product classification with deep learning
 
Applying machine learning to product categorization
Applying machine learning to product categorizationApplying machine learning to product categorization
Applying machine learning to product categorization
 
How Data Science can increase Ecommerce profits
How Data Science can increase Ecommerce profitsHow Data Science can increase Ecommerce profits
How Data Science can increase Ecommerce profits
 
Data Science for e-commerce
Data Science for e-commerceData Science for e-commerce
Data Science for e-commerce
 
Data mining with Google analytics
Data mining with Google analyticsData mining with Google analytics
Data mining with Google analytics
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
The Evolution of Digital Ecommerce
The Evolution of Digital EcommerceThe Evolution of Digital Ecommerce
The Evolution of Digital Ecommerce
 
Supervised Classifcation Portland Metro
Supervised Classifcation Portland MetroSupervised Classifcation Portland Metro
Supervised Classifcation Portland Metro
 
Practical Predictive Analytics Models and Methods
Practical Predictive Analytics Models and MethodsPractical Predictive Analytics Models and Methods
Practical Predictive Analytics Models and Methods
 
Webinar: Maximize Keyword Profits & Conversions with Data Science
Webinar: Maximize Keyword Profits & Conversions with Data ScienceWebinar: Maximize Keyword Profits & Conversions with Data Science
Webinar: Maximize Keyword Profits & Conversions with Data Science
 
An ad words ad performance analysis by r
An ad words ad performance analysis by rAn ad words ad performance analysis by r
An ad words ad performance analysis by r
 
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn..."Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
 
Machine Learning in Ecommerce
Machine Learning in EcommerceMachine Learning in Ecommerce
Machine Learning in Ecommerce
 
Digital analytics with R - Sydney Users of R Forum - May 2015
Digital analytics with R - Sydney Users of R Forum - May 2015Digital analytics with R - Sydney Users of R Forum - May 2015
Digital analytics with R - Sydney Users of R Forum - May 2015
 
Interactively querying Google Analytics reports from R using ganalytics
Interactively querying Google Analytics reports from R using ganalyticsInteractively querying Google Analytics reports from R using ganalytics
Interactively querying Google Analytics reports from R using ganalytics
 
Locality Sensitive Hashing By Spark
Locality Sensitive Hashing By SparkLocality Sensitive Hashing By Spark
Locality Sensitive Hashing By Spark
 
Web data from R
Web data from RWeb data from R
Web data from R
 
Data Science and Machine Learning for eCommerce and Retail
Data Science and Machine Learning for eCommerce and RetailData Science and Machine Learning for eCommerce and Retail
Data Science and Machine Learning for eCommerce and Retail
 
Using R with Hadoop
Using R with HadoopUsing R with Hadoop
Using R with Hadoop
 
Tapping the Data Deluge with R
Tapping the Data Deluge with RTapping the Data Deluge with R
Tapping the Data Deluge with R
 

Similar to Boosting conversion rates on ecommerce using deep learning algorithms

IRJET-Handwritten Digit Classification using Machine Learning Models
IRJET-Handwritten Digit Classification using Machine Learning ModelsIRJET-Handwritten Digit Classification using Machine Learning Models
IRJET-Handwritten Digit Classification using Machine Learning ModelsIRJET Journal
 
Open06
Open06Open06
Open06butest
 
Working with the data for Machine Learning
Working with the data for Machine LearningWorking with the data for Machine Learning
Working with the data for Machine LearningMehwish690898
 
AMAZON STOCK PRICE PREDICTION BY USING SMLT
AMAZON STOCK PRICE PREDICTION BY USING SMLTAMAZON STOCK PRICE PREDICTION BY USING SMLT
AMAZON STOCK PRICE PREDICTION BY USING SMLTIRJET Journal
 
A02610104
A02610104A02610104
A02610104theijes
 
Paper-Allstate-Claim-Severity
Paper-Allstate-Claim-SeverityPaper-Allstate-Claim-Severity
Paper-Allstate-Claim-SeverityGon-soo Moon
 
Predicting Employee Attrition
Predicting Employee AttritionPredicting Employee Attrition
Predicting Employee AttritionShruti Mohan
 
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET Journal
 
data warehousing & minining 1st unit
data warehousing & minining 1st unitdata warehousing & minining 1st unit
data warehousing & minining 1st unitbhagathk
 
Higgs Boson Challenge
Higgs Boson ChallengeHiggs Boson Challenge
Higgs Boson ChallengeRaouf KESKES
 
A tour of the top 10 algorithms for machine learning newbies
A tour of the top 10 algorithms for machine learning newbiesA tour of the top 10 algorithms for machine learning newbies
A tour of the top 10 algorithms for machine learning newbiesVimal Gupta
 
2018 p 2019-ee-a2
2018 p 2019-ee-a22018 p 2019-ee-a2
2018 p 2019-ee-a2uetian12
 
1.6.data preprocessing
1.6.data preprocessing1.6.data preprocessing
1.6.data preprocessingKrish_ver2
 
House Price Estimation as a Function Fitting Problem with using ANN Approach
House Price Estimation as a Function Fitting Problem with using ANN ApproachHouse Price Estimation as a Function Fitting Problem with using ANN Approach
House Price Estimation as a Function Fitting Problem with using ANN ApproachYusuf Uzun
 
Ijariie1117 volume 1-issue 1-page-25-27
Ijariie1117 volume 1-issue 1-page-25-27Ijariie1117 volume 1-issue 1-page-25-27
Ijariie1117 volume 1-issue 1-page-25-27IJARIIE JOURNAL
 
Presentation_BigData_NenaMarin
Presentation_BigData_NenaMarinPresentation_BigData_NenaMarin
Presentation_BigData_NenaMarinn5712036
 
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET Journal
 
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET Journal
 
Weka_Manual_Sagar
Weka_Manual_SagarWeka_Manual_Sagar
Weka_Manual_SagarSagar Kumar
 

Similar to Boosting conversion rates on ecommerce using deep learning algorithms (20)

IRJET-Handwritten Digit Classification using Machine Learning Models
IRJET-Handwritten Digit Classification using Machine Learning ModelsIRJET-Handwritten Digit Classification using Machine Learning Models
IRJET-Handwritten Digit Classification using Machine Learning Models
 
Open06
Open06Open06
Open06
 
Working with the data for Machine Learning
Working with the data for Machine LearningWorking with the data for Machine Learning
Working with the data for Machine Learning
 
AMAZON STOCK PRICE PREDICTION BY USING SMLT
AMAZON STOCK PRICE PREDICTION BY USING SMLTAMAZON STOCK PRICE PREDICTION BY USING SMLT
AMAZON STOCK PRICE PREDICTION BY USING SMLT
 
A02610104
A02610104A02610104
A02610104
 
Paper-Allstate-Claim-Severity
Paper-Allstate-Claim-SeverityPaper-Allstate-Claim-Severity
Paper-Allstate-Claim-Severity
 
Predicting Employee Attrition
Predicting Employee AttritionPredicting Employee Attrition
Predicting Employee Attrition
 
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
 
data warehousing & minining 1st unit
data warehousing & minining 1st unitdata warehousing & minining 1st unit
data warehousing & minining 1st unit
 
mod 2.pdf
mod 2.pdfmod 2.pdf
mod 2.pdf
 
Higgs Boson Challenge
Higgs Boson ChallengeHiggs Boson Challenge
Higgs Boson Challenge
 
A tour of the top 10 algorithms for machine learning newbies
A tour of the top 10 algorithms for machine learning newbiesA tour of the top 10 algorithms for machine learning newbies
A tour of the top 10 algorithms for machine learning newbies
 
2018 p 2019-ee-a2
2018 p 2019-ee-a22018 p 2019-ee-a2
2018 p 2019-ee-a2
 
1.6.data preprocessing
1.6.data preprocessing1.6.data preprocessing
1.6.data preprocessing
 
House Price Estimation as a Function Fitting Problem with using ANN Approach
House Price Estimation as a Function Fitting Problem with using ANN ApproachHouse Price Estimation as a Function Fitting Problem with using ANN Approach
House Price Estimation as a Function Fitting Problem with using ANN Approach
 
Ijariie1117 volume 1-issue 1-page-25-27
Ijariie1117 volume 1-issue 1-page-25-27Ijariie1117 volume 1-issue 1-page-25-27
Ijariie1117 volume 1-issue 1-page-25-27
 
Presentation_BigData_NenaMarin
Presentation_BigData_NenaMarinPresentation_BigData_NenaMarin
Presentation_BigData_NenaMarin
 
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification Algorithms
 
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification Algorithms
 
Weka_Manual_Sagar
Weka_Manual_SagarWeka_Manual_Sagar
Weka_Manual_Sagar
 

More from Armando Vieira

Improving Insurance Risk Prediction with Generative Adversarial Networks (GANs)
Improving Insurance  Risk Prediction with Generative Adversarial Networks (GANs)Improving Insurance  Risk Prediction with Generative Adversarial Networks (GANs)
Improving Insurance Risk Prediction with Generative Adversarial Networks (GANs)Armando Vieira
 
Seasonality effects on second hand cars sales
Seasonality effects on second hand cars salesSeasonality effects on second hand cars sales
Seasonality effects on second hand cars salesArmando Vieira
 
Visualizations of high dimensional data using R and Shiny
Visualizations of high dimensional data using R and ShinyVisualizations of high dimensional data using R and Shiny
Visualizations of high dimensional data using R and ShinyArmando Vieira
 
Dl1 deep learning_algorithms
Dl1 deep learning_algorithmsDl1 deep learning_algorithms
Dl1 deep learning_algorithmsArmando Vieira
 
Extracting Knowledge from Pydata London 2015
Extracting Knowledge from Pydata London 2015Extracting Knowledge from Pydata London 2015
Extracting Knowledge from Pydata London 2015Armando Vieira
 
Hidden Layer Leraning Vector Quantizatio
Hidden Layer Leraning Vector Quantizatio Hidden Layer Leraning Vector Quantizatio
Hidden Layer Leraning Vector Quantizatio Armando Vieira
 
machine learning in the age of big data: new approaches and business applicat...
machine learning in the age of big data: new approaches and business applicat...machine learning in the age of big data: new approaches and business applicat...
machine learning in the age of big data: new approaches and business applicat...Armando Vieira
 
Neural Networks and Genetic Algorithms Multiobjective acceleration
Neural Networks and Genetic Algorithms Multiobjective accelerationNeural Networks and Genetic Algorithms Multiobjective acceleration
Neural Networks and Genetic Algorithms Multiobjective accelerationArmando Vieira
 
Credit risk with neural networks bankruptcy prediction machine learning
Credit risk with neural networks bankruptcy prediction machine learningCredit risk with neural networks bankruptcy prediction machine learning
Credit risk with neural networks bankruptcy prediction machine learningArmando Vieira
 
Online democracy Armando Vieira
Online democracy Armando VieiraOnline democracy Armando Vieira
Online democracy Armando VieiraArmando Vieira
 
Invtur conference aveiro 2010
Invtur conference aveiro 2010Invtur conference aveiro 2010
Invtur conference aveiro 2010Armando Vieira
 
Tourism with recomendation systems
Tourism with recomendation systemsTourism with recomendation systems
Tourism with recomendation systemsArmando Vieira
 
Manifold learning for bankruptcy prediction
Manifold learning for bankruptcy predictionManifold learning for bankruptcy prediction
Manifold learning for bankruptcy predictionArmando Vieira
 
Artificial neural networks for ion beam analysis
Artificial neural networks for ion beam analysisArtificial neural networks for ion beam analysis
Artificial neural networks for ion beam analysisArmando Vieira
 
Key ratios for financial analysis
Key ratios for financial analysisKey ratios for financial analysis
Key ratios for financial analysisArmando Vieira
 

More from Armando Vieira (20)

Improving Insurance Risk Prediction with Generative Adversarial Networks (GANs)
Improving Insurance  Risk Prediction with Generative Adversarial Networks (GANs)Improving Insurance  Risk Prediction with Generative Adversarial Networks (GANs)
Improving Insurance Risk Prediction with Generative Adversarial Networks (GANs)
 
Seasonality effects on second hand cars sales
Seasonality effects on second hand cars salesSeasonality effects on second hand cars sales
Seasonality effects on second hand cars sales
 
Visualizations of high dimensional data using R and Shiny
Visualizations of high dimensional data using R and ShinyVisualizations of high dimensional data using R and Shiny
Visualizations of high dimensional data using R and Shiny
 
Dl2 computing gpu
Dl2 computing gpuDl2 computing gpu
Dl2 computing gpu
 
Dl1 deep learning_algorithms
Dl1 deep learning_algorithmsDl1 deep learning_algorithms
Dl1 deep learning_algorithms
 
Extracting Knowledge from Pydata London 2015
Extracting Knowledge from Pydata London 2015Extracting Knowledge from Pydata London 2015
Extracting Knowledge from Pydata London 2015
 
Hidden Layer Leraning Vector Quantizatio
Hidden Layer Leraning Vector Quantizatio Hidden Layer Leraning Vector Quantizatio
Hidden Layer Leraning Vector Quantizatio
 
machine learning in the age of big data: new approaches and business applicat...
machine learning in the age of big data: new approaches and business applicat...machine learning in the age of big data: new approaches and business applicat...
machine learning in the age of big data: new approaches and business applicat...
 
Neural Networks and Genetic Algorithms Multiobjective acceleration
Neural Networks and Genetic Algorithms Multiobjective accelerationNeural Networks and Genetic Algorithms Multiobjective acceleration
Neural Networks and Genetic Algorithms Multiobjective acceleration
 
Credit risk with neural networks bankruptcy prediction machine learning
Credit risk with neural networks bankruptcy prediction machine learningCredit risk with neural networks bankruptcy prediction machine learning
Credit risk with neural networks bankruptcy prediction machine learning
 
Online democracy Armando Vieira
Online democracy Armando VieiraOnline democracy Armando Vieira
Online democracy Armando Vieira
 
Invtur conference aveiro 2010
Invtur conference aveiro 2010Invtur conference aveiro 2010
Invtur conference aveiro 2010
 
Tourism with recomendation systems
Tourism with recomendation systemsTourism with recomendation systems
Tourism with recomendation systems
 
Manifold learning for bankruptcy prediction
Manifold learning for bankruptcy predictionManifold learning for bankruptcy prediction
Manifold learning for bankruptcy prediction
 
Credit iconip
Credit iconipCredit iconip
Credit iconip
 
Requiem pelo ensino
Requiem pelo ensino Requiem pelo ensino
Requiem pelo ensino
 
Eurogen v
Eurogen vEurogen v
Eurogen v
 
Artificial neural networks for ion beam analysis
Artificial neural networks for ion beam analysisArtificial neural networks for ion beam analysis
Artificial neural networks for ion beam analysis
 
Pattern recognition
Pattern recognitionPattern recognition
Pattern recognition
 
Key ratios for financial analysis
Key ratios for financial analysisKey ratios for financial analysis
Key ratios for financial analysis
 

Recently uploaded

Future Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted VersionFuture Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted VersionMintel Group
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...lizamodels9
 
2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis UsageNeil Kimberley
 
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607dollysharma2066
 
Pitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deckPitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deckHajeJanKamps
 
Kenya’s Coconut Value Chain by Gatsby Africa
Kenya’s Coconut Value Chain by Gatsby AfricaKenya’s Coconut Value Chain by Gatsby Africa
Kenya’s Coconut Value Chain by Gatsby Africaictsugar
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCRashishs7044
 
The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024christinemoorman
 
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts ServiceVip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Serviceankitnayak356677
 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMintel Group
 
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCRashishs7044
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst SummitHolger Mueller
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesKeppelCorporation
 
8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCRashishs7044
 
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… AbridgedLean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… AbridgedKaiNexus
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...lizamodels9
 
Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfJos Voskuil
 
RE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechRE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechNewman George Leech
 
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu MenzaYouth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menzaictsugar
 

Recently uploaded (20)

Future Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted VersionFuture Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted Version
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
 
2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage
 
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
 
Pitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deckPitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deck
 
Kenya’s Coconut Value Chain by Gatsby Africa
Kenya’s Coconut Value Chain by Gatsby AfricaKenya’s Coconut Value Chain by Gatsby Africa
Kenya’s Coconut Value Chain by Gatsby Africa
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
 
The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024
 
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts ServiceVip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Service
 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 Edition
 
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst Summit
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation Slides
 
8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR8447779800, Low rate Call girls in Rohini Delhi NCR
8447779800, Low rate Call girls in Rohini Delhi NCR
 
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… AbridgedLean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
Lean: From Theory to Practice — One City’s (and Library’s) Lean Story… Abridged
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
 
Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdf
 
RE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechRE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman Leech
 
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu MenzaYouth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
 

Boosting conversion rates on ecommerce using deep learning algorithms

  • 1. Boosting conversion rates on ecommerce using deep learning algorithms Armando Vieira (Armando@dataai.uk) 31 Oct 2014 Objective Predict the probability that a user will buy a product from an online shop based on past interactions within the shop website. Approach This problem will be analysed in two stages. First using off the shelf classification algorithms and a second using a stacked auto-encoder to reduce the dimensionality of the problem. Data description Data consists of one week of records of user interaction with a ecommerce site. Events have a userId, a timestamp, an event type (5 categories: pageview, basketview, buy, adclick and adview) and productId (around 25 000 categories). In case of a buy of basketview we have information on the price. We ignore adview and aclick events. Only about 1% of products (around 250) have a full category identification. However, these corresponds to about 85% of pageviews and 92% of buys. In this section we only consider interactions with these product and exclude the others. The data is about 10Gb and cannot be loaded into my laptop memory, so we first took a subsample of the first 100 000 events just to have a snapshot of the interactions. We found:  78 360 pageviews events (~78.4% of total events) from 13342 unique users.  16 409 basketview (~16.4%) from 3091 unique users.  2 430 sales events (~2.5%) from 2014 unique users (around 1.2 sales per user). If we restrict to the 257 label product categories, we found 39561 pageviews, from 7469 distinct users, which is about half of the population. We found an average of 6 interactions per user, the distribution is very skewed, following a power-law distribution (see next figure). Most users do a single interaction while very few engage in very large interactions.
  • 2. In terms of interactions with products we found also that a few products receive a very large number of interactions (pageviews) while others just a few, see next figure: Data for training the classifiers To build the data set we will restrict, for the moment, to the set of 257 product categories (which account for half of the pageviews) – will deal with all categories in future (see last section). Data was aggregated at the week level per product category and semi-week (two time buckets). In this first iteration we will not add basketview events as most of them are made on the same session/day of sales events and the objective is to predict sales with at least one day of delay. We will consider this in next iteration. All data sets were balanced: same number of sales events and non-sales events. Due to the large size of data, we essentially study the importance of sample size. We excluded pageview events from the same day or day before the sale event. Next table describe the various tests done with the 5 datasets consider: Data set Size Comments Data 1 3 000 Only page views; 257 categories; weekly aggregate Data 2 10 000 Same as data 1 but more data Data 3 30 000 Same as data 1 but more data Data 4 10 000 Same as Data 2 but semi-week aggregation Data 5 3 000 Same as Data 1 but including top 2000 categories
  • 3. Feature selection with Non-Negative Matrix Factorization (NMF) In order to test the impact of not including all product categories, we considered a new data set (Data 5) containing the top 2000 more visited product categories. Since this a huge dimensional search space, we applied Non-Negative Matrix Factorization (NMF) to reduce dimensionality. Non-negative Matrix Factorization (NMF) is a class of unsupervised learning algorithms, such as Principal Components Analysis (PCA) or learning vector quantization (LVQ) that factorizes a data matrix subjected to constraints. Although PCA is a widely used algorithm it has some drawbacks, like its linearity and poor performance on factors. Furthermore, it enforces a weak orthogonality constraint. LVQ uses a winner-take-all constraint that results in clustering the data into mutually exclusive prototypes but it performs poorly on high dimensional correlated data. Non-negativity is a more robust constraint for matrix factorization [5]. Given a non-negative matrix V (containing the training data), NMF finds non- negative matrix factors, W and H, such that: �≅��. Each data vector V (data entry) can be approximated by a linear combination of the columns of W, weighted by the patterns matrix H. Therefore, W can be regarded as containing a basis that is optimized for the linear approximation of the data in V. Since relatively few basis vectors are used to represent many data vectors, good approximation can only be achieve if the basis vectors discover the structure that is latent in the data. NMF was successfully applied to high dimensional problems with sparse data, like image recognition and text analysis. In our case we used NMF to compress data into a 10 feature subset. The major issue with NMF is the lack of an optimal method to compute the factor matrixes and stopping criteria to find the ideal number of features to be selected. In our case we apply NMF to reduced the dimensionality of the search space to 100, 200 and 300 (Data 5 – 100, 200 and 300). Running the Classifiers Based on the data sets, we test the performance of two classifiers: Logistic Regression and Random Forest. The first is a standard in industry and serve as a baseline the second is more robust and produce in general better results. It has the disadvantage of their predictions not being ease to understand (black box). We used the algorithms without any optimization of the parameters (number of trees, numbers of variables to consider in each split, split level, etc.) As a KPI to measure performance we use the standard Area Under Roc curve (AUC). An AUC=0.5 meaning a random (useless) classifier and 1 a perfect one. For all runs we used 10 fold cross validation. The results are presented in next table: Data set Logistic Random Forest Data 1 0.67 0.71 Data 2 0.69 0.76 Data 3 0.70 0.80 Data 4 0.68 0.82
  • 4. Data 5 - 100 0.62 0.67 Data 5 - 200 0.64 0.69 Data 5 – 300 0.64 0.72 We conclude that sample size is an important factor in the performance of the classifier, though the Logistic Regression does not have the same gains as the Random Forest (RF) algorithm. Clearly RF has a much best performance than logistic regression. From data set 4 we also conclude that time of events is an important factor to taken into account: although we increase the dimensionality of the search space, we still have a net gain even using fewer training examples. From applying the algorithms to data set 5, we concluded that NFM algorithm is doing some compression on data but not in a very efficient way (only the data with 300 features had improved the accuracy over the initial subset of products). In next section we suggest using Auto-encoders to reduce the dimensionality of data for all the 25 000 categories. Polarity of variables is presented in appendix 1. The most important variables are the ones corresponding to products that have highest purchase rate, which make some sense, as they correspond to the categories where most buys are made. Table 1: Confusion matrix for the dataset 1 with classifier . 1 0 1 .89 .11 0 .07 .93 Confusion Matrix, ROC curves, variable importance and polarity: To Be Delivered
  • 5. Work to be performed Stacked auto-enconders Auto-encoders are unsupervised feature learning and classification neural networks machines that belong to the category of the now called deep learning neural networks. They are especially fitted for hard problems involving very high dimensional data when we have a large number of training examples but most of them are unlabeled, like text analysis or bioinformatics. At its simplest form, an auto-encoder can be seen as a special neural network with three layers – the input layer, the latent (hidden) layer, and the reconstruction layer (as shown in Figure1 below). An auto-encoder contains two parts: (1) The encoder maps an input to the latent representation (feature) via a deterministic mapping fe: x1 = fe(x0) = se(WT 1 x0 + b1) Figure 1: schematic representation of an auto-encoder. The blue points corresponds to raw data and the red to label data used for fine-tuning supervision. where se is the activation function of the encoder, whose input is called the activation of the latent layer, and {W1, b1} is the parameter set with a weight matrix and a bias vector b1. The decoder maps the latent representation x1 back to a reconstruction via another mapping function fd: x2 = fd(x1) = sd(WT 2 x1 + b2) The input of sd is called the activation of the reconstruction layer. Parameters are learned through back-propagation by minimizing the loss function L(x0, x2): L(x0, x2) = Lr(x0, x2) + 0.5 (||W1||2 2 + ||W2||2 2) which consists of the reconstruction error Lr(x0, x2) and the L2 regularization ofW1 andW2. By minimizing the reconstruction error, we require the latent features should be able to reconstruct the original input as much as possible. In this way, the latent features preserve regularities of the original data. The squared Euclidean distance is often used for Lr(x0, x2). Other loss functions such as negative log likelihood and cross-entropy are also used. The L2 regularization term is a weight-decay which is added to the objective function to penalize large weights and reduce over-fitting. The term is the weight decay cost, which is usually a small number.
  • 6. The stacked auto-encoders (SAE) is a neural network with multiple layers of auto- encoders. It has been widely used as a deep learning method for dimensionality reduction and feature learning Figure 2: schematic representation of a stacked auto-encoder. As illustrated in Figure 2, there are h auto-encoders which are trained in a bottom-up and layer-wise manner. The input vectors (blue color in the figure) are fed to the bottom auto-encoder. After finishing training the bottom auto-encoder, the output latent representations are propagated to the higher layer. The sigmoid function or tanh function is typically used for the activation functions of se and sd. The same procedure is repeated until all the auto-encoders are trained. After such a pre-training stage, the whole neural network is fine-tuned based on a pre- defined objective. The latent layer of the top auto-encoder is the output of the stacked auto-encoders, which can be further fed into other applications, such as SVM for classification. The unsupervised pre-training can automatically exploit large amounts of unlabeled data to obtain a good weight initialization for the neural network than traditional random initialization. Staked auto-encoders have been used in problems with very sparse data high dimensional data of up to 100 000 input variables and billions of rows. Contrary to shallow learning machines, like support vector machines (SVM) and traditional neural networks, these architectures can take advantage of the large quantities of data and continuously improve performance by adding new training examples. The only downsize of them is the large computational effort needed to train them (typically tens of hours or days in regular computers) – in some cases we are working with 100 millions parameters that have to be learned... This can be alleviated by using computation based on the CPUs and a cluster of machines (like the Amazon cloud) which can reduce the training time to a couple of hours or minutes.
  • 7. Results We used two approaches: Stacked Auto-Encoders and Deep Belief Networks. DBN with several architectures, with N inputs, M outputs (in this case M=1). Stopping criteria. Learning rate. Data set Architecture AUC 1 N-100-200-M 0.88 1 N-200-100-M 0.85 2 N-100-200-M 0.91 The only downsize of them is the large computational effort needed to train them (typically tens of hours or days in regular computers) – in some cases we are working with 100 millions parameters that have to be learned... This can be alleviated by using computation based on the CPUs and a cluster of machines (like the Amazon cloud) which can reduce the training time to a couple of hours or minutes.