SlideShare a Scribd company logo
1 of 9
Download to read offline
An Exploration of Sephora’s Winning Formula
Ke Li, Yuyan Wang, Xinyue Yan
November 30, 2018
Abstract
Speaking of make up, Sephora is the biggest multinational chain of personal care and beauty
e-commerce, and lip make-ups are undoubtedly the hottest item that every cosmetic users passion
for. It’s not hard to notice that whenever we log in with Sephora, there will always be a section for
recommended products. Good recommendations provide customers with highly relevant personal-
ized services which brings profit to Sephora and makes customers stay with Sephora. So we set our
first predictive task as recommending lip make-up product to users that they’re likely to purchase.
For this task we build a model to recommend most important, relevant products to users. We apply
user-based Collaborative Filtering methodology in finalizing the model.
Looking back on customer’s purchase process, rather than check out with whatever has been rec-
ommended, they usually will take a second thought on reviews that can help them to know more the
product. Since each product normally receives an average of 400 reviews, Sephora presents reviews
to users based on ‘helpfulness’, scoring from 0 to 1, and reviews with higher scores are listed in
the front. The score is decided by the number of ”Helpful” clicks and the number of ”Not Helpful”
clicks by those who reviewed the text. Helpful reviews can make the product appealing, while less
meaningful reviews may lead users close the window. Through exploring the data set we found that
at least 50 percent of reviews have never been clicked, though they might actually be helpfulness to
stimulate a purchase. So we built a model with LDA and linear regression to predict the helpful-
ness of reviews given the text, such that reviews can further assist recommendations, to attract users.
Keywords: Machine Learning, Text Mining, Natural Language Processing, Collaborative Fil-
tering, TF-IDF, LDA
1 Dataset
In our case, there’s no existing data set
thus we started our analysis with crawling
all the product information and reviews for
all lip make-up products on Sephora website
(https://www.sephora.com/shop/lips-makeup).
We conducted exploratory analysis about the
characteristics in order to better understand the
features of users, products, and reviews, to fur-
ther assist the design of our model in the follow-
ing sections.
1.1 Data Format
Our final data set includes 252,317 reviews from
175,434 users, about 5,318 lip make-up products.
Reviews and product information from json files
are embedded by page/record, and each record
has features involving two parts, (1) product
attributes encapsulated in Includes, including
id, name, brand, URL of image, color id of the
product, the number of comments certain prod-
uct has along with other distinct information for
identification; (2) user attributes encapsulated in
Results, which contains reviewer id, nickname,
time of the submission of reviews, personal fea-
tures, text of reviews with detailed information.
To simplify the process of analysis and maximize
the efficiency, we tend to extract part of features
for further exploration, which has been given in
Table 1.
1
Table 1: Data formula
name description
products_id id of each product, reviews with the same id are of the same product
color_id id of colors of the lipstick, one product owns at least one color id
category_id id of each category, one product could only be categorized into one type
description description of the product
review_statistics statistical values related to reviews (see below)
_recommended_count count of numbers that the product recommended by certain user
_average_overall_rating average number of overall ratings given by users
_total_review_count total numbers of reviews related to the product
_not_recommended_count count of not recommend reviews received by the product
_helpful_vote_count count of helpful labels received by overall reviews of the product
brand_id & name id and name of each brand
author_id unique id for the user/reviewer
results list of reviews
rating rating number given by the user (range from 1 to 5)
review_text text of the reviews
context_data_values attributes of reviewers, including age, skin type, skin tone, hair color, eye color
helpfuluess feedbacks from other user (1 denotes helpful, 0 denotes unhelpful)
user_nickname name of the user who submitted the review
Among all these features, the most valuable
ones should be those related to review text for
two main reasons. First of all, the preferences of
customers are expressed directly through com-
ments submitted by users, with either positive or
negative attitude towards the products. Apart
from feedback from original users, reviews can
also be considered as critical reference as new
users making decision of purchase. Thus it is
reasonable to attach great importance and pay
more attention to those attributes for better per-
formance of recommending system with higher
accuracy.
1.2 Exploratory Data Analysis
1.2.1 Description of user data
Based on the user data we collected aimed at
the customers who have submitted reviews, we
are able to conduct descriptive analysis on basic
features of users, including age, skin color, skin
tone, eye color and hair color, to gain a general
understanding of user characteristics.
According to the figure below, we are able to
conclude that the target users of sephora lipstick
products are ranging from 18 to 54 years old, ex-
plaining for 89.1% of total records. Given the age
group information a newly created user belonged
to, we are able to recommend items purchased
by other users in the same age group, and adjust
the frequency of advertisements accordingly.
Figure 1: Age distribution of users
Furthermore, to enhance the performance of
recommender system, it is necessary to take a
close look at user features of appearance as the
basis of user clustering and filtering. The statis-
tical results are shown below.
Figure 2: Eye Colors of users
Figure 3: Skin Types of users
Figure 4: Hair Colors of users
2
1.2.2 Description of product data
Based on the same record dataset, we could also
conduct descriptive analysis for products, facili-
tating the realization of text mining and LDA.
For the first step, we divide overall 620 kinds
of products into 7 groups on the basis of num-
bers of reviews attached to each one. Especially,
we identify the product with distinct product_id
instead of color_id considering the difficulty of
building relationship between the latter one and
review contents lacking necessary explanation or
description or colors, for example 1983931 and
2012706. With an average number of reviews of
each lipstick product being 479 pieces, we draw
the bar plot for distribution.
Figure 5: Distribution of Number of Reviews
Apart from number of reviews, the popularity
of brands should also be considered as a critical
feature while creating recommendation model.
Given overall brands involved in the dataset, we
create the word cloud while evaluating the de-
gree of popularity with number of reviews users
submitted.
Figure 6: Popularity of Brands
Furthermore, we try to determine the relation-
ship between length of text and helpfulness of cer-
tain reviews for helpfullness prediction. The re-
sults of exploration ,however, do not indicate sig-
nificant positive relationship. Though the value
of helpfullness increases as the text expanding,
the fluctuation of data reflected through standard
deviation increases accordingly( as shown in Fig-
ure 7), which requires further adjustment if the
length feature is used is the model.
Figure 7: Relationship between Review Length
and Helpfulness
We also explore the relationship between users
and items using reviewer as connection. The fig-
ure below depicts the distribution of customers
submitting comments under certain product be-
longed to total 18 categories.
Figure 8: Distribution of Reviews in Categories
Considering total amount of products being 620,
we are able to conclude that there exists overlap-
ping in products reviewed by customers, which
provide the foundation for our similarity calcula-
tion and recommendation metrics creation.
2 Recommend Products
Sephora differentiates itself from other beauty re-
tailers by looking for what was most important,
relevant to a customer. So it always recommend
products to users through sliding windows of its
website. For this task we build a recommenda-
tion model to recommend lipsticks to users. In
3
this task we split the whole data set of users pur-
chase history into 60% training set, 20% valida-
tion set for tuning parameters and 20% test set
for evaluate model performance.
2.1 Model Baseline
We want to have an idea of what to expect
from the system by trying the following baseline
method: Popular item. In the industry of cos-
metic, it’s natural that customers follow trend
and to buy popular items. This baseline method
recommends users with the most popular lip-
sticks: lipsticks with largest amount of sales. So
all users will receive exactly same set of recom-
mendations.
Figure 9: Prediction accuracy vs number of items
recommended on baseline model
From the figure above we can see that as we in-
crease number of recommended items, accuracy
of purchase prediction increases. And when we
recommend 600 popular items to each users, 90%
of users in test data will buy product we recom-
mend. But this doesn’t mean the baseline model
is a good recommendation strategy, increasing
number of items in recommendation bring more
cost which we will discuss in detail in evaluation
part later.
2.2 User-based Collaborative Fil-
tering
In this model we recommend to a user based on
the fact that the products have been liked by
users similar to the user. For example if user A
and B like the same lipsticks and a new lipsticks
comes out that A likes it, then we can recommend
that lipstick to B because A and B seems to like
the same products.
2.2.1 Utility Matrix
In our recommendation system we mainly focus
on two entities users and items. We have the
record that a user bought a certain item or not.
The data is represented as a utility matrix, giv-
ing for each user-item pair, a value 1 means that
the user bought corresponding item, a value of 0
means that he or she didn’t buy. We assume that
the matrix is sparse that we have no explicit in-
formation about user’s purchase behavior on the
item. And the goal of our recommendation sys-
tem is to predict ”?” mark to be 1 or 0.
R =





1 0 . . .
0 ? . . .
...
...
1 ?





items
users
2.2.2 Jaccard Similarity
Jaccard similarity takes number of preferences
common between two users into account. Two
users will be more similar when they have more
common related items.
Jaccard(Ui, Uj) =
|Ui ∩ Uj|
|Ui ∪ Uj|
Figure 10: Distribution of Jaccard coefficient
From the distribution of Jaccard similarity values
above, coefficient values are mostly pretty large,
even a lot are distributed above 0.9. This means
that in our data, users have very similar purchase
behavior on lipsticks, people tend to like prod-
ucts that others like as well. So our recommen-
dation system based on Jaccard similarity and
popular item can find the most relevant products
for each users. The system first recommend users
with items bought by people who have the high-
est Jaccard similarity. Then it will recommend
some popular items.
2.3 Model Evaluation
Like we mentioned early, if we keep adding prod-
ucts to recommendation system, certainly users
will very likely buy items we recommend even
4
based on the trivial baseline model. But recom-
mend many products will increase cost and de-
crease system efficiency: people may feel tired of
reading such many recommendations and our rec-
ommendations may not contribute to their pur-
chase behavior at all.
In this case we choose precision to measure how
accurate the system could predict users like to
purchase items or not and choose recall to mea-
sure how efficiently the system recommend a item
that user will like.
precision =
| {returned items} ∩ {relevant items} |
| {returned items} |
recall =
| {returned items} ∩ {relevant items} |
| {relevant items} |
And in order to weight precision and recall
equally in our evaluation, we use F1 metrics:
F1 = 2 ·
precision · recall
precision + recall
So in our validation set, we implemented our
model to recommend from 1 to 40 items to each
users. From the plot below, as we expect recall
keeps increasing as we increase number of items
to recommend, because each user has constant
relevant items, when we keep recommend there
must be some they like. Precision keeps decreas-
ing because it’s denominator getting large, and
the speed of we detecting relevant items is much
slower than speed of increasing returned items.
F1 score which is influenced by both recall and
precision, increases rapidly first then decreases
gradually.
Figure 11: F1, Precision, Recall vs Number of
items we recommend
In order to optimize our model to be efficient and
to balance precision and recall, we choose to re-
turn 10 items each time in our recommendation
model which gives highest F1 score in validation.
F1 score for final model on test data is 0.278,
increasing the baseline F1 score by 15% and in-
creasing the baseline recall rate by 5%.
3 Predict Helpfulness
This task is designed for generating an automatic
mechanism for scoring review helpfulness in or-
der to present more helpful reviews for future
customers. We extracted two attributes from
the original data set: the actual review text and
the helpfulness scores. As we stated in the ab-
stract, the majority of the reviews don’t contain
a helpfulness score, so we need to filter out those
”null” helpfulness value records. Afterwards,
we’re looking at more than 100000 records, thus
we still have enough data for further processing.
We split the filtered data set into 50% train-
ing set, 20% validation set for tuning parameters,
and 30% for test. MSE is used as the measure-
ment for model evaluation.
3.1 Model Baseline
Since we observed a scattered but still positive
trend between review length and score of helpful-
ness, the feature we used in baseline is the length
of the review text.We used validation set to find
the optimal threshold, the result is in the follow-
ing figure :
Figure 12: Find Optimal Threshold for Baseline
The optimal threshold for our baseline model is
0.004, thus the baseline expression is written as:
ReviewHelpfulness = ReviewLength ∗ 0.004
(1)
This model will be used for future model perfor-
mance evaluation through searching for the final
model in the following chapters.
3.2 TF-IDF Model
3.2.1 Model Design
In our case, reviews with more important words
across all reviews may have higher helpfulness
score, so we tried to specify ’helpfulness’ by
how ’important’ the words are in the review
by TF-IDF. TF-IDF is a a numerical statistic
5
widely used in information retrieval. Gener-
ally speaking, TF-IDF is composed of two el-
ements: first element is Term Frequency(TF),
which measures the frequency a single word ap-
pears in a text/document. The second one is In-
verse Document Frequency(IDF) represents the
offsets Term Frequency by measuring the num-
ber of texts/documents that this single word ap-
pears. Below are mathematical expressions that
we put in the model:
TF =
# of word in review
Total # of word in review
(2)
IDF = loge
# of review
{review ∈ reviews : word ∈ review}
(3)
Before creating the frequency matrix, we first
dropped the stop words, stemmed them and
transformed them into the TF-IDF representa-
tion. For implementation, we used TfidfVector-
izer in sklearn library for feature extraction. No-
tice here we also constrain max_features for TF-
IDF where only top max_features number of vo-
cabulary ordered by term frequency across the
training set will be considered. A large number
of features would potentially lead to over fitting
and a small number of features could not be ef-
fective enough to differentiate all the reviews, re-
sulting in an under fitting model. So we need to
tune the feature numbers for a better model.
3.2.2 Training and Validation
The training stage is straight-forward: We build a
sparse matrix from the TF-IDF features for train-
ing set and fit a linear regression model to help-
fulness. Here are two things that we should be
careful with: 1. The predicted helpfulness score
should only be within the range of [0 1.0] . We’re
use the following expression for prediction:
Score = max(min(1, linear_regression_output), 0)
(4)
2. The same pre-processing steps (dropping
stop words, stemming words, vectorizing words)
needs to be applied to validation and test set
before applying linear regression.
Once we have the linear model, we could
validate the our model with the validation set
and find the optimal number of features that
we should use for test data. The following fig-
ure shows how the MSE varies with different
max_features value:
Figure 13: Find Optimal Number of Features for
TF-IDF
We see that with approximately 800 number
of features(words), the TF-IDF model has the
best performance on the validation set.
3.2.3 Evaluation
The performances of TF-IDF model and baseline
model on the validation data set are in Table2:
Table 2: Results of TF-IDF on Test Set
Model MSE
Baseline Model 0.15002893
TF-IDF + Linear_Regression 0.11272753
The model using TF-IDF and linear regres-
sion has reduced the baseline MSE by 24.8%.
3.3 LDA Model
The second perspective to view helpfulness of
a review is the topic that the review discussed.
Here we used Latent Dirichlet Allocation method
to retrieve the latent topics behind the review
text. LDA is a method of Bayesian Learning.
It assumes there is a latent random variable de-
cides the review’s topic with distribution P(topic)
and based on the topic, the words in the review
text are derived by the conditional probability
P(word|topic). The task of LDA model is to
learn the prior probability P(topic) and condi-
tional probability table P(word|topic) from the
review text.
After the structure of the model is estab-
lished, we then explored if the topic matrix can
have better predictive performance on helpful-
ness comparing to both baseline and TF-IDF
model. Comparing to TF-IDF model, the LDA
model’s feature are concerned with the topic dis-
cussed in the text. Intuitively, the topic might
have direct impact on how customers feel how
helpful the review is. Since a review discussed a
lot about how wonderful the chat with a beauty
adviser was, rather than how much she likes the
6
lipstick for its moisturizing effect will be consid-
ered less helpful.
Similarly, before creating the matrix for top-
ics, we vectorized the review text into words, and
used CountVectorizer to transform them into a
matrix representation.
3.3.1 Training and Validation
We start with a 15 topic model and computed
the average helpfulness for each topic:
Figure 14: Relationship between topic and help-
fulness
From the figure we observed that not all top-
ics are equally helpful. If dominant topic of a
given review is topic 12, then it’s likely to have
a lower helpfulness score than topic 2 and 19.
Thus a model that can make topics differ most
in helpfulness is what we’re looking for.
The number of topics in an LDA model is the
key to decide whether these topics make sense
or not. Therefore we did grid search starting
from 15 to 25 with step of 1, and trained the
transformed matrix using LDA. After getting
the output of LDA, we trained our initial model
with linear regression on the LDA output and
the score of helpfulness. Table 3 shows the pa-
rameters of different LDA models and the MSE
on validation set.
Table 3: MSE of predicting usefulness using different number of topics
Number of Topic MSE on Validation Set Log Likelihood Model Perplexity
15 0.131865 -5839527.734060916 423.2268901699496
16 0.129842 -5862523.743103024 433.42771242741054
17 0.127493 -5871409.85324313 437.43504911060467
18 0.122465 -5871373.935690109 437.4187771560674
19 0.113974 -5874825.186300991 438.9850874537416
20 0.110873 -5891637.942804745 446.6959467805993
21 0.110842 -5912014.697301019 456.22314533000224
22 0.110797 -5911136.048998294 457.80816999900003
23 0.110731 -5936159.312419161 459.43154819404225
24 0.110727 -59371523.23856239 467.7375391299828
25 0.110706 -5976159.232312414 477.34532429124632
There are three parameters to consider when
choosing a LDA model. We prefer a LDA model
with the minimum MSE, but also a high log likeli-
hood and low model perplexity. Taking all these
into consideration, based on results of our vali-
dation set we find that the model with 20 top-
ics has the optimal performance, because MSE
doesn’t decrease much after 20 while model per-
plexity keeps going up. Therefore we used this
number in our final LDA model.
Table4 has the top 10 words of different topics
of the 20 topics:
7
Table 4: Top 10 Words in 20-Topic model
Top 10 Words in LDA Topic Model
0 pencil touch lasted ago box month incredibly wore play drugstore
1 love colour lip absolutely power staying color look doe beautiful
2 product price worth size packaging great high totally scrub good
3 lipstick color formula liquid shade dry like drying doe transfer
4 wa sephora try store did color went bought tried looked
5 gloss lip sticky color love like just look shine glossy
6 review brand wish sephora cute kat soon von does better
7 lip product use dry balm day like time using work
8 long lip lasting liner recommend pigmented highly product love creamy
9 color perfect love shade lipstick nude skin tone wear red
10 packaging texture small warm case great smooth purse hand color
11 lip color stain doe apply dry hour like just need
12 natural look tint happy drinking looking lip eating color getting
13 sheer pigment hydrating rich color shimmer expected summer able job
14 lip dry soft year help winter skin literally greasy having
15 brand sephora store went buy kat cute bought von multiple
16 wa color did try looked just like got really thought
17 review high price product 10 did application minute slight opinion
18 lip balm product use ve used day using time tried
19 love color stay lipstick lip doe pencil great day long
In the 20-topic model, we can find the top
words in each topic can make sense if put to-
gether, for example the in topic 5 the word
”sticky” is highly weighted mining this topic is
describing the . See also topic 17 is discussing
the review and the rating of the product.
3.3.2 Evaluation
The model using LDA and linear regression has
a MSE on validation of 0.110842, which reduced
the baseline MSE by 26.1%.
3.4 Model Optimization
Since we’re projecting both the TF-IDF and LDA
transformed matrix to helpfulness using linear re-
gression, we’d want to regularize the regression
output.
Table 5: Influence of Review Length on Helpful-
ness
Review Length Score Mean Score Std
0 to 200 0.238 0.0340
200 to 400 0.207 0.0244
400 to 600 0.189 0.0395
600 to 800 0.183 0.0719
800 to 1000 0.176 0.104
1000 to 1500 0.174 0.119
> 1500 0.186 0.176
Notice that when review length increases, the
standard deviation of helpfulness also increases.
So we tried to design a regularization/penalty
term based on length:
Reg = −lambda ∗ 10−
5 ∗ (log(Review_Length))
(5)
Then we applied this regularization term to both
models, results are shown below:
Figure 15: Optimization of TF-IDF Model
Figure 16: Optimization of LDA Model
8
The optimal lambda for TF-IDF model is 7, and
the optimal lambda for LDA model is 4.
3.5 Model Evaluation
Performance of all models on validation set and
test set are listed below:
Table 6: Model Comparison
Model Validation Set Test Set
Baseline 0.15002893 0.19213458
TF-IDF 0.11148997 0.13033192
LDA 0.11061232 0.12134558
4 Related Studies
For the similar objective, the author of paper
Collaborative Embeddings for Lipstick Recom
mendations applied GloVe algorithm for building
recommender system, interpreting the problem
into matrix factorisation. Taking users’ brows-
ing session with products contexts as input data,
the author decomposed the log result of product
co-occurrence matrix into Embedding matrix E
and Bias vector b, which could be expressed as
below:
Learned by mini-batch stochastic gradient de-
scent, parameters E and b are used for revealing
potential relationship between items/products,
assisting building a fully collaborative item-based
recommender system to predict users purchase
behavior. Particularly, the author provided em-
beddings algebra to avoid possible bias caused
by the impact of brand attributes and promote
better product discovery.
In the paper Improved Collaborative Filteri
ng Algorithm Based on Multifactor Fusion
and User Features Clustering, the authors
put forward an improved algorithm based on
multifactor fusion and user features clustering,
calculating user similarity by user rating sim-
ilarity and items category preference. Mean-
while, Marko and Yoav compared pure collab-
orative with pure content-based systems and
further discussed the problem of cold-start in
Content − based, collaborativerecommendation,
facilitating our selection of similarity calculation
function.
Also in Margaret Fu’s Recommendation Syst
em for E −commerce services, the author com-
bines classification and collaborative filtering to
predict the category of movie that users like. In
this paper,the system is based on users commu-
nity graph from their past watching history. The
model they use determines similarity between
users by edges between two users. They give
different weight to different neighbours of a user,
and opinion from neighbours with high weight
values more than from a low weighted neigh-
bour. And by using these methodology they
build a Naive Bayes Learning Algorithm to de-
fine the probability of each user will like a certain
movie genre. All of those methods are adaptable
while the framework is easy to be extended, facil-
itating the achievements of Sephora’s user-based
recommendation.
5 Summary
In this project, we first crawled data from
Sephora’s website. Since it is using dynamic
loading techniques, it’s quite challenging to re-
trieve all the reviews in json format. After get-
ting all the data, we explored the properties of
the overall data set.
Next, we built two models base our research
of recommending products to users based on past
activities, and predicting helpfulness of reviews in
order to present more helpful reviews for future
users. After updating features and models, we
arrive at our final conclusion:
• To recommend lip make-up products, we
build recommendation system to look for
efficient way to predict users’ purchase be-
havior by using user based collaborative fil-
tering. The model analyzes users’ past pur-
chase behavior and compare purchases be-
tween users to predict purchase behavior in
the future thus gives recommendation.
• To predict review helpfulness, we extracted
topics behind review text using LDA, then
project the output to helpfulness score with
linear model which is penalized by review
length. Compared to the baseline, the fi-
nal model succesefully decreased MSE by
36.8%.
9

More Related Content

What's hot

Impact of Brand Image, Trust and Affect on Consumer Brand Extension Attitude:...
Impact of Brand Image, Trust and Affect on Consumer Brand Extension Attitude:...Impact of Brand Image, Trust and Affect on Consumer Brand Extension Attitude:...
Impact of Brand Image, Trust and Affect on Consumer Brand Extension Attitude:...fahidsohail
 
THE IMPACT OF SERVICE QUALITY AND BRAND AWARENESS ON BRAND LOYALTY: (A STUDY ...
THE IMPACT OF SERVICE QUALITY AND BRAND AWARENESS ON BRAND LOYALTY: (A STUDY ...THE IMPACT OF SERVICE QUALITY AND BRAND AWARENESS ON BRAND LOYALTY: (A STUDY ...
THE IMPACT OF SERVICE QUALITY AND BRAND AWARENESS ON BRAND LOYALTY: (A STUDY ...paperpublications3
 
Perceieved Price
Perceieved PricePerceieved Price
Perceieved PriceJawad Ali
 
research
researchresearch
researchhanif69
 
A presentation on Spinnimng mantra
A presentation on Spinnimng mantraA presentation on Spinnimng mantra
A presentation on Spinnimng mantraBhavik Parmar
 
Effect of branding on consumer buying behaviour
Effect of branding on consumer buying behaviourEffect of branding on consumer buying behaviour
Effect of branding on consumer buying behaviourShashank Srivastav
 
The Customers’ Brand Identification with Luxury Hotels: A Social Identity Per...
The Customers’ Brand Identification with Luxury Hotels: A Social Identity Per...The Customers’ Brand Identification with Luxury Hotels: A Social Identity Per...
The Customers’ Brand Identification with Luxury Hotels: A Social Identity Per...Mark Anthony Camilleri
 
Comscore CPG white paper
Comscore CPG white paperComscore CPG white paper
Comscore CPG white paperBrian Crotty
 
Mobile phone brand loyalty and repurchase intention
Mobile phone brand loyalty and repurchase intentionMobile phone brand loyalty and repurchase intention
Mobile phone brand loyalty and repurchase intentionAlexander Decker
 
Importance of Perceived Brand Ranking for B2B Customers in Making High Risk P...
Importance of Perceived Brand Ranking for B2B Customers in Making High Risk P...Importance of Perceived Brand Ranking for B2B Customers in Making High Risk P...
Importance of Perceived Brand Ranking for B2B Customers in Making High Risk P...IOSRJBM
 
MODELING THE BRAND IMAGE OF COSMETICS AND ITS IMPACT ON CUSTOMER SATISFACTION...
MODELING THE BRAND IMAGE OF COSMETICS AND ITS IMPACT ON CUSTOMER SATISFACTION...MODELING THE BRAND IMAGE OF COSMETICS AND ITS IMPACT ON CUSTOMER SATISFACTION...
MODELING THE BRAND IMAGE OF COSMETICS AND ITS IMPACT ON CUSTOMER SATISFACTION...IAEME Publication
 
Amazon e service quality
Amazon e service qualityAmazon e service quality
Amazon e service qualityAmareshNayak12
 
The Effects of Consumer Experience towards Behavioral Intention of Loyalty th...
The Effects of Consumer Experience towards Behavioral Intention of Loyalty th...The Effects of Consumer Experience towards Behavioral Intention of Loyalty th...
The Effects of Consumer Experience towards Behavioral Intention of Loyalty th...ijtsrd
 
Determinants of Brand Equity in two Wheeler Industry A Study with Special Ref...
Determinants of Brand Equity in two Wheeler Industry A Study with Special Ref...Determinants of Brand Equity in two Wheeler Industry A Study with Special Ref...
Determinants of Brand Equity in two Wheeler Industry A Study with Special Ref...ijtsrd
 
Effect of Brand Awareness, Brand Association, Perceived Quality and Brand Loy...
Effect of Brand Awareness, Brand Association, Perceived Quality and Brand Loy...Effect of Brand Awareness, Brand Association, Perceived Quality and Brand Loy...
Effect of Brand Awareness, Brand Association, Perceived Quality and Brand Loy...IJAEMSJORNAL
 
The impact of brand extension strategy on the brand equity of fast moving con...
The impact of brand extension strategy on the brand equity of fast moving con...The impact of brand extension strategy on the brand equity of fast moving con...
The impact of brand extension strategy on the brand equity of fast moving con...Alexander Decker
 
Brand Image, Customer Satisfaction And Brand Loyalty Of Blackberry Mobile Phone
Brand Image, Customer Satisfaction And Brand Loyalty Of Blackberry Mobile PhoneBrand Image, Customer Satisfaction And Brand Loyalty Of Blackberry Mobile Phone
Brand Image, Customer Satisfaction And Brand Loyalty Of Blackberry Mobile Phoneinventionjournals
 
Omni-Channel CUstomer Care
Omni-Channel CUstomer CareOmni-Channel CUstomer Care
Omni-Channel CUstomer Careelcontact.com
 

What's hot (20)

Research method
Research methodResearch method
Research method
 
Impact of Brand Image, Trust and Affect on Consumer Brand Extension Attitude:...
Impact of Brand Image, Trust and Affect on Consumer Brand Extension Attitude:...Impact of Brand Image, Trust and Affect on Consumer Brand Extension Attitude:...
Impact of Brand Image, Trust and Affect on Consumer Brand Extension Attitude:...
 
THE IMPACT OF SERVICE QUALITY AND BRAND AWARENESS ON BRAND LOYALTY: (A STUDY ...
THE IMPACT OF SERVICE QUALITY AND BRAND AWARENESS ON BRAND LOYALTY: (A STUDY ...THE IMPACT OF SERVICE QUALITY AND BRAND AWARENESS ON BRAND LOYALTY: (A STUDY ...
THE IMPACT OF SERVICE QUALITY AND BRAND AWARENESS ON BRAND LOYALTY: (A STUDY ...
 
Perceieved Price
Perceieved PricePerceieved Price
Perceieved Price
 
research
researchresearch
research
 
A presentation on Spinnimng mantra
A presentation on Spinnimng mantraA presentation on Spinnimng mantra
A presentation on Spinnimng mantra
 
Effect of branding on consumer buying behaviour
Effect of branding on consumer buying behaviourEffect of branding on consumer buying behaviour
Effect of branding on consumer buying behaviour
 
The Customers’ Brand Identification with Luxury Hotels: A Social Identity Per...
The Customers’ Brand Identification with Luxury Hotels: A Social Identity Per...The Customers’ Brand Identification with Luxury Hotels: A Social Identity Per...
The Customers’ Brand Identification with Luxury Hotels: A Social Identity Per...
 
Comscore CPG white paper
Comscore CPG white paperComscore CPG white paper
Comscore CPG white paper
 
Mobile phone brand loyalty and repurchase intention
Mobile phone brand loyalty and repurchase intentionMobile phone brand loyalty and repurchase intention
Mobile phone brand loyalty and repurchase intention
 
Importance of Perceived Brand Ranking for B2B Customers in Making High Risk P...
Importance of Perceived Brand Ranking for B2B Customers in Making High Risk P...Importance of Perceived Brand Ranking for B2B Customers in Making High Risk P...
Importance of Perceived Brand Ranking for B2B Customers in Making High Risk P...
 
MODELING THE BRAND IMAGE OF COSMETICS AND ITS IMPACT ON CUSTOMER SATISFACTION...
MODELING THE BRAND IMAGE OF COSMETICS AND ITS IMPACT ON CUSTOMER SATISFACTION...MODELING THE BRAND IMAGE OF COSMETICS AND ITS IMPACT ON CUSTOMER SATISFACTION...
MODELING THE BRAND IMAGE OF COSMETICS AND ITS IMPACT ON CUSTOMER SATISFACTION...
 
Amazon e service quality
Amazon e service qualityAmazon e service quality
Amazon e service quality
 
Research methods
Research methodsResearch methods
Research methods
 
The Effects of Consumer Experience towards Behavioral Intention of Loyalty th...
The Effects of Consumer Experience towards Behavioral Intention of Loyalty th...The Effects of Consumer Experience towards Behavioral Intention of Loyalty th...
The Effects of Consumer Experience towards Behavioral Intention of Loyalty th...
 
Determinants of Brand Equity in two Wheeler Industry A Study with Special Ref...
Determinants of Brand Equity in two Wheeler Industry A Study with Special Ref...Determinants of Brand Equity in two Wheeler Industry A Study with Special Ref...
Determinants of Brand Equity in two Wheeler Industry A Study with Special Ref...
 
Effect of Brand Awareness, Brand Association, Perceived Quality and Brand Loy...
Effect of Brand Awareness, Brand Association, Perceived Quality and Brand Loy...Effect of Brand Awareness, Brand Association, Perceived Quality and Brand Loy...
Effect of Brand Awareness, Brand Association, Perceived Quality and Brand Loy...
 
The impact of brand extension strategy on the brand equity of fast moving con...
The impact of brand extension strategy on the brand equity of fast moving con...The impact of brand extension strategy on the brand equity of fast moving con...
The impact of brand extension strategy on the brand equity of fast moving con...
 
Brand Image, Customer Satisfaction And Brand Loyalty Of Blackberry Mobile Phone
Brand Image, Customer Satisfaction And Brand Loyalty Of Blackberry Mobile PhoneBrand Image, Customer Satisfaction And Brand Loyalty Of Blackberry Mobile Phone
Brand Image, Customer Satisfaction And Brand Loyalty Of Blackberry Mobile Phone
 
Omni-Channel CUstomer Care
Omni-Channel CUstomer CareOmni-Channel CUstomer Care
Omni-Channel CUstomer Care
 

Similar to Sephora's Winning Formula: An Exploration of Product Recommendations and Review Helpfulness Predictions

Customer Value Proposition
Customer Value Proposition Customer Value Proposition
Customer Value Proposition Reza Hashemi
 
Turn Online Reviews into Data Driven Business Decisions-2
Turn Online Reviews into Data Driven Business Decisions-2Turn Online Reviews into Data Driven Business Decisions-2
Turn Online Reviews into Data Driven Business Decisions-2Jon LeMire
 
Brand Keys Overview - A Guide to Predictive Brand Equity and Consumer Loyalty
Brand Keys Overview - A Guide to Predictive Brand Equity and Consumer LoyaltyBrand Keys Overview - A Guide to Predictive Brand Equity and Consumer Loyalty
Brand Keys Overview - A Guide to Predictive Brand Equity and Consumer LoyaltyBrand Keys
 
Explain logistics tasks involved in one service supply chain s.docx
Explain logistics tasks involved in one service supply chain s.docxExplain logistics tasks involved in one service supply chain s.docx
Explain logistics tasks involved in one service supply chain s.docxpauline234567
 
Ppt of review and rating
Ppt of review and ratingPpt of review and rating
Ppt of review and ratingsafalta thakur
 
Com 621 final paper Waiters 12-15-15
Com 621 final paper Waiters 12-15-15Com 621 final paper Waiters 12-15-15
Com 621 final paper Waiters 12-15-15Allen K. Waiters
 
download_business model metrics.pdf
download_business model metrics.pdfdownload_business model metrics.pdf
download_business model metrics.pdfTrnMinhThun12
 
Loyalty_Driver_Analysis_V13b
Loyalty_Driver_Analysis_V13bLoyalty_Driver_Analysis_V13b
Loyalty_Driver_Analysis_V13bBayesia USA
 
Download Brandequity
Download BrandequityDownload Brandequity
Download Brandequityb4nkb4nk
 
Brand Mind Territory
Brand Mind TerritoryBrand Mind Territory
Brand Mind Territorydoron_bs
 
A marketers guide to social media tools
A marketers guide to social media toolsA marketers guide to social media tools
A marketers guide to social media toolsPaola Caceres Oma
 
T3_UsefulBrand_Report
T3_UsefulBrand_ReportT3_UsefulBrand_Report
T3_UsefulBrand_ReportJames Lanyon
 
Customer_Analysis.docx
Customer_Analysis.docxCustomer_Analysis.docx
Customer_Analysis.docxKevalKabariya
 
Sentiment Analysis - A Definitive Guide
Sentiment Analysis - A Definitive GuideSentiment Analysis - A Definitive Guide
Sentiment Analysis - A Definitive GuideBytesview
 
leewayhertz.com-How to build an AI-powered recommendation system.pdf
leewayhertz.com-How to build an AI-powered recommendation system.pdfleewayhertz.com-How to build an AI-powered recommendation system.pdf
leewayhertz.com-How to build an AI-powered recommendation system.pdfrobertsamuel23
 
Branding garima ahuja,isha singh,gobind raj,mantaj sidhu, 30 oct. 2013
Branding garima ahuja,isha singh,gobind raj,mantaj sidhu, 30 oct. 2013Branding garima ahuja,isha singh,gobind raj,mantaj sidhu, 30 oct. 2013
Branding garima ahuja,isha singh,gobind raj,mantaj sidhu, 30 oct. 2013Gobind Raj Aulakh
 
BRIDGEi2i Whitepaper - The Science of Customer Experience Management
BRIDGEi2i Whitepaper - The Science of Customer Experience ManagementBRIDGEi2i Whitepaper - The Science of Customer Experience Management
BRIDGEi2i Whitepaper - The Science of Customer Experience ManagementBRIDGEi2i Analytics Solutions
 
Loyalty programs & New Opportunities
Loyalty programs & New OpportunitiesLoyalty programs & New Opportunities
Loyalty programs & New OpportunitiesGaurav Laddha
 
Brand building and category expansion_ITC Interrobang Case Competition FMS_Delhi
Brand building and category expansion_ITC Interrobang Case Competition FMS_DelhiBrand building and category expansion_ITC Interrobang Case Competition FMS_Delhi
Brand building and category expansion_ITC Interrobang Case Competition FMS_DelhiSukesh Chandra Gain
 

Similar to Sephora's Winning Formula: An Exploration of Product Recommendations and Review Helpfulness Predictions (20)

Customer Value Proposition
Customer Value Proposition Customer Value Proposition
Customer Value Proposition
 
Turn Online Reviews into Data Driven Business Decisions-2
Turn Online Reviews into Data Driven Business Decisions-2Turn Online Reviews into Data Driven Business Decisions-2
Turn Online Reviews into Data Driven Business Decisions-2
 
Brand Keys Overview - A Guide to Predictive Brand Equity and Consumer Loyalty
Brand Keys Overview - A Guide to Predictive Brand Equity and Consumer LoyaltyBrand Keys Overview - A Guide to Predictive Brand Equity and Consumer Loyalty
Brand Keys Overview - A Guide to Predictive Brand Equity and Consumer Loyalty
 
Explain logistics tasks involved in one service supply chain s.docx
Explain logistics tasks involved in one service supply chain s.docxExplain logistics tasks involved in one service supply chain s.docx
Explain logistics tasks involved in one service supply chain s.docx
 
Ppt of review and rating
Ppt of review and ratingPpt of review and rating
Ppt of review and rating
 
Com 621 final paper Waiters 12-15-15
Com 621 final paper Waiters 12-15-15Com 621 final paper Waiters 12-15-15
Com 621 final paper Waiters 12-15-15
 
download_business model metrics.pdf
download_business model metrics.pdfdownload_business model metrics.pdf
download_business model metrics.pdf
 
Loyalty_Driver_Analysis_V13b
Loyalty_Driver_Analysis_V13bLoyalty_Driver_Analysis_V13b
Loyalty_Driver_Analysis_V13b
 
Recommender system
Recommender systemRecommender system
Recommender system
 
Download Brandequity
Download BrandequityDownload Brandequity
Download Brandequity
 
Brand Mind Territory
Brand Mind TerritoryBrand Mind Territory
Brand Mind Territory
 
A marketers guide to social media tools
A marketers guide to social media toolsA marketers guide to social media tools
A marketers guide to social media tools
 
T3_UsefulBrand_Report
T3_UsefulBrand_ReportT3_UsefulBrand_Report
T3_UsefulBrand_Report
 
Customer_Analysis.docx
Customer_Analysis.docxCustomer_Analysis.docx
Customer_Analysis.docx
 
Sentiment Analysis - A Definitive Guide
Sentiment Analysis - A Definitive GuideSentiment Analysis - A Definitive Guide
Sentiment Analysis - A Definitive Guide
 
leewayhertz.com-How to build an AI-powered recommendation system.pdf
leewayhertz.com-How to build an AI-powered recommendation system.pdfleewayhertz.com-How to build an AI-powered recommendation system.pdf
leewayhertz.com-How to build an AI-powered recommendation system.pdf
 
Branding garima ahuja,isha singh,gobind raj,mantaj sidhu, 30 oct. 2013
Branding garima ahuja,isha singh,gobind raj,mantaj sidhu, 30 oct. 2013Branding garima ahuja,isha singh,gobind raj,mantaj sidhu, 30 oct. 2013
Branding garima ahuja,isha singh,gobind raj,mantaj sidhu, 30 oct. 2013
 
BRIDGEi2i Whitepaper - The Science of Customer Experience Management
BRIDGEi2i Whitepaper - The Science of Customer Experience ManagementBRIDGEi2i Whitepaper - The Science of Customer Experience Management
BRIDGEi2i Whitepaper - The Science of Customer Experience Management
 
Loyalty programs & New Opportunities
Loyalty programs & New OpportunitiesLoyalty programs & New Opportunities
Loyalty programs & New Opportunities
 
Brand building and category expansion_ITC Interrobang Case Competition FMS_Delhi
Brand building and category expansion_ITC Interrobang Case Competition FMS_DelhiBrand building and category expansion_ITC Interrobang Case Competition FMS_Delhi
Brand building and category expansion_ITC Interrobang Case Competition FMS_Delhi
 

Recently uploaded

定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 

Recently uploaded (20)

定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 

Sephora's Winning Formula: An Exploration of Product Recommendations and Review Helpfulness Predictions

  • 1. An Exploration of Sephora’s Winning Formula Ke Li, Yuyan Wang, Xinyue Yan November 30, 2018 Abstract Speaking of make up, Sephora is the biggest multinational chain of personal care and beauty e-commerce, and lip make-ups are undoubtedly the hottest item that every cosmetic users passion for. It’s not hard to notice that whenever we log in with Sephora, there will always be a section for recommended products. Good recommendations provide customers with highly relevant personal- ized services which brings profit to Sephora and makes customers stay with Sephora. So we set our first predictive task as recommending lip make-up product to users that they’re likely to purchase. For this task we build a model to recommend most important, relevant products to users. We apply user-based Collaborative Filtering methodology in finalizing the model. Looking back on customer’s purchase process, rather than check out with whatever has been rec- ommended, they usually will take a second thought on reviews that can help them to know more the product. Since each product normally receives an average of 400 reviews, Sephora presents reviews to users based on ‘helpfulness’, scoring from 0 to 1, and reviews with higher scores are listed in the front. The score is decided by the number of ”Helpful” clicks and the number of ”Not Helpful” clicks by those who reviewed the text. Helpful reviews can make the product appealing, while less meaningful reviews may lead users close the window. Through exploring the data set we found that at least 50 percent of reviews have never been clicked, though they might actually be helpfulness to stimulate a purchase. So we built a model with LDA and linear regression to predict the helpful- ness of reviews given the text, such that reviews can further assist recommendations, to attract users. Keywords: Machine Learning, Text Mining, Natural Language Processing, Collaborative Fil- tering, TF-IDF, LDA 1 Dataset In our case, there’s no existing data set thus we started our analysis with crawling all the product information and reviews for all lip make-up products on Sephora website (https://www.sephora.com/shop/lips-makeup). We conducted exploratory analysis about the characteristics in order to better understand the features of users, products, and reviews, to fur- ther assist the design of our model in the follow- ing sections. 1.1 Data Format Our final data set includes 252,317 reviews from 175,434 users, about 5,318 lip make-up products. Reviews and product information from json files are embedded by page/record, and each record has features involving two parts, (1) product attributes encapsulated in Includes, including id, name, brand, URL of image, color id of the product, the number of comments certain prod- uct has along with other distinct information for identification; (2) user attributes encapsulated in Results, which contains reviewer id, nickname, time of the submission of reviews, personal fea- tures, text of reviews with detailed information. To simplify the process of analysis and maximize the efficiency, we tend to extract part of features for further exploration, which has been given in Table 1. 1
  • 2. Table 1: Data formula name description products_id id of each product, reviews with the same id are of the same product color_id id of colors of the lipstick, one product owns at least one color id category_id id of each category, one product could only be categorized into one type description description of the product review_statistics statistical values related to reviews (see below) _recommended_count count of numbers that the product recommended by certain user _average_overall_rating average number of overall ratings given by users _total_review_count total numbers of reviews related to the product _not_recommended_count count of not recommend reviews received by the product _helpful_vote_count count of helpful labels received by overall reviews of the product brand_id & name id and name of each brand author_id unique id for the user/reviewer results list of reviews rating rating number given by the user (range from 1 to 5) review_text text of the reviews context_data_values attributes of reviewers, including age, skin type, skin tone, hair color, eye color helpfuluess feedbacks from other user (1 denotes helpful, 0 denotes unhelpful) user_nickname name of the user who submitted the review Among all these features, the most valuable ones should be those related to review text for two main reasons. First of all, the preferences of customers are expressed directly through com- ments submitted by users, with either positive or negative attitude towards the products. Apart from feedback from original users, reviews can also be considered as critical reference as new users making decision of purchase. Thus it is reasonable to attach great importance and pay more attention to those attributes for better per- formance of recommending system with higher accuracy. 1.2 Exploratory Data Analysis 1.2.1 Description of user data Based on the user data we collected aimed at the customers who have submitted reviews, we are able to conduct descriptive analysis on basic features of users, including age, skin color, skin tone, eye color and hair color, to gain a general understanding of user characteristics. According to the figure below, we are able to conclude that the target users of sephora lipstick products are ranging from 18 to 54 years old, ex- plaining for 89.1% of total records. Given the age group information a newly created user belonged to, we are able to recommend items purchased by other users in the same age group, and adjust the frequency of advertisements accordingly. Figure 1: Age distribution of users Furthermore, to enhance the performance of recommender system, it is necessary to take a close look at user features of appearance as the basis of user clustering and filtering. The statis- tical results are shown below. Figure 2: Eye Colors of users Figure 3: Skin Types of users Figure 4: Hair Colors of users 2
  • 3. 1.2.2 Description of product data Based on the same record dataset, we could also conduct descriptive analysis for products, facili- tating the realization of text mining and LDA. For the first step, we divide overall 620 kinds of products into 7 groups on the basis of num- bers of reviews attached to each one. Especially, we identify the product with distinct product_id instead of color_id considering the difficulty of building relationship between the latter one and review contents lacking necessary explanation or description or colors, for example 1983931 and 2012706. With an average number of reviews of each lipstick product being 479 pieces, we draw the bar plot for distribution. Figure 5: Distribution of Number of Reviews Apart from number of reviews, the popularity of brands should also be considered as a critical feature while creating recommendation model. Given overall brands involved in the dataset, we create the word cloud while evaluating the de- gree of popularity with number of reviews users submitted. Figure 6: Popularity of Brands Furthermore, we try to determine the relation- ship between length of text and helpfulness of cer- tain reviews for helpfullness prediction. The re- sults of exploration ,however, do not indicate sig- nificant positive relationship. Though the value of helpfullness increases as the text expanding, the fluctuation of data reflected through standard deviation increases accordingly( as shown in Fig- ure 7), which requires further adjustment if the length feature is used is the model. Figure 7: Relationship between Review Length and Helpfulness We also explore the relationship between users and items using reviewer as connection. The fig- ure below depicts the distribution of customers submitting comments under certain product be- longed to total 18 categories. Figure 8: Distribution of Reviews in Categories Considering total amount of products being 620, we are able to conclude that there exists overlap- ping in products reviewed by customers, which provide the foundation for our similarity calcula- tion and recommendation metrics creation. 2 Recommend Products Sephora differentiates itself from other beauty re- tailers by looking for what was most important, relevant to a customer. So it always recommend products to users through sliding windows of its website. For this task we build a recommenda- tion model to recommend lipsticks to users. In 3
  • 4. this task we split the whole data set of users pur- chase history into 60% training set, 20% valida- tion set for tuning parameters and 20% test set for evaluate model performance. 2.1 Model Baseline We want to have an idea of what to expect from the system by trying the following baseline method: Popular item. In the industry of cos- metic, it’s natural that customers follow trend and to buy popular items. This baseline method recommends users with the most popular lip- sticks: lipsticks with largest amount of sales. So all users will receive exactly same set of recom- mendations. Figure 9: Prediction accuracy vs number of items recommended on baseline model From the figure above we can see that as we in- crease number of recommended items, accuracy of purchase prediction increases. And when we recommend 600 popular items to each users, 90% of users in test data will buy product we recom- mend. But this doesn’t mean the baseline model is a good recommendation strategy, increasing number of items in recommendation bring more cost which we will discuss in detail in evaluation part later. 2.2 User-based Collaborative Fil- tering In this model we recommend to a user based on the fact that the products have been liked by users similar to the user. For example if user A and B like the same lipsticks and a new lipsticks comes out that A likes it, then we can recommend that lipstick to B because A and B seems to like the same products. 2.2.1 Utility Matrix In our recommendation system we mainly focus on two entities users and items. We have the record that a user bought a certain item or not. The data is represented as a utility matrix, giv- ing for each user-item pair, a value 1 means that the user bought corresponding item, a value of 0 means that he or she didn’t buy. We assume that the matrix is sparse that we have no explicit in- formation about user’s purchase behavior on the item. And the goal of our recommendation sys- tem is to predict ”?” mark to be 1 or 0. R =      1 0 . . . 0 ? . . . ... ... 1 ?      items users 2.2.2 Jaccard Similarity Jaccard similarity takes number of preferences common between two users into account. Two users will be more similar when they have more common related items. Jaccard(Ui, Uj) = |Ui ∩ Uj| |Ui ∪ Uj| Figure 10: Distribution of Jaccard coefficient From the distribution of Jaccard similarity values above, coefficient values are mostly pretty large, even a lot are distributed above 0.9. This means that in our data, users have very similar purchase behavior on lipsticks, people tend to like prod- ucts that others like as well. So our recommen- dation system based on Jaccard similarity and popular item can find the most relevant products for each users. The system first recommend users with items bought by people who have the high- est Jaccard similarity. Then it will recommend some popular items. 2.3 Model Evaluation Like we mentioned early, if we keep adding prod- ucts to recommendation system, certainly users will very likely buy items we recommend even 4
  • 5. based on the trivial baseline model. But recom- mend many products will increase cost and de- crease system efficiency: people may feel tired of reading such many recommendations and our rec- ommendations may not contribute to their pur- chase behavior at all. In this case we choose precision to measure how accurate the system could predict users like to purchase items or not and choose recall to mea- sure how efficiently the system recommend a item that user will like. precision = | {returned items} ∩ {relevant items} | | {returned items} | recall = | {returned items} ∩ {relevant items} | | {relevant items} | And in order to weight precision and recall equally in our evaluation, we use F1 metrics: F1 = 2 · precision · recall precision + recall So in our validation set, we implemented our model to recommend from 1 to 40 items to each users. From the plot below, as we expect recall keeps increasing as we increase number of items to recommend, because each user has constant relevant items, when we keep recommend there must be some they like. Precision keeps decreas- ing because it’s denominator getting large, and the speed of we detecting relevant items is much slower than speed of increasing returned items. F1 score which is influenced by both recall and precision, increases rapidly first then decreases gradually. Figure 11: F1, Precision, Recall vs Number of items we recommend In order to optimize our model to be efficient and to balance precision and recall, we choose to re- turn 10 items each time in our recommendation model which gives highest F1 score in validation. F1 score for final model on test data is 0.278, increasing the baseline F1 score by 15% and in- creasing the baseline recall rate by 5%. 3 Predict Helpfulness This task is designed for generating an automatic mechanism for scoring review helpfulness in or- der to present more helpful reviews for future customers. We extracted two attributes from the original data set: the actual review text and the helpfulness scores. As we stated in the ab- stract, the majority of the reviews don’t contain a helpfulness score, so we need to filter out those ”null” helpfulness value records. Afterwards, we’re looking at more than 100000 records, thus we still have enough data for further processing. We split the filtered data set into 50% train- ing set, 20% validation set for tuning parameters, and 30% for test. MSE is used as the measure- ment for model evaluation. 3.1 Model Baseline Since we observed a scattered but still positive trend between review length and score of helpful- ness, the feature we used in baseline is the length of the review text.We used validation set to find the optimal threshold, the result is in the follow- ing figure : Figure 12: Find Optimal Threshold for Baseline The optimal threshold for our baseline model is 0.004, thus the baseline expression is written as: ReviewHelpfulness = ReviewLength ∗ 0.004 (1) This model will be used for future model perfor- mance evaluation through searching for the final model in the following chapters. 3.2 TF-IDF Model 3.2.1 Model Design In our case, reviews with more important words across all reviews may have higher helpfulness score, so we tried to specify ’helpfulness’ by how ’important’ the words are in the review by TF-IDF. TF-IDF is a a numerical statistic 5
  • 6. widely used in information retrieval. Gener- ally speaking, TF-IDF is composed of two el- ements: first element is Term Frequency(TF), which measures the frequency a single word ap- pears in a text/document. The second one is In- verse Document Frequency(IDF) represents the offsets Term Frequency by measuring the num- ber of texts/documents that this single word ap- pears. Below are mathematical expressions that we put in the model: TF = # of word in review Total # of word in review (2) IDF = loge # of review {review ∈ reviews : word ∈ review} (3) Before creating the frequency matrix, we first dropped the stop words, stemmed them and transformed them into the TF-IDF representa- tion. For implementation, we used TfidfVector- izer in sklearn library for feature extraction. No- tice here we also constrain max_features for TF- IDF where only top max_features number of vo- cabulary ordered by term frequency across the training set will be considered. A large number of features would potentially lead to over fitting and a small number of features could not be ef- fective enough to differentiate all the reviews, re- sulting in an under fitting model. So we need to tune the feature numbers for a better model. 3.2.2 Training and Validation The training stage is straight-forward: We build a sparse matrix from the TF-IDF features for train- ing set and fit a linear regression model to help- fulness. Here are two things that we should be careful with: 1. The predicted helpfulness score should only be within the range of [0 1.0] . We’re use the following expression for prediction: Score = max(min(1, linear_regression_output), 0) (4) 2. The same pre-processing steps (dropping stop words, stemming words, vectorizing words) needs to be applied to validation and test set before applying linear regression. Once we have the linear model, we could validate the our model with the validation set and find the optimal number of features that we should use for test data. The following fig- ure shows how the MSE varies with different max_features value: Figure 13: Find Optimal Number of Features for TF-IDF We see that with approximately 800 number of features(words), the TF-IDF model has the best performance on the validation set. 3.2.3 Evaluation The performances of TF-IDF model and baseline model on the validation data set are in Table2: Table 2: Results of TF-IDF on Test Set Model MSE Baseline Model 0.15002893 TF-IDF + Linear_Regression 0.11272753 The model using TF-IDF and linear regres- sion has reduced the baseline MSE by 24.8%. 3.3 LDA Model The second perspective to view helpfulness of a review is the topic that the review discussed. Here we used Latent Dirichlet Allocation method to retrieve the latent topics behind the review text. LDA is a method of Bayesian Learning. It assumes there is a latent random variable de- cides the review’s topic with distribution P(topic) and based on the topic, the words in the review text are derived by the conditional probability P(word|topic). The task of LDA model is to learn the prior probability P(topic) and condi- tional probability table P(word|topic) from the review text. After the structure of the model is estab- lished, we then explored if the topic matrix can have better predictive performance on helpful- ness comparing to both baseline and TF-IDF model. Comparing to TF-IDF model, the LDA model’s feature are concerned with the topic dis- cussed in the text. Intuitively, the topic might have direct impact on how customers feel how helpful the review is. Since a review discussed a lot about how wonderful the chat with a beauty adviser was, rather than how much she likes the 6
  • 7. lipstick for its moisturizing effect will be consid- ered less helpful. Similarly, before creating the matrix for top- ics, we vectorized the review text into words, and used CountVectorizer to transform them into a matrix representation. 3.3.1 Training and Validation We start with a 15 topic model and computed the average helpfulness for each topic: Figure 14: Relationship between topic and help- fulness From the figure we observed that not all top- ics are equally helpful. If dominant topic of a given review is topic 12, then it’s likely to have a lower helpfulness score than topic 2 and 19. Thus a model that can make topics differ most in helpfulness is what we’re looking for. The number of topics in an LDA model is the key to decide whether these topics make sense or not. Therefore we did grid search starting from 15 to 25 with step of 1, and trained the transformed matrix using LDA. After getting the output of LDA, we trained our initial model with linear regression on the LDA output and the score of helpfulness. Table 3 shows the pa- rameters of different LDA models and the MSE on validation set. Table 3: MSE of predicting usefulness using different number of topics Number of Topic MSE on Validation Set Log Likelihood Model Perplexity 15 0.131865 -5839527.734060916 423.2268901699496 16 0.129842 -5862523.743103024 433.42771242741054 17 0.127493 -5871409.85324313 437.43504911060467 18 0.122465 -5871373.935690109 437.4187771560674 19 0.113974 -5874825.186300991 438.9850874537416 20 0.110873 -5891637.942804745 446.6959467805993 21 0.110842 -5912014.697301019 456.22314533000224 22 0.110797 -5911136.048998294 457.80816999900003 23 0.110731 -5936159.312419161 459.43154819404225 24 0.110727 -59371523.23856239 467.7375391299828 25 0.110706 -5976159.232312414 477.34532429124632 There are three parameters to consider when choosing a LDA model. We prefer a LDA model with the minimum MSE, but also a high log likeli- hood and low model perplexity. Taking all these into consideration, based on results of our vali- dation set we find that the model with 20 top- ics has the optimal performance, because MSE doesn’t decrease much after 20 while model per- plexity keeps going up. Therefore we used this number in our final LDA model. Table4 has the top 10 words of different topics of the 20 topics: 7
  • 8. Table 4: Top 10 Words in 20-Topic model Top 10 Words in LDA Topic Model 0 pencil touch lasted ago box month incredibly wore play drugstore 1 love colour lip absolutely power staying color look doe beautiful 2 product price worth size packaging great high totally scrub good 3 lipstick color formula liquid shade dry like drying doe transfer 4 wa sephora try store did color went bought tried looked 5 gloss lip sticky color love like just look shine glossy 6 review brand wish sephora cute kat soon von does better 7 lip product use dry balm day like time using work 8 long lip lasting liner recommend pigmented highly product love creamy 9 color perfect love shade lipstick nude skin tone wear red 10 packaging texture small warm case great smooth purse hand color 11 lip color stain doe apply dry hour like just need 12 natural look tint happy drinking looking lip eating color getting 13 sheer pigment hydrating rich color shimmer expected summer able job 14 lip dry soft year help winter skin literally greasy having 15 brand sephora store went buy kat cute bought von multiple 16 wa color did try looked just like got really thought 17 review high price product 10 did application minute slight opinion 18 lip balm product use ve used day using time tried 19 love color stay lipstick lip doe pencil great day long In the 20-topic model, we can find the top words in each topic can make sense if put to- gether, for example the in topic 5 the word ”sticky” is highly weighted mining this topic is describing the . See also topic 17 is discussing the review and the rating of the product. 3.3.2 Evaluation The model using LDA and linear regression has a MSE on validation of 0.110842, which reduced the baseline MSE by 26.1%. 3.4 Model Optimization Since we’re projecting both the TF-IDF and LDA transformed matrix to helpfulness using linear re- gression, we’d want to regularize the regression output. Table 5: Influence of Review Length on Helpful- ness Review Length Score Mean Score Std 0 to 200 0.238 0.0340 200 to 400 0.207 0.0244 400 to 600 0.189 0.0395 600 to 800 0.183 0.0719 800 to 1000 0.176 0.104 1000 to 1500 0.174 0.119 > 1500 0.186 0.176 Notice that when review length increases, the standard deviation of helpfulness also increases. So we tried to design a regularization/penalty term based on length: Reg = −lambda ∗ 10− 5 ∗ (log(Review_Length)) (5) Then we applied this regularization term to both models, results are shown below: Figure 15: Optimization of TF-IDF Model Figure 16: Optimization of LDA Model 8
  • 9. The optimal lambda for TF-IDF model is 7, and the optimal lambda for LDA model is 4. 3.5 Model Evaluation Performance of all models on validation set and test set are listed below: Table 6: Model Comparison Model Validation Set Test Set Baseline 0.15002893 0.19213458 TF-IDF 0.11148997 0.13033192 LDA 0.11061232 0.12134558 4 Related Studies For the similar objective, the author of paper Collaborative Embeddings for Lipstick Recom mendations applied GloVe algorithm for building recommender system, interpreting the problem into matrix factorisation. Taking users’ brows- ing session with products contexts as input data, the author decomposed the log result of product co-occurrence matrix into Embedding matrix E and Bias vector b, which could be expressed as below: Learned by mini-batch stochastic gradient de- scent, parameters E and b are used for revealing potential relationship between items/products, assisting building a fully collaborative item-based recommender system to predict users purchase behavior. Particularly, the author provided em- beddings algebra to avoid possible bias caused by the impact of brand attributes and promote better product discovery. In the paper Improved Collaborative Filteri ng Algorithm Based on Multifactor Fusion and User Features Clustering, the authors put forward an improved algorithm based on multifactor fusion and user features clustering, calculating user similarity by user rating sim- ilarity and items category preference. Mean- while, Marko and Yoav compared pure collab- orative with pure content-based systems and further discussed the problem of cold-start in Content − based, collaborativerecommendation, facilitating our selection of similarity calculation function. Also in Margaret Fu’s Recommendation Syst em for E −commerce services, the author com- bines classification and collaborative filtering to predict the category of movie that users like. In this paper,the system is based on users commu- nity graph from their past watching history. The model they use determines similarity between users by edges between two users. They give different weight to different neighbours of a user, and opinion from neighbours with high weight values more than from a low weighted neigh- bour. And by using these methodology they build a Naive Bayes Learning Algorithm to de- fine the probability of each user will like a certain movie genre. All of those methods are adaptable while the framework is easy to be extended, facil- itating the achievements of Sephora’s user-based recommendation. 5 Summary In this project, we first crawled data from Sephora’s website. Since it is using dynamic loading techniques, it’s quite challenging to re- trieve all the reviews in json format. After get- ting all the data, we explored the properties of the overall data set. Next, we built two models base our research of recommending products to users based on past activities, and predicting helpfulness of reviews in order to present more helpful reviews for future users. After updating features and models, we arrive at our final conclusion: • To recommend lip make-up products, we build recommendation system to look for efficient way to predict users’ purchase be- havior by using user based collaborative fil- tering. The model analyzes users’ past pur- chase behavior and compare purchases be- tween users to predict purchase behavior in the future thus gives recommendation. • To predict review helpfulness, we extracted topics behind review text using LDA, then project the output to helpfulness score with linear model which is penalized by review length. Compared to the baseline, the fi- nal model succesefully decreased MSE by 36.8%. 9