Search is central to e-commerce platforms. Diversification of search
results is essential to cater to the diverse preferences of the customers. One of the primary metrics of e-commerce businesses is
revenue. On the other hand, the prices of the products shown
influence customer preferences. Hence, diversifying e-commerce
search results requires learning the diverse price preferences of the
customers and simultaneously maximizing the revenue without
hurting the relevance of the results. In this paper, we introduce
the learning to diversify problem for e-commerce search. We also
show that diversification improves the median customer lifetime
value (CLV), which is a critical long-term business metric for an
e-commerce business. We design three algorithms for the task. The
first two algorithms are modifications of algorithms that are in the
past developed in the context of the diversification problem in web
search. The third algorithm is a novel approximate knapsack based
semi-bandit algorithm. We derive the regret and pay-off bounds of
all these algorithms and conduct experiments with synthetic data
and simulation to validate and compare the algorithms. We compute
revenue, median CLV, and purchase based mean reciprocal rank
(PMRR) under various scenarios such as with changing user preferences with time in our simulation to compare the performances
of these algorithms. We show that our proposed third algorithm is
more practical and efficient compared to the first two algorithms
and can produce higher revenue, maintain a better median CLV
and PMRR.
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
Learning to Diversify for E-commerce Search with Multi-Armed Bandit}
1. SIGIR ECOM 2019
Learning to Diversify for E-commerce Search with
Multi-Armed Bandit
Anjan Goswami (UC Davis), Chengxiang Zhai (UIUC), Prasant
Mohapatra (UC Davis)
July 24, 2019
2. SIGIR ECOM 2019
Agenda of this Presentation
The Problem
Contribution
Algorithms
Evaluation and Results
Future Work
3. SIGIR ECOM 2019
Diversity problem
Figure: Query: “Sunglasses for Men”, Site: Amazon, Evaluation: Only
the cheaper sunglasses are shown in top. But a user may be interested in
an expensive one, that Amazon carries.
4. SIGIR ECOM 2019
Diversity problem
Figure: Query: “Sunglasses for Men”, Site: Walmart, Evaluation: It even
shows two sunglasses from one brand, but a user may be interested in
exploring samples from multiple brands to understand the diversity of the
selection available at Walmart.
5. SIGIR ECOM 2019
Why yet another learning to diversify problem for
Commerce?
Learn the diverse (price) preferences of the customers from
the data.
Aim to maximize the revenue.
Not hurt the relevance of the search results.
6. SIGIR ECOM 2019
Our contribution
Defining the learning to diversify problem for e-commerce.
A novel semi-bandit optimization algorithm for learning to
diversify (KPBA).
A simulation based evaluation methodology (similar to
counterfactual learning [3].
7. SIGIR ECOM 2019
Learning to Diversify Algorithms
Revenue Ranked Explore and Commit (RREC) [2]
Revenue Ranked Bandits Algorithm (RRBA) [2]
Knapsack based bandit algorithm (KPBA)
8. SIGIR ECOM 2019
Revenue Ranked Explore and Commit (RREC)
Baseline greedy algorithm.
It shows all the products iteratively to estimate the demand.
Eventually maximizes the revenue.
Can have arbitrarily poor performance.
Can not learn any more after it reaches the optimality.
9. SIGIR ECOM 2019
Revenue Ranked Bandits Algorithm (RRBA)
This is an easy modification of the algorithm proposed in [2].
Uses k bandits for k positions.
Each product can be an arm.
A product can be part of several MABs.
Not simultaneously optimizes all the MABs.
Complex realization.
10. SIGIR ECOM 2019
Knapsack based bandit algorithm (KPBA)
The main algorithm proposed in this paper.
Semi-bandit optimization.
Each product can be an arm.
Selects k out of n arms in every iteration.
Simplifies realization.
11. SIGIR ECOM 2019
Algorithms for Diversity: KPBA
max
1,2,··· ,T
k
j=1
vUCB
jT
subject to
k
j=1
sj ≥ B (1)
vrj = prj /irj × ρj × Z + α 2 ln t/irj
p: purchase, i: impression, ρ: price, B: relevance threshold, Z:
normalization.
12. SIGIR ECOM 2019
Algorithms for Diversity: KPBA
max
x1t ,··· ,xnt
n
i=1
xit × vUCB
it
subject to
n
i=1
xij × ˆsi ≤ ˆB
n
i=1
xij = k (2)
13. SIGIR ECOM 2019
Algorithms for Diversity: KPBA properties
KPBA: A 1
2-approximate solution for E-kKP runs in O(n)
time.
No need for using k MAB for k positions.
Semi-bandit algorithm that is optimal for ranking.
Regret same as RRBA (MAB based): O( (nT lg T))
(proven).
14. SIGIR ECOM 2019
Algorithms for Diversity: Evaluation Metrics
Average Revenue per Query: ARQ
Median Customer Life Time Value: MCV
Mean Reciprocal Rank of Purchases: PMRR
15. SIGIR ECOM 2019
Algorithms for Diversity: Evaluation based on Simulation
Synthetically generate product data set.
Assign demand for each product based on a realistic
distribution for each query.
Assign a utility or relevance score to each product for each
query.
Make a user model.
Simulate a user search session with a specific rank function.
16. SIGIR ECOM 2019
Algorithms for Diversity: Simulation
Figure: Price histograms and their corresponding relevance score and
purchase rate. Note the plot shows correlation between the relevance
score and purchase rate fitting a line.
17. SIGIR ECOM 2019
Algorithms for Diversity: Results
Figure: Note that the red curves represent RREC metrics, blue curves
represent RRBA metrics and the green curves denote KPBA metrics. The
revenue metric uses log scale.
18. SIGIR ECOM 2019
Algorithms for Diversity: Results with position bias
Figure: Note that the red curves represent RREC metrics, blue curves
represent RRBA metrics and the green curves denote KPBA metrics. The
revenue metric uses log scale.
19. SIGIR ECOM 2019
Algorithms for Diversity: with changing customer
preference
Figure: Note that the red curves represent RREC metrics, blue curves
represent RRBA metrics and the green curves denote KPBA metrics. The
revenue metric uses log scale.
20. SIGIR ECOM 2019
Possible extensions
Learn more complex function of customer preferences by
incorporating multiple product attributes such as brand etc.
Combine the online learning framework to the traditional
learning to rank functions [1].
22. SIGIR ECOM 2019
References I
Tie-Yan Liu et al.
Learning to rank for information retrieval.
Foundations and Trends R in Information Retrieval,
3(3):225–331, 2009.
Filip Radlinski, Robert Kleinberg, and Thorsten Joachims.
Learning diverse rankings with multi-armed bandits.
In Proceedings of the 25th International Conference on
Machine Learning, ICML ’08, 2008.
Adith Swaminathan and Thorsten Joachims.
Counterfactual risk minimization: Learning from logged bandit
feedback.
In ICML, pages 814–823, 2015.