This document summarizes a reading session of a paper on markdown price prediction and optimization in e-commerce.
The paper is from Alibaba and proposes a two-stage approach: 1) Using machine learning to predict demand based on product and category features, and 2) Formulating the pricing problem as a Markov decision process to find the optimal multi-period pricing policy via the Bellman equation.
The approach jointly optimizes prices across stores with the goal of increasing gross merchandise volume by over 20% according to the authors.
22. 二段階構成 (x→d)
1. 4.4 Counterfactual Demand Prediction
a. 4.2 Basic Sales = Intercept Prediction (β for Items)
b. 4.3 Slope Prediction (α for Categories)
2. 5.2 Two-stage Algorithm (Dynamic Programming)
a. 5.2.1 Update by Greedy Policy by Bellman Equation
b. 5.2.2 Joint Optimization of Q function
β α Beq d
Demand Discount
x
L
MDP
Y
23. Demand(Y|α, β) Prediction
特徴量x, L→Y/Y_normalを予測
目的変数Y
● Y/Y_normal = 値下げ時の売上と定価時の売上の比(>1)
● Y_i: Y of product i
説明変数
● x: set of all features ∈ R^n ⊂ {historical sales of products, shops, holidays...}
● L_i: 3-hot product category vector
24. Demand(Y): Base Sales Prediction (β:Boosting Tree)
d_0: dではなくd_0。average of historical discounts
x_i: set of all features ∈ R^n ⊂ {historical sales of products, shops, holidays...}
25. Demand(Y): Base Sales Prediction (β:Boosting Tree)
とりあえず各itemの平均売上(β)を当ててるか
ら、x_iの中にあるhistorical
sales(Y_i_normal_t)が必要っぽい。
聞いてみた →
“h doesn’t learn the relationship between
price and sales.”
つまりSlope(α)はxから学習しないわけだ
29. 二段階構成 (x→d)
1. 4.4 Counterfactual Demand Prediction
a. 4.2 Basic Sales = Intercept Prediction (β for Items)
b. 4.3 Slope Prediction (α for Categories)
2. 5.2 Two-stage Algorithm (Dynamic Programming)
a. 5.2.1 Update by Greedy Policy by Bellman Equation
b. 5.2.2 Joint Optimization of Q function
β α Beq d
Demand Discount
x
L
MDP
Y