4. 手法
• 行列分解をベースにする
d
u
m
S
u U
d
m
V
• 提案手法: ユーザ側にT個のmを仮定する
d
u
m
S
u U
T
d
m
V
5. The key idea of the proposed model is to deﬁne T interest
vectors per user, where the user part of the model is written
ˆ
詳細
as U which is an m ⇥ |U| ⇥ T tensor. Hence, we also write
ˆ
Stochasticas the m-dimensional Training. represents the
Uiu 2 Rm Gradient Descent vector that
th
i Now T possiblehave described the model, the part of the is
of that we interests for user u. The item next step
tensorにするのでスコア関数が少し変わる a model
model is thehow to train it.classical user-item factorization
to describe same as in the We could learn such
models, and is still denoted as a m approach such The SVD,
using a regression (least squares) ⇥ |D| matrix V . as in new
scoring model is deﬁned as:
but in this work we focus on learning to rank as it has been
observed to perform well=onmax U >recommendation tasks
several V .
ˆiu d
f (u, d)
(2)
i=1,...,T
previously [6, 10]. Our starting point is the objective of
the linear factorization are nowWsabie [9],Twhich learns the
For any given item, we model, computing dot products,
model parameters by minimizing:it to the user, and taking
目的関数は次式を最小化する
rather than one, when comparing
XX X `
´
¯
L rankd (u) max(0, 1 + f (u, d) f (u, d)).
u2U d2D ¯2Du
he positive item dd relative to all the negative items:
/
X
¯
Here Dd (u) =
rank u denotes the (u, d) items (u, d)),the user has purI(f set of 1 + f that
Datase
chased / watchedu/ listened to (depending on the context)
¯2D
d/
Numbe
評価したアイテムが
which we refer to as positive items, i.e. we are in a binary U
Train
nd L(⌘) converts the rank評価してないアイテムより上位に来たかどうか
to a weight. Choosing L(⌘) =
rating setting rather than the real-valued rating setting e.g. U
Test
⌘ for any positive constant C optimizes the mean rank,
P
•
•
6. om that
ferently, for each user, the set of items is partitioned into T
subsets, where the partitioning can vary across users. For
each partition a di↵erent scoring function is applied.
MapReduce
ed nonr image
wledge,
s in our
ommen-
Algorithm 2 MaxMF MapReduce algorithm.
Initialize V randomly (mean 0, standard deviation p1 ).
m
P
1
Deﬁne model f1 (u, d) = |Du | i2Du Vi> Vd .
Train f1 (u, d) using Algorithm 1.
ˆ>
Deﬁne f2 (u, d) = maxi Uiu Vd⇤ , where V ⇤ = V from f1 .
for each user u (in parallel) do
ˆ
Train Uu , but keep V ⇤ ﬁxed, i.e. run Algorithm 1 but
only invoke the gradient updates (3)-(4) and not (5)-(6).
end for
] where
are used
RS
e form:
(1)
• 大規模なデータになると回らないのでMapReduce
Stochastic Gradient Descent Training.
an item
assigned
can be
ll rank)
ated by
atrix U ,
for the
e items.
ors.
•
•
Now that we have described the model, the next step
アイテム側をm次元に落とす操作(V)を先に計算 is
to describe how to train it. We could learn such a model
using a regression (least squares) approach such as in SVD,
その後，Vを固定してユーザごとのtensor計算を並列で
but in this work we focus on learning to rank as it has been
実行
observed to perform well on several recommendation tasks
previously [6, 10]. Our starting point is the objective of
the linear factorization model, Wsabie [9], which learns the
Be the first to comment