Successfully reported this slideshow.
Upcoming SlideShare
×

# Nonlinear latent factorization by embedding multiple user interests(Recsys 2013)

3,229 views

Published on

• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

### Nonlinear latent factorization by embedding multiple user interests(Recsys 2013)

1. 1. Nonlinear Latent Factorization by Embedding Multiple User Interests Jason Weston, et al. Recsys2013 読み会 @y_benjo
2. 2. 目次 • 目的 • 手法 • 実験 • まとめ
3. 3. 目的 • 「ユーザの興味はアイテムで作られる興味より複雑 なのではないか？」 • Amazonにおいて，家庭用品に対する興味とDVDに対す る興味が一つの潜在変数には対応付かないのでは? • 潜在変数の数は少なくしたままでユーザの興味を複雑に モデリングしたい！
4. 4. 手法 • 行列分解をベースにする d u m S u U d m V • 提案手法: ユーザ側にT個のmを仮定する d u m S u U T d m V
5. 5. The key idea of the proposed model is to deﬁne T interest vectors per user, where the user part of the model is written ˆ 詳細 as U which is an m ⇥ |U| ⇥ T tensor. Hence, we also write ˆ Stochasticas the m-dimensional Training. represents the Uiu 2 Rm Gradient Descent vector that th i Now T possiblehave described the model, the part of the is of that we interests for user u. The item next step tensorにするのでスコア関数が少し変わる a model model is thehow to train it.classical user-item factorization to describe same as in the We could learn such models, and is still denoted as a m approach such The SVD, using a regression (least squares) ⇥ |D| matrix V . as in new scoring model is deﬁned as: but in this work we focus on learning to rank as it has been observed to perform well=onmax U >recommendation tasks several V . ˆiu d f (u, d) (2) i=1,...,T previously [6, 10]. Our starting point is the objective of the linear factorization are nowWsabie [9],Twhich learns the For any given item, we model, computing dot products, model parameters by minimizing:it to the user, and taking 目的関数は次式を最小化する rather than one, when comparing XX X ` ´ ¯ L rankd (u) max(0, 1 + f (u, d) f (u, d)). u2U d2D ¯2Du he positive item dd relative to all the negative items: / X ¯ Here Dd (u) = rank u denotes the (u, d) items (u, d)),the user has purI(f set of 1 + f that Datase chased / watchedu/ listened to (depending on the context) ¯2D d/ Numbe 評価したアイテムが which we refer to as positive items, i.e. we are in a binary U Train nd L(⌘) converts the rank評価してないアイテムより上位に来たかどうか to a weight. Choosing L(⌘) = rating setting rather than the real-valued rating setting e.g. U Test ⌘ for any positive constant C optimizes the mean rank, P • •
6. 6. om that ferently, for each user, the set of items is partitioned into T subsets, where the partitioning can vary across users. For each partition a di↵erent scoring function is applied. MapReduce ed nonr image wledge, s in our ommen- Algorithm 2 MaxMF MapReduce algorithm. Initialize V randomly (mean 0, standard deviation p1 ). m P 1 Deﬁne model f1 (u, d) = |Du | i2Du Vi> Vd . Train f1 (u, d) using Algorithm 1. ˆ> Deﬁne f2 (u, d) = maxi Uiu Vd⇤ , where V ⇤ = V from f1 . for each user u (in parallel) do ˆ Train Uu , but keep V ⇤ ﬁxed, i.e. run Algorithm 1 but only invoke the gradient updates (3)-(4) and not (5)-(6). end for ] where are used RS e form: (1) • 大規模なデータになると回らないのでMapReduce Stochastic Gradient Descent Training. an item assigned can be ll rank) ated by atrix U , for the e items. ors. • • Now that we have described the model, the next step アイテム側をm次元に落とす操作(V)を先に計算 is to describe how to train it. We could learn such a model using a regression (least squares) approach such as in SVD, その後，Vを固定してユーザごとのtensor計算を並列で but in this work we focus on learning to rank as it has been 実行 observed to perform well on several recommendation tasks previously [6, 10]. Our starting point is the objective of the linear factorization model, Wsabie [9], which learns the