Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Recent advances in deep recommender systems

239 views

Published on

In this lecture, I will first cover the recent advances in neural recommender systems such as autoencoder-based and MLP-based recommender systems. Then, I will introduce the recent achievement for automatic playlist continuation in music recommendation.

Published in: Technology
  • Be the first to comment

Recent advances in deep recommender systems

  1. 1. Recent Advances in Recommender Systems using Deep Learning August 20, 2019 Sungkyunkwan University (SKKU) Prof. Jongwuk Lee
  2. 2. 2 Recommender Systems Basics
  3. 3. What are Recommender Systems? 3 Items Recommendation products, movies, news, …
  4. 4. Search vs. Recommendation  How can we help users get access to relevant data?  Pull mode (search engines)  Users take initiative.  Ad-hoc information need  Push mode (recommender systems)  Systems take initiative.  Stable information need or a system has user’s information need. 4
  5. 5. Product Recommendation 5 Amazon product recommendation
  6. 6. Movie & Music Recommendation 6 Netflix movie recommendation Spotify music recommendation
  7. 7. News Recommendation 7 Naver AiRs Kakao Rubics
  8. 8. Various Recommendations 8 Social/Friend recommendation Restaurant recommendation Tag recommendationApp recommendation
  9. 9. What is Collaborative Filtering?  Given a target user, Alice, find a set of users whose preference patterns are similar to that of the target user.  Predict a list of items that Alice will be likely to prefer. 9 Target user: Alice ① Inferring Alice’s preference ② Finding a set of users with similar preference for Alice ③ Recommending a list of items that a user group prefers Top-N Recommendation
  10. 10. User-Item Rating Matrix  A user-item rating matrix R of the target user, Alice, and other users is given:  R: a user-item rating matrix (𝒎 × 𝒏 matrix)  Determine whether She would like or dislike movies, which has not seen yet. 10 3 3 ? 2 ? ? 4 1 5 4 ? ? 3 ? ? 3 … … … … … … … … … … … Alice
  11. 11. Latent Factor Models  How to model user-item interactions?  U: latent user matrix (𝒎 × 𝒌 matrix)  Each user is represented by a latent vector (1 × 𝑘 vector).  V: latent item matrix (𝒏 × 𝒌 matrix)  Each item is represented by a latent vector (1 × 𝑘 vector). 11 User-item interaction 𝒇 𝑼, 𝑽 = ? …… ……
  12. 12. Factorizing Two Latent Matrices  The user-item rating matrix R can be approximated as a linear combination of two latent matrices U and V.  R: user-item rating matrix (𝑚 × 𝑛 matrix)  U: latent user matrix (𝑚 × 𝑘 matrix)  V: latent item matrix (𝑛 × 𝑘 matrix)  𝑘: # of latent features 12 3 3 ? 2 ? ? 4 1 5 4 ? ? 3 ? ? 3 … … … … … … 1.5 ... 0.1 0.6 ... 1.2 0.7 ... 0.5 … ... ... 0.1 ... 0.2 0.2 1.4 1.2 … 2.3 … ... … ... ... 0.1 2.6 0.3 … 1.5 … … k features kfeatures R U VT … … … … … Yehuda Koren, “Factorization meets the neighborhood: a multifaceted collaborative filtering model,” KDD 2008
  13. 13. Limitation of Existing Models  Existing models are mainly based on a linear user-item interaction.  However, the user-item interaction may be non-linear and non-trivial. 13 3 3 ? 2 ? ? 4 1 5 4 ? ? 3 ? ? 3 … … … … … … 1.5 ... 0.1 0.6 ... 1.2 0.7 ... 0.5 … ... ... 0.1 ... 0.2 0.2 1.4 1.2 … 2.3 … ... … ... ... 0.1 2.6 0.3 … 1.5… … k features kfeatures R U VT … … … … …
  14. 14. 14 Deep Recommender Systems
  15. 15. Statistics of RecSys Models using DNNs  The number increases exponentially in the last five years.  SIGIR, WWW, RecSys, KDD, AAAI, WSDM, NIPS, … 15 Shuai Zhang et al., “Deep Learning based Recommender System: A Survey and New Perspectives,” 2017
  16. 16. Categories of RecSys Models using DNNs 16 Deep Learning based Recommender System Model using Single DL Technique Integrate DL with Traditional RS Deep Composite Model Recommend Rely Solely on DL MLP AE CNN RNN DSSM RBM NADE GAN Loosely Coupled Model Tightly Coupled Model Integration ModelNeural Network Model
  17. 17. Categories of RecSys Models using DNNs 17 Deep Learning based Recommender System Model using Single DL Technique Integrate DL with Traditional RS Deep Composite Model Recommend Rely Solely on DL MLP AE CNN RNN DSSM RBM NADE GAN Loosely Coupled Model Tightly Coupled Model Integration ModelNeural Network Model
  18. 18. AutoRec: Autoencoder-based Model  For each item, reconstruct rating vectors.  Observed ratings are only used to update the model. 18 2 1 3… 2 1 3… …𝒉(𝒓) 𝒓 ො𝒓 𝑾, 𝒃 𝑾′ , 𝒃′ 3 3 ? 2 ? ? 4 1 5 4 ? ? 3 ? ? 3 … … … … … … Suvash Sedhain et al. “AutoRec: Autoencoders Meet Collaborative Filtering,” WWW 2015
  19. 19. Denoising Autoencoder (DAE)  Learn to reconstruct a user’s favorite set ෤𝒓 of items from randomly sampled subsets, i.e., denoising autoencoder. 19 3 2… 3 3 2… …𝒉(𝒓) ෤𝒓 ො𝒓 𝑾, 𝒃 𝑾′ , 𝒃′ Denoising input 3 3 ? 2 ? ? 4 1 5 4 ? ? 3 ? ? 3 … … … … … … Yao Wu et al., “Collaborative Denoising Auto-Encoders for Top-N Recommender Systems,” WSDM 2016
  20. 20. Collaborative Denoising Autoencoder  Learn to reconstruct a user’s favorite set ෤𝒓 of items from randomly sampled subsets, i.e., denoising autoencoder.  Train for all users by shared variables for items and a specific variable for each user, i.e., 𝒌 × 𝟏 vector. 20 3 2… 3 3 2… …𝒉(𝒓) ෤𝒓 ො𝒓 𝑾, 𝒃 𝑾′ , 𝒃′ 1 0 0… 𝑽 𝒌×𝟏 One-hot vector for user Denoising input 3 3 ? 2 ? ? 4 1 5 4 ? ? 3 ? ? 3 … … … … … …
  21. 21. Generalized Matrix Factorization (GMF)  Embeddings are used to learn latent user and item features.  Input: one-hot feature vector for user u and item i  Output: predicted score ŷui  Element-wise product is same as the existing MF model. 21 0 0 1 … 00 1 0 … 0 latent user vector Latent item vector ෝ𝒚 𝒖𝒊 User embedding Item embedding Element-wise product Fully connected w/o bias Layer … … 3 3 ? 2 ? ? 4 1 5 4 ? ? 3 ? ? 3 … … … … … … R … … … … … Xiangnan He et al., “Neural Collaborative Filtering,” WWW 2017
  22. 22. Step 1: Embedding Users and Items  User embedding  Represent a latent feature for each user.  Item embedding  Similarly, it represents a latent feature for each item. 22 𝟎 𝟏 𝟎 … 𝟎 × 𝟎. 𝟔 𝟎. 𝟖 𝟎. 𝟔 𝟎. 𝟒 𝟎. 𝟐 𝟎. 𝟓 ⋯ 𝟎. 𝟔 𝟏. 𝟐 𝟎. 𝟐 ⋮ ⋱ ⋮ 𝟎. 𝟕 𝟎. 𝟓 ⋯ 𝟎. 𝟑 = 𝟎. 𝟔 𝟎. 𝟒 … 𝟏. 𝟐 𝟏 × 𝒎 vector 𝒎 × 𝒌 matrix 𝟏 × 𝒌 vector 0 0 1 … 00 1 0 … 0 latent user vector Latent item vector User embedding Item embedding … …
  23. 23. Step 1: Embedding Users and Items 23 0 0 1 … 00 1 0 … 0 latent user vector Latent item vector User embedding Item embedding … … 𝟎. 𝟔 … 𝟏. 𝟐 𝟏. 𝟐 … 𝟎. 𝟑 3 3 ? 2 ? ? 4 1 5 4 ? ? 3 ? ? 3 … … … … … 1.5 ... 0.1 0.6 ... 1.2 0.7 ... 0.5 … ... ... 0.1 ... 0.2 0.2 1.4 1.2 … 2.3 … ... … ... ... 0.1 2.6 0.3 … 1.5 … … R U VT … … … … … …
  24. 24. Step 2: Element-wise Product  This is same as the linear combination of U and VT. 24 Element-wise product Layer 𝟎. 𝟕𝟐 … 𝟎. 𝟑𝟔 3 3 ? 2 ? ? 4 1 5 4 ? ? 3 ? ? 3 … … … … … … 1.5 ... 0.1 0.6 ... 1.2 0.7 ... 0.5 … ... ... 0.1 ... 0.2 0.2 1.4 1.2 … 2.3 … ... … ... ... 0.1 2.6 0.3 … 1.5 … … … … … … … 0 0 1 … 00 1 0 … 0 latent user vector Latent item vector User embedding Item embedding … … 𝟎. 𝟔 … 𝟏. 𝟐 𝟏. 𝟐 … 𝟎. 𝟑 R U VT
  25. 25. Step 3: Passing Fully Connected Layer 25 ෝ𝒚 𝒖𝒊 Fully connected w/o bias 3 3 ? 2 ? ? 4 1 5 4 ? ? 3 ? ? 3 … … … … … … 1.5 ... 0.1 0.6 ... 1.2 0.7 ... 0.5 … ... ... 0.1 ... 0.2 0.2 1.4 1.2 … 2.3 … ... … ... ... 0.1 2.6 0.3 … 1.5 … … R … … … … … If the weight is equal, it is same as the existing MF model. 𝑹 = 𝒘(𝑼⨂𝑽 𝑻) Element-wise product Layer 𝟎. 𝟕𝟐 … 𝟎. 𝟑𝟔 0 0 1 … 00 1 0 … 0 latent user vector Latent item vector User embedding Item embedding … … 𝟎. 𝟔 … 𝟏. 𝟐 𝟏. 𝟐 … 𝟎. 𝟑 U VT
  26. 26. MLP-based Matrix Factorization  Instead of element-wise product, concatenate latent user and item vectors. 26 0 0 1 … 00 1 0 … 0 latent user vector Latent item vector ෝ𝒚 𝒖𝒊 User embedding Item embedding Concatenating two latent vectorsLayer 1 Layer X ……. Fully connected w/o bias … … 3 3 ? 2 ? ? 4 1 5 4 ? ? 3 ? ? 3 … … … … … … R … … … … …
  27. 27. MLP-based Matrix Factorization  Learning non-trivial interactions between users and items 27 0 0 1 … 00 1 0 … 0 latent user vector Latent item vector ෝ𝒚 𝒖𝒊 User embedding Item embedding Concatenating two latent vectorsLayer 1 Layer X ……. Fully connected w/o bias … … 3 3 ? 2 ? ? 4 1 5 4 ? ? 3 ? ? 3 … … … … … … R … … … … … 𝟎. 𝟔 … 𝟏. 𝟐 𝟏. 𝟐 … 𝟎. 𝟑 𝟎. 𝟔 … 𝟏. 𝟐 𝟏. 𝟐 … 𝟎. 𝟑
  28. 28. Fusing Two Solutions  Neural collaborative filtering: Both use GMF and MLP layers. 28 0 0 1 … 00 1 0 … 0 MF user vector MLP item vectorMLP user vector MF item vector MLP Layer 1 MLP Layer X ……. Element-wise product GMF Layer ෝ𝒚 𝒖𝒊 Concatenation Fully connected w/o bias NeuMF Layer Concatenation ……
  29. 29. Example: Product Recommendation 29 https://www.bloter.net/archives/288812
  30. 30. Distributional Hypothesis  “You shall know a word by the company it keeps.” by J.R. Firth (1957)  Words with similar contexts share similar meanings.  One of the most successful ideas of modern NLP  What is Tejuino ?  A cup of Tejuino is on the table.  A woman likes Tejuino.  Tejuino makes you drunk.  I usually drink Tejuino every morning. 30
  31. 31. Prod2Vec using Word Embeddings  How to perform product embedding?  Adopt the skip-gram model for products.  Input: i-th product purchased by the user  Output: the other product purchased by the user 31 A set of words  A set of products purchased by the user Mihajlo Grbovic et al., “E-commerce in Your Inbox: Product Recommendations at Scale,” KDD 2015
  32. 32. Prod2Vec using Word Embeddings  Imagine that the existing user-item matrix.  Word  Movie, A set of words  User  The window size is ignored. 32 3 3 ? 2 ? ? 4 1 5 4 ? ? 3 ? ? 3 … … … … … … i-th movie watched by the user Projection Softmax Transform The other movie watched by the user
  33. 33. Possible Models of Prod2Vec 33 i-th product purchased by the user Projection Softmax Transform The other product purchased by the user i-th product purchased by the user Projection + Averaging Softmax Transform user products except for the i-th product purchased by the user Prod2Vec Skip-gram model User embedding + Prod2Vec
  34. 34. 34 About Time (2013) 드라마/판타지/성장 어떠한 순간을 다시 살게 된다면, 과연 완벽한 사랑을 이룰 수 있을까? 모태솔로 팀은 성인이 된 날, 아비저로부터 놀랄만한 가문의 비밀을 듣게 된다. 바로 사긴을 되돌릴 수 있는 능력이 있다는 것! 여자친구를 만든다는 꿈을 이루기 위해 런던으로 간 팀은 메리에게 첫눈에 반하게 되는데.. Silver Linings Playbook (2012) 드라마/코미디 Secret Life of Walter Mitty, The (2013) 판타지/드라마 Perks of Being a Wallflower, The (2012) 드라마/성장 The Theory of Everything(2014) 드라마/로맨스 What If (2013) 로맨스/코미디 Man Up (2015) 드라마/로맨스 Love, Rosie (2014) 로맨스/코미디 Two Night Stand (2014) 로맨스/코미디 GloVe Skip-gram
  35. 35. Mulan (1998) 애니메이션 동양의 분위기를 살린 디즈니의 역작! 파씨 가문의 외동딸 뮬란은 선머슴 같은 성격 때문에 중매를 볼 때마다 퇴짜를 맞는다. 때마침 훈족의 침입으로 징집명령이 떨어지고 늙은 아버지를 대신하여 남장을 하고 나서는데.. Jungle Book, The (1967) 애니메이션 Antz (1998) 애니메이션 Lady and the Tramp (1955) 애니메이션 Peter Pan (1953) 애니메이션 Thumbelina (1994) 애니메이션 A Dinosaur's Story (1993) 애니메이션 Quest for Camelot (1998) 애니메이션 Return to Never Land (2002) 애니메이션 35 GloVe Skip-gram
  36. 36. Machine, The (2013) 판타지/SF(로봇) 인간과 로봇 그 경계가 사라진다! Signal, The (2014) 스릴러/SF (컴퓨터) Zero Theorem, The (2013) 판타지/드라마(컴퓨터) Autómata (2014) 스릴러/액션(로봇) Cargo (2009) 스릴러/미스터리(우주) 신 냉전시대에 인간의 뇌 데이터를 바탕으로 탄생한 살인로봇 머신 에이바는 점차 인간의 감정을 느껴가고, 그녀를 주축으로 머신들은 인간과의 최후의 전쟁을 선포하는데… the east(2013) 스릴러/액션(스파이) Signal, The (2014) 스릴러/SF (컴퓨터) Autómata (2014) 스릴러/액션(로봇) Cargo (2009) 스릴러/미스터리(우주) 36 GloVe Skip-gram
  37. 37. Session-based Recommendation  Sequential behavior (global information)  Last click (local information) 37 Session input Recommendation
  38. 38. GRU4Rec: RNN-based Model 38 Item2Vec RNN Layer Balazs Hidasi et al., “Session-based Recommendations with Recurrent Neural Networks”, RecSys Workshop 2015
  39. 39. NARM: Attention-based Model  Recommend the next item in a given session.  Combine global and local information.  They are represented by RNNs. 39 Jing Li et al., “Neural Attentive Session-based Recommendation,” CIKM 2017
  40. 40. Step 1: Global Encoder in NARM  Capturing user’s sequential behavior  ℎ 𝑡 = 1 − 𝑧𝑡 ℎ 𝑡−1 + 𝑧𝑡 ෠ℎ 𝑡  𝑧𝑡 = 𝜎(𝑊𝑧 𝑥 𝑡 + 𝑈𝑧ℎ 𝑡−1) where 𝑧𝑡 is update gate  ෠ℎ 𝑡 = tanh[𝑊𝑥 𝑡 + 𝑈 𝑟𝑡 ⊙ ℎ 𝑡−1 ]  𝑟𝑡 = 𝜎(𝑊𝑟 𝑥 𝑡 + 𝑈𝑟ℎ 𝑡−1) 40
  41. 41. Step 2: Local Encoder in NARM  Capturing user’s main purpose  𝛼 𝑡𝑗 = 𝑞 𝒉 𝒕, ℎ𝑗 , where 𝒉 𝒕 is a latent vector for the last item.  𝐶𝑡 𝑙 = σ 𝑗=1 𝑡 𝛼 𝑡𝑗ℎ𝑗  𝑞 is an attention scoring function for ℎ 𝑡 and ℎ𝑗.  𝑞 ℎ 𝑡, ℎ𝑗 = 𝑣 𝑇 𝜎(𝐴1ℎ 𝑡 + 𝐴2ℎ𝑗) 41
  42. 42. Step 3: Decoder in NARM  Concatenated vector 𝑐𝑡 = 𝑐𝑡 𝑔 ; 𝑐𝑡 𝑙 = [ℎ 𝑡 𝑔 ; σ 𝑗−1 𝑡 𝛼 𝑡𝑗ℎ 𝑡 𝑙 ]  Use an alternative bi-linear similarity function.  𝑆𝑖 = 𝑒𝑚𝑏𝑖 𝑇 𝐵𝑐𝑡 where 𝐵 is a 𝐷 × |𝐻| matrix.  |𝐷| is the dimension of each item. 42
  43. 43. Combining Attention and Memory  𝒎 𝒔 represents the average vector of items.  𝑚 𝑠 = 1 𝑡 σ𝑖=1 𝑡 𝑥𝑖  𝒎 𝒕 is the vector for last item. 43 Qiao Liu et al., “Short-Term Attention/Memory Priority Model for Session-based Recommendation,” KDD 2018
  44. 44. STAMP: Attention/Memory Priority  𝒎 𝒂 is sum of multiplication between coefficient and embedding vector.  𝑚 𝑎 = σ𝑖=1 𝑡 𝛼𝑖 𝑥𝑖  Attention coefficient: 𝛼𝑖 = 𝑊0 𝜎(𝑊1 𝑥𝑖 + 𝑊2 𝒎 𝒔 + 𝑊3 𝒙 𝒕 + 𝑏) 44
  45. 45. MMCF: Multimodal Collaborative Filtering for Automatic Playlist Continuation RecSys Challenge 2018 Team ‘hello world!’ (2nd place), main track Hojin Yang, Yoonki Jeong, Minjin Choi, and Jongwuk Lee Sungkyunkwan University, Republic of Korea
  46. 46. Automatic Playlist Continuation  Million Playlist Dataset (MPD) 46 Playlist title Tracks in the playlist Metadata of tracks (artist, album)
  47. 47. Challenge Set 47 1 2 3 4 5 6 7 8 9 10 # of tracks 0 1 5 10 5 10 25 100 25 100 Title available Yes Yes Yes Yes No No Yes Yes Yes Yes Track order Seq Seq Seq Seq Seq Seq Seq Seq Shuffled Shuffled # of playlists 1,000 1,000 1,000 1,000 1,000 1,000 1,000 1,000 1,000 1,000 Few tracks in the first part Many tracks in the first part Many tracks in the random position
  48. 48. Challenge Set 48 1 2 3 4 5 6 7 8 9 10 # of tracks 0 1 5 10 5 10 25 100 25 100 Title available Yes Yes Yes Yes No No Yes Yes Yes Yes Track order Seq Seq Seq Seq Seq Seq Seq Seq Shuffled Shuffled # of playlists 1,000 1,000 1,000 1,000 1,000 1,000 1,000 1,000 1,000 1,000 No tracks in the playlist How to deal with an edge case?
  49. 49. Challenge Set 49 1 2 3 4 5 6 7 8 9 10 # of tracks 0 1 5 10 5 10 25 100 25 100 Title available Yes Yes Yes Yes No No Yes Yes Yes Yes Track order Seq Seq Seq Seq Seq Seq Seq Seq Shuffled Shuffled # of playlists 1,000 1,000 1,000 1,000 1,000 1,000 1,000 1,000 1,000 1,000 Scarce information How to treat playlists with scarce information?
  50. 50. Challenge Set 50 1 2 3 4 5 6 7 8 9 10 # of tracks 0 1 5 10 5 10 25 100 25 100 Title available Yes Yes Yes Yes No No Yes Yes Yes Yes Track order Seq Seq Seq Seq Seq Seq Seq Seq Shuffled Shuffled # of playlists 1,000 1,000 1,000 1,000 1,000 1,000 1,000 1,000 1,000 1,000 Various types of Input How to deal with various types of input?
  51. 51. Overview of the Proposed Model  An ensemble method with two components.  Autoencoder for tracks and metadata for tracks.  CharCNN for playlist titles. 51
  52. 52. Overview of the Proposed Model  An ensemble method with two components.  Autoencoder for tracks and metadata for tracks.  CharCNN for playlist titles. 52
  53. 53. Autoencoder-based Model 53 1 0 1 1 0 1Hey Jude Rehab Yesterday Dancing Queen Mamma Mia Viva la Vida encoder decoder  Learn a latent representation of a given playlist consisting of a set of tracks 0.9 0.01 0.78 0.9 0. 6 0.8 ✔ Top-1 Recommendation Output
  54. 54. Denoising Autoencoder 54 1 0 1 1 0 1 1 0 1 0 0 0 0.9 0.01 0.78 0.9 0. 6 0.8Hey Jude Rehab Yesterday Dancing Queen Mamma Mia Viva la Vida encoder decoder 1 0 1 1 0 1  Training with denoising  Some positive input values are corrupted (set to zero). denoising
  55. 55. Denoising Autoencoder 55  Training with denoising  Some positive input values are corrupted (set to zero). How to utilize the metadata such as artists and albums? 1 0 1 1 0 1 1 0 1 0 0 0 0.9 0.01 0.78 0.9 0. 6 0.8Hey Jude Rehab Yesterday Dancing Queen Mamma Mia Viva la Vida encoder decoder 1 0 1 1 0 1 denoising
  56. 56. Utilizing Metadata 56 1 0 1 1 0 1 0 1 1 0 0.9 0.01 0.78 0.9 0. 6 0.8 0.2 0.98 0.9 0.6 Hey Jude Rehab Yesterday Dancing Queen Mamma Mia Viva la Vida encoder decoder  Concatenate an artist vector corresponding to the track vector in the playlist.
  57. 57.  Randomly choose either the playlist or its artists as input. 57 Training Strategy: Hide-and-Seek
  58. 58. Training Strategy: Hide-and-Seek  Randomly choose either the playlist or its artists as input. 58
  59. 59. Training Strategy: Hide-and-Seek  Randomly choose either the playlist or its artists as input. 59
  60. 60. CharCNN for Playlist Titles  An ensemble method with two components.  Autoencoder for tracks and metadata for tracks.  CharCNN for playlist titles. 60
  61. 61. Word-level CNN for NLP  Effective for capturing spatial locality of a sequence of texts 61 I like this song very much 0.1 0.3 0.2 0.6 0.2 0.6 -1.2 -0.2 -2.1 0.2 0.1 0.4 -2.1 0.9 -3.1 1.4 0.1 0.3 -0.2 0.1 0.4 0.1 0.7 0.1 I like this song very Filter (3 by k ) 2.2 2.3 -1.3 0.9 max pooling Conv layer 2.3 Feature much convolution k-dimension embedding
  62. 62. Word-level CNN for NLP  Effective for capturing spatial locality of a sequence of texts 62 I like this song very much 0.1 0.3 0.2 0.6 0.2 0.6 -1.2 -0.2 -2.1 0.2 0.1 0.4 -2.1 0.9 -3.1 1.4 0.1 0.3 -0.2 0.1 0.4 0.1 0.7 0.1 I like this song very Filters (3 by k ) convolution 2.2 2.3 -1.3 0.9 max pooling 2.3 Feature much k-dimension embedding Conv layer 2.2 2.3 -1.3 0.9 Conv layers 2.2 2.3 -1.3 0.9 1.2 2.4 -1.1 0.4 max pooling 2.3 1.2 2.4 Feature vector convolution
  63. 63. CharCNN for Playlist Titles  Playlist titles are represented by a short text, implying an abstract description of a playlist.  Use character-level embedding. 63 Conv layers Feature vector
  64. 64. Combining Two Models  Simplest method: 𝑤𝒊𝒕𝒆𝒎 = 0.5 and 𝑤𝒕𝒊𝒕𝒍𝒆 = 0.5 64
  65. 65. Combining Two Models  The accuracy of the AE highly relies on the number of tracks within a playlist.  Dynamic: Set weights according to the number of items. 65 Items Playlist Title Chill songs 0.7 0.4 0.9 0.1 0.2 0.1 0.2 0.3 0.7 0.1 0.6 0.4 0.7 0.2 0.2 AE CNN 𝑤_𝑖𝑡𝑒𝑚 = 5 𝑤_𝑡𝑖𝑡𝑙𝑒 = 1
  66. 66. RecSys Challenge 2018 66
  67. 67. Recent Progress in Our Lab  “Dual Neural Personalized Ranking,” WWW 2019  “Characterization and Early Detection of Evergreen News Articles,” ECML/PKDD 2019 (To appear)  “Collaborative Distillation for Top-N Recommendation,” ICDM 2019 (To appear) 67
  68. 68. Q&A  https://jongwuklee.weebly.com/ 68
  69. 69. SR-GNN using Graph Neural Nets  Each session graph is proceeded one by one. Node vectors can be obtained through a gated graph neural network.  Each session is represented as the combination of the global preference and current interests of this session using an attention net. 69 Shu Wu et al., “Session-based Recommendation with Graph Neural Networks,” AAAI 2019
  70. 70. RecSys Challenge 2018 70

×