Session-based recommendations with recurrent neural networks

SESSION - BASED RECOMMENDATIONS WITH
RECURRENT NEURAL NETWORKS
ICLR 2016

Balázs Hidasi
Gravity R&D , Head of Data Mining and Research
Ph.D Context-aware factorization methods for implicit
feedback based recommendation problems(2016)
Alexandros Karatzoglou
Google, Staff Research Scientist (19. 9 ~)
Telefonica Research, Scientific Director
Cowork
● Parallel Recurrent Neural Network Architectures for Feature-rich Session-based Recommendations(RecSys16)
● Recurrent Neural Networks with Top-k Gains for Session-based Recommendations(CIKM ’18)
Author

About us
Gravity is a personalization engine vendor offering a product
portfolio called Yusp, filling multiple needs using the same
underlying technology.
Mission
To enable businesses to be relevant and not forgotten.
Gravity R&D Inc.
출처 :https://www.yusp.com/

Gravity R&D Inc.
출처 :https://www.yusp.com/

출처 :https://ko.wikipedia.org/wiki/%ED%85%94%EB%A0%88%ED%8F%AC%EB%8B%88%EC%B9%B4

● In e-commerce system,
● hard to track users
● even possible
- only one or two sessions
- should be handled independently
● hard to use MF and not accurate
● use item-to-item similarity
- while effective
- only taking into account the last click
of the user, ignoring the past clicks

● 화장품 -> 청소기 -> 우유 -> 옷 을 조회했어도
● 앞에 정보 무시 옷 관련만 추천
Related Work : Co-Occurence, Item2Vec
출처 https://brunch.co.kr/@goodvc78/16

Extended version of General Factorization Framework
- two kinds of latent representation for items
- representation as the item itself
- representation as part of a session
- does not consider ordering within the session
🤔
출처: https://medium.com/snipfeed/time-dependent-recommender-systems-part-2-92e8dfaf1199

Contribution
- RNN 처음으로 사용 ( 딥러닝 도입 시도는 있었음.)
- Session-Parallel Mini-batch
- Sampling on the output
- Ranking Loss

Model Architecture
Model I/O
Input : item click(1-of-N Encoding) or event(weighted sum of items)
Output : item of the next event

without embedding, 1-of-N encoding is always better.
Model I/O
1-of-N encoding weighted sum of items

Model Architecture : GRU(Gated Recurrent Unit)
기본적인 RNN 구조
출처: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21

Model Architecture : GRU(Gated Recurrent Unit)
실험결과 Classic RNN, LSTM보다 GRU가 좋음
출처: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21

SESSION-PARALLEL MINI-BATCHES
1) length different even more than sentence
some sessions consists of only 2 events
2) to capture how a session evolves over time
Sessions are assumed to be independent reset the appropriate hidden state when switch occurs
-> 코드 구현 화이팅...

Sampling on the output
출처 :Deep Neural Networks for YouTube Recommendations
Why?
● Calculating a score for each item is unusable
● have to sample for calculating score
How?
● Based on popularity
● Implicit data
● unseen data →dislike? didn’t know?
● 아이유 새 앨범 : 관심없어서인지 몰라서인지

Sampling on the output
출처 :https://instagram-engineering.com/powered-by-ai-instagrams-explore-recommender-system-7ca901d2a882

Ranking Loss
Input : 킬링 이브 -> 비포 선셋 -> 엘르 -> 봄날은 간다 -> 멜로가 체질 -> ??
Output Ground Truth : 해리샐리
Candidates : [독전, 해리샐리, 황해, 조커, 비포 선라이즈]
Predictions : [0.1, 0.8, 0.2, 0.5, 0.9]
After Sorting : [비포 선라이즈, 해리샐리, 조커, 황해, 독전]
Loss
Pointwise : 해리샐리는 1로 나머지는 0으로.
Pairwise : (해리샐리, 독전) , (해리샐리, 황해), (해리샐리, 조커), (해리샐리, 비포 선라이즈) 값 차이가 커지게!
Listwise : [비포 선라이즈, 해리샐리, 조커, 황해, 독전] -> [해리샐리, 비포 선라이즈, 조커, 황해, 독전]로
만들도록!
→ Sorting 필요, O(NlogN) → Nope!

Bayesian Personalized Ranking
UAI 2009
출처 : https://arxiv.org/pdf/1205.2618.pdf
- When Dealing with Implicit Data
- Seen data should be higher than unseen data
- we don’t know ranking b/w seen data
- b/w unseen data also

Bayesian Personalized Ranking
UAI 2009
출처 : https://arxiv.org/pdf/1205.2618.pdf

BPR - max
Recurrent Neural Networks with Top-k Gains for Session-based Recommendations
출처 : https://www.slideshare.net/balazshidasi/gru4rec-v2-recurrent-neural-networks-with-topk-gains-for-sessionbased-recommendations

Experiments - Datasets (RSC15)
● 전자상거래에서 유저들의 클릭 기록과 구매 기록을 모은 데이터.
● 논문에서는 클릭데이터만 사용
● 길이 1짜리는 제외 6달치 데이터를 사용.
● Train Set : 7,966,257 sessions of 31,637,239 clicks on 37,483 item
→ 3.97 clicks / 1 session
● 학습 Session의 subsequent day data들이 Test Set.
● Train Set에 없는 Test Set의 Item들은 전부 제거
● Test Set : 15,324 sessions of 71,222 events for the test set
→ 4.64 clicks/ 1session

● Youtube Like Service
● RSC15와 비슷함
● item2item 추천이 돌아가고 있어서 유저의 행동에 영향을 주었을 거라 의심
Selection Bias?
● Session이 너무 긴거는 Bot일거라고 의심
Experiments - Datasets (youtube like OTT)

Metric : Recall@20 and MRR@20
출처: https://ko.wikipedia.org/wiki/%EC%A0%95%EB%B0%80%EB%8F%84%EC%99%80_%EC%9E%AC%ED%98%84%EC%9C%A8
https://www.blabladata.com/2014/10/26/evaluating-recommender-systems/

Baselines Experiments
Item - KNN이 압도적 → Baseline Model

● weight initialization : [-x, x] uniformly
● adagrad >rmsprop
● GRU > LSTM, RNN
● Single GRU Layer is Enough (아마도 session 이 비교적 짧아서)
● GRU Size를 늘리는건 도움
● activation 으로 tanh 로 쓰는게 좋음100/ 1000 unit 학습, 추론 속도 차이 크게 없음
● 자주 재학습이 가능한지는 추천 시스템에서 중요
● KNN 보다 훨씬 좋음
● Cross - entropy는 Pair-wise loss아님
● CE의 경우 unit 늘리면 떨어짐
● Top1 , BPR 크게 차이 없음
GRU Model Experiments

Author Code
저자가 공개한 코드 with THEANO : https://github.com/hidasib/GRU4Rec
이 코드의 간단한 리뷰 : https://www.notion.so/zmin/Code-Review-Session-Based-RecSys-7edaa306e3424426aa6956e903fdcc2a
이 코드를 Keras 로 구현한 Repo : https://github.com/pcerdam/KerasGRU4Rec

From RNN to Transformer
출처 :Behavior Sequence Transformer for E-commerce Recommendation in Alibaba

Session-based recommendations with recurrent neural networks

More Related Content

What's hot

Similar to Session-based recommendations with recurrent neural networks

Recently uploaded

Session-based recommendations with recurrent neural networks