SlideShare a Scribd company logo
1 of 35
Download to read offline
KRED: Knowledge-Aware Document Representation
for News Recommendations
펀디멘탈팀
김창연, 김동희, 김준호, 송헌, 이민경, 이재윤
1
Introduction
2
Introduction
온라인 뉴스가 뉴스 소비의 큰 부분을 차지하고 있음.
3
출처: 한국언론진흥재단 언론수용자 조사
Introduction
뉴스 추천 시스템 실제 사례
4
출처: 미디어다음
Introduction
뉴스 추천 시스템이 일반 추천 시스템과 다른 점
1. news articles are highly time-sensitive.
- 90% 이상의 뉴스가 생성 이후 2일 뒤에 수명을 다하게 됨.
- 대부분의 사람들이 여러 번 뉴스를 읽지 않음.
5
출처: DKN (Deep Knowledge-Aware Network for News Recommendation)
Introduction
뉴스 추천 시스템이 일반 추천 시스템과 다른 점
2. news articles possess the accuracy, brevity and clarity characteristics.
- 다른 추천시스템에 비해 CB(Content-based) 추천을 하기에 용이한 환경
- 특히 NLU (Natural Language Understanding)을 이용한 Document Representation
6
Introduction
뉴스 추천 시스템이 일반 추천 시스템과 다른 점
3. News article may contain a few entities.
- entity의 position, frequency, relationship을 통해 뉴스 기사 분석 가능.
7
8
Introduction
뉴스 추천 시스템이 일반 추천 시스템과 다른 점
3. News article may contain a few entities.
- entity의 position, frequency, relationship을 통해 뉴스 기사 분석 가능.
- Knowledge graph를 뉴스 추천시스템에 결합하여 사용할 때 entity에 대한 comprehensive external
information 사용 가능.
9
Q & A
10
Proposed Method
11
Model Architecture
12
1
2
3
4
Proposed Method
1. Entity Representation Layer
2. Context Embedding Layer
3. Information Distillation Layer
4. Multi-task Learning
13
Knowledge Graph G = {(h,r,t)|h,t ∈ E,r ∈ R}
Entity Embedding:
Knowledge graph → low-dimensional representation vector for each entity and relation
해당 논문에서는 TransE entity embedding을 사용하여 학습. (링크)
Entity Representation Layer
14
Entity Representation Layer
15
Entity Embedding → Entity Representation
KGAT: Knowledge Graph Attention Network for Recommendation
“Entity is not represented by its own embeddings, but also
partially represented by its neighbors.”
CF
KG
Entity Representation Layer
실제로 entity h를 나타내는 embedding
( : h가 head entity인 triplet들의 집합)
16
π (h, r , t ) : attention weight
one-hop neighborhood만 적용되도록 하였음.
→ e를 node로 삼고, 그것의 relation에 대한 GCN으로 볼 수 있음.
Context Embedding Layer
- 연산량을 줄이기 위해 whole text를 모델의 input으로 넣는 것을 지양함.
- Dynamic Context (position, frequency, category)가 article에서의 entity의
relevance와 importance를 좌우함.
C(1)
: Position Encoding → 제목/본문 위치에 따른 position bias vector.
C(2)
: Frequency Encoding → frequency에 따라 discrete, 최대 20으로 제한.
C(3)
: Category Encoding → Entity가 속한 카테고리 별.
17
Information Distillation Layer
Attentive mechanism을 이용하여 article의 entities를 하나의 representation vector로
합치는 과정.
18
: original document vector of document d by NLU
: set of entities in document v.
Information Distillation Layer
19
최종적으로 DV와 concat하여 FFN.
vk
: KDV
Knowledge-based Document Vector
- 실제 추천 시스템 현장에서는 다양한 추천 도메인이 존재.
(user2item, item2item, news popularity prediction, news category classification, local news detection)
- 이러한 추천 도메인들은 세부적으로는 다르나 비슷한 패턴을 공유, 상호 보완 작용
가능.
- 데이터셋 크기가 부족한 도메인의 경우, 데이터셋 크기가 큰 다른 도메인으로부터 도움
받을 수 있음.
Multi-task Learning
20
Multi-task Learning
User2item
- UV(User Vector): 유저의 클릭 기록에 나오는 document들의
vector를 attentive pooling
- 이후 KDV와 concat하여 FFN을 통해 prediction 값 도출.
Otherwise
- Only KDV 사용.
- FFN에 집어넣고 목적에 따라 probability 값을 도출하여 label과
학습.
21
Multi-task Learning
Loss functions
- User2item, Item2item
- 하나의 positive item, 5개의 random negative item
- Otherwise
- Cross-Entropy Loss
22
Q & A
23
Experiments
24
Dataset and Settings
1. Dataset
- real-world industrial news dataset by Microsoft News. (MIND dataset)
- set of impression logs ranging from Jan 15, 2019 to Jan 28, 2019
- Personalized recommendation을 위해 앞의 1주는 training & validation, 뒤의 1주는 test
dataset으로 사용.
- 여러 정제 과정을 통한 결과, 665034 users, 24542 news articles, 1590092 interactions
2. Knowledge graph construction
- Satori 라는 MS internal tool을 사용하여 제작. (관련 논문, satori tutorial)
- Knowledge graph cover 3314628 entities, 1006 relations and 71727874 triples.
25
Experiment Settings
1. Knowledge graph를 제외한 document representation model과의 비교.
- DVBERT
, DVLDA+DSSM
, DVLDA+DSSM
+ entity, DVBERT
+ entity, NAML
2. 다른 Knowledge-aware recommendation model들과의 비교.
- DKN, STCKA, ERNIE
- KREDBERT
, KREDLDA+DSSM
- KREDBERT
single-task, KREDLDA+DSSM
single-task
3. 추천시스템의 대표적인 baseline인 FM, Wide & deep과의 비교.
Dataset and Settings
26
Result
1. User-to-item Recommendation
- NLU 기반 모델들이 FM, Wide & deep을
가볍게 상회함.
- Knowledge entity 정보가 추천 정확도
향상에 큰 기여. (DV vs KRED)
- 기존 Knowledge-based model에 비해
KRED는 좋은 성능 확보.
- NLU 중 BERT를 DV로 사용했을 때 가장
좋은 성능 확인.
- Multi-task learning이 single-task에
비해서 성능 향상.
27
Result
2. item-to-item recommendation
- ANN (Approximate Nearest Neighborhood) search 방법을 이용하여 비슷한 아이템 탐색.
- KRED > BERT, ERNIE > knowledge-based model >> LDA
28
Result
3. Document Classification and Popularity detection
- 폭넓은 데이터 (user2item, item2item)의 이점을 받을 수 있는 multi-task learning이 진행될 경우,
눈에 띄게 성능이 향상되는 것을 확인할 수 있음.
29
Result
4. Ablation Study
각 step (KGAT, Context, Distillation)은 중요한 역할을 담당한다.
30
Result
5. Visualization of Document Embeddings
- C-H score가 높을수록 model with better defined clusters.
31
Result
6. Case Study
- 같은 단어라도 맥락, 위치, 빈도에 따라 다른 attention score 적용됨.
32
Q & A
33
Contribution
1. KRED → fully flexible, can enhance arbitrary document vector with knowledge
entities.
⇒ 기존 DKN(Kim-CNN에 한정)과 다르게 어떤 언어 모델 (LDA + DSSM, BERT) 적용 가능
2. 3단계의 layer 구성, 세 단계 모두 성능 향상에 중요함을 ablation studies를 통해서
확인
⇒ efficient, effective knowledge fusion
3. News Recommendation에 있어서의 Multi-task learning의 필요성 강조
34
35

More Related Content

What's hot

Liebherr l580 wheel loader service repair manual (serial number from 0101)
Liebherr l580 wheel loader service repair manual (serial number from 0101)Liebherr l580 wheel loader service repair manual (serial number from 0101)
Liebherr l580 wheel loader service repair manual (serial number from 0101)fjsjjefskekmem
 
Vip Magazine Nr 1112 2009 Bct
Vip Magazine Nr 1112 2009 BctVip Magazine Nr 1112 2009 Bct
Vip Magazine Nr 1112 2009 BctFriso de Jong
 
Part2 P2p4
Part2 P2p4Part2 P2p4
Part2 P2p4zerk87
 
Henrob Overview Benefits
Henrob Overview BenefitsHenrob Overview Benefits
Henrob Overview Benefitsspotmatt
 
Terapia de voz y habla
Terapia de voz y hablaTerapia de voz y habla
Terapia de voz y hablalolanp
 
Tai lieu plc s7 200
Tai lieu plc s7 200Tai lieu plc s7 200
Tai lieu plc s7 200Bùi Thể
 
창세기1-pdf
창세기1-pdf창세기1-pdf
창세기1-pdfkimwnd
 
ALB June 2016 - A Balancing Act
ALB June 2016 - A Balancing ActALB June 2016 - A Balancing Act
ALB June 2016 - A Balancing ActSharyn Ch'ang
 
Anatomia tablice ang
Anatomia tablice angAnatomia tablice ang
Anatomia tablice angMagdaKurdziel
 

What's hot (15)

Locaweb7
Locaweb7Locaweb7
Locaweb7
 
4 8 avo 2014 x
4 8 avo 2014 x4 8 avo 2014 x
4 8 avo 2014 x
 
Sop ashi 2011
Sop   ashi 2011Sop   ashi 2011
Sop ashi 2011
 
Liebherr l580 wheel loader service repair manual (serial number from 0101)
Liebherr l580 wheel loader service repair manual (serial number from 0101)Liebherr l580 wheel loader service repair manual (serial number from 0101)
Liebherr l580 wheel loader service repair manual (serial number from 0101)
 
Vip Magazine Nr 1112 2009 Bct
Vip Magazine Nr 1112 2009 BctVip Magazine Nr 1112 2009 Bct
Vip Magazine Nr 1112 2009 Bct
 
Part2 P2p4
Part2 P2p4Part2 P2p4
Part2 P2p4
 
Antonio Danubio
Antonio DanubioAntonio Danubio
Antonio Danubio
 
Henrob Overview Benefits
Henrob Overview BenefitsHenrob Overview Benefits
Henrob Overview Benefits
 
Terapia de voz y habla
Terapia de voz y hablaTerapia de voz y habla
Terapia de voz y habla
 
AS_bladet_april_2011
AS_bladet_april_2011AS_bladet_april_2011
AS_bladet_april_2011
 
Epi norma modificada 2010
Epi norma modificada 2010Epi norma modificada 2010
Epi norma modificada 2010
 
Tai lieu plc s7 200
Tai lieu plc s7 200Tai lieu plc s7 200
Tai lieu plc s7 200
 
창세기1-pdf
창세기1-pdf창세기1-pdf
창세기1-pdf
 
ALB June 2016 - A Balancing Act
ALB June 2016 - A Balancing ActALB June 2016 - A Balancing Act
ALB June 2016 - A Balancing Act
 
Anatomia tablice ang
Anatomia tablice angAnatomia tablice ang
Anatomia tablice ang
 

More from taeseon ryu

OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...taeseon ryu
 
3D Gaussian Splatting
3D Gaussian Splatting3D Gaussian Splatting
3D Gaussian Splattingtaeseon ryu
 
Hyperbolic Image Embedding.pptx
Hyperbolic  Image Embedding.pptxHyperbolic  Image Embedding.pptx
Hyperbolic Image Embedding.pptxtaeseon ryu
 
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정taeseon ryu
 
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdfLLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdftaeseon ryu
 
Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories taeseon ryu
 
Packed Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation ExtractionPacked Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation Extractiontaeseon ryu
 
MOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement LearningMOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement Learningtaeseon ryu
 
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language ModelsScaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Modelstaeseon ryu
 
Visual prompt tuning
Visual prompt tuningVisual prompt tuning
Visual prompt tuningtaeseon ryu
 
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdfvariBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdftaeseon ryu
 
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdfReinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdftaeseon ryu
 
The Forward-Forward Algorithm
The Forward-Forward AlgorithmThe Forward-Forward Algorithm
The Forward-Forward Algorithmtaeseon ryu
 
Towards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural NetworksTowards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural Networkstaeseon ryu
 
BRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive SummarizationBRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive Summarizationtaeseon ryu
 

More from taeseon ryu (20)

VoxelNet
VoxelNetVoxelNet
VoxelNet
 
OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...
 
3D Gaussian Splatting
3D Gaussian Splatting3D Gaussian Splatting
3D Gaussian Splatting
 
JetsonTX2 Python
 JetsonTX2 Python  JetsonTX2 Python
JetsonTX2 Python
 
Hyperbolic Image Embedding.pptx
Hyperbolic  Image Embedding.pptxHyperbolic  Image Embedding.pptx
Hyperbolic Image Embedding.pptx
 
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정
 
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdfLLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
 
YOLO V6
YOLO V6YOLO V6
YOLO V6
 
Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories
 
RL_UpsideDown
RL_UpsideDownRL_UpsideDown
RL_UpsideDown
 
Packed Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation ExtractionPacked Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation Extraction
 
MOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement LearningMOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement Learning
 
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language ModelsScaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Models
 
Visual prompt tuning
Visual prompt tuningVisual prompt tuning
Visual prompt tuning
 
mPLUG
mPLUGmPLUG
mPLUG
 
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdfvariBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
 
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdfReinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
 
The Forward-Forward Algorithm
The Forward-Forward AlgorithmThe Forward-Forward Algorithm
The Forward-Forward Algorithm
 
Towards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural NetworksTowards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural Networks
 
BRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive SummarizationBRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive Summarization
 

KRED : knowledge aware document representation for news recommendations

  • 1. KRED: Knowledge-Aware Document Representation for News Recommendations 펀디멘탈팀 김창연, 김동희, 김준호, 송헌, 이민경, 이재윤 1
  • 3. Introduction 온라인 뉴스가 뉴스 소비의 큰 부분을 차지하고 있음. 3 출처: 한국언론진흥재단 언론수용자 조사
  • 4. Introduction 뉴스 추천 시스템 실제 사례 4 출처: 미디어다음
  • 5. Introduction 뉴스 추천 시스템이 일반 추천 시스템과 다른 점 1. news articles are highly time-sensitive. - 90% 이상의 뉴스가 생성 이후 2일 뒤에 수명을 다하게 됨. - 대부분의 사람들이 여러 번 뉴스를 읽지 않음. 5 출처: DKN (Deep Knowledge-Aware Network for News Recommendation)
  • 6. Introduction 뉴스 추천 시스템이 일반 추천 시스템과 다른 점 2. news articles possess the accuracy, brevity and clarity characteristics. - 다른 추천시스템에 비해 CB(Content-based) 추천을 하기에 용이한 환경 - 특히 NLU (Natural Language Understanding)을 이용한 Document Representation 6
  • 7. Introduction 뉴스 추천 시스템이 일반 추천 시스템과 다른 점 3. News article may contain a few entities. - entity의 position, frequency, relationship을 통해 뉴스 기사 분석 가능. 7
  • 8. 8
  • 9. Introduction 뉴스 추천 시스템이 일반 추천 시스템과 다른 점 3. News article may contain a few entities. - entity의 position, frequency, relationship을 통해 뉴스 기사 분석 가능. - Knowledge graph를 뉴스 추천시스템에 결합하여 사용할 때 entity에 대한 comprehensive external information 사용 가능. 9
  • 13. Proposed Method 1. Entity Representation Layer 2. Context Embedding Layer 3. Information Distillation Layer 4. Multi-task Learning 13
  • 14. Knowledge Graph G = {(h,r,t)|h,t ∈ E,r ∈ R} Entity Embedding: Knowledge graph → low-dimensional representation vector for each entity and relation 해당 논문에서는 TransE entity embedding을 사용하여 학습. (링크) Entity Representation Layer 14
  • 15. Entity Representation Layer 15 Entity Embedding → Entity Representation KGAT: Knowledge Graph Attention Network for Recommendation “Entity is not represented by its own embeddings, but also partially represented by its neighbors.” CF KG
  • 16. Entity Representation Layer 실제로 entity h를 나타내는 embedding ( : h가 head entity인 triplet들의 집합) 16 π (h, r , t ) : attention weight one-hop neighborhood만 적용되도록 하였음. → e를 node로 삼고, 그것의 relation에 대한 GCN으로 볼 수 있음.
  • 17. Context Embedding Layer - 연산량을 줄이기 위해 whole text를 모델의 input으로 넣는 것을 지양함. - Dynamic Context (position, frequency, category)가 article에서의 entity의 relevance와 importance를 좌우함. C(1) : Position Encoding → 제목/본문 위치에 따른 position bias vector. C(2) : Frequency Encoding → frequency에 따라 discrete, 최대 20으로 제한. C(3) : Category Encoding → Entity가 속한 카테고리 별. 17
  • 18. Information Distillation Layer Attentive mechanism을 이용하여 article의 entities를 하나의 representation vector로 합치는 과정. 18 : original document vector of document d by NLU : set of entities in document v.
  • 19. Information Distillation Layer 19 최종적으로 DV와 concat하여 FFN. vk : KDV Knowledge-based Document Vector
  • 20. - 실제 추천 시스템 현장에서는 다양한 추천 도메인이 존재. (user2item, item2item, news popularity prediction, news category classification, local news detection) - 이러한 추천 도메인들은 세부적으로는 다르나 비슷한 패턴을 공유, 상호 보완 작용 가능. - 데이터셋 크기가 부족한 도메인의 경우, 데이터셋 크기가 큰 다른 도메인으로부터 도움 받을 수 있음. Multi-task Learning 20
  • 21. Multi-task Learning User2item - UV(User Vector): 유저의 클릭 기록에 나오는 document들의 vector를 attentive pooling - 이후 KDV와 concat하여 FFN을 통해 prediction 값 도출. Otherwise - Only KDV 사용. - FFN에 집어넣고 목적에 따라 probability 값을 도출하여 label과 학습. 21
  • 22. Multi-task Learning Loss functions - User2item, Item2item - 하나의 positive item, 5개의 random negative item - Otherwise - Cross-Entropy Loss 22
  • 25. Dataset and Settings 1. Dataset - real-world industrial news dataset by Microsoft News. (MIND dataset) - set of impression logs ranging from Jan 15, 2019 to Jan 28, 2019 - Personalized recommendation을 위해 앞의 1주는 training & validation, 뒤의 1주는 test dataset으로 사용. - 여러 정제 과정을 통한 결과, 665034 users, 24542 news articles, 1590092 interactions 2. Knowledge graph construction - Satori 라는 MS internal tool을 사용하여 제작. (관련 논문, satori tutorial) - Knowledge graph cover 3314628 entities, 1006 relations and 71727874 triples. 25
  • 26. Experiment Settings 1. Knowledge graph를 제외한 document representation model과의 비교. - DVBERT , DVLDA+DSSM , DVLDA+DSSM + entity, DVBERT + entity, NAML 2. 다른 Knowledge-aware recommendation model들과의 비교. - DKN, STCKA, ERNIE - KREDBERT , KREDLDA+DSSM - KREDBERT single-task, KREDLDA+DSSM single-task 3. 추천시스템의 대표적인 baseline인 FM, Wide & deep과의 비교. Dataset and Settings 26
  • 27. Result 1. User-to-item Recommendation - NLU 기반 모델들이 FM, Wide & deep을 가볍게 상회함. - Knowledge entity 정보가 추천 정확도 향상에 큰 기여. (DV vs KRED) - 기존 Knowledge-based model에 비해 KRED는 좋은 성능 확보. - NLU 중 BERT를 DV로 사용했을 때 가장 좋은 성능 확인. - Multi-task learning이 single-task에 비해서 성능 향상. 27
  • 28. Result 2. item-to-item recommendation - ANN (Approximate Nearest Neighborhood) search 방법을 이용하여 비슷한 아이템 탐색. - KRED > BERT, ERNIE > knowledge-based model >> LDA 28
  • 29. Result 3. Document Classification and Popularity detection - 폭넓은 데이터 (user2item, item2item)의 이점을 받을 수 있는 multi-task learning이 진행될 경우, 눈에 띄게 성능이 향상되는 것을 확인할 수 있음. 29
  • 30. Result 4. Ablation Study 각 step (KGAT, Context, Distillation)은 중요한 역할을 담당한다. 30
  • 31. Result 5. Visualization of Document Embeddings - C-H score가 높을수록 model with better defined clusters. 31
  • 32. Result 6. Case Study - 같은 단어라도 맥락, 위치, 빈도에 따라 다른 attention score 적용됨. 32
  • 34. Contribution 1. KRED → fully flexible, can enhance arbitrary document vector with knowledge entities. ⇒ 기존 DKN(Kim-CNN에 한정)과 다르게 어떤 언어 모델 (LDA + DSSM, BERT) 적용 가능 2. 3단계의 layer 구성, 세 단계 모두 성능 향상에 중요함을 ablation studies를 통해서 확인 ⇒ efficient, effective knowledge fusion 3. News Recommendation에 있어서의 Multi-task learning의 필요성 강조 34
  • 35. 35