SlideShare a Scribd company logo
1 of 26
When Relevance is not Enough:
Promoting Diversity and Freshness in
Personalized Question
Recommendation
IDAN SZPEKTOR,YOELLE MAAREK,DAN PELLEG

YAHOO!RESEARCH
ABSTRACT
a good question recommendation system
1.

designed around answerers, rather than exclusively for askers

2.

Scale to many questions and users and be fast enough

3.

Relevant to his or her interests

4.

diversity
INTRODUCTION
Common way: only to the best possible answerers (“experts”)
All potential answerers
INTRODUCTION
relevance: to what degree the question matches the user’s tastes
diversity and freshness needs
Three requirements:
1. questions need to be recommended for all types of users
2. questions have to be diverse
3. recommendations need to be fresh and be served fast
a) serve questions as recommendations immediately
b) instantly adapting to users’ changes in taste
RELATED WORK
limitations
real-time ranking
the needs of new users with very little historical data are not addressed well.
only on relevance
Framework
Question profile:
1. LDA model
2. Lexical model
3. Category model

User profile:
Question recommendation
Matching question and user profiles
Proactive diversification
Recommendation merging
QUESTION PROFILE
Split it according to the 26 top categories in Yahoo! Answers
Two Advantage:
1.
2.

represent disjoint users’ interests.
word sense disambiguation

1.

question textual content(title and body)

2.

category
QUESTION PROFILE
Build profile, which is represented by three vectors:
1.

a Latent Dirichlet Allocation (LDA) topic vector

2.

a lexical vector

3.

a category vector
LDA Model
1. Initial training: a random sample
of up to 2 million resolved
questions
2. Incremental learning: a random
sample of up to half a million
questions per top category
3. Inference: at least10% of the
probability mass
Lexical Model
a unigram bag-of-words representation of a question
tf·idf score / L1 normalized
a probability distribution

Category Model
a probability of 1 to the category in which the question was posted
USER PROFILE
the questions answered in the past
the user representation is generated by aggregating signals over these
questions
user profile: a probability tree
1. Aggregating the profiles of the questions the user answered
2. Update
the first and third tree levels:
a decaying factor on past questions

the second level:
1. Measure the similarity between the feature distribution of each model in the
question and the corresponding feature distribution in the user profile
2. Normalized to a probability distribution
QUESTION RECOMMENDATION
Matching Question and User Profiles
A list of open questions ranked by a relevance score, which is calculated for the pair {question
profile , user profile}

For question profiles:
1.

Turn the three vectors forming the question profile into a single vector, multiply the
probability of each feature by 1/3 before storing it in the index

2.

Index every question vector and build an inverted index
QUESTION RECOMMENDATION
For user profile:
associate with each user feature a score that consists of the product of each probability score
on the tree path that led to this feature

Ranking:
Similarity: a simple dot-product
QUESTION RECOMMENDATION
Proactive Diversification
thematic sampling:
1.

For each user vector u , we generate N query vectors u 1 ;u 2 ;…;u N

2.

N ranked lists

3.

Blending them together results in a final diverse list

Two types of thematic constraints:

specific top category: randomly select top categories as constraints by sampling without repetition
based on their distribution in the root node of the user’s probability tree
spefic LDA topic: randomly sample LDA topics without repetition from the user profile by traversing
the probability tree
QUESTION RECOMMENDATION
Recommendation Merging
blending algorithm
1.

Each list being associated with a probability score

2.

Sampling an intermediate list, based on the assigned probabilities

3.

Removing one recommendation from the sampled list to be added at the end of the final
list.

4.

Repeat
QUESTION RECOMMENDATION
Non-Thematic LDA Topics
QUESTION RECOMMENDATION
Non-Thematic LDA Topics
116 topics, 23 top categories
34% non-thematic topics
A logistic regression classifier
EXPERIMENTS
Offline Experiment
8 different top categories
Active users: at least 21 questions as of January 2011
New users: at least two questions as of January 2011
EXPERIMENTS
Online Experiment
A/B test
Control bucket , CTL ( n = 25093)
Relevance bucket , R ( n = 5359)
Freshness bucket , F ( n = 46228) : 50% recent ; 20% thematic sampling
Diversity bucket , D ( n = 42041) : 20% recent ; 50% thematic sampling
CONCLUSIONS
Relevance, but also by freshness and diversity
Several relevance models
“question retrieval engine“
Diversity: thematic sampling
内容上:different factors/models/levels

写作上:层次清楚,递进

More Related Content

Similar to When relevance is not enough

Using GradeMark to improve feedback and involve students in the marking process
Using GradeMark to improve feedback and involve students in the marking process Using GradeMark to improve feedback and involve students in the marking process
Using GradeMark to improve feedback and involve students in the marking process Sara Marsham
 
PUBH 6034 Module 3 Assignment Air Quality Standards Worksheet (Ru.docx
PUBH 6034 Module 3 Assignment Air Quality Standards Worksheet (Ru.docxPUBH 6034 Module 3 Assignment Air Quality Standards Worksheet (Ru.docx
PUBH 6034 Module 3 Assignment Air Quality Standards Worksheet (Ru.docxpotmanandrea
 
1 Social Science Statistics Project 1 Global Issu.docx
 1 Social Science Statistics  Project 1 Global Issu.docx 1 Social Science Statistics  Project 1 Global Issu.docx
1 Social Science Statistics Project 1 Global Issu.docxShiraPrater50
 
1 Social Science Statistics Project 1 Global Issu.docx
1 Social Science Statistics  Project 1 Global Issu.docx1 Social Science Statistics  Project 1 Global Issu.docx
1 Social Science Statistics Project 1 Global Issu.docxpoulterbarbara
 
What papers should I cite from my reading list? User evaluation of a manuscri...
What papers should I cite from my reading list? User evaluation of a manuscri...What papers should I cite from my reading list? User evaluation of a manuscri...
What papers should I cite from my reading list? User evaluation of a manuscri...Aravind Sesagiri Raamkumar
 
I want to answer, who has a
I want to answer, who has aI want to answer, who has a
I want to answer, who has achenbojyh
 
Instructions for LearnersUse this template only if you have a
Instructions for LearnersUse this template only if you have a Instructions for LearnersUse this template only if you have a
Instructions for LearnersUse this template only if you have a TatianaMajor22
 
· Toggle DrawerOverviewFor this assessment, you will complete .docx
· Toggle DrawerOverviewFor this assessment, you will complete .docx· Toggle DrawerOverviewFor this assessment, you will complete .docx
· Toggle DrawerOverviewFor this assessment, you will complete .docxodiliagilby
 
HUMANITIES 105 - THE HUMAN STRUGGLE PRESENTATION ASSIG.docx
HUMANITIES 105 - THE HUMAN STRUGGLE PRESENTATION ASSIG.docxHUMANITIES 105 - THE HUMAN STRUGGLE PRESENTATION ASSIG.docx
HUMANITIES 105 - THE HUMAN STRUGGLE PRESENTATION ASSIG.docxeugeniadean34240
 
TYPE TITLE OF DISSERTATION IN ALL UPPERCASE LETTERS USING TWO-TIER.docx
TYPE TITLE OF DISSERTATION IN ALL UPPERCASE LETTERS USING TWO-TIER.docxTYPE TITLE OF DISSERTATION IN ALL UPPERCASE LETTERS USING TWO-TIER.docx
TYPE TITLE OF DISSERTATION IN ALL UPPERCASE LETTERS USING TWO-TIER.docxouldparis
 
data analysis and report wring in research (Section d)
data analysis and report wring  in research (Section d)data analysis and report wring  in research (Section d)
data analysis and report wring in research (Section d)CGC Technical campus,Mohali
 
Publishing with IEEE Workshop February 2019
Publishing with IEEE Workshop February 2019Publishing with IEEE Workshop February 2019
Publishing with IEEE Workshop February 2019uoblibraries
 
! College of Doctoral Studies PSY 850 SPSS Assi.docx
!          College of Doctoral Studies PSY 850 SPSS Assi.docx!          College of Doctoral Studies PSY 850 SPSS Assi.docx
! College of Doctoral Studies PSY 850 SPSS Assi.docxMARRY7
 
Marshall hm poster_vra2015
Marshall hm poster_vra2015Marshall hm poster_vra2015
Marshall hm poster_vra2015Hannah Marshall
 
Running head HOW TO WRITE A RESEARCH PROPOSAL 1 .docx
Running head HOW TO WRITE A RESEARCH PROPOSAL  1  .docxRunning head HOW TO WRITE A RESEARCH PROPOSAL  1  .docx
Running head HOW TO WRITE A RESEARCH PROPOSAL 1 .docxcowinhelen
 
Research Proposal Tentative Schedule and Assignment(All of the .docx
Research Proposal Tentative Schedule and Assignment(All of the .docxResearch Proposal Tentative Schedule and Assignment(All of the .docx
Research Proposal Tentative Schedule and Assignment(All of the .docxdebishakespeare
 
College of Doctoral StudiesBackground Inform.docx
College of Doctoral StudiesBackground Inform.docxCollege of Doctoral StudiesBackground Inform.docx
College of Doctoral StudiesBackground Inform.docxadkinspaige22
 
College of Doctoral StudiesBackground Inform.docx
                College of Doctoral StudiesBackground Inform.docx                College of Doctoral StudiesBackground Inform.docx
College of Doctoral StudiesBackground Inform.docxhallettfaustina
 
5MARK012W – CW1 INDIVIDUAL ASSIGNMENTSemester One - Task Three S.docx
5MARK012W – CW1 INDIVIDUAL ASSIGNMENTSemester One - Task Three S.docx5MARK012W – CW1 INDIVIDUAL ASSIGNMENTSemester One - Task Three S.docx
5MARK012W – CW1 INDIVIDUAL ASSIGNMENTSemester One - Task Three S.docxblondellchancy
 
Academic Writing Expectations Checklist The faculty Assessor w.docx
Academic Writing Expectations Checklist The faculty Assessor w.docxAcademic Writing Expectations Checklist The faculty Assessor w.docx
Academic Writing Expectations Checklist The faculty Assessor w.docxdaniahendric
 

Similar to When relevance is not enough (20)

Using GradeMark to improve feedback and involve students in the marking process
Using GradeMark to improve feedback and involve students in the marking process Using GradeMark to improve feedback and involve students in the marking process
Using GradeMark to improve feedback and involve students in the marking process
 
PUBH 6034 Module 3 Assignment Air Quality Standards Worksheet (Ru.docx
PUBH 6034 Module 3 Assignment Air Quality Standards Worksheet (Ru.docxPUBH 6034 Module 3 Assignment Air Quality Standards Worksheet (Ru.docx
PUBH 6034 Module 3 Assignment Air Quality Standards Worksheet (Ru.docx
 
1 Social Science Statistics Project 1 Global Issu.docx
 1 Social Science Statistics  Project 1 Global Issu.docx 1 Social Science Statistics  Project 1 Global Issu.docx
1 Social Science Statistics Project 1 Global Issu.docx
 
1 Social Science Statistics Project 1 Global Issu.docx
1 Social Science Statistics  Project 1 Global Issu.docx1 Social Science Statistics  Project 1 Global Issu.docx
1 Social Science Statistics Project 1 Global Issu.docx
 
What papers should I cite from my reading list? User evaluation of a manuscri...
What papers should I cite from my reading list? User evaluation of a manuscri...What papers should I cite from my reading list? User evaluation of a manuscri...
What papers should I cite from my reading list? User evaluation of a manuscri...
 
I want to answer, who has a
I want to answer, who has aI want to answer, who has a
I want to answer, who has a
 
Instructions for LearnersUse this template only if you have a
Instructions for LearnersUse this template only if you have a Instructions for LearnersUse this template only if you have a
Instructions for LearnersUse this template only if you have a
 
· Toggle DrawerOverviewFor this assessment, you will complete .docx
· Toggle DrawerOverviewFor this assessment, you will complete .docx· Toggle DrawerOverviewFor this assessment, you will complete .docx
· Toggle DrawerOverviewFor this assessment, you will complete .docx
 
HUMANITIES 105 - THE HUMAN STRUGGLE PRESENTATION ASSIG.docx
HUMANITIES 105 - THE HUMAN STRUGGLE PRESENTATION ASSIG.docxHUMANITIES 105 - THE HUMAN STRUGGLE PRESENTATION ASSIG.docx
HUMANITIES 105 - THE HUMAN STRUGGLE PRESENTATION ASSIG.docx
 
TYPE TITLE OF DISSERTATION IN ALL UPPERCASE LETTERS USING TWO-TIER.docx
TYPE TITLE OF DISSERTATION IN ALL UPPERCASE LETTERS USING TWO-TIER.docxTYPE TITLE OF DISSERTATION IN ALL UPPERCASE LETTERS USING TWO-TIER.docx
TYPE TITLE OF DISSERTATION IN ALL UPPERCASE LETTERS USING TWO-TIER.docx
 
data analysis and report wring in research (Section d)
data analysis and report wring  in research (Section d)data analysis and report wring  in research (Section d)
data analysis and report wring in research (Section d)
 
Publishing with IEEE Workshop February 2019
Publishing with IEEE Workshop February 2019Publishing with IEEE Workshop February 2019
Publishing with IEEE Workshop February 2019
 
! College of Doctoral Studies PSY 850 SPSS Assi.docx
!          College of Doctoral Studies PSY 850 SPSS Assi.docx!          College of Doctoral Studies PSY 850 SPSS Assi.docx
! College of Doctoral Studies PSY 850 SPSS Assi.docx
 
Marshall hm poster_vra2015
Marshall hm poster_vra2015Marshall hm poster_vra2015
Marshall hm poster_vra2015
 
Running head HOW TO WRITE A RESEARCH PROPOSAL 1 .docx
Running head HOW TO WRITE A RESEARCH PROPOSAL  1  .docxRunning head HOW TO WRITE A RESEARCH PROPOSAL  1  .docx
Running head HOW TO WRITE A RESEARCH PROPOSAL 1 .docx
 
Research Proposal Tentative Schedule and Assignment(All of the .docx
Research Proposal Tentative Schedule and Assignment(All of the .docxResearch Proposal Tentative Schedule and Assignment(All of the .docx
Research Proposal Tentative Schedule and Assignment(All of the .docx
 
College of Doctoral StudiesBackground Inform.docx
College of Doctoral StudiesBackground Inform.docxCollege of Doctoral StudiesBackground Inform.docx
College of Doctoral StudiesBackground Inform.docx
 
College of Doctoral StudiesBackground Inform.docx
                College of Doctoral StudiesBackground Inform.docx                College of Doctoral StudiesBackground Inform.docx
College of Doctoral StudiesBackground Inform.docx
 
5MARK012W – CW1 INDIVIDUAL ASSIGNMENTSemester One - Task Three S.docx
5MARK012W – CW1 INDIVIDUAL ASSIGNMENTSemester One - Task Three S.docx5MARK012W – CW1 INDIVIDUAL ASSIGNMENTSemester One - Task Three S.docx
5MARK012W – CW1 INDIVIDUAL ASSIGNMENTSemester One - Task Three S.docx
 
Academic Writing Expectations Checklist The faculty Assessor w.docx
Academic Writing Expectations Checklist The faculty Assessor w.docxAcademic Writing Expectations Checklist The faculty Assessor w.docx
Academic Writing Expectations Checklist The faculty Assessor w.docx
 

More from moresmile

Using content and interactions for discovering communities in
Using content and interactions for discovering communities inUsing content and interactions for discovering communities in
Using content and interactions for discovering communities inmoresmile
 
Topical keyphrase extraction from twitter
Topical keyphrase extraction from twitterTopical keyphrase extraction from twitter
Topical keyphrase extraction from twittermoresmile
 
Questions about questions
Questions about questionsQuestions about questions
Questions about questionsmoresmile
 
Magnet community identification on social networks
Magnet community identification on social networksMagnet community identification on social networks
Magnet community identification on social networksmoresmile
 
Is it time for a career switch
Is it time for a career switchIs it time for a career switch
Is it time for a career switchmoresmile
 
Generating event storylines from microblogs
Generating event storylines from microblogsGenerating event storylines from microblogs
Generating event storylines from microblogsmoresmile
 
Finding bursty topics from microblogs
Finding bursty topics from microblogsFinding bursty topics from microblogs
Finding bursty topics from microblogsmoresmile
 
Exploring social influence via posterior effect of word of-mouth
Exploring social influence via posterior effect of word of-mouthExploring social influence via posterior effect of word of-mouth
Exploring social influence via posterior effect of word of-mouthmoresmile
 
Event summarization using tweets
Event summarization using tweetsEvent summarization using tweets
Event summarization using tweetsmoresmile
 

More from moresmile (9)

Using content and interactions for discovering communities in
Using content and interactions for discovering communities inUsing content and interactions for discovering communities in
Using content and interactions for discovering communities in
 
Topical keyphrase extraction from twitter
Topical keyphrase extraction from twitterTopical keyphrase extraction from twitter
Topical keyphrase extraction from twitter
 
Questions about questions
Questions about questionsQuestions about questions
Questions about questions
 
Magnet community identification on social networks
Magnet community identification on social networksMagnet community identification on social networks
Magnet community identification on social networks
 
Is it time for a career switch
Is it time for a career switchIs it time for a career switch
Is it time for a career switch
 
Generating event storylines from microblogs
Generating event storylines from microblogsGenerating event storylines from microblogs
Generating event storylines from microblogs
 
Finding bursty topics from microblogs
Finding bursty topics from microblogsFinding bursty topics from microblogs
Finding bursty topics from microblogs
 
Exploring social influence via posterior effect of word of-mouth
Exploring social influence via posterior effect of word of-mouthExploring social influence via posterior effect of word of-mouth
Exploring social influence via posterior effect of word of-mouth
 
Event summarization using tweets
Event summarization using tweetsEvent summarization using tweets
Event summarization using tweets
 

Recently uploaded

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Recently uploaded (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

When relevance is not enough

  • 1. When Relevance is not Enough: Promoting Diversity and Freshness in Personalized Question Recommendation IDAN SZPEKTOR,YOELLE MAAREK,DAN PELLEG YAHOO!RESEARCH
  • 2. ABSTRACT a good question recommendation system 1. designed around answerers, rather than exclusively for askers 2. Scale to many questions and users and be fast enough 3. Relevant to his or her interests 4. diversity
  • 3. INTRODUCTION Common way: only to the best possible answerers (“experts”) All potential answerers
  • 4. INTRODUCTION relevance: to what degree the question matches the user’s tastes diversity and freshness needs Three requirements: 1. questions need to be recommended for all types of users 2. questions have to be diverse 3. recommendations need to be fresh and be served fast a) serve questions as recommendations immediately b) instantly adapting to users’ changes in taste
  • 5.
  • 6. RELATED WORK limitations real-time ranking the needs of new users with very little historical data are not addressed well. only on relevance
  • 7. Framework Question profile: 1. LDA model 2. Lexical model 3. Category model User profile: Question recommendation Matching question and user profiles Proactive diversification Recommendation merging
  • 8. QUESTION PROFILE Split it according to the 26 top categories in Yahoo! Answers Two Advantage: 1. 2. represent disjoint users’ interests. word sense disambiguation 1. question textual content(title and body) 2. category
  • 9. QUESTION PROFILE Build profile, which is represented by three vectors: 1. a Latent Dirichlet Allocation (LDA) topic vector 2. a lexical vector 3. a category vector
  • 10. LDA Model 1. Initial training: a random sample of up to 2 million resolved questions 2. Incremental learning: a random sample of up to half a million questions per top category 3. Inference: at least10% of the probability mass
  • 11. Lexical Model a unigram bag-of-words representation of a question tf·idf score / L1 normalized a probability distribution Category Model a probability of 1 to the category in which the question was posted
  • 12. USER PROFILE the questions answered in the past the user representation is generated by aggregating signals over these questions user profile: a probability tree
  • 13. 1. Aggregating the profiles of the questions the user answered 2. Update
  • 14. the first and third tree levels: a decaying factor on past questions the second level: 1. Measure the similarity between the feature distribution of each model in the question and the corresponding feature distribution in the user profile 2. Normalized to a probability distribution
  • 15. QUESTION RECOMMENDATION Matching Question and User Profiles A list of open questions ranked by a relevance score, which is calculated for the pair {question profile , user profile} For question profiles: 1. Turn the three vectors forming the question profile into a single vector, multiply the probability of each feature by 1/3 before storing it in the index 2. Index every question vector and build an inverted index
  • 16. QUESTION RECOMMENDATION For user profile: associate with each user feature a score that consists of the product of each probability score on the tree path that led to this feature Ranking: Similarity: a simple dot-product
  • 17. QUESTION RECOMMENDATION Proactive Diversification thematic sampling: 1. For each user vector u , we generate N query vectors u 1 ;u 2 ;…;u N 2. N ranked lists 3. Blending them together results in a final diverse list Two types of thematic constraints: specific top category: randomly select top categories as constraints by sampling without repetition based on their distribution in the root node of the user’s probability tree spefic LDA topic: randomly sample LDA topics without repetition from the user profile by traversing the probability tree
  • 18. QUESTION RECOMMENDATION Recommendation Merging blending algorithm 1. Each list being associated with a probability score 2. Sampling an intermediate list, based on the assigned probabilities 3. Removing one recommendation from the sampled list to be added at the end of the final list. 4. Repeat
  • 20. QUESTION RECOMMENDATION Non-Thematic LDA Topics 116 topics, 23 top categories 34% non-thematic topics A logistic regression classifier
  • 21. EXPERIMENTS Offline Experiment 8 different top categories Active users: at least 21 questions as of January 2011 New users: at least two questions as of January 2011
  • 22. EXPERIMENTS Online Experiment A/B test Control bucket , CTL ( n = 25093) Relevance bucket , R ( n = 5359) Freshness bucket , F ( n = 46228) : 50% recent ; 20% thematic sampling Diversity bucket , D ( n = 42041) : 20% recent ; 50% thematic sampling
  • 23.
  • 24.
  • 25.
  • 26. CONCLUSIONS Relevance, but also by freshness and diversity Several relevance models “question retrieval engine“ Diversity: thematic sampling 内容上:different factors/models/levels 写作上:层次清楚,递进