SlideShare a Scribd company logo
Machine Learning for Q&A Sites:
The Quora Example
Xavier Amatriain (@xamat)
04/11/2016
“To share and grow the world’s
knowledge”
• Millions of questions & answers
• Millions of users
• Thousands of topics
• ...
DemandQuality
Relevance
Data
Machine Learning
Applications for Q&A
Sites
Answer Ranking
Goal
• Given a question and n
answers, come up with the
ideal ranking of those n
answers
What is a good Quora answer?
• truthful
• reusable
• provides explanation
• well formatted
• ...
How are those dimensions translated
into features?
• Features that relate to the text
quality itself
• Interaction features
(upvotes/downvotes, clicks,
comments…)
• User features (e.g. expertise in topic)
Feed Ranking
• Goal: Present most interesting stories for
a user at a given time
• Interesting = topical relevance +
social relevance + timeliness
• Stories = questions + answers
• ML: Personalized learning-to-rank approach
• Relevance-ordered vs time-ordered = big
gains in engagement
• Challenges:
• potentially many candidate stories
• real-time ranking
• optimize for relevance
Feed dataset: impression logs
click
upvote
downvote
expand
share
click
answer pass
downvote
follow
● Value of showing a story to a user, e.g. weighted sum of actions:
v = ∑a
va
1{ya
= 1}
● Goal: predict this value for new stories. 2 possible approaches:
○ predict value directly
v_pred = f(x)
■ pros: single regression model
■ cons: can be ambiguous, coupled
○ predict probabilities for each action, then compute expected value:
v_pred = E[ V | x ] = ∑a
va
p(a | x)
■ pros: better use of supervised signal, decouples action models from action values
■ cons: more costly, one classifier per action
What is relevance?
● Essential for getting good rankings
● Better if updated in real-time (more reactive)
● Main sets of features:
○ user (e.g. age, country, recent activity)
○ story (e.g. popularity, trendiness, quality)
○ interactions between the two (e.g. topic or author affinity)
Feature engineering
● Linear
○ simple, fast to train
○ manual, non-linear transforms for richer
representation (buckets, ngrams)
● Decision trees
○ learn non-linear representations
● Tree ensembles
○ Random forests
○ Gradient boosted decision trees
● In-house C++ training code, third-party
libraries for prototyping new models
Models
Ask2Answer
● Given a question and a viewer rank all
other users based on how “well-suited”
they are.
○ “Well-suited” = likelihood of viewer sending a
request + likelihood of the candidate adding a
good answer.
● A2A = extension of CTR-prediction
○ Not only care about the viewer’s probability of
sending a request, but also the recipient’s
probability of writing a good answer
A2A
● Example labels:
○ Binary label: 0 if no request was sent or no
answer was added and 1 if a request was sent
and yielded an answer with a goodness score
above some threshold.
○ Continuous label:
w1⋅had_request+w2⋅had_answer+w3⋅answer_
goodness+⋯w1⋅had_request+w2⋅had_answer+
w3⋅answer_goodness+⋯
A2A
● Features
○ Based on what the viewer or candidate has
done in the past.
○ Historical features that encapsulate the
relationship of the viewer to the candidate.
○ In addition to historical features, other features
can be devised (e.g. a binary feature saying
whether the viewer follows the candidate)
● Many more features are possible.
Feature engineering is a crucial
component of any ML system.
A2A
Topics & Users
Recommendations
Goal: Recommend new topics for the
user to follow
● Based on
○ Other topics followed
○ Users followed
○ User interactions
○ Topic-related features
○ ...
Goal: Recommend new users to follow
● Based on:
○ Other users followed
○ Topics followed
○ User interactions
○ User-related features
○ ...
Related Questions
● Given interest in question A (source) what other
questions will be interesting?
● Not only about similarity, but also “interestingness”
● Features such as:
○ Textual
○ Co-visit
○ Topics
○ …
● Important for logged-out use case
Duplicate Questions
● Important issue for Q&A Sites
○ Want to make sure we don’t disperse
knowledge to the same question
● Solution: binary classifier trained with
labelled data
● Features
○ Textual vector space models
○ Usage-based features
○ ...
User Trust
Goal: Infer user’s trustworthiness in relation
to a given topic
● We take into account:
○ Answers written on topic
○ Upvotes/downvotes received
○ Endorsements
○ ...
● Trust/expertise propagates through the network
● Must be taken into account by other algorithms
Trending Topics
Goal: Highlight current events that are interesting
for the user
● We take into account:
○ Global “Trendiness”
○ Social “Trendiness”
○ User’s interest
○ ...
● Trending topics are a great discovery mechanism
Moderation
● Very important for Quora to keep quality of content
● Pure manual approaches do not scale
● Hard to get algorithms 100% right
● ML algorithms detect content/user issues
○ Output of the algorithms feed manually
curated moderation queues
Content Creation
Prediction
● Quora’s algorithms not only optimize for
probability of reading
● Important to predict probability of a user
answering a question
● Parts of our system completely rely on
that prediction
○ E.g. A2A (ask to answer) suggestions
Models
● Logistic Regression
● Elastic Nets
● Gradient Boosted Decision
Trees
● Random Forests
● (Deep) Neural Networks
● LambdaMART
● Matrix Factorization
● LDA
● ...
●
Experimentation
⚫ Extensive A/B testing, data-driven decision-
making
⚫ Separate, orthogonal “layers” for different parts of
the system
⚫ Experiment framework showing comparisons for
various metrics
Conclusions
• Q&A sites have not only Big, but also “rich” data
• Algorithms need to understand and optimize complex
aspects such as quality, interestingness, or user
expertise
• ML is one of the keys to success
• Many interesting problems, and many unsolved
challenges
Questions?
Machine Learning for Q&A Sites: The Quora Example

More Related Content

What's hot

5 csp
5 csp5 csp
5 csp
Mhd Sb
 
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and CostLLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
Aggregage
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at Netflix
Jaya Kawale
 
Handling concept drift in data stream mining
Handling concept drift in data stream miningHandling concept drift in data stream mining
Handling concept drift in data stream mining
Manuel Martín
 
Unsupervised Machine Learning
Unsupervised Machine LearningUnsupervised Machine Learning
Unsupervised Machine Learning
Livares Technologies Pvt Ltd
 
‘Big models’: the success and pitfalls of Transformer models in natural langu...
‘Big models’: the success and pitfalls of Transformer models in natural langu...‘Big models’: the success and pitfalls of Transformer models in natural langu...
‘Big models’: the success and pitfalls of Transformer models in natural langu...
Leiden University
 
Reinforcement learning slides
Reinforcement learning slidesReinforcement learning slides
Reinforcement learning slides
OmranHakami
 
LLMs in Production: Tooling, Process, and Team Structure
LLMs in Production: Tooling, Process, and Team StructureLLMs in Production: Tooling, Process, and Team Structure
LLMs in Production: Tooling, Process, and Team Structure
Aggregage
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
David Talby
 
18 Data Streams
18 Data Streams18 Data Streams
18 Data Streams
Pier Luca Lanzi
 
Facebook Talk at Netflix ML Platform meetup Sep 2019
Facebook Talk at Netflix ML Platform meetup Sep 2019Facebook Talk at Netflix ML Platform meetup Sep 2019
Facebook Talk at Netflix ML Platform meetup Sep 2019
Faisal Siddiqi
 
Machine learning
Machine learningMachine learning
Machine learning
Amit Kumar Rathi
 
Machine learning
Machine learningMachine learning
Machine learning
eonx_32
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix Perspective
Justin Basilico
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In Industry
Xavier Amatriain
 
Generative AI at the edge.pdf
Generative AI at the edge.pdfGenerative AI at the edge.pdf
Generative AI at the edge.pdf
Qualcomm Research
 
Robustness of Deep Neural Networks
Robustness of Deep Neural NetworksRobustness of Deep Neural Networks
Robustness of Deep Neural Networks
khalooei
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
Yves Raimond
 
SLIQ
SLIQSLIQ
Personalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing RecommendationsPersonalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing Recommendations
Justin Basilico
 

What's hot (20)

5 csp
5 csp5 csp
5 csp
 
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and CostLLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
LLMOps for Your Data: Best Practices to Ensure Safety, Quality, and Cost
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at Netflix
 
Handling concept drift in data stream mining
Handling concept drift in data stream miningHandling concept drift in data stream mining
Handling concept drift in data stream mining
 
Unsupervised Machine Learning
Unsupervised Machine LearningUnsupervised Machine Learning
Unsupervised Machine Learning
 
‘Big models’: the success and pitfalls of Transformer models in natural langu...
‘Big models’: the success and pitfalls of Transformer models in natural langu...‘Big models’: the success and pitfalls of Transformer models in natural langu...
‘Big models’: the success and pitfalls of Transformer models in natural langu...
 
Reinforcement learning slides
Reinforcement learning slidesReinforcement learning slides
Reinforcement learning slides
 
LLMs in Production: Tooling, Process, and Team Structure
LLMs in Production: Tooling, Process, and Team StructureLLMs in Production: Tooling, Process, and Team Structure
LLMs in Production: Tooling, Process, and Team Structure
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
 
18 Data Streams
18 Data Streams18 Data Streams
18 Data Streams
 
Facebook Talk at Netflix ML Platform meetup Sep 2019
Facebook Talk at Netflix ML Platform meetup Sep 2019Facebook Talk at Netflix ML Platform meetup Sep 2019
Facebook Talk at Netflix ML Platform meetup Sep 2019
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Recent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix PerspectiveRecent Trends in Personalization: A Netflix Perspective
Recent Trends in Personalization: A Netflix Perspective
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In Industry
 
Generative AI at the edge.pdf
Generative AI at the edge.pdfGenerative AI at the edge.pdf
Generative AI at the edge.pdf
 
Robustness of Deep Neural Networks
Robustness of Deep Neural NetworksRobustness of Deep Neural Networks
Robustness of Deep Neural Networks
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
SLIQ
SLIQSLIQ
SLIQ
 
Personalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing RecommendationsPersonalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing Recommendations
 

Viewers also liked

Past present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectivePast present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry Perspective
Xavier Amatriain
 
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
Xavier Amatriain
 
Staying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning WorldStaying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning World
Xavier Amatriain
 
Machine Learning to Grow the World's Knowledge
Machine Learning to Grow  the World's KnowledgeMachine Learning to Grow  the World's Knowledge
Machine Learning to Grow the World's Knowledge
Xavier Amatriain
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
Xavier Amatriain
 
Recsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem RevisitedRecsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem Revisited
Xavier Amatriain
 
Kdd 2014 Tutorial - the recommender problem revisited
Kdd 2014 Tutorial -  the recommender problem revisitedKdd 2014 Tutorial -  the recommender problem revisited
Kdd 2014 Tutorial - the recommender problem revisited
Xavier Amatriain
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Xavier Amatriain
 
Barcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons LearnedBarcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons Learned
Xavier Amatriain
 
MLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@QuoraMLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@Quora
Xavier Amatriain
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Xavier Amatriain
 
Lean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven CompaniesLean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven Companies
Xavier Amatriain
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
Xavier Amatriain
 
ML and Data Science at Uber - GITPro talk 2017
ML and Data Science at Uber - GITPro talk 2017ML and Data Science at Uber - GITPro talk 2017
ML and Data Science at Uber - GITPro talk 2017
Sudhir Tonse
 
Pros and Cons of a MicroServices Architecture talk at AWS ReInvent
Pros and Cons of a MicroServices Architecture talk at AWS ReInventPros and Cons of a MicroServices Architecture talk at AWS ReInvent
Pros and Cons of a MicroServices Architecture talk at AWS ReInvent
Sudhir Tonse
 
MicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scaleMicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scale
Sudhir Tonse
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
Liang Xiang
 
How to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on SparkHow to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on Spark
Caserta
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Rahul Jain
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Lior Rokach
 

Viewers also liked (20)

Past present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectivePast present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry Perspective
 
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
 
Staying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning WorldStaying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning World
 
Machine Learning to Grow the World's Knowledge
Machine Learning to Grow  the World's KnowledgeMachine Learning to Grow  the World's Knowledge
Machine Learning to Grow the World's Knowledge
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
 
Recsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem RevisitedRecsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem Revisited
 
Kdd 2014 Tutorial - the recommender problem revisited
Kdd 2014 Tutorial -  the recommender problem revisitedKdd 2014 Tutorial -  the recommender problem revisited
Kdd 2014 Tutorial - the recommender problem revisited
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
Barcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons LearnedBarcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons Learned
 
MLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@QuoraMLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@Quora
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
 
Lean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven CompaniesLean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven Companies
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 
ML and Data Science at Uber - GITPro talk 2017
ML and Data Science at Uber - GITPro talk 2017ML and Data Science at Uber - GITPro talk 2017
ML and Data Science at Uber - GITPro talk 2017
 
Pros and Cons of a MicroServices Architecture talk at AWS ReInvent
Pros and Cons of a MicroServices Architecture talk at AWS ReInventPros and Cons of a MicroServices Architecture talk at AWS ReInvent
Pros and Cons of a MicroServices Architecture talk at AWS ReInvent
 
MicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scaleMicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scale
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
 
How to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on SparkHow to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on Spark
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 

Similar to Machine Learning for Q&A Sites: The Quora Example

Machine Learning at Quora (2/26/2016)
Machine Learning at Quora (2/26/2016)Machine Learning at Quora (2/26/2016)
Machine Learning at Quora (2/26/2016)
Nikhil Dandekar
 
H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...
H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...
H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...
Sri Ambati
 
Search, Discovery and Questions at Quora
Search, Discovery and Questions at QuoraSearch, Discovery and Questions at Quora
Search, Discovery and Questions at Quora
Nikhil Dandekar
 
Recommending the world's knowledge
Recommending the world's knowledgeRecommending the world's knowledge
Recommending the world's knowledge
Lei Yang
 
Xavier Amatriain, VP of Engineering, Quora at MLconf SEA - 5/01/15
Xavier Amatriain, VP of Engineering, Quora at MLconf SEA - 5/01/15Xavier Amatriain, VP of Engineering, Quora at MLconf SEA - 5/01/15
Xavier Amatriain, VP of Engineering, Quora at MLconf SEA - 5/01/15
MLconf
 
Scaling Quality on Quora Using Machine Learning
Scaling Quality on Quora Using Machine LearningScaling Quality on Quora Using Machine Learning
Scaling Quality on Quora Using Machine Learning
Vo Viet Anh
 
Intelligently matching users to questions for reading and writing
Intelligently matching users to questions for reading and writingIntelligently matching users to questions for reading and writing
Intelligently matching users to questions for reading and writing
Nikhil Dandekar
 
Maintaining high quality user generated content through machine learning
Maintaining high quality user generated content through machine learningMaintaining high quality user generated content through machine learning
Maintaining high quality user generated content through machine learning
Nikhil Dandekar
 
Quora ML Workshop: Maintaining High Quality User-Generated Content through Ma...
Quora ML Workshop: Maintaining High Quality User-Generated Content through Ma...Quora ML Workshop: Maintaining High Quality User-Generated Content through Ma...
Quora ML Workshop: Maintaining High Quality User-Generated Content through Ma...
Quora
 
[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用
台灣資料科學年會
 
Scaling Recommendations at Quora (RecSys talk 9/16/2016)
Scaling Recommendations at Quora (RecSys talk 9/16/2016)Scaling Recommendations at Quora (RecSys talk 9/16/2016)
Scaling Recommendations at Quora (RecSys talk 9/16/2016)
Nikhil Dandekar
 
Data Science Salon: Digital Transformation: The Data Science Catalyst
Data Science Salon: Digital Transformation: The Data Science CatalystData Science Salon: Digital Transformation: The Data Science Catalyst
Data Science Salon: Digital Transformation: The Data Science Catalyst
Formulatedby
 
Intern Project Showcase.pptx
Intern Project Showcase.pptxIntern Project Showcase.pptx
Intern Project Showcase.pptx
ritikgarg48
 
Taking it to the next level: Strategies for making good UX a team effort
Taking it to the next level: Strategies for making good UX a team effortTaking it to the next level: Strategies for making good UX a team effort
Taking it to the next level: Strategies for making good UX a team effort
Sarah Khan
 
CP vs Project - Elevate Ep. 02.pdf
CP vs Project  - Elevate Ep. 02.pdfCP vs Project  - Elevate Ep. 02.pdf
CP vs Project - Elevate Ep. 02.pdf
preetikumara
 
Discovering Real-World Usage for a Multimodal Math Search Interface
Discovering Real-World Usage for a Multimodal Math Search InterfaceDiscovering Real-World Usage for a Multimodal Math Search Interface
Discovering Real-World Usage for a Multimodal Math Search Interface
Keita (Del Valle) Wangari
 
A feature guide to QUT's Digital Workplace (Intranets2016)
A feature guide to QUT's Digital Workplace (Intranets2016)A feature guide to QUT's Digital Workplace (Intranets2016)
A feature guide to QUT's Digital Workplace (Intranets2016)
Andy McBride
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
Justin Basilico
 
Become a Better Data Analyst with Tableau - Charlotte TUG
Become a Better Data Analyst with Tableau - Charlotte TUGBecome a Better Data Analyst with Tableau - Charlotte TUG
Become a Better Data Analyst with Tableau - Charlotte TUG
Sarah Bartlett
 
How Does the USA Today Network Provide Its Readers With Meaningful Content? -...
How Does the USA Today Network Provide Its Readers With Meaningful Content? -...How Does the USA Today Network Provide Its Readers With Meaningful Content? -...
How Does the USA Today Network Provide Its Readers With Meaningful Content? -...
Lucidworks
 

Similar to Machine Learning for Q&A Sites: The Quora Example (20)

Machine Learning at Quora (2/26/2016)
Machine Learning at Quora (2/26/2016)Machine Learning at Quora (2/26/2016)
Machine Learning at Quora (2/26/2016)
 
H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...
H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...
H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...
 
Search, Discovery and Questions at Quora
Search, Discovery and Questions at QuoraSearch, Discovery and Questions at Quora
Search, Discovery and Questions at Quora
 
Recommending the world's knowledge
Recommending the world's knowledgeRecommending the world's knowledge
Recommending the world's knowledge
 
Xavier Amatriain, VP of Engineering, Quora at MLconf SEA - 5/01/15
Xavier Amatriain, VP of Engineering, Quora at MLconf SEA - 5/01/15Xavier Amatriain, VP of Engineering, Quora at MLconf SEA - 5/01/15
Xavier Amatriain, VP of Engineering, Quora at MLconf SEA - 5/01/15
 
Scaling Quality on Quora Using Machine Learning
Scaling Quality on Quora Using Machine LearningScaling Quality on Quora Using Machine Learning
Scaling Quality on Quora Using Machine Learning
 
Intelligently matching users to questions for reading and writing
Intelligently matching users to questions for reading and writingIntelligently matching users to questions for reading and writing
Intelligently matching users to questions for reading and writing
 
Maintaining high quality user generated content through machine learning
Maintaining high quality user generated content through machine learningMaintaining high quality user generated content through machine learning
Maintaining high quality user generated content through machine learning
 
Quora ML Workshop: Maintaining High Quality User-Generated Content through Ma...
Quora ML Workshop: Maintaining High Quality User-Generated Content through Ma...Quora ML Workshop: Maintaining High Quality User-Generated Content through Ma...
Quora ML Workshop: Maintaining High Quality User-Generated Content through Ma...
 
[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用
 
Scaling Recommendations at Quora (RecSys talk 9/16/2016)
Scaling Recommendations at Quora (RecSys talk 9/16/2016)Scaling Recommendations at Quora (RecSys talk 9/16/2016)
Scaling Recommendations at Quora (RecSys talk 9/16/2016)
 
Data Science Salon: Digital Transformation: The Data Science Catalyst
Data Science Salon: Digital Transformation: The Data Science CatalystData Science Salon: Digital Transformation: The Data Science Catalyst
Data Science Salon: Digital Transformation: The Data Science Catalyst
 
Intern Project Showcase.pptx
Intern Project Showcase.pptxIntern Project Showcase.pptx
Intern Project Showcase.pptx
 
Taking it to the next level: Strategies for making good UX a team effort
Taking it to the next level: Strategies for making good UX a team effortTaking it to the next level: Strategies for making good UX a team effort
Taking it to the next level: Strategies for making good UX a team effort
 
CP vs Project - Elevate Ep. 02.pdf
CP vs Project  - Elevate Ep. 02.pdfCP vs Project  - Elevate Ep. 02.pdf
CP vs Project - Elevate Ep. 02.pdf
 
Discovering Real-World Usage for a Multimodal Math Search Interface
Discovering Real-World Usage for a Multimodal Math Search InterfaceDiscovering Real-World Usage for a Multimodal Math Search Interface
Discovering Real-World Usage for a Multimodal Math Search Interface
 
A feature guide to QUT's Digital Workplace (Intranets2016)
A feature guide to QUT's Digital Workplace (Intranets2016)A feature guide to QUT's Digital Workplace (Intranets2016)
A feature guide to QUT's Digital Workplace (Intranets2016)
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Become a Better Data Analyst with Tableau - Charlotte TUG
Become a Better Data Analyst with Tableau - Charlotte TUGBecome a Better Data Analyst with Tableau - Charlotte TUG
Become a Better Data Analyst with Tableau - Charlotte TUG
 
How Does the USA Today Network Provide Its Readers With Meaningful Content? -...
How Does the USA Today Network Provide Its Readers With Meaningful Content? -...How Does the USA Today Network Provide Its Readers With Meaningful Content? -...
How Does the USA Today Network Provide Its Readers With Meaningful Content? -...
 

More from Xavier Amatriain

Data/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealthData/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealth
Xavier Amatriain
 
AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19
Xavier Amatriain
 
AI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 updateAI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 update
Xavier Amatriain
 
AI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approachAI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approach
Xavier Amatriain
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systems
Xavier Amatriain
 
AI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for EveryoneAI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for Everyone
Xavier Amatriain
 
Towards online universal quality healthcare through AI
Towards online universal quality healthcare through AITowards online universal quality healthcare through AI
Towards online universal quality healthcare through AI
Xavier Amatriain
 
From one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategyFrom one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategy
Xavier Amatriain
 
Learning to speak medicine
Learning to speak medicineLearning to speak medicine
Learning to speak medicine
Xavier Amatriain
 
ML to cure the world
ML to cure the worldML to cure the world
ML to cure the world
Xavier Amatriain
 
Medical advice as a Recommender System
Medical advice as a Recommender SystemMedical advice as a Recommender System
Medical advice as a Recommender System
Xavier Amatriain
 
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf
Xavier Amatriain
 
10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems
Xavier Amatriain
 
MMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the Cloud
MMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the CloudMMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the Cloud
MMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the Cloud
Xavier Amatriain
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Xavier Amatriain
 

More from Xavier Amatriain (15)

Data/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealthData/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealth
 
AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19
 
AI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 updateAI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 update
 
AI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approachAI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approach
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systems
 
AI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for EveryoneAI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for Everyone
 
Towards online universal quality healthcare through AI
Towards online universal quality healthcare through AITowards online universal quality healthcare through AI
Towards online universal quality healthcare through AI
 
From one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategyFrom one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategy
 
Learning to speak medicine
Learning to speak medicineLearning to speak medicine
Learning to speak medicine
 
ML to cure the world
ML to cure the worldML to cure the world
ML to cure the world
 
Medical advice as a Recommender System
Medical advice as a Recommender SystemMedical advice as a Recommender System
Medical advice as a Recommender System
 
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf
 
10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems
 
MMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the Cloud
MMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the CloudMMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the Cloud
MMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the Cloud
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
 

Recently uploaded

Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 

Recently uploaded (20)

Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 

Machine Learning for Q&A Sites: The Quora Example

  • 1. Machine Learning for Q&A Sites: The Quora Example Xavier Amatriain (@xamat) 04/11/2016
  • 2. “To share and grow the world’s knowledge” • Millions of questions & answers • Millions of users • Thousands of topics • ...
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 12. Goal • Given a question and n answers, come up with the ideal ranking of those n answers
  • 13. What is a good Quora answer? • truthful • reusable • provides explanation • well formatted • ...
  • 14. How are those dimensions translated into features? • Features that relate to the text quality itself • Interaction features (upvotes/downvotes, clicks, comments…) • User features (e.g. expertise in topic)
  • 16. • Goal: Present most interesting stories for a user at a given time • Interesting = topical relevance + social relevance + timeliness • Stories = questions + answers • ML: Personalized learning-to-rank approach • Relevance-ordered vs time-ordered = big gains in engagement • Challenges: • potentially many candidate stories • real-time ranking • optimize for relevance
  • 17. Feed dataset: impression logs click upvote downvote expand share click answer pass downvote follow
  • 18. ● Value of showing a story to a user, e.g. weighted sum of actions: v = ∑a va 1{ya = 1} ● Goal: predict this value for new stories. 2 possible approaches: ○ predict value directly v_pred = f(x) ■ pros: single regression model ■ cons: can be ambiguous, coupled ○ predict probabilities for each action, then compute expected value: v_pred = E[ V | x ] = ∑a va p(a | x) ■ pros: better use of supervised signal, decouples action models from action values ■ cons: more costly, one classifier per action What is relevance?
  • 19. ● Essential for getting good rankings ● Better if updated in real-time (more reactive) ● Main sets of features: ○ user (e.g. age, country, recent activity) ○ story (e.g. popularity, trendiness, quality) ○ interactions between the two (e.g. topic or author affinity) Feature engineering
  • 20. ● Linear ○ simple, fast to train ○ manual, non-linear transforms for richer representation (buckets, ngrams) ● Decision trees ○ learn non-linear representations ● Tree ensembles ○ Random forests ○ Gradient boosted decision trees ● In-house C++ training code, third-party libraries for prototyping new models Models
  • 22. ● Given a question and a viewer rank all other users based on how “well-suited” they are. ○ “Well-suited” = likelihood of viewer sending a request + likelihood of the candidate adding a good answer. ● A2A = extension of CTR-prediction ○ Not only care about the viewer’s probability of sending a request, but also the recipient’s probability of writing a good answer A2A
  • 23. ● Example labels: ○ Binary label: 0 if no request was sent or no answer was added and 1 if a request was sent and yielded an answer with a goodness score above some threshold. ○ Continuous label: w1⋅had_request+w2⋅had_answer+w3⋅answer_ goodness+⋯w1⋅had_request+w2⋅had_answer+ w3⋅answer_goodness+⋯ A2A
  • 24. ● Features ○ Based on what the viewer or candidate has done in the past. ○ Historical features that encapsulate the relationship of the viewer to the candidate. ○ In addition to historical features, other features can be devised (e.g. a binary feature saying whether the viewer follows the candidate) ● Many more features are possible. Feature engineering is a crucial component of any ML system. A2A
  • 26. Goal: Recommend new topics for the user to follow ● Based on ○ Other topics followed ○ Users followed ○ User interactions ○ Topic-related features ○ ...
  • 27. Goal: Recommend new users to follow ● Based on: ○ Other users followed ○ Topics followed ○ User interactions ○ User-related features ○ ...
  • 29. ● Given interest in question A (source) what other questions will be interesting? ● Not only about similarity, but also “interestingness” ● Features such as: ○ Textual ○ Co-visit ○ Topics ○ … ● Important for logged-out use case
  • 31. ● Important issue for Q&A Sites ○ Want to make sure we don’t disperse knowledge to the same question ● Solution: binary classifier trained with labelled data ● Features ○ Textual vector space models ○ Usage-based features ○ ...
  • 33. Goal: Infer user’s trustworthiness in relation to a given topic ● We take into account: ○ Answers written on topic ○ Upvotes/downvotes received ○ Endorsements ○ ... ● Trust/expertise propagates through the network ● Must be taken into account by other algorithms
  • 35. Goal: Highlight current events that are interesting for the user ● We take into account: ○ Global “Trendiness” ○ Social “Trendiness” ○ User’s interest ○ ... ● Trending topics are a great discovery mechanism
  • 37. ● Very important for Quora to keep quality of content ● Pure manual approaches do not scale ● Hard to get algorithms 100% right ● ML algorithms detect content/user issues ○ Output of the algorithms feed manually curated moderation queues
  • 39. ● Quora’s algorithms not only optimize for probability of reading ● Important to predict probability of a user answering a question ● Parts of our system completely rely on that prediction ○ E.g. A2A (ask to answer) suggestions
  • 41. ● Logistic Regression ● Elastic Nets ● Gradient Boosted Decision Trees ● Random Forests ● (Deep) Neural Networks ● LambdaMART ● Matrix Factorization ● LDA ● ... ●
  • 43. ⚫ Extensive A/B testing, data-driven decision- making ⚫ Separate, orthogonal “layers” for different parts of the system ⚫ Experiment framework showing comparisons for various metrics
  • 45. • Q&A sites have not only Big, but also “rich” data • Algorithms need to understand and optimize complex aspects such as quality, interestingness, or user expertise • ML is one of the keys to success • Many interesting problems, and many unsolved challenges