Machine Learning for Q&A Sites:
The Quora Example
Xavier Amatriain (@xamat)
04/11/2016
“To share and grow the world’s
knowledge”
• Millions of questions & answers
• Millions of users
• Thousands of topics
• ...
DemandQuality
Relevance
Data
Machine Learning
Applications for Q&A
Sites
Answer Ranking
Goal
• Given a question and n
answers, come up with the
ideal ranking of those n
answers
What is a good Quora answer?
• truthful
• reusable
• provides explanation
• well formatted
• ...
How are those dimensions translated
into features?
• Features that relate to the text
quality itself
• Interaction features
(upvotes/downvotes, clicks,
comments…)
• User features (e.g. expertise in topic)
Feed Ranking
• Goal: Present most interesting stories for
a user at a given time
• Interesting = topical relevance +
social relevance + timeliness
• Stories = questions + answers
• ML: Personalized learning-to-rank approach
• Relevance-ordered vs time-ordered = big
gains in engagement
• Challenges:
• potentially many candidate stories
• real-time ranking
• optimize for relevance
Feed dataset: impression logs
click
upvote
downvote
expand
share
click
answer pass
downvote
follow
● Value of showing a story to a user, e.g. weighted sum of actions:
v = ∑a
va
1{ya
= 1}
● Goal: predict this value for new stories. 2 possible approaches:
○ predict value directly
v_pred = f(x)
■ pros: single regression model
■ cons: can be ambiguous, coupled
○ predict probabilities for each action, then compute expected value:
v_pred = E[ V | x ] = ∑a
va
p(a | x)
■ pros: better use of supervised signal, decouples action models from action values
■ cons: more costly, one classifier per action
What is relevance?
● Essential for getting good rankings
● Better if updated in real-time (more reactive)
● Main sets of features:
○ user (e.g. age, country, recent activity)
○ story (e.g. popularity, trendiness, quality)
○ interactions between the two (e.g. topic or author affinity)
Feature engineering
● Linear
○ simple, fast to train
○ manual, non-linear transforms for richer
representation (buckets, ngrams)
● Decision trees
○ learn non-linear representations
● Tree ensembles
○ Random forests
○ Gradient boosted decision trees
● In-house C++ training code, third-party
libraries for prototyping new models
Models
Ask2Answer
● Given a question and a viewer rank all
other users based on how “well-suited”
they are.
○ “Well-suited” = likelihood of viewer sending a
request + likelihood of the candidate adding a
good answer.
● A2A = extension of CTR-prediction
○ Not only care about the viewer’s probability of
sending a request, but also the recipient’s
probability of writing a good answer
A2A
● Example labels:
○ Binary label: 0 if no request was sent or no
answer was added and 1 if a request was sent
and yielded an answer with a goodness score
above some threshold.
○ Continuous label:
w1⋅had_request+w2⋅had_answer+w3⋅answer_
goodness+⋯w1⋅had_request+w2⋅had_answer+
w3⋅answer_goodness+⋯
A2A
● Features
○ Based on what the viewer or candidate has
done in the past.
○ Historical features that encapsulate the
relationship of the viewer to the candidate.
○ In addition to historical features, other features
can be devised (e.g. a binary feature saying
whether the viewer follows the candidate)
● Many more features are possible.
Feature engineering is a crucial
component of any ML system.
A2A
Topics & Users
Recommendations
Goal: Recommend new topics for the
user to follow
● Based on
○ Other topics followed
○ Users followed
○ User interactions
○ Topic-related features
○ ...
Goal: Recommend new users to follow
● Based on:
○ Other users followed
○ Topics followed
○ User interactions
○ User-related features
○ ...
Related Questions
● Given interest in question A (source) what other
questions will be interesting?
● Not only about similarity, but also “interestingness”
● Features such as:
○ Textual
○ Co-visit
○ Topics
○ …
● Important for logged-out use case
Duplicate Questions
● Important issue for Q&A Sites
○ Want to make sure we don’t disperse
knowledge to the same question
● Solution: binary classifier trained with
labelled data
● Features
○ Textual vector space models
○ Usage-based features
○ ...
User Trust
Goal: Infer user’s trustworthiness in relation
to a given topic
● We take into account:
○ Answers written on topic
○ Upvotes/downvotes received
○ Endorsements
○ ...
● Trust/expertise propagates through the network
● Must be taken into account by other algorithms
Trending Topics
Goal: Highlight current events that are interesting
for the user
● We take into account:
○ Global “Trendiness”
○ Social “Trendiness”
○ User’s interest
○ ...
● Trending topics are a great discovery mechanism
Moderation
● Very important for Quora to keep quality of content
● Pure manual approaches do not scale
● Hard to get algorithms 100% right
● ML algorithms detect content/user issues
○ Output of the algorithms feed manually
curated moderation queues
Content Creation
Prediction
● Quora’s algorithms not only optimize for
probability of reading
● Important to predict probability of a user
answering a question
● Parts of our system completely rely on
that prediction
○ E.g. A2A (ask to answer) suggestions
Models
● Logistic Regression
● Elastic Nets
● Gradient Boosted Decision
Trees
● Random Forests
● (Deep) Neural Networks
● LambdaMART
● Matrix Factorization
● LDA
● ...
●
Experimentation
⚫ Extensive A/B testing, data-driven decision-
making
⚫ Separate, orthogonal “layers” for different parts of
the system
⚫ Experiment framework showing comparisons for
various metrics
Conclusions
• Q&A sites have not only Big, but also “rich” data
• Algorithms need to understand and optimize complex
aspects such as quality, interestingness, or user
expertise
• ML is one of the keys to success
• Many interesting problems, and many unsolved
challenges
Questions?
Machine Learning for Q&A Sites: The Quora Example

Machine Learning for Q&A Sites: The Quora Example

  • 1.
    Machine Learning forQ&A Sites: The Quora Example Xavier Amatriain (@xamat) 04/11/2016
  • 2.
    “To share andgrow the world’s knowledge” • Millions of questions & answers • Millions of users • Thousands of topics • ...
  • 3.
  • 4.
  • 10.
  • 11.
  • 12.
    Goal • Given aquestion and n answers, come up with the ideal ranking of those n answers
  • 13.
    What is agood Quora answer? • truthful • reusable • provides explanation • well formatted • ...
  • 14.
    How are thosedimensions translated into features? • Features that relate to the text quality itself • Interaction features (upvotes/downvotes, clicks, comments…) • User features (e.g. expertise in topic)
  • 15.
  • 16.
    • Goal: Presentmost interesting stories for a user at a given time • Interesting = topical relevance + social relevance + timeliness • Stories = questions + answers • ML: Personalized learning-to-rank approach • Relevance-ordered vs time-ordered = big gains in engagement • Challenges: • potentially many candidate stories • real-time ranking • optimize for relevance
  • 17.
    Feed dataset: impressionlogs click upvote downvote expand share click answer pass downvote follow
  • 18.
    ● Value ofshowing a story to a user, e.g. weighted sum of actions: v = ∑a va 1{ya = 1} ● Goal: predict this value for new stories. 2 possible approaches: ○ predict value directly v_pred = f(x) ■ pros: single regression model ■ cons: can be ambiguous, coupled ○ predict probabilities for each action, then compute expected value: v_pred = E[ V | x ] = ∑a va p(a | x) ■ pros: better use of supervised signal, decouples action models from action values ■ cons: more costly, one classifier per action What is relevance?
  • 19.
    ● Essential forgetting good rankings ● Better if updated in real-time (more reactive) ● Main sets of features: ○ user (e.g. age, country, recent activity) ○ story (e.g. popularity, trendiness, quality) ○ interactions between the two (e.g. topic or author affinity) Feature engineering
  • 20.
    ● Linear ○ simple,fast to train ○ manual, non-linear transforms for richer representation (buckets, ngrams) ● Decision trees ○ learn non-linear representations ● Tree ensembles ○ Random forests ○ Gradient boosted decision trees ● In-house C++ training code, third-party libraries for prototyping new models Models
  • 21.
  • 22.
    ● Given aquestion and a viewer rank all other users based on how “well-suited” they are. ○ “Well-suited” = likelihood of viewer sending a request + likelihood of the candidate adding a good answer. ● A2A = extension of CTR-prediction ○ Not only care about the viewer’s probability of sending a request, but also the recipient’s probability of writing a good answer A2A
  • 23.
    ● Example labels: ○Binary label: 0 if no request was sent or no answer was added and 1 if a request was sent and yielded an answer with a goodness score above some threshold. ○ Continuous label: w1⋅had_request+w2⋅had_answer+w3⋅answer_ goodness+⋯w1⋅had_request+w2⋅had_answer+ w3⋅answer_goodness+⋯ A2A
  • 24.
    ● Features ○ Basedon what the viewer or candidate has done in the past. ○ Historical features that encapsulate the relationship of the viewer to the candidate. ○ In addition to historical features, other features can be devised (e.g. a binary feature saying whether the viewer follows the candidate) ● Many more features are possible. Feature engineering is a crucial component of any ML system. A2A
  • 25.
  • 26.
    Goal: Recommend newtopics for the user to follow ● Based on ○ Other topics followed ○ Users followed ○ User interactions ○ Topic-related features ○ ...
  • 27.
    Goal: Recommend newusers to follow ● Based on: ○ Other users followed ○ Topics followed ○ User interactions ○ User-related features ○ ...
  • 28.
  • 29.
    ● Given interestin question A (source) what other questions will be interesting? ● Not only about similarity, but also “interestingness” ● Features such as: ○ Textual ○ Co-visit ○ Topics ○ … ● Important for logged-out use case
  • 30.
  • 31.
    ● Important issuefor Q&A Sites ○ Want to make sure we don’t disperse knowledge to the same question ● Solution: binary classifier trained with labelled data ● Features ○ Textual vector space models ○ Usage-based features ○ ...
  • 32.
  • 33.
    Goal: Infer user’strustworthiness in relation to a given topic ● We take into account: ○ Answers written on topic ○ Upvotes/downvotes received ○ Endorsements ○ ... ● Trust/expertise propagates through the network ● Must be taken into account by other algorithms
  • 34.
  • 35.
    Goal: Highlight currentevents that are interesting for the user ● We take into account: ○ Global “Trendiness” ○ Social “Trendiness” ○ User’s interest ○ ... ● Trending topics are a great discovery mechanism
  • 36.
  • 37.
    ● Very importantfor Quora to keep quality of content ● Pure manual approaches do not scale ● Hard to get algorithms 100% right ● ML algorithms detect content/user issues ○ Output of the algorithms feed manually curated moderation queues
  • 38.
  • 39.
    ● Quora’s algorithmsnot only optimize for probability of reading ● Important to predict probability of a user answering a question ● Parts of our system completely rely on that prediction ○ E.g. A2A (ask to answer) suggestions
  • 40.
  • 41.
    ● Logistic Regression ●Elastic Nets ● Gradient Boosted Decision Trees ● Random Forests ● (Deep) Neural Networks ● LambdaMART ● Matrix Factorization ● LDA ● ... ●
  • 42.
  • 43.
    ⚫ Extensive A/Btesting, data-driven decision- making ⚫ Separate, orthogonal “layers” for different parts of the system ⚫ Experiment framework showing comparisons for various metrics
  • 44.
  • 45.
    • Q&A siteshave not only Big, but also “rich” data • Algorithms need to understand and optimize complex aspects such as quality, interestingness, or user expertise • ML is one of the keys to success • Many interesting problems, and many unsolved challenges
  • 46.