Scaling Recommendations at Quora (RecSys talk 9/16/2016)

Scaling Recommendations at Quora
Nikhil Dandekar
@nikhilbd
9/16/2016

Quora’s Mission
“To share and grow the world’s knowledge”
● Millions of questions & answers
● Millions of users
● Over a million topics
● Growing exponentially...

Lots of high-quality textual information

● Scaling the home page feed
● Scaling the Machine Learning environment
● Pragmatism: aka don’t chase every new, shiny object
Agenda

Recommendations at Quora
● Home feed
● Digest emails
● Topics to follow
● Users to follow
● Related Questions
● Related Topics (topic → topic)
● Trending topics
● …..

Home feed
● Goal: personalized, engaging experience for
reading/writing
● Show a ranked list of stories (questions/answers)
● ML model predicts an interestingness score for each
story
● Training data:
○ impression logs from the past
○ x: features about user/story/interactions
○ y: score based on actions (answer/follow,
upvote/click)

What is interestingness?
click
upvote
downvote
expand
share
click
answer pass
downvot
e
follow

Performance and Cost
Millions of questions and
answers
The best 20 questions and
answers
Personalized
Ranking
x millions of users
Scaling challenge:
● Content growing exponentially
○ Time spent per ranking request growing
exponentially
● Users growing exponentially
○ Number of ranking requests growing
exponentially
● Computational resources spent on ranking
growing quadratically with respect to user
growth

● Solution: Multi-phase ranking!
● Use an unpersonalized model to reduce the
number of candidates for the personalized
model
● Cache the computed score in storage
Performance and Cost
Millions of questions and
answers
The best 20 questions and
answers
Ranking
x millions of users
Thousands of questions and
answers
Unpersonalized
(1p)
Personalized (2p)

Feed backend system
Aggregator 1 Aggregator 2 Aggregator 3
Leaf 1 Leaf 2 Leaf 3
Aggregator
Leaf
Requests from Web (python)
...
...
...
user_id
object_id

Scaling the Machine
Learning Environment

ML applications
● Feed / digest
● Search
● Answer ranking / Answer collapsing
● User-user, user-topic recommendations
● Related questions
● Duplicate questions
● Question-topics
● Question quality
● Spam users / content
● ….and a lot more
Machine Learning environment
ML Models
● Logistic Regression
● Gradient Boosted Decision Trees
● LambdaMART
● Random Forest
● Matrix Factorization
● Deep Neural Networks
● LDA
● k-means
● k-NNs
● ...and others

● Productionizing ML training
○ Continuous retraining of models to
adapt to new data
○ Use Luigi to keep track of task
dependencies

● Productionizing ML training:
adapt to new data
dependencies
● Use Amazon EC2 spot instance for
training tasks
○ Usually much cheaper than
on-demand price
○ Can spawn multiple boxes at once and
shut them down after training is
complete

adapt to new data
dependencies
● Use Amazon EC2 spot instance for training
tasks
● Extremely important to have automatic
monitoring of each task’s input/output
○ Data can change in unexpected ways
○ Don’t want bugs in upstream models
to affect downstream models
Data populator
Training model 1
Training model 2 Training model 3

adapt to new data
dependencies
● Use Amazon EC2 spot instance for training
tasks
● Extremely important to have automatic
monitoring of each task’s input/output
○ Data can change in unexpected ways
○ Don’t want bugs in upstream models
to affect downstream models
Data populator
Training model 1
Training model 2 Training model 3
Verify data
Verify metrics
Counts, class
proportions,...
MSE, R2, AUC,...

● Need a ML platform that is
○ Easy to ramp up on
○ Easy to iterate on
○ Fast
○ Reliable
○ Reusable
○ Production-ready
Machine Learning platform goals

● Have a centralized ML platform that is shared across teams
○ Write training scripts in C++/Python and run them on remote boxes
○ Provide Python wrappers with iPython integration
○ Store data on Redshift/S3 and have training boxes communicate with them directly
Machine Learning platform
Dev laptop
Storage services (Redshift,
S3…)
Training
boxes
CPU/GPU

● In an IPython notebook
Lego ML platform

● Single way to define and add ML features
● Features are reusable
○ Different ML applications do not define / calculate them separately
● Available both offline (training time) and online (prediction time)
● Single point for logging, monitoring, documentation etc.
Alchemy Feature Engineering Framework

● Relevance
● Speed: Fast prediction, (relatively) fast
training
● Fast development and iteration time
● Reliability / Robustness
● Cost
● Debuggability
● Low technical debt
What all matters for your ML algorithm:

Occam’s razor for Machine Learning
● Given two models that perform more or
less equally, you should always prefer
the less complex
● E.g. A Deep Learning model:
○ +1% in accuracy
○ 10x training time
○ 1.5x prediction time
○ Costly to store and maintain
● Look at all the factors, not just
relevance

Distributing ML training
● Distributed ML training helps you scale with data
● But most of what people do in practice can fit into a single, multi-core
machine
● Trade-offs:
○ Relevance gains
○ Training speed
○ Development and iteration time
○ Costs
● Use what works best given these factors, with an eye out for the future

● Figure out how to scale up your data and your models
● But scaling is not just about data and the models
○ Think about your ML environment too
● Be Pragmatic
○ Don’t chase every new, shiny object
In summary

● https://www.quora.com/careers
● Technical Lead - Machine Learning
● Software Engineer - Machine Learning
● Software Engineer - NLP
● Engineering Manager - Machine Learning
We are hiring!

Scaling Recommendations at Quora (RecSys talk 9/16/2016)

Recommended

Recommended

More Related Content

What's hot

What's hot (17)

Viewers also liked

Viewers also liked (14)

Similar to Scaling Recommendations at Quora (RecSys talk 9/16/2016)

Similar to Scaling Recommendations at Quora (RecSys talk 9/16/2016) (20)

Recently uploaded

Recently uploaded (20)

Scaling Recommendations at Quora (RecSys talk 9/16/2016)