Ewa Dominowska, Engineering Manager, Facebook at MLconf SEA - 5/20/16

Generating a Billion Personal News Feeds
Lessons Learned from News Feed Ranking
Ewa Dominowska

1 News Feed Overview
2 Machine Learning in News Feed
3 Measurement

What is Facebook News Feed?
Way to connect with stories that matter to you most
Connect, Inform and Entertain
News Feed is the constantly updating list of stories in the middle of your home
page or mobile app. News Feed includes status updates, photos, videos,
links, app activity and likes from people, pages and groups that you follow
on Facebook.
Feed Ranking is just a tool, you are in control of what you see in your News
Feed and can adjust your settings.

▪ Over 1,000,000,000 daily users
▪ Hundreds of billions stories seen
per day
▪ Trillions of stories ranked per day
▪ Publish -> In Feed < 1s
▪ Retrieval + rank time < 200ms
Basic Stats
Alex Chapel
Francis McDonough
Victoria Beckchen

Deliver everything that matters to people
and nothing that doesn't
▪ Don’t miss any important stories
▪ New stories should show up within seconds
▪ Put the best content at the top
▪ People notice/interact with
content at the top first.
▪ Better content at top means
better experience, less good
content missed.
▪ It’s not just winner takes all
ordering of content matters

How News Feed Works?
Story 1
PHOTO
Story 2
FRIEND
POST
Story 3
VIDEO
Story 4
LINK
SHARE
Story 5
PHOTO
Story 6
LINK
SHARE
Story 7
FRIEND
POST
▪ Goal is to put best content on top
▪ Solution – every time you visit we
rank all new content and put on top
▪ Anything you haven’t seen
is new to you
▪ New friend shares same link you’ve seen
▪ Unseen old stories
▪ Seen stories with new
comments
▪ For frequent users ranking
is almost chronological
▪ Diversity of content matters
9:00 AM
1.9
1.1
0.8
0.6
0.4
0.3
0.4
Story 4
LINK
SHARE
Story 5
PHOTO
Story 6
LINK
SHARE
Story 7
FRIEND
POST
0.9
0.6
0.3
0.2
Story 8
PAGE POST
Story 3
F_comment 1.0
1.3
10:00 AM
Story 5
PHOTO
0.1
0.9
Story 9
VIDEO
10:10 AM
Story 10
FRIEND
POST
1.8
Story 11
LINK
SHARE
1.5
Story 12
FRIEND
POST
1.1
Story 13
VIDEO 0.2
12:00 PM

Scoring based Ranking
▪ Given a potential feed story, how
good is it?
▪ Express as probability of click, like,
comment, etc.
▪ Assign different weights to different
events, according to significance
▪ Example: close coworker feels
earthquake
▪ Highest chance of click
▪ Decent chance of like/comment
Event Probability Value*
Click 5.1% 1
Like 2.9% 5
Comment 0.55% 20
Share 0.00005% 40
Friend 0.00003% 50
Hide 0.00002% -100
Total 0.306
*Example, not real values

▪ Why this structure:
▪ Uses Machine Learning to predict true, measurable behavior
▪ Models train on their own data
▪ Allows fast iteration
▪ Allows distributed development
▪ Allows for easy ranking of heterogeneous content
▪ Allows for value to be adjusted independently
▪ WORKS WELL IN PRACTICE
Learnings

Role of Network Structure
▪ News Feed delivers content from
friends along social network
▪ Understanding the network is
key to defining quality
▪ Who are your close friends?
▪ Whose photos do you always like?
▪ Whose links are the most interesting
to you?
Click Like Comment Weighted
Sum
Joe 0.012 0.0042 0.00082 0.0494
Susan 0.023 0.02 0.0082 0.287
Li 0.012 0.0037 0.001 0.0505
0.287
0.0494 0.0505
Joe
Susan
Li

Feature Selection (BDTs)
▪ Start with over >100K potential (dense) features and all historical activity
▪ First, prune these to top ~2K
▪ Training time is proportional to number of examples * number of features
▪ Under-sample negative examples (impressions, no action) to help with # of examples
▪ Start with 100K features, max rows, keep most important 10K, train 10x rows
▪ Do this for each feed event type: train many forests
▪ Historical counts and propensity are some of the strongest features

Model Training (Logistic regression)
▪ We need to react quickly and incorporate new content - use a simple model
▪ Logistic regression is simple, fast and easy to distribute
▪ Treat the trees as feature transforms, each one turning the input features into
a set of categorical features, one per tree.
▪ Use logistic regression for online learning to quickly re-learn leaf weights
F3
-0.1 0.3
0.2
F1
-0.5
0.2 -.05
F2
F3
Throw out boosted tree weights, use only transforms
Input: (F1, F2, F3)
Output (T1, T2) where T1  {Leaves of tree 1}

Stacking: Combined Tree + LR Model
▪ Main Advantage: Tree application is computationally expensive and slow
▪ Reuse click tree to predict likes, comments, etc.
▪ Only slightly more expensive than independent models; better prediction
performance – transfer learnings
~Thousands of
Raw features
Thousands of Tree Transforms
Sparse Boolean features + non-tree raw features
Like Comment Share Friend Outbound
Click
Follow HideClick
Click Like Comment Share Friend Outbound click Follow Hide

Other models + sparse features
▪ Train Neural nets to predict events
▪ Discard final layer, use final layer outputs as features
▪ Add sparse features such as text or content ID
Raw
Features
Forest
Raw
Features
Neural Network
Sparse features
Logistic Regression
Like Comment Share Hide Outbound
Click
Fan | Follow FriendClick

▪ Data freshness matters – simple models allows for online learning and
twitch response
▪ Feature generation is part of the modeling process
▪ Stacking
▪ supports plugging-in new algorithms and features easily
▪ works very well in practice
▪ Use skewed sampling to manage high data volumes
▪ Historical counters as features provides highly predictive features, easy
to update online
Learnings

Measurement
18
▪ Selecting the right objective function
▪ Defining metrics
▪ Implicit: Engagement, ex. Click
▪ Longitudinal metrics, ex. Abandonment
▪ Explicit: Quality, ex. Survey Score

Why are implicit metrics limited?
▪ Some important stories don’t
get that much engagement
▪ Eg. Sad stories and world news
▪ Some lower quality stories get lots of
engagement, social expectations
▪ Relative importance: Is comment
always more important then a like?
▪ Goal is to align ranking with
personalized relevance
▪ Solution -> Ask users directly:
Collect explicit signals from survey
data

Pairwise Comparison
Survey
20
▪ Pair wise comparison
between two stories
from same feed.
▪ Pro: Real user preference
on two stories from same
query.
▪ Con: Don’t really know if
they are just comparing
two good stories, or two
poor stories or one of
each.

In or Out survey
21
▪ Single Story survey, “do
you want to see it or
not?”
▪ Pro: Fun, simple, absolute
▪ Con: People do not really
know the consequence of
the action, limited
resolution does not help
with ranking

Rating Survey
22
▪ 5pt rating scale, “how
much do you want to
see the story in your
feed?”.
▪ Pro: Absolute metrics,
good participant rate.
▪ Con: Out of context, raters
might not be truthful,
harder to do for users.

In-Context Survey
23
▪ 5pt rating scale, “how
much do you want to
see the story in your
feed?”.
▪ Pro: In context
▪ Con: Can distract, lead to
abandonment, takes up
valuable real-estate from
the feed

Absolute vs. relative ratings
▪ Relative
▪ Easier
▪ Infinite precision
▪ More self-consistent
▪ More calibrated cross people
▪ Absolute
▪ Gives amount of delta
▪ No intransitivity issues
▪ Clear definition of best
▪ Which one do we choose?
▪ Solution: Do both

Start to better understand what matters for each
individual
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Definitely do
NOT want to
see this in Feed
Do not want to
see in Feed
Don't mind
seeing in Feed
but wouldn't
mind missing
either
Want to see in
Feed
Definitely want
to see in feed
Someone you really don't care about Someone you don't care about
Someone you somewhat care about Someone you care about
Someone you really care about

Improving Feed based on rating results
▪ Not enough data to do large
scale, personalized machine
learning
▪ Look for other insights
▪ Eg. How much should we value a
comment vs. a like
▪ Ranking by
α p(like) + β p(comment)
▪ Optimal α, β depends on content
type, content creator, context
▪ Passive consumption (dwell time)
prediction improves relevance
▪ Response means different things in
different contexts and for different
people
▪ Eg: ‘like’ is harder to come by on public
content than friend content, and hence tends
to indicate higher quality there.
P(like)P(outboundclick)
Rating vs. p(like) and p(outbound click)
Avg.Rating

Learnings
▪ Both implicit and explicit signals are important and can be used
together
▪ Multiple survey types can be used simultaneously to get different
advantages
▪ Metrics are important - the right metrics are needed to define an
objective function for our models and to measure model performance
▪ ML and Metrics are tools that let us CONNECT, INFORM and
ENTERTAIN Facebook users.

We far from done…
Come join us to help solve the next big
challenge

Ewa Dominowska, Engineering Manager, Facebook at MLconf SEA - 5/20/16

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Ewa Dominowska, Engineering Manager, Facebook at MLconf SEA - 5/20/16

Similar to Ewa Dominowska, Engineering Manager, Facebook at MLconf SEA - 5/20/16 (20)

More from MLconf

More from MLconf (20)

Recently uploaded

Recently uploaded (20)

Ewa Dominowska, Engineering Manager, Facebook at MLconf SEA - 5/20/16

Editor's Notes