Sonya Liberman leads the Personalization team @ Outbrain's Recommendations group, developing large-scale machine learning algorithms for Outbrain's content recommendations platform serving tens of billions real-time recommendations a day. She specializes in Information Retrieval, Machine Learning, and Computational Linguistics. Before joining Outbrain, she led the Research and Algorithms @ ConvertMedia (acquired by Taboola). She holds an MSc in Computer Science and a BSc in Computer Science and Computational Biology.
This invited talk was given at the Recommender Systems Workshop 2017, University of Haifa.
14. The Lighthouse
Help people discover content they can trust to
be interesting, relevant, and timely for them
Main Players - Publishers, Marketeers,
Users
Use
r
Publishe
r
Marketeer
15. 15
Challenges
• Personalization
• A Jungle of Market Rules
Geo targeting, publisher blacklisting of sites, URLs,
titles
• Scale
35K req/sec, 50ms latency, millions of potential content
recs
27. 28
Translate articles to searchable documents in
the same feature space of user interests and
market rules
Is about:
Celebrities
site:
www.brad.com
Breakup: What’s Next?
Brad's
acting
career
continues to flourish
while he films a new …
28. 29
Representing Users and Content in the
Same Feature Space
In classical search – Queries and Documents are
represented in the feature space of ‘terms’
In content recommender system - Users and Content
are represented in a Semantic Feature Space
29. 30
What is a Document About?
Semantic Features
Categories
Entertainment/Television
Topics
Story, Murder, Television
Entities
Dolores, Westworld, HBO
NLP
43. 53
From Market Rules to Elasticsearch Filters
Geo Targeting
”Music World – everything on NY Music Scene "
Targeting "US" users only
44. 54
Index Geo Field in the Document
{
"title" :”Music World–everything on NY Music Scene“,
"categories" : [”music"],
"entities" : [”aerosmith", ”ny"],
"geo" : ["us"]
}
59. 70
Some signals are highly dynamic
Offline signals must match online signals
Example: User Profile is changing with every user’s
interaction
Supervised Learning in an Environment of
Dynamic Signals
65. User’s interactions affect user’s profile
76
Preventing Data Leakage
Time
Music
Musi
c
Tech
Learn
Model
Photo-
graph
y
66. User’s interactions affect user’s profile
77
Time
Music
Musi
c
Tech
Learn
Model
Photo-
graph
y
Preventing Data Leakage
67. User’s interactions affect user’s profile
78
Time
Music
Musi
c
Tech
Learn
Model
Photo-
graph
y
Preventing Data Leakage
68. 79
Solution - Log And Learn Framework
Serving-time
Logged Data
Offline Model
Learning
Framework
Online Serving
Request Response
Models
Signals
Snapshot
s
69. 80
Supervised Learning from Explorative Data
Regression Models
Collaborative Filtering Models
Factorization Machines
70. 81
Elasticsearch Custom Scoring Functions
Writing our own scoring functions with native Java via
Elasticseach plugins mechanism
Passing parameters to Elasticsearch
Applying machine learned models in
serving time