Many web properties make extensive use of item-based collaborative filtering, which showcases relationships between pairs of items based on the wisdom of the crowd. This paper presents LinkedIn’s horizontal collaborative filtering infrastructure, known as browsemaps. The platform enables rapid development, deployment, and computation of collaborative filtering recommendations for al- most any use case on LinkedIn. In addition, it provides centralized management of scaling, monitoring, and other operational tasks for online serving. We also present case studies on how LinkedIn uses this platform in various recommendation products, as well as lessons learned in the field over the several years this system has been in production.
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Browsemap: Collaborative Filtering at LinkedIn
1. Browsemap:
Collaborative Filtering At LinkedIn
Lili Wu, Sam Shah, Sean Choi, Mitul Tiwari, Christian Posse
RSWeb 2014
with RecSys
Recruiting Solutions 1
3. Collaborative filtering for member profile
Profile Browsemap:
People who viewed
this profile
also viewed…
3
Count co-views
4. 4
Collaborative filtering for job page
Job Browsemap:
People who viewed
this job
also viewed…
Count co-views
5. 5
… many CF based recommenders
group company portfolio
6. • Many different entities
• Similar problems with different requirement
• Fast product development cycle
• Hybrid recommender systems
• Handle LinkedIn data volume and traffic
6
Challenges
7. • Many different entities
• Similar problems with different requirement
• Fast product development cycle
• Hybrid recommender systems
• Handle LinkedIn data volume and traffic
7
Challenges
è Horizontal Platform
9. 9
Browsemap Platform
• Scalability
Ø Online/offline architecture
Ø Hundreds of millions of entities, billions of
monthly page views
• Browsemap Domain Specific Language (DSL)
Ø Code reuse through modular components
Ø Flexible computation workflow construction
• Data are used by hybrid recommenders
10. 10
Browsemap Architecture
Frontend
Services
User
Activity
Data
HDFS
Queries
Results
Hadoop
Browsemap
DSL
Browsemap
Engine
Online
Query
API
Key-value
storage
Voldemort
11. 11
Browsemap Architecture
Frontend
Services
HDFS
Queries
Results
Hadoop
Browsemap
DSL
Browsemap
Engine
Online
Query
API
Key-value
storage
Voldemort
User
Activity
High Data
Throughput
12. 12
Browsemap Architecture
Frontend
Services
HDFS
Queries
Results
Hadoop
Browsemap
DSL
Browsemap
Engine
Online
Query
API
Key-value
storage
Voldemort
User
Activity
Data
Low
Latency
13. 13
Browsemap Domain Specific Language
(DSL)
Module
Collection
Expired Job
Filtering
Spam User
Filtering
Co-view
counting
Job
Expired Job
Filtering
Cold-start
techniques
…
Co-view
counting
…
Cold-start
techniques
Job browsemap
Company
…
Spam User
Filtering
Spam User
Filtering
…
Co-view
counting
…
Cold-start
techniques
Company
browsemap
14. • Support all entity types
• Adjust to each product requirement
• Scale
14
Recap
Voldemort
18. 18
Applications – Hybrid Recommender Systems
Suggested
Profile
Update
Goal: for each member,
find companies he may want to follow
19. 19
Applications – Hybrid Recommender Systems
Member info:
• Content-based features
title, industry, location, …
• Collaborative filtering feature
Google Cisco
Member
followed
companies
Linkedin,
Facebook
Juniper,
Arista Companies user may
be interested in
…
…
Co-follow Browsemaps:
People who follow
this company also
follow these companies
20. 20
Applications – Hybrid Recommender Systems
Question:
For a company C, will member M like it?
Approach:
Logistic regression
Features:
member location company location
1 if yes, 0 if no
company is in the list of the co-follow browsemaps ?
1 if yes, 0 if no
…
21. 21
Applications – Hybrid Recommender Systems
Collaborative Filtering is important:
• Surface implicit connection between companies
• Based on Member’s preference
26. Lesson 1: Tall oaks grow from little acorns
26
A generic horizontal platform is essential
27. Lesson 2: One hand washes the other
27
Job Browsemap
Similar Jobs
Collaborative filtering:
“Follower audience”
Content based:
“Leader audience”
28. Lesson 3: You can’t get blood out of a stone
28
Need to handle cold start problem
Job 1 Job 2 Job 3
(new)
(view
time)
merge
Leverage Browsing History Personalized Backfill
29. Lesson 4: A chain is only as strong as its weakest link
29
CF: Relies solely on user activities
Good data is crucial
§ Mistakes can be hard to detect / debug
§ Simple mistakes can have big impact
e.g. “jobid” à “id”
§ Need prevention mechanism
Ø Improve tracking
Ø Unit test
Ø Browsemap platform data-check :
input volume, coverage/metrics analysis
30. Lesson 5: User experience matters
50%
CTR
500%
more
applications
30
ª Put recommendations in user’s flow
31. 31
Conclusion
§ Collaborative filtering is important for
LinkedIn
§ Browsemap is in production for 3+ years
§ Horizontal platform is crucial