Beyond ratings and followers (RecSys 2012)

Beyond
Ratings
&
Followers
Anmol Bhasin
Sr. Manager
Analytics Engineering
www.linkedin.com

The Recommender Ecosystem
Similar Profiles

Connections

Network updates
Events You May
Be Interested In

News

11

LinkedIn Recommendation Engine

Jobs Groups
People
Recommendation … Ads
Entities Companies
Searches

be interested in

Similar Groups
Jobs You May

Jobs Browse

Browse Map
Similar Jobs
News
Browse Map
TalentMatch

Groups
Events

GYML
Referral
Profiles
People

Similar

Center

Map
… and more
Products

A/B
API
Recommen-
dation Types Behavior Collaborative
Popularity User Feedback
Analysis Filtering
Shared,
Dynamic,
Unified (R-T) Feature Extraction, Entity (R-T) matching computations
Core Resolution & Enrichment Offline data munging (hadoop)
Service

Possible Approaches

 Naïve K Nearest Neighbor solution
 Complexity is O(n 2 )

 Clustering
 Latent Factor Models like PLSI or LDA
 Hierarchical Agglomerative clustering

 Self Organizing Maps

 Item based Collaborative Filtering
 Find pairs of Users viewed in the same session

Challenges
 Scale
 175+ M profiles

 Dimensionality
 ~2M companies
 ~200K schools
 ~147 industries
 ~200 countries
 ~25K titles
 ~40K Skills
 ~200 Job Functions

 Similar means different things to different people
 Similar Behavior doesn’t mean you can replace me at my job
 Accuracy vs Relevance (me & my boss.. )

 Realtime..
 It’s a problem of accuracy.. Not recall..

Approach
 Focus attention only on pairs likely to be similar

 Filter out the possibly dis-similar pairs

 Run Similarity Functions on filtered in pairs

FILTER

Cluster
Rank

Locality Sensitive Hashing
 LSH function family for Cosine Distance

Similarity Functions

 Different bands of attributes
 Boolean, Jaccard or Cosine Similarities across attribute
pairs.

• Logisitic Regression with Elastic Penalty

 Learn model params on a set of hand labeled data points
 Predicted value interpreted as score

Ad Ranking
 Given
U j ,{(c0, b0 ), (c1, b1 ), (c2, b2 ), (c3, b3 )..(cn, bn )}, H
 Objective

argmax(pCTR i *bidi )
iÎC

 Goal:
 Increase revenue
 Respect daily budgets of Advertisers
 Good user experience

Virtual Profiling

Title : Eng Mgr
Company : LinkedIn
Location : CA,USA
Skills : ML, RecSys

Title : Vice President
Company : Twitter
Location : CA,USA
Skills : DM, ML,
RecSys
……………….

Virtual Profiling
Title :
Title : Eng Mgr Sr. SE<1>, Eng Mgr<1>,
Company : LinkedIn Eng Dir<1>
Location : CA,USA
Skills : ML, RecSys Company :
Title : Sr. SE LinkedIn<2>, Google<1>,
Company : Google
Location : PA, USA Location :
Skills : ML, DM CA,USA <2>, PA, USA<1>

Skills :
Title : Eng Dir ML<2>, RecSys<1>,
Company : Linkedin Stats<1>, DM<1>
Location : PA, USA
Skills : ML, Stats, DM

Virtual Profiling

Information Gain

 Pick Top K overrepresented features from the
clicker distribution vs the target segment

A representative projection of the item in the
member feature space

CTR Prediction – CF Similarity

Ranker
MEMBER FEATURES
AD CREATIVE VIRTUAL PROFILE

Creative Score to
features pCTR
pCTRi correction

 L2 regularized Logistic Regression (Liblinear, VW, Mahout, ADMM)

 For new ad creatives back-off to the advertiser / ad category nodes till
they reach critical impression/click volume (explore/exploit)

Feature Engineering – Entity Resolution

 Companies
‘IBM’ has 8000+ variations
- ibm – ireland
- ibm research
- T J Watson Labs
- International Bus. Machines K-Ambiguous

- Deep Blue

 Huge impact on the
business and UE
 Ad targeting
 TalentMatch
 Referrals
Asonam’11, KDD’11

30

Feature Engineering – Sticky Locations
 Open to relocation ?
 Region similarity based on profiles or network
 Region transition probability

 predict individuals propensity to migrate and
most likely migration target
 Impact on job recommendations
 20% lift in views/viewers/applications/applicants

What should you transition to .. and when ?
Probability of switch

Months since graduation

32

Social Referral

Linkedin Group: Text Analytics
From: Deepak Agarwal – Engineering Director, LinkedIn

I found this group interesting, and I think you will too

Deepak
2X conversion
Linkedin Group: Text Analytics
> 2X Conversion

Mohammad Amin, Baoshi Yan, Sripad Sriram, Anmol Bhasin, Christian Posse.
Social Referral : Using network connections to deliver recommendations. To appear in
Proceedings of the Sixth ACM conference on Recommender systems (RecSys '12)

Beware of some A/B testing pitfalls

1. Novelty effect
 E.g., new job recommendation algorithms
have week-long novelty effect that shows
lifts twice the stationary (real) one
job views per 5% bucket range - 6/5/11 job views 6/19/11
9,000
7,000
8,000

7,000
6,000

6,000 5,000
5,000 4,000
job views per 5% bucket range -
4,000 6/5/11
3,000 job views 6/19/11
3,000
2,000
2,000

1,000
1 week lifts 1,000
2weeks lifts
0 0
0 5 10 15 20 25 0 5 10 15 20 25

1. Cannibalization
 Zero-sum game or real lift?
2. Random sampling destroys
network effect
38
38

Open Source Technologies

Bobo
Zoie

Voldemort
Kafka

http://data.linkedin.com 40

Credits

Engineering : Abhishek Gupta, Adam Smyczek, Adil Aijaz,
Alan Li, Baoshi Yan, Bee-Chung Chen, Deepak Agarwal,
Ethan Zhang, Haishan Liu, Igor Perisic, Jonathan
Traupman, Liang Zhang, Lokesh Bajaj, Mario Rodriguez,
Mitul Tiwari, Mohammad Amin, Monica Rogati, Parul
Jain, Paul Ogilvie, Sam Shah, Sanjay Dubey, Tarun Kumar,
Trevor Walker, Utku Irmak

Product : Andrew Hill, Christian posse, Gyanda Sachdeva,
Mike Grishaver, Parker Barrile, Sachit Kamat

Alphabetically sorted 

A Recommendation for you..

Picture yourself with this New Job:

You
Applied Researcher /
Research Engineer

Contact:
abhasin@linkedin.com

http://data.linkedin.com/

Beyond ratings and followers (RecSys 2012)

Recommended

Recommended

More Related Content

What's hot

What's hot (8)

Similar to Beyond ratings and followers (RecSys 2012)

Similar to Beyond ratings and followers (RecSys 2012) (20)

Recently uploaded

Recently uploaded (20)

Beyond ratings and followers (RecSys 2012)

Editor's Notes