Efficient Top-N Recommendation by Linear Regression

•

12 likes•11,596 views

Mark Levy

Slides from Large Scale Recommender System workshop at ACM RecSys, 13 October 2013.

Technology

Efficient Top-N Recommendation
by Linear Regression
Mark Levy
Mendeley

What and why?
●

Social network products @Mendeley

What and why?
●

Social network products @Mendeley

●

ERASM Eurostars project

●

Make newsfeed more interesting

●

First task: who to follow

What and why?
●

Multiple datasets
–

2M users, many active

–

~100M documents

–

author keywords, social tags

–

50M physical pdfs

–

15M <user,document> per month

–

User-document matrix currently ~250M non-zeros

What and why?
●

Approach as item similarity problem
–

●

with a constrained target item set

Possible state of the art:
–
–

matrix factorization and then neighbourhood

–

something dynamic (?)

–
●

pimped old skool neighbourhood method

SLIM

Use some side data

SLIM
●

Learn sparse item similarity weights

●

No explicit neighbourhood or metric

●

L1 regularization gives sparsity

●

Bound-constrained least squares:

SLIM

users

R

W

R

x

≈
rj

items
wj

model.fit(R,rj)
wj = model.coef_

SLIM
Good:
–

Outperforms MF methods on implicit ratings data [1]

–

Easy extensions to include side data [2]

Not so good:
–

Reported to be slow beyond small datasets [1]

[1] X. Ning and G. Karypis, SLIM: Sparse Linear Methods for Top-N Recommender
Systems, Proc. IEEE ICDM, 2011.
[2] X. Ning and G. Karypis, Sparse Linear Methods with Side Information for Top-N
Recommendations, Proc. ACM RecSys, 2012.

From SLIM to regression
●

Avoid constraints

●

Regularized regression, learn with SGD

●

Easy to implement

●

Faster on large datasets

From SLIM to regression

users

R´

W

R

x

≈

items rj

rj
wj

model.fit(R´,rj)
wj = model.coef_

Results on Mendeley data
●

Stacked readership counts, keyword counts

●

5M docs/keywords, 1M users, 140M non-zeros

●

Constrained to ~100k target users

●

Python implementation on top of scikit-learn
–
–

●

Trivially parallelized with IPython
eats CPU but easy on AWS

10% improvement over nearest neighbours

Our use case
R´

W

R

x

≈
all users

j

items

items

readership

keywords

target
users

all users

j

Tuning regularization constants
●

Relate directly to business logic
–

want sparsest similarity lists that are not too sparse

–

“too sparse” = # items with < k similar items

●

Grid search with a small sample of items

●

Empirically corresponds well to optimising
recommendation accuracy of a validation set
–

but faster and easier

Software release: mrec
●

Wrote our own small framework

●

Includes parallelized SLIM

●

Support for evaluation

●

BSD licence

●

https://github.com/mendeley/mrec

●

Please use it or contribute!

Thanks for listening
mark.levy@mendeley.com
@gamboviol
https://github.com/gamboviol
https://github.com/mendeley/mrec

Real World Requirements
●

Cheap to compute

●

Explicable

●

Easy to tweak + combine with business rules

●

Not a black box

●

Work without ratings

●

Handle multiple data sources

●

Offer something to anon users

Cold Start?
●

Can't work miracles for new items

●

Serve new users asap

●

Fine to run a special system for either case

●

Most content leaks

●

… or is leaked

●

All serious commercial content is annotated

What's hot

Natural language processing (NLP) introductionRobert Lujo

Natural language processing and transformer modelsDing Li

Past, Present & Future of Recommender Systems: An Industry PerspectiveJustin Basilico

Introduction to Named Entity RecognitionTomer Lieber

Missing values in recommender modelsParmeshwar Khurd

Recommending for the WorldYves Raimond

Business Intelligence: Multidimensional AnalysisMichael Lamont

Time, Context and Causality in Recommender SystemsYves Raimond

Making Netflix Machine Learning Algorithms ReliableJustin Basilico

Supporting decisions with MLMegan Neider

Recurrent Neural Networks, LSTM and GRUananth

Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Simplilearn

Anatomy of an eCommerce Search Engine by Mayur DatarNaresh Jain

Deep Dive into Hyperparameter TuningShubhmay Potdar

Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...MLconf

Deep Learning for Recommender SystemsYves Raimond

Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev

A Friendly Introduction to Machine LearningHaptik

How to fine-tune and develop your own large language model.pptxKnoldus Inc.

Introduction to Natural Language ProcessingPranav Gupta

What's hot (20)

Natural language processing (NLP) introduction

Natural language processing and transformer models

Past, Present & Future of Recommender Systems: An Industry Perspective

Introduction to Named Entity Recognition

Missing values in recommender models

Recommending for the World

Business Intelligence: Multidimensional Analysis

Time, Context and Causality in Recommender Systems

Making Netflix Machine Learning Algorithms Reliable

Supporting decisions with ML

Recurrent Neural Networks, LSTM and GRU

Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...

Anatomy of an eCommerce Search Engine by Mayur Datar

Deep Dive into Hyperparameter Tuning

Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...

Deep Learning for Recommender Systems

Introduction to Transformers for NLP - Olga Petrova

A Friendly Introduction to Machine Learning

How to fine-tune and develop your own large language model.pptx

Introduction to Natural Language Processing

Viewers also liked

Offline evaluation of recommender systems: all pain and no gain?Mark Levy

Recommender Systems (Machine Learning Summer School 2014 @ CMU)Xavier Amatriain

Collaborative Filtering Recommender Based on Co-occurrence MatrixMarjan Sterjev

Latent factor models for Collaborative Filteringsscdotopen

Matrix Factorization Techniques For Recommender SystemsLei Guo

Efficient Top-N Recommendation for Very Large Scale Binary Rated DatasetsFabio Aiolli

[CIKM 2014] Deviation-Based Contextual SLIM RecommendersYONG ZHENG

[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...YONG ZHENG

Algorithms on Hadoop at Last.fmMark Levy

November 2014 HUG: Apache Tez - A Performance View into Large Scale Data-proc...Yahoo Developer Network

October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in HadoopYahoo Developer Network

Brandie and Michaella's ProjectLacilia0024

October 2016 HUG: Pulsar, a highly scalable, low latency pub-sub messaging s...Yahoo Developer Network

February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...Yahoo Developer Network

Linear Regression Ordinary Least Squares Distributed Calculation ExampleMarjan Sterjev

Mutiple linear regression projectJAPAN SHAH

October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...Yahoo Developer Network

Co-occurrence Based Recommendations with Mahout, Scala and Sparksscdotopen

Recommendation System --Theory and PracticeKimikazu Kato

econometrics project PG1 2015-16Sayantan Baidya

Viewers also liked (20)

Offline evaluation of recommender systems: all pain and no gain?

Recommender Systems (Machine Learning Summer School 2014 @ CMU)

Collaborative Filtering Recommender Based on Co-occurrence Matrix

Latent factor models for Collaborative Filtering

Matrix Factorization Techniques For Recommender Systems

Efficient Top-N Recommendation for Very Large Scale Binary Rated Datasets

[CIKM 2014] Deviation-Based Contextual SLIM Recommenders

[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...

Algorithms on Hadoop at Last.fm

November 2014 HUG: Apache Tez - A Performance View into Large Scale Data-proc...

October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in Hadoop

Brandie and Michaella's Project

October 2016 HUG: Pulsar, a highly scalable, low latency pub-sub messaging s...

February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...

Linear Regression Ordinary Least Squares Distributed Calculation Example

Mutiple linear regression project

October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...

Co-occurrence Based Recommendations with Mahout, Scala and Spark

Recommendation System --Theory and Practice

econometrics project PG1 2015-16

Similar to Efficient Top-N Recommendation by Linear Regression

Parallel analytics as a servicePetrie Wong

Production-Ready BIG ML Workflows - from zero to heroDaniel Marcous

Data Science Salon: A Journey of Deploying a Data Science Engine to ProductionFormulatedby

Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15MLconf

10 more lessons learned from building Machine Learning systems - MLConfXavier Amatriain

10 more lessons learned from building Machine Learning systemsXavier Amatriain

Triantafyllia VoulibasiISSEL

Data ops in practice - Swedish styleLars Albertsson

Document Clustering using LDA | Haridas Narayanaswamy [Pramati]Pramati Technologies

Recommender Systems from A to Z – Real-Time DeploymentCrossing Minds

Benefits of a Homemade ML PlatformGetInData

Dirty Data? Clean it up! - Rocky Mountain DataCon 2016Dan Lynn

MongoDB World 2019: MongoDB in Data Science: How to Build a Scalable Product ...MongoDB

#RADC4L16: An API-First Archives Approach at NPRCamille Salas

LODFlow: Workflow Management System for Linked Data ProcessingIvan Ermilov

BSSML16 L10. Summary Day 2 SessionsBigML, Inc

Intro to ember.jsLeo Hernandez

Moodle performance optimizationsJan Meier

Open source ml systems that need to be builtNikhil Garg

Devday @ Sahaj - Domain Specific NLP PipelinesRajesh Muppalla

Similar to Efficient Top-N Recommendation by Linear Regression (20)

Parallel analytics as a service

Production-Ready BIG ML Workflows - from zero to hero

Data Science Salon: A Journey of Deploying a Data Science Engine to Production

Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15

10 more lessons learned from building Machine Learning systems - MLConf

10 more lessons learned from building Machine Learning systems

Triantafyllia Voulibasi

Data ops in practice - Swedish style

Document Clustering using LDA | Haridas Narayanaswamy [Pramati]

Recommender Systems from A to Z – Real-Time Deployment

Benefits of a Homemade ML Platform

Dirty Data? Clean it up! - Rocky Mountain DataCon 2016

MongoDB World 2019: MongoDB in Data Science: How to Build a Scalable Product ...

#RADC4L16: An API-First Archives Approach at NPR

LODFlow: Workflow Management System for Linked Data Processing

BSSML16 L10. Summary Day 2 Sessions

Intro to ember.js

Moodle performance optimizations

Open source ml systems that need to be built

Devday @ Sahaj - Domain Specific NLP Pipelines

Recently uploaded

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

CNv6 Instructor Chapter 6 Quality of Servicegiselly40

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

A Call to Action for Generative AI in 2024Results

Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

How to convert PDF to text with Nanonetsnaman860154

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

Boost PC performance: How more available memory can improve productivityPrincipled Technologies

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Histor y of HAM Radio presentation slidevu2urc

A Year of the Servo Reboot: Where Are We Now?Igalia

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Recently uploaded (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Boost Fertility New Invention Ups Success Rates.pdf

CNv6 Instructor Chapter 6 Quality of Service

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx

The Codex of Business Writing Software for Real-World Solutions 2.pptx

A Call to Action for Generative AI in 2024

Advantages of Hiring UIUX Design Service Providers for Your Business

Driving Behavioral Change for Information Management through Data-Driven Gree...

How to convert PDF to text with Nanonets

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

Boost PC performance: How more available memory can improve productivity

Presentation on how to chat with PDF using ChatGPT code interpreter

Data Cloud, More than a CDP by Matt Robison

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Histor y of HAM Radio presentation slide

A Year of the Servo Reboot: Where Are We Now?

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Efficient Top-N Recommendation by Linear Regression

1. Efficient Top-N Recommendation by Linear Regression Mark Levy Mendeley

2. What and why? ● Social network products @Mendeley

3. What and why? ● Social network products @Mendeley

4. What and why? ● Social network products @Mendeley

5. What and why? ● Social network products @Mendeley ● ERASM Eurostars project ● Make newsfeed more interesting ● First task: who to follow

6. What and why? ● Multiple datasets – 2M users, many active – ~100M documents – author keywords, social tags – 50M physical pdfs – 15M <user,document> per month – User-document matrix currently ~250M non-zeros

7. What and why? ● Approach as item similarity problem – ● with a constrained target item set Possible state of the art: – – matrix factorization and then neighbourhood – something dynamic (?) – ● pimped old skool neighbourhood method SLIM Use some side data

8. SLIM ● Learn sparse item similarity weights ● No explicit neighbourhood or metric ● L1 regularization gives sparsity ● Bound-constrained least squares:

9. SLIM users R W R x ≈ rj items wj model.fit(R,rj) wj = model.coef_

10. SLIM Good: – Outperforms MF methods on implicit ratings data [1] – Easy extensions to include side data [2] Not so good: – Reported to be slow beyond small datasets [1] [1] X. Ning and G. Karypis, SLIM: Sparse Linear Methods for Top-N Recommender Systems, Proc. IEEE ICDM, 2011. [2] X. Ning and G. Karypis, Sparse Linear Methods with Side Information for Top-N Recommendations, Proc. ACM RecSys, 2012.

11. From SLIM to regression ● Avoid constraints ● Regularized regression, learn with SGD ● Easy to implement ● Faster on large datasets

12. From SLIM to regression users R´ W R x ≈ items rj rj wj model.fit(R´,rj) wj = model.coef_

13. Results on reference datasets

14. Results on reference datasets

15. Results on reference datasets

16. Results on Mendeley data ● Stacked readership counts, keyword counts ● 5M docs/keywords, 1M users, 140M non-zeros ● Constrained to ~100k target users ● Python implementation on top of scikit-learn – – ● Trivially parallelized with IPython eats CPU but easy on AWS 10% improvement over nearest neighbours

17. Our use case R´ W R x ≈ all users j items items readership keywords target users all users j

18. Tuning regularization constants ● Relate directly to business logic – want sparsest similarity lists that are not too sparse – “too sparse” = # items with < k similar items ● Grid search with a small sample of items ● Empirically corresponds well to optimising recommendation accuracy of a validation set – but faster and easier

19. Software release: mrec ● Wrote our own small framework ● Includes parallelized SLIM ● Support for evaluation ● BSD licence ● https://github.com/mendeley/mrec ● Please use it or contribute!

20. Thanks for listening mark.levy@mendeley.com @gamboviol https://github.com/gamboviol https://github.com/mendeley/mrec

21. Real World Requirements ● Cheap to compute ● Explicable ● Easy to tweak + combine with business rules ● Not a black box ● Work without ratings ● Handle multiple data sources ● Offer something to anon users

22. Cold Start? ● Can't work miracles for new items ● Serve new users asap ● Fine to run a special system for either case ● Most content leaks ● … or is leaked ● All serious commercial content is annotated

23. Degrees of Freedom for k-NN ● Input numbers from mining logs ● Temporal “modelling” (e.g. fake users) ● Data pruning (scalability, popularity bias, quality) ● Preprocessing (tf-idf, log/sqrt, …) ● Hand crafted similarity metric ● Hand crafted aggregation formula ● Postprocessing (popularity matching) ● Diversification ● Attention profile

Efficient Top-N Recommendation by Linear Regression

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Efficient Top-N Recommendation by Linear Regression

Similar to Efficient Top-N Recommendation by Linear Regression (20)

Recently uploaded

Recently uploaded (20)

Efficient Top-N Recommendation by Linear Regression