LSH for Recommendation Systems

•

0 likes•772 views

Maruf Aytekin

Using LSH for predicting user ratings on the items.

Engineering

Outline
• User-based
• Item-based
• LSH
• Parameters
• Model Build Performance
• Accuracy Performance
• LSH Parameters

Data Set
Total Ratings: 100000
Number of Users : 943
Number of Items : 1682
Sparsity = 0.0630

Evaluation Methods
• We use hold out cross validation methot for the
experiments
• We select %5 for test %5 for
validation data randomly.
• Repeat this process 3 times and
averaged out the results

User-based
Neighbors can have different levels of similarity.
Wuv: Similarity of user u and v.
rvi: Rating value of user v for item i.
Ni(u): Set of neighbors who have rated for item i.

ruj: Rating value of user u for item j.
Nu(i): the items rated by user u most similar to item i.
Wij: Similarity of item i and j
Item-based

U1
U2
U3
Um
.
.
.
.
.
H1
H2
U7
U11
U10
.
.
U13
U39
Um
.
.
U1
U3
U9
.
.
U2
U5
U6
.
.
bucket 1
key: 0101
bucket 2
key: 1110
bucket 3
key: 1101
bucket 4
key: 1001
[0,1]
[0,1]
AND-Construction
Locality Sensitive Hashing

Hash Tables
U2
U6
U1
U3
.
.
.
candidate set for U5:
C(U5)
L = 2
K = 4
t = 1
t = 2

LSH for Prediction
L : number of hash tables (bands)
Cvi(t) : the set of candidate pairs retrieved from hash table t
rated for item i.
rvi : rating of user v (in C) on item i

Computational Complexty
|U | : User set size
| I | : Item set size
k : Number of neighbors used in the predictions
p : Maximum number of ratings per user
q : Maximum number of ratings per item

Results 
User-based
With the optimum k = 30 and Y=7 ;
• Average MAE: 0.79527
• Average running time: 9.437 seconds.
We compare this results LSH method.

Conclusion
• LSH tremendously improved the scalability
• Accuracy decreased in acceptable ranges
• Performance improved a lot.
• LSH needs to be configured to balance MAE and
performance according to expectations from the
system.

What's hot

Data streaming algorithmsSandeep Joshi

Probabilistic data structuresshrinivasvasala

3.informed searchKONGU ENGINEERING COLLEGE

SchemEX - Creating the Yellow Pages for the Linked Open Data CloudAnsgar Scherp

2.uninformed searchKONGU ENGINEERING COLLEGE

Document clustering for forensic analysissrinivasa teja

Query Optimizationrohitsalunke

Text categorization as graphHarry Potter

Dictionary based Annotation at Scale with Spark, SolrTextTagger and OpenNLPSujit Pal

A Comparison of Different Strategies for Automated Semantic Document AnnotationAnsgar Scherp

Probabilistic data structures. Part 3. FrequencyAndrii Gakhov

Data Streaming Algorithms宇傅

Changepoint Detection with Bayesian InferenceFrank Kelly

Fast Perceptron Decision Tree Learning from Evolving Data StreamsAlbert Bifet

Measuring Search Engine Quality using Spark and PythonSujit Pal

Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...Andrii Gakhov

What's hot (16)

Data streaming algorithms

Probabilistic data structures

3.informed search

SchemEX - Creating the Yellow Pages for the Linked Open Data Cloud

2.uninformed search

Document clustering for forensic analysis

Query Optimization

Text categorization as graph

Dictionary based Annotation at Scale with Spark, SolrTextTagger and OpenNLP

A Comparison of Different Strategies for Automated Semantic Document Annotation

Probabilistic data structures. Part 3. Frequency

Data Streaming Algorithms

Changepoint Detection with Bayesian Inference

Fast Perceptron Decision Tree Learning from Evolving Data Streams

Measuring Search Engine Quality using Spark and Python

Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...

Similar to LSH for Recommendation Systems

[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...Emanuel Lacić

Recommendation systemDing Li

Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recom...Daniel Valcarce

Survey of Recommendation Systemsyoualab

Recommendation SystemsRobin Reni

Triantafyllia VoulibasiISSEL

RBHF_SDM_2011_JieMDO_Lab

Item Based Collaborative Filtering Recommendation Algorithmsnextlib

Scalable Similarity-Based Neighborhood Methods with MapReducesscdotopen

A flexible recommenndation system for Cable TVIntoTheMinds

A Flexible Recommendation System for Cable TVFrancisco Couto

Probabilistic Collaborative Filtering with Negative Cross EntropyAlejandro Bellogin

Clustering-based Location Recommendation(Collaborative Filtering)ফারহান তানভীর

On Sampling Strategies for Sampling Strategies-based Collaborative FilteringTing Chen

Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...Soheila Dehghanzadeh

Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...Dr. Cornelius Ludmann

Change Impact Analysis for Natural Language RequirementsLionel Briand

Leveraging Knowledge Basesfor Contextual Entity Exploration Categories祺傑林

Item basedcollaborativefilteringrecommendationalgorithmsAravindharamanan S

PhD Consortium ADBIS presetation.Giuseppe Ricci

Similar to LSH for Recommendation Systems (20)

[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...

Recommendation system

Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recom...

Survey of Recommendation Systems

Recommendation Systems

Triantafyllia Voulibasi

RBHF_SDM_2011_Jie

Item Based Collaborative Filtering Recommendation Algorithms

Scalable Similarity-Based Neighborhood Methods with MapReduce

A flexible recommenndation system for Cable TV

A Flexible Recommendation System for Cable TV

Probabilistic Collaborative Filtering with Negative Cross Entropy

Clustering-based Location Recommendation(Collaborative Filtering)

On Sampling Strategies for Sampling Strategies-based Collaborative Filtering

Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...

Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...

Change Impact Analysis for Natural Language Requirements

Leveraging Knowledge Basesfor Contextual Entity Exploration Categories

Item basedcollaborativefilteringrecommendationalgorithms

PhD Consortium ADBIS presetation.

Recently uploaded

welding defects observed during the weldingMuhammadUzairLiaqat

CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani

Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort

Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774

Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721

Correctly Loading Incremental Data at ScaleAlluxio, Inc.

POWER SYSTEMS-1 Complete notes examplesDr. Gudipudi Nageswara Rao

Solving The Right Triangles PowerPoint 2.pptJasonTagapanGulla

An experimental study in using natural admixture as an alternative for chemic...Chandu841456

Indian Dairy Industry Present Status and.pptMadan Karki

Energy Awareness training ppt for manufacturing process.pptxsiddharthjain2303

young call girls in Green Park🔝 9953056974 🔝 escort Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgsaravananr517913

TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1

IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst

Design and analysis of solar grass cutter.pdfTagore Institute of Engineering And Technology

complete construction, environmental and economics information of biomass com...asadnawaz62

Vishratwadi & Ghorpadi Bridge Tender documentsSachinPawar510423

Earthing details of Electrical Substationstephanwindworld

Virtual memory management in Operating SystemRashmi Bhat

Recently uploaded (20)

welding defects observed during the welding

CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf

Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service

Arduino_CSE ece ppt for working and principal of arduino.ppt

Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync

Correctly Loading Incremental Data at Scale

POWER SYSTEMS-1 Complete notes examples

Solving The Right Triangles PowerPoint 2.ppt

An experimental study in using natural admixture as an alternative for chemic...

Indian Dairy Industry Present Status and.ppt

Energy Awareness training ppt for manufacturing process.pptx

young call girls in Green Park🔝 9953056974 🔝 escort Service

Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg

TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers

IVE Industry Focused Event - Defence Sector 2024

Design and analysis of solar grass cutter.pdf

complete construction, environmental and economics information of biomass com...

Vishratwadi & Ghorpadi Bridge Tender documents

Earthing details of Electrical Substation

Virtual memory management in Operating System

LSH for Recommendation Systems

1. LSH for  Prediction Problem in Recommendation Maruf Aytekin PhD Student Computer Engineering Department Bahcesehir University May 5, 2015

2. Outline • User-based • Item-based • LSH • Parameters • Model Build Performance • Accuracy Performance • LSH Parameters

3. Data Set Total Ratings: 100000 Number of Users : 943 Number of Items : 1682 Sparsity = 0.0630

4. Evaluation Methods • We use hold out cross validation methot for the experiments • We select %5 for test %5 for validation data randomly. • Repeat this process 3 times and averaged out the results

5. User-based Neighbors can have different levels of similarity. Wuv: Similarity of user u and v. rvi: Rating value of user v for item i. Ni(u): Set of neighbors who have rated for item i.

6. ruj: Rating value of user u for item j. Nu(i): the items rated by user u most similar to item i. Wij: Similarity of item i and j Item-based

7. U1 U2 U3 Um . . . . . H1 H2 U7 U11 U10 . . U13 U39 Um . . U1 U3 U9 . . U2 U5 U6 . . bucket 1 key: 0101 bucket 2 key: 1110 bucket 3 key: 1101 bucket 4 key: 1001 [0,1] [0,1] AND-Construction Locality Sensitive Hashing

8. Hash Tables U2 U6 U1 U3 . . . candidate set for U5: C(U5) L = 2 K = 4 t = 1 t = 2

9. LSH for Prediction L : number of hash tables (bands) Cvi(t) : the set of candidate pairs retrieved from hash table t rated for item i. rvi : rating of user v (in C) on item i

10. Computational Complexty |U | : User set size | I | : Item set size k : Number of neighbors used in the predictions p : Maximum number of ratings per user q : Maximum number of ratings per item

11. Parameters (CF)

12. LSH Parameters

13. LSH Parameters

14. Model Build Time

15. Results  User-based With the optimum k = 30 and Y=7 ; • Average MAE: 0.79527 • Average running time: 9.437 seconds. We compare this results LSH method.

16. LSH & User-based  Hash Functions

17. LSH & User-based  Hash Functions

18. LSH & User-based  Hash Tables

19. LSH & User-based  Hash Tables

20. Conclusion • LSH tremendously improved the scalability • Accuracy decreased in acceptable ranges • Performance improved a lot. • LSH needs to be configured to balance MAE and performance according to expectations from the system.

21. Source Code User-based Prediction:

22. Source Code LSH Prediction:

23. Q&A

LSH for Recommendation Systems

Recommended

Recommended

More Related Content

What's hot

What's hot (16)

Similar to LSH for Recommendation Systems

Similar to LSH for Recommendation Systems (20)

Recently uploaded

Recently uploaded (20)

LSH for Recommendation Systems