Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

LSH for
 Prediction Problem in Recommendation

Using LSH for predicting user ratings on the items.

  • Be the first to comment

  • Be the first to like this

LSH for
 Prediction Problem in Recommendation

  1. 1. LSH for
 Prediction Problem in Recommendation Maruf Aytekin PhD Student Computer Engineering Department Bahcesehir University May 5, 2015
  2. 2. Outline • User-based • Item-based • LSH • Parameters • Model Build Performance • Accuracy Performance • LSH Parameters
  3. 3. Data Set Total Ratings: 100000 Number of Users : 943 Number of Items : 1682 Sparsity = 0.0630
  4. 4. Evaluation Methods • We use hold out cross validation methot for the experiments • We select %5 for test %5 for validation data randomly. • Repeat this process 3 times and averaged out the results
  5. 5. User-based Neighbors can have different levels of similarity. Wuv: Similarity of user u and v. rvi: Rating value of user v for item i. Ni(u): Set of neighbors who have rated for item i.
  6. 6. ruj: Rating value of user u for item j. Nu(i): the items rated by user u most similar to item i. Wij: Similarity of item i and j Item-based
  7. 7. U1 U2 U3 Um . . . . . H1 H2 U7 U11 U10 . . U13 U39 Um . . U1 U3 U9 . . U2 U5 U6 . . bucket 1 key: 0101 bucket 2 key: 1110 bucket 3 key: 1101 bucket 4 key: 1001 [0,1] [0,1] AND-Construction Locality Sensitive Hashing
  8. 8. Hash Tables U2 U6 U1 U3 . . . candidate set for U5: C(U5) L = 2 K = 4 t = 1 t = 2
  9. 9. LSH for Prediction L : number of hash tables (bands) Cvi(t) : the set of candidate pairs retrieved from hash table t rated for item i. rvi : rating of user v (in C) on item i
  10. 10. Computational Complexty |U | : User set size | I | : Item set size k : Number of neighbors used in the predictions p : Maximum number of ratings per user q : Maximum number of ratings per item
  11. 11. Parameters (CF)
  12. 12. LSH Parameters
  13. 13. LSH Parameters
  14. 14. Model Build Time
  15. 15. Results
 User-based With the optimum k = 30 and Y=7 ; • Average MAE: 0.79527 • Average running time: 9.437 seconds. We compare this results LSH method.
  16. 16. LSH & User-based
 Hash Functions
  17. 17. LSH & User-based
 Hash Functions
  18. 18. LSH & User-based
 Hash Tables
  19. 19. LSH & User-based
 Hash Tables
  20. 20. Conclusion • LSH tremendously improved the scalability • Accuracy decreased in acceptable ranges • Performance improved a lot. • LSH needs to be configured to balance MAE and performance according to expectations from the system.
  21. 21. Source Code User-based Prediction:
  22. 22. Source Code LSH Prediction:
  23. 23. Q&A

×