1.
Benchmarking NODE against Collaborative Filtering Albert Azout Boris Chen Sociocast Networks LLC Sociocast Networks LLC New York, New York New York, New York albert.azout@sociocast.com boris.chen@sociocast.com Giri Iyengar, PhD Sociocast Networks LLC New York, New York giri.iyengar@sociocast.com January 8, 2013 Abstract We have benchmarked Sociocast’s proprietary NODE algorithm against the pop-ular collaborative ﬁltering algorithm for the task of predicting Social bookmarkingactivity of internet users. Our results indicate that NODE performs between 4 and 10times better in precision, recall and F1 score compared to collaborative ﬁltering. Thisperformance holds across varying levels of the prediction window.1 IntroductionRecommender systems have become widely used across many industries - e.g. Netﬂixfor movies, Pandora for music, and Amazon for consumer products. This technologyhas also been studied in statistics and machine learning research communities. Theunderlying problem is to form product recommendations based on previously recordeddata [1, 11]. Better recommendations can improve customer loyalty by helping cus-tomers ﬁnd products of interest they were previously unaware of [2]. Ansari, Essegaier, and Kohli (2000) categorize recommendation systems into twotypes: collaborative ﬁltering and content-based approaches [3]. In this short study,we benchmark the performance of Sociocast’s NODE algorithm against collaborativeﬁltering. NODE incorporates the time dimension into its core similarity function be-tween users, whereas in collaborative ﬁltering the introduction of temporal dynamicsis usually via additional parameters that result in a very large number of parameterswhich makes the algorithm unscalable and impractical for most use cases (cf. Netﬂixprize winning algorithm). 1
2.
2 Testing Methodology2.1 Delicious DatasetWe use the Delicious (a Yahoo! company) dataset that is publicly available. Thisdataset represents bookmarking activity by 210,000 users on the www.delicious.comwebsite over a period of 10 days (Sept 5th, 2009 - Sept 14th, 2009). The ﬁrst eightdays are given to both algorithms for training, and the last two days are withheld asthe ground truth for testing. We restrict the dataset to only those users who had atleast 10 bookmarks in this period. This represents 14337 users and 600752 bookmarksover the 8-day training period and another 136164 bookmarks over the test period.2.2 Bookmark ClassiﬁcationEach user provided bookmark corresponds to a live URL. We classify each URL intoa space of 434 classes using a proprietary machine learning based classiﬁer trainedon a custom-curated corpora. These classes correspond to the 2nd level of the IABstandard taxonomy. An example classiﬁcation of a URL could be “Sports, Basketball”or “Technology Products, Laptops”. Each URL is allowed up to three classiﬁcations.The prediction task can then be thought of as predicting which classes or topics eachuser will bookmark next, based on their previous bookmarking activity.2.3 Collaborative FilteringUser-based collaborative ﬁltering [4, 5, 6] is a memory-based algorithm which mimicsthe word-of-mouth behavior for rating data. The intuition is that users with similarpreferences will rate items similarly. Missing ratings for a user can be predicted byﬁnding a neighborhood of similar users and then aggregating the ratings of these usersto form a prediction. A neighborhood of similar users can be deﬁned with either thePearson correlation coeﬃcient or cosine similarity: ¯ ¯ i∈I (xi x)(yi y) simP earson (x, y) = (1) (|I| − 1)sd(x)sd(y) x·y simcosine (x, y) = (2) ||x||||y||where I is the set of items, x and y represent the row vectors in the rating matrix Rof two users’ proﬁle vectors, sd(·) is the standard deviation and || · || is the l2 norm ofa vector. Once the users in a neighborhood of an active user N (a) ⊂ U are found bytaking a threshold on the similarity or by taking the k nearest neighbors, the easiestway to form predicted ratings is to average the ratings in the neighborhood: 1 raj = ˆ sai rij (3) i∈N (a)sai i∈N (a) where sai is the similarity between the active user ua and user ui in the neighbor-hood. 2
3.
In some data sets where numeric ratings are not appropriate or only binary datais available, a version of CF using 0-1 data is available [7, 8]. The Delicious dataset isbest represented this way, where each rating rjk ∈ {0, 1} can be deﬁned as: 1 if user uj bookmarked item ik rjk = 0 otherwise. A similarity measure which only focuses on matching ones and avoids the ambiguityof zeroes representing either missing ratings or negative examples is the Jaccard index: |X ∩ Y| simJaccard (X , Y) = (4) |X ∪ Y|where X and Y are the sets of the items with a 1 in user proﬁles ua and ub , respectively.3 Evaluation and ResultsWe ask each algorithm to generate the top-N recommended items for each user (whereN can vary), based on the training period. Each recommended item can then bechecked whether or not it appears in the withheld ground truth period. The resultscan be summarized with the classical binary classiﬁcation confusion matrix. Precision,recall, and F1 are popular metrics used in information retrieval [9, 10]: correctly recommended items P recision = total recommended items correctly recommended items Recall = total useful recommendations P recision · Recall F1 = 2 · P recision + Recall The tables below summarize the performance of the two algorithms for diﬀerentlevels of N , where N is the number of recommendations for each user each algorithmis forced to make. Each recommendation is then evaluated against the ground truthset, then tallied using precision, recall, and F1. Precision N NODE CF Factor of Improvement 1 35.31% 4.22% 8.37 2 31.01% 3.11% 9.93 5 23.50% 3.66% 6.42 10 18.19% 3.03% 6.01 15 15.05% 3.87% 3.89 Recall N NODE CF Factor of Improvement 1 5.43% 0.65% 8.37 2 9.53% 0.96% 9.93 5 18.06% 2.81% 6.42 10 27.97% 4.65% 6.01 15 34.68% 8.92% 3.89 3
4.
F1 score N NODE CF Factor of Improvement 1 9.41% 1.12% 8.37 2 14.58% 1.46% 9.93 5 20.43% 3.18% 6.42 10 22.04% 3.67% 6.01 15 20.98% 5.39% 3.89 NODE consistently outperforms CF by a factor of 3.89 to 9.93 in both precisionand recall. Note that the factor of improvement is consistent across all metrics, sinceboth algorithms are forced make the same number of predictions, and the ground truthset is also the same for both algorithms.References[1] B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Analysis of recommendation algorithms for e-commerce. In EC ’00: Proceedings of the 2nd ACM conference on Electronic commerce, pages 158–167. ACM, 2000. ISBN 1-58113-272-7.[2] J. B. Schafer, J. A. Konstan, and J. Riedl. E-commerce recommendation applica- tions. Data Mining and Knowledge Discovery, 5(1/2):115–153, 2001.[3] A. Ansari, S. Essegaier, and R. Kohli. Internet recommendation systems. Journal of Marketing Research, 37:363–375, 2000.[4] D. Goldberg, D. Nichols, B. M. Oki, and D. Terry. Using collaborative ﬁltering to weave an information tapestry. Communications of the ACM, 35(12):61–70, 1992. ISSN 0001-0782. doi: http://doi.acm.org/10.1145/138859.138867.[5] P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl. Grouplens: an open archi- tecture for collaborative ﬁltering of netnews. In CSCW ’94: Proceedings of the 1994 ACM conference on Computer supported cooperative work, pages 175–186. ACM, 1994. ISBN 0-89791-689-1. doi: http://doi.acm.org/10.1145/192844.192905.[6] U. Shardanand and P. Maes. Social information ﬁltering: Algorithms for automat- ing ’word of mouth’. In Conference proceedings on Human factors in computing systems (CHI’95), pages 210–217, Denver, CO, May 1995. ACM Press/Addison- Wesley Publishing Co.[7] A. Mild and T. Reutterer. An improved collaborative ﬁltering approach for pre- dicting cross- category purchases based on binary market basket data. Journal of Retailing and Consumer Services, 10(3):123–133, 2003.[8] J.-S. Lee, C.-H. Jun, J. Lee, and S. Kim. Classiﬁcation-based collaborative ﬁlter- ing using market basket data. Expert Systems with Applications, 29(3):700–704, October 2005.[9] G. Salton and M. McGill. Introduction to Modern Information Retrieval. McGraw- Hill, New York, 1983.[10] C. van Rijsbergen. Information retrieval. Butterworth, London, 1979. 4
5.
[11] M. Hahsler. recommenderlab: A Framework for Developing and Testing Recom- mendation Algorithms. 2011.[12] Y. Koren. Collaborative Filtering with Temporal Dynamics KDD’09, June 28–July 1, 2009, Paris, France. 2009 ACM 978-1-60558-495-9/09/06 5
Be the first to comment