Recommendation system based on association rules applied to consistent behavior over tim1

416 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
416
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Recommendation system based on association rules applied to consistent behavior over tim1

  1. 1. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 412 RECOMMENDATION SYSTEM BASED ON ASSOCIATION RULES APPLIED TO CONSISTENT BEHAVIOR OVER TIME Paulo J. G. Lisboa1 , Huda Naji Nawaf2 , Wesam S. Bhaya3 1 Liverpool John Moores University, 2 University of Babylon, 3 University of Babylon ABSTRACT People's behavior or desires change with time so taking this factor into account may enhance the quality of the recommendations made by recommender systems. We propose a method to improve recommendation systems by taking into consideration changes in the behavior of users over time through the analyses of consistency in user behavior over two different time intervals. A sequence alignment tool is used to create two similarity networks of users, one for each time period. A single similarity network can be obtained by multiplying the two edge values for the similarity networks derived for the different time periods, enhancing the similarity between users whose behavior is consistently similar in both periods. Communities of users are then found using spectral algorithms and recommendations are made within each community with Association Rules modeling. The results are benchmarked against a comparable method when the two time intervals are merged into a single time period and when the association rules method is applied to single population model. The performance of predictions made with association rules is also compared against the performance of predictions with the Naïve Bayes algorithm. The two algorithms are evaluated on the Hetrec2011dataset using standard accuracy metrics; e.g. recall and precision. Our experimental evaluation shows that the proposed association rules-based recommender system measuring consistency of behavior over time and segmenting the population of users into communities shows marginal but consistent improvements over standard personalization methods that do not take account of behavioral consistency over time and also compared with single population models. Moreover, the accuracy of recommendations with Association Rules consistency improves on Naïve Bayes models in terms of accuracy and F-values. Keywords: Recommendation System, Association Rule, Sequence Alignment, Similarity Networks, Communities of Users. 1. INTRODUCTION The initial development of recommender systems came as a result of widely used by well- known commercial systems, (e.g. Amazon.com, Netflix.com), where these system try lure customers by offering to them what they prefer of products. INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) ISSN 0976 – 6367(Print) ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), pp. 412-421 © IAEME: www.iaeme.com/ijcet.asp Journal Impact Factor (2013): 6.1302 (Calculated by GISI) www.jifactor.com IJCET © I A E M E
  2. 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 413 There are three approaches in recommendation system, collaborative filtering, content-based filtering, and hybrid approaches. Collaborative filtering recommends items to a user depending on taste of user’s neighborhood. Content-based systems recommend items based on item's neighborhood. Hybrid approaches exploit both content-based and collaborative filtering facilities [1]. Most researchers in collaborative filtering field have studied the behavior of users in order to develop these systems, because the success of these systems is heavily dependent on the knowledge of user behavior, which may change from one period to another. This change may be reflected on an affiliation of user to communities, so the user may belong to the community at a time and probably belongs to another community in another time, leading to a change in the structure of the network which in turn reflects on the quality of the recommendations. In this paper, we suggest a technique to improve the collaborative filtering. We have focused on enhancing recommendations by analyzing the behavior of users in two different time periods by modeling the behavior of each user as a sequence of actions chronologically arranged. Sequence alignment tool is used to alignment these sequence of users to create similarity networks. The key steps of this work are twofold: (i) First, we build similarity networks to model the stability over time of user behavior, (ii) we used the latest behaviors of users for evaluation purposes, and i.e. these behaviors are not subject to analysis. The evaluation of recommendation system is a very important step. The evaluation can tell the extent the system inform predictions about the real world. Most methodologies of recommendation system are tested by splitting dataset into training, and test set. In this paper, each sequence for a user is split into three periods; the last one is test part. Because each sequence is arranged chronologically, this part of sequence reflects the future behavior of user. So, in our work we try to predict the future behavior of user, leading to more reality testing. The reset of this paper is organized as follow. Section 2 presents dataset, and how extracts sequences of users from dataset. Section 3 describes the neighborhood models. In section 4, we discuss the association rules as prediction algorithm. Performance of our proposed system is presented in section 5. Experimental results are displayed in section 6, and were discussed in section 7. 2. DATASET AND EXTRACTION SEQUENCES OF USERS In our study, HetRec 2011 Data sets (MovieLens rating data with IMDb and Rotten Tomatoes links) has been used. The datasets were generated by the Information Retrieval Group at Autónoma University of Madrid (http://ir.ii.uam.es), it has 86,000 ratings from 2113 users. This dataset is associated with IMDB (http://www.imdb.com/genre/) to create 19 genres for movies. The data was collected between September of 1997 and December of 2009. The sequence for each user is extracted from dataset as follows: i. Collect only id’s of movies and the time stamp associated with an individual user. ii. Arrange the movies chronologically according to the time stamp to create a sequence of movies viewing actions. iii. Divide the sequence into three periods: a- 1st period from 1-Sep-1997 to 1-Oct-2001 b- 2nd period from 2-Oct-2001 to 1-Nov.-2005. c- 3rd period from 2-Nov-2005 to 30-Dec-2009. iv. Map each id of movie to its genre according to the movie taxonomy. Figure 1 shows activities of users over time in used data set. As we noticed the activities are linearly increasing until 2007, where the users were less active in first period and most activity in third period (evaluation). The same curve we got in figure 2 that show the active users in each
  3. 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 414 period. We have to use only the users involved in all three periods, so the number of users is too small as long as the users have not have the same level of activity. Figure 1: Activities of users over time Figure 2: The active users in each period 3. NEIGHBORHOOD MODELS Neighborhood models are created depending on the similarity relationships among either users or items. We are interested to find out the neighborhood depending on the similarity of relationships among users. Finding out neighborhood requires two steps: i. Create similarity networks ii. Analysis these similarity networks to detect communities (neighborhoods) 3.1. Create Similarity Networks Sequence alignment tools can be used to build similarity networks of users. It is modified sequence comparison method originally applied in bioinformatics. If we can create sequences from the viewing of movies for all users, so we can compare these sequences in same way employed in bioinformatics. Any dataset can subject to following comparison [2]: 1- If we can placing the activities into meaningful collection, 2- If there is temporal dimension for each activity, 3- If there is a dimension to compare each activity with another to obtain a score of similarity.
  4. 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 415 Optimal Matching is a global alignment algorithm applied in social science field as to analyze non-biological sequences as simulation to Needleman-Wunsch algorithm that applied in bioinformatics field [3]. As mentioned in section 2, each sequence is split into three periods; the first two periods are subjected to analysis. So, optimal matching is applied on the two first periods of all sequences of users to create two similarity networks. By multiplying the two similarity networks, a single similarity network is obtained which shows users with similar behavior over both periods. Optimal Matching starts by aligning two sequences using three operations, namely replacement, insertion and deletion. A combined insertion and deletion combination is called indel [3]. For example, if we have two sequences, the best alignment for them is showing in figure 3: Figure 3: The optimal alignment between two sequences. Vertical line indicate matching items in the two sequences So, the resultant alignment has 2 mismatches, 6 matches, and 4 gaps. In principle, a weight is assigned to each replacement, insertion or deletion [3]. These weights reflect the ‘cost’ of making the alignment between sequences to their total. This is a measure of the dissimilarity between the two sequences. From our experiments, the best results have been gotten when set the following weights: Gap and mismatch=0, Match=2 So, the final score for this alignment is: (6*6)+(4*0)+(2*0)=36. Sometimes, the difference in length between sequences is problem. So, Abbott and others found a way to deal with this problem by dividing the final score of alignment by the length of the longer sequence of the pair [3]. In our work, we deal with problem by the final score of alignment by length of resultant alignment as K. Duraiswamy and V. ValliMayil [4]. There are some papers dealing also with sequences of actions in the fields of web traffic and e- commerce. In particular, P. Liu and L. Hai propose a Sequence Alignment Technique (SAT) collaborative filtering (CF) recommendation methodology using Product Taxonomy (SAT-PT), to address the sparsity and scalability problems of current CF based recommender systems. The author used a simple hierarchical clustering algorithm to form neighborhoods [5]. In contrast, A. Banerjee and J Ghosh used weighted longest common subsequences for clustering web users. The purpose of clustering users is finding group of similar users in their interest based on their clickstreams, which takes into account both the trajectory taken through a website and the time spent at each page. Paper [4] focuses on a method to find session similarity by sequence alignment using dynamic programming and proposes a model such as similarity matrix for representing session similarity measures [6]. 3.2. Detect Communities (Neighborhoods) The spectral clustering algorithm can be used to cluster data using the eigenvectors of a similarity/affinity matrix derived from a dataset.
  5. 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 416 The key step of algorithm is the calculation of a modularity score, by which decide if the community can be split into two communities or not. The good division of a network into communities, it is one in which there are fewer than expected edges between communities or if the number within groups is more significantly [7]. So, the modularity matrix calculates the different between the number of edges between a pair of nodes, Aij, and the expected number derived from the order (i.e. total number of edges) of each node, ki, kj when placed at random. The community structure is derived by a hierarchical process where the total set of nodes is partitioned into two groups at a time, depending on the following modularity score [7]: ܳ ൌ 1 4݉ ෍ ቆ‫ܣ‬௜௝ െ ݇௜݇௝ 2݉ ቇ ‫ݏ‬௜‫ݏ‬௝ ௜௝ ൌ 1 4݉ ‫ݏ‬் ‫ݏܤ‬ For a partition to take place, this modularity score must be increased .The similarity network Aij that used here is result from the alignment of user’s sequences. Each element in matrix represents the score of similarity between i and j, ki And kj are the degrees of vertices i and j respectively. So, kikj /2m is the expected number of edges between i and j, and m is the total number of edges [7]. 4. RECOMMENDER USING SIMPLE ASSOCIATION RULES Recommendations list for the customers in the same community is predicted using association rule learning as prediction algorithm. The problem of association rule mining is defined as in [8]: Let I ൌ ሼiଵ, iଶ , … . . i୬ሽ a set of binary attributes called items. Let D ൌ ሼtଵ , tଶ , … . , t୫ሽbe a set of transactions called the database. Each transaction in D has a unique transaction ID and contains a subset of the items in I. A rules defined as an implication of the form x→y where x,y ⊂I and x∩y=∅ɸ. The sets of items x and y are called antecedent and consequent of the rule respectively. To select interesting rules from the set of all possible rules, constraints on various measures of significance and interest can be used. The best- known constraints are minimum thresholds on support and confidence. The support supp(x) of an item set x is defined as the proportion of transactions in the data set which contain the item set. The confidence of a rule is defined confሺx ՜ yሻ=suppሺx ‫ת‬ yሻ suppሺxሻ⁄ . In this paper, we used single item in antecedent and consequent of the rule, so for each item in antecedent of a rule, the item in consequent is considered as recommendation. 5. PERFORMANCE MEASURES OF RECOMMENDATION ALGORITHM In the field of machine learning, a confusion matrix is a specific table layout that allows visualization of the performance of an algorithm [10]. Table (1) shows the observations that represent the source of real conditions, the real conditions in our case is the third period (evaluation period), against predicted recommendations that result from our model. Each row of the matrix represents the films that have been automatically predicted by association rules algorithm, while each column represents the actual films that have been seen in evaluation period.
  6. 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 417 Table 1. The confusion matrix compares the predictions with the observations. Several standard terms have been defined for the two class matrix. The accuracy (AC) is the proportion of the total number of predictions that were correct. It is determined using the equation: ‫۱ۯ‬ ൌ ሺ‫܉‬ ൅ ‫܌‬ሻ ሺ‫܉‬ ൅ ‫܊‬ ൅ ‫܋‬ ൅ ‫܌‬ሻ … … . ሺ૛ሻ The recall or true positive rate (TP) is the proportion of positive cases that were correctly identified, as calculated using the equation: ‫۾܂‬ ൌ ‫܉‬ ‫܉‬ ൅ ‫܋‬ … … . ሺ૜ሻ Finally, precision (P) is the proportion of the predicted positive cases that were correct, as calculated using the equation: ‫۾‬ ൌ ‫܉‬ ‫܉‬ ൅ ‫܊‬ … … . . ሺ૝ሻ F-measure considers both precision and recall providing a single measurement for a system: ۴ ൌ ૛. ‫.ܖܗܑܛܑ܋܍ܚ۾‬‫ܔܔ܉܋܍܀‬ ࡼ࢘ࢋࢉ࢏࢙࢏࢕࢔ ൅ ࡾࢋࢉࢇ࢒࢒ … … . . ሺ૞ሻ In order to give different weights to precision and recall, the F-measure was derived so that Fß measures the effectiveness of retrieval with respect to a user who attaches ß times as much weighted to recall as precision. The parameter ß generally take the values 0.5, 1 and 2, with higher values of ß placing a greater emphasis on getting the predictions right in the sense of achieving a higher positive predictive value. ۴઺ ൌ ሺ૚ ൅ ઺૛ሻ. ሺ‫.ܖܗܑܛܑ܋܍ܚ۾‬ ‫ܔܔ܉܋܍܀‬ሻ ሺ઺૛. ‫ܖܗܑܛܑ܋܍ܚ۾‬ ൅ ‫ܔܔ܉܋܍܀‬ሻ … … . . ሺ૟ሻ F1 is the traditional F-measure or balanced F-score because recall and precision are evenly weighted. Values of β < 1 emphasize precision, while values of β > 1 emphasize recall [9]. 6. EXPERIMENTAL RESULTS In the following experiments, the proposed system based on association rules is benchmarked against a comparable method when the two time intervals are merged into a single time period and when the association rules method is applied to single population model. The performance of methods based on association rules is compared with proposed system in [11]. Predicted Observed Positive: film viewed Negative: film not viewed Positive: film recommended a b Negative: film not recommended c d
  7. 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 418 6.1. First Experiment The predictive performance of association rules system is presented quantitatively and graphically in table2. The system recommended 3 and 5 top items for 196 users. Table2: F-Measure and accuracy for predictive performance of association rules 6.2. Second experiment The predictive performance of Nave Bayes system [11] is presented in this experiment for comparison the performance of association rules prediction algorithm with it. The system recommended 5 and 3 top items for only 134, and 192 users respectively. Table 3 shows the performance of Naïve Bayes prediction system. Table3: F-Measure and accuracy for predictive performance of Naïve Bayes. 6.3 Third experiment In this experiment, association rules prediction algorithm applied on same 134 users in second experiment when N-top=5. For N-top=3, the two algorithms recommended the same 192 users. Table 4 shows the performance of association rules algorithm on 134, and 192 users. Figure 4, 5 and 6 show the comparison between Naïve Bayes algorithm and association rules algorithm in terms of single population model, personalized model (one period), and personalized model (two periods) when N-top=5. Table 4. F-Measure and accuracy for predictive performance of association rules Methods F1 F2 F0.5 Accuracy Single population Model (N-top=5) (N-top=3) 62% 57% 69% 55% 57% 59% 62% 65% Personalized Model One period (communities) (N-top=5) (N-top=3) 61% 57% 68% 55% 56% 60% 61% 66% Two periods(communities)(N-top=5) (N-top=3) 63% 59% 70% 56% 57% 61% 62% 67% Methods F1 F2 F0.5 Accuracy Single Population Model (N-top=5) (N-top=3) 59% 57% 65% 54% 55% 59% 60% 65% Personalized Model One Period (communities) (N-top=5) (N-top=3) 59% 58% 65% 56% 55% 61% 60% 66% Two Periods(communities) (N-top=5) (N-top=3) 60% 58% 66% 56% 56% 61% 61% 67% Methods F1 F2 F0.5 Accuracy Single population Model (N-top=5) (N-top=3) 61% 57% 67% 54% 56% 59% 61% 65% Personalized Model One period (communities) (N-top=5) (N-top=3) 63% 57% 69% 55% 58% 59% 62% 66% Two periods(communities) (N-top=5) (N-top=3) 64% 58% 70% 56% 58% 61% 64% 67%
  8. 8. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 419 Figure 4: Comparison between the two algorithms in terms of non-personalized model Figure 5: Comparison between the two algorithms in terms of personalized model (one period) Figure 6: Comparison between the two algorithms in terms of personalized model (two periods) Figure 7: Precision-Recall curve of personalized model (two periods) using association-rules, and Naïve-Bayes algorithms 7. DISCUSSION AND CONCLUSION A total 205 users who have actions in both periods in the modeling data behavior and a non- empty set of films viewed during the evaluation period. These users have been subjected to the study The system in all experiments remove some users because of they either have no new items in the evaluation period or have less than the determined number of items (N) in recommendation list. The first experiment was about the performance of Association rules as prediction algorithm on 205 users, where the system recommended fixed length (N) of items. It recommended 5 and 3 items for 196 users. Table 2 shows the comparison of performance of our algorithm against single population model, and personalized model (one period).The performance of single population algorithm (baseline) is outperformed by other two systems as expected when N=3, and our system outperformed the personalized model (one period) .When N=5, single population is still less
  9. 9. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 420 performance from proposed system in terms of F1 and F2, but superior personalized model (one period) in terms of F-values and accuracy. In second experiment, we applied Naïve Bayes based recommendation system with the same dataset. The system recommended 5 items for only 134 users, and 3 items for 192 users. As we noticed, the performance of single population model has lower rank when N=3 as expected, while the single population is in line with personalized- (one period) when N=5. Again, the proposed system is slightly outperforming the both when N=3 and N=5. The two algorithms works on the same dataset, but the condition of each algorithm may impose a different number of users. Thus, we succumb the two algorithms the terms of the algorithm that accept the least number of users to make the comparison between their performance more accurate. So, the association rules based recommendation system has been applied on the same 134 and 192 users when N=5and N=3 respectively as described in experiment 3. Tables 3 and 4 tell some interesting things. On the one hand, we will compare the performance of Association Rules and Naïve Bayes at first. As we noticed when N=3, the performance of the two algorithms are comparable in terms of F-values and accuracy in case of single population and personalized- (two periods). In case of personalized (one period), the performance of Naïve Bayes is few better than Association Rules. The performance of the two algorithms is more different when N=5. The performance of Association Rules is significantly outperformed that of Naïve Bayes in case of single population, personalized (one period and two periods). On other hand, we can compare the performance of the both personalized models; one period and two periods when N=3. As shown in table 3, the two methods are comparable except the accuracy of two periods is bet better than that of one period. Whereas in table 4, the personalized- (two periods) model slightly outperformed the one period in terms of F-values and accuracy. The same comparison has been done when N=5. The performance of personalized- (two periods) is better than that of personalized-(one period) in both algorithms (tables 3 and 4). In general, compared to the performance of the algorithm in the case of existence communities, and in the case of there are not show the performance of single population has lower rank than that of communities (personalized models) in both algorithms, and whatever value of N. Worth mentioned, recommendation system based on association rules is more sensitive for existing communities than that of Naïve Bayes. Figure 4, 5 and 6 report the performance of the algorithms in case of associative rules and Naïve Bayes. Generally, the performance of Associative Rules in case of personalized-(two periods) model is best as shown in precision-recall curve ( see figure 7). Improve performance over Naïve Bayes by 2% to 4% for F-values. The two algorithms did not recommend items for significant number of users when N=8, and N=10. 8. REFERENCES [1] O. Kirmemis, and A. Birturk: “A Content-Based User Model Generation and Optimization Approach for Movie Recommendation”, 6th Workshop on Intelligent Techniques for Web Personalization & Recommender Systems, AAAI Press, 2008. [2] A. Godwin “Time Web: Comparing University-Spaced Time Sequences Using Social Network Analysis of Local Alignment Pairs”, M.S. Thesis, University of North Carolina at Charlotte, 2008. [3] A. Abbott and A. Tsay, "Sequence analysis and optimal matching methods in sociology," Journal: Sociological Methods & Research, vol. 29, no. 1, pp. 3-33, 2000.
  10. 10. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 421 [4] K. Duraiswamy, V. Mayil, "Similarity matrix based session clustering by sequence alignment using dynamic programming," Journal of Computer and information Science," 1(3): pp. 66-72, 2008. [5] P. Liu, L. Hai, "Application of sequence alignment technique to collaborative recommendations in e-commerce," 2010 International Conference on E-Product E-Service and E-Entertainment (ICEEE), vol. 1, pp. 1-3, 2010 [6] A. Banerjee ,J Ghosh, "Click stream clustering using weighted longest common subsequences," In Proceedings of the Web Mining Workshop at the 1st SIAM Conference on Data Mining. pp. 33-40, 2001. [7] M. Newman, “Modularity and community structure in networks”. Proc. Natl. Acad. Sci. USA 103, 8577-8582 (2006) [8] A. Geyer-Schulz, M. Hahsler, “Evaluation of Recommender Algorithms for an Internet Information Broker based on Simple Association Rules and on the Repeat-Buying Theory”. Proceedings of fourth webkdd workshop: Web Mining for Usage Patterns & User Profiles. pp. 100-114, 2002. [9] C. Manning, P. Raghavan, H. Schütze, “An Introduction to Information Retrieval”, Cambridge University Press, 2008. [10] VA. Narayana, P. Premchand, A. Govardhan , “Performance and Comparative Analysis of the Two Contrary Approaches for Detecting Near Duplicate Web Documents in Web Crawling” . International Journal of Electrical and Computer Engineering (IJECE), Vol. 2, No. 6, pp. 819~830, 2012 [11] P. Lisboa, H. Nawaf, W. Bhaya, “Improving Recommendation Systems by Modeling the Stability of Implicit Behaviour”. The Post Graduate Network Symposium (PGNet2013), Liverpool, UK, 2013 [12] Madhubala Myneni, Dr.M.Seetha, “Feature Integration for Image Information Retrieval Using Image Mining Techniques” International Journal of Computer Engineering & Technology (IJCET), Volume 3, Issue 3, 2012, pp. 273 - 281, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375

×