KIM et al.: AUTOMATIC RECOMMENDATION SCHEME OF TV PROGRAM CONTENTS FOR (IP)TV PERSONALIZATION 675clustering are compared in this paper: demographic clustering more frequently watched TV program contents by an active userand -means clustering. Finally, CF -based recommendation is are recommended in lower ranks. The proposed rank model inperformed with a novel ranking model which extends the Best this paper tries to remedy this weak point of the previous rankMatch (BM) model – to rank the candidate TV program models.contents for recommendation. The proposed rank model is In the CF systems, for a large number of users, the process ofdesigned to make easy access to preferred TV program con- grouping similar users and recommending items entails a com-tents. For the recommendation of popular or newly broadcast putational complexity issue , . To solve the system over-TV program contents, the popular TV program contents can load in clustering similar taste users for a large number of users,be identiﬁed via CF from similar user groups. On the other G.R. Xue suggests a two-step clustering method by using of-hand, newly broadcast TV program contents are preferably ﬂine -means clustering and online PCC (Pearson correlationrecommended by restricting not recently broadcast TV program coefﬁcient) clustering for the item ratings . In this method,contents outside a sliding time window in the history data of the -means clustering is performed ofﬂine for a large numberwatched TV program contents. of users just one time for a given value. Then more similar This paper is organized as follows: Section II reviews the pre- users are extracted online for active users based on PCC valuesvious related works for recommendation; Section III introduces from their respective clusters to predict the rating values for thethe overall system architecture of our proposed automatic rec- unrated items. However, this method needs to know an appro-ommendation scheme for TV program contents and describes priate cluster number a priori.the data used for experiments; Section IV describes the compo- For user clustering in this paper, we compare two clusteringnents of the proposed recommendation scheme in detail—user methods: demographic clustering and -means clustering. Theproﬁle reasoning, similar user clustering, and ranking of candi- demographic clustering is very simple to only use user’s demo-date TV program contents for recommendation; In Section V, graphic information such as genders and ages for clustering. Forthe proposed rank model is explained in detail; The experi- the -means clustering, we use the feature vectors of user pref-mental results are presented in Section VI; Finally, Section VII erence values of 8 genres and 47 subgenres for TV program con-concludes this work. tents. The former one is computationally very simple and can be a solution for the cold-start problem which takes time to learn II. RELATED WORKS users, but requires a priori knowledge about the demographic PTV  adopted a hybrid method of CF and CBF to sup- information. On the other hand, the latter one does not requireplement the item ramp-up problem of CF and the user ramp-up such demographic information but implicitly clusters similarproblem of CBF . It requires users to provide their preference users based on the genre preference from the watched historyinformation on contents while enrolling. Based on this prefer- data of TV program contents by users. For the -means clus-ence information provided by the users, it creates and manages tering, an appropriate number for can be found by searchinguser proﬁles with explicit ratings by users. However, in general a range based on dendrogram of hierarchical clustering ,users do not want to offer their personal information or some- , . We then determine a value based on the smallesttimes do not faithfully exhibit their interests on the items with sum of squared errors in this paper. The details of ﬁnding a rightexplicit ratings. Pazzani et al. reported that only 15% of people value are described in Section IV. And as a rank model forrespond to the request for the relevance feedback on their pref- ordered recommendation of TV program contents, we proposeerence . Therefore, requiring users to rate explicitly on items a novel rank model based on the BM model –. The pro-is one of the main reasons that cause rating sparsity problems posed rank model is described in Section V in detail.. We can summarize the contribution points of our personal- Deshpande M. et al. proposed an item based top- recom- ized automatic recommendation scheme for TV program con-mendation algorithm  and J. Wang et al.  proposed an tents as follows: (1) it is more appropriate for TV program rec-extension to “relevance model” from language model , both ommendation since it makes implicit reasoning for user prefer-of which utilize CF with user-proﬁle and item matching for ence on TV program contents from the watched TV programitem recommendation. The item based top- recommendation history data, which does not require users to explicitly rate theiralgorithm uses an item-to-item matrix for item recommenda- watched TV program contents; (2) it takes into account not onlytion which is computed based on a user-to-item matrix  for the group preferences but also the individual user’s preferenceswhich the recommendation performance is further improved by on TV program contents for recommendation; and (3) the pro-extending a language model to a relevance model . posed rank model elaborates collaborative ﬁltering by consid- In item recommendation of e-commerce, recommender sys- ering the relative lengths of watching times for TV program con-tems tend to suggest new items to the users because they are tents, not just by simply counting the number of users who havenot likely to repurchase the same items or similar kinds after watched them.they have bought them. However, this may not be appropriatein TV environments where TV viewers are expected to watch III. ARCHITECTURE OF THE PROPOSED RECOMMENDATION(consume) the TV program contents (items) that they have been SYSTEM AND EXPERIMENT DATAaccustomed to watch. In general, TV viewers tend to watch pop-ular TV program contents as their similar taste users do or spe- A. Proposed Recommendation System Schemeciﬁc TV program contents according to their individual pref- In this paper, it is assumed that TV terminals are connectederences. So, the previous two models are insufﬁcient in that the to the content servers of TV programs via back channels so that
676 IEEE TRANSACTIONS ON BROADCASTING, VOL. 57, NO. 3, SEPTEMBER 2011 TABLE I FIELDS OF TV USAGE HISTORY DATABASE data set of 2,005 people is used which has been collected on 6 terrestrial TV channels for 6 months from Dec. 1, 2002 to May 31, 2003. Table I shows the data ﬁelds of the usage history data set for watched TV program contents. The TV program contents in the history data set have 8 main genres which are further divided into 47 subgenres in total. For the data set, the total number of TV program titles amounts to 924 and the total number of subtitles is 1,855. We use 795 TV programFig. 1. Architecture of the proposed recommendation system for TV program contents for training, corresponding to the ﬁrst 4 months andcontent. 629 TV program contents for testing, corresponding to the last 2 months. Notice that the sum (1,424) of the TV program contents for training and testing exceeds the total number (924)the usage (or watching) history of (IP)TV program contents can of the titles because 500 watched TV program contents arebe collected at the server sides. In IPTV environments, TV pro- the titles that were repeatedly broadcast. Table I shows thegram contents are streamed over IP networks and the respon- information attached to the broadcast TV program contents.sible content providers at head-end sides can collect usage his-tory of TV programs watched by the users via back channels. IV. PROPOSED RECOMMENDATION SCHEMEFig. 1 shows the architecture of our proposed automatic recom-mendation system for TV program contents. The automatic rec- A. User Proﬁle Reasoningommendation scheme consists of three agents: (1) the user pro- In this paper, a user is characterized in terms of his/her pro-ﬁle reasoning agent computes user preferences on genres and ﬁle which consists of two preferences on items (TV programTV programs by analyzing user’s watching history of TV pro- contents) and genres. First of all, we remove from the usagegram contents. So, this agent collects TV usage history from history data set all the TV program contents that have not beenlocal repositories of TV terminals for user proﬁle reasoning; (2) watched for more than 10% of their respective total lengths. Thethe user clustering agent clusters the users (TV viewers) into preference on a TV program content is deﬁned as the relativelysimilar preference user groups; (3) the recommendation agent watched ratio over the total time length. For the reruns of TVrecommends to each active user a list of his/her preferred TV program contents, they are all considered the same title (item).program contents. Here, an active user means the user who logs The preference on item by user is deﬁned asinto the TV terminal and is ready to receive a recommended TVprogram list. For recommendation, a list of candidate TV program contents (1)is extracted based on CF and our proposed rank model calculatesthe respective scores of the candidate TV program contents for where is the number of times being broadcast for an item .ranked recommendation. Then the TV program contents with And is deﬁned asthe top highest scores are presented to the active user in adescending rank order as the result of recommendation. Noticein this paper that the users and items are interchangeably used (2)with the TV viewers and TV program contents, respectively. where and indicate the watched time length for item byB. Description of Usage History Data Set for Watched TV user and the total length of an item , respectively. It must beProgram Contents pointed out in (1) that the item preference might be inaccurately For the experiments to test the effectiveness of the proposed computed for inattentively watched TV program contents. Therecommendation scheme, Neilson Korea’s TV usage history treatment of them is out of scope in this paper.
KIM et al.: AUTOMATIC RECOMMENDATION SCHEME OF TV PROGRAM CONTENTS FOR (IP)TV PERSONALIZATION 677 Since the popularity or recency of TV program contents are TABLE IIoften diminished with time and the user’ interest on TV pro- SELECTION RESULTS OF CLUSTER NUMBERS, Kgram contents varies over time, it is more appropriate to reﬂectthe recently watched TV program contents for recommendation.Therefore, a time window function is deﬁned as (3)where is a control parameter for the window size which is set and reveal the characteristics of cluster distributions, we taketo two-month or four-month length in this paper. a two-step approach: an unsupervised hierarchical clustering is The average of user preference on item by user is given ﬁrst run to construct a dendrogram for which a range of valuesby is found by cutting its branches at the large jumps in a distance criterion , , , ; the ﬁnal value is then deter- (4) mined in the range by repeatedly performing -means clus- tering. To determine the ﬁnal value, -means clustering is repeated times for which the centroids of clusters arewhere is the total number of items in the watched TV pro- randomly initialized each time. When the clustering yields thegram list by user . same clustering results times for a given value, the clus- In order to efﬁciently perform similar user clustering in low tering results become the ﬁnal clusters with the value. Whendimension, genre preference is used which can reﬂect the sim- any does not result in the same clustering results less thanilarity of user’s content consumption for TV program contents. times, the value that results in the same clustering re-Genre preference is computed by accumulating the item pref- sults the largest times is selected as the ﬁnal value. In this paper,erence values for the genre and is then normalized by the total and are set to 1,000 and 5, respectively. Table IIgenre preference values for all genres. When the total number shows ranges and ﬁnally selected values for the featuresof genres is , the genre preference on genre by user vectors of 47 subgenres and 8 main genres for the watched TVis deﬁned as program contents by the users who have watched at least 33% of the average number of watched TV program contents during (5) the training period. For this experiment, the open-source code Cluster 3.0 was used in . The K-means clustering, which is the most time consuming task in our scheme, takes less than one minute for on a PC with Intel Core 2 Quad CPU 2.4 GHz and 2 GBB. User Clustering memory. For computational efﬁciency and effectiveness of collabora-tive ﬁltering, TV users are clustered into similar user groups. C. Recommendation of TV Program ContentsAfter clustering, each user has a membership to one of the user In order to recommend the preferred TV program contentsgroups. Therefore, CF operation for an active user is performed to an active user, the recommendation process consists of threefor the user group to which the user belongs, not for the whole steps: extracting similar preference users from the clusters (sim-users. ilar user groups) to which the active user belongs; selecting For similar user grouping, two clustering approaches are candidate items for recommendation; and ranking the candidatecompared: demographic clustering based on genders and ages, items. Especially the rank model will be explained in Section Vand -means clustering based on the genre preference as in details.described in (5). For -means clustering, two feature vectors 1) Selecting Similar Preference Users of an Active User in aare compared with 8 preferences on the main genres and 47 Group: In Section IV-B, clustering the similar preference userspreferences on the subgenres, respectively. is done ofﬂine. For recommendation, more relevant users are The demographic clustering is computationally very simple further extracted to construct a set of candidate TV programbut can only be used if the demographic information such as contents based on CF for the user group to which an active usergenders and ages is available. The demographic clustering can belongs. By doing so, the computation complexity is lowered byavoid the cold-start clustering problem that usually takes time reducing the number of all users to the number of similar peerwhile learning the users. In our demographic clustering, there users in the similar user group to which the active user belong.are 26 combinations of different genders and ages. The genders Based on the proximity measure, the most peerare divided into two classes—male and female and the ages are users with similar preference are extracted for the active user.divided into 13 classes— , and For the proximity measure, the normalized correlation is66 ages and higher. computed by subtracting the average preference value from all As an alternative, -means clustering can be used, which the preference values .does not require the demographic information. The essential On the basis of the consumed (watched) item (TV programprerequisite for -means clustering is to know an appropriate contents) list of an active user , the similarity betweennumber as the number of clusters. In order to ﬁnd a right and each peer user is measured as the proximity between
678 IEEE TRANSACTIONS ON BROADCASTING, VOL. 57, NO. 3, SEPTEMBER 2011the normalized preference values on items for and in the two conditions: one is that the weight is independent of termsimilar preference groups. The similarity is deﬁned as frequency; and the other is that the weight is linear with term frequency. Each condition is satisﬁed as and . But, the second condition is not always satisﬁed as . To remedy this, a scaling factor is added in the numerator, thus resulting in . This is taken into account in the BM11, 15 and 25 models . (6) The BM25 includes an inappropriate condition for TV pro-where is an item belonging to . represents the active gram recommendation since it gives a high weight on short doc-user’s proﬁle and indicates the preference value on item uments compared to long documents by scope hypothesis . consumed by user as in (1), and is the averaged item Therefore, the proposed rank model in this paper extends thepreference value of user as in (4). BM15 model which does incorporate the scope hypothesis into The users with are only regarded as rel- its rank model so that the TV program contents that were broad-evant users to the active user. Then CF is performed for the cast less times are prevented from being higher-ranked.item lists between the active user and each of the relevant users.Since the number of similar preference users affects precision B. Proposed Rank Modelperformance, we need to ﬁnd an optimal number of peer usersbased on the average precision accuracy, which is explained in An extension to the BM15 is made by taking into accountSection VI. the collaborative ﬁltering concept that accounts for the watching 2) Filtering Candidate Items With EPG Information: After times of users in the rank model for recommendation of TV pro-selecting the relevant users for an active user, their preference gram contents. Furthermore, we add to the rank model a weightitems become the candidate items for recommendation. But with the correlation between candidate items for recommenda-some items may not be available in TV channels due to the tion and the items watched by the active user. We score the ﬁl-termination of broadcasting for the TV program contents. In tered candidates of TV program contents in a ranked order. Thecase of linear TV broadcasting services, Electronic Program relations between candidate document and query in BM15Guide (EPG) information can be used to ﬁlter out the candidate are translated into the relations between candidate TV programTV program contents which are not available. contents for recommendation and the active user . 3) Ranking Items: After a set of candidate TV program con- To make the BM15 be applicable for recommendation of TVtents for recommendation is determined, they are ordered by a program contents, we have the following assumptions: (1) therank model. Finally, the recommended TV program contents are watched TV program list represents the active user ; (2)presented to active users in the descending order of rank scores. is transformed into the relative watching frequencies of bothThe proposed rank model is described in the following section. TV program contents of similar preference users and of an active user by applying CF concept, where indicates the watched TV program contents by ; (3) V. PROPOSED RANK MODEL is regarded as the relative watching frequency of by ;A. Related Work—BM Model (4) the similarity between and by is further taken into account. The matching score between and is deﬁned as Our proposed ranked model extends the Best Match (BM) our proposed rank model bymodel –. The BM is a ranking function used by retrievalengines to rank matching documents according to their rele-vance to a given query. The BM model is given by (7) (9)where where and are used to balance the term frequency and the query term frequency in the rank model. The Robertson (8) et al. analyzed the way of weighting in details . and are set to 200 and 0.2 empirically in this paper. In (9), indicates the relative watching frequencyIn (8), is the number of total documents, is the number of which is the ratio of the total number of watching times of bothdocuments including a speciﬁc term of query, is the number of programs and over the total number of watching times ofdocuments related with a speciﬁc topic, and is the number of the TV program contents (all ’s) by the peer users.documents including a speciﬁc term of query and is related with Therefore, the relative watching frequency is calculated asthe speciﬁc topic . In (7), is term frequency in documentsand is term frequency in the query. The BM model originates from two Poisson models thatthe term frequency is independent of relevant and irrelevant (10)documents . Based on this idea, the simple formation is suggested under the following
KIM et al.: AUTOMATIC RECOMMENDATION SCHEME OF TV PROGRAM CONTENTS FOR (IP)TV PERSONALIZATION 679 items. There are two users, and , who have watched two items (TV program contents) and , and the similarity (VCC) value between and is 0.643. On the contrary, for the two users ( and ) who have watched both and , the VCC value between and is 0.4. So, if we set 0.5 of the VCC value as a threshold for the similarity between items, then the items and are considered being “similar”, but and are not sim- ilar. So, in (9) can improve the rank model by taking into account the relation between the active user and the candidate items for the score calculation. The effect of on precision performance will be shown in Fig. 5 in Section VI. VI. EXPERIMENTAL RESULTS For the usage history of watched TV program contents ex- plained in Section III-B, we use the usage history data of four months for training and the remaining two months for testing.Fig. 2. Illustration for signiﬁcance on weights w . In this experiment section, we measure the performance of our recommendation scheme in terms of both precision/recall and Average Normalized Modiﬁed Retrieval Rank (ANMRR)In (9), indicates the ratio of the total number of which considers the rank orders in retrieval –.watching times of TV program contents over the total numberof watching times for all the TV program contents (all ’s) by A. Performance Measure of Rank Models , and is given by Precision and Recall: The performance in information re- (11) trieval is usually measured in terms of precision and recall . The precision is deﬁned as the ratio of how many watched TV program contents (relevant documents) are contained in the rec- ommendation list (retrieved documents) of TV program con-Eq. (9) can be explained intuitively as follow: tents for an active user. The recall is deﬁned as the ratio ofis regarded as the peer users’ preferences on TV program how many recommended TV program contents (retrieved doc-contents in the same user group to which belongs; and uments) are actually included in the watched TV programs (rel- is referred to as the active user’s preference on evant documents) for the active user. The precision and recallTV program contents. The two terms and are deﬁned as. are in mutually supplemental relation as beingmultiplied together. (14) In (9), . Two weights and are givenas (15) (12) where is the number of watched TV pro- gram contents in the recommended list of TV program contents (13) and is the number of recommended TV program con- tents. is the number of recommended TVIn (12), indicates the total number of broadcast times for all program contents in the watched list of TV program contents, items and is the number of broadcast times of each item. and is the number of watched TV program contents. reﬂects the inverse document frequency with independence For the recommendation of TV program contents, the precisionassumption between the documents with and without the terms is a more appropriate metric for performance evaluation than the. In this paper, it is assumed that the document for retrieval recall since the recall accuracy is increased as the number of rec-is and the speciﬁc term of query is from active user proﬁle ommended TV program contents increases. In this regard, rec- . is added as a weight for the similarity between ommending a larger number of TV program contents increases and which is calculated as vector cosine correlation simi- false positives. So, in this paper, we use precision accuracy forlarity in (13) for which the and are the feature performance evaluation. However, if the number of ground truthvectors of user preference on program and , respectively. increases, the precision also becomes higher. So, the perfor-This weight puts more emphasis on the active user’s personal mance of the rank models is measured in terms of precision andpreference on TV program contents, which is not reﬂected in recall.the original BM , . ANMRR: Compared to precision measure, another perfor- In order to see the effectiveness of in (13), Fig. 2 il- mance measure, ANMRR –, is considered which haslustrates an example of similarity measures between two been developed to measure the image retrieval performance in
680 IEEE TRANSACTIONS ON BROADCASTING, VOL. 57, NO. 3, SEPTEMBER 2011Fig. 3. Preferences on genres and channels for groups: Demographic Clustering (DM) vs. K -means clustering (KM). (a) Genre preferences of groups. (b) Channelpreferences of groups.MPEG-7 . The ANMRR indicates not only how many cor- A cluster ’s preference on a speciﬁc genre is computed byrect items are recommended but also how highly more relevant accumulating the preferences on the speciﬁc genre by all usersitems are ranked among the recommended items. For ANMRR, in the same cluster, and then it is normalized by the total numberNormalized Modiﬁed Retrieval Rank (NMRR) is deﬁned as of users in the cluster . Similarly, the normalized pref- erence on a channel can also be computed for each cluster. The preferences and on genre and channel for a cluster are calculated as (16)where is the number of recommended TV program con- (19)tents that the active user has really watched longer than the av-erage watching times of his/her preferred TV program contentsduring test period. is the allowable maximum rank and (20)is computed as where is the maximum of . And the in(16) is revised by where is the total number of users in the cluster . and are the total numbers of genres and channels, respectively. (17) Fig. 3 shows the proﬁles of clusters’ preferences on genres and channels of TV programs. As shown in Fig. 3, the genre preferences are not signiﬁcantly distinguished among differentwhere is the rank ordered in score values by the pro- groups by demographic clustering (DM). On the other hand, theposed rank model in this paper. Finally ANMRR is written as groups by -means clustering (KM) show somewhat differentfollows: patterns for genre preferences among them. This is also sim- ilarly observed for the channel preferences except the group4 (18) and group5 by DM. Table III shows the average precision performance for dif- ferent numbers of groups by DM and KM. Although the pref- erences on genres and channels are better distinguished by KMB. Clustered Data Analysis than DM for different groups, the performance difference of av- As explained before, two clustering methods are compared erage precision between DM and KM is very slight. In this ex-between the demographic clustering and -means clustering. periment, the KM turns out to be effective for recommendation
KIM et al.: AUTOMATIC RECOMMENDATION SCHEME OF TV PROGRAM CONTENTS FOR (IP)TV PERSONALIZATION 681 TABLE III TABLE V COMPARISONS OF AVERAGE PRECISION BETWEEN DM AND KM PRECISION ACCURACY WITH OUTLIER REMOVALS outlier criteria (33%); refer to Table V. the number of peer user is 5; refer to Fig. 8. the number of cluster is 26 by DM; refer to Table IV. the number of peer user is 5; refer to Fig. 8. TABLE IV AVERAGE PRECISION PERFORMANCE FOR THE NUMBER OF CLUSTERS outlier criteria (33%); refer to Table V. the number of peer user is 5; refer to Fig. 8. Fig. 5. Performance comparison with/without w . o’clock” and “Let’s marry”. For both TV program contents, there are relatively large numbers of users who have watched them less than 10% or more than 95% of the total TV program lengths, respectively. This pattern is similarly observed in other TV program contents. So, we set 10% of the total length of TV program contents as a threshold for outlier removal. Table V shows the average precision performance on different thresholds of outlier removal for the second case. With the ex- clusion of users who watched the TV program content less than the 33% of the average number of watched TV program con-Fig. 4. Number of users versus relative watching lengths of TV programcontents. tents, we obtain 76.6% of average precision accuracy for the Top-5 recommendation. The threshold values in Table V indi- cate the ratios of the number of watched TV program contentsalthough though it does not utilize the demographic information by each user over the average number of watched TV programfor clustering. contents by all users during the training period of 4 months. The Table IV shows the average precision performance on Top- average numbers of watched TV program contents by all usersrecommendations for different numbers of clusters (groups) by are 124 and 90 during the training period of 4 months and theDM. Increasing the number of clusters does not enhance the testing period of 2 months, respectively.precision performance because we only use the most similar users to an active user of his/her group (The average D. Effect of on Precision Performance of the Proposedprecision performance according to the number of peer users Rank Modelwill be shown in Fig. 8). Fig. 5 shows the performance comparison in terms of average precision accuracy for the recommended TV program contentsC. Exclusion of Nosy Items and Outliers of Users with and without in (9). For the experiments, two kinds of outliers are removed to have The average precision accuracies with in the proposedreliable recommendation: ﬁrstly, the TV program contents that rank model are higher than those without it. The average preci-were watched less than 10 % of their respective whole lengths sion in this experiment is measured with 67 active users.are removed as noise; secondly, the users who have watched TVprogram contents less than a predeﬁned TV watching times are E. Performance Comparison Between Proposed Rank Modelexcluded. and Linear Model Fig. 4 shows the number of users versus the relative watching For performance comparison in precision and ANMRR be-lengths for the two TV program contents—“Hometown at 6 tween the proposed rank model and the linear model , 90
682 IEEE TRANSACTIONS ON BROADCASTING, VOL. 57, NO. 3, SEPTEMBER 2011Fig. 6. Performance comparison in precision and recall. Fig. 8. Precision accuracies versus different numbers of similar peer users. (a) Precision accuracies with Top-5 recommendations. (b) Precision accuracies with four clusters.Fig. 7. Performance comparison in ANMRR. 3) It takes into account the number of watching times for both TV program contents and by peer users in the scoreusers are randomly selected. Figs. 6 and 7 show the perfor- calculation for ranking. This more elaborates collaborativemance comparisons of the proposed model and linear model in ﬁltering. On the other hand, the linear rank model precision-recall and ANMRR, respectively. The proposed rank simply counts the number of peer users who have watchedmodel outperforms the linear model in both precision-recall and both and .ANMR. Notice that the smaller the ANMRR value is, the better F. Performance Analysis for Proposed Recommendationthe recommendation performance is. Ideally, the case of Scheme is achieved when the ranked order of the rec- We investigate the performance of the proposed recommen-ommended TV program contents is perfectly matched with dation in terms of the number of clusters, the number of sim-the order of the watched TV program contents by the active ilar peer users and the number of TV program contents for ﬁnaluser during the test period. Therefore the recommended TV recommendation. Fig. 8 shows precision performance for dif-program contents by the proposed rank model are also better ferent numbers of similar peer users given Top-5 recommenda-matched in ranked orders than the linear model. tions and 4 clusters. The superiority of our proposed rank model comes from the In Fig. 8(a), the average precision performance slightly de-facts that: creases as the number of similar peer users increases for dif- 1) The proposed rank model deﬁnes the weight in (12) ferent numbers of clusters. This is because a smaller number such that more frequently broadcast TV program contents of similar peer users yields more correlation between the ac- are put in lower ranks; tive user and peer users so that the resulting recommendation 2) For recommendation of TV program contents, the tradi- precision is usually enhanced. When the number of clusters in- tional models usually intensify the preference of the peer creases, the resulting recommendation precision seldom varies users but relatively reduce the preference of an active user, for different numbers of recommended items, because the larger which might be appropriate to recommend unpurchased the number of clusters, the more correlate the clustered users items to active users in e-commerce environments. How- are. In Fig. 8(b), the average precision performance of the pro- ever, in TV environments, users often tend to watch the posed recommendation scheme becomes lowered as the number TV program contents that they used to watch. Therefore, it of recommended TV program contents increases. is reasonable to take into consideration the preferences of Table VI shows the 19 recommended TV program contents both similar users and an active user for recommendation. by the proposed rank model for the corresponding ground truth The proposed rank model actually considers both; items out of 67 for an active user with . As
KIM et al.: AUTOMATIC RECOMMENDATION SCHEME OF TV PROGRAM CONTENTS FOR (IP)TV PERSONALIZATION 683 TABLE VI  R. Bruke, “Hybrid recommender systems: Survey and experiments,” RECOMMENDATION RESULTS AND GROUND TRUTH FOR AN ACTIVE USER User Modeling and User-Adapted Interaction, vol. 12, no. 4, pp. WITH ID = 213039903 331–370, Nov. 2002.  G. Adomavicius and A. Tuzhilin, “Toward the next generation of rec- ommender systems: A survey of the state-of-the-art and possible ex- tensions,” IEEE Trans. Knowl. Data Eng., vol. 17, no. 6, pp. 734–749, Jun. 2005.  P. Cotter and B. Smyth, “PTV: Intelligent personalized TV Guides,” Amer. Assoc. AI, pp. 957–964, 2000.  M. Pazzani and D. Billsus, “Learning and revising user proﬁles: The identiﬁcation of interesting web sites,” Machine Learning, vol. 27, pp. 313–331, 1997.  N. Good, J. B. Schafer, J. A. Konstan, A. Borchers, B. Sarwar, J. Herlocker, and J. Riedl, “GroupLens research project, combining col- laborative ﬁltering with personal agents for better recommendations,” Amer. Assoc. AI, 1999.  S. E. Robertson, S. Walker, M. Beaulieu, M. Gatford, and A. Payne, “Okapi at TREC-4,” in 4th Text Retrieval Conf. (TREC-4), 1995, pp. 73–96.  S. E. Robertson and K. Spark Jones, “Relevance weighting of search terms,” J. Amer. Soc. Inf. Sci., vol. 27, pp. 129–146, 1976.  S. E. Robertson and S. Walker, Some Simple Effective Approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval. New York: Springer-Verlag, 1994, pp. 232–241.  P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl, “Grou- plens: An open aArchitecture for collaborative ﬁltering of netnews,” in ACM Conf. Comput. Supported Cooperative Work, 1994, pp. 175–186.  M. Deshpande and G. Karvpis, “Item-based top-N recommendation algorithms,” ACM Trans. Inf. Syst., vol. 22, no. 1, pp. 143–177, Jan. 2004.  J. Wang, J. Powelse, J. Fokker, A. Vreies, and M. Reinders, “Person- alization on a peer-to-peer television system,” Multimedia Tools Appl., vol. 36, no. 1/2, pp. 89–103, 2007. # recommendation order, ## preference order of the active user.  J. Lafferty and C. Zhai, “Probabilistic relevance models based on doc- ument and query generation,” Language Modeling Inf. Retrieval, 2002.  M. J.L. De Hoon, S. Imoto, J. Nolan, and S. Miyano, “Open source clustering software,” Bioinfomatics, p. 781, 2004.aforementioned, the more frequently watched TV program con- K  D. P. Vetrov and L. I. Kuncheva, “Evaluation of stability of -meanstents such as daily news, daily soap opera and weekly regular cluster ensembles with respect to random initialization,” IEEE Trans. PAMI, vol. 28, no. 11, pp. 1798–1808, 2006.drama are shown to appear higher-ranked. So this can help ac-  G. Xue, C. Lin, Q. Yang, W. Xi, H. Zeng, Y. Yu, and Z. Chen, “Scalabletive users easily to access their frequently watching TV program collaborative ﬁltering using cluster-based smoothing,” in ACM SIGIR, Aug. 2005, pp. 114–121.contents. On the other hand, the low-ranked items by the pro-  T. Sergios and K. Konstantions, Pattern Recognition, 3rd ed. Newposed rank model are the TV program contents that were not York: Academic Press, 2006, pp. 572–582.often or never watched by the active user but frequently watched  R. Duda, P. Hart, and D. Stork, Pattern Classiﬁcation, 2nd ed. New York: Wiley-Interscience, 2001, pp. 542–559.by his/her peer users via the incorporation of collaborative ﬁl-  B. S. Manjunath, J.-R. Ohm, V. V. Vasudevan, and A. Yamada, “Colortering into recommendation. and texture descriptors,” IEEE Trans. Circuits Syst. Video Technol., vol. 11, no. 6, pp. 703–715, Jun. 2001.  P. Ndjiki-Nya, J. Restat, T. Meiers, J.-R. Ohm, A. Seyferth, and R. Sniehotta, “Subjective evaluation of the MPEG-7 retrieval accuracy VII. CONCLUSION measure (ANMRR),” in ISO/WG11 MPEG Meeting, Geneva, Switzer- land, May 2000, Doc. M6029. In this paper, we propose an automatic recommendation  W. Ka-Man and P. Lai-Man, “MEPG-7 dominant color descriptor based relevance feedback usingmerged palette histogram,” in IEEEscheme of (IP)TV program contents for TV personalization. Int. Conf. Acoust., Speech, Signal Process., May 2004, vol. 3, pp.Unlike the tradition recommendation in document retrieval or 433–436.  C. D. Manning, P. Raghavan, and H. Schütze, Introduction to Informa-e-commerce, the proposed scheme does not require the explicit tion Retrieval. Cambridge, U.K.: Cambridge Univ. Press, 2008, pp.ratings on watched TV program contents, rather making im- 151–175.plicit reasoning for user preference on TV program contents inthe usage history data of watched TV program contents. Therank model in the proposed scheme takes into account not onlythe group preferences but also the active user’s preferences on EunHui Kim received the B.E. degree in infor-TV program contents. Furthermore, the proposed rank model mation and communications engineering fromelaborates collaborative ﬁltering by considering the relative Chungnam National University in 2000 and the M.Sc. degree in information communications engi-lengths of watching times for TV program contents, not just by neering from Korea Advanced Institute and Sciencesimply counting the number of users who have watched them. Technology (KAIST), Daejeon, Korea in 2009. SheOur proposed recommendation scheme shows the effectiveness is currently pursuing the Ph.D. degree in Department of Electrical Engineering at KAIST.with rich experimental results for a real usage history dataset She worked for Samsung Electronics as an As-of watched TV program contents. sistant Engineer of Software team in Visual Display Division during 2000-2003 in Suwon, Korea and as REFERENCES an Associate Engineer of Architecture team in Digital Solution Center during 2003–2007 in Seoul Korea. Her research interests include personalization in  M. Montaner, B. Lopez, and J. L. DE Larosa, “A taxonomy of recom- connected TV, data clustering, collaborative ﬁltering, and recommendation mender agents on the Internet,” AI Rev., vol. 19, pp. 285–330, 2003. modeling with AI for smart TV interaction.
684 IEEE TRANSACTIONS ON BROADCASTING, VOL. 57, NO. 3, SEPTEMBER 2011 Shinjee Pyo received the B.E. degree and the Munchurl Kim (M’07) received the B.E. degree in M.Sc. degree in information and communications electronics from Kyungpook National University, engineering from KAIST, Daejeon, Korea, in 2007 Korea in 1989, and the M.E. and Ph.D. degrees in and 2009, respectively. She is currently pursuing electrical and computer engineering from University the Ph.D. degree in information and communica- of Florida, Gainesville, Florida, in 1992 and 1996, tions engineering at KAIST. Her research interests respectively. include Personalization in Connected TV, sequential After his graduation, he joined Electronics and pattern mining for TV personalization and pattern Telecommunications Research Institute (ETRI) recognition. where he had led Broadcasting Media Research Team and Realistic Broadcasting Research Team, and had worked in the MPEG-4/7 standardization related research areas. In 2001, he joined, as Assistant Professor in School of Engineering, the Information and Communications University (ICU) in Taejon, Korea. Since 2009, he is Associate Professor in Department of Electrical Eunkyung Park received the B.E. degree in in- Engineering at KAIST, Daejeon, Korea. His research areas of interest include formation and communications engineering and 2D/3D video coding, 3D video quality assessment, pattern recognition and the M.Sc. in electrical engineering from KAIST, machine learning, and video analysis and understanding. Deajeon, Korea, in 2009 and 2011, respectively. Now she joins NAVER which is the ﬁrst and largest search portal in Korea and is working with business and planning for web portal services. Her research in- terest is statistical learning theory, social networking, and machine learning.