A model of recommender system for a digital library


Published on

Recommender System
Collaborative filtering
Content- based

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

A model of recommender system for a digital library

  1. 1. A MODEL OF RECOMMENDER SYSTEM FOR A DIGITAL LIBRARY Hangsapholyna Sar1 , Saokosal Oum2 ARTICLE INFO ABSTRACT Recommender System The number of digital contents and books in a university-size digital library is enormous and better than ever. Readers find it more and more difficult to locate their favorite books. Even though they Collaborative filtering could possibly find a favorite book, finding another similar book to the first favorite book seems as if Content- based finding a nail in the sea. That is because the second favorite book might be at the very last edge of a Personalizing long tail. So a recommender system is often a requirement in a digital library that should be considered and needed to come in to make the above finding simpler. This research proposes and Digital library describes a service to provide a model of recommendation, which is part of a networking digital library project whose principal goal is to develop technologies for supporting digital services. The researcher uses collaborative filtering, content- based, and personalizing approaches for a model of a recommender system a digital library and Data was collected from users and librarian in Norton University, Cambodia, who had explaining in recommender system a digital library. The goal of this research is to propose the Model of Recommender System for a Digital Library. Finally, a series of experiments is performed, and the results indicate that the proposed methodology produces high- quality recommendations. I. INTRODUCTION retrieval systems that attempt to present information that Digital libraries gain more and more importance in would be of interest to the user. A recommender system the modern world, although the concept behind digital would compare the user‟s profile with that of his or her libraries existed before the term was introduced. There is previous history or the similar profiles of other users with no clear consensus on the definition of digital libraries, similar interests or background and provide top but, in general, they can be defined as collections of references that are relevant to the current item of interest. information that have associated services delivered to Recommender systems attempt to reduce information user using a variety of technologies (Callan, et al. 2003). overload and retain customers by selecting a subset of The collections of information can be scientific, business items from a universal set based on user preferences. A or personal data and can be represented as a digital text, preference may reflect an individual mental state image, audio, video or other media. The information can concerning a subset of items from the universe of be digitized paper or born digital material and the alternatives. Individuals form preferences based on their services offered on such information can be varied, experience with the relevant items, such as book, article, ranging from content operations to rights management, music etc (Palani, A., et al 2009). and can be offered to individuals or user (Variou, 2001). Recommender systems are a particular type of Moreover internet access has resulted in digital libraries personalization that learn about a person‟s needs and that are increasingly used by different user for diverse then proactively identify and recommend information purposes, and in which sharing and collaboration have that meets those needs. Recommender systems are become important social elements. As digital libraries especially useful when they identify information a person become commonplace, ask their contents and services was previously unaware of (Callan & Smeaton, 2001). become more varied, and as their customers become 2.2 Challenges in Recommender Systems more experienced with computer technology, user expect Interface Design: The recommender systems should more complicated services from their digital libraries have an interface design which provides a good user (Callan & Smeaton, 2001). A traditional search function is experience. The user interface should be modeled in such normally an integral part of any digital library, but user‟s a way the user does not get tired providing explicit aggravations with this increase as their needs become feedback and also the recommendation list should be more complex and as the volume of information displayed in an uncluttered manner which not only managed by digital libraries increases. Thus digital captures the attention of the user but also provides it in a libraries must move from being passive, with little nonintrusive fashion(Lindem et al., 2003). adaptation to individual users, to being more proactive in Amount of Data: One of the issues facing offering and tailoring information for individuals and in recommender systems is that they need a lot of data to supporting efforts to capture, structure, and share effectively make recommendations. The industry leaders in knowledge. Digital libraries that are not personalized for recommendations like Google, Amazon, Netflix, etc are individuals will be seen as defaulting on their obligation those with a lot of consumer user data: A good to offer the best service possible. Just as people patronize recommender system initially needs item data (from a stores in which they and their preferences are known, catalog or other form), then it captures and analyzes user and their needs anticipated, so too will they patronize data (behavioral events), and then the appropriate digital libraries that remember them and anticipate their algorithm is carried out. The more item and user data a needs. recommender system has to work with, the stronger theII. REVIEW OF RELATED ITERATURE AND STUDIES chances of getting good recommendations (Palani, A., et al 2.1 Recommender System 2009). Recommender Systems are part of information Unpredictable Items: There are some items that 1 1Email: lyna_it_eng73@yahoo.com Tel: (+855) 16 506 873 2Email: oumsaokosal@gmail.com Tel: (+855) 12 252 752
  2. 2. people either love or hate in equally strong terms. There  Users know in advance that, in a virtual e-learningare books that the puritans rubbish but the commoner‟s environment (or any other web based environment), alllove. These types of items are difficult to make actions are logged;recommendations on, because the user reaction to them  The recommendation system must be designed in atends to be diverse and unpredictable. Music especially has non-intrusive manner and be user-friendly, includinglot of cases like this where the uses like both soft rock and the possibility of disconnecting it or minimizing itsheavy metal bands (MacManus, R. 2009). participation in the browsing or searching activities; and. Dynamically Changing Data: Recommender systems  The participation of each individual user in the finalmostly do a long term profiling of users and hence biased recommendation system is completely anonymous( Enric,towards the old and have difficulty showing new. The past M., & Julia, M. 2005)behavior of users would not always be a good tool because In sum up a recommender system must protect thethe trends are always changing. Hence a simple algorithmic individual‟s right to privacy and protect him/her againstapproach would find it difficult to keep up with current malicious identity hackers.trends in fast changing domains such as fashion. 2.3 Standards and Policies in Digital libraries Scalability: A recommender system would need to Information is a main component of a government.make millions of recommendations to millions of users As digital information becomes more popular, and asacross the globe especially in the case of collaborative competition over the velocity of informationfiltering algorithms which might need to compute the K dissemination sometimes overshadows integrity, thenearest neighbors at runtime and hence recommender regulations and policies that govern the circulation ofsystems should be scalable across various sizes and types information have also become more complex.of data and users( Karatzoglou, A., Smola, A., & Weimer, M. Hence, an information professional‟s role is to provide2010). accurate information in a timely manner and ensure that Changing User Preferences: While a user may have a the integrity of that information is not compromised.particular intention when browsing a portal like Copyright laws, such as the American Copyrightamazon.com, the next day the user might have a different Revision Act of 1976 and the Software Copyright Act ofintention. A classic example is that one day the user might 1980, are in place to protect software.be searching books, but the next day the same user could be As extensive databases and storage systems are beingsearching for house hold appliances. Hence a developed to preserve digital information, standards andrecommender systems should not take all decisions based policies should be enforced to protect the security andon prior content and also should be able to make an privacy of users. The digital revolution has madeintelligent choice based on current context (MacManus, R. information more accessible; it has also affected society‟s2009). morals by altering the perception of standards and Shilling Attacks: An underhanded and cheap way to policies of retrieving information or downloading mediaincrease recommendation frequency is to manipulate or from the Internet. It is sometimes perceived the same astrick the system into doing so. This can be done by having a borrowing electronic devices from a friend. Informationgroup of users (human or agent) use the recommender seekers are often found to put laws and moralsystem and provide specially crafted opinions that cause it implications aside when they “choose” to pirate or rectifyto make the desired recommendation more often. For information for their own illicit purposes based on theirexample, it has been shown that a number of book reviews own “ethical” standards. For instance, 1997 recordspublished on Amazon.com are actually written by the confirm instances of commercial software being copiedauthor of the book being reviewed. A consumer trying to not only by customers but the programmers themselves.decide which book to purchase could be misled by such According to Seadle (2004), even “a reasonable fair use inreviews into believing that the book is better than it really ethical terms could still be an infringement in strict legalis. This is known as shilling attack and recommender judgment”. Hence, there is a need to have an enforcementsystems should protect against these attacks (Lam, S., mechanism in place, instead of the current trends whereFrankowski, D., & Riedl, J. 2006). Two simple type of shilling peer pressure ethical judgments are enabling orattacks are Random Bot and Average Bot. discouraging intellectual property infringements. That is  A Random Bot is filterbot who randomly rate items why the guidelines set by law need to be enforced on outside of the target item-set with either the how digital criminal behaviour shall be dealt with. Many minimum rating (for nuke attack) or maximum governments have created protection stipulations in the rating (for push attack). form of copyrights, but such stipulations protect only an  An AverageBot is filterbot where the rating is based intellectual thought, leaving information in physical and on the average rating of each item following a digital forms as victims of chance. normal distribution with a mean equal to the The information superhighway, epitomized by the World average rating for that item. Wide Web and digital libraries, is constantly faced with Privacy Issues: a very important aspect that cannot be security issues, such as identity theft, data corruption,ignored is the fact that users are always under control, in illegal downloads, and piracy. These moral issues couldthe sense that all taken actions are monitored and be seen as the result of easy access to vast information,registered. This might seem a very invasive setup which which opens easy opportunities for slack security, whichharms user privacy and, therefore, undesirable. consequently invokes a possibility of wrong doing. TheNevertheless, there are several remarkable facts that need Information Age is generating more reasons for security-to be clarified: consciousness with the Internet and digital libraries. This 2
  3. 3. is essential for the access and control of digital for a certain period of time. Aslesen (1998) has classifiedinformation in order to prevent all from being lost to our the former as usage rights, and the latter two asown plunders as we raid the information superhighway. marketing rights (which include selling and distribution2.4 Important Issues on Digital libraries processes).The main innovation in the field of digital libraries is 2.5 Type of Recommender Systemapparent in the fact that most of resources are in 2.5.1 Collaborative Approachelectronic format. In this format, there is no need for The first type of recommendation technique, thephysical resources linked to loan, access, and reservation. collaborative approach (sometimes called the social-basedResources are ideally held in a distributed database approach), takes into account the given user‟s interestswhich should be accessed over the Internet. The user, profile and the profiles of other users with similarinstead of taking a hard copy of the document in the interests (Shardanand & Maes, 1995). The collaborativelibrary, downloads a new copy. The quality of the data approach looks for relevance among users by observingcan become more difficult to assess if the Internet is used their ratings assigned to products in a training set ofnot only as a client access to the library but also as a limited size. The „„nearest neighbour‟‟ users are those thatrepository of information. Following some researchers‟ exhibit the strongest similarity to the target user. Thesedefinitions of the term digital library, which advocate that users then act as „„recommendation partners‟‟ for thethe Internet can be viewed as a huge digital library, the target user, and collaborative approaches recommend toproblem of data quality erupts (Arms, 2001). As a the target user items that appear in the profiles of theseconsequence of this, there should be concerns about data recommendation partners (but not in the target user‟saccuracy, originator, and integrity, once they are not profile). It has been observed in several practical settingseasily measured (as they are in traditional libraries). that the collaborative approach generally achieves moreThere are difficulties in determining how to charge for effective recommendations than its content-basedthe library services, and especially, how to guarantee counterpart (Alspector et al., 1998; Breese et al., 1998;copyrights on the data which are downloaded by users. Mooney and Roy, 2000; Pazzani, 1999).Moreover, security is a great issue with the increasing 2.5.2 Content-Based Approach:influx of new viruses and hacker attacks. This issue will Another type of recommendation technique wasbe addressed later in this chapter. Social and called the content-based approach (Loeb & Terry, 1992).psychological aspects must be taken into consideration in A content-based approach characterizes recommendablethe move to a digital format, given that everything is now items by a set of content features and represents a user‟saccessible via a computer system, and less human interests by a similar feature set. Then, the relevance of ainteraction is therefore required. This can result in given content item to the user‟s interest profile isdifficulties on making effective use of the library, as measured as the similarity of this recommendable item tousability issues must be thoroughly addressed. Further, the user‟s interest profile. Content-based approachesmulti- language interfaces and facilities, such as thesauri select recommendable items that have a high degree ofand translators, should also be provided. One the other similarity to the user‟s interest profile. Systemshand, a reservation service becomes useless, since implementing a content based recommendation approachindefinite soft copies are allowed from a document. analyze a set of documents and descriptions of itemsSecurity is a major problem in digital libraries, previously rated by a user, and build a model or profileparticularly with reference to unauthorized use of library of user interests based on the features of the objects ratedresources. The usual security approach that has been by that user (Mladenic, D. 1999).adopted is to establish an access control to the library The profile is a structured representation of userresources. Under this arrangement, data consumers interests, adopted to recommend new interesting items.should have a registration record with their contact The recommendation process basically consists ininformation, and should be given a login name for matching up the attributes of the user profile against theauthorization and a password for authentication. A attributes of a content object. The result is a relevancesecurity log recording all access made should exist in judgment that represents the user‟s level of interest inorder to enable effective auditing. Ethical policies should that object. If a profile accurately reflects user preferences,be explained to all users in order to make sure they use it is of tremendous advantage for the effectiveness of anthe library appropriately. information access process. For instance, it could be usedCopyright is another important issue in digital libraries, to filter search results by deciding whether a user isas governments have not yet agreed a method by which interested in a specific Web page or not and, in theto effectively establish copyright laws for digital data negative case, preventing it from being displayed.(Onsrud & Lopez, 1998). The problem of copyright 2.5.3 Personalized Recommendationlegislation is more evident now that data can be Personalization is about building customer reliabilitydownloaded, and each country may have its own specific by creating a meaningful one-to-one relationship; bylegislation. Guaranteeing that the user will not alter data accepting the needs of each individual and helping satisfyand resell them is a high priority. Spatial data is usually a goal that efficiently and knowledgeably addresses eachvery expensive to capture and generate, so it is highly individual‟s need in a given context (Riecken, 2000). Theimportant that intellectual property rights be imposed key element of a personalized environment is the userand obeyed. Moreover, users are usually interested in a model. A user model is a data structure that representsspecific part of the spatial data set. Copyright is related to user interests, goals and behaviors. The more informationthe use, replication and update of data and usually lasts a user model has, the better the content and presentation 3
  4. 4. will be adapted for each individual user. A user model is created through a user modeling process in which Collect User Informations Response Priofile Library Rules unobservable information about a user is inferred from Explicit observable information from that user; for instance, using Implicit Admin the interactions with the system (Zukerman, et al., 1999). Filtering Select content Retrieve content Use legacy data and make and assemble page User User models can be created using a user-guided Create Mata Data Simple Recommentation for Display approach, in which the models are directly created using Content-Based the information provided by each user, or an automatic approach, in which the process of creating a user model is Collaborative Content Server Cached Content User Profiles hidden from the user. The hypermedia systems constructed using the user-guided approach are called Meta data attributes adaptable, while the ones produced using an automatic and Ratings of content and user grouping approach are called adaptive (Fink, et al., 1997; Brusilovsky Figure 1 Architecture of the proposed digital library & Schwarz, 1997).Within the context of digital library, up The architecture as show in Figure 1 consisted of the to now, user modeling has been implemented using following main components: mainly user guided approaches, which has produced  User adaptable digital library. However the problem of user The user concept covers the various actors (whether modeling in digital library can be easily implemented human or machine) entitled to interact with digital using an automatic approach because a typical user libraries. exhibits patterns when accessing digital libraries and the  Collecting User Information information containing these patterns is already usually The objective of collecting visitor information is to stored in databases. For this purpose, machine learning develop a profile that describes a site users interests, role techniques can be applied to recognize such regularities in an organization, entitlements, purchases, or some other in order to integrate them as part of the user model. set of descriptors important to the site owner. The most Machine learning encompasses techniques where common techniques are explicit profiling, implicit machine acquires knowledge from its previous profiling, and using legacy data: experience (Witten & Frank, 1999). The output of machine Explicit profiling asks each user to fill out information or learning technique is a structural description of what has questionnaires. been learned that can be used to explain the original data Implicit profiling tracks the users behavior. This and to make predictions. From this perspective, machine technique is generally transparent to the user. Browsing learning techniques make it possible to automatically and buying patterns are the behaviors most often create user models for the implementation of assessed. personalized digital library services. Using legacy data accesses legacy data for valuable Personalization has become an important topic for digital profile information, such as credit applications and libraries to take a more active role in dynamically previous purchases. For existing customers and known tailoring their information and service offers to user, legacy data often provides the richest source of individuals in order to better meet their needs (Callan & profile information. Smeaton, 2003). Most of the work on personalized  Analyzing User Profiles information access focuses on the use of machine learning When the profile is available, the next step is to analyze algorithms for the automated induction of a structured the profile information in order to present or recommend model of a user‟s interests, referred to as user profile, documents, purchases, or actions specific to the user. from labeled text documents (Mladenic, 1999). Keyword- Making such recommendations is the most challenging based user profiles suffer from problems of polysemy and step. Many techniques for presenting content and making synonymy. The result is that, due to synonymy, relevant recommendations are in use or under development. information can be missed if the profile does not contain the exact keywords occurring in the documents and, due User Profiles to polysemy, wrong documents could be deemed as ID, Password, Interests Role, etc... relevant. This work explores a possible solution for this kind of issue: the adoption of semantic user profiles that Identify site user capture key concepts representing users‟ interests from books, articles, or relevant documents. Semantic profiles will contain journals Retrieve user’s profile references to concepts defined in lexicons like Word Net (Miller, 1995) or ontologies. The solution is implemented Select content that in the item recommender (ITR) system which induces matches user’s Suggested items of Interest: preferences semantic user profiles from documents represented by Applications I am using Word Net (Degemmis, Lops, & Semeraro, 2007). Retrieve content and authorized to use: assemble page forIII. ARCHITECTURE ANDCOMPONENTS display to user Library policy manuals I have access to: In this section researcher discuss the overall architecture of the proposed digital library system Figure 2 Analyzing User profiles  Filtering Techniques: Filtering techniques employ algorithms to analyze meta-data and drive presentation and recommendations. The three most common filtering 4
  5. 5. techniques simple filtering, content-based filtering, and time applications, such as dynamically constructing webcollaborative filtering are introduced below. pages based on the visitor‟s profile, affects systemSimple filtering relies on predefined groups, or classes, performance.of user to determine what content is displayed or what  Content Server: The content concept encompassesservice is provided. the data and information that the digital library handlesContent-based filtering works by analyzing the content and makes available to its users. It is composed of a set ofof the objects to form a representation of the users information objects organized in collections.interests. IV. RESULTS | | ∑ Analysis of the Finings ( ) According to the survey conducted, about 95% of (| | | |) (∑ ∑ ) Where X is the set of keywords extracted from the questionnaire taking from students and 5% ofbook, or article and Y is the set of keywords in the user‟s questionnaire taking from librarians. Therefore, thisprofile. The coefficient, Overlap(X, Y), is not influenced study collected preference data from 200 student‟s atby the sizes of X and Y, which is desirable as the number Norton University who were in year four and year threeof book or article key-words could be much larger than at the time of the survey. In addition, the top ten booksthe key words in the user‟s explicit keyword list or much were used: The researcher choose only top 5 books out ofsmaller than the keywords in a user‟s implicit key word top 10 books based on the top reading from library sincelist (Dietmar, J., Markus, Z., Aleexander, F., & Gerhard, F., September 2012.2011). Book 1: PHP, MySQL, JavaScript, and CSS If set X is a subset of Y or the converse then the Book 2: Professional Java E-Commerce Book 3: Cloud Computingoverlap coefficient is equal to one. The value of Book4: Oracle Database 11g & MySQL 5.6 Developer Handbook ( ) is ranges between 0 and 1. Book5: Object-Oriented and Classical Software Engineering ItCollaborative filtering collects users opinions on a set of was included as items to be ranked by users in a leadingobjects, using either explicit or implicit ratings, to form books store. Each user will be provided with a ranked listlike-minded peer groups and then learns from the peer of all items where ties are allowed.groups to predict a particular visitors interest in an item. Table 1 Ratings DatabaseInstead of finding objects similar to those a user liked in Book 1 Book 2 Book 3 Book 4 Book 5the past, as in content-based filtering, collaborative Group 1 86 33 60 146 ?filtering develops recommendations by finding users Group 2 85 62 69 139 124with similar tastes. Researcher build upon the work of thepure collaborative filtering algorithms published that Group 3 110 90 125 118 90compute similarities between users using a Parson Group 4 62 146 142 124 184Correlation Coefficient (Dietmar, J., Markus, Z., Group 5 97 136 170 45 135Aleexander, F., & Gerhard, F., 2011). Predictions for anitem are than computed as the weighted average of the Analysis of Collaborative Filteringratings for the items from those users which are similar, Table 2 Ratings Database of similaritywhere the weight is the computed coefficient. The general Book 1 Book 2 Book 3 Book 4 Book 5 Simformula for a prediction for an item for user u is: Group 1 86 33 60 146 ? ∑ ( ) ( ) ( ) 85 62 69 139 124 0.98 ∑ ( ) Group 2Where is the mean rating for the user in question, Group 3 110 90 125 118 90 0.53 ( ) is the Pearson‟s correlation coefficient of user i Group 4 62 146 142 124 184 -0.31with the user for whom the prediction is being computed, represents the rating submitted by user i for the Group 5 97 136 170 45 135 -0.87article for which the prediction is being computed, is theaverage rating (the average of the user ratings for the 160books, articles in common) for user i and n is the total 140number of the user in the system that have some 120correlation with the user and have rated the item. 100 ∑ ( ̅ )( ̅ ) ( ) 80 √∑ ( ̅ ) √∑ ( ̅ ) 60 a, b : User r a,b : Rating of user a for item p 40 P : Set of items, rated both by user a and user b 20 Group 1 Group 2 Group 3 ̅̅̅ , ̅̅̅ : Users average ratings 0 Possible similarity value between -1 and +1 Book1 Book2 Book3 Book4 Content Caching Providing personalization for real Figure 3 Comparing Group 1 with two other Groups 5
  6. 6. Table 3 Prediction of Group 1 and Group 2 by Pearson Correlation in the digital age has becoming almost impossible since Coefficient the availability of sources is enormous. Even though good Group 1 Group 2 Y-Y Sim(x, y)*(Y-Y) search engines have been made, the type of tools is not Book 1 86 85 -3.75 -3.674 going to tell you what books favor you the most. The main goal of this research, hence, is to find a Book 2 33 62 -26.75 -26.205 model of a recommender system for a digital library Book 3 60 69 -19.75 -19.347 while this research has chosen Norton University‟s digital library as a case study. The specific objectives of this Book 4 146 139 50.25 49.226 research are 1) to identify the type of recommender Book 5 124 35.25 34.531 system available for digital libraries 2) to illustrate how the „collaborative filtering‟, „content-based‟, and 34.531 „personalizing‟ approaches are used in a digital library, = Mean of Group 1= 88.75 and 3) to introduce a model of a recommender system a = Mean of Group 2= 81.25 digital library. ∑ ( ) ( This research also introduces what recommender ) ( ) ∑ ( ) system is as well as the challenges in the recommender system and the importance of digital library. Chosen ( ) techniques of the recommender system were evaluated. A survey technique was used. 200 random samples were Analysis of Content-Based collected from the third year and four year students Table 4 Ratings Database of Overlap studying in Computer Science, Information Management, Book 1 Book 2 Book 3 Book 4 Book 5 Overlap Software Management, and Software Engineering, for this research. Group 1 86 33 60 146 ? Based on the findings shown in the chapter IV, the 85 62 69 139 124 1 result showed that the three techniques chosen were Group 2 working well and produced considerable Group 3 110 90 125 118 90 09 recommendation improvement, as shown in figure 6. 0.8 Hence the new proposed model answered to the research Group 4 62 146 142 124 184 questions. Group 5 97 136 170 45 135 0.7 In summary, this research has contributed in two ways. Firstly, a model of a recommender system for a digital library is discovered. Secondly, „Collaborative 160 filtering‟, „content-based‟ and „personalized‟ techniques 140 Group 1 Group 2 are proved to work together well. 120 The implications of the research can be beneficial for practitioners, both readers and librarians. By applying 100 this model to the existing digital library system at Norton 80 University, students would simply find their favorite 60 books or article and a series of similar taste of books 40 easier and faster. They would waste less time for finding 20 and gain more time for reading. The busyness of the 0 University‟s Internet bandwidth would be no longer Book1 BooK2 Book3 Book4 Book5 wasted. And for the librarian as well as management Figure 4 Overlap of Group 1 and Group 2 team would understand the students‟ taste better. Recommendations 200 Even though this research has developed a model of a recommender system for a digital library and found significant results from a combination of the three chosen 150 techniques, this is limited to the research scope. That is, an actual web-based system needs to be made to test the 100 actual data. The actual data differs significantly from this small sample data. In real system, new techniques may be discovered and may need to assist these existing 50 techniques in order to improve the quality of recommendation. The future research would do with 0 actual data collected from a real system. 0 1 2 3 4 5 6 Limitations Figure 6 Scatter Plot with Line of best fit The body of work aimed at empirically studying the determinants of the intention to participate in a model ofV. CONCLUSION a recommender system for a digital library. A survey As shown in the statement of problem, it is clear technique was used to collect data. First, a pilot study on that finding an article or book which interests the readers recommender system for digital libraries users and 6
  7. 7. librarian was run to find out any different type of second DELOS network of excellence workshop onrecommender system. And then pre-test included eight Personalization and Recommender Systems in Digitaluser and two librarians who were experienced in Libraries. ERCIM Workshop proceedings No01/W03.recommender system for digital libraries. Dublin City University. According to scope and limitation of this research  Degemmis, M., Lops, P., & Semeraro, G. (2007). Athat have been stated, there are four major in Computer content-collaborative recommender that exploitsScience, Information and Communication Technology, WordNet-based user profiles for neighborhood formation. User Modeling and User-AdaptedNetwork and Security, and Software Development were Interaction: The Journal of Personalization Re-search,selected to study on this research. To be correctly and 17(3), 217-255.effectively study, students of Computer Science and  Fink, J., Kobsa, A., & Nill, A. (1997). Adaptable andInformation Management that are totally around 195 Adaptive Information Access for All Users, Includingstudents and 05 librarian of Norton University were the Disabled and the Elderly. A. Jamesson, C. Paris andchose to complete the surveys from 15th to 30th of C. Tasso (Eds.), User Modeling: Proceedings of the SixthSeptember 2012 so that recommender system is International Conference, UM97, pp.171-173.applicable.  Loeb, S. & Terry, D. (1992). Information filtering,Acknowledgements Communications of the ACM, Special Issue onI would like to pay my highly appreciation and thankful Information Filtering, Vol. 35 No. 12, pp. 26-8.for those people who have helped and contributed so  Linden G., Smith B. & York J. (2003). Amazon.commany useful resource, ideas, and time toward the Recommendations: Item-to-Item Collaborative Filtering,completion of this thesis. Without their help, I could not IEEE Internet Computing, v.7 n.1, pages 76-80, Januarybe able to finish it. Firstly, I would like to pay my highly 2003respect to my parents who have been supporting me in  Miller, G. A. (1995). WordNet: A lexical data-base forevery way they can in order to ease my study. Without English. Communications of the ACM, 38(11), 39-41.their help and guidance, I would not have finished my  Mladenic, D .(1999). Text-learning and Relatedmaster degree. Secondly, I would like pay my highly Intelligent Agents, A Survey. IEEE Intelligent Systemsappreciation to my advisor, Prof. Oum Saokosal, who 14(4), pp.44–54.  Mooney, R. & Roy, L. (2000). Content-based bookhave supported me in generating good ideas as well as recommending using learning for text categorization ,provided some critical insight related to the thesis so that Proceedings of the 5th ACM Conference on DigitalI could did it smoothly into the right direction. Without Libraries, San Antonio, ACM Press, New York, NY, pp.his guidance and patience, the thesis will not be done 195-204.correctly. Thirdly, I would like to give my appreciation to  Onsrud, H., & Lopez, X. (1998). Intellectual propertyall teachers and lecturers at Norton University who have rights in disseminating digital geo-graphic data,been actively support to all 2nd year students including products and services: Conflicts and commonalitiesme in our thesis writing. Last but not least, I would like to among EU and U.S. approaches. In P. Burrough, & I.thanks to all the students at Norton University who Masser (Eds.), European Geographic Infrastructures:helped one another both in terms of knowledge sharing Opportunities and Pitfalls, GISDATA 5 (pp. 127–135).and direction pointing toward a successful thesis. Taylor & Francis.Without these mentioned people, this thesis would not  Palani, A., Fox, E., Yang, S., & Ganesan, V. (2009).have been existed. So, I would like to pay my highly Digital Library/ Recommender Systems Curriculumappreciation to these people. May all the best things come Development.to all of us.  Riecken, D. (2000). Personalized Views of References Personalization. Communications of the ACM, 43 (8), Arms, W. (2001). Digital libraries (2nd ed.). MIT Press. pp.27-28. Aslesen, L. (1998). Intellectual property and map-ping:  Shardanand, U.,& Maes, P.(1995).Social Information A European perspective. In P. Burrough, & I. Masser Filtering: Algorithms for Automating “Word of Mouth”. (Eds.), European Geographic Infra-structures: In: Proceedings of ACM CHI‟95 Conference on Human Opportunities and Pitfalls, GISDATA 5 (pp. 127–135). Factors in Computing Systems, vol. 1, pp. 210–217 Taylor & Francis.  Various. (2001). Special issue on the theme digital Breese, J., Heckerman, D. & Kadie, C. (1998), Empirical libraries. Communications of the ACM, 44(5). analysis of predictive algorithms for collaborative  Witten, I.H., & Frank E. (1999). Data Mining. Practical filtering , Technical Report MSR-TR-98-12, Microsoft Machine Learning Tools and Techniques with JAVA Research, Seattle, CA. Implementations. Morgan Kaufman Publishers. Brusilovsky, P., & Schwarz, E. (1997). User as Student:  Zukerman, I., Albrecht, D.W., & Nicholson, A.E. (1999). Towards an Adaptive Interface for Advanced Web- Predicting Users Request on the WWW. Proceedings of Based Applications. A. Jamesson, C. Paris and C. Tasso the 7th International Conference on User Modeling, (Eds.), User Modeling: Proceedings of the Sixth UM99, pp.275-284. International Conference, UM97, pp.177-188. Callan, J., Smeaton, A., Beaulieu, M., Borlund, P., Brusilovsky, P., Chalmers, M., Lynch, C., Riedl, J., Smyth, B., Straccia, U., & Toms E. (2003). Personalization and Recommender Systems in Digital Libraries, Joint NSF-EU DELOS Working Group Report. Callan, J., & Smeaton, A. (2001). Proceedings of the 7
  8. 8. Questionnaire for StudentI. Students General Information: 1. Age:  A. 18-21  B. 22-25  C. 26- 29  D. 30-33  E. more than 33 2. Sex:  A. M  B. F 3. College of Science, which majoring is you in?  A. Information and Communication Technology  B. Computer Science  C. Software Development  D. Network and Security  E. Other…………………….. 4. Do you have any online shopping experience?(like Amazon.com)  A. Yes  B. No 5. Do you have any experience finding books, articles, and journals E-library?  A. Yes  B. No if yes, How it work?................................................................................ ................................................................................................................. if No, Why ?............................................................................................ ................................................................................................................ 6. Does Norton University have recommender system for digital library?  A. Yes if yes, What kind of recommender system is used? .................................................................................................................  B. No if No, Do you want to have ? Like: www.yahoo.com, www.youtube.com, and, www.amazon.com.II. Students Recommender System Technique: Answer each statement by ticking each answer box. Use these ratings as a guide when you answer each statement: សូមធ្វើការធ្រើ សធរ ើសនូ វធសៀវធៅចំ នួន៥ក្បាលក្បនុងចំ ធោម១០ក្បាល ធ ើធសៀវធៅោខ្លះ? ដែលអ្នក្បចូ លចិ តអាន និ ងធ្ើ វការស្រាវ្ាវ ធ ើយធអាយពិ នុទែូចខាងធ្កាមៈ 1= It is useless ឥ ្រធោរន៍ 2= Not very useful មិនសូវមាន្រធោរន៍ 3= Neutral អ្ពា្ក្បឹ 4=Nice to have លអដែរធរើមាន 5= Excellent លអោស់ធរើមានធសៀវធៅធនះ
  9. 9. Statement Title 1 2 3 4 5 Cisco CCNA in 60 Days 1. Practical Database Programming with Visual Basic.NET 2. PHP, MySQL, JavaScript, and CSS3. Professional Java E-Commerce4. Network Analysis, Architecture, and Design5. Oracle Programming with Visual Basic6. Microsoft Visual Basic 2010 for Windows, Web, Office, and Database Applications7.
  10. 10. Cloud Computing 8. Oracle Database 11g & MySQL 5.6 Developer Handbook 9. Object-Oriented and Classical Software Engineering 10.Please add below any other comments: ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Thank you for your time?
  11. 11. Questionnaire for Librarian I. Librarian General Information: 1) Age:  A. 18-21  B. 22-25  C. 26- 29  D. 30-33  E. more than 33 2) Sex:  A. M  B. F 3) What kind of library does Norton University has?  A. Library if yes, How it work?................................................................................ ................................................................................................................. if No, Why ?............................................................................................ .................................................................................................................  B. E-library if yes, How it work?................................................................................ ................................................................................................................. if No, Why ?............................................................................................ .................................................................................................................  C. Library Managements if yes, How it work?................................................................................ ................................................................................................................. if No, Why ?............................................................................................ .................................................................................................................  E. Other………………………………………………………………….. 4) Does Norton University have recommender system for digital library?  A. Yes if yes, What kind of recommender system is used? .................................................................................................................  B. No if No, Do you want to have the tools and services of recommender system for digital library of Norton University? Like: www.yahoo.com, www.youtube.com, and www.amazon.com.Please add below any other comments: ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Thank you for your time?