3 05564736


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

3 05564736

  1. 1. Research of Data Mining Based on E-Commerce Liu Xiao-liang Li Yong-hong, Dept. of Business Hebei University of Economics and Business Dept. of Account Hebei University of Economics and Shijiazhuang, Hebei Business lx1130225@163.com Shijiazhuang, Hebei liyonghong 1966@126.comAbs1lllcf-this paper describes the Web data mining teehnology,Web data mining process, data sources and uses, designs apersonalized recommendation system based on Web usemining for e-commerce, and discusses in detail the function ofeach module of the system and implementation techniques.Through the mining of the historical log data of Web server, lCb mining Web usage miningthe system can access to the user browsing patterns to providepersonalized service for users. KI!)H01fIs-IYeIJ tiotfl lIIining; e-ctJlIII1II!1Ce; recollllllt!lllltllllSJSIelll Figure I. Web mining classification I. INTRODUCTION With the increasing popularity of Internet and the rapid J) Web content miningdevelopment of e-commerce, internet-based business Web It refers to the process of mining for the Web pagesite is facing increasing competition. E-commerce sites content and back-office database, obtaining the usefulgenerate large amounts of data each day, and the data knowledge from the Web docwnents, which also can carryincludes conswner-related potential information which is out mining for the Web structure and link relationship tovaluable for market analysis and prediction [I] . So how to obtain useful knowledge from artificial link structure.effectively organize and use such business information and 2) Web structure mininghow to understand customer interests and value preferences Web structure includes the hyperlinks between differentas much as possible to optimize website design and provide pages and the tree structure within the page which can beusers with personalized service, becomes the issues of the expressed by HTML, XML, as well as the directory pathe-commerce to be solved urgently. structures in docwnent URL. Hyperlink structure between Web data mining is the application of data mining the Web pages contains much useful information [2] .technology in the Web environment, is the process to find 3) Web usage miningthe implicit, unknown potential value, non-trivial model The object of Web content mining, Web structure miningfrom a large nwnber of Web docwnents and the data of is the original data in the network, but the Web usage miningusers visiting the website. Web data mining can play a role is different from the first two, which faces the second-handin many areas, and e-commerce provides a wealth of data data extracted from the process of user and networksources and new research topic for the data mining. interactions. Web usage mining is to find the mode of users to access the Web pages through mining the Web log files II. WEB DATA MINING and the related data. It also can identify e-commerceA. Web data mining overview potential customers through analyzing and exploring the Web mining is a technology to combine the data mining principles in the Web log records, to strengthen the qualitytechnology and the Internet, simply said, Web mining is to and delivery of end-user Internet information services, andextract the interesting, potential useful patterns and hidden improve the performance of the Web server system. B.information from the Web docwnents and Web activities. Web data mining application in e-commerceOn Web there are a mass of data, so how to carry out The emergence of e-commerce has changed thecomplex application of these data becomes the research traditional business models, also changed the relationshiphotspot of current database technology. Data mining is just between the vendors and customers. The expanded customerto find the implicit content with regularity from large choice makes them pay more attention to the value of goods,amounts of data to resolve data quality problems. but unlike the previous first consider the brand and At present, according to the direction of mining, Web geographical factors. Therefore, in terms of the seller to asdata mining is mainly divided into three categories: Web much as possible understand customers tastes and values cancontent mining, Web structure mining and Web usage remain invincible in the competition. Data miningmining, as shown in figure 1. technology can effectively help sellers to understand978-1-4244-5540-9/10/$26.00 ©2010 IEEE 719
  2. 2. customer behaviors and improve the efficiency of site, thus variety of mmmg techniques such as association rules,having been widely used in the e-commerce design, service clustering and use clustering.customer relationship management, Internet marketing, and Online recommendation module sets the recommendationother fields. engine in the Web server front-end, which combines the current browse activities of the user and the mined page III. E-COMMERCE RECOMMENDATION SYSTEM DESIGN recommended set together to generate the corresponding AND IMPLEMENTATION recommendation set. The task of e-commerce recommendation system is to Model application module is to add the page links of theobserve the users access request online, identify the user recommended set in the latest user request page, and thenseSSIOn set 0f each current on1· me user [4] , to automatlca11y through the web server to deliver to the client browser [6] , to . .recommend the pages they may be interested and have not provide users real-time personalized service; at the same time,visited according to the user access pattern library the recommended results will be sent to the site managementestablished in the pattern mining module. center to adjust the site design, optimize the site structure, thus to improve the site efficiency.A. System overall design The system applies the data mining ideas and methods to The system consists of five modules: data acquisition the Web server logs for Web usage mining, to extract the dU(module, data preprocessing module, offline mining module, users access law, and then comes the online access usersonline recommendation module and model application down to a certain class, according to the law of such usersmO [ ill fiT 2 i [ i] access to carry out Web page recommendation, thus to provide users with personalized access. B. System specific module design and implementation 1) O.lfline module Web Server Offline module mainly consists of data pre-processing, Web mining and other modules, to provide support to the 3E _rr-- ] recommended engine, the system structure as shown in figure 3. VL- Rule data Offline module N- --- -- -� Recommendation (7 algorithm Rule model -- lr -- Accumulation of data --- fF Figure 2. E-commerce recommendation system based on Web mining (7 Recommendation The main function of the data acquisition module is to � Online modulecollect the original log data to form data acquisition libraries engineand prepare for future mining. !--v Data pre-processing module is mainly to pre-process the Figure 3. Offline module structuredata in the data acquisition library to obtain a usertransaction database, and a user transaction is the set of a Data pre-processing and Web mining have relativelygroup of meaningful browser pages of a user in a period of large time cost, can not meet the real time requirement oftime. personalized recommendation services, so the two operations Offline mining module is mainly composed by the must be carried out offline, while the mining results can bemining engine, mining algorithm library, pattern discovery directly referred by the online recommendation engine.sub-module and pattern analysis sub-module. In which the Data preprocessing is to complete the work of two parts:mining engine uses the data mining technology in the delete the data irrelevant to the data source (Web log file),mining algorithm library to discover user patterns, through generate the user session documents and transaction databasethe pattern analysis to analyze and interpret it to obtain useful for the mining stage and generate site data files for themeaningful page recommendation set, this system provides a relevant documents in site. Pattern mining task is to use data 720
  3. 3. mmmg algorithms to carry out mmmg for the resulttransaction database of data preprocessing, to generate Client Clientfrequent user access page group, that is, user browse modelibrary [3] . The offline part needs periodical mining to updatethe user browse mode library. In general, the offline part of the recommended system Internetmainly aims to registered users, is to clean up therecommended set according to the key information usersprovided, to display the precise information of user interestin the recommend page set. Web Server 2) Online module The online part mode is finally to provide customerswith the best quality of recommendation services when the Data interfacecustomer online browsing, so the design of the online part iscrucial for the whole recommendation system. Online part is composed by the pattern analysis tasks andthe data source acquisition tasks. In which the patternanalysis tasks use the offline part to generate results and then Recommendedonline recommend the next step operation for users result set Recommendation engineaccording to the current access behaviors of users, whichconsists of personalized recommendation engine and Webservers, personalization recommendation engine is to findthe users interest through analyzing the users current accessoperation, and generate recommendation results through the Web databaserecommendation algorithms, thus to achieve personalizedservices. The acquisition task of the data source is to collectthe Web log files on Web server, and this process is Figure 4. shows the structure diagram of the system online partautomatically completed by configuring the serverscorresponding parameters. From the above figure it can be seen, when users visit the Online mode is divided into two situations, one is for Web site the web page request will be first sent to the Webregistered user login, second is for random non-registered server to be submitted to the recommendation engine module,users. Registered users can change any interest, while the and then the latter will use the pattern library the offlinerecommendation system based on the users choice will form module generated and Web database to calculate, generatea recommended set and display the precise recommendation recommendation result set to be returned to the Web serverpages, and if the changed interest term does not include the to return to the client through Http protocol, finally displayedkey information of the registered users, then the in the browser. Thus the whole recommendation process isrecommended set will be generated from the original achieved. And from this process it can be seen that the entirerecommended set, so that the recommendation page users calculation process is completed by the recommendationobtained will be more accurate. engine. For non-registered users, the recommended page isclosely related to the user interest, while the precise extent IV. CONCLUSIONlargely depends on the user interest item selection, that is, At present, Web data mining has gradually become a hotthe more the user interest item constraints the more accurate issue in network research, data mining, knowledge discovery,the recommended page [5] , but this has the cost of the user software agent, and other fields. Log mining research fortime consumption. The first page of random users includes Web site optimization, e-commerce, distant education,all the categories and types. And after a series of choices, the information retrieval and other fields has great significance.recommendation system will finally generate a However, how to complete these technologies and as soon asrecommended page, and the more the selected interest items, possible use them in a variety of Internet applications is athe more precise the recommended page. new topic we faced. This paper studies the e-commerce recommendation system models based on Web data mining, designs a personalized recommendation system based on Web use mining for e-commerce, and discusses in detail the function of each module of the system and implementation techniques. The system can access to the user browsing patterns to provide personalized service for users through the mining of the historical log data of Web server. 721
  4. 4. REFERENCES[I]. [Mobasher, B. Dai, Improving the Effectiveness of Collaborative Filtering on Anonymous Web Usage Data, ITWPOI, 2001.[2]. Zhen Liu, Minyi Guo. A Proposal of Integrating Data Mining and On-line Analytical Processing in Data Warehouse, In Proceeding of international conference on mechatronics and information technology, 2001:146-51.[3]. Sarwar B.Karypis, G.Konstan, J., and Riedl, J. Analysis of Recommendation Algorithms for E-Commerce, ACM Conference on Electronic Commerce, 2000.[4]. N.Colossi, W.Malloy, B. Reinwald Relational extensions For OALP, IBM Systems Journal [J].2002, 41(4):31-35.[5]. CHEN Yu-ru, HUNG Ming-chuan, Don-lin YANG, Using Data Mining to Construct an Intelligent Web Search System [1], International Journal of Computer Processing of Oriental Languages, 2003.[6]. S.Foucaud, A.Zanichelli, B.Garilli, M. Scodeggio and P.Franzetti. VIPGI and Elise3D: Reducing VIMOS-IFU Data and Searching for Emission Line Sources in Data Cubes. New Astronomy Reviews, In Press, Corrected Proof, and Available online.19 April 2006. 722