SlideShare a Scribd company logo
1 of 5
Download to read offline
IOSR Journal of Computer Engineering (IOSR-JCE)
e-ISSN: 2278-0661, p- ISSN: 2278-8727Volume 11, Issue 3 (May. - Jun. 2013), PP 57-61
www.iosrjournals.org
www.iosrjournals.org 57 | Page
“A new approach for user identification in web usage mining
preprocessing”
1.Arvind kumar Dangi, 2.Asst. Prof. Sunita Sangwan
1.(Software Engineering ,PDM College of engineering ,India) akdangi@gmail.com
2.(Computer Science & Engineering ,PDM College of engineering ,India) sunita2009@gmail.com
Abstract : Web usage mining is a subset of data mining. In order to huge amount of data but the data is less
appropriates “quantity and quality” of the web data is opposite to each other this is the main problem. Web
data usage Mining is a variation of this field is untapped source of richly offered free textual information. The
importance of web data usage mining is mounting along with the immense volumes of data generated in web
habitual existence data always arrives in a various, continuous, rapid and time varying flow. Web data usage
mining taking out procedures are important in extracting useful streaming on-line sources. As throughout the
globe no. of web users are continuously and rapidly growing, it is necessary for the web usage miners to utilize
efficient tools in order to discover, extort, clean and assess the desired information. The data pre-processing
stage is the most important phase in the preprocessing for investigation of the web user & his usage behavior.
To fulfill this requirement the navigations are recorded in web log file as well as the IP address of the website,
session of usage & visited web link. In order to improve the performance & quality of data preprocessing in
order to identify unique users and user sessions. We propose a new method for web data preprocessing in which
it has three phases. “In the first phase some websites are selected and by different locations access these
website & by applying the (java) tools & methods then find out the IP address of that websites, session usage
time & navigations, in the final phase combine them i.e.(web link navigation + IP address of website + session
of usage ). This framework helps to investigate the web user usage behavior efficiently.
Keywords - :-weblink, navigation, threading, networking, diminution, web logs, session, Data clearout, Data
makeover.
I. INTRODUCTION
Data pre-processing is an important and required phase in Web usage mining. Different methods &
techniques are employed in web usage mining to remove irrelevant items and identify IP address, sessions of
usage and navigation along with the extraction information. There is no of steps define for preprocessing asData
clear out also known as the routines work to “clean” the data by satisfying in missing values, smoothing noisy
data, identifying or removing outliers, and resolving inconsistencies.If users believe the data is noisy, they are
improbable to trust the results of any data mining that has been applied to it. Also, dirty data can cause
uncertainty for the mining procedure, resulting in unreliable output. But, they are not always robust. a new
method for web data preprocessing in which it has three phases. “In the first phase some websites are selected
and by different locations access these website & by applying the (java) tools & methods then find out the IP
address of that websites, session usage time & navigations, in the final phase combine them i.e.(web link
navigation + IP address of website + session of usage ). This framework helps to investigate the web user usage
behavior efficiently.
1. Introduction
2. Indentation and equations
2.1 Data clear out
2.2 IP address identification
2.3 Session of usage identification
2.4 Data integration
2.5 data makeover
2.6 Data diminution
2.6 Data usage mining
3. Figures & tables
3.1Framework for research Activities
3.2 Representing research work
3.3 Sequential steps which followed for the research work
4. Conclusion & future work
References
“A new approach for user identification in web usage mining preprocessing”
www.iosrjournals.org 58 | Page
II. INDENTATIONS AND EQUATIONS
2.1 Data clear out (cleaning)
Data clear out also known as the routines work to “clean” the data by satisfying in missing values,
smoothing noisy data, identifying or removing outliers, and resolving inconsistencies.
2.2 IP address identification & Web link visited identification
Using the following function & code we Identify the IP address & web link visited by the end user using the
java language.
try
{
Statement st=con.createStatement();
int a=st.executeUpdate("INSERT INTO cdass.tracing (userid, `IP`, visited_page) n" +"VALUES
('"+userid+"', '"+ip+"', '"+page.toString()+"');n" +"");
}catch(Exception e)
{
System.out.println("Error During tracing.............."+e);
}
finally
{ try {
con.close();
} catch (SQLException ex) {
Logger.getLogger(track.class.getName()).log(Level.SEVERE, null, ex);
}
}
2.3Session of usage identification
Identify the session of usage by the user at particular web site or the web page.
long d=session.getCreationTime();
long dd=session.getLastAccessedTime();
out.print(new Date(dd).getSeconds()-new Date(d).getSeconds());
session.removeAttribute("userid");
Session Tracing
<%
tracing.track.setTracking((String)session.getAttribute("userid"),request.getRemoteAddr(),request.getRequestUR
L());
%>
<%
long d=session.getCreationTime();
long dd=session.getLastAccessedTime();
// out.print(new Date(d).getSeconds());
out.print(new Date(dd).getSeconds()-new Date(d).getSeconds());
// out.println((dd-d)/60);
// System.out.println(session.getLastAccessedTime());
session.removeAttribute("userid");
// response.sendRedirect("login_form.jsp");
%>
Packages used for achieving the above tasks
import java.sql.Connection;
The JDBC ( Java Database Connectivity) API defines interfaces and classes for writing database
applications in Java by making database connections. Using JDBC you can send SQL, PL/SQL statements to
almost any relational database. JDBC is a Java API for executing SQL statements and supports basic SQL
functionality. It provides RDBMS access by allowing you to embed SQL inside Java code. Because Java can
run on a thin client, applets embedded in Web pages can contain downloadable JDBC code to enable remote
database access.
import java.sql.Statement;
To execute a SQL statement on your table, you set up a Statement object. So add this import line to the
top of your code:
import java.sql.Statement;
“A new approach for user identification in web usage mining preprocessing”
www.iosrjournals.org 59 | Page
In the try part of the try … catch block add the following line (add it just below your Connection line):
Statement stmt = con.createStatement( );
Here, we're creating a Statement object called stmt. The Statement object needs a Connection object, with
the createStatment method.
We also need a SQL Statement for the Statement object to execute. So add this line to your code:
String SQL = "SELECT * FROM Workers";
The above statement selects all the records from the database table called Workers.
We can pass this SQL query to a method of the Statement object called executeQuery. The Statement object
will then go to work gathering all the records that match our query.
import java.util.logging.Level;
The java.util.logging package provides the logging capabilities via the Logger class.
To create a logger in your Java coding you can use the following snippet.
Import java.util.logging.logger
Private final static logger logger = logger.getlogger(myclass.class.getname())
The Logger you create a actually a hierarchy of Loggers, and a . (dot) in the hierarchy indicates a level in the
hierarchy. So if you get a Logger for the com.example key this Logger is a child of the com Logger and
the com Logger is child of the Logger for the empty String. You can configure the main logger and this affects
all its children.
SIGN IN TIME SIGN OUT TIME BROWSER AGENT URL IP
2.4 Data integration
Integration we mean that a dataset can be retrieved and added to other datasets to create greater, more
robust and useful datasets.The discipline of data integration comprises the practices, architectural techniques
and tools for achieving the consistent access and delivery of data across the spectrum of data subject areas and
data structure types in the enterprise to meet the data consumption requirements of all applications and business
processes.
2.5 Data makeover
Data makeover process applies on the data which is collected from different sources of the data. In the
make over process it organized the data minimizes redundancy and dependency of the data.
2.6 Data diminution
These are the technique that can be applied to get a reduced representation of the data set that is much smaller in
volume maintain the data integrity of the original data.
2.7 Usage mining
Web Usage Mining process is a user session file that gives an exact account about the user accessed the
Web site, what pages were visited and in what order, and how much time each page was visited. A user session
is the set of the page accesses that occur during a single visit to a Web site. the information contained in a raw
Web server log represent a user session file before data preprocessing
III. FIGURES AND TABLES
3.1Framework of activities
“A new approach for user identification in web usage mining preprocessing”
www.iosrjournals.org 60 | Page
3.2Research work representation
3.3 Sequential steps which followed for the research work
user
DATA CLEAR OUT
IP ADDRESS IDENTIFICATION
SESSION OF USAGE IDENTIFICATION
WEB LINK VISITED IDENTIFICATION
DATA INTEGRATION
DATA MAKEOVER
DATA DIMINUTION
USAGE MINING
“A new approach for user identification in web usage mining preprocessing”
www.iosrjournals.org 61 | Page
IV. CONCLUSION
There are several data preprocessing techniques. Data cleaning can be applied to remove noise and
correct inconsistencies in data. Data integration merges data from multiple sources into a logical data store such
as a data warehouse. Data diminution can reduce data size by, for instance, aggregating, eliminating redundant
features, or clustering. Usage of data mining method and data extraction on the web is now highlighting of the
boosting number of researchers. web usage mining is a type of data mining technique that can be useful in
suggesting the web usage type with the identification of user session, this research also help in identification of
the end user behavior with the help of the identification of the user (IP address + web link navigation +
session usage) and combine them together for identification user behavior more accurately for preprocessing
data. This research mainly concentrates on providing another approach for preprocessing the data more
accurately. Experimental results suggest the significance of the proposed approach.
REFERENCES
[1] Dr. M. Giri and Dr. Akash Kumar, “An Efficient Web Content Mining using elevance Analysis Approach”, International Journal of
Multidisciplinary Research in Advanced Engineering, pages. 201-210, 2012.
[2] Dr. M. Giri and Dr. Akash Kumar, “An Efficient Web Content Mining using Divide and Conquer Approach”, International journal
of Computational Intelligence Research, pages. 201-210, 2012.
[3] Dr. M. Giri and Dr. Akash Kumar, “An Efficient Web Content Mining using Multi Threading Approach”, International Journal of
Systems, Algorithms and Applications, pages. 1-4, 2012
[4] Yan LI, Boqin FENG and Qinjiao MAO, “Research on Path Completion Technique In Web Usage Mining”, IEEE International
Symposium On Computer Science and Computational Technology, pp. 554-559, 2008.
[5] JING Chang-bin and Chen Li, “ Web Log Data Preprocessing Based On Collaborative Filtering ”, IEEE 2nd International
Workshop On Education Technology and Computer Science, pp.118-121, 2010.
[6] Huiping Peng, “Discovery of Interesting Association Rules Based On Web Usage Mining”, IEEE Coference, pp.272-275, 2010.
[7] Tasawar Hussain, Dr. Sohail Asghar and Nayyer Masood, “Hierarchical Sessionization at Preprocessing Level of WUM Based on
Swarm Intelligence ”, 6th International Conference on Emerging Technologies (ICET) IEEE, pp. 21- 26, 2010.

More Related Content

Viewers also liked

Discovering Frequent Patterns with New Mining Procedure
Discovering Frequent Patterns with New Mining ProcedureDiscovering Frequent Patterns with New Mining Procedure
Discovering Frequent Patterns with New Mining ProcedureIOSR Journals
 
Area Optimized Implementation For Mips Processor
Area Optimized Implementation For Mips ProcessorArea Optimized Implementation For Mips Processor
Area Optimized Implementation For Mips ProcessorIOSR Journals
 
Designing the Workflow of a Language Interpretation Device Using Artificial I...
Designing the Workflow of a Language Interpretation Device Using Artificial I...Designing the Workflow of a Language Interpretation Device Using Artificial I...
Designing the Workflow of a Language Interpretation Device Using Artificial I...IOSR Journals
 
B01120813 copy
B01120813   copyB01120813   copy
B01120813 copyIOSR Journals
 
Performance analysis of High Speed ADC using SR F/F
Performance analysis of High Speed ADC using SR F/FPerformance analysis of High Speed ADC using SR F/F
Performance analysis of High Speed ADC using SR F/FIOSR Journals
 
Fuzzy Control of Multicell Converter
Fuzzy Control of Multicell ConverterFuzzy Control of Multicell Converter
Fuzzy Control of Multicell ConverterIOSR Journals
 
Service Based Content Sharing in the Environment of Mobile Ad-hoc Networks
Service Based Content Sharing in the Environment of Mobile Ad-hoc NetworksService Based Content Sharing in the Environment of Mobile Ad-hoc Networks
Service Based Content Sharing in the Environment of Mobile Ad-hoc NetworksIOSR Journals
 
An Efficient Approach for Outlier Detection in Wireless Sensor Network
An Efficient Approach for Outlier Detection in Wireless Sensor NetworkAn Efficient Approach for Outlier Detection in Wireless Sensor Network
An Efficient Approach for Outlier Detection in Wireless Sensor NetworkIOSR Journals
 
Computer Based Human Gesture Recognition With Study Of Algorithms
Computer Based Human Gesture Recognition With Study Of AlgorithmsComputer Based Human Gesture Recognition With Study Of Algorithms
Computer Based Human Gesture Recognition With Study Of AlgorithmsIOSR Journals
 
Interrelation between Climate Change and Lightning and its Impacts on Power S...
Interrelation between Climate Change and Lightning and its Impacts on Power S...Interrelation between Climate Change and Lightning and its Impacts on Power S...
Interrelation between Climate Change and Lightning and its Impacts on Power S...IOSR Journals
 
Procuring the Anomaly Packets and Accountability Detection in the Network
Procuring the Anomaly Packets and Accountability Detection in the NetworkProcuring the Anomaly Packets and Accountability Detection in the Network
Procuring the Anomaly Packets and Accountability Detection in the NetworkIOSR Journals
 
Effect Of Nitrogen And Potassium On The Yield And Quality Of Ginger In The De...
Effect Of Nitrogen And Potassium On The Yield And Quality Of Ginger In The De...Effect Of Nitrogen And Potassium On The Yield And Quality Of Ginger In The De...
Effect Of Nitrogen And Potassium On The Yield And Quality Of Ginger In The De...IOSR Journals
 
2-Dimensional and 3-Dimesional Electromagnetic Fields Using Finite element me...
2-Dimensional and 3-Dimesional Electromagnetic Fields Using Finite element me...2-Dimensional and 3-Dimesional Electromagnetic Fields Using Finite element me...
2-Dimensional and 3-Dimesional Electromagnetic Fields Using Finite element me...IOSR Journals
 
Selfish Node Detection in Replica Allocation over MANETs
Selfish Node Detection in Replica Allocation over MANETsSelfish Node Detection in Replica Allocation over MANETs
Selfish Node Detection in Replica Allocation over MANETsIOSR Journals
 

Viewers also liked (20)

F013114350
F013114350F013114350
F013114350
 
K017357681
K017357681K017357681
K017357681
 
Discovering Frequent Patterns with New Mining Procedure
Discovering Frequent Patterns with New Mining ProcedureDiscovering Frequent Patterns with New Mining Procedure
Discovering Frequent Patterns with New Mining Procedure
 
Area Optimized Implementation For Mips Processor
Area Optimized Implementation For Mips ProcessorArea Optimized Implementation For Mips Processor
Area Optimized Implementation For Mips Processor
 
Designing the Workflow of a Language Interpretation Device Using Artificial I...
Designing the Workflow of a Language Interpretation Device Using Artificial I...Designing the Workflow of a Language Interpretation Device Using Artificial I...
Designing the Workflow of a Language Interpretation Device Using Artificial I...
 
B01120813 copy
B01120813   copyB01120813   copy
B01120813 copy
 
D017152832
D017152832D017152832
D017152832
 
Performance analysis of High Speed ADC using SR F/F
Performance analysis of High Speed ADC using SR F/FPerformance analysis of High Speed ADC using SR F/F
Performance analysis of High Speed ADC using SR F/F
 
Fuzzy Control of Multicell Converter
Fuzzy Control of Multicell ConverterFuzzy Control of Multicell Converter
Fuzzy Control of Multicell Converter
 
Service Based Content Sharing in the Environment of Mobile Ad-hoc Networks
Service Based Content Sharing in the Environment of Mobile Ad-hoc NetworksService Based Content Sharing in the Environment of Mobile Ad-hoc Networks
Service Based Content Sharing in the Environment of Mobile Ad-hoc Networks
 
An Efficient Approach for Outlier Detection in Wireless Sensor Network
An Efficient Approach for Outlier Detection in Wireless Sensor NetworkAn Efficient Approach for Outlier Detection in Wireless Sensor Network
An Efficient Approach for Outlier Detection in Wireless Sensor Network
 
Computer Based Human Gesture Recognition With Study Of Algorithms
Computer Based Human Gesture Recognition With Study Of AlgorithmsComputer Based Human Gesture Recognition With Study Of Algorithms
Computer Based Human Gesture Recognition With Study Of Algorithms
 
Interrelation between Climate Change and Lightning and its Impacts on Power S...
Interrelation between Climate Change and Lightning and its Impacts on Power S...Interrelation between Climate Change and Lightning and its Impacts on Power S...
Interrelation between Climate Change and Lightning and its Impacts on Power S...
 
Procuring the Anomaly Packets and Accountability Detection in the Network
Procuring the Anomaly Packets and Accountability Detection in the NetworkProcuring the Anomaly Packets and Accountability Detection in the Network
Procuring the Anomaly Packets and Accountability Detection in the Network
 
Effect Of Nitrogen And Potassium On The Yield And Quality Of Ginger In The De...
Effect Of Nitrogen And Potassium On The Yield And Quality Of Ginger In The De...Effect Of Nitrogen And Potassium On The Yield And Quality Of Ginger In The De...
Effect Of Nitrogen And Potassium On The Yield And Quality Of Ginger In The De...
 
2-Dimensional and 3-Dimesional Electromagnetic Fields Using Finite element me...
2-Dimensional and 3-Dimesional Electromagnetic Fields Using Finite element me...2-Dimensional and 3-Dimesional Electromagnetic Fields Using Finite element me...
2-Dimensional and 3-Dimesional Electromagnetic Fields Using Finite element me...
 
D012631423
D012631423D012631423
D012631423
 
Selfish Node Detection in Replica Allocation over MANETs
Selfish Node Detection in Replica Allocation over MANETsSelfish Node Detection in Replica Allocation over MANETs
Selfish Node Detection in Replica Allocation over MANETs
 
K1803046067
K1803046067K1803046067
K1803046067
 
G1803044045
G1803044045G1803044045
G1803044045
 

Similar to A new approach for user identification in web usage mining preprocessing

a novel technique to pre-process web log data using sql server management studio
a novel technique to pre-process web log data using sql server management studioa novel technique to pre-process web log data using sql server management studio
a novel technique to pre-process web log data using sql server management studioINFOGAIN PUBLICATION
 
A Novel Method for Data Cleaning and User- Session Identification for Web Mining
A Novel Method for Data Cleaning and User- Session Identification for Web MiningA Novel Method for Data Cleaning and User- Session Identification for Web Mining
A Novel Method for Data Cleaning and User- Session Identification for Web MiningIJMER
 
Ijarcet vol-2-issue-7-2341-2343
Ijarcet vol-2-issue-7-2341-2343Ijarcet vol-2-issue-7-2341-2343
Ijarcet vol-2-issue-7-2341-2343Editor IJARCET
 
Ijarcet vol-2-issue-7-2341-2343
Ijarcet vol-2-issue-7-2341-2343Ijarcet vol-2-issue-7-2341-2343
Ijarcet vol-2-issue-7-2341-2343Editor IJARCET
 
WEB LOG PREPROCESSING BASED ON PARTIAL ANCESTRAL GRAPH TECHNIQUE FOR SESSION ...
WEB LOG PREPROCESSING BASED ON PARTIAL ANCESTRAL GRAPH TECHNIQUE FOR SESSION ...WEB LOG PREPROCESSING BASED ON PARTIAL ANCESTRAL GRAPH TECHNIQUE FOR SESSION ...
WEB LOG PREPROCESSING BASED ON PARTIAL ANCESTRAL GRAPH TECHNIQUE FOR SESSION ...cscpconf
 
Web personalization using clustering of web usage data
Web personalization using clustering of web usage dataWeb personalization using clustering of web usage data
Web personalization using clustering of web usage dataijfcstjournal
 
AN EXTENSIVE LITERATURE SURVEY ON COMPREHENSIVE RESEARCH ACTIVITIES OF WEB US...
AN EXTENSIVE LITERATURE SURVEY ON COMPREHENSIVE RESEARCH ACTIVITIES OF WEB US...AN EXTENSIVE LITERATURE SURVEY ON COMPREHENSIVE RESEARCH ACTIVITIES OF WEB US...
AN EXTENSIVE LITERATURE SURVEY ON COMPREHENSIVE RESEARCH ACTIVITIES OF WEB US...James Heller
 
Web Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage miningWeb Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage miningIOSR Journals
 
Applying web mining application for user behavior understanding
Applying web mining application for user behavior understandingApplying web mining application for user behavior understanding
Applying web mining application for user behavior understandingZakaria Zubi
 
Implementation of Intelligent Web Server Monitoring
Implementation of Intelligent Web Server MonitoringImplementation of Intelligent Web Server Monitoring
Implementation of Intelligent Web Server Monitoringiosrjce
 
Integrated Web Recommendation Model with Improved Weighted Association Rule M...
Integrated Web Recommendation Model with Improved Weighted Association Rule M...Integrated Web Recommendation Model with Improved Weighted Association Rule M...
Integrated Web Recommendation Model with Improved Weighted Association Rule M...ijdkp
 
Comparison of decision and random tree algorithms on
Comparison of decision and random tree algorithms onComparison of decision and random tree algorithms on
Comparison of decision and random tree algorithms oneSAT Publishing House
 
A Survey of Issues and Techniques of Web Usage Mining
A Survey of Issues and Techniques of Web Usage MiningA Survey of Issues and Techniques of Web Usage Mining
A Survey of Issues and Techniques of Web Usage MiningIRJET Journal
 
A Comparative Study of Recommendation System Using Web Usage Mining
A Comparative Study of Recommendation System Using Web Usage Mining A Comparative Study of Recommendation System Using Web Usage Mining
A Comparative Study of Recommendation System Using Web Usage Mining Editor IJMTER
 
Volume 2-issue-6-2056-2060
Volume 2-issue-6-2056-2060Volume 2-issue-6-2056-2060
Volume 2-issue-6-2056-2060Editor IJARCET
 
Volume 2-issue-6-2056-2060
Volume 2-issue-6-2056-2060Volume 2-issue-6-2056-2060
Volume 2-issue-6-2056-2060Editor IJARCET
 

Similar to A new approach for user identification in web usage mining preprocessing (20)

a novel technique to pre-process web log data using sql server management studio
a novel technique to pre-process web log data using sql server management studioa novel technique to pre-process web log data using sql server management studio
a novel technique to pre-process web log data using sql server management studio
 
A Novel Method for Data Cleaning and User- Session Identification for Web Mining
A Novel Method for Data Cleaning and User- Session Identification for Web MiningA Novel Method for Data Cleaning and User- Session Identification for Web Mining
A Novel Method for Data Cleaning and User- Session Identification for Web Mining
 
Ijarcet vol-2-issue-7-2341-2343
Ijarcet vol-2-issue-7-2341-2343Ijarcet vol-2-issue-7-2341-2343
Ijarcet vol-2-issue-7-2341-2343
 
Ijarcet vol-2-issue-7-2341-2343
Ijarcet vol-2-issue-7-2341-2343Ijarcet vol-2-issue-7-2341-2343
Ijarcet vol-2-issue-7-2341-2343
 
WEB LOG PREPROCESSING BASED ON PARTIAL ANCESTRAL GRAPH TECHNIQUE FOR SESSION ...
WEB LOG PREPROCESSING BASED ON PARTIAL ANCESTRAL GRAPH TECHNIQUE FOR SESSION ...WEB LOG PREPROCESSING BASED ON PARTIAL ANCESTRAL GRAPH TECHNIQUE FOR SESSION ...
WEB LOG PREPROCESSING BASED ON PARTIAL ANCESTRAL GRAPH TECHNIQUE FOR SESSION ...
 
Web personalization using clustering of web usage data
Web personalization using clustering of web usage dataWeb personalization using clustering of web usage data
Web personalization using clustering of web usage data
 
AN EXTENSIVE LITERATURE SURVEY ON COMPREHENSIVE RESEARCH ACTIVITIES OF WEB US...
AN EXTENSIVE LITERATURE SURVEY ON COMPREHENSIVE RESEARCH ACTIVITIES OF WEB US...AN EXTENSIVE LITERATURE SURVEY ON COMPREHENSIVE RESEARCH ACTIVITIES OF WEB US...
AN EXTENSIVE LITERATURE SURVEY ON COMPREHENSIVE RESEARCH ACTIVITIES OF WEB US...
 
Web Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage miningWeb Data mining-A Research area in Web usage mining
Web Data mining-A Research area in Web usage mining
 
Applying web mining application for user behavior understanding
Applying web mining application for user behavior understandingApplying web mining application for user behavior understanding
Applying web mining application for user behavior understanding
 
Implementation of Intelligent Web Server Monitoring
Implementation of Intelligent Web Server MonitoringImplementation of Intelligent Web Server Monitoring
Implementation of Intelligent Web Server Monitoring
 
C017231726
C017231726C017231726
C017231726
 
Integrated Web Recommendation Model with Improved Weighted Association Rule M...
Integrated Web Recommendation Model with Improved Weighted Association Rule M...Integrated Web Recommendation Model with Improved Weighted Association Rule M...
Integrated Web Recommendation Model with Improved Weighted Association Rule M...
 
L017418893
L017418893L017418893
L017418893
 
625 634
625 634625 634
625 634
 
50120140505007
5012014050500750120140505007
50120140505007
 
Comparison of decision and random tree algorithms on
Comparison of decision and random tree algorithms onComparison of decision and random tree algorithms on
Comparison of decision and random tree algorithms on
 
A Survey of Issues and Techniques of Web Usage Mining
A Survey of Issues and Techniques of Web Usage MiningA Survey of Issues and Techniques of Web Usage Mining
A Survey of Issues and Techniques of Web Usage Mining
 
A Comparative Study of Recommendation System Using Web Usage Mining
A Comparative Study of Recommendation System Using Web Usage Mining A Comparative Study of Recommendation System Using Web Usage Mining
A Comparative Study of Recommendation System Using Web Usage Mining
 
Volume 2-issue-6-2056-2060
Volume 2-issue-6-2056-2060Volume 2-issue-6-2056-2060
Volume 2-issue-6-2056-2060
 
Volume 2-issue-6-2056-2060
Volume 2-issue-6-2056-2060Volume 2-issue-6-2056-2060
Volume 2-issue-6-2056-2060
 

More from IOSR Journals (20)

A011140104
A011140104A011140104
A011140104
 
M0111397100
M0111397100M0111397100
M0111397100
 
L011138596
L011138596L011138596
L011138596
 
K011138084
K011138084K011138084
K011138084
 
J011137479
J011137479J011137479
J011137479
 
I011136673
I011136673I011136673
I011136673
 
G011134454
G011134454G011134454
G011134454
 
H011135565
H011135565H011135565
H011135565
 
F011134043
F011134043F011134043
F011134043
 
E011133639
E011133639E011133639
E011133639
 
D011132635
D011132635D011132635
D011132635
 
C011131925
C011131925C011131925
C011131925
 
B011130918
B011130918B011130918
B011130918
 
A011130108
A011130108A011130108
A011130108
 
I011125160
I011125160I011125160
I011125160
 
H011124050
H011124050H011124050
H011124050
 
G011123539
G011123539G011123539
G011123539
 
F011123134
F011123134F011123134
F011123134
 
E011122530
E011122530E011122530
E011122530
 
D011121524
D011121524D011121524
D011121524
 

Recently uploaded

main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitterShivangiSharma879191
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoĂŁo Esperancinha
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 

Recently uploaded (20)

main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 

A new approach for user identification in web usage mining preprocessing

  • 1. IOSR Journal of Computer Engineering (IOSR-JCE) e-ISSN: 2278-0661, p- ISSN: 2278-8727Volume 11, Issue 3 (May. - Jun. 2013), PP 57-61 www.iosrjournals.org www.iosrjournals.org 57 | Page “A new approach for user identification in web usage mining preprocessing” 1.Arvind kumar Dangi, 2.Asst. Prof. Sunita Sangwan 1.(Software Engineering ,PDM College of engineering ,India) akdangi@gmail.com 2.(Computer Science & Engineering ,PDM College of engineering ,India) sunita2009@gmail.com Abstract : Web usage mining is a subset of data mining. In order to huge amount of data but the data is less appropriates “quantity and quality” of the web data is opposite to each other this is the main problem. Web data usage Mining is a variation of this field is untapped source of richly offered free textual information. The importance of web data usage mining is mounting along with the immense volumes of data generated in web habitual existence data always arrives in a various, continuous, rapid and time varying flow. Web data usage mining taking out procedures are important in extracting useful streaming on-line sources. As throughout the globe no. of web users are continuously and rapidly growing, it is necessary for the web usage miners to utilize efficient tools in order to discover, extort, clean and assess the desired information. The data pre-processing stage is the most important phase in the preprocessing for investigation of the web user & his usage behavior. To fulfill this requirement the navigations are recorded in web log file as well as the IP address of the website, session of usage & visited web link. In order to improve the performance & quality of data preprocessing in order to identify unique users and user sessions. We propose a new method for web data preprocessing in which it has three phases. “In the first phase some websites are selected and by different locations access these website & by applying the (java) tools & methods then find out the IP address of that websites, session usage time & navigations, in the final phase combine them i.e.(web link navigation + IP address of website + session of usage ). This framework helps to investigate the web user usage behavior efficiently. Keywords - :-weblink, navigation, threading, networking, diminution, web logs, session, Data clearout, Data makeover. I. INTRODUCTION Data pre-processing is an important and required phase in Web usage mining. Different methods & techniques are employed in web usage mining to remove irrelevant items and identify IP address, sessions of usage and navigation along with the extraction information. There is no of steps define for preprocessing asData clear out also known as the routines work to “clean” the data by satisfying in missing values, smoothing noisy data, identifying or removing outliers, and resolving inconsistencies.If users believe the data is noisy, they are improbable to trust the results of any data mining that has been applied to it. Also, dirty data can cause uncertainty for the mining procedure, resulting in unreliable output. But, they are not always robust. a new method for web data preprocessing in which it has three phases. “In the first phase some websites are selected and by different locations access these website & by applying the (java) tools & methods then find out the IP address of that websites, session usage time & navigations, in the final phase combine them i.e.(web link navigation + IP address of website + session of usage ). This framework helps to investigate the web user usage behavior efficiently. 1. Introduction 2. Indentation and equations 2.1 Data clear out 2.2 IP address identification 2.3 Session of usage identification 2.4 Data integration 2.5 data makeover 2.6 Data diminution 2.6 Data usage mining 3. Figures & tables 3.1Framework for research Activities 3.2 Representing research work 3.3 Sequential steps which followed for the research work 4. Conclusion & future work References
  • 2. “A new approach for user identification in web usage mining preprocessing” www.iosrjournals.org 58 | Page II. INDENTATIONS AND EQUATIONS 2.1 Data clear out (cleaning) Data clear out also known as the routines work to “clean” the data by satisfying in missing values, smoothing noisy data, identifying or removing outliers, and resolving inconsistencies. 2.2 IP address identification & Web link visited identification Using the following function & code we Identify the IP address & web link visited by the end user using the java language. try { Statement st=con.createStatement(); int a=st.executeUpdate("INSERT INTO cdass.tracing (userid, `IP`, visited_page) n" +"VALUES ('"+userid+"', '"+ip+"', '"+page.toString()+"');n" +""); }catch(Exception e) { System.out.println("Error During tracing.............."+e); } finally { try { con.close(); } catch (SQLException ex) { Logger.getLogger(track.class.getName()).log(Level.SEVERE, null, ex); } } 2.3Session of usage identification Identify the session of usage by the user at particular web site or the web page. long d=session.getCreationTime(); long dd=session.getLastAccessedTime(); out.print(new Date(dd).getSeconds()-new Date(d).getSeconds()); session.removeAttribute("userid"); Session Tracing <% tracing.track.setTracking((String)session.getAttribute("userid"),request.getRemoteAddr(),request.getRequestUR L()); %> <% long d=session.getCreationTime(); long dd=session.getLastAccessedTime(); // out.print(new Date(d).getSeconds()); out.print(new Date(dd).getSeconds()-new Date(d).getSeconds()); // out.println((dd-d)/60); // System.out.println(session.getLastAccessedTime()); session.removeAttribute("userid"); // response.sendRedirect("login_form.jsp"); %> Packages used for achieving the above tasks import java.sql.Connection; The JDBC ( Java Database Connectivity) API defines interfaces and classes for writing database applications in Java by making database connections. Using JDBC you can send SQL, PL/SQL statements to almost any relational database. JDBC is a Java API for executing SQL statements and supports basic SQL functionality. It provides RDBMS access by allowing you to embed SQL inside Java code. Because Java can run on a thin client, applets embedded in Web pages can contain downloadable JDBC code to enable remote database access. import java.sql.Statement; To execute a SQL statement on your table, you set up a Statement object. So add this import line to the top of your code: import java.sql.Statement;
  • 3. “A new approach for user identification in web usage mining preprocessing” www.iosrjournals.org 59 | Page In the try part of the try … catch block add the following line (add it just below your Connection line): Statement stmt = con.createStatement( ); Here, we're creating a Statement object called stmt. The Statement object needs a Connection object, with the createStatment method. We also need a SQL Statement for the Statement object to execute. So add this line to your code: String SQL = "SELECT * FROM Workers"; The above statement selects all the records from the database table called Workers. We can pass this SQL query to a method of the Statement object called executeQuery. The Statement object will then go to work gathering all the records that match our query. import java.util.logging.Level; The java.util.logging package provides the logging capabilities via the Logger class. To create a logger in your Java coding you can use the following snippet. Import java.util.logging.logger Private final static logger logger = logger.getlogger(myclass.class.getname()) The Logger you create a actually a hierarchy of Loggers, and a . (dot) in the hierarchy indicates a level in the hierarchy. So if you get a Logger for the com.example key this Logger is a child of the com Logger and the com Logger is child of the Logger for the empty String. You can configure the main logger and this affects all its children. SIGN IN TIME SIGN OUT TIME BROWSER AGENT URL IP 2.4 Data integration Integration we mean that a dataset can be retrieved and added to other datasets to create greater, more robust and useful datasets.The discipline of data integration comprises the practices, architectural techniques and tools for achieving the consistent access and delivery of data across the spectrum of data subject areas and data structure types in the enterprise to meet the data consumption requirements of all applications and business processes. 2.5 Data makeover Data makeover process applies on the data which is collected from different sources of the data. In the make over process it organized the data minimizes redundancy and dependency of the data. 2.6 Data diminution These are the technique that can be applied to get a reduced representation of the data set that is much smaller in volume maintain the data integrity of the original data. 2.7 Usage mining Web Usage Mining process is a user session file that gives an exact account about the user accessed the Web site, what pages were visited and in what order, and how much time each page was visited. A user session is the set of the page accesses that occur during a single visit to a Web site. the information contained in a raw Web server log represent a user session file before data preprocessing III. FIGURES AND TABLES 3.1Framework of activities
  • 4. “A new approach for user identification in web usage mining preprocessing” www.iosrjournals.org 60 | Page 3.2Research work representation 3.3 Sequential steps which followed for the research work user DATA CLEAR OUT IP ADDRESS IDENTIFICATION SESSION OF USAGE IDENTIFICATION WEB LINK VISITED IDENTIFICATION DATA INTEGRATION DATA MAKEOVER DATA DIMINUTION USAGE MINING
  • 5. “A new approach for user identification in web usage mining preprocessing” www.iosrjournals.org 61 | Page IV. CONCLUSION There are several data preprocessing techniques. Data cleaning can be applied to remove noise and correct inconsistencies in data. Data integration merges data from multiple sources into a logical data store such as a data warehouse. Data diminution can reduce data size by, for instance, aggregating, eliminating redundant features, or clustering. Usage of data mining method and data extraction on the web is now highlighting of the boosting number of researchers. web usage mining is a type of data mining technique that can be useful in suggesting the web usage type with the identification of user session, this research also help in identification of the end user behavior with the help of the identification of the user (IP address + web link navigation + session usage) and combine them together for identification user behavior more accurately for preprocessing data. This research mainly concentrates on providing another approach for preprocessing the data more accurately. Experimental results suggest the significance of the proposed approach. REFERENCES [1] Dr. M. Giri and Dr. Akash Kumar, “An Efficient Web Content Mining using elevance Analysis Approach”, International Journal of Multidisciplinary Research in Advanced Engineering, pages. 201-210, 2012. [2] Dr. M. Giri and Dr. Akash Kumar, “An Efficient Web Content Mining using Divide and Conquer Approach”, International journal of Computational Intelligence Research, pages. 201-210, 2012. [3] Dr. M. Giri and Dr. Akash Kumar, “An Efficient Web Content Mining using Multi Threading Approach”, International Journal of Systems, Algorithms and Applications, pages. 1-4, 2012 [4] Yan LI, Boqin FENG and Qinjiao MAO, “Research on Path Completion Technique In Web Usage Mining”, IEEE International Symposium On Computer Science and Computational Technology, pp. 554-559, 2008. [5] JING Chang-bin and Chen Li, “ Web Log Data Preprocessing Based On Collaborative Filtering ”, IEEE 2nd International Workshop On Education Technology and Computer Science, pp.118-121, 2010. [6] Huiping Peng, “Discovery of Interesting Association Rules Based On Web Usage Mining”, IEEE Coference, pp.272-275, 2010. [7] Tasawar Hussain, Dr. Sohail Asghar and Nayyer Masood, “Hierarchical Sessionization at Preprocessing Level of WUM Based on Swarm Intelligence ”, 6th International Conference on Emerging Technologies (ICET) IEEE, pp. 21- 26, 2010.