Internet is the boon in modern era as every organization uses it for dissemination of information and ecommerce
related applications. Sometimes people of organization feel delay while accessing internet in
spite of proper bandwidth. Prediction model of web caching and prefetching is an ideal solution of this
delay problem. Prediction model analysing history of internet user from server raw log files and determine
future sequence of web objects and placed all web objects to nearer to the user so access latency could be
reduced to some extent and problem of delay is to be solved. To determine sequence of future web objects,
it is necessary to determine proximity of one web object with other by identifying proper distance metric
technique related to web caching and prefetching. This paper studies different distance metric techniques
and concludes that bio informatics based distance metric techniques are ideal in context to Web Caching
and Web Prefetching
Privacy Preserving Reputation Calculation in P2P Systems with Homomorphic Enc...IJCNCJournal
In this paper, we consider the problem of calculating the node reputation in a Peer-toPeer (P2P) system from fragments of partial knowledge concerned with the trustfulness of nodes which are subjectively given by each node (i.e., evaluator) participating in the system. We are particularly interested in the distributed processing of the calculation of reputation scores while preserving the privacy of evaluators. The basic idea of the proposed method is to extend the EigenTrust reputation management system with the notion of homomorphic cryptosystem. More specifically, it calculates the main eigenvector of a linear system which models the trustfulness of the users (nodes) in the P2P system in a distributed manner, in such a way that: 1) it blocks accesses to the trust value by the nodes to have the secret key used for the decryption, 2) it improves the efficiency of calculation by offloading a part of the task to the participating nodes, and 3) it uses different public keys during the calculation to improve the robustness against the leave of nodes. The performance of the proposed method is evaluated through numerical calculations.
SECURING BGP BY HANDLING DYNAMIC NETWORK BEHAVIOR AND UNBALANCED DATASETSIJCNCJournal
The Border Gateway Protocol (BGP) provides crucial routing information for the Internet infrastructure. A problem with abnormal routing behavior affects the stability and connectivity of the global Internet. The biggest hurdles in detecting BGP attacks are extremely unbalanced data set category distribution and the dynamic nature of the network. This unbalanced class distribution and dynamic nature of the network results in the classifier's inferior performance. In this paper we proposed an efficient approach to properly managing these problems, the proposed approach tackles the unbalanced classification of datasets by turning the problem of binary classification into a problem of multiclass classification. This is achieved by splitting the majority-class samples evenly into multiple segments using Affinity Propagation, where the number of segments is chosen so that the number of samples in any segment closely matches the minority-class samples. Such sections of the dataset together with the minor class are then viewed as different classes and used to train the Extreme Learning Machine (ELM). The RIPE and BCNET datasets are used to evaluate the performance of the proposed technique. When no feature selection is used, the proposed technique improves the F1 score by 1.9% compared to state-of-the-art techniques. With the Fischer feature selection algorithm, the proposed algorithm achieved the highest F1 score of 76.3%, which was a 1.7% improvement over the compared ones. Additionally, the MIQ feature selection technique improves the accuracy by 3.5%. For the BCNET dataset, the proposed technique improves the F1 score by 1.8% for the Fisher feature selection technique. The experimental findings support the substantial improvement in performance from previous approaches by the new technique.
UTILIZING XAI TECHNIQUE TO IMPROVE AUTOENCODER BASED MODEL FOR COMPUTER NETWO...IJCNCJournal
Machine learning (ML) and Deep Learning (DL) methods are being adopted rapidly, especially in computer network security, such as fraud detection, network anomaly detection, intrusion detection, and much more. However, the lack of transparency of ML and DL based models is a major obstacle to their implementation and criticized due to its black-box nature, even with such tremendous results. Explainable Artificial Intelligence (XAI) is a promising area that can improve the trustworthiness of these models by giving explanations and interpreting its output. If the internal working of the ML and DL based models is understandable, then it can further help to improve its performance. The objective of this paper is to show that how XAI can be used to interpret the results of the DL model, the autoencoder in this case. And, based on the interpretation, we improved its performance for computer network anomaly detection. The kernel SHAP method, which is based on the shapley values, is used as a novel feature selection technique. This method is used to identify only those features that are actually causing the anomalous behaviour of the set of attack/anomaly instances. Later, these feature sets are used to train and validate the autoencoderbut on benign data only. Finally, the built SHAP_Model outperformed the other two models proposed based on the feature selection method. This whole experiment is conducted on the subset of the latest CICIDS2017 network dataset. The overall accuracy and AUC of SHAP_Model is 94% and 0.969, respectively.
FUZZY LOGIC-BASED EFFICIENT MESSAGE ROUTE SELECTION METHOD TO PROLONG THE NET...IJCNCJournal
Recently, sensor networks have been used in a wide range of applications, and interest in sensor node
performance has increased. A sensor network is composed of tiny nodes with limited resources. The sensor
network communicates between nodes in a configured network through self-organization. An energyefficient security protocol with a hierarchy structure with various advantages has been proposed to
prolong the network lifetime of sensor networks. But due to structural problems in traditional protocols,
nodes located upstream tend to consume relatively high energy compared to other nodes. A network
protocol should be considered to provide minimal security and efficient allocation of energy consumption
by nodes to increase the network lifetime. In this paper, we introduce a solution to solve the bottleneck
problem through an efficient message route selection method. The proposed method selects an efficient
messaging path using GA and fuzzy logic composed of multiple rules. Message route selection plays an
important role in controlling the load balancing of nodes. A principal benefit of the proposed scheme is the
potential portability of the clustering-based protocol. In addition, the proposed method is updated to find
the optimal path through the genetic algorithm to respond to various environments. We demonstrated the
effectiveness of the proposed method through an experiment in which the proposed method is applied to a
probabilistic voting-based filtering scheme that is one of the cluster-based security schemes.
A COOPERATIVE LOCALIZATION METHOD BASED ON V2I COMMUNICATION AND DISTANCE INF...IJCNCJournal
Relative positions are recent solutions to overcome the limited accuracy of GPS in urban environment.
Vehicle positions obtained using V2I communication are more accurate because the known roadside unit
(RSU) locations help predict errors in measurements over time. The accuracy of vehicle positions depends
more on the number of RSUs; however, the high installation cost limits the use of this approach. It also
depends on nonlinear localization nature. They were neglected in several research papers. In these studies,
the accumulated errors increased with time due to the linearity localization problem. In the present study,
a cooperative localization method based on V2I communication and distance information in vehicular
networks is proposed for improving the estimates of vehicles’ initial positions. This method assumes that
the virtual RSUs based on mobility measurements help reduce installation costs and facilitate in handling
fault environments. The extended Kalman filter algorithm is a well-known estimator in nonlinear problem,
but it requires well initial vehicle position vector and adaptive noise in measurements. Using the proposed
method, vehicles’ initial positions can be estimated accurately. The experimental results confirm that the
proposed method has superior accuracy than existing methods, giving a root mean square error of
approximately 1 m. In addition, it is shown that virtual RSUs can assist in estimating initial positions in
fault environments.
MEKDA: Multi-Level ECC based Key Distribution and Authentication in Internet ...IJCNCJournal
The Internet of Things (IoT) is an extensive system of networks and connected devices with minimal human interaction and swift growth. The constraints of the System and limitations of Devices pose several challenges, including security; hence billions of devices must protect from attacks and compromises. The resource-constrained nature of IoT devices amplifies security challenges. Thus standard data communication and security measures are inefficient in the IoT environment. The ubiquity of IoT devices and their deployment in sensitive applications increase the vulnerability of any security breaches to risk lives. Hence, IoT-related security challenges are of great concern. Authentication is the solution to the vulnerability of a malicious device in the IoT environment. The proposed Multi-level Elliptic Curve Cryptography based Key Distribution and Authentication in IoT enhances the security by Multi-level Authentication when the devices enter or exit the Cluster in an IoT system. The decreased Computation Time and Energy Consumption by generating and distributing Keys using Elliptic Curve Cryptography extends the availability of the IoT devices. The Performance analysis shows the improvement over the Fast Authentication and Data Transfer method.
Chaos Image Encryption Methods: A Survey StudyjournalBEEI
With increasing dependence on communications over internet and networks, secure data transmission is coming under threat. One of the best solutions to ensure secure data transmissions is encryption. Multiple forms of data, such as text, audio, image, and video can be digitally transmitted, nowadays images being the most popular and old encryption techniques such as: AES,DES,RSA etc., show low security level when used for image encryption. This problem was resolved by using of chaos encryption which is an acceptable form of encryption for image data. The sensitivity to initial conditions and control parameters make chaos encryption suitable for image applications. This study discusses various chaos encryption techniques.
Probability Density Functions of the Packet Length for Computer Networks With...IJCNCJournal
The research on Internet traffic classification and identification, with application on prevention of attacks
and intrusions, increased considerably in the past years. Strategies based on statistical characteristics of
the Internet traffic, that use parameters such as packet length (size) and inter-arrival time and their
probability density functions, are popular. This paper presents a new statistical modeling for packet length,
which shows that it can be modeled using a probability density function that involves a normal or a beta
distribution, according to the traffic generated by the users. The proposed functions has parameters that
depend on the type of traffic and can be used as part of an Internet traffic classification and identification
strategy. The models can be used to compare, simulate and estimate the computer network traffic, as well
as to generate synthetic traffic and estimate the packets processing capacity of Internet routers
Privacy Preserving Reputation Calculation in P2P Systems with Homomorphic Enc...IJCNCJournal
In this paper, we consider the problem of calculating the node reputation in a Peer-toPeer (P2P) system from fragments of partial knowledge concerned with the trustfulness of nodes which are subjectively given by each node (i.e., evaluator) participating in the system. We are particularly interested in the distributed processing of the calculation of reputation scores while preserving the privacy of evaluators. The basic idea of the proposed method is to extend the EigenTrust reputation management system with the notion of homomorphic cryptosystem. More specifically, it calculates the main eigenvector of a linear system which models the trustfulness of the users (nodes) in the P2P system in a distributed manner, in such a way that: 1) it blocks accesses to the trust value by the nodes to have the secret key used for the decryption, 2) it improves the efficiency of calculation by offloading a part of the task to the participating nodes, and 3) it uses different public keys during the calculation to improve the robustness against the leave of nodes. The performance of the proposed method is evaluated through numerical calculations.
SECURING BGP BY HANDLING DYNAMIC NETWORK BEHAVIOR AND UNBALANCED DATASETSIJCNCJournal
The Border Gateway Protocol (BGP) provides crucial routing information for the Internet infrastructure. A problem with abnormal routing behavior affects the stability and connectivity of the global Internet. The biggest hurdles in detecting BGP attacks are extremely unbalanced data set category distribution and the dynamic nature of the network. This unbalanced class distribution and dynamic nature of the network results in the classifier's inferior performance. In this paper we proposed an efficient approach to properly managing these problems, the proposed approach tackles the unbalanced classification of datasets by turning the problem of binary classification into a problem of multiclass classification. This is achieved by splitting the majority-class samples evenly into multiple segments using Affinity Propagation, where the number of segments is chosen so that the number of samples in any segment closely matches the minority-class samples. Such sections of the dataset together with the minor class are then viewed as different classes and used to train the Extreme Learning Machine (ELM). The RIPE and BCNET datasets are used to evaluate the performance of the proposed technique. When no feature selection is used, the proposed technique improves the F1 score by 1.9% compared to state-of-the-art techniques. With the Fischer feature selection algorithm, the proposed algorithm achieved the highest F1 score of 76.3%, which was a 1.7% improvement over the compared ones. Additionally, the MIQ feature selection technique improves the accuracy by 3.5%. For the BCNET dataset, the proposed technique improves the F1 score by 1.8% for the Fisher feature selection technique. The experimental findings support the substantial improvement in performance from previous approaches by the new technique.
UTILIZING XAI TECHNIQUE TO IMPROVE AUTOENCODER BASED MODEL FOR COMPUTER NETWO...IJCNCJournal
Machine learning (ML) and Deep Learning (DL) methods are being adopted rapidly, especially in computer network security, such as fraud detection, network anomaly detection, intrusion detection, and much more. However, the lack of transparency of ML and DL based models is a major obstacle to their implementation and criticized due to its black-box nature, even with such tremendous results. Explainable Artificial Intelligence (XAI) is a promising area that can improve the trustworthiness of these models by giving explanations and interpreting its output. If the internal working of the ML and DL based models is understandable, then it can further help to improve its performance. The objective of this paper is to show that how XAI can be used to interpret the results of the DL model, the autoencoder in this case. And, based on the interpretation, we improved its performance for computer network anomaly detection. The kernel SHAP method, which is based on the shapley values, is used as a novel feature selection technique. This method is used to identify only those features that are actually causing the anomalous behaviour of the set of attack/anomaly instances. Later, these feature sets are used to train and validate the autoencoderbut on benign data only. Finally, the built SHAP_Model outperformed the other two models proposed based on the feature selection method. This whole experiment is conducted on the subset of the latest CICIDS2017 network dataset. The overall accuracy and AUC of SHAP_Model is 94% and 0.969, respectively.
FUZZY LOGIC-BASED EFFICIENT MESSAGE ROUTE SELECTION METHOD TO PROLONG THE NET...IJCNCJournal
Recently, sensor networks have been used in a wide range of applications, and interest in sensor node
performance has increased. A sensor network is composed of tiny nodes with limited resources. The sensor
network communicates between nodes in a configured network through self-organization. An energyefficient security protocol with a hierarchy structure with various advantages has been proposed to
prolong the network lifetime of sensor networks. But due to structural problems in traditional protocols,
nodes located upstream tend to consume relatively high energy compared to other nodes. A network
protocol should be considered to provide minimal security and efficient allocation of energy consumption
by nodes to increase the network lifetime. In this paper, we introduce a solution to solve the bottleneck
problem through an efficient message route selection method. The proposed method selects an efficient
messaging path using GA and fuzzy logic composed of multiple rules. Message route selection plays an
important role in controlling the load balancing of nodes. A principal benefit of the proposed scheme is the
potential portability of the clustering-based protocol. In addition, the proposed method is updated to find
the optimal path through the genetic algorithm to respond to various environments. We demonstrated the
effectiveness of the proposed method through an experiment in which the proposed method is applied to a
probabilistic voting-based filtering scheme that is one of the cluster-based security schemes.
A COOPERATIVE LOCALIZATION METHOD BASED ON V2I COMMUNICATION AND DISTANCE INF...IJCNCJournal
Relative positions are recent solutions to overcome the limited accuracy of GPS in urban environment.
Vehicle positions obtained using V2I communication are more accurate because the known roadside unit
(RSU) locations help predict errors in measurements over time. The accuracy of vehicle positions depends
more on the number of RSUs; however, the high installation cost limits the use of this approach. It also
depends on nonlinear localization nature. They were neglected in several research papers. In these studies,
the accumulated errors increased with time due to the linearity localization problem. In the present study,
a cooperative localization method based on V2I communication and distance information in vehicular
networks is proposed for improving the estimates of vehicles’ initial positions. This method assumes that
the virtual RSUs based on mobility measurements help reduce installation costs and facilitate in handling
fault environments. The extended Kalman filter algorithm is a well-known estimator in nonlinear problem,
but it requires well initial vehicle position vector and adaptive noise in measurements. Using the proposed
method, vehicles’ initial positions can be estimated accurately. The experimental results confirm that the
proposed method has superior accuracy than existing methods, giving a root mean square error of
approximately 1 m. In addition, it is shown that virtual RSUs can assist in estimating initial positions in
fault environments.
MEKDA: Multi-Level ECC based Key Distribution and Authentication in Internet ...IJCNCJournal
The Internet of Things (IoT) is an extensive system of networks and connected devices with minimal human interaction and swift growth. The constraints of the System and limitations of Devices pose several challenges, including security; hence billions of devices must protect from attacks and compromises. The resource-constrained nature of IoT devices amplifies security challenges. Thus standard data communication and security measures are inefficient in the IoT environment. The ubiquity of IoT devices and their deployment in sensitive applications increase the vulnerability of any security breaches to risk lives. Hence, IoT-related security challenges are of great concern. Authentication is the solution to the vulnerability of a malicious device in the IoT environment. The proposed Multi-level Elliptic Curve Cryptography based Key Distribution and Authentication in IoT enhances the security by Multi-level Authentication when the devices enter or exit the Cluster in an IoT system. The decreased Computation Time and Energy Consumption by generating and distributing Keys using Elliptic Curve Cryptography extends the availability of the IoT devices. The Performance analysis shows the improvement over the Fast Authentication and Data Transfer method.
Chaos Image Encryption Methods: A Survey StudyjournalBEEI
With increasing dependence on communications over internet and networks, secure data transmission is coming under threat. One of the best solutions to ensure secure data transmissions is encryption. Multiple forms of data, such as text, audio, image, and video can be digitally transmitted, nowadays images being the most popular and old encryption techniques such as: AES,DES,RSA etc., show low security level when used for image encryption. This problem was resolved by using of chaos encryption which is an acceptable form of encryption for image data. The sensitivity to initial conditions and control parameters make chaos encryption suitable for image applications. This study discusses various chaos encryption techniques.
Probability Density Functions of the Packet Length for Computer Networks With...IJCNCJournal
The research on Internet traffic classification and identification, with application on prevention of attacks
and intrusions, increased considerably in the past years. Strategies based on statistical characteristics of
the Internet traffic, that use parameters such as packet length (size) and inter-arrival time and their
probability density functions, are popular. This paper presents a new statistical modeling for packet length,
which shows that it can be modeled using a probability density function that involves a normal or a beta
distribution, according to the traffic generated by the users. The proposed functions has parameters that
depend on the type of traffic and can be used as part of an Internet traffic classification and identification
strategy. The models can be used to compare, simulate and estimate the computer network traffic, as well
as to generate synthetic traffic and estimate the packets processing capacity of Internet routers
IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...ijcseit
This paper mainly presents some technical discussions on the identification and analyze of “LAN usersessions”.
The identification of a user-session is non trivial. Classical methods approaches rely on
threshold based mechanisms. Threshold based techniques are very sensitive to the value chosen for the
threshold, which may be difficult to set correctly. Clustering techniques are used to define a novel
methodology to identify LAN user-sessions without requiring an a priori definition of threshold values. We
have defined a clustering based approach in detail, and also we discussed positive and negative of this
approach, and we apply it to real traffic traces. The proposed methodology is applied to artificially
generated traces to evaluate its benefits against traditional threshold based approaches. We also analyzed
the characteristics of user-sessions extracted by the clustering methodology from real traces and study
their statistical properties.
New kind of intrusions causes deviation in the normal behaviour of traffic flow in
computer networks every day. This study focused on enhancing the learning capabilities of IDS
to detect the anomalies present in a network traffic flow by comparing the k-means approach of
data mining for intrusion detection and the outlier detection approach. The k-means approach
uses clustering mechanisms to group the traffic flow data into normal and abnormal clusters.
Outlier detection calculates an outlier score (neighbourhood outlier factor (NOF)) for each flow
record, whose value decides whether a traffic flow is normal or abnormal. These two methods
were then compared in terms of various performance metrics and the amount of computer
resources consumed by them. Overall, k-means was more accurate and precise and has better
classification rate than outlier detection in intrusion detection using traffic flows. This will help
systems administrators in their choice of IDS.
Map as a Service: A Framework for Visualising and Maximising Information Retu...M H
This paper presents a distributed information extraction and visualisation service, called the mapping service, for maximising information return from large-scale wireless sensor networks. Such a service would greatly simplify the production of higher-level, information-rich, representations suitable for informing other network services and the delivery of field information visualisations. The mapping service utilises a blend of inductive and deductive models to map sense data accurately using externally available knowledge. It utilises the special characteristics of the application domain to render visualisations in a map format that are a precise reflection of the concrete reality. This service is suitable for visualising an arbitrary number of sense modalities. It is capable of visualising from multiple independent types of the sense data to overcome the limitations of generating visualisations from a single type of sense modality. Furthermore, the mapping service responds dynamically to changes in the environmental conditions, which may affect the visualisation performance by continuously updating the application domain model in a distributed manner. Finally, a distributed self-adaptation function is proposed with the goal of saving more power and generating more accurate data visualisation. We conduct comprehensive experimentation to evaluate the performance of our mapping service and show that it achieves low communication overhead, produces maps of high fidelity, and further minimises the mapping predictive error dynamically through integrating the application domain model in the mapping service.
SURVEY PAPER ON PRIVACY IN LOCATION BASED SEARCH QUERIES.ijiert bestjournal
Due to tremendous growth in mobile phones,the mark et for Location Based Services is growing fast. Man y mobile phone applications uses location based services suc h as nearest store finder,car navigation system et c. Location � Based Services provides services to mobile device u sers based on the location information as well as d ata profile of the users. Using these services mobile users retrie ve information about nearest POI. This involves loc ation and data profile of the user�s to be misused. In order to pr otect user�s private information many solutions ar e offered but most of them only addressed on snapshot and no supp ort for continuous query and MQMO .Some papers add ressed MQMO but fails to provide privacy. This paper focus es on MQMO and also protect user�s private informat ion using PIR (private information retrieval) .
Online stream mining approach for clustering network trafficeSAT Journals
Abstract A large number of research have been proposed on intrusion detection system, which leads to the implementation of agent based intelligent IDS (IIDS), Non – intelligent IDS (NIDS), signature based IDS etc. While building such IDS models, learning algorithms from flow of network traffic plays crucial role in accuracy of IDS systems. The proposed work focuses on implementing the novel method to cluster network traffic which eliminates the limitations in existing online clustering algorithms and prove the robustness and accuracy over large stream of network traffic arriving at extremely high rate. We compare the existing algorithm with novel methods to analyse the accuracy and complexity. Keywords— NIDS, Data Stream Mining, Online Clustering, RAH algorithm, Online Efficient Incremental Clustering algorithm
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
MINIMIZING LOCALIZATION ERROR AND ENSURE SECURITY OF DVHOP APPROACHijceronline
In case of wireless sensor network there exist problem of determining the nodes which are symmetrical to each other. The nodes which are symmetrical and are at lesser distance are selected for data transfer. This identification of the distance between nodes is known as localization. In the proposed paper work on DVHOP is done. The DVHOP is the distance vector routing based protocol which is used to indicate whether there exist a path from source to destination or not. The malicious node can also be present which can take over the actual node causing problems in the transfer process. The most common attack which results from this will be DDOS attack. This will result in the duplication of the information and will cause traffic jamming. DVHOP with random key is used to handle DDOS attack.
Handwriting identification using deep convolutional neural network methodTELKOMNIKA JOURNAL
Handwriting is a unique thing that produced differently for each person. Handwriting has a characteristic that remain the same with single writer, so a handwriting can be used as a variable in biometric systems. Each person have a different form of handwriting style but with a small possibility that same characters have something commons. We propose a handwriting identification method using sentence segmented handwriting forms. Sentence form is used to get more complete handwriting characteristics than using a single characters or words. Dataset used is divided into three categories of images, binary, grayscale, and inverted binary. All datasets have same image with different in color and consist of 100 class. Transfer learning used in this paper are pre-trained model VGG19. Training was conducted in 100 epochs. Highest result is grayscale images with genuince acceptance rate of 92.3% and equal error rate of 7.7%.
An Overview and Classification of Approaches to Information Extraction in Wir...M H
Recent advances in wireless communication have made it possible to develop low-cost, and low power Wireless Sensor Networks (WSN). The WSN can be used for several application areas (e.g., habitat monitoring, forest fire detection, and health care). WSN Information Extraction (IE) techniques can be classified into four categories depending on the factors that drive data acquisition: event-driven, time-driven, query-based, and hybrid. This paper presents a survey of the state-of-the-art IE techniques in WSNs. The benefits and shortcomings of different IE approaches are presented as motivation for future work into automatic hybridisation and adaptation of IE mechanisms.
Anew approach to broadcast in wormhole routed three-dimensional networks is proposed. One of the most
important process in communication and parallel computer is broadcast approach.. The approach of this
case of Broadcasting is to send the message from one source to all destinations in the network which
corresponds to one-to-all communication. Wormhole routing is a fundamental routing mechanism in
modern parallel computers which is characterized with low communication latency. We show how to apply
this approach to 3-D meshes. Wormhole routing is divided the packets into set of FLITS (flow control
digits). The first Flit of the packet (Header Flit) is containing the destination address and all subsets flits
will follow the routing way of the header Flit. In this paper, we consider an efficient algorithm for
broadcasting on an all-port wormhole-routed 3D mesh with arbitrary size. We introduce an efficient
algorithm, Y-Hamiltonian Layers Broadcast(Y-HLB). In this paper the behaviors of this algorithm were
compared to the previous results, our paradigm reduces broadcast latency and is simpler. In this paper our
simulation results show the average of our proposed algorithm over the other algorithms that presented.
Adaptive Routing in Wireless Sensor Networks: QoS Optimisation for Enhanced A...M H
One of the key challenges for research in wireless sensor networks is the development of routing protocols that provide application-specific service guarantees. This paper presents a new cluster-based Route Optimisation and Load-balancing protocol, called ROL, that uses various quality of service (QoS) metrics to meet application requirements. ROL combines several application requirements, specifically it attempts to provide an inclusive solution to prolong network life, provide timely message delivery and improve network robustness. It uses a combination of routing metrics that can be configured according to the priorities of user-level applications to improve overall network performance. To this end, an optimisation tool for balancing the communication resources for the constraints and priorities of user applications has been developed and Nutrient-flow-based Distributed Clustering (NDC), an algorithm for load balancing is proposed. NDC works seamlessly with any clustering algorithm to equalise, as far as possible, the diameter and the membership of clusters. This paper presents simulation results to show that ROL/NDC gives a higher network lifetime than other similar schemes, such Mires++. In simulation, ROL/NDC maintains a maximum of 7\% variation from the optimal cluster population, reduces the total number of set-up messages by up to 60%, reduces the end-to-end delay by up to 56%, and enhances the data delivery ratio by up to 0.98% compared to Mires++.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
AN EFFECTIVE SEMANTIC ENCRYPTED RELATIONAL DATA USING K-NN MODELijsptm
Data exchange and data publishing are becoming an important part of business and academic practices.
Data owners need to maintain the rights over the datasets they share. A right-protection mechanism can be
provided for the ownership of shared data, without revealing its usage under a wide range of machine
learning and mining. In the approach provide two algorithms: the Nearest-Neighbors (NN) and determiner
preserves the Minimum Spanning Tree (MST). The K-NN protocol guarantees that relations between object
remain unaltered. The algorithms preserve the both right protection and utility preservation. The rightprotection
scheme is based on watermarking. Watermarking methodology preserves the distance
relationships.
Two level data security using steganography and 2 d cellular automataeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Information extraction from sensor networks using the Watershed transform alg...M H
Wireless sensor networks are an effective tool to provide fine resolution monitoring of the physical environment. Sensors generate continuous streams of data, which leads to several computational challenges. As sensor nodes become increasingly active devices, with more processing and communication resources, various methods of distributed data processing and sharing become feasible. The challenge is to extract information from the gathered sensory data with a specified level of accuracy in a timely and power-efficient approach. This paper presents a new solution to distributed information extraction that makes use of the morphological Watershed algorithm. The Watershed algorithm dynamically groups sensor nodes into homogeneous network segments with respect to their topological relationships and their sensing-states. This setting allows network programmers to manipulate groups of spatially distributed data streams instead of individual nodes. This is achieved by using network segments as programming abstractions on which various query processes can be executed. Aiming at this purpose, we present a reformulation of the global Watershed algorithm. The modified Watershed algorithm is fully asynchronous, where sensor nodes can autonomously process their local data in parallel and in collaboration with neighbouring nodes. Experimental evaluation shows that the presented solution is able to considerably reduce query resolution cost without scarifying the quality of the returned results. When compared to similar purpose schemes, such as “Logical Neighborhood”, the proposed approach reduces the total query resolution overhead by up to 57.5%, reduces the number of nodes involved in query resolution by up to 59%, and reduces the setup convergence time by up to 65.1%.
Approximation of regression-based fault minimization for network trafficTELKOMNIKA JOURNAL
This research associates three distinct approaches for computer network traffic prediction. They are the traditional stochastic gradient descent (SGD) using a few random samplings instead of the complete dataset for each iterative calculation, the gradient descent algorithm (GDA) which is a well-known optimization approach in deep learning, and the proposed method. The network traffic is computed from the traffic load (data and multimedia) of the computer network nodes via the Internet. It is apparent that the SGD is a modest iteration but can conclude suboptimal solutions. The GDA is a complicated one, can function more accurate than the SGD but difficult to manipulate parameters, such as the learning rate, the dataset granularity, and the loss function. Network traffic estimation helps improve performance and lower costs for various applications, such as an adaptive rate control, load balancing, the quality of service (QoS), fair bandwidth allocation, and anomaly detection. The proposed method confirms optimal values out of parameters using simulation to compute the minimum figure of specified loss function in each iteration.
On Using Network Science in Mining Developers Collaboration in Software Engin...IJDKP
Background: Network science is the set of mathematical frameworks, models, and measures that are used to understand a complex system modeled as a network composed of nodes and edges. The nodes of a network represent entities and the edges represent relationships between these entities. Network science has been used in many research works for mining human interaction during different phases of software engineering (SE). Objective: The goal of this study is to identify, review, and analyze the published research works that used network analysis as a tool for understanding the human collaboration on different levels of software development. This study and its findings are expected to be of benefit for software engineering practitioners and researchers who are mining software repositories using tools from network science field. Method: We conducted a systematic literature review, in which we analyzed a number of selected papers from different digital libraries based on inclusion and exclusion criteria. Results: We identified 35 primary studies (PSs) from four digital libraries, then we extracted data from each PS according to a predefined data extraction sheet. The results of our data analysis showed that not all of the constructed networks used in the PSs were valid as the edges of these networks did not reflect a real relationship between the entities of the network. Additionally, the used measures in the PSs were in many cases not suitable for the used networks. Also, the reported analysis results by the PSs were not, in most cases, validated using any statistical model. Finally, many of the PSs did not provide lessons or guidelines for software practitioners that can improve the software engineering practices. Conclusion: Although employing network analysis in mining developers’ collaboration showed some satisfactory results in some of the PSs, the application of network analysis needs to be conducted more carefully. That is said, the constructed network should be representative and meaningful, the used measure needs to be suitable for the context, and the validation of the results should be considered. More and above, we state some research gaps, in which network science can be applied, with some pointers to recent advances that can be used to mine collaboration networks.
IDENTIFICATION AND INVESTIGATION OF THE USER SESSION FOR LAN CONNECTIVITY VIA...ijcseit
This paper mainly presents some technical discussions on the identification and analyze of “LAN usersessions”.
The identification of a user-session is non trivial. Classical methods approaches rely on
threshold based mechanisms. Threshold based techniques are very sensitive to the value chosen for the
threshold, which may be difficult to set correctly. Clustering techniques are used to define a novel
methodology to identify LAN user-sessions without requiring an a priori definition of threshold values. We
have defined a clustering based approach in detail, and also we discussed positive and negative of this
approach, and we apply it to real traffic traces. The proposed methodology is applied to artificially
generated traces to evaluate its benefits against traditional threshold based approaches. We also analyzed
the characteristics of user-sessions extracted by the clustering methodology from real traces and study
their statistical properties.
New kind of intrusions causes deviation in the normal behaviour of traffic flow in
computer networks every day. This study focused on enhancing the learning capabilities of IDS
to detect the anomalies present in a network traffic flow by comparing the k-means approach of
data mining for intrusion detection and the outlier detection approach. The k-means approach
uses clustering mechanisms to group the traffic flow data into normal and abnormal clusters.
Outlier detection calculates an outlier score (neighbourhood outlier factor (NOF)) for each flow
record, whose value decides whether a traffic flow is normal or abnormal. These two methods
were then compared in terms of various performance metrics and the amount of computer
resources consumed by them. Overall, k-means was more accurate and precise and has better
classification rate than outlier detection in intrusion detection using traffic flows. This will help
systems administrators in their choice of IDS.
Map as a Service: A Framework for Visualising and Maximising Information Retu...M H
This paper presents a distributed information extraction and visualisation service, called the mapping service, for maximising information return from large-scale wireless sensor networks. Such a service would greatly simplify the production of higher-level, information-rich, representations suitable for informing other network services and the delivery of field information visualisations. The mapping service utilises a blend of inductive and deductive models to map sense data accurately using externally available knowledge. It utilises the special characteristics of the application domain to render visualisations in a map format that are a precise reflection of the concrete reality. This service is suitable for visualising an arbitrary number of sense modalities. It is capable of visualising from multiple independent types of the sense data to overcome the limitations of generating visualisations from a single type of sense modality. Furthermore, the mapping service responds dynamically to changes in the environmental conditions, which may affect the visualisation performance by continuously updating the application domain model in a distributed manner. Finally, a distributed self-adaptation function is proposed with the goal of saving more power and generating more accurate data visualisation. We conduct comprehensive experimentation to evaluate the performance of our mapping service and show that it achieves low communication overhead, produces maps of high fidelity, and further minimises the mapping predictive error dynamically through integrating the application domain model in the mapping service.
SURVEY PAPER ON PRIVACY IN LOCATION BASED SEARCH QUERIES.ijiert bestjournal
Due to tremendous growth in mobile phones,the mark et for Location Based Services is growing fast. Man y mobile phone applications uses location based services suc h as nearest store finder,car navigation system et c. Location � Based Services provides services to mobile device u sers based on the location information as well as d ata profile of the users. Using these services mobile users retrie ve information about nearest POI. This involves loc ation and data profile of the user�s to be misused. In order to pr otect user�s private information many solutions ar e offered but most of them only addressed on snapshot and no supp ort for continuous query and MQMO .Some papers add ressed MQMO but fails to provide privacy. This paper focus es on MQMO and also protect user�s private informat ion using PIR (private information retrieval) .
Online stream mining approach for clustering network trafficeSAT Journals
Abstract A large number of research have been proposed on intrusion detection system, which leads to the implementation of agent based intelligent IDS (IIDS), Non – intelligent IDS (NIDS), signature based IDS etc. While building such IDS models, learning algorithms from flow of network traffic plays crucial role in accuracy of IDS systems. The proposed work focuses on implementing the novel method to cluster network traffic which eliminates the limitations in existing online clustering algorithms and prove the robustness and accuracy over large stream of network traffic arriving at extremely high rate. We compare the existing algorithm with novel methods to analyse the accuracy and complexity. Keywords— NIDS, Data Stream Mining, Online Clustering, RAH algorithm, Online Efficient Incremental Clustering algorithm
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
MINIMIZING LOCALIZATION ERROR AND ENSURE SECURITY OF DVHOP APPROACHijceronline
In case of wireless sensor network there exist problem of determining the nodes which are symmetrical to each other. The nodes which are symmetrical and are at lesser distance are selected for data transfer. This identification of the distance between nodes is known as localization. In the proposed paper work on DVHOP is done. The DVHOP is the distance vector routing based protocol which is used to indicate whether there exist a path from source to destination or not. The malicious node can also be present which can take over the actual node causing problems in the transfer process. The most common attack which results from this will be DDOS attack. This will result in the duplication of the information and will cause traffic jamming. DVHOP with random key is used to handle DDOS attack.
Handwriting identification using deep convolutional neural network methodTELKOMNIKA JOURNAL
Handwriting is a unique thing that produced differently for each person. Handwriting has a characteristic that remain the same with single writer, so a handwriting can be used as a variable in biometric systems. Each person have a different form of handwriting style but with a small possibility that same characters have something commons. We propose a handwriting identification method using sentence segmented handwriting forms. Sentence form is used to get more complete handwriting characteristics than using a single characters or words. Dataset used is divided into three categories of images, binary, grayscale, and inverted binary. All datasets have same image with different in color and consist of 100 class. Transfer learning used in this paper are pre-trained model VGG19. Training was conducted in 100 epochs. Highest result is grayscale images with genuince acceptance rate of 92.3% and equal error rate of 7.7%.
An Overview and Classification of Approaches to Information Extraction in Wir...M H
Recent advances in wireless communication have made it possible to develop low-cost, and low power Wireless Sensor Networks (WSN). The WSN can be used for several application areas (e.g., habitat monitoring, forest fire detection, and health care). WSN Information Extraction (IE) techniques can be classified into four categories depending on the factors that drive data acquisition: event-driven, time-driven, query-based, and hybrid. This paper presents a survey of the state-of-the-art IE techniques in WSNs. The benefits and shortcomings of different IE approaches are presented as motivation for future work into automatic hybridisation and adaptation of IE mechanisms.
Anew approach to broadcast in wormhole routed three-dimensional networks is proposed. One of the most
important process in communication and parallel computer is broadcast approach.. The approach of this
case of Broadcasting is to send the message from one source to all destinations in the network which
corresponds to one-to-all communication. Wormhole routing is a fundamental routing mechanism in
modern parallel computers which is characterized with low communication latency. We show how to apply
this approach to 3-D meshes. Wormhole routing is divided the packets into set of FLITS (flow control
digits). The first Flit of the packet (Header Flit) is containing the destination address and all subsets flits
will follow the routing way of the header Flit. In this paper, we consider an efficient algorithm for
broadcasting on an all-port wormhole-routed 3D mesh with arbitrary size. We introduce an efficient
algorithm, Y-Hamiltonian Layers Broadcast(Y-HLB). In this paper the behaviors of this algorithm were
compared to the previous results, our paradigm reduces broadcast latency and is simpler. In this paper our
simulation results show the average of our proposed algorithm over the other algorithms that presented.
Adaptive Routing in Wireless Sensor Networks: QoS Optimisation for Enhanced A...M H
One of the key challenges for research in wireless sensor networks is the development of routing protocols that provide application-specific service guarantees. This paper presents a new cluster-based Route Optimisation and Load-balancing protocol, called ROL, that uses various quality of service (QoS) metrics to meet application requirements. ROL combines several application requirements, specifically it attempts to provide an inclusive solution to prolong network life, provide timely message delivery and improve network robustness. It uses a combination of routing metrics that can be configured according to the priorities of user-level applications to improve overall network performance. To this end, an optimisation tool for balancing the communication resources for the constraints and priorities of user applications has been developed and Nutrient-flow-based Distributed Clustering (NDC), an algorithm for load balancing is proposed. NDC works seamlessly with any clustering algorithm to equalise, as far as possible, the diameter and the membership of clusters. This paper presents simulation results to show that ROL/NDC gives a higher network lifetime than other similar schemes, such Mires++. In simulation, ROL/NDC maintains a maximum of 7\% variation from the optimal cluster population, reduces the total number of set-up messages by up to 60%, reduces the end-to-end delay by up to 56%, and enhances the data delivery ratio by up to 0.98% compared to Mires++.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
AN EFFECTIVE SEMANTIC ENCRYPTED RELATIONAL DATA USING K-NN MODELijsptm
Data exchange and data publishing are becoming an important part of business and academic practices.
Data owners need to maintain the rights over the datasets they share. A right-protection mechanism can be
provided for the ownership of shared data, without revealing its usage under a wide range of machine
learning and mining. In the approach provide two algorithms: the Nearest-Neighbors (NN) and determiner
preserves the Minimum Spanning Tree (MST). The K-NN protocol guarantees that relations between object
remain unaltered. The algorithms preserve the both right protection and utility preservation. The rightprotection
scheme is based on watermarking. Watermarking methodology preserves the distance
relationships.
Two level data security using steganography and 2 d cellular automataeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Information extraction from sensor networks using the Watershed transform alg...M H
Wireless sensor networks are an effective tool to provide fine resolution monitoring of the physical environment. Sensors generate continuous streams of data, which leads to several computational challenges. As sensor nodes become increasingly active devices, with more processing and communication resources, various methods of distributed data processing and sharing become feasible. The challenge is to extract information from the gathered sensory data with a specified level of accuracy in a timely and power-efficient approach. This paper presents a new solution to distributed information extraction that makes use of the morphological Watershed algorithm. The Watershed algorithm dynamically groups sensor nodes into homogeneous network segments with respect to their topological relationships and their sensing-states. This setting allows network programmers to manipulate groups of spatially distributed data streams instead of individual nodes. This is achieved by using network segments as programming abstractions on which various query processes can be executed. Aiming at this purpose, we present a reformulation of the global Watershed algorithm. The modified Watershed algorithm is fully asynchronous, where sensor nodes can autonomously process their local data in parallel and in collaboration with neighbouring nodes. Experimental evaluation shows that the presented solution is able to considerably reduce query resolution cost without scarifying the quality of the returned results. When compared to similar purpose schemes, such as “Logical Neighborhood”, the proposed approach reduces the total query resolution overhead by up to 57.5%, reduces the number of nodes involved in query resolution by up to 59%, and reduces the setup convergence time by up to 65.1%.
Approximation of regression-based fault minimization for network trafficTELKOMNIKA JOURNAL
This research associates three distinct approaches for computer network traffic prediction. They are the traditional stochastic gradient descent (SGD) using a few random samplings instead of the complete dataset for each iterative calculation, the gradient descent algorithm (GDA) which is a well-known optimization approach in deep learning, and the proposed method. The network traffic is computed from the traffic load (data and multimedia) of the computer network nodes via the Internet. It is apparent that the SGD is a modest iteration but can conclude suboptimal solutions. The GDA is a complicated one, can function more accurate than the SGD but difficult to manipulate parameters, such as the learning rate, the dataset granularity, and the loss function. Network traffic estimation helps improve performance and lower costs for various applications, such as an adaptive rate control, load balancing, the quality of service (QoS), fair bandwidth allocation, and anomaly detection. The proposed method confirms optimal values out of parameters using simulation to compute the minimum figure of specified loss function in each iteration.
On Using Network Science in Mining Developers Collaboration in Software Engin...IJDKP
Background: Network science is the set of mathematical frameworks, models, and measures that are used to understand a complex system modeled as a network composed of nodes and edges. The nodes of a network represent entities and the edges represent relationships between these entities. Network science has been used in many research works for mining human interaction during different phases of software engineering (SE). Objective: The goal of this study is to identify, review, and analyze the published research works that used network analysis as a tool for understanding the human collaboration on different levels of software development. This study and its findings are expected to be of benefit for software engineering practitioners and researchers who are mining software repositories using tools from network science field. Method: We conducted a systematic literature review, in which we analyzed a number of selected papers from different digital libraries based on inclusion and exclusion criteria. Results: We identified 35 primary studies (PSs) from four digital libraries, then we extracted data from each PS according to a predefined data extraction sheet. The results of our data analysis showed that not all of the constructed networks used in the PSs were valid as the edges of these networks did not reflect a real relationship between the entities of the network. Additionally, the used measures in the PSs were in many cases not suitable for the used networks. Also, the reported analysis results by the PSs were not, in most cases, validated using any statistical model. Finally, many of the PSs did not provide lessons or guidelines for software practitioners that can improve the software engineering practices. Conclusion: Although employing network analysis in mining developers’ collaboration showed some satisfactory results in some of the PSs, the application of network analysis needs to be conducted more carefully. That is said, the constructed network should be representative and meaningful, the used measure needs to be suitable for the context, and the validation of the results should be considered. More and above, we state some research gaps, in which network science can be applied, with some pointers to recent advances that can be used to mine collaboration networks.
On Using Network Science in Mining Developers Collaboration in Software Engin...IJDKP
Background: Network science is the set of mathematical frameworks, models, and measures that are
used to understand a complex system modeled as a network composed of nodes and edges. The nodes of a network
represent entities and the edges represent relationships between these entities. Network science has been used in
many research works for mining human interaction during different phases of software engineering (SE)
Automatic signature verification with chain code using weighted distance and ...eSAT Journals
Abstract The signature forgery can be restricted by either online or offline signature verification techniques. It verifies the signature by
performing a match with the pre-processed signature dynamically by detecting the motion of stylus during signature while on
other hand, offline verifies by performing a match using the two dimensional scanned image of the signature. This paper studies
about the various techniques available in offline signature verification along with their shadows.
Keywords: Signature Verification, Weighted Distance, High Pressure Factor, Normalization, Threshold Value
Investigating the Effect of BD-CRAFT to Text Detection Algorithmsgerogepatton
With the rise and development of deep learning, computer vision and document analysis has influenced the
area of text detection. Despite significant efforts in improving text detection performance, it remains to be
challenging, as evident by the series of the Robust Reading Competitions. This study investigates the impact
of employing BD-CRAFT – a variant of CRAFT that involves automatic image classification utilizing a
Laplacian operator and further preprocess the classified blurry images using blind deconvolution to the
top-ranked algorithms, SenseTime and TextFuseNet. Results revealed that the proposed method
significantly enhanced the detection performances of the said algorithms. TextFuseNet + BD-CRAFT
achieved an outstanding h-mean result of 93.55% and shows an impressive improvement of over 4%
increase to its precision yielding 95.71% while SenseTime + BD-CRAFT placed first with a very
remarkable 95.22% h-mean and exhibited a huge precision improvement of over 4%.
INVESTIGATING THE EFFECT OF BD-CRAFT TO TEXT DETECTION ALGORITHMSijaia
With the rise and development of deep learning, computer vision and document analysis has influenced the
area of text detection. Despite significant efforts in improving text detection performance, it remains to be
challenging, as evident by the series of the Robust Reading Competitions. This study investigates the impact
of employing BD-CRAFT – a variant of CRAFT that involves automatic image classification utilizing a
Laplacian operator and further preprocess the classified blurry images using blind deconvolution to the
top-ranked algorithms, SenseTime and TextFuseNet. Results revealed that the proposed method
significantly enhanced the detection performances of the said algorithms. TextFuseNet + BD-CRAFT
achieved an outstanding h-mean result of 93.55% and shows an impressive improvement of over 4%
increase to its precision yielding 95.71% while SenseTime + BD-CRAFT placed first with a very
remarkable 95.22% h-mean and exhibited a huge precision improvement of over 4%.
Analysis and assessment software for multi-user collaborative cognitive radi...IJECEIAES
Computer simulations are without a doubt a useful methodology that allows to explore research queries and develop prototypes at lower costs and timeframes than those required in hardware processes. The simulation tools used in cognitive radio networks (CRN) are undergoing an active process. Currently, there is no stable simulator that enables to characterize every element of the cognitive cycle and the available tools are a framework for discrete-event software. This work presents the spectral mobility simulator in CRN called “App MultiColl-DCRN”, developed with MATLAB’s app designer. In contrast with other frameworks, the simulator uses real spectral occupancy data and simultaneously analyzes features regarding spectral mobility, decision-making, multi-user access, collaborative scenarios and decentralized architectures. Performance metrics include bandwidth, throughput level, number of failed handoffs, number of total handoffs, number of handoffs with interference, number of anticipated handoffs and number of perfect handoffs. The assessment of the simulator involves three scenarios: the first and second scenarios present a collaborative structure using the multi-criteria optimization and compromise solution (VIKOR) decision-making model and the naïve Bayes prediction technique respectively. The third scenario presents a multi-user structure and uses simple additive weighting (SAW) as a decision-making technique. The present development represents a contribution in the cognitive radio network field since there is currently no software with the same features.
Handwritten digit recognition using quantum convolution neural networkIAESIJAI
The recognition of handwritten digits holds a significant place in the field of information processing. Recognizing such characters accurately from images is a complex task because of the vast differences in people's writing styles. Furthermore, the presence of various image artifacts such as blurring, intensity variations, and noise adds to the complexity of this process. The existing algorithm, convolution neural network (CNN) is one of the prominent algorithms in deep learning to handle the above problems. But there is a difficulty in handling input data that differs significantly from the training data, leading to decreased accuracy and performance. In this work, a method is proposed to overcome the aforementioned limitations by incorporating a quantum convolutional neural network algorithm (QCNN). QCNN is capable of performing more complex operations than classical CNNs. It can achieve higher levels of accuracy than classical CNNs, especially when working with noisy or incomplete data. It has the potential to scale more efficiently and effectively than classical CNNs, making them better suited for large-scale applications. The effectiveness of the proposed model is demonstrated on the modified national institute of standards and technology (MNIST) dataset and achieved an average accuracy of 91.08%.
As we know the fingerprint is unique of every living objects. It is quite difficult to find out the prints.
Usually the Forensics use Fine powder and duct tapes to identify the prints of living object. As powder is
exceptionally muddled, so such molecule can cause loss of information after that examination the information is
coordinated with the system. The proposed system consists of an embedded device in which it consists of ultra
light to glow the fingerprints details. After that we can detect the fingerprint, analysis and it will checks on the
database, and it will return the output after matching. For matching and analysis of the Fingerprint, we will be
using the Algorithm for matching.
A simplified predictive framework for cost evaluation to fault assessment usi...IJECEIAES
Software engineering is an integral part of any software development scheme which frequently encounters bugs, errors, and faults. Predictive evaluation of software fault contributes towards mitigating this challenge to a large extent; however, there is no benchmarked framework being reported in this case yet. Therefore, this paper introduces a computational framework of the cost evaluation method to facilitate a better form of predictive assessment of software faults. Based on lines of code, the proposed scheme deploys adopts a machine-learning approach to address the perform predictive analysis of faults. The proposed scheme presents an analytical framework of the correlation-based cost model integrated with multiple standards machine learning (ML) models, e.g., linear regression, support vector regression, and artificial neural networks (ANN). These learning models are executed and trained to predict software faults with higher accuracy. The study considers assessing the outcomes based on error-based performance metrics in detail to determine how well each learning model performs and how accurate it is at learning. It also looked at the factors contributing to the training loss of neural networks. The validation result demonstrates that, compared to logistic regression and support vector regression, neural network achieves a significantly lower error score for software fault prediction.
Online learning algorithmes often have to operate in the presence of concept drifts. A recent study
revealed that different diversity levels in an ensemble of learning machines are required in order to maintain high
generalization on both old and new concepts. Inspired by this study and based on a further study of diversity with
different strategies to deal with drifts, so propose a new online ensemble learning approach called Diversity for
Dealing with Drifts (DDD).DDD maintains ensembles with different diversity levels and is able to attain better
accuracy than other approaches. Furthermore, it is very robust, outperforming other drift handling approaches in
terms of accuracy when there are false positive drift detections. It is always performed at least as well as
other drift handling approaches under various conditions, with very few exceptions. Presents an analysis of low
and high diversity ensembles combined with different strategies to deal with concept drift and proposes a new
approach (DDD) to handle drifts.
Concept drift and machine learning model for detecting fraudulent transaction...IJECEIAES
In a streaming environment, data is continuously generated and processed in an ongoing manner, and it is necessary to detect fraudulent transactions quickly to prevent significant financial losses. Hence, this paper proposes a machine learning-based approach for detecting fraudulent transactions in a streaming environment, with a focus on addressing concept drift. The approach utilizes the extreme gradient boosting (XGBoost) algorithm. Additionally, the approach employs four algorithms for detecting continuous stream drift. To evaluate the effectiveness of the approach, two datasets are used: a credit card dataset and a Twitter dataset containing financial fraudrelated social media data. The approach is evaluated using cross-validation and the results demonstrate that it outperforms traditional machine learning models in terms of accuracy, precision, and recall, and is more robust to concept drift. The proposed approach can be utilized as a real-time fraud detection system in various industries, including finance, insurance, and e-commerce.
Advance Clustering Technique Based on Markov Chain for Predicting Next User M...idescitation
According to the survey India is one of the
leading countries in the word for technical education and
management education. Numbers of students are increasing
day by day by the growth rate of 45% per annum. Advancement
in technology puts special effect on education system. This
helps in upgrading higher education. Some universities and
colleges are using these technologies. Weblog is one of them.
Main aim of this paper is to represent web logs using clustering
technique for predicting next user movement and user
behavior analysis. This paper moves around the web log
clustering technique based on Markov chain results .In this
paper we present an ideal approach to web clustering
(clustering web site users) and predicting their behavior for
next visit. Methodology: For generating effective result approx
14 engineering college web usage data is used and an advance
clustering approach is presenting after optimizing the other
clustering approach.Results: The user behavior is predicted
with the help of the advance clustering approach based on the
FPCM and k-mean. Proposed algorithm is used to mined and
predict user’s preferred paths. To predict the user behavior
existing approaches have been used. But the existing
approaches are not enough because of its reaction towards
noise. Thus with the help of ACM, noise is reduced, provides
more accurate result for predicting the user behavior. Approach
Implementation:The algorithm was implemented in MAT
LAB, DTRG and in Java .The experiment result proves that
this method is very effective in predicting user behavior. The
experimental results have validated the method’s effectiveness
in comparison with some previous studies.
A Review on Traffic Classification Methods in WSNIJARIIT
In a wireless network it is very important to provide the network security and quality of service. To achieve these parameters there must be proper traffic classification in the wireless network. There are many algorithms used such as port number, deep packet inspection as the earlier methods and now days KISS, nearest cluster based classifier (NCC), SVM method and used to classify the traffic and improve the network security and quality of service of a network.
Enhancement of Single Moving Average Time Series Model Using Rough k-Means fo...IJERA Editor
In the last decade, real–time audio and video services have gained much popularity, and now occupying a large portion of the total network traffic in the Internet. As the real-time services are becoming mainstream the demand for Quality of Service (QoS) is greater than ever before. It is necessary to use the network resources to the fullest, to satisfy the increasing demand for QoS. To solve this issue, we need to apply a prediction model for network traffic, on the basis of network management such as congestion control and bandwidth location. In this paper, we propose an integrated model that combines Rough K-Means (RKM) clustering with Single Moving Average (SMA) time series model to improve prediction loading packets of network traffic. The single moving average time series prediction model is used to predict loading of packets volume in real network traffic. Further, clustering granules obtained by using rough k-means is used to analyze the network data of each year separately. The proposed model is an integration of the prediction results that were obtained from conventional single moving average prediction model with centriods of clusters that obtained from rough k-means clustering. The model is evaluated using on line network traffic data that has been collected from WIDE backbone Network MSE, RMSE and MAPE metrics are used to examine the results of the integrated model. The experimental results show that the integrated model can be an effective way to improve prediction accuracy achieved with the help of rough k-means clustering. A Comparative result between conventional prediction model and our integrated model is presented.
Similar to STUDY OF DISTANCE MEASUREMENT TECHNIQUES IN CONTEXT TO PREDICTION MODEL OF WEB CACHING AND WEB PREFETCHING (20)
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
STUDY OF DISTANCE MEASUREMENT TECHNIQUES IN CONTEXT TO PREDICTION MODEL OF WEB CACHING AND WEB PREFETCHING
1. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.5, No.1, February 2016
DOI :10.5121/ijscai.2016.5101 1
STUDY OF DISTANCE MEASUREMENT
TECHNIQUES IN CONTEXT TO PREDICTION
MODEL OF WEB CACHING AND WEB
PREFETCHING
Dharmendra Patel
1
Smt.Chadaben Mohanbhai Patel Institute of Computer Applications, Charotar University
of Science and Technology, Changa, Gujarat, India
ABSTRACT
Internet is the boon in modern era as every organization uses it for dissemination of information and e-
commerce related applications. Sometimes people of organization feel delay while accessing internet in
spite of proper bandwidth. Prediction model of web caching and prefetching is an ideal solution of this
delay problem. Prediction model analysing history of internet user from server raw log files and determine
future sequence of web objects and placed all web objects to nearer to the user so access latency could be
reduced to some extent and problem of delay is to be solved. To determine sequence of future web objects,
it is necessary to determine proximity of one web object with other by identifying proper distance metric
technique related to web caching and prefetching. This paper studies different distance metric techniques
and concludes that bio informatics based distance metric techniques are ideal in context to Web Caching
and Web Prefetching.
KEYWORDS
Web Caching, Web Prefetching, Distance Metric, Access Latency, Bio Informatics
1. INTRODUCTION
WWW is an information hub that consists of enormous web objects in form of web page, audio,
video, image etc. The common problem for internet user is delay while accessing web objects in
spite of proper bandwidth. Web Caching and Web Prefetching concepts integration [1] is the
software solution of delay problem. The main purpose of Web Prefetching is to generate patterns
from historical data. Following figure describes main approaches of pattern discovery.
Dependency Graph It increases network traffic and prediction accuracy is very low [2]. Markov
Model is well known model that predicts the next page by matching user’s current access
sequence with the user’s historical web access sequence but main limitations are; In the lowest
order Markov model, the prediction is not accurate while in high order Markov model converge
is low and complexity is more [3].Cost Function is less popular and it uses popularity and
lifetime kind of measures to generate pattern [4]. New pattern generation is quite impossible and
very complex in all techniques of dependency graph, Markov Model and Cost Function. Data
Mining [5] is the ultimate technique of web Prefetching that generates new patterns from large
data set. The techniques of data mining applied for web data are known as Web Mining [6]. Web
mining has three main categories (i) Web Content (ii) Web Structure and (iii) Web Usage. Web
2. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.5, No.1, February 2016
2
Caching and Prefetching based prediction model uses the data of server raw log file to generate
meaningful sequence of patterns means Web Usage Mining category of Web mining is applicable
for this. Web Usage mining has three main steps (a) Data Preprocessing (b) Pattern Discovery
and (c) Pattern Analysis[7]. Number of researches have been done on data preprocessing stage
[8][9][10][11][12][13] but very few researches have been done for pattern discovery and analysis.
This paper focuses on pattern discovery and analysis in exhaustive way. For pattern discovery
and analysis stage it is required to identify sessions of particular user.
There are number of sessionization techniques [14][15][16] are identified in literature and
technique of combination of IP,agent and sessionization heuristic is an ideal in context to
prediction model as no additional overhead to determine sessions as well as data regarding IP and
agent are always available. Once sessions are formed it is challenge to determine proximity of
one session with another in order to generate interesting, new, hidden and valuable sequence of
pattern in the context to user. To determine the proximity among objects number of distance
metric techniques [17][18][19] are identified in literature. The detail study and identification of an
appropriate distance measurement is a vital for generation of sequence of patterns. The section 2
will describe detail study of distance metric techniques in context to web caching and
Prefetching. The Section 3 provides conclusion about all distance measurement techniques in
context to Web Caching and Web Prefetching.
2. STUDY OF DISTANCE MEASUREMENT TECHNIQUES
Most popular and commonly used distance metric technique studied in literature is Euclidean
distance [20,21, 22]. It is an ordinary distance between two points that is measured with the ruler
and it is derived from Pythagorean formula. The distance between two data points in the plane
with coordinates (p, q) and (r, s) is formulated by:
DIST (( p, q), (r, s)) = Sqrt (( p-r)
2
+ (q-s)
2
)
The usefulness of Euclidean depends on circumstances. For any plane, it provides pretty good
results, but with slow speed. It is also not useful in the case of string distance measurement.
Euclidean distance can be extended to any dimensions. Another most popular distance metric
technique is Manhattan distance that computes the distance from one data point to another if grid
like a path is followed. The Manhattan distance between two items is the sum of the differences
of their corresponding elements [23]. The distance between two data points with coordinates (p1,
q1) and (p2, q2) is derived from:
It is the summation of horizontal and vertical elements where diagonal distance is computed using
Pythagorean formula. It is derived from the Euclidean distance so it exhibits similar
characteristics as of Euclidean distance. It is generally useful in gaming applications like chess to
determine diagonal distance from one element to another. Minkowski is another famous distance
measurement technique that can be considered as a generalization of both Euclidean and
Manhattan distance. Several researches [24,25,26] have been done based on Minkowski distance
measurement technique to determine similarity among objects. The Minkowski distance of order
p between two points: P (x1, x2, x3…xn) and Q( y1,y2,y3,…yn) R
n
is defined as:
3. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.5, No.1, February 2016
3
If the value of p equals to one S1, S2 Minkowski distance is known as Manhattan distance while for
value 2 it becomes Euclidean distance. All distances discussed so far are considered as ordinary
distances that are measured from ruler and they are not suitable for measuring similarity of two
strings so they are not appropriate in context to propose research since web session consists of
numbers of string in the form of URLs. Hamming distance is a most popular similarity measure
for strings that determines similarity by considering the number of positions at which the
corresponding characters are different. More formally hamming distance between two strings P
and Q is | Pi Qi | . Hamming distance theory is used widely in several applications like
quantification of information, study of the properties of code and for secure communication. The
main limitation on Hamming distance, is it only applicable for strings of the same length.
Hamming distance is widely used in error free communication [27,28] field because the byte length
remains same for both parties who are involved in communication. Hamming distance is not so much
useful in context to web usage mining since length of all web sessions is not similar. The Levenshtein
distance or edit distance is more sophisticated distance measurement technique for string similarity. It
is the key distance in several fields such as optical character recognition, text processing,
computational biology, fraud detection, cryptography etc. and was studied extensively by many
authors [29,30,31]. The formula of Levenshtein distance between two strings S1, S2 is given by Lev
S1, S2 (| S1|, |S2|) where
Lev S1, S2 (I, j) = Max (I, j) if Min (I, j) =0,
Otherwise
Lev S1,S2 ( i,j) = Min ( Lev s1, s2 (i-1, j) +1)
OR
Min ( Lev s1, s2( i, j-1) + 1)
OR
Min ( Lev s1, s2 ( i-1, j-1) + [ S1i # S2j]
Levenshtein distance measurement technique is an ideal context for web session since it is
applicable to strings of unequal size.
Several bioinformatics distance measurement techniques that are used to align protein or
nucleotide sequences can be used to web mining perspectives to cluster unequal size web
sessions. One of the most important techniques of this category was invented by Saul B.
Needleman and Christian D. Wunsch [32] to align unequal size protein sequences. This technique
uses dynamic programming means solving complex problems by breaking them down into
simpler sub problems. It is a global alignment technique in which closely related sequences of
same length are very much appropriate. Alignment is done from beginning till end of sequence to
find out best possible alignment. This technique uses scoring system. Positive or higher value is
assigned for a match and a negative or a lower value is assigned for mismatch. It uses gap
penalties to maximize the meaning of sequence. This gap penalty is subtracted from each gap that
has been introduced. There are two main types of gap penalties such as open and extension. The
open penalty is always applied at the start of the gap, and then the other gaps following it are
given with a gap extension penalty which will be less compared to the open penalty. Typical
values are –12 for gap opening, and –4 for gap extension. According to Needleman Wunsch
algorithm, initial matrix is created with N * M dimension, where N = number of rows equals to
number of characters of first string plus one and M= number of columns equals to number of
characters of first string plus one. Extra row and column is used to align with gap. After that
scoring scheme is introduced that can be user defined with specific scores. The simple basic
scoring scheme is, if two sequences at i
th
and j
th
positions are same matching score is 1( S(I,j) =1)
or if two sequences at i
th
and j
th
positions are not same mismatch score is assumed as -1 ( S(I,j)=-
4. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.5, No.1, February 2016
4
1). The gap penalty is assumed as -1. When any kind of operation is performed like insertion or
deletion, the dynamic programming matrix is defined with three different steps:
1. Initialization Phase: - In initialization phase the gap score can be added to previous cell of
the row and column.
2. Matrix Filling Phase: - It is most crucial phase and matrix filling starting from the upper
left hand corner of the matrix. It is required to know the diagonal, left and right score of the
current position in order to find maximum score of each cell. From the assumed values, add
match or mismatch score to diagonal value. Same way repeat the process for left and right value.
Take the maximum of three values (i.e. diagonal, right and left) and fill ith
and jth
positions with
obtained score. The equation to calculate the maximum score is as under:
Mi,j = Max [ Mi-1,j-1 + Si,j, Mi,j-1 + W, Mi-1,j +W]
Where i,j describes row and columns.
M is the matrix value of the required cell (stated as Mi,j)
S is the score of the required cell (Si, j)
W is the gap alignment
3. Alignment through Trace Back: - It is the final step in Needleman Wunsch algorithm that
is trace back for the best possible alignment. The best alignment among the alignments can be
identified by using the maximum alignment score.
Needleman Wunsch distance measurement technique is an ideal one in string similarity so this
technique is also considered in the proposed research context.
Smith-waterman is an important bioinformatics technique to align different strings. This
technique compares segments of all possible lengths and optimizes the measure of similarity.
Temple F.Smith and Michael S.Waterman [33] were founders of this technique. The main
difference in the comparison of Needleman Wunsh is that negative scoring matrix cells are set to
zero that makes local alignment visible. This technique compares diversified length segments
instead of looking at entire sequence at once. The main advantages of smith waterman technique
are:
To gives conserved regions between the two sequences.
To align partially overlapping sequences.
To align the subsequence of the sequence to itself.
Alike Needleman-Wunsch, this technique also uses scoring matrix system. For scoring system
and gap analysis, same concepts used in Needleman Wunsch are applicable here in Smith-
Waterman. It also uses same steps of initialization, matrix filling and alignment through trace
back. The equation to calculate maximum score is same as Needleman Wunsch. The main
differences between Needleman-Wunsch and Smith Waterman are:
Needleman Wunsch does global alignment while Smith Waterman focuses on local
alignment.
Needleman Wunsch requires alignment score for pair of remainder to be >=0 while for
Smith Waterman it may be positive or negative.
For Needleman Wunsch no gap penalty is required for processing while for Smith and
Waterman gap penalty is required for efficient work.
5. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.5, No.1, February 2016
5
In Needleman and Wunsch score can not be decrease between two cells of a pathway
while in Smith Waterman score can increase, decrease or remain same between two cells
of pathway.
Table 1 describes the comparison of different distance metrics techniques in context to proposed
research.
Table 1 Comparison of distance metrics techniques
Sr.No Technique Description Advantages Disadvantages
1. Euclidean
It describes distance
between two (1)It is faster for (1)It is not
Distance
points that would be
measure with determination of suitable for
ruler and calculated using correlation among ordinal data like
Pythagorean theorem. points (2) It is fair string.
measure because it (2) It requires
compares data points actual data not
based on actual rank.
ratings.
2. Levenshtein
It is a string metric for
measuring the It is fast and best It is not
difference between two
strings. suited for strings considered
similarity. order of
sequence of
characters while
comparing.
3. Needleman-
It is a bio informatics
algorithm and It is best for string It requires same
Wunsch
provides global alignment
between comparison because it length of string
strings while comparing. considers ordering of while
sequence of characters comparing.
6. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.5, No.1, February 2016
6
4. Smith-
It is a bio informatics
algorithm and It is best for string It is quite
Waterman
provides local alignment
between comparison because it complex than
strings while comparing. considers ordering of any global
sequence of characters alignment
and it is applicable for technique
either similar or
dissimilar length of
strings.
From above table it is to be analyzed that Euclidean distance is not suitable for proposed research
because web sessions consists of sequences of web objects and which are in string format.
Levenshtein distance is a very good technique for string sequences similarity but for prediction
model of web caching and prefetching an ordering of web objects is an important aspect that is
ignored by this distance metric technique so it is also not an appropriate way in proposed research
context. Both Needleman-Wunsch and Smith-Waterman considers an ordering of sequence for
string matching so they are ideal for this context. Web Sessions are not always of same length so
Needleman-Wunsch algorithm is not cent percent fit for formation of web sessions clusters as it
only provides global alignment. Smith-Waterman algorithm is applicable for both same length
sequence as well as dissimilar length of sequences so it is an ideal algorithm for formation of
clusters in this proposed research.
3. CONCLUSION
In era of internet, to reduce access latency of user is most challenging job. To reduce access
latency, Web Caching and Prefetching based approach is an ideal but success of it depends on
accuracy of patterns as well as capacity of generation of new patterns. This paper described most
efficient technique to generate more accurate patters. This paper dealt with distance measurement
techniques in order to determine proximity among web sessions. The challenge is to identifying
best technique in context to web caching and Prefetching. This paper studies in depth about
distance measurement techniques and identify theoretically which techniques are suitable in
context to the application of Web Caching and Prefetching.
REFERENCES
[1] Sarina Sulaiman, Siti, Ajith Abraham, Shahida Sulaiman, “Web Caching and Prefetching: What,
Why, and How? “, IEEE, Information Technology,2008,Volume-4,pages: 1-8.
[2] A. Nanopoulos, D. Katsaros, and Y. Manolopoulos “ A data mining algorithm for generalized Web
Prefetching” IEEE Trans on Knowledge and Data Engineering , 15 ( 5) ,(2003). PP. 11552-1169.
[3] Dharmendra T.Patel, Dr.Kalpesh Parikh, “Quantitative Study of Markov Model for Prediction of User
Behavior for Web Caching and Prefetching Purpose”, International Journal of Computer Applications
(0975- 8887) , Volume 65 No.15 , March 2013,PP. 39-49.
[4] E. P. Markatos and C. E. Chronaki : A Top-10 approach to prefetching on the Web “, Proceedings of
INET'98 Geneva, Switzerland, (1998), pp. 276-290.
7. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.5, No.1, February 2016
7
[5] Xindong Wu, Vipin Kumar, J. Ross Quinlan, “Top 10 Algorithms in Data Mining”, Springer,
Knowledge Information System,2008,PP. 1-37.
[6] COOLEY, R., MOBASHER, B. ; SRIVASTAVA, J.,” WEB MINING: INFORMATION AND
PATTERN DISCOVERY ON THE WORLD WIDE WEB “,TOOLS WITH ARTIFICIAL
INTELLIGENCE, 1997. PROCEEDINGS., NINTH IEEE INTERNATIONAL CONFERENCE ON
3-8 NOV 1997,PP.-558-567.
[7] Xinjin Li, Sujing Zhang,” Application of Web Usage Mining in e-learning Platform”, IEE 2010
International Conference on E-Business and E-Government, May 2010,PP-1391-1394.
[8] Jaideep Srivastava, Robert Cooley, Mukund Deshpande, Pang-Ning Tan,” Web Usage Mining:
Discovery and Applications of Usage Patterns from Web Data”, SIGKDD Explorations,2000,
Vol.1(2) ,PP 1-12.
[9] Mohd Helmy Abd Wahab, Mohd Norzali Haji Mohd, Hafizul Fahri Hanafi, Mohamad Farhan
Mohamad Mohsin- “ Data Pre-processing on Web Server Logs for Generalized Association Rules
Mining Algorithm”, World Academy of Science, Engineering and Technology 48 2008.
[10] Tanasa D., Trousse B.. Advanced data preprocessing for intersites Web usage mining.
IntelligentSystems, IEEE,2004(19), PP 59 – 65.
[11] Yuan, F., L.-J. Wang, et al.,” Study on Data Preprocessing Algorithm in Web Log Mining”,
Proceedings of the Second International Conference on Machine Learning and Cybernetics, Wan, 2-5
November 2003.
[12] Pabarskaite, Z,” Implementing Advanced Cleaning and End-User Interpretability Technologies in
Web Log Mining”. 24th Int. Conf. information Technology Interfaces /TI 2002, June 24-27, 2002,
Cavtat, Croatia.
[13] Li Chaofeng, “ Research and Development of Data Preprocessing in Web Usage Mining”,
International Conference on Management Science and Engineering , 2006.
[14] Robert.Cooley,Bamshed Mobasher, and Jaideep Srinivastava, “ Web mining:Information and Pattern
Discovery on the World Wide Web”, In International conference on Tools with Artificial Intelligence,
pages 558-567, Newport Beach, IEEE,1997.
[15] Robert F.Dell ,Pablo E.Roman, and Juan D.Velasquez, “Web User Session Reconstruction Using
Integer Programming,” , IEEE/ACM International Conference on Web Intelligence and Intelligent
Agent,2008.
[16] Spilipoulou M.and Mobasher B, Berendt B.,”A framework for the Evaluation of Session
Reconstruction Heuristics in Web Usage Analysis”, INFORMS Journal on Computing Spring ,2003.
[17] Frank Y. Shih, Yi-Ta Wu,” Fast Euclidean distance transformation in two scans using a 3 3
neighborhood”, Computer Vision and Image Understanding ,Elsveir,2004,PP 195–205.
[18] L. Kari, S. Konstantinidis, S. Perron, G. Wozniak, J. Xu, “Computing the Hamming distance of a
regular language in quadratic time,” WSEAS Transactions on Information Science &
Applications,vol. 1, pp. 445–449, 2004.
[19] Stavros Konstantinidis, “Computing the Levenshtein Distance of a Regular Language”, In the Proc. of
IEEE ISOC ITW2005 on Coding and Complexity; editor M.J. Dinneen; co-chairs U. Speidel and D.
Taylor; PP 113-116.
[20] A. Alfakih, A. Khandani, and H. Wolkowicz ”Solving Euclidean distance matrix completion
problems via semidenite programming”, Comput. Optim. (1999), PP.13-30.
[21] M. Bakonyi and C. R. Johnson, “ The Euclidean distance matrix completion problem”, SIAM J.
Matrix Anal. (1995), PP 645-654.
[22] H. X. Huang, Z. A. Liang, and P. M Pardalos,”Some properties for the Euclidean distancematrix and
positive semidenite matrix completion problems”, J. Global Optim.,(2003),PP 3-21.
[23] Sung-Hyuk Cha, “ Comprehensive Survey on Distance/Similarity Measures between Probability
Density Functions”, International Journal of Mathematical Models and Methods in Applied Sciences,
Issue 4, Volume 1, (2007).
[24] Agrawal P. K., Flato E., Halperin D. : Polygon decomposition for efficient construction of minkowski
sums. Comput. Geom. Theory Appl. 21 , 1 (2002), 39–61.
[25] Foglel E., Halperin D. Exact and efficient construction of minkowski sums of convex polyhedra with
applications. In Proc. ALENEX 2006 (2006).
[26] G Ritzmann P., Sturmfels B. : Minkowski addition of polytopes: Computational complexity and
applications to grobner basis. SIAM J. Discrete Math. 6 , 2 (1993), 246–269.
[27] A. Ambainis, W. Gasarch, A. Srinavasan, A. Utis: Lower bounds on the deterministic and quantum
communication complexity of hamming distance, cs.CC/0411076, (2004).
8. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.5, No.1, February 2016
8
[28] Wei Huang, Yaoyun Shi, Shengyu Zhang, Yufan Zhu: The coomunication complexity of Hamming
distance problem, Elsevier, Information Processing Letters, (2006), pp-149-153.
[29] Navarro. A: guided tour to approximate string matching, ACM Comput. Surv. , 33(1):31–88, (2001).
[30] Andoni and R. Krauthgamer: The computational hardness of estimating edit distance. In Proceedings
of the Symposium on Foundations of Computer Science, (2007).
[31] G. Navarro, R. Baeza-Yates, E. Sutinen, and J. Tarhio: Indexing methods for approximate string
matching, IEEE Data Engineering Bulletin, 24(4):19–27, (2001). Special issue on Text and Databases
Invited paper.
[32] Needleman Saul B and Wunsch Christian D. :A general method applicable to the search for
similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48 (3): 443– 53,
1970.
[33] Smith, Temple F.; and Waterman, Michael S. : Identification of Common Molecular Subsequences,
Journal of Molecular Biology , pages 195–197.
AUTHOR
He received his Ph.D degree from Kadi Sarva Vishwavidyalaya, Gandhinagar in Web
Mining field. Currently he is working as an associate professor at Smt.Chandaben
Mohanbhai Patel Institute of Computer
Applications,CHARUSAT,Changa,Gujarat,India. He has published 12 research papers
in international journals of repute. He is associated with many journals of repute as an
editorial board/reviewer board member. He is member of several professional bodies.