Multi-Agent systems (Autonomous agents or agents) and knowledge discovery (or data mining) are two active
areas in information technology. A profound insight of bringing these two communities together has unveiled a tremendous
potential for new opportunities and wider applications through the synergy of agents and data mining. Multi-agent systems
(MAS) often deal with complex applications that require distributed problem solving. In many applications the individual and
collective behavior of the agents depends on the observed data from distributed data sources. Data mining technology has
emerged, for identifying patterns and trends from large quantities of data. The increasing demand to scale up to massive data sets
inherently distributed over a network with limited band width and computational resources available motivated the development of
distributed data mining (DDM).Distributed data mining is originated from the need of mining over decentralized data
sources. DDM is expected to perform partial analysis of data at individual sites and then to send the outcome as partial result
to other sites where it sometimes required to be aggregated to the global result
An Overview of Security in Distributed Database Management SystemIJSRD
Databases are exceptionally familiar in businesses. The ordinary database is naturally held on a central server and people log in to the system to query or update the database. Still, there is an additional type of database known as a distributed database that bids advantages for some kinds of organization. This paper will examine the underlying features of the distributed database system and its security. Learning the task of distributed database management system will lead us to a successful design. Developing a successful distributed database system requires to address the importance of security issues that may arise and possibly compromise the access control and the integrity of the system. This paper propose some solutions for some security aspects such as multilevel access control, confidentiality, reliability, integrity and recovery that pertain to a distributed database system.
DISTRIBUTED AND BIG DATA STORAGE MANAGEMENT IN GRID COMPUTINGijgca
Big data storage management is one of the most challenging issues for Grid computing environments, since large amount of data intensive applications frequently involve a high degree of data access locality. Grid applications typically deal with large amounts of data. In traditional approaches high-performance computing consists dedicated servers that are used to data storage and data replication. In this paper we present a new mechanism for distributed and big data storage and resource discovery services. Here we proposed an architecture named Dynamic and Scalable Storage Management (DSSM) architecture in grid environments. This allows in grid computing not only sharing the computational cycles, but also share the storage space. The storage can be transparently accessed from any grid machine, allowing easy data sharing among grid users and applications. The concept of virtual ids that, allows the creation of virtual spaces has been introduced and used. The DSSM divides all Grid Oriented Storage devices (nodes) into multiple geographically distributed domains and to facilitate the locality and simplify the intra-domain storage management. Grid service based storage resources are adopted to stack simple modular service piece by piece as demand grows. To this end, we propose four axes that define: DSSM architecture and algorithms description, Storage resources and resource discovery into Grid service, Evaluate purpose prototype system, dynamically, scalability, and bandwidth, and Discuss results. Algorithms at bottom and upper level for standardization dynamic and scalable storage management, along with higher bandwidths have been designed.
Dynamic Resource Provisioning with Authentication in Distributed DatabaseEditor IJCATR
Data center have the largest consumption amounts of energy in sharing the power. The public cloud workloads of different
priorities and performance requirements of various applications [4]. Cloud data center have capable of sensing an opportunity to present
different programs. In my proposed construction and the name of the security level of imperturbable privacy leakage rarely distributed
cloud system to deal with the persistent characteristics there is a substantial increases and information that can be used to augment the
profit, retrenchment overhead or both. Data Mining Analysis of data from different perspectives and summarizing it into useful
information is a process. Three empirical algorithms have been proposed assignments estimate the ratios are dissected theoretically and
compared using real Internet latency data recital of testing methods
An Overview of Security in Distributed Database Management SystemIJSRD
Databases are exceptionally familiar in businesses. The ordinary database is naturally held on a central server and people log in to the system to query or update the database. Still, there is an additional type of database known as a distributed database that bids advantages for some kinds of organization. This paper will examine the underlying features of the distributed database system and its security. Learning the task of distributed database management system will lead us to a successful design. Developing a successful distributed database system requires to address the importance of security issues that may arise and possibly compromise the access control and the integrity of the system. This paper propose some solutions for some security aspects such as multilevel access control, confidentiality, reliability, integrity and recovery that pertain to a distributed database system.
DISTRIBUTED AND BIG DATA STORAGE MANAGEMENT IN GRID COMPUTINGijgca
Big data storage management is one of the most challenging issues for Grid computing environments, since large amount of data intensive applications frequently involve a high degree of data access locality. Grid applications typically deal with large amounts of data. In traditional approaches high-performance computing consists dedicated servers that are used to data storage and data replication. In this paper we present a new mechanism for distributed and big data storage and resource discovery services. Here we proposed an architecture named Dynamic and Scalable Storage Management (DSSM) architecture in grid environments. This allows in grid computing not only sharing the computational cycles, but also share the storage space. The storage can be transparently accessed from any grid machine, allowing easy data sharing among grid users and applications. The concept of virtual ids that, allows the creation of virtual spaces has been introduced and used. The DSSM divides all Grid Oriented Storage devices (nodes) into multiple geographically distributed domains and to facilitate the locality and simplify the intra-domain storage management. Grid service based storage resources are adopted to stack simple modular service piece by piece as demand grows. To this end, we propose four axes that define: DSSM architecture and algorithms description, Storage resources and resource discovery into Grid service, Evaluate purpose prototype system, dynamically, scalability, and bandwidth, and Discuss results. Algorithms at bottom and upper level for standardization dynamic and scalable storage management, along with higher bandwidths have been designed.
Dynamic Resource Provisioning with Authentication in Distributed DatabaseEditor IJCATR
Data center have the largest consumption amounts of energy in sharing the power. The public cloud workloads of different
priorities and performance requirements of various applications [4]. Cloud data center have capable of sensing an opportunity to present
different programs. In my proposed construction and the name of the security level of imperturbable privacy leakage rarely distributed
cloud system to deal with the persistent characteristics there is a substantial increases and information that can be used to augment the
profit, retrenchment overhead or both. Data Mining Analysis of data from different perspectives and summarizing it into useful
information is a process. Three empirical algorithms have been proposed assignments estimate the ratios are dissected theoretically and
compared using real Internet latency data recital of testing methods
BIG DATA NETWORKING: REQUIREMENTS, ARCHITECTURE AND ISSUESijwmn
A flexible, efficient and secure networking architecture is required in order to process big data. However,
existing network architectures are mostly unable to handle big data. As big data pushes network resources
to the limits it results in network congestion, poor performance, and detrimental user experiences. This
paper presents the current state-of-the-art research challenges and possible solutions on big data
networking theory. More specifically, we present the state of networking issues of big data related to
capacity, management and data processing. We also present the architectures of MapReduce and Hadoop
paradigm with research challenges, fabric networks and software defined networks (SDN) that are used to
handle today’s idly growing digital world and compare and contrast them to identify relevant problems and
solutions.
BIG DATA NETWORKING: REQUIREMENTS, ARCHITECTURE AND ISSUESijwmn
A flexible, efficient and secure networking architecture is required in order to process big data. However, existing network architectures are mostly unable to handle big data. As big data pushes network resources
to the limits it results in network congestion, poor performance, and detrimental user experiences. This paper presents the current state-of-the-art research challenges and possible solutions on big data networking theory. More specifically, we present the state of networking issues of big data related to
capacity, management and data processing. We also present the architectures of MapReduce and Hadoop paradigm with research challenges, fabric networks and software defined networks (SDN) that are used to handle today’s idly growing digital world and compare and contrast them to identify relevant problems and solutions.
A study secure multi authentication based data classification model in cloud ...IJAAS Team
Abstract: Cloud computing is the most popular term among enterprises and news. The concepts come true because of fast internet bandwidth and advanced cooperation technology. Resources on the cloud can be accessed through internet without self built infrastructure. Cloud computing is effectively managing the security in the cloud applications. Data classification is a machine learning technique used to predict the class of the unclassified data. Data mining uses different tools to know the unknown, valid patterns and relationshipsin the dataset. These tools are mathematical algorithms, statistical models and Machine Learning (ML) algorithms. In this paper author uses improved Bayesian technique to classify the data and encrypt the sensitive data using hybrid stagnography. The encrypted and non encrypted sensitive data is sent to cloud environment and evaluate the parameters with different encryption algorithms.
Review of access control models for cloud computingcsandit
The relationship between users and resources is dynamic in the cloud, and service providers
and users are typically not in the same security domain. Identity-based security (e.g.,
discretionary or mandatory access control models) cannot be used in an open cloud computing
environment, where each resource node may not be familiar, or even do not know each other.
Users are normally identified by their attributes or characteristics and not by predefined
identities. There is often a need for a dynamic access control mechanism to achieve crossdomain
authentication. In this paper, we will focus on the following three broad categories of
access control models for cloud computing: (1) Role-based models; (2) Attribute-based
encryption models and (3) Multi-tenancy models. We will review the existing literature on each
of the above access control models and their variants (technical approaches, characteristics,
applicability, pros and cons), and identify future research directions for developing access
control models for cloud computing environments.
BIG DATA IN CLOUD COMPUTING REVIEW AND OPPORTUNITIESijcsit
Big Data is used in decision making process to gain useful insights hidden in the data for business and engineering. At the same time it presents challenges in processing, cloud computing has helped in advancement of big data by providing computational, networking and storage capacity. This paper presents the review, opportunities and challenges of transforming big data using cloud computing resources.
Data Partitioning Technique In Cloud: A Survey On Limitation And BenefitsIJERA Editor
In recent years,increment in the growth and popularity of cloud services has lead the enterprises to an increase in the capability to handle, store and retrieve critical data. This technology access a shared group of configurable computing resources, which are- servers,storage and applications. Cloud computing is a succeeding generation architecture of IT enterprise, which convert the application software and databaseto large data hubs.Data security and storage of data is an essential functionality of cloud services.It allows data storage in the cloud server efficiently without any worry. Cloud services includes request service, wide web access, measured services, just single click away ,easy usage, just pay for the services you use and location independent.All these features poses many security challenges.The data partitioning techniques are used in literature, for privacy conserving and security of data, using third party auditor (TPA). Objective of the current workis to review all available partitioning technique in literature and analyze them. Through this work authors will compare and identify the limitations and benefits of the available and widely used partitioning techniques.
AUTHENTICATION SCHEME FOR DATABASE AS A SERVICE(DBAAS) ijccsa
IT Companies have shifted their resources to the cloud at rapidly increasing rate. As part of this trend companies are migrating business critical and sensitive data stored in database to cloud-hosted and Database as a Service (DBaaS) solutions.Of all that has been written about cloud computing, precious little attention has been paid to authentication in the cloud. In this paper we have designed a new effective authentication scheme for Cloud Database as a Service (DBaaS). A user can change his/her password, whenever demanded. Furthermore, security analysis realizes the feasibility of the proposed model for DBaaS and achieves efficiency. We also proposed an efficient authentication scheme to solve the authentication problem in cloud. The proposed solution which we have provided is based mainly on improved Needham-Schroeder’s protocol to prove the users’ identity to determine if this user is authorized or not. The results showed that this scheme is very strong and difficult to break it.
Cooperative caching for efficient data access in disruption tolerant networksLeMeniz Infotech
Cooperative caching for efficient data access in disruption tolerant networks
Disruption tolerant networks (DTNs) are characterized by low node density, unpredictable node mobility, and lack of global network information. Most of current research efforts in DTNs focus on data forwarding, but only limited work has been done on providing efficient data access to mobile users. A novel approach is proposed to support cooperative caching in DTNs, which enables the sharing and coordination of cached data among multiple nodes and reduces data access delay.
The advent of Big Data has seen the emergence of new processing and storage challenges. These challenges are often solved by distributed processing. Distributed systems are inherently dynamic and unstable, so it is realistic to expect that some resources will fail during use. Load balancing and task scheduling is an important step in determining the performance of parallel applications. Hence the need to design load balancing algorithms adapted to grid computing. In this paper, we propose a dynamic and hierarchical load balancing strategy at two levels: Intrascheduler load balancing, in order to avoid the use of the large-scale communication network, and interscheduler load balancing, for a load regulation of our whole system. The strategy allows improving the average response time of CLOAK-Reduce application tasks with minimal communication. We first focus on the three performance indicators, namely response time, process latency and running time of MapReduce tasks.
CYBER INFRASTRUCTURE AS A SERVICE TO EMPOWER MULTIDISCIPLINARY, DATA-DRIVEN S...ijcsit
In supporting its large scale, multidisciplinary scientific research efforts across all the university campuses and by the research personnel spread over literally every corner of the state, the state of Nevada needs to build and leverage its own Cyber infrastructure. Following the well-established as-a-service model, this state-wide Cyber infrastructure that consists of data acquisition, data storage, advanced instruments, visualization, computing and information processing systems, and people, all seamlessly linked together through a high-speed network, is designed and operated to deliver the benefits of Cyber infrastructure-as-aService (CaaS).There are three major service groups in this CaaS, namely (i) supporting infrastructural
services that comprise sensors, computing/storage/networking hardware, operating system, management tools, virtualization and message passing interface (MPI); (ii) data transmission and storage services that provide connectivity to various big data sources, as well as cached and stored datasets in a distributed
storage backend; and (iii) processing and visualization services that provide user access to rich processing and visualization tools and packages essential to various scientific research workflows. Built on commodity hardware and open source software packages, the Southern Nevada Research Cloud(SNRC)and a data repository in a separate location constitute a low cost solution to deliver all these services around CaaS. The service-oriented architecture and implementation of the SNRC are geared to encapsulate as much detail of big data processing and cloud computing as possible away from end users; rather scientists only need to learn and access an interactive web-based interface to conduct their collaborative, multidisciplinary, dataintensive research. The capability and easy-to-use features of the SNRC are demonstrated through a use case that attempts to derive a solar radiation model from a large data set by regression analysis.
The huge volume of text documents available on the internet has made it difficult to find valuable
information for specific users. In fact, the need for efficient applications to extract interested knowledge
from textual documents is vitally important. This paper addresses the problem of responding to user
queries by fetching the most relevant documents from a clustered set of documents. For this purpose, a
cluster-based information retrieval framework was proposed in this paper, in order to design and develop
a system for analysing and extracting useful patterns from text documents. In this approach, a pre-
processing step is first performed to find frequent and high-utility patterns in the data set. Then a Vector
Space Model (VSM) is performed to represent the dataset. The system was implemented through two main
phases. In phase 1, the clustering analysis process is designed and implemented to group documents into
several clusters, while in phase 2, an information retrieval process was implemented to rank clusters
according to the user queries in order to retrieve the relevant documents from specific clusters deemed
relevant to the query. Then the results are evaluated according to evaluation criteria. Recall and Precision
(P@5, P@10) of the retrieved results. P@5 was 0.660 and P@10 was 0.655.
Risk Prediction for Production of an EnterpriseEditor IJCATR
Despite all preventive measures, there is so much possibility of risks in any project development as well as in enterprise management. There is no any standard mechanism or methodology available to assess the risks in any project or production management. Using some precautionary steps, the manager can only avoid the risks as much as he can. To address this issue, this paper presents a probabilistic risk assessment model for the production of an enterprise. For this, Multi-Entity Bayesian Network (MEBN) has been used to represent the requirements for production management as well as to assess the risks adherence in production management, where MEBN combines expressivity of first-order logic and probabilistic feature of Bayesian network. Bayesian network provides the feature to represent the probabilistic uncertainty and reasoning about probabilistic knowledge base, which is used here to represent the probable risks behind each causes of a risk. The proposed probabilistic model is discussed with the help of a case study, which is used to predict risks inherent in the production of an enterprise, which depends upon various measures like labour availability, power backup, transport availability etc.
Data Mining is an analytic process designed to explore data in search of consistent patterns and systematic relationships between variables, and then to validate the results by applying the patterns found to a new subset of data. Data mining is often described as the process of discovering patterns, correlations, trends or relationships by searching through a large amount of data stored in repositories, databases, and data warehouses. Diabetes, often referred to by doctors as diabetes mellitus, describes a group of metabolic diseases in which the person has high blood [3] glucose (blood sugar), either because insulin production is insufficient, or because the body's cells do not respond properly to insulin, or both. This project helps in identifying whether a person has diabetes or not, if predicted diabetic[4] the project suggest measures for maintaining normal health and if not diabetic it predicts the risk of getting diabetic. In this project Classification algorithm was used to classify the Pima Indian diabetes dataset. Results have been obtained using Android Application.
Online Social Network (OSN) sites act as a medium to spread their own views, activities and their thoughts to some camaraderie. Contents of this network are spread over web, so it was hard to determine by a human decision. Currently, they do not provide any mechanism to ensure privacy concerns towards data associated with each user. Due to this problem, number of users lacks from their ownership control. In this paper, we proposed AC2P (Activity Control-Access Control Protocol) for information control on the web. Alternatively, Tag Refinement strategy determines illegal tagging over images and send notification about particular image spread within different communities/groups. These techniques reduce risk of information flow and avoid unwanted tagging toward images.
Security for Effective Data Storage in Multi CloudsEditor IJCATR
Cloud Computing is a technology that uses the internet and central remote servers to maintain data and
applications. Cloud computing allows consumers and businesses to use applications without installation and access their personal
files at any computer with internet access. This technology allows for much more efficient computing by centralizing data
storage, processing and bandwidth. The use of cloud computing has increased rapidly in many organizations. Cloud computing
provides many benefits in terms of low cost and accessibility of data. Ensuring the security of cloud computing is a major factor
in the cloud computing environment, as users often store sensitive information with cloud storage providers but these providers
may be untrusted. Dealing with “single cloud” providers is predicted to become less popular with customers due to risks of
service availability failure and the possibility of malicious insiders in the single cloud. A movement towards “multi-clouds”, or in
other words, “interclouds” or “cloud-of clouds” has emerged recently. This paper surveys recent research related to single and
multi-cloud security and addresses possible solutions. It is found that the research into the use of multicloud providers to maintain
security has received less attention from the research community than has the use of single clouds. This work aims to promote the
use of multi-clouds due to its ability to reduce security risks that affect the cloud computing user.
Image enhancement plays an important role in vision applications. Recently a lot of work has been performed in the field of image enhancement. Many techniques have already been proposed till now for enhancing the digital images. This paper has presented a comparative analysis of various image enhancement techniques. This paper has shown that the fuzzy logic and histogram based techniques have quite effective results over the available techniques. This paper ends up with suitable future directions to enhance fuzzy based image enhancement technique further. In the proposed technique, an approach is made to enhance the images other than low-contrast images as well by balancing the stretching parameter (K) according to the color contrast. Proposed technique is designed to restore the degraded edges resulted due to contrast enhancement as well.
In this paper, we introduce the notion of intuitionistic fuzzy semipre generalized connected space, intuitionistic fuzzy semipre generalized super connected space and intuitionistic fuzzy semipre generalized extremally disconnected spaces. We investigate some of their properties.
BIG DATA NETWORKING: REQUIREMENTS, ARCHITECTURE AND ISSUESijwmn
A flexible, efficient and secure networking architecture is required in order to process big data. However,
existing network architectures are mostly unable to handle big data. As big data pushes network resources
to the limits it results in network congestion, poor performance, and detrimental user experiences. This
paper presents the current state-of-the-art research challenges and possible solutions on big data
networking theory. More specifically, we present the state of networking issues of big data related to
capacity, management and data processing. We also present the architectures of MapReduce and Hadoop
paradigm with research challenges, fabric networks and software defined networks (SDN) that are used to
handle today’s idly growing digital world and compare and contrast them to identify relevant problems and
solutions.
BIG DATA NETWORKING: REQUIREMENTS, ARCHITECTURE AND ISSUESijwmn
A flexible, efficient and secure networking architecture is required in order to process big data. However, existing network architectures are mostly unable to handle big data. As big data pushes network resources
to the limits it results in network congestion, poor performance, and detrimental user experiences. This paper presents the current state-of-the-art research challenges and possible solutions on big data networking theory. More specifically, we present the state of networking issues of big data related to
capacity, management and data processing. We also present the architectures of MapReduce and Hadoop paradigm with research challenges, fabric networks and software defined networks (SDN) that are used to handle today’s idly growing digital world and compare and contrast them to identify relevant problems and solutions.
A study secure multi authentication based data classification model in cloud ...IJAAS Team
Abstract: Cloud computing is the most popular term among enterprises and news. The concepts come true because of fast internet bandwidth and advanced cooperation technology. Resources on the cloud can be accessed through internet without self built infrastructure. Cloud computing is effectively managing the security in the cloud applications. Data classification is a machine learning technique used to predict the class of the unclassified data. Data mining uses different tools to know the unknown, valid patterns and relationshipsin the dataset. These tools are mathematical algorithms, statistical models and Machine Learning (ML) algorithms. In this paper author uses improved Bayesian technique to classify the data and encrypt the sensitive data using hybrid stagnography. The encrypted and non encrypted sensitive data is sent to cloud environment and evaluate the parameters with different encryption algorithms.
Review of access control models for cloud computingcsandit
The relationship between users and resources is dynamic in the cloud, and service providers
and users are typically not in the same security domain. Identity-based security (e.g.,
discretionary or mandatory access control models) cannot be used in an open cloud computing
environment, where each resource node may not be familiar, or even do not know each other.
Users are normally identified by their attributes or characteristics and not by predefined
identities. There is often a need for a dynamic access control mechanism to achieve crossdomain
authentication. In this paper, we will focus on the following three broad categories of
access control models for cloud computing: (1) Role-based models; (2) Attribute-based
encryption models and (3) Multi-tenancy models. We will review the existing literature on each
of the above access control models and their variants (technical approaches, characteristics,
applicability, pros and cons), and identify future research directions for developing access
control models for cloud computing environments.
BIG DATA IN CLOUD COMPUTING REVIEW AND OPPORTUNITIESijcsit
Big Data is used in decision making process to gain useful insights hidden in the data for business and engineering. At the same time it presents challenges in processing, cloud computing has helped in advancement of big data by providing computational, networking and storage capacity. This paper presents the review, opportunities and challenges of transforming big data using cloud computing resources.
Data Partitioning Technique In Cloud: A Survey On Limitation And BenefitsIJERA Editor
In recent years,increment in the growth and popularity of cloud services has lead the enterprises to an increase in the capability to handle, store and retrieve critical data. This technology access a shared group of configurable computing resources, which are- servers,storage and applications. Cloud computing is a succeeding generation architecture of IT enterprise, which convert the application software and databaseto large data hubs.Data security and storage of data is an essential functionality of cloud services.It allows data storage in the cloud server efficiently without any worry. Cloud services includes request service, wide web access, measured services, just single click away ,easy usage, just pay for the services you use and location independent.All these features poses many security challenges.The data partitioning techniques are used in literature, for privacy conserving and security of data, using third party auditor (TPA). Objective of the current workis to review all available partitioning technique in literature and analyze them. Through this work authors will compare and identify the limitations and benefits of the available and widely used partitioning techniques.
AUTHENTICATION SCHEME FOR DATABASE AS A SERVICE(DBAAS) ijccsa
IT Companies have shifted their resources to the cloud at rapidly increasing rate. As part of this trend companies are migrating business critical and sensitive data stored in database to cloud-hosted and Database as a Service (DBaaS) solutions.Of all that has been written about cloud computing, precious little attention has been paid to authentication in the cloud. In this paper we have designed a new effective authentication scheme for Cloud Database as a Service (DBaaS). A user can change his/her password, whenever demanded. Furthermore, security analysis realizes the feasibility of the proposed model for DBaaS and achieves efficiency. We also proposed an efficient authentication scheme to solve the authentication problem in cloud. The proposed solution which we have provided is based mainly on improved Needham-Schroeder’s protocol to prove the users’ identity to determine if this user is authorized or not. The results showed that this scheme is very strong and difficult to break it.
Cooperative caching for efficient data access in disruption tolerant networksLeMeniz Infotech
Cooperative caching for efficient data access in disruption tolerant networks
Disruption tolerant networks (DTNs) are characterized by low node density, unpredictable node mobility, and lack of global network information. Most of current research efforts in DTNs focus on data forwarding, but only limited work has been done on providing efficient data access to mobile users. A novel approach is proposed to support cooperative caching in DTNs, which enables the sharing and coordination of cached data among multiple nodes and reduces data access delay.
The advent of Big Data has seen the emergence of new processing and storage challenges. These challenges are often solved by distributed processing. Distributed systems are inherently dynamic and unstable, so it is realistic to expect that some resources will fail during use. Load balancing and task scheduling is an important step in determining the performance of parallel applications. Hence the need to design load balancing algorithms adapted to grid computing. In this paper, we propose a dynamic and hierarchical load balancing strategy at two levels: Intrascheduler load balancing, in order to avoid the use of the large-scale communication network, and interscheduler load balancing, for a load regulation of our whole system. The strategy allows improving the average response time of CLOAK-Reduce application tasks with minimal communication. We first focus on the three performance indicators, namely response time, process latency and running time of MapReduce tasks.
CYBER INFRASTRUCTURE AS A SERVICE TO EMPOWER MULTIDISCIPLINARY, DATA-DRIVEN S...ijcsit
In supporting its large scale, multidisciplinary scientific research efforts across all the university campuses and by the research personnel spread over literally every corner of the state, the state of Nevada needs to build and leverage its own Cyber infrastructure. Following the well-established as-a-service model, this state-wide Cyber infrastructure that consists of data acquisition, data storage, advanced instruments, visualization, computing and information processing systems, and people, all seamlessly linked together through a high-speed network, is designed and operated to deliver the benefits of Cyber infrastructure-as-aService (CaaS).There are three major service groups in this CaaS, namely (i) supporting infrastructural
services that comprise sensors, computing/storage/networking hardware, operating system, management tools, virtualization and message passing interface (MPI); (ii) data transmission and storage services that provide connectivity to various big data sources, as well as cached and stored datasets in a distributed
storage backend; and (iii) processing and visualization services that provide user access to rich processing and visualization tools and packages essential to various scientific research workflows. Built on commodity hardware and open source software packages, the Southern Nevada Research Cloud(SNRC)and a data repository in a separate location constitute a low cost solution to deliver all these services around CaaS. The service-oriented architecture and implementation of the SNRC are geared to encapsulate as much detail of big data processing and cloud computing as possible away from end users; rather scientists only need to learn and access an interactive web-based interface to conduct their collaborative, multidisciplinary, dataintensive research. The capability and easy-to-use features of the SNRC are demonstrated through a use case that attempts to derive a solar radiation model from a large data set by regression analysis.
The huge volume of text documents available on the internet has made it difficult to find valuable
information for specific users. In fact, the need for efficient applications to extract interested knowledge
from textual documents is vitally important. This paper addresses the problem of responding to user
queries by fetching the most relevant documents from a clustered set of documents. For this purpose, a
cluster-based information retrieval framework was proposed in this paper, in order to design and develop
a system for analysing and extracting useful patterns from text documents. In this approach, a pre-
processing step is first performed to find frequent and high-utility patterns in the data set. Then a Vector
Space Model (VSM) is performed to represent the dataset. The system was implemented through two main
phases. In phase 1, the clustering analysis process is designed and implemented to group documents into
several clusters, while in phase 2, an information retrieval process was implemented to rank clusters
according to the user queries in order to retrieve the relevant documents from specific clusters deemed
relevant to the query. Then the results are evaluated according to evaluation criteria. Recall and Precision
(P@5, P@10) of the retrieved results. P@5 was 0.660 and P@10 was 0.655.
Risk Prediction for Production of an EnterpriseEditor IJCATR
Despite all preventive measures, there is so much possibility of risks in any project development as well as in enterprise management. There is no any standard mechanism or methodology available to assess the risks in any project or production management. Using some precautionary steps, the manager can only avoid the risks as much as he can. To address this issue, this paper presents a probabilistic risk assessment model for the production of an enterprise. For this, Multi-Entity Bayesian Network (MEBN) has been used to represent the requirements for production management as well as to assess the risks adherence in production management, where MEBN combines expressivity of first-order logic and probabilistic feature of Bayesian network. Bayesian network provides the feature to represent the probabilistic uncertainty and reasoning about probabilistic knowledge base, which is used here to represent the probable risks behind each causes of a risk. The proposed probabilistic model is discussed with the help of a case study, which is used to predict risks inherent in the production of an enterprise, which depends upon various measures like labour availability, power backup, transport availability etc.
Data Mining is an analytic process designed to explore data in search of consistent patterns and systematic relationships between variables, and then to validate the results by applying the patterns found to a new subset of data. Data mining is often described as the process of discovering patterns, correlations, trends or relationships by searching through a large amount of data stored in repositories, databases, and data warehouses. Diabetes, often referred to by doctors as diabetes mellitus, describes a group of metabolic diseases in which the person has high blood [3] glucose (blood sugar), either because insulin production is insufficient, or because the body's cells do not respond properly to insulin, or both. This project helps in identifying whether a person has diabetes or not, if predicted diabetic[4] the project suggest measures for maintaining normal health and if not diabetic it predicts the risk of getting diabetic. In this project Classification algorithm was used to classify the Pima Indian diabetes dataset. Results have been obtained using Android Application.
Online Social Network (OSN) sites act as a medium to spread their own views, activities and their thoughts to some camaraderie. Contents of this network are spread over web, so it was hard to determine by a human decision. Currently, they do not provide any mechanism to ensure privacy concerns towards data associated with each user. Due to this problem, number of users lacks from their ownership control. In this paper, we proposed AC2P (Activity Control-Access Control Protocol) for information control on the web. Alternatively, Tag Refinement strategy determines illegal tagging over images and send notification about particular image spread within different communities/groups. These techniques reduce risk of information flow and avoid unwanted tagging toward images.
Security for Effective Data Storage in Multi CloudsEditor IJCATR
Cloud Computing is a technology that uses the internet and central remote servers to maintain data and
applications. Cloud computing allows consumers and businesses to use applications without installation and access their personal
files at any computer with internet access. This technology allows for much more efficient computing by centralizing data
storage, processing and bandwidth. The use of cloud computing has increased rapidly in many organizations. Cloud computing
provides many benefits in terms of low cost and accessibility of data. Ensuring the security of cloud computing is a major factor
in the cloud computing environment, as users often store sensitive information with cloud storage providers but these providers
may be untrusted. Dealing with “single cloud” providers is predicted to become less popular with customers due to risks of
service availability failure and the possibility of malicious insiders in the single cloud. A movement towards “multi-clouds”, or in
other words, “interclouds” or “cloud-of clouds” has emerged recently. This paper surveys recent research related to single and
multi-cloud security and addresses possible solutions. It is found that the research into the use of multicloud providers to maintain
security has received less attention from the research community than has the use of single clouds. This work aims to promote the
use of multi-clouds due to its ability to reduce security risks that affect the cloud computing user.
Image enhancement plays an important role in vision applications. Recently a lot of work has been performed in the field of image enhancement. Many techniques have already been proposed till now for enhancing the digital images. This paper has presented a comparative analysis of various image enhancement techniques. This paper has shown that the fuzzy logic and histogram based techniques have quite effective results over the available techniques. This paper ends up with suitable future directions to enhance fuzzy based image enhancement technique further. In the proposed technique, an approach is made to enhance the images other than low-contrast images as well by balancing the stretching parameter (K) according to the color contrast. Proposed technique is designed to restore the degraded edges resulted due to contrast enhancement as well.
In this paper, we introduce the notion of intuitionistic fuzzy semipre generalized connected space, intuitionistic fuzzy semipre generalized super connected space and intuitionistic fuzzy semipre generalized extremally disconnected spaces. We investigate some of their properties.
Random Valued Impulse Noise Elimination using Neural FilterEditor IJCATR
A neural filtering technique is proposed in this paper for restoring the images extremely corrupted with random valued impulse noise. The proposed intelligent filter is carried out in two stages. In first stage the corrupted image is filtered by applying an asymmetric trimmed median filter. An asymmetric trimmed median filtered output image is suitably combined with a feed forward neural network in the second stage. The internal parameters of the feed forward neural network are adaptively optimized by training of three well known images. This is quite effective in eliminating random valued impulse noise. Simulation results show that the proposed filter is superior in terms of eliminating impulse noise as well as preserving edges and fine details of digital images and results are compared with other existing nonlinear filters.
Wireless data broadcast is an efficient way of disseminating data to users in the mobile computing environments. From the server’s point of view, how to place the data items on channels is a crucial issue, with the objective of minimizing the average access time and tuning time. Similarly, how to schedule the data retrieval process for a given request at the client side such that all the requested items can be downloaded in a short time is also an important problem. In this paper, we investigate the multi-item data retrieval scheduling in the push-based multichannel broadcast environments. The most important issues in mobile computing are energy efficiency and query response efficiency. However, in data broadcast the objectives of reducing access latency and energy cost can be contradictive to each other. Consequently, we define a new problem named Minimum Cost Data Retrieval Problem (MCDR) and Large Number Data Retrieval (LNDR) Problem. We also develop a heuristic algorithm to download a large number of items efficiently. When there is no replicated item in a broadcast cycle, we show that an optimal retrieval schedule can be obtained in polynomial time
In this paper, we consider a criminal investigation on the collective guilt of part members in a working group. Assuming that the statistics we used are reliable, we present the Page Rank Model based on mutual information. First, we use the average mutual information between non-suspicious topics and the suspicious topics to score the topics by degree of suspicion. Second, we build the correlation matrix based on the degree of suspicion and acquire the corresponding Markov state transition matrix. Then, we set the original value for all members of the working group based on the degree of suspicion. At the last, we calculate the suspected degree of each member in the working group. In the small 10-people case, we build the improved Page Rank model. By calculating the statistics of this case, we acquire a table which indicates the ranking of the suspected degree. In contrast with the results given in this issue, we find these two results basically match each other, indicating the model we have built is feasible. In the current case, firstly, we obtain a ranking list on 15 topics in order of suspicion via Page Rank Model based on mutual information. Secondly, we acquire the stable point of Markov state transition matrix using the Markov chain. Then, we build the connection matrix based on the degree of suspicion and acquire the corresponding Markov state transition matrix. Last, we calculate the degree of 83 candidates. From the result, we can see that those suspicious are on the top of the ranking list while those innocent people are at the bottom of the list, representing that the model we have built is feasible. When suspicious topics and conspirators changed, a relatively good result can also be obtained by this model. In the current case, we have the evidence to believe that Dolores and Jerome, who are the senior managers, have significant suspicion. It is recommended that future attention should be paid to them. The Page Rank Model, based on mutual information, takes full account of the information flow in message distribution network. This model can not only deal with the statistics used in conspiracy, but also be applied to detect the infected cells in a biological network. Finally, we present the advantages and disadvantages of this model and the direction of improvements.
One of the most popular areas of research is wireless communication. Mobile Ad Hoc network (MANET) is a network with wireless mobile nodes, infrastructure less and self organizing. With its wireless and distributed nature it is exposed to several security threats. One of the threats in MANET is the wormhole attack. In this attack a pair of attacker forms a virtual link thereby recording and replaying the wireless transmission. This paper presents types of wormhole attack and also includes different technique for detecting wormhole attack in MANET..
Grid computing is a modularized way of structuring network resources so as to share the information and resources, to perform heavily intense problems. Grid computing is a part of distributed processing which allows the distribution of the problem over multiple computational resources. The formation of grid decides the computation cost, the performance metrics of the operation to be carried out. The grid computing favours the formation of an adhoc structure formed at the time of request (it does not hold an infrastructure) that is termed as virtual organization. This paper presents dynamic source routing for modeling virtual organization.
Assistive Examination System for Visually ImpairedEditor IJCATR
This paper presents a design of voice enabled examination system which can be used by the visually challenged students.
The system uses Text-to-Speech (TTS) and Speech-to-Text (STT) technology. The text-to-speech and speech-to-text web based
academic testing software would provide an interaction for blind students to enhance their educational experiences by providing them
with a tool to give the exams. This system will aid the differently-abled to appear for online tests and enable them to come at par with
the other students. This system can also be used by students with learning disabilities or by people who wish to take the examination in
a combined auditory and visual way.
A Formal Machine Learning or Multi Objective Decision Making System for Deter...Editor IJCATR
Decision-making typically needs the mechanisms to compromise among opposing norms. Once multiple objectives square measure is concerned of machine learning, a vital step is to check the weights of individual objectives to the system-level performance. Determinant, the weights of multi-objectives is associate in analysis method, associated it's been typically treated as a drawback. However, our preliminary investigation has shown that existing methodologies in managing the weights of multi-objectives have some obvious limitations like the determination of weights is treated as one drawback, a result supporting such associate improvement is limited, if associated it will even be unreliable, once knowledge concerning multiple objectives is incomplete like an integrity caused by poor data. The constraints of weights are also mentioned. Variable weights square measure is natural in decision-making processes. Here, we'd like to develop a scientific methodology in determinant variable weights of multi-objectives. The roles of weights in a creative multi-objective decision-making or machine-learning of square measure analyzed, and therefore the weights square measure determined with the help of a standard neural network.
Risk-Aware Response Mechanism with Extended D-S theoryEditor IJCATR
Mobile Ad hoc Networks (MANET) are having dynamic nature of its network infrastructure and
it is vulnerable to all types of attacks. Among these attacks, the routing attacks getting more attention
because its changing the whole topology itself and it causes more damage to MANET. Even there are lot of
intrusion detection Systems available to diminish those critical attacks, existing causesunexpected network
partition, and causes additional damages to the infrastructure of the network , and it leads to uncertainty in
finding routing attacks in MANET. In this paper, we propose a adaptive risk-aware response mechanism with
extended Dempster-Shafer theory in MANET to identify the routing attacks and malicious node. Our
techniques find the malicious node with degree of evidence from the expert knowledge and detect the
important factors for each node.It creates black list and all those malicious nodes so that it may not enter the
network again
Agent based frameworks for distributed association rule mining an analysis ijfcstjournal
Distributed Association Rule Mining (DARM) is the task for generating the globally strong association
rules from the global frequent itemsets in a distributed environment. The intelligent agent based model, to
address scalable mining over large scale distributed data, is a popular approach to constructing
Distributed Data Mining (DDM) systems and is characterized by a variety of agents coordinating and
communicating with each other to perform the various tasks of the data mining process. This study
performs the comparative analysis of the existing agent based frameworks for mining the association rules
from the distributed data sources.
Fragmentation of Data in Large-Scale System For Ideal Performance and SecurityEditor IJCATR
Cloud computing is becoming prominent trend which offers the number of significant advantages. One of the ground laying
advantage of the cloud computing is the pay-as-per-use, where according to the use of the services, the customer has to pay. At present,
user’s storage availability improves the data generation. There is requiring farming out such large amount of data. There is indefinite
large number of Cloud Service Providers (CSP). The Cloud Service Providers is increasing trend for many number of organizations and
as well as for the customers that decreases the burden of the maintenance and local data storage. In cloud computing transferring data to
the third party administrator control will give rise to security concerns. Within the cloud, compromisation of data may occur due to
attacks by the unauthorized users and nodes. So, in order to protect the data in cloud the higher security measures are required and also
to provide security for the optimization of the data retrieval time. The proposed system will approach the issues of security and
performance. Initially in the DROPS methodology, the division of the files into fragments is done and replication of those fragmented
data over the cloud node is performed. Single fragment of particular file can be stored on each of the nodes which ensure that no
meaningful information is shown to an attacker on a successful attack. The separation of the nodes is done by T-Coloring in order to
prohibit an attacker to guess the fragment’s location. The complete data security is ensured by DROPS methodology
ACCESSING SECURED DATA IN CLOUD COMPUTING ENVIRONMENTIJNSA Journal
Number of businesses using cloud computing has increased dramatically over the last few years due to the attractive features such as scalability, flexibility, fast start-up and low costs. Services provided over the web are ranging from using provider’s software and hardware to managing security and other issues. Some of the biggest challenges at this point are providing privacy and data security to subscribers of public cloud servers. An efficient encryption technique presented in this paper can be used for secure access to and storage of data on public cloud server, moving and searching encrypted data through communication channels while protecting data confidentiality. This method ensures data protection against both external and internal intruders. Data can be decrypted only with the provided by the data owner key, while public cloud server is unable to read encrypted data or queries. Answering a query does not depend on it size and done in a constant time. Data access is managed by the data owner. The proposed schema allows unauthorized modifications detection.
Accessing secured data in cloud computing environmentIJNSA Journal
Number of businesses using cloud computing has increased dramatically over the last few years due to the attractive features such as scalability, flexibility, fast start-up and low costs. Services provided over the web are ranging from using provider’s software and hardware to managing security and other issues. Some of the biggest challenges at this point are providing privacy and data security to subscribers of public cloud servers. An efficient encryption technique presented in this paper can be used for secure access to and storage of data on public cloud server, moving and searching encrypted data through communication channels while protecting data confidentiality. This method ensures data protection against both external and internal intruders. Data can be decrypted only with the provided by the data owner key, while public cloud server is unable to read encrypted data or queries. Answering a query does not depend on it size and done in a constant time. Data access is managed by the data owner. The proposed schema allows unauthorized modifications detection
Excellent Manner of Using Secure way of data storage in cloud computingEditor IJMTER
The major challenging issue in Cloud computing is Security. Providing Security is big issue
towards protecting data from third person as well as in Internet. This mainly deals the Security how it is
provided. Various type of services are there to protect our data and Various Services are available in Cloud
Computing to Utilize effective manner as Software as a Service (SaaS), Platform as a Service (PaaS),
Hardware as a Service (HaaS). Cloud computing is the use of computing resources (hardware and
software) that are delivered as a service over Internet network. Cloud Computing moves the Application
software and databases to the large data centres, where the administration of the data and services may not
be fully trustworthy that is in third party here the party has to get certified and authorized. Since Cloud
Computing share distributed resources via network in the open environment thus it makes new security
risks towards the correctness of the data in cloud. I propose in this paper flexibility of data storage
mechanism in the distributed environment by using the homomorphism token generation. In the proposed
system, users need to allow auditing the cloud storage with lightweight communication. While using
Encryption and Decryption methods it is very burden for a single processor. Than the processing
Capabilities can we utilize from Cloud Computing.
Implementation of Agent Based Dynamic Distributed ServiceCSCJournals
The concept of distributed computing implies a network / internet-work of independent nodes which are logically configured in such a manner as to be seen as one machine by an application. They have been implemented in many varying forms and configurations, for the optimal processing of data. Agents and multi-agent systems are useful in modeling complex distributed processes. They focus on support for (the development of) large-scale, secure, and heterogeneous distributed systems. They are expected to abstract both hardware and software vis-à-vis distributed systems. For optimizing the use of the tremendous increase in processing power, bandwidth, and memory that technology is placing in the hands of the designer, a Dynamically Distributed Service (to be positioned as a service to a network / internet-work) is proposed. The service will conceptually migrate an application on to different nodes. In this paper, we present the design and implementation of an inter-mobility (migration) mechanism for agents. This migration is based on FIPA ACL messages. We also evaluate the performance of this implementation.
A robust and verifiable threshold multi authority access control system in pu...IJARIIT
Attribute-based Encryption is observed as a promising cryptographic leading tool to assurance data owners’ direct
regulator over their data in public cloud storage. The former ABE schemes include only one authority to maintain the whole
attribute set, which can carry a single-point bottleneck on both security and performance. Then, certain multi-authority
schemes are planned, in which numerous authorities distinctly maintain split attribute subsets. However, the single-point
bottleneck problem remains unsolved. In this survey paper, from another perspective, we conduct a threshold multi-authority
CP-ABE access control scheme for public cloud storage, named TMACS, in which multiple authorities jointly manage a
uniform attribute set. In TMACS, taking advantage of (t, n) threshold secret allocation, the master key can be shared among
multiple authorities, and a lawful user can generate his/her secret key by interacting with any t authorities. Security and
performance analysis results show that TMACS is not only verifiable secure when less than t authorities are compromised, but
also robust when no less than t authorities are alive in the system. Also, by efficiently combining the traditional multi-authority
scheme with TMACS, we construct a hybrid one, which satisfies the scenario of attributes coming from different authorities as
well as achieving security and system-level robustness.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
Characterizing and Processing of Big Data Using Data Mining TechniquesIJTET Journal
Abstract— Big data is a popular term used to describe the exponential growth and availability of data, both structured and unstructured. It concerns Large-Volume, Complex and growing data sets in both multiple and autonomous sources. Not only in science and engineering big data are now rapidly expanding in all domains like physical, bio logical etc...The main objective of this paper is to characterize the features of big data. Here the HACE theorem, that characterizes the features of the Big Data revolution, and proposes a Big Data processing model, from the data mining perspective, is used. The aggregation of mining, analysis, information sources, user interest modeling, privacy and security are involved in this model. To explore and extract the large volumes of data and useful information or knowledge respectively is the most fundamental challenge in Big Data. So we should have a tendency to analyze these problems and knowledge revolution.
Data mining over diverse data sources is useful
means for discovering valuable patterns, associations, trends, and
dependencies in data. Many variants of this problem are existing,
depending on how the data is distributed, what type of data
mining we wish to do, how to achieve privacy of data and what
restrictions are placed on sharing of information. A transactional
database owner, lacking in the expertise or computational sources
can outsource its mining tasks to a third party service provider
or server. However, both the itemsets along with the association
rules of the outsourced database are considered private property
of the database owner.
In this paper, we consider a scenario where multiple data sources
are willing to share their data with trusted third party called
combiner who runs data mining algorithms over the union
of their data as long as each data source is guaranteed that
its information that does not pertain to another data source
will not be revealed. The proposed algorithm is characterized
with (1) secret sharing based secure key transfer for distributed
transactional databases with its lightweight encryption is used
for preserving the privacy. (2) and rough set based mechanism
for association rules extraction for an efficient and mining task.
Performance analysis and experimental results are provided for
demonstrating the effectiveness of the proposed algorithm.
Text Mining in Digital Libraries using OKAPI BM25 ModelEditor IJCATR
The emergence of the internet has made vast amounts of information available and easily accessible online. As a result, most libraries have digitized their content in order to remain relevant to their users and to keep pace with the advancement of the internet. However, these digital libraries have been criticized for using inefficient information retrieval models that do not perform relevance ranking to the retrieved results. This paper proposed the use of OKAPI BM25 model in text mining so as means of improving relevance ranking of digital libraries. Okapi BM25 model was selected because it is a probability-based relevance ranking algorithm. A case study research was conducted and the model design was based on information retrieval processes. The performance of Boolean, vector space, and Okapi BM25 models was compared for data retrieval. Relevant ranked documents were retrieved and displayed at the OPAC framework search page. The results revealed that Okapi BM 25 outperformed Boolean model and Vector Space model. Therefore, this paper proposes the use of Okapi BM25 model to reward terms according to their relative frequencies in a document so as to improve the performance of text mining in digital libraries.
Green Computing, eco trends, climate change, e-waste and eco-friendlyEditor IJCATR
This study focused on the practice of using computing resources more efficiently while maintaining or increasing overall performance. Sustainable IT services require the integration of green computing practices such as power management, virtualization, improving cooling technology, recycling, electronic waste disposal, and optimization of the IT infrastructure to meet sustainability requirements. Studies have shown that costs of power utilized by IT departments can approach 50% of the overall energy costs for an organization. While there is an expectation that green IT should lower costs and the firm’s impact on the environment, there has been far less attention directed at understanding the strategic benefits of sustainable IT services in terms of the creation of customer value, business value and societal value. This paper provides a review of the literature on sustainable IT, key areas of focus, and identifies a core set of principles to guide sustainable IT service design.
Policies for Green Computing and E-Waste in NigeriaEditor IJCATR
Computers today are an integral part of individuals’ lives all around the world, but unfortunately these devices are toxic to the environment given the materials used, their limited battery life and technological obsolescence. Individuals are concerned about the hazardous materials ever present in computers, even if the importance of various attributes differs, and that a more environment -friendly attitude can be obtained through exposure to educational materials. In this paper, we aim to delineate the problem of e-waste in Nigeria and highlight a series of measures and the advantage they herald for our country and propose a series of action steps to develop in these areas further. It is possible for Nigeria to have an immediate economic stimulus and job creation while moving quickly to abide by the requirements of climate change legislation and energy efficiency directives. The costs of implementing energy efficiency and renewable energy measures are minimal as they are not cash expenditures but rather investments paid back by future, continuous energy savings.
Performance Evaluation of VANETs for Evaluating Node Stability in Dynamic Sce...Editor IJCATR
Vehicular ad hoc networks (VANETs) are a favorable area of exploration which empowers the interconnection amid the movable vehicles and between transportable units (vehicles) and road side units (RSU). In Vehicular Ad Hoc Networks (VANETs), mobile vehicles can be organized into assemblage to promote interconnection links. The assemblage arrangement according to dimensions and geographical extend has serious influence on attribute of interaction .Vehicular ad hoc networks (VANETs) are subclass of mobile Ad-hoc network involving more complex mobility patterns. Because of mobility the topology changes very frequently. This raises a number of technical challenges including the stability of the network .There is a need for assemblage configuration leading to more stable realistic network. The paper provides investigation of various simulation scenarios in which cluster using k-means algorithm are generated and their numbers are varied to find the more stable configuration in real scenario of road.
Optimum Location of DG Units Considering Operation ConditionsEditor IJCATR
The optimal sizing and placement of Distributed Generation units (DG) are becoming very attractive to researchers these days. In this paper a two stage approach has been used for allocation and sizing of DGs in distribution system with time varying load model. The strategic placement of DGs can help in reducing energy losses and improving voltage profile. The proposed work discusses time varying loads that can be useful for selecting the location and optimizing DG operation. The method has the potential to be used for integrating the available DGs by identifying the best locations in a power system. The proposed method has been demonstrated on 9-bus test system.
Analysis of Comparison of Fuzzy Knn, C4.5 Algorithm, and Naïve Bayes Classifi...Editor IJCATR
Early detection of diabetes mellitus (DM) can prevent or inhibit complication. There are several laboratory test that must be done to detect DM. The result of this laboratory test then converted into data training. Data training used in this study generated from UCI Pima Database with 6 attributes that were used to classify positive or negative diabetes. There are various classification methods that are commonly used, and in this study three of them were compared, which were fuzzy KNN, C4.5 algorithm and Naïve Bayes Classifier (NBC) with one identical case. The objective of this study was to create software to classify DM using tested methods and compared the three methods based on accuracy, precision, and recall. The results showed that the best method was Fuzzy KNN with average and maximum accuracy reached 96% and 98%, respectively. In second place, NBC method had respective average and maximum accuracy of 87.5% and 90%. Lastly, C4.5 algorithm had average and maximum accuracy of 79.5% and 86%, respectively.
Web Scraping for Estimating new Record from Source SiteEditor IJCATR
Study in the Competitive field of Intelligent, and studies in the field of Web Scraping, have a symbiotic relationship mutualism. In the information age today, the website serves as a main source. The research focus is on how to get data from websites and how to slow down the intensity of the download. The problem that arises is the website sources are autonomous so that vulnerable changes the structure of the content at any time. The next problem is the system intrusion detection snort installed on the server to detect bot crawler. So the researchers propose the use of the methods of Mining Data Records and the method of Exponential Smoothing so that adaptive to changes in the structure of the content and do a browse or fetch automatically follow the pattern of the occurrences of the news. The results of the tests, with the threshold 0.3 for MDR and similarity threshold score 0.65 for STM, using recall and precision values produce f-measure average 92.6%. While the results of the tests of the exponential estimation smoothing using ? = 0.5 produces MAE 18.2 datarecord duplicate. It slowed down to 3.6 datarecord from 21.8 datarecord results schedule download/fetch fix in an average time of occurrence news.
Evaluating Semantic Similarity between Biomedical Concepts/Classes through S...Editor IJCATR
Most of the existing semantic similarity measures that use ontology structure as their primary source can measure semantic similarity between concepts/classes using single ontology. The ontology-based semantic similarity techniques such as structure-based semantic similarity techniques (Path Length Measure, Wu and Palmer’s Measure, and Leacock and Chodorow’s measure), information content-based similarity techniques (Resnik’s measure, Lin’s measure), and biomedical domain ontology techniques (Al-Mubaid and Nguyen’s measure (SimDist)) were evaluated relative to human experts’ ratings, and compared on sets of concepts using the ICD-10 “V1.0” terminology within the UMLS. The experimental results validate the efficiency of the SemDist technique in single ontology, and demonstrate that SemDist semantic similarity techniques, compared with the existing techniques, gives the best overall results of correlation with experts’ ratings.
Semantic Similarity Measures between Terms in the Biomedical Domain within f...Editor IJCATR
The techniques and tests are tools used to define how measure the goodness of ontology or its resources. The similarity between biomedical classes/concepts is an important task for the biomedical information extraction and knowledge discovery. However, most of the semantic similarity techniques can be adopted to be used in the biomedical domain (UMLS). Many experiments have been conducted to check the applicability of these measures. In this paper, we investigate to measure semantic similarity between two terms within single ontology or multiple ontologies in ICD-10 “V1.0” as primary source, and compare my results to human experts score by correlation coefficient.
A Strategy for Improving the Performance of Small Files in Openstack Swift Editor IJCATR
This is an effective way to improve the storage access performance of small files in Openstack Swift by adding an aggregate storage module. Because Swift will lead to too much disk operation when querying metadata, the transfer performance of plenty of small files is low. In this paper, we propose an aggregated storage strategy (ASS), and implement it in Swift. ASS comprises two parts which include merge storage and index storage. At the first stage, ASS arranges the write request queue in chronological order, and then stores objects in volumes. These volumes are large files that are stored in Swift actually. During the short encounter time, the object-to-volume mapping information is stored in Key-Value store at the second stage. The experimental results show that the ASS can effectively improve Swift's small file transfer performance.
Integrated System for Vehicle Clearance and RegistrationEditor IJCATR
Efficient management and control of government's cash resources rely on government banking arrangements. Nigeria, like many low income countries, employed fragmented systems in handling government receipts and payments. Later in 2016, Nigeria implemented a unified structure as recommended by the IMF, where all government funds are collected in one account would reduce borrowing costs, extend credit and improve government's fiscal policy among other benefits to government. This situation motivated us to embark on this research to design and implement an integrated system for vehicle clearance and registration. This system complies with the new Treasury Single Account policy to enable proper interaction and collaboration among five different level agencies (NCS, FRSC, SBIR, VIO and NPF) saddled with vehicular administration and activities in Nigeria. Since the system is web based, Object Oriented Hypermedia Design Methodology (OOHDM) is used. Tools such as Php, JavaScript, css, html, AJAX and other web development technologies were used. The result is a web based system that gives proper information about a vehicle starting from the exact date of importation to registration and renewal of licensing. Vehicle owner information, custom duty information, plate number registration details, etc. will also be efficiently retrieved from the system by any of the agencies without contacting the other agency at any point in time. Also number plate will no longer be the only means of vehicle identification as it is presently the case in Nigeria, because the unified system will automatically generate and assigned a Unique Vehicle Identification Pin Number (UVIPN) on payment of duty in the system to the vehicle and the UVIPN will be linked to the various agencies in the management information system.
Assessment of the Efficiency of Customer Order Management System: A Case Stu...Editor IJCATR
The Supermarket Management System deals with the automation of buying and selling of good and services. It includes both sales and purchase of items. The project Supermarket Management System is to be developed with the objective of making the system reliable, easier, fast, and more informative.
Energy-Aware Routing in Wireless Sensor Network Using Modified Bi-Directional A*Editor IJCATR
Energy is a key component in the Wireless Sensor Network (WSN)[1]. The system will not be able to run according to its function without the availability of adequate power units. One of the characteristics of wireless sensor network is Limitation energy[2]. A lot of research has been done to develop strategies to overcome this problem. One of them is clustering technique. The popular clustering technique is Low Energy Adaptive Clustering Hierarchy (LEACH)[3]. In LEACH, clustering techniques are used to determine Cluster Head (CH), which will then be assigned to forward packets to Base Station (BS). In this research, we propose other clustering techniques, which utilize the Social Network Analysis approach theory of Betweeness Centrality (BC) which will then be implemented in the Setup phase. While in the Steady-State phase, one of the heuristic searching algorithms, Modified Bi-Directional A* (MBDA *) is implemented. The experiment was performed deploy 100 nodes statically in the 100x100 area, with one Base Station at coordinates (50,50). To find out the reliability of the system, the experiment to do in 5000 rounds. The performance of the designed routing protocol strategy will be tested based on network lifetime, throughput, and residual energy. The results show that BC-MBDA * is better than LEACH. This is influenced by the ways of working LEACH in determining the CH that is dynamic, which is always changing in every data transmission process. This will result in the use of energy, because they always doing any computation to determine CH in every transmission process. In contrast to BC-MBDA *, CH is statically determined, so it can decrease energy usage.
Security in Software Defined Networks (SDN): Challenges and Research Opportun...Editor IJCATR
In networks, the rapidly changing traffic patterns of search engines, Internet of Things (IoT) devices, Big Data and data centers has thrown up new challenges for legacy; existing networks; and prompted the need for a more intelligent and innovative way to dynamically manage traffic and allocate limited network resources. Software Defined Network (SDN) which decouples the control plane from the data plane through network vitalizations aims to address these challenges. This paper has explored the SDN architecture and its implementation with the OpenFlow protocol. It has also assessed some of its benefits over traditional network architectures, security concerns and how it can be addressed in future research and related works in emerging economies such as Nigeria.
Measure the Similarity of Complaint Document Using Cosine Similarity Based on...Editor IJCATR
Report handling on "LAPOR!" (Laporan, Aspirasi dan Pengaduan Online Rakyat) system depending on the system administrator who manually reads every incoming report [3]. Read manually can lead to errors in handling complaints [4] if the data flow is huge and grows rapidly, it needs at least three days to prepare a confirmation and it sensitive to inconsistencies [3]. In this study, the authors propose a model that can measure the identities of the Query (Incoming) with Document (Archive). The authors employed Class-Based Indexing term weighting scheme, and Cosine Similarities to analyse document similarities. CoSimTFIDF, CoSimTFICF and CoSimTFIDFICF values used in classification as feature for K-Nearest Neighbour (K-NN) classifier. The optimum result evaluation is pre-processing employ 75% of training data ratio and 25% of test data with CoSimTFIDF feature. It deliver a high accuracy 84%. The k = 5 value obtain high accuracy 84.12%
Hangul Recognition Using Support Vector MachineEditor IJCATR
The recognition of Hangul Image is more difficult compared with that of Latin. It could be recognized from the structural arrangement. Hangul is arranged from two dimensions while Latin is only from the left to the right. The current research creates a system to convert Hangul image into Latin text in order to use it as a learning material on reading Hangul. In general, image recognition system is divided into three steps. The first step is preprocessing, which includes binarization, segmentation through connected component-labeling method, and thinning with Zhang Suen to decrease some pattern information. The second is receiving the feature from every single image, whose identification process is done through chain code method. The third is recognizing the process using Support Vector Machine (SVM) with some kernels. It works through letter image and Hangul word recognition. It consists of 34 letters, each of which has 15 different patterns. The whole patterns are 510, divided into 3 data scenarios. The highest result achieved is 94,7% using SVM kernel polynomial and radial basis function. The level of recognition result is influenced by many trained data. Whilst the recognition process of Hangul word applies to the type 2 Hangul word with 6 different patterns. The difference of these patterns appears from the change of the font type. The chosen fonts for data training are such as Batang, Dotum, Gaeul, Gulim, Malgun Gothic. Arial Unicode MS is used to test the data. The lowest accuracy is achieved through the use of SVM kernel radial basis function, which is 69%. The same result, 72 %, is given by the SVM kernel linear and polynomial.
Application of 3D Printing in EducationEditor IJCATR
This paper provides a review of literature concerning the application of 3D printing in the education system. The review identifies that 3D Printing is being applied across the Educational levels [1] as well as in Libraries, Laboratories, and Distance education systems. The review also finds that 3D Printing is being used to teach both students and trainers about 3D Printing and to develop 3D Printing skills.
Survey on Energy-Efficient Routing Algorithms for Underwater Wireless Sensor ...Editor IJCATR
In underwater environment, for retrieval of information the routing mechanism is used. In routing mechanism there are three to four types of nodes are used, one is sink node which is deployed on the water surface and can collect the information, courier/super/AUV or dolphin powerful nodes are deployed in the middle of the water for forwarding the packets, ordinary nodes are also forwarder nodes which can be deployed from bottom to surface of the water and source nodes are deployed at the seabed which can extract the valuable information from the bottom of the sea. In underwater environment the battery power of the nodes is limited and that power can be enhanced through better selection of the routing algorithm. This paper focuses the energy-efficient routing algorithms for their routing mechanisms to prolong the battery power of the nodes. This paper also focuses the performance analysis of the energy-efficient algorithms under which we can examine the better performance of the route selection mechanism which can prolong the battery power of the node
Comparative analysis on Void Node Removal Routing algorithms for Underwater W...Editor IJCATR
The designing of routing algorithms faces many challenges in underwater environment like: propagation delay, acoustic channel behaviour, limited bandwidth, high bit error rate, limited battery power, underwater pressure, node mobility, localization 3D deployment, and underwater obstacles (voids). This paper focuses the underwater voids which affects the overall performance of the entire network. The majority of the researchers have used the better approaches for removal of voids through alternate path selection mechanism but still research needs improvement. This paper also focuses the architecture and its operation through merits and demerits of the existing algorithms. This research article further focuses the analytical method of the performance analysis of existing algorithms through which we found the better approach for removal of voids
Decay Property for Solutions to Plate Type Equations with Variable CoefficientsEditor IJCATR
In this paper we consider the initial value problem for a plate type equation with variable coefficients and memory in
1 n R n ), which is of regularity-loss property. By using spectrally resolution, we study the pointwise estimates in the spectral
space of the fundamental solution to the corresponding linear problem. Appealing to this pointwise estimates, we obtain the global
existence and the decay estimates of solutions to the semilinear problem by employing the fixed point theorem
This presentation includes basic of PCOS their pathology and treatment and also Ayurveda correlation of PCOS and Ayurvedic line of treatment mentioned in classics.
Safalta Digital marketing institute in Noida, provide complete applications that encompass a huge range of virtual advertising and marketing additives, which includes search engine optimization, virtual communication advertising, pay-per-click on marketing, content material advertising, internet analytics, and greater. These university courses are designed for students who possess a comprehensive understanding of virtual marketing strategies and attributes.Safalta Digital Marketing Institute in Noida is a first choice for young individuals or students who are looking to start their careers in the field of digital advertising. The institute gives specialized courses designed and certification.
for beginners, providing thorough training in areas such as SEO, digital communication marketing, and PPC training in Noida. After finishing the program, students receive the certifications recognised by top different universitie, setting a strong foundation for a successful career in digital marketing.
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
it describes the bony anatomy including the femoral head , acetabulum, labrum . also discusses the capsule , ligaments . muscle that act on the hip joint and the range of motion are outlined. factors affecting hip joint stability and weight transmission through the joint are summarized.
Strategies for Effective Upskilling is a presentation by Chinwendu Peace in a Your Skill Boost Masterclass organisation by the Excellence Foundation for South Sudan on 08th and 09th June 2024 from 1 PM to 3 PM on each day.
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
Delivering Micro-Credentials in Technical and Vocational Education and TrainingAG2 Design
Explore how micro-credentials are transforming Technical and Vocational Education and Training (TVET) with this comprehensive slide deck. Discover what micro-credentials are, their importance in TVET, the advantages they offer, and the insights from industry experts. Additionally, learn about the top software applications available for creating and managing micro-credentials. This presentation also includes valuable resources and a discussion on the future of these specialised certifications.
For more detailed information on delivering micro-credentials in TVET, visit this https://tvettrainer.com/delivering-micro-credentials-in-tvet/
The simplified electron and muon model, Oscillating Spacetime: The Foundation...RitikBhardwaj56
Discover the Simplified Electron and Muon Model: A New Wave-Based Approach to Understanding Particles delves into a groundbreaking theory that presents electrons and muons as rotating soliton waves within oscillating spacetime. Geared towards students, researchers, and science buffs, this book breaks down complex ideas into simple explanations. It covers topics such as electron waves, temporal dynamics, and the implications of this model on particle physics. With clear illustrations and easy-to-follow explanations, readers will gain a new outlook on the universe's fundamental nature.
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...NelTorrente
In this research, it concludes that while the readiness of teachers in Caloocan City to implement the MATATAG Curriculum is generally positive, targeted efforts in professional development, resource distribution, support networks, and comprehensive preparation can address the existing gaps and ensure successful curriculum implementation.
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
Agent-Driven Distributed Data Mining
1. International Journal of Science and Engineering Applications
Volume 2 Issue 5, 2013, ISSN-2319-7560 (Online)
www.ijsea.com 103
Agent-Driven Distributed Data Mining
Rohini . P Sree Lakshmi.P
Department of CSE Department of CSE
Malla Reddy College of Engineering Malla Reddy College of Engineering
Andhra Pradesh, India. Andhra Pradesh, India.
Abstract: Multi-Agent systems (Autonomous agents or agents) and knowledge discovery (or data mining) are two active
areas in information technology. A profound insight of bringing these two communities together has unveiled a tremendous
potential for new opportunities and wider applications through the synergy of agents and data mining. Multi-agent systems
(MAS) often deal with complex applications that require distributed problem solving. In many applications the individual and
collective behavior of the agents depends on the observed data from distributed data sources. Data mining technology has
emerged, for identifying patterns and trends from large quantities of data. The increasing demand to scale up to massive data sets
inherentlydistributed over a network with limited band width and computational resources available motivated the development of
distributed data mining (DDM).Distributed data mining is originated from the need of mining over decentralized data
sources. DDM is expected to perform partial analysis of data at individual sites and then to send the outcome as partial result
to other sites where it sometimes required to be aggregated to the global result.
Keywords: Distributed Data Mining, Multi-Agent Systems, Multi Agent Data Mining, Multi-Agent Based
Distributed Data Mining.
1. INTRODUCTION
Data Mining (DM), originated from knowledge discovery
from databases (KDD), the large variety of DM techniques
which have been developed over the past decade includes
methods for pattern-based similarity search, cluster
analysis, decision-tree based classification, generalization
taking the data cube or attribute-oriented induction
approach, and mining of association rules [7]. DDM is a
branch of the field of data mining that offers a framework
to distributed data paying careful attention to the distributed
data and computing resources. Distributed data mining
(DDM) mines data from data sources regardless of their
physical locations. The need for such characteristic arises
from the fact that data produced locally at each site may not
often be transferred across the network due to the
Excessive amount of data and security issues. Recently,
DDM has become a critical component of knowledge based
systems because its decentralized architecture reaches
every network such as weather databases, financial data
portals, or emerging disease information systems has been
recognized by industrial companies as an opportunity of
major revenues from applications such as warehousing,
process control, and customer services, where large
amounts of data are stored.
In the DDM literature, one of two assumptions is
commonly adopted as to how data is distributed across
sites: homogeneously (horizontally partitioned) and
heterogeneously (vertically partitioned). Both viewpoints
adopt the conceptual viewpoint that the data tables at each
site are partitions of a single global table.
Data Mining still poses many challenges to the research
community. The main challenges in data mining are: 1)
Data mining has to deal with huge amounts of data located
at different physical locations. 2) Data mining is
computationally intensive process involving very large data
i.e. more than terabytes. So, it is necessary to partition and
distribute the data for parallel processing to achieve
acceptable time and space performance. 3) The data stored
for particular domain the Input data changes rapidly. In
these cases, knowledge has to be mined fast and efficiently
in order to be usable and updated.
2. NEED OF MULTI-AGENTS
Autonomous agents are computational systems that inhibit
some complex dynamic environment, sense and act
autonomously in this environment, and by doing so realize
a set of goals or tasks for which they are designed. Agents
are reactive i.e., they perceive their environment and
respond in a timely fashion to changes that occur.
Multi-Agent systems are used for all types of system
composed of multiple autonomous components showing
the following characteristics:
• each agent has incomplete capabilities to solve a
problem
• there is no global system control
• data is decentralized
• computation is asynchronous
Multi-Agent has the following features
• Dividing functionality among many agents provides
modularity, flexibility, modifiability, and
extensibility.
• Knowledge that is spread over various sources
2. International Journal of Science and Engineering Applications
Volume 2 Issue 5, 2013, ISSN-2319-7560 (Online)
www.ijsea.com 104
(agents) can be integrated for a more complete view
when needed.
• Applications requiring distributed computing are
better supported by MAS.
• Agent technology supports distributed component
technology.
In a typical distributed environment analyzing distributed
data is a non-trivial problem because of many constraints
such as limited bandwidth (e.g. wireless networks), privacy
sensitive data, distributed compute nodes, only to mention a
few. The field of Distributed Data Mining (DDM) deals
with these challenges in analyzing distributed data and
offers many algorithmic solutions to perform different data
analysis and mining operations in a fundamentally
distributed manner that pays careful attention to the
resource constraints.
Since MAS are also distributed systems, combining
DDM with MAS for data intensive applications is
appealing. DDM is expected to perform partial analysis of
data at individual sites and then to send the outcome as
partial result to other sites where it is sometimes required to
be aggregated to the global result. Quite a number of DDM
solutions are available using various techniques such as
distributed association rules, distributed clustering,
Bayesian learning, classification (regression), and
compression, but only a few of them make use of
intelligent agents at all. The main problems any approach to
DDM is challenged issues of autonomy and privacy. For
example, when data can be viewed at the data warehouse
from many different perspectives and at different levels of
abstraction, it may threaten the goal of protecting
individual data and guarding against invasion of privacy.
These issues of privacy and autonomy become particularly
important in business application scenarios where, for
example, different (often competing) companies may want
to collaborate for fraud detection but without sharing their
individual customers’ data or disclosing it to third parties.
DDM is a complex system focusing on the distribution of
data resources over the network as well as extraction of
data from those resources. The very core of DDM systems
is the scalability as the system configuration may be altered
time to time, therefore designing DDM systems deals with
great details of software engineer issues, such reusability,
extensibility, compatibility, flexibility and robustness. For
these reasons, agents’ characteristics are desirable for DDM
systems.
Autonomy of the system: A DM agent here is considered
as a modular extension of a data management system to
deliberatively handle the access to the data source in
agreement with constraints on the required autonomy of the
system, data and model. This is in full compliance with the
paradigm of cooperative information systems [6].
Multi-strategy DDM: For some complex application
settings an appropriate combination of multiple data mining
techniques may be more beneficial than applying just one
particular one. DM agents may choose depending on the
type of data retrieved from different sites and mining tasks
to be done. The learning of multi-strategy selection of DM
methods is similar to the adaptive selection of coordination
strategies in a multi-agent system as proposed.
Collaborative DM: DM agents may operate independently
on data they have gathered at local repositories, and then
combine their respective patterns or they may agree to
share potential knowledge as it is discovered.
Scalability of DM to massive distributed data: To reduce
network and DM application server load, DM agents
migrate to each of the local data sites in a DDM system on
which they may perform mining tasks locally, and then
either return with or send relevant pre-selected patterns to
their originating server for further processing. Experiments
in using mobile information filtering agents in distributed
data environments are encouraging [9].
Security: Any agent-based DDM system has to cope with
the problem of ensuring data security and privacy. Any
mining operation performed by agents of a DDM system
lacking sound security architecture could be subject to
eavesdropping, data tampering, or denial of service attacks.
Agent code and data integrity is a crucial issue in secure
DDM: Subverting or hijacking a DM agent places a trusted
piece of (mobile) software. In addition, data integration or
aggregation in a DDM process introduces concern
regarding inference attacks as a potential security threat.
However, any failure to implement least privilege at a data
source, that means endowing subjects with only enough
permissions to discharge their duties, could give any
mining agent unsolicited access to sensitive data. Finally,
selective agent replications may help to prevent malicious
hosts from simply blocking or destroying the temporarily
residing DM agents.
Trustworthiness: Data mining agents may infer sensitive
information even from partial integration to a certain extent
and with some probability. This problem, known as the so
called inference problem, occurs especially in settings
where agents may access data sources across trust
boundaries which enable them to integrate implicit
knowledge from different sources using commonly held
rules of thumb.
Furthermore, the decentralization property seems
to fit best with the DDM requirement in order to avoid
security treats. At each data repository, mining strategy is
deployed specifically for the certain domain of data.
3. OPEN PROBLEMS STRATEGY
Several systems have been developed for distributed data
mining. These systems can be classified according to their
strategy to three types; central learning, meta-learning, and
hybrid learning.
3.1 Central learning strategy is when all the
data can be gathered at a central site and a single model can
be build. The only requirement is to be able to move the
data to a central location in order to merge them and then
apply sequential DM algorithms. This strategy is used
when the geographically distributed data is small. The
strategy is generally very expensive but also more accurate
[10]. The process of gathering data in general is not simply
a merging step; it depends on the original distribution.
Agent technology is not very preferred in such strategy.
3.2 Meta-learning strategy offers a way to mine
3. International Journal of Science and Engineering Applications
Volume 2 Issue 5, 2013, ISSN-2319-7560 (Online)
www.ijsea.com 105
classifiers from homogeneously distributed data. Meta-
learning follows three main steps. 1) To generate base
classifiers at each site using a classifier learning algorithms.
2) To collect the base classifiers at a central site, and
produce meta-level data from a separate validation set and
predictions generated by the base classifier on it. 3) To
generate the final classifier (meta-classifier) from meta-
level data via a combiner or an arbiter. Copies of classifier
agent will exist or deployed on nodes in the network being
used. Perhaps the most mature systems of agent-based
meta-learning systems are: JAM system [11], and BODHI
[11].
3.3 Hybrid learning strategy is a technique that
combines local and centralized learning for model building
[15]; for example, Papyrus [12] is designed to support both
learning strategies. In contrast to JAM and BODHI,
Papyrus can not only move models from site to site, but can
also move data when that strategy is desired. Papyrus is a
specialized system which is designed for clusters while
JAM and BODHI are designed for data classification. The
major criticism of such systems is that it is not always
possible to obtain an exact final result, i.e. the global
knowledge model obtained may be different from the one
obtained by applying the one model approach (if possible)
to the same data. Approximated results are not always a
major concern, but it is important to be aware of that.
Moreover, in these systems hardware resource usage is not
optimized. If the heavy computational part is always
executed locally to data, when the same data is accessed
concurrently, the benefits coming from the distributed
environment might vanish due to the possible strong
performance degradation. Another drawback is that
occasionally, these models are induced from databases that
have different schemas and hence are incompatible.
Autonomous agent can be treated as a computing
unit that performs multiple tasks based on a dynamic
configuration. The agent interprets the configuration and
generates an execution plan to complete multiple tasks. [7],
[14], [8], [6], and [9] discuss the benefits of deploying
agents in DDM systems. Nature of MAS is decentralization
and therefore each agent has only limited view to the
system. The limitation somehow allows better security as
agents do not need to observe other irrelevant surroundings.
Agents, in this way, can be programmed as compact as
possible, in which light-weight agents can be transmitted
across the network rather than the data which can be more
bulky. Being able to transmit agents from one to another
host allows dynamic organization of the system. For
example, mining agent ma, located at repository r1, posses
algorithm alg1. Data mining task t1 at repository r2 is
instructed to mine the data using alg1. In this setting,
transmitting alg1 to r2 is a probable way rather than
transfer all data from r2 to r1 where alg1 is available.
4. AGENT-BASED DISTRIBUTED
DATA MINING (ADDM)
ADDM is a novel data mining technique that inherits all
powerful properties of agents and, as a result, yields
desirable characteristics. In general, constructing an
ADDM system concerns with three key characteristics:
interoperability, dynamic system configuration, and
performance. Interoperability concerns with collaboration
of agents in the system, and external interaction which
allow new agents to enter the system seamlessly. The
architecture of the system must be open and flexible so that
it can support the interaction including communication
protocol, integration policy, and service directory.
Communication protocol covers message encoding,
encryption, and transportation between agents. Integration
policy specifies how a system behaves when an external
component, such as an agent or a data site, requests to enter
or leave. Dynamic system configuration, handles the
dynamic configuration of the system, and is a challenge
issue due to the complexity of the planning and mining
algorithms. A mining task may involve several agents and
data sources, in which agents are configured to equip with
an algorithm and deal with given data sets. In distributed
environment, tasks can be executed in parallel, in
exchange, concurrency issues arise. Quality of service
control in performance of data mining and system
perspectives is desired; however it can be derived from
both data mining and agent’s fields.
Fig.4.1: ADDM System Architecture
An ADDM system can be generalized into a set
of components and viewed as depicted in figure 4.1.We
may generalize activities of the system into request and
response, each of which involves a different set of
components. Basic components of an ADDM system are as
follows.
Data: Data is the foundation layer of the architecture. In
distributed environment, data can be hosted in various
forms, such as online relational databases, data stream, web
pages, etc., in which purpose of the data might be varied.
Communication: The system chooses the related resources
from the directory service, which maintains a list of data
sources, mining algorithms, data schemas, data types, etc.
The communication protocols may vary depending on
implementation of the system, such as client-server, peer-
4. International Journal of Science and Engineering Applications
Volume 2 Issue 5, 2013, ISSN-2319-7560 (Online)
www.ijsea.com 106
to-peer etc.
User Interface: The user interface (UI) interacts with the
user as to receive and respond to the user. The interface
simplifies complex distributed systems into user-friendly
message such as network diagrams, visual reporting tools,
etc. On the other hand, when a user requests for data
mining through the UI, the following components are
involved.
Query optimization: A query optimizer analyses the
request as to determine type of mining tasks and chooses
proper resources for the request. It also determines whether
it is possible to parallelize the tasks, since the data is
distributed and can be mined in parallel.
Discovery Plan: A planner allocates sub-tasks with related
resources. At this stage, mediating agents play important
roles as to coordinate multiple computing units since
mining sub-tasks performed asynchronously as well as
results from those tasks. On the other hand, when a mining
task is done, the following components are taken place,
Local Knowledge Discovery (KD): In order to transform
data into patterns which adequately represent the data and
reasonable to be transferred over the network, at each data
site, mining process may take place locally depending on
the individual implementation.
Knowledge Discovery: Also known as mining, it executes
the algorithm as required by the task to obtain knowledge
from the specified data source.
Knowledge Consolidation: In order to present to the user
with a compact and Meaningful mining result, it is
necessary to normalize the knowledge obtained from
various sources. The component involves complex
methodologies to combine knowledge/ patterns from
distributed sites. Consolidating homogeneous
knowledge/patterns is promising and yet difficult for
heterogeneous case.
5. PROPOSED SCHEMA
Building and managing of large-scale distributed systems is
becoming an increasingly challenging task. Continuous
intervention by user administrators is generally limited in
large-scale distributed environments. System support is
also needed for configuration and reorganization when
systems evolve with the addition of new resources. The
primary goal of the management of distributed systems is
to ensure efficient use of resources and provide timely
service to users. Most of the distributed system
management techniques still follow the centralized model
that is based on the client-server model. Centralization have
presented some problems, such as: 1) it could cause a
traffic overload and processing at the manager node may
affect its performance;2) it does not present scalability in
the increase of the complexity of the network; 3) the fault
in the central manager node can leave the system without a
manager.
One model is the distributed management where
management tasks are spread across the managed
infrastructure and are carried out at managed resources.
The goal is to minimize the network traffic related to
management and to speed up management tasks by
distributing operations across resources. The new trend in
distributed system management involves using multi-agents
to manage the resources of distributed systems. Agents
have the capability to autonomously travel (execution state
and code) among different data repositories to complete
their task. The route may be predetermined or chosen
dynamically depending on the results at each local data
repository.
The concept of multi-agents promises new ways
of designing applications that better use the resources and
services of computer systems and networks. For example,
moving a program (e.g., search engine) to a resource (e.g.,
database) can save a lot of bandwidth and can be an
enabling factor for applications that otherwise would not be
practical due to network latency.
Conceptually, a multi-agent can migrate its whole virtual
machine from host to host; it owns the code, not the
resources. Multi-agents are the basis of an emerging
technology that promises to make it very much easier to
design, implement, and maintain distributed systems. We
have found that multi-agents reduce network traffic,
provide an effective means of overcoming network latency,
and, perhaps most importantly, through their ability to
operate asynchronously and autonomously of the process
that created them help us to construct more robust and fault
tolerant systems. The purpose of the proposed multi-agent
system is to locate, monitor, and manage resources in
distributed systems. The system consists of a set of static
and mobile agents. Some of them reside in each node or
element in the distributed system. There are two multi-
agents named delegated and collector agents that can move
through the distributed system. The role of each agent in
the multi-agent system, the interaction between agents, and
the operation of the system
5.1 Structure of Multi-Agent Systems
The multi-agent system structure assumes that each node in
the system will have a set of agents residing and running on
that node [1]. These agent types are the following:
Client agent (CA) percepts service requests, initiated by the
user, from the system. The CA may receive the request
from the local user directly. In the other case, it will receive
the request from the exporter agent coming from another
node.
Service list agent (SLA) has a list of the resource agents in
the system. This agent will receive the request from the CA
and send it to the resource availability agent. If the reply
indicates that the requested resource is local then the
service list agent will deliver the request to the categorizer
agent. Otherwise, it will return the request to the CA.
Resource availability agent (RAA) indicates whether the
requested resource is free and available for use or not. It
also indicates whether the requested resource is local or
remote. It receives the request from the service list agent
and checks the status of the requested resource through the
access of the MIB. The agent then constructs the reply
depending on the retrieved information from the database.
Resource agent (RSA) is responsible for the operation and
5. International Journal of Science and Engineering Applications
Volume 2 Issue 5, 2013, ISSN-2319-7560 (Online)
www.ijsea.com 107
control of the resource. This agent executes the on the
resource. Each node may have zero or more RSAs.
Router agent (RA) provides the path of the requested
resource on the network in case of accessing remote
resources. Before being dispatched, the exporter agent will
ask the router agent for the path of the requested resource.
This in turn delivers it to the exporter agent.
Categorizer agent (CZA) allocates a suitable resource
agent to perform the user request. This agent percepts
inputs coming from the service list agent. It then tries to
find a suitable free resource agent to perform the requested
service.
Exporter agent (EA) is a mobile agent that can carry the
user request through the path identified by the RA to reach
the node that has the required resource. It passes the
requested resource id to the RA and then receives the reply.
If the router agent has no information about the requested
resource, the EA will try to locate the resource in the
system. There are also two additional mobile agent types
exist in the system.
Delegated agent (DA) is a mobile agent that is launched in
each sub network. It is responsible for traversing sub
network nodes instead of theexporter agent to do the
required task and carry results back to the exporter agent.
Collector agent (CTA) is a mobile agent that is launched
from the last sub network visited by the exporter agent. It is
launched when results from that sub network become
available. This agent goes through the reversed itinerary of
the exporter agent trip. The CTA collects results from the
delegated agents and carries it to the source node. All
mobile agents used here are of interrupt driven type.
5.2 Functionality of the System
The activity cycle of the multi-agent system resides in a
local data repository. The client agent receives the service
requests either from the user or from an exporter agent. The
client agent then asks the service list agent for the existence
of a resource agent that can perform the request. The
service list agent checks the availability of the required
resource agent by consulting a resource availability agent to
perform the requested service. The reply of the resource
availability agent describes whether or not the resource is
locally available and whether or not there is a resource
agent that can perform the requested service. If the resource
availability agent accepts the request, the service list agent
will ask the categorizer agent to allocate a suitable resource
agent to the requested service and the resource agent will
perform the requested service. Otherwise, the service list
agent informs the client agent with the rejection and is
passed to the exporter agent. The exporter agent asks the
router agent for the path of the required resource agent.
Once the path is determined, the exporter agent will be
dispatched through the network channel to the destination
node identified by that path. If the router agent has no
information about the location of the required resource
agent, the exporter agent will search the distributed system
to find the location of the required resource agent and
assign the required task to it.
As shown in Fig. 5.2.2, the exporter agent traverses
the sub networks of the distributed system through its trip.
At each sub network, a delegated agent is launched to
traverse the local nodes of that sub network doing the
required task and carrying results of that task.
Fig. 5.2.1: Agents activity.
The agents of the social interface described in Fig. 5.2.1 are
implemented at each node in the system. There are two
approaches to collect results of the required task and send
these results back to the source.
In traditional agent-based management systems that use
mobile agent, the exporter agent will wait at each visited
sub network until the delegated agent finishes its work and
obtains results. Then, the exporter agent will take these
results and go to the next sub network in its itinerary. The
exporter agent will return to its home sub network after
visiting all the sub networks determined in the itinerary.
The home sub network of the exporter agent is the sub
network from which it was initially dispatched. The waiting
of the exporter agent prevents execution of tasks to be
started in the other sub networks. This approach is used in
most of the previously developed management systems in
which operation is based on mobile agents. In the proposed
multi-agent management system, the exporter agent does
not wait for results from each sub network. It resumes its
trip visiting other sub networks, and at each sub network,
another delegated mobile agent is launched to carry out
management tasks instead of the exporter agent. The
exporter agent will be killed at the last visited sub network
in its itinerary. When results from the last visited sub
network become available, another mobile agent called
collector agent is launched or dispatched from this sub
network to collect results from it and other sub networks.
The collector agent goes through the reversed itinerary of
the exporter agent trip carrying results to the home sub
network. In this manner, operations can be done in a
parallel fashion at different sub networks because there is
no delay of the task submission to local data repositories of
these sub networks.
CA
SLA
Check
Ask
Reply
RAA DATA
REPOSITORY
CZA
RSA
Rejec
t Allocate Resource
EA
RA
AskPath
Request
Dispatch
6. International Journal of Science and Engineering Applications
Volume 2 Issue 5, 2013, ISSN-2319-7560 (Online)
www.ijsea.com 108
.
Fig. 5.2.2: Network architecture of ADDM.
6. CONCLUSION
Distributed management for distributed systems is
becoming a reality due to the rapid growing trend in
internetworking and the rapid expanding connectivity. This
paper describes a new multi-agent system for the
management of distributed systems. The system is
proposed to optimize the execution of management
functions in distributed systems. The proposed system can
locate, monitor, and manage resources in the system. The
new technique in that system allows management tasks to
be submitted to sub networks of the distributed system and
executed in a parallel fashion. The proposed system uses
two multi- agents. The first is used to submit tasks to the
sub networks of the distributed system and the other
collects results from these sub networks. The proposed
system is compared against traditional management
techniques in terms of response time, speedup, and
efficiency. A prototype has been implemented using
performance management as the case study. The
performance results indicate a significant improvement in
response time, speedup, efficiency, and scalability
compared to traditional techniques. The use of JVM in the
implementation of the proposed system gives the system a
certain type of portability. Therefore, it is desirable to use
the proposed system in the management of distributed
systems. The proposed system is limited to be applied to
high-speed networks that have bandwidth 100 Mb/s or
more. Also, the system cannot work when a failure occurs.
Future research will be related to the security of mobile
agents and of hosts that receive them in the context of
public networks. Mobile agents should be protected against
potentially malicious hosts. The hosts should also be
protected against malicious actions that may be performed
by the mobile code they receive and execute. So, a detailed
design and implementation of the whole secure system
should be considered as a future work. Also, the high
complexity of distributed systems could increase the
potential for system faults. Most of the existing
management systems assume that there is no fault in the
system. It would be interesting to develop a fault tolerant
management system that introduces safety in the system
and attempts to maximize the system reliability without
extra hardware cost.
7. REFERENCES:
[1] T.C. Du, E.Y. Li, and A. Chang,―Mobile agents in
distributed network management,‖ Commun. ACM, vol.
46, no. 7, pp. 127–132, July 2003.
[2] H. Ku, G.W.R. Ludere, and B.Subbiah, ―An
intelligent mobile agent framework for distributed network
management,‖in Proc. Globecom’97Phoenix, pp. 160–
164.
[3] N. R. Jennings and S. Bussmann, ―Agent-based
control systems—Why are they suited to engineering
complex systems?‖ IEEE Control Syst.Mag., vol. 23, no.
3, pp. 61–73, Jun. 2003.
[4] R. B. Patel, Neeraj Goel, ―Mobile Agents in
Heterogeneous Networks: A Look on Performance,‖
Journal of Computer Science,2(11): 824-834, 2006.
[5] O'Hare G.M.P., Marsh D., Ruzzelli A.,
R. Tynan,―Agents for Wireless Sensor Network Power
Management‖, in Proceedings of International Workshop
on Wireless and Sensor Networks (WSNET-05), Oslo,
Norway IEEE Press, 2005.
[6] S. Bailey, R. Grossman, H. Sivakumar, and A.
Turinsky. Papyrus: a system for data mining over local and
wide area clusters and super-clusters. In
Supercomputing ’99: Proceedings of the 1999 ACM/IEEE
conference on Supercomputing (CDROM), page 63, New
York, NY, USA, 1999. ACM
[7] R. J. Bayardo, W. Bohrer, R. Brice, A.Cichocki, J.
Fowler, A. Helal, V. Kashyap, T. Ksiezyk, G. Martin, M.
Nodine, and Others.InfoSleuth: agent-based semantic
integration of information in open and dynamic
environments. ACM SIGMOD Record, 26(2):195–206,
1997.
[8] F. Bergenti, M.P. Gleizes, and F.Zambonelli.
Methodologies And Software Engineering For Agent
Systems: The Agent oriented Software Engineering
Handbook. Kluwer Academic Publishers, 2004.
[9] R. Bose and V. Sugumaran. IDM: an intelligent
software agent based data mining environment. 1998 IEEE
International Conference on Systems, Man, and
Cybernetics, 3, 1998.
[10] J. Dasilva, C. Giannella, R. Bhargava, H.Kargupta,
and M. Klusch. Distributed datamining and agents.
Engineering Applicationsof Artificial Intelligence,
18(7):791–807,
Subnet1
BACKBONE
CTA
Subnet2
EA
Subnet3
DA
DA
DA
Subnet4
DA
Subnet
DA
7. International Journal of Science and Engineering Applications
Volume 2 Issue 5, 2013, ISSN-2319-7560 (Online)
www.ijsea.com 109
October 2005.
[11] S. Datta, K. Bhaduri, C. Giannella, R. Wolff,and H.
Kargupta. Distributed data mining in peer-to-peer
networks. Internet Computing,IEEE, 10(4):18–26, 2006.
[12] U. Fayyad, R. Uthurusamy, and Others. Data mining
and knowledge discovery in databases.
Communications of the ACM,39(11):24–26, 1996.
[13] Vladimir Gorodetsky, Oleg Karsaev, and Vladimir
Samoilov. Multi-agent technology for distributed data
mining and classification.In IAT, pages 438–441. IEEE
Computer
Society, 2003.
[14] Sven A. Brueckner H. Van Dyke Parunak.
Engineering swarming systems.Methodologies and
Software Engineering for Agent Systems, pages 341–376,
2004.
[15] W. Davies and P. Edwards. Distributed Learning: An
Agent-Based Approach to Data-Mining. In Proceedings of
MachineLearning 95 Workshop on Agents that Learn from
Other Agents, 1995.