This paper proposed for power management in micro grid using a hybrid distributed generator based on photovoltaic, wind-driven PMDC and energy storage system is proposed. In this generator, the sources are together connected to the grid with the help of interleaved boost converter followed by an inverter. Thus, compared to earlier schemes, the proposed scheme has fewer power converters. FUZZY based MPPT controllers are also proposed for the new hybrid scheme to separately trigger the interleaved DC-DC converter and the inverter for tracking the maximum power from both the sources. The integrated operations of both the proposed controllers for different conditions are demonstrated through simulation with the help of MATLAB software
Elimination of data redundancy before persisting into dbms using svm classifi...nalini manogaran
Elimination of data redundancy before persisting into dbms using svm classification,
Data Base Management System is one of the
growing fields in computing world. Grid computing, internet
sharing, distributed computing, parallel processing and cloud
are the areas store their huge amount of data in a DBMS to
maintain the structure of the data. Memory management is
one of the major portions in DBMS due to edit, delete, recover
and commit operations used on the records. To improve the
memory utilization efficiently, the redundant data should be
eliminated accurately. In this paper, the redundant data is
fetched by the Quick Search Bad Character (QSBC) function
and intimate to the DB admin to remove the redundancy.
QSBC function compares the entire data with patterns taken
from index table created for all the data persisted in the
DBMS to easy comparison of redundant (duplicate) data in
the database. This experiment in examined in SQL server
software on a university student database and performance is
evaluated in terms of time and accuracy. The database is
having 15000 students data involved in various activities.
Keywords—Data redundancy, Data Base Management System,
Support Vector Machine, Data Duplicate.
I. INTRODUCTION
The growing (prenominal) mass of information
present in digital media has become a resistive problem for
data administrators. Usually, shaped on data congregate
from distinct origin, data repositories such as those used by
digital libraries and e-commerce agent based records with
disparate schemata and structures. Also problems regarding
to low response time, availability, security and quality
assurance become more troublesome to manage as the
amount of data grow larger. It is practicable to specimen
that the peculiarity of the data that an association uses in its
systems is relative to its efficiency for offering beneficial
services to their users. In this environment, the
determination of maintenance repositories with “dirty” data
(i.e., with replicas, identification errors, equal patterns,
etc.) goes greatly beyond technical discussion such as the
everywhere quickness or accomplishment of data
administration systems.
Nalini.M, nalini.tptwin@gmail.com, Anbu.S, anomaly detection,
data mining
big data
dbms
intrusion detection
dublicate detection
data cleaning
data redundancy
data replication, redundancy removel, QSBC, Duplicate detection, error correction, de-duplication, Data cleaning, Dbms, Data sets
Study on potential capabilities of a nodb systemijitjournal
There is a need of optimal data to query processing technique to handle the increasing database size,
complexity, diversity of use. With the introduction of commercial website, social network, expectations are
that the high scalability, more flexible database will replace the RDBMS. Complex application and Big
Table require highly optimized queries. Users are facing the increasing bottlenecks in their data analysis. A
growing part of the database community recognizes the need for significant and fundamental changes to
database design. A new philosophy for creating database systems called noDB aims at minimizing the datato-
query time, most prominently by removing the need to load data before launching queries. That will
process queries without any data preparation or loading steps. There may not need to store data. User can
pipe raw data from websites, DBs, excel sheets into two promise sample inputs without storing anything.
This study is based on PostgreSQL systems. A series of the baseline experiment are executed to evaluate the
Performance of this system as per -a. Data loading cost, b-Query processing timing, c-Avoidance of
Collision and Deadlock, d-Enabling the Big data storage and e-Optimize query processing etc. The study
found significant possible capabilities of noDB system over the traditional database management system.
QUERY OPTIMIZATION IN OODBMS: IDENTIFYING SUBQUERY FOR COMPLEX QUERY MANAGEMENTcsandit
This paper is based on relatively newer approach for query optimization in object databases,
which uses query decomposition and cached query results to improve execution a query. Issues
that are focused here is fast retrieval and high reuse of cached queries, Decompose Query into
Sub query, Decomposition of complex queries into smaller for fast retrieval of result.
Here we try to address another open area of query caching like handling wider queries. By
using some parts of cached results helpful for answering other queries (wider Queries) and
combining many cached queries while producing the result.
Multiple experiments were performed to prove the productivity of this newer way of optimizing
a query. The limitation of this technique is that it’s useful especially in scenarios where data
manipulation rate is very low as compared to data retrieval rate.
BI-TEMPORAL IMPLEMENTATION IN RELATIONAL DATABASE MANAGEMENT SYSTEMS: MS SQ...lyn kurian
Traditional database management systems (DBMS) are the computation
storage and reservoir of large amounts of information. The data accumulated by these
database systems is the information valid at present time, valid now. It is the data that
is true at the present moment. Past data is the information that was kept in the
database at an earlier time, data that is hold to be existed in the past, were valid at
some point before now. Future data is the information supposed to be valid at a future
time instance, data that will be true in the near future, valid at some point after now.
The commercial DBMS of today used by organizations and individuals, such as MS
SQL Server, Oracle, DB2, Sybase, Postgres etc., do not provide models to support and
process (retrieving, modifying, inserting and removing) past and future data.
The implementation of bi-temporal modelling in Microsoft SQL Server is important
to know how relational database management system handles data the bi-temporal
property. In bi-temporal database, data saved is never deleted and additional values
are always appended. Therefore, the paper explores one of the way we can build bitemporal handling of data. The paper aims to build the core concepts of bi-temporal
data storage and querying techniques used in bi-temporal relational DBMS i.e., from
data structures to normalized storage, and to extraction or slicing of data.
The unlimited growth of data results relational data to become complicated in terms
of management and storage of data. Thus, the developers working in various
commercial and industrial applications should know how bi-temporal concepts apply to relational databases, especially due to their increased flexibility in the bi-temporal
storage as well as in analyzing data. Thereby, the paper demonstrates how bi-temporal
data structures and their operations are applied in Relational Database Management
System
Dear students get fully solved assignments
Send your semester & Specialization name to our mail id :
help.mbaassignments@gmail.com
or
call us at : 08263069601
Elimination of data redundancy before persisting into dbms using svm classifi...nalini manogaran
Elimination of data redundancy before persisting into dbms using svm classification,
Data Base Management System is one of the
growing fields in computing world. Grid computing, internet
sharing, distributed computing, parallel processing and cloud
are the areas store their huge amount of data in a DBMS to
maintain the structure of the data. Memory management is
one of the major portions in DBMS due to edit, delete, recover
and commit operations used on the records. To improve the
memory utilization efficiently, the redundant data should be
eliminated accurately. In this paper, the redundant data is
fetched by the Quick Search Bad Character (QSBC) function
and intimate to the DB admin to remove the redundancy.
QSBC function compares the entire data with patterns taken
from index table created for all the data persisted in the
DBMS to easy comparison of redundant (duplicate) data in
the database. This experiment in examined in SQL server
software on a university student database and performance is
evaluated in terms of time and accuracy. The database is
having 15000 students data involved in various activities.
Keywords—Data redundancy, Data Base Management System,
Support Vector Machine, Data Duplicate.
I. INTRODUCTION
The growing (prenominal) mass of information
present in digital media has become a resistive problem for
data administrators. Usually, shaped on data congregate
from distinct origin, data repositories such as those used by
digital libraries and e-commerce agent based records with
disparate schemata and structures. Also problems regarding
to low response time, availability, security and quality
assurance become more troublesome to manage as the
amount of data grow larger. It is practicable to specimen
that the peculiarity of the data that an association uses in its
systems is relative to its efficiency for offering beneficial
services to their users. In this environment, the
determination of maintenance repositories with “dirty” data
(i.e., with replicas, identification errors, equal patterns,
etc.) goes greatly beyond technical discussion such as the
everywhere quickness or accomplishment of data
administration systems.
Nalini.M, nalini.tptwin@gmail.com, Anbu.S, anomaly detection,
data mining
big data
dbms
intrusion detection
dublicate detection
data cleaning
data redundancy
data replication, redundancy removel, QSBC, Duplicate detection, error correction, de-duplication, Data cleaning, Dbms, Data sets
Study on potential capabilities of a nodb systemijitjournal
There is a need of optimal data to query processing technique to handle the increasing database size,
complexity, diversity of use. With the introduction of commercial website, social network, expectations are
that the high scalability, more flexible database will replace the RDBMS. Complex application and Big
Table require highly optimized queries. Users are facing the increasing bottlenecks in their data analysis. A
growing part of the database community recognizes the need for significant and fundamental changes to
database design. A new philosophy for creating database systems called noDB aims at minimizing the datato-
query time, most prominently by removing the need to load data before launching queries. That will
process queries without any data preparation or loading steps. There may not need to store data. User can
pipe raw data from websites, DBs, excel sheets into two promise sample inputs without storing anything.
This study is based on PostgreSQL systems. A series of the baseline experiment are executed to evaluate the
Performance of this system as per -a. Data loading cost, b-Query processing timing, c-Avoidance of
Collision and Deadlock, d-Enabling the Big data storage and e-Optimize query processing etc. The study
found significant possible capabilities of noDB system over the traditional database management system.
QUERY OPTIMIZATION IN OODBMS: IDENTIFYING SUBQUERY FOR COMPLEX QUERY MANAGEMENTcsandit
This paper is based on relatively newer approach for query optimization in object databases,
which uses query decomposition and cached query results to improve execution a query. Issues
that are focused here is fast retrieval and high reuse of cached queries, Decompose Query into
Sub query, Decomposition of complex queries into smaller for fast retrieval of result.
Here we try to address another open area of query caching like handling wider queries. By
using some parts of cached results helpful for answering other queries (wider Queries) and
combining many cached queries while producing the result.
Multiple experiments were performed to prove the productivity of this newer way of optimizing
a query. The limitation of this technique is that it’s useful especially in scenarios where data
manipulation rate is very low as compared to data retrieval rate.
BI-TEMPORAL IMPLEMENTATION IN RELATIONAL DATABASE MANAGEMENT SYSTEMS: MS SQ...lyn kurian
Traditional database management systems (DBMS) are the computation
storage and reservoir of large amounts of information. The data accumulated by these
database systems is the information valid at present time, valid now. It is the data that
is true at the present moment. Past data is the information that was kept in the
database at an earlier time, data that is hold to be existed in the past, were valid at
some point before now. Future data is the information supposed to be valid at a future
time instance, data that will be true in the near future, valid at some point after now.
The commercial DBMS of today used by organizations and individuals, such as MS
SQL Server, Oracle, DB2, Sybase, Postgres etc., do not provide models to support and
process (retrieving, modifying, inserting and removing) past and future data.
The implementation of bi-temporal modelling in Microsoft SQL Server is important
to know how relational database management system handles data the bi-temporal
property. In bi-temporal database, data saved is never deleted and additional values
are always appended. Therefore, the paper explores one of the way we can build bitemporal handling of data. The paper aims to build the core concepts of bi-temporal
data storage and querying techniques used in bi-temporal relational DBMS i.e., from
data structures to normalized storage, and to extraction or slicing of data.
The unlimited growth of data results relational data to become complicated in terms
of management and storage of data. Thus, the developers working in various
commercial and industrial applications should know how bi-temporal concepts apply to relational databases, especially due to their increased flexibility in the bi-temporal
storage as well as in analyzing data. Thereby, the paper demonstrates how bi-temporal
data structures and their operations are applied in Relational Database Management
System
Dear students get fully solved assignments
Send your semester & Specialization name to our mail id :
help.mbaassignments@gmail.com
or
call us at : 08263069601
Query optimization in oodbms identifying subquery for query managementijdms
This paper is based on relatively newer approach for query optimization in object databases, which uses
query decomposition and cached query results to improve execution a query. Issues that are focused here is
fast retrieval and high reuse of cached queries, Decompose Query into Sub query, Decomposition of
complex queries into smaller for fast retrieval of result.
Here we try to address another open area of query caching like handling wider queries. By using some
parts of cached results helpful for answering other queries (wider Queries) and combining many cached
queries while producing the result.
Multiple experiments were performed to prove the productivity of this newer way of optimizing a query.
The limitation of this technique is that it’s useful especially in scenarios where data manipulation rate is
very low as compared to data retrieval rate.
A database is generally used for storing related, structured data, w.pdfangelfashions02
A database is generally used for storing related, structured data, with well defined data formats,
in an efficient manner for insert, update and/or retrieval (depending on application).
On the other hand, a file system is a more unstructured data store for storing arbitrary, probably
unrelated data. The file system is more general, and databases are built on top of the general data
storage services provided by file systems.
A Data Base Management System is a system software for easy, efficient and reliable data
processing and management. It can be used for:
Creation of a database.
Retrieval of information from the database.
Updating the database.
Managing a database.
It provides us with the many functionalities and is more advantageous than the traditional file
system in many ways listed below:
1) Processing Queries and Object Management:
In traditional file systems, we cannot store data in the form of objects. In practical-world
applications, data is stored in objects and not files. So in a file system, some application software
maps the data stored in files to objects so that can be used further.
We can directly store data in the form of objects in a database management system. Application
level code needs to be written to handle, store and scan through the data in a file system whereas
a DBMS gives us the ability to query the database.
2) Controlling redundancy and inconsistency:
Redundancy refers to repeated instances of the same data. A database system provides
redundancy control whereas in a file system, same data may be stored multiple times. For
example, if a student is studying two different educational programs in the same college, say
,Engineering and History, then his information such as the phone number and address may be
stored multiple times, once in Engineering dept and the other in History dept. Therefore, it
increases time taken to access and store data. This may also lead to inconsistent data states in
both places. A DBMS uses data normalization to avoid redundancy and duplicates.
3) Efficient memory management and indexing:
DBMS makes complex memory management easy to handle. In file systems, files are indexed in
place of objects so query operations require entire file scans whereas in a DBMS , object
indexing takes place efficiently through database schema based on any attribute of the data or a
data-property. This helps in fast retrieval of data based on the indexed attribute.
4) Concurrency control and transaction management:
Several applications allow user to simultaneously access data. This may lead to inconsistency in
data in case files are used. Consider two withdrawal transactions X and Y in which an amount of
100 and 200 is withdrawn from an account A initially containing 1000. Now since these
transactions are taking place simultaneously, different transactions may update the account
differently. X reads 1000, debits 100, updates the account A to 900, whereas X also reads 1000,
debits 200, updates A to 800. In bot.
Data Ware House System in Cloud EnvironmentIJERA Editor
To reduce Cost of data ware house deployment , virtualization is very Important. virtualization can reduce Cost
and as well as tremendous Pressure of managing devices, Storages Servers, application models & main Power.
In current time, data were house is more effective and important Concepts that can make much impact in
decision support system in Organization. Data ware house system takes large amount of time, cost and efforts
then data base system to Deploy and develop in house system for an Organization . Due to this reason that,
people now think about cloud computing as a solution of the problem instead of implementing their own data
were house system . In this paper, how cloud environment can be established as an alternative of data ware
house system. It will given the some knowledge about better environment choice for the organizational need.
Organizational Data were house and EC2 (elastic cloud computing ) are discussed with different parameter like
ROI, Security, scalability, robustness of data, maintained of system etc
On multi dimensional cubes of census data: designing and queryingJaspreet Issaj
The primary focus of this research is to design a data warehouse that specifically targets OLAP storage, analyzing and querying requirements to the multidimensional cubes of census data with an efficient and timely manner.
Enhancement techniques for data warehouse staging areaIJDKP
Poor performance can turn a successful data warehousing project into a failure. Consequently, several
attempts have been made by various researchers to deal with the problem of scheduling the Extract-
Transform-Load (ETL) process. In this paper we therefore present several approaches in the context of
enhancing the data warehousing Extract, Transform and loading stages. We focus on enhancing the
performance of extract and transform phases by proposing two algorithms that reduce the time needed in
each phase through employing the hidden semantic information in the data. Using the semantic
information, a large volume of useless data can be pruned in early design stage. We also focus on the
problem of scheduling the execution of the ETL activities, with the goal of minimizing ETL execution time.
We explore and invest in this area by choosing three scheduling techniques for ETL. Finally, we
experimentally show their behavior in terms of execution time in the sales domain to understand the impact
of implementing any of them and choosing the one leading to maximum performance enhancement.
Learn the best way to overcome the challenges of your database homework! Our knowledgeable Database Homework Helpers have the skills necessary to handle challenging queries, improve SQL, and interpret ER diagrams. We make sure that your assignments stand out by providing in-depth knowledge and prompt assistance. Obtain success right away with database homework help!
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
DATABASE SYSTEMS PERFORMANCE EVALUATION FOR IOT APPLICATIONSijdms
ABSTRACT
The amount of data stored in IoT databases increases as the IoT applications extend throughout smart city appliances, industry and agriculture. Contemporary database systems must process huge amounts of sensory and actuator data in real-time or interactively. Facing this first wave of IoT revolution, database vendors struggle day-by-day in order to gain more market share, develop new capabilities and attempt to overcome the disadvantages of previous releases, while providing features for the IoT.
There are two popular database types: The Relational Database Management Systems and NoSQL databases, with NoSQL gaining ground on IoT data storage. In the context of this paper these two types are examined. Focusing on open source databases, the authors experiment on IoT data sets and pose an answer to the question which one performs better than the other. It is a comparative study on the performance of the commonly market used open source databases, presenting results for the NoSQL MongoDB database and SQL databases of MySQL and PostgreSQL
A Survey of Ontology-based Information Extraction for Social Media Content An...ijcnes
The amount of information generated in the Web has grown enormously over the years. This information is significant to individuals, businesses and organizations. If analyzed, understood and utilized, it will provide a valuable insight to its stakeholders. However, many of these information are semi-structured or unstructured which makes it difficult to draw in-depth understanding of the implications behind those information. This is where Ontology-based Information Extraction (OBIE) and social media content analysis come into play. OBIE has now become a popular way to extract information coming from machine-readable sources. This paper presents a survey of OBIE, Ontology languages and tools and the process to build an ontology model and framework. The author made a comparison of two ontology building frameworks and identified which framework is complete.
Economic Growth of Information Technology (It) Industry on the Indian Economyijcnes
Information Technology (IT) is an important emerging sector of the Indian Economy. IT in India is an industry comprising of two noteworthy segments IT administrations and business process outsourcing (BPO).The segment has expanded its commitment to Indias GDP from 1.2% in 1998 to 9.3% in 2015. According to NASSCOM, the segment amassed incomes of US$147 billion out of 2015, with send out income remaining at US$99 billion and household income at US$48 billion, developing by more than 13%.Indias present Prime Minister Narendra Modi has begun a venture called �DIGITAL INDIA i.e., Computerized India to help secure IT a position both inside and outside of India. The IT sector has served as a fertile ground for the growth of a new entrepreneurial class with innovative corporate practices and has been instrumental in reversing the brain drain, raising Indias brand equity and attracting foreign direct investment (FDI) leading to other associated benefits. The Size of this sector has increased at a tremendous rate of 35% per year during the last 10 years. This Paper examines the India�s growth in IT industry and also studied the impact of IT on the Indian Economy.
More Related Content
Similar to Power Management in Micro grid Using Hybrid Energy Storage System
Query optimization in oodbms identifying subquery for query managementijdms
This paper is based on relatively newer approach for query optimization in object databases, which uses
query decomposition and cached query results to improve execution a query. Issues that are focused here is
fast retrieval and high reuse of cached queries, Decompose Query into Sub query, Decomposition of
complex queries into smaller for fast retrieval of result.
Here we try to address another open area of query caching like handling wider queries. By using some
parts of cached results helpful for answering other queries (wider Queries) and combining many cached
queries while producing the result.
Multiple experiments were performed to prove the productivity of this newer way of optimizing a query.
The limitation of this technique is that it’s useful especially in scenarios where data manipulation rate is
very low as compared to data retrieval rate.
A database is generally used for storing related, structured data, w.pdfangelfashions02
A database is generally used for storing related, structured data, with well defined data formats,
in an efficient manner for insert, update and/or retrieval (depending on application).
On the other hand, a file system is a more unstructured data store for storing arbitrary, probably
unrelated data. The file system is more general, and databases are built on top of the general data
storage services provided by file systems.
A Data Base Management System is a system software for easy, efficient and reliable data
processing and management. It can be used for:
Creation of a database.
Retrieval of information from the database.
Updating the database.
Managing a database.
It provides us with the many functionalities and is more advantageous than the traditional file
system in many ways listed below:
1) Processing Queries and Object Management:
In traditional file systems, we cannot store data in the form of objects. In practical-world
applications, data is stored in objects and not files. So in a file system, some application software
maps the data stored in files to objects so that can be used further.
We can directly store data in the form of objects in a database management system. Application
level code needs to be written to handle, store and scan through the data in a file system whereas
a DBMS gives us the ability to query the database.
2) Controlling redundancy and inconsistency:
Redundancy refers to repeated instances of the same data. A database system provides
redundancy control whereas in a file system, same data may be stored multiple times. For
example, if a student is studying two different educational programs in the same college, say
,Engineering and History, then his information such as the phone number and address may be
stored multiple times, once in Engineering dept and the other in History dept. Therefore, it
increases time taken to access and store data. This may also lead to inconsistent data states in
both places. A DBMS uses data normalization to avoid redundancy and duplicates.
3) Efficient memory management and indexing:
DBMS makes complex memory management easy to handle. In file systems, files are indexed in
place of objects so query operations require entire file scans whereas in a DBMS , object
indexing takes place efficiently through database schema based on any attribute of the data or a
data-property. This helps in fast retrieval of data based on the indexed attribute.
4) Concurrency control and transaction management:
Several applications allow user to simultaneously access data. This may lead to inconsistency in
data in case files are used. Consider two withdrawal transactions X and Y in which an amount of
100 and 200 is withdrawn from an account A initially containing 1000. Now since these
transactions are taking place simultaneously, different transactions may update the account
differently. X reads 1000, debits 100, updates the account A to 900, whereas X also reads 1000,
debits 200, updates A to 800. In bot.
Data Ware House System in Cloud EnvironmentIJERA Editor
To reduce Cost of data ware house deployment , virtualization is very Important. virtualization can reduce Cost
and as well as tremendous Pressure of managing devices, Storages Servers, application models & main Power.
In current time, data were house is more effective and important Concepts that can make much impact in
decision support system in Organization. Data ware house system takes large amount of time, cost and efforts
then data base system to Deploy and develop in house system for an Organization . Due to this reason that,
people now think about cloud computing as a solution of the problem instead of implementing their own data
were house system . In this paper, how cloud environment can be established as an alternative of data ware
house system. It will given the some knowledge about better environment choice for the organizational need.
Organizational Data were house and EC2 (elastic cloud computing ) are discussed with different parameter like
ROI, Security, scalability, robustness of data, maintained of system etc
On multi dimensional cubes of census data: designing and queryingJaspreet Issaj
The primary focus of this research is to design a data warehouse that specifically targets OLAP storage, analyzing and querying requirements to the multidimensional cubes of census data with an efficient and timely manner.
Enhancement techniques for data warehouse staging areaIJDKP
Poor performance can turn a successful data warehousing project into a failure. Consequently, several
attempts have been made by various researchers to deal with the problem of scheduling the Extract-
Transform-Load (ETL) process. In this paper we therefore present several approaches in the context of
enhancing the data warehousing Extract, Transform and loading stages. We focus on enhancing the
performance of extract and transform phases by proposing two algorithms that reduce the time needed in
each phase through employing the hidden semantic information in the data. Using the semantic
information, a large volume of useless data can be pruned in early design stage. We also focus on the
problem of scheduling the execution of the ETL activities, with the goal of minimizing ETL execution time.
We explore and invest in this area by choosing three scheduling techniques for ETL. Finally, we
experimentally show their behavior in terms of execution time in the sales domain to understand the impact
of implementing any of them and choosing the one leading to maximum performance enhancement.
Learn the best way to overcome the challenges of your database homework! Our knowledgeable Database Homework Helpers have the skills necessary to handle challenging queries, improve SQL, and interpret ER diagrams. We make sure that your assignments stand out by providing in-depth knowledge and prompt assistance. Obtain success right away with database homework help!
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
DATABASE SYSTEMS PERFORMANCE EVALUATION FOR IOT APPLICATIONSijdms
ABSTRACT
The amount of data stored in IoT databases increases as the IoT applications extend throughout smart city appliances, industry and agriculture. Contemporary database systems must process huge amounts of sensory and actuator data in real-time or interactively. Facing this first wave of IoT revolution, database vendors struggle day-by-day in order to gain more market share, develop new capabilities and attempt to overcome the disadvantages of previous releases, while providing features for the IoT.
There are two popular database types: The Relational Database Management Systems and NoSQL databases, with NoSQL gaining ground on IoT data storage. In the context of this paper these two types are examined. Focusing on open source databases, the authors experiment on IoT data sets and pose an answer to the question which one performs better than the other. It is a comparative study on the performance of the commonly market used open source databases, presenting results for the NoSQL MongoDB database and SQL databases of MySQL and PostgreSQL
Similar to Power Management in Micro grid Using Hybrid Energy Storage System (20)
A Survey of Ontology-based Information Extraction for Social Media Content An...ijcnes
The amount of information generated in the Web has grown enormously over the years. This information is significant to individuals, businesses and organizations. If analyzed, understood and utilized, it will provide a valuable insight to its stakeholders. However, many of these information are semi-structured or unstructured which makes it difficult to draw in-depth understanding of the implications behind those information. This is where Ontology-based Information Extraction (OBIE) and social media content analysis come into play. OBIE has now become a popular way to extract information coming from machine-readable sources. This paper presents a survey of OBIE, Ontology languages and tools and the process to build an ontology model and framework. The author made a comparison of two ontology building frameworks and identified which framework is complete.
Economic Growth of Information Technology (It) Industry on the Indian Economyijcnes
Information Technology (IT) is an important emerging sector of the Indian Economy. IT in India is an industry comprising of two noteworthy segments IT administrations and business process outsourcing (BPO).The segment has expanded its commitment to Indias GDP from 1.2% in 1998 to 9.3% in 2015. According to NASSCOM, the segment amassed incomes of US$147 billion out of 2015, with send out income remaining at US$99 billion and household income at US$48 billion, developing by more than 13%.Indias present Prime Minister Narendra Modi has begun a venture called �DIGITAL INDIA i.e., Computerized India to help secure IT a position both inside and outside of India. The IT sector has served as a fertile ground for the growth of a new entrepreneurial class with innovative corporate practices and has been instrumental in reversing the brain drain, raising Indias brand equity and attracting foreign direct investment (FDI) leading to other associated benefits. The Size of this sector has increased at a tremendous rate of 35% per year during the last 10 years. This Paper examines the India�s growth in IT industry and also studied the impact of IT on the Indian Economy.
An analysis of Mobile Learning Implementation in Shinas College of Technology...ijcnes
In the past decade, technology has grown exponentially, especially the speed of the Internet and mobile technology have reached its peak it seems. This technology advancement also gives its impact to all the areas especially in the education sector. Researchers have to be interested in investigating how these technologies can be exploited for educational purposes aiming to enhance learning experiences. Subsequently, this has prompt an exploration slant which is ordinarily alluded to as Mobile Learning (M-Learning) in which specialists endeavors have meant to disseminate fitting learning encounters to learners considering their own flexibility needs, the universal usage of portable advances and the accessibility of data whenever � anyplace. By and by, m-learning is still in its start and extraordinary endeavors should be done as such as to explore the potential outcomes of educational outlook change from the conventional on-estimate fits-all illuminating ways to deal with a versatile and customized discovering that can be circulated by means of portable creations. This paper, presents the suitability and need of mobile learning facility in Shinas College of Technology(SHCT) and also presents the framework for implementing m-learning in SHCT.
A Survey on the Security Issues of Software Defined Networking Tool in Cloud ...ijcnes
The Advent of the digital age has led to a rise in different types of data with every passing day. In fact, it is expected that half of the total data used around the world will be on the cloud nowadays. This complex data needs to be stored, processed and analyzed for information gaining that can be used for several organizations. Cloud computing provides an appropriate platform for Software Defined Networking (SDN) in communicating and computing requirements of the latter. It makes cloud-based networking a viable research field in the current scenario. However, several issues addressed and risk needs to be mitigated in the L2 cloud server. Virtual networks and cloud federation being considered in the network virtualization over L3 cloud router. This research work explores the existing research challenges and discusses open issues for the security in cloud computing and its uses in the relevant field by means of a comparative analysis of L2 server L3 router based on SDN tools. Also, an analysis of such issues are discussed and summarized. Finally, the best tool identified for the use cloud security.
We briefly discuss about the e-government which is about the finishing transactions between the government and the public through internet. First, we wrote about the three sectors of e-government which are between government and (government, citizens, business). Second, we wrote about benefits that users can get from using e-government. Third, we wrote about the challenges that e-government fac
Holistic Forecasting of Onset of Diabetes through Data Mining Techniquesijcnes
Diabetes is one of the modern day diseases that poses serious threat for the affected and is ever challenging for physicians who are involved in its management and control.Type2 diabetes mellitus ranges in exponential rating day by day in its increase. Mere not being aware of the facts and causes that can lead to such state, unawareness about diabetic symptoms and late detection make diabetic condition unmanageable and is really a challenging task to be faced all victims. This paper suggests holistic measures and means by which any common man can get into it to check whether he / she is a would-be victim of Diabetes through simple checking of symptoms that may lead to Diabetic condition, analyses the factual causes of the aforesaid disease. This would certainly make a person to ensure for the locus-centric state of whether of being a diabetic or not. The problem of diagnosing the onset and incidence of Diabetes is addressed more with a data mining approach in mind. As the success of any data mining approach is solely dependant on the underlying dataset upon which learning is manifested and taken for, this paper inspects more on locating prima-facie symptoms of diabetes disorder. A sagacious insight of analyzing the actual causes of diabetes is set and hence a comprehensive set of data for diabetic condition is proposed here. Subjecting this data to data analysis through simple data mining techniques v.i.z., FP-Growth and Apriori would certainly model a holistic inference engine that could help a doctor to be more astute in confirming the diabetic condition of patients. Association rules are also being inducted based on both of these approaches. A heuristic computer aided diagnosis (CAD) system for diabetes can be built upon this
A Survey on Disease Prediction from Retinal Colour Fundus Images using Image ...ijcnes
The aim of this survey is to list the various disease predictions from retinal funds images and various methods used to detect the disease. This paper gives a detailed description about the various diseases predicted in retina by comparing retinal funds image structure. Till now, the prediction of various diseases such as diabetic retinopathy, cardiovascular disease and other eye problems had been predicted by using retinal funds images. Next, a comparative study of the various methods followed using image processing to find out the diseases from retinal funds images, is provided. The basic matrices observed to predict the diseases are optic disc,nerve cup and rim. To find the differences in the basic matrices, image processing techniques such as mask generation, colour normalization, edge detection, contrast enhancement are used. The datasets that are used for retinal image inputs are STARE, DRIVE, ONHSD, ARIA, IMAGERET. The survey at the end, discusses the future work for the possibilities of predicting gastreointestinal problems via retinal funds images.
Feature Extraction in Content based Image Retrievalijcnes
A technique for Content Based Image Retrieval (CBIR) for the generation of image content descriptor which exploiting the advantage of low complexity Order Dither Block Truncation Coding (ODBTC). The quantizer and bitmap image are the compressed form of image obtained from the ODBTC technique in encoding step. Decoding is not performed in this method. It has two image feature such as Color Co-occurrence Feature (CCF) and Bit Pattern Feature (BPF) for indexing the image. These features are directly obtained from ODBTC encoded data stream. By comparing with the BTC image retrieval system and other earlier method the experimental result show the proposed method is superior. ODBTC is suited for image compression and it is a simple and effective descriptor to index the image in CBIR system. Content-based image retrieval is a technique which is used to extract the images on the basis of their content such as texture, color, shape and spatial layout. In order to minimize this gap many concepts was introduced. Moreover, Images can be stored and extracted based on various features and one of the prominent feature is Texture.
Challenges and Mechanisms for Securing Data in Mobile Cloud Computingijcnes
Cloud computing enables users to utilize the services of computing resources. Now days computing resources in mobile applications are being delivered with cloud computing. As there is a growing need for new mobile applications, usage of cloud computing can not be overlooked. Cloud service providers offers the services for the data request in a remote server. Virtualization aspect of cloud computing in mobile applications felicitates better utilization of resources. The industry needs to address the foremost security risk in the underlying technology. The cloud computing environment in mobile applications aggravated with various security problems. This paper addresses challenges in securing data in cloud for mobile Cloud computing and few mechanisms to overcome.
Detection of Node Activity and Selfish & Malicious Behavioral Patterns using ...ijcnes
Mobile ad-hoc networks(MANETs) assume that mobile nodes voluntary cooperate in order to work properly. This cooperation is a cost-intensive activity and some nodes can refuse to cooperate, leading to a selfish node behaviour. Thus, the overall network performance could be seriously affected. The use of watchdogs is a well-known mechanism to detect selfish nodes. However, the detection process performed by watchdogs can fail, generating false positives and false negatives that can induce to wrong operations. Moreover, relying on local watchdogs alone can lead to poor performance when detecting selfish nodes, in term of precision and speed. This is especially important on networks with sporadic contacts, such as delay tolerant networks (DTNs), where sometimes watchdogs lack of enough time or information to detect the selfish nodes. Thus, We apply chord algorithm to identify behavior pattern of one shelf by two neighborhood nodes and themselves. Servers will finally categories nature of node.
Optimal Channel and Relay Assignment in Ofdmbased Multi-Relay Multi-Pair Two-...ijcnes
Efficient utilization of radio resources in wireless networks is crucial and has been investigated extensively. This letter considers a wireless relay network where multiple user pairs conduct bidirectional communications via multiple relays based on orthogonal frequency-division multiplexing (OFDM) transmission. The joint optimization of channel and relay assignment, including subcarrier pairing, subcarrier allocation as well as relay selection, for total throughput maximization is formulated as a combinatorial optimization problem. Using a graph theoretical approach, we solve the problem optimally in polynomial time by transforming it into a maximum weighted bipartite matching (MWBM) problem. Simulation studies are carried out to evaluate the network total throughput versus transmit power per node and the number of relay nodes
An Effective and Scalable AODV for Wireless Ad hoc Sensor Networksijcnes
Appropriate routing protocol in data transfer is a challenging problem of network in terms of lower end-to-end delay in delivery of data packets with improving packet delivery ratio and lower overhead as well. In this paper we explain an effective and scalable AODV (called as AODV-ES) for Wireless Ad hoc Sensor Networks (WASN) by using third party reply model, n-hop local ring and time-to-live based local recovery. Our goal is to reduce time delay for delivery of the data packets, routing overhead and improve the data packet delivery ratio. The resulting algorithm AODV-ES is then simulated by NS-2 under Linux operating system. The performance of routing protocol is evaluated under various mobility rates and found that the proposed routing protocol is better than AODV.
Secured Seamless Wi-Fi Enhancement in Dynamic Vehiclesijcnes
At present, cellular networks provide ubiquitous Internet connection, but with relatively expensive cost. Furthermore, the cellular networks have been proven to be insufficient for the surging amount of data from Internet enabled mobile devices. Due to the explosive growth of the subscriber number and the mobile data, cellular networks are suffering overload, and the users are experiencing service quality degradation. In this project implement seamless and efficient Wi-Fi based Internet access from moving vehicles. In our proposed implementation, a group of APs are employed to communicate with a client (called AP diversity), and the transmission succeeds if any AP in the group accomplishes the delivery with the client (called opportunistic transmission). Such AP diversity and opportunistic transmission are exploited to overcome the high packet loss rate, which is achieved by configuring all the APs with the same MAC and IP addresses. With such a configuration, a client gets a graceful illusion that only one (virtual) AP exists, and will always be associated with this virtual AP. Uplink communications, when the client transmits a packet to the virtual AP, actually multiple APs within its transmission range are able to receive it. The transmission is successful as long as at least one AP receives the packet correctly. Proposed implementation will show that outperforms existing schemes remarkably.
Virtual Position based Olsr Protocol for Wireless Sensor Networksijcnes
In wireless sensor networks usually taken in routing void problem in geographical routing in high control overhead and transmission delay .The routing void protocol is proposed in this paper is efficient bypassing void routing protocol. This protocol based on virtual co-ordinates is to transform a structure of random process in virtual circle .The circle are composed by void edge in to by mapping of edge nodes. In this paper to used the greedy forwarding algorithm. This algorithm can be process on virtual circle. The virtual circle greedy forwarding failing on routing void process. There are forwarding process are source to destination. The proposed protocol as find the shortest path, long transmission and High quality link maintenance.
Mitigation and control of Defeating Jammers using P-1 Factorizationijcnes
Jamming-resistant broadcast communication is crucial for safety-critical applications such as emergency alert broadcasts or the dissemination of navigation signals in adversarial settings. These applications share the need for guaranteed authenticity and availability of messages which are broadcasted by base stations to a large and unknown number of (potentially untrusted) receivers. Common techniques to counter jamming attacks such as Direct-Sequence Spread Spectrum (DSSS) and Frequency Hopping are based on secrets that need to be shared between the sender and the receivers before the start of the communication. However, broadcast anti jamming communication that relies on Pollards Rho Method. In this work, we therefore propose a solution called P-Rho Method to enables spread-spectrum anti-jamming broadcast communication without the requirement of shared secrets. complete our work with an experimental evaluation on a prototype implementation.
An analysis and impact factors on Agriculture field using Data Mining Techniquesijcnes
In computing and information huge amount of data was provided in the storage. The task is to extract the specified data from the raw data. Data mining is one of the techniques that will extract the data. Data mining techniques are used in many places. The techniques like K-means, K nearest neighbor, support vector machine, bi clustering, navie bayes classifier, neural networks and fuzzy C-means are applied on agricultural data. There are many factors in agriculture. The main factors for the farmer are climate, soil and yield prediction. Farmer must know To improve their production select suitable crop for suitable climate. This paper provides the various concepts of Data mining, their applications and also discusses the research field in agriculture. This paper discusses the different types of factors that impact in the agriculture field.
A Study on Code Smell Detection with Refactoring Tools in Object Oriented Lan...ijcnes
A code smell is an indication in the source code that hypothetically indicates a design problem in the equivalent software. The Code smells are certain code lines which makes problems in source code. It also means that code lines are bad design shape or any code made by bad coding practices. Code smells are structural characteristics of software that may indicates a code or drawing problem that makes software hard to evolve and maintain, and may trigger refactoring of code. In this paper, we proposed some success issues for smell detection tools which can assistance to develop the user experience and therefore the acceptance of such tools. The process of detecting and removing code smells with refactoring can be overwhelming.
Priority Based Multi Sen Car Technique in WSNijcnes
In Wireless sensor network (WSN), Clustering is an efficient mechanism used to overcome energy capability problem, routing and load balancing. This paper addresses energy utilization and load distribution between the clusters. The load balanced clustering algorithms (LBC) used to decrease the energy utilization and distribute load into clusters. The SenCar uses multi-user multi-input and multi-output (MU-MIMO) technique which is introduced the multi SenCar to collects the information from each cluster heads and upload the data in the base station. This method achieves more than 50 percent energy saving per node, 80 percent energy saving on cluster heads and also achieves less data collection time compared to the existing system
Semantic Search of E-Learning Documents Using Ontology Based Systemijcnes
The keyword searching mechanism is traditionally used for information retrieval from Web based systems. However, this system fails to meet the requirements in Web searching of the expert knowledge base based on the popular semantic systems. Semantic search of E-learning documents based on ontology is increasingly adopted in information retrieval systems. Ontology based system simplifies the task of finding correct information on the Web by building a search system based on the meaning of keyword instead of the keyword itself. The major function of the ontology based system is the development of specification of conceptualization which enhances the connection between the information present in the Web pages with that of the background knowledge.The semantic gap existing between the keyword found in documents and those in query can be matched suitably using Ontology based system. This paper provides a detailed account of the semantic search of E-learning documents using ontology based system by making comparison between various ontology systems. Based on this comparison, this survey attempts to identify the possible directions for future research.
Investigation on Challenges in Cloud Security to Provide Effective Cloud Comp...ijcnes
Cloud computing provides the capability to use computing and storage resources on a metered basis and reduce the investments in an organization�s computing infrastructure. The spawning and deletion of virtual machines running on physical hardware and being controlled by hypervisors is a cost-efficient and flexible computing paradigm. In addition, the integration and widespread availability of large amounts of sanitized information such as health care records can be of tremendous benefit to researchers and practitioners. However, as with any technology, the full potential of the cloud cannot be achieved without understanding its capabilities, vulnerabilities, advantages, and trade-offs. We propose a new method of achieving the maximum benefit from cloud computation with minimal risk. Issues such as data ownership, privacy protections, data mobility, quality of service and service levels, bandwidth costs, data protection, and support have to be tackled in order to achieve the maximum benefit from cloud computation with minimal risk.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
Power Management in Micro grid Using Hybrid Energy Storage System
1. Integrated Intelligent Research (IIR) International Journal of Business Intelligents
Volume: 05 Issue: 01 June 2016 Page No.112-115
ISSN: 2278-2400
112
Support Vector Machine Based Data Classification to
Avoid Data Redundancy Removal before Persist the Data
in a DBMS
M.Nalini1
, S.Anbu2
Research Scholar, Department of Computer Science and Engineering,St.Peter's University,Avadi, Chennai, India
Professor, Department of Computer Science and Engineering,St.Peters College of Engineering and Technology,Avadi, Chennai, India
Email: nalinicseme@gmail.com, anbuss16@gmail.com
Abstract—Data Base Management System is one of the growing
fields in computing world. Grid computing, internet sharing,
distributed computing, parallel processing and cloud are the areas
store their huge amount of data in a DBMS to maintain the
structure of the data. Memory management is one of the major
portions in DBMS due to edit, delete, recover and commit
operations used on the records. To improve the memory
utilization efficiently, the redundant data should be eliminated
accurately. In this paper, the redundant data is fetched by the
Quick Search Bad Character (QSBC) function and intimate to the
DB admin to remove the redundancy. QSBC function compares
the entire data with patterns taken from index table created for all
the data persisted in the DBMS to easy comparison of redundant
(duplicate) data in the database. This experiment in examined in
SQL server software on a university student database and
performance is evaluated in terms of time and accuracy. The
database is having 15000 students data involved in various
activities.
Keywords—Data redundancy, Data Base Management System,
Support Vector Machine, Data Duplicate.
I. INTRODUCTION
The growing (prenominal) mass of information present in digital
media has become a resistive problem for data administrators.
Usually, shaped on data congregate from distinct origin, data
repositories such as those used by digital libraries and e-
commerce agent based records with disparate schemata and
structures. Also problems regarding to low response time,
availability, security and quality assurance become more
troublesome to manage as the amount of data grow larger. It is
practicable to specimen that the peculiarity of the data that an
association uses in its systems is relative to its efficiency for
offering beneficial services to their users. In this environment,
the determination of maintenance repositories with “dirty” data
(i.e., with replicas, identification errors, equal patterns, etc.) goes
greatly beyond technical discussion such as the everywhere
quickness or accomplishment of data administration systems.
The solutions available for addressing this situation need more
than technical efforts; they need administration and cultural
changes as well.
To distinguishing and manipulation replicas is essential to assure
the peculiarity of the information made present by emphasizing
system such as digital libraries and e-commerce agent. These
systems may rely on compatible data to propose exalted
profession benefit, and may be inclined by the existence of
replica in their repositories. A Hereditary Scheme (HS) approach
was used to register deduplication [1]. The problem of find out
and destroy replica entries in a repository is known as record
deduplication [2]. This approach bind several dissimilar portion
of attestation extracted from the data content to exhibit a
deduplication function that is capable to recognize whether two
or more entries in a database are replicas or not. Since record
deduplication is a time consuming work even for small
databases, our scope is to encourage a process that predicting a
peculiar combination of the best pieces of evidence, thus yielding
a deduplication function that improves the performance using a
method to compare the corresponding data for training process.
Finally, this function can be applied on the entire data or even
applied to other databases with similar characteristics. Moreover,
modern supplemental data can be entreated similarly by the
suggested function, as long as there is no unexpected deviate in
the data patterns, something that is very important in huge
databases. It is valuable consideration that these (arithmetical)
services, which can be considered as a combination of several
powerful deduplication regulations, is easy, fast and strong to
calculate, permit its effectual technique to the deduplication of
huge databases. By record deduplication using HS approach that
generates gene excellence for each record using genetic
operation. If that gene value equal with any other record that
record was considered as a duplicate record. These trading
operations are to increase the characteristic of given record.
Genetic Operations are Reproduction, Mutation and Crossover
[1]. From this, it can be understand that genetic operations can
impact the performance of the record deduplication task. From
the experimental results it can be concluded that the significant
difference among the various efforts required obtaining suitable
solution. The main contribution of the existing approach is to
eliminate the record duplication. The experimental results
obtained from the existing approaches PSO and GA are
compared to evaluate the performance, where PSO is better than
GA is proved [3]. Some of the research methods such as anomaly
detection methods are used in various applications like online
banking, credit card fraud, insurance and health care areas. The
2. Integrated Intelligent Research (IIR) International Journal of Business Intelligents
Volume: 05 Issue: 01 June 2016 Page No.112-115
ISSN: 2278-2400
113
quality of the software product is depends on the data models
used for managing dynamic changes on the data [4]. Some of the
research works do testing process also integrating with the
anomaly detection [5]. Few research works are removing the
duplicate record removal in file system either at sub-directory or
in the whole directory [6-9]. Some of the existing approaches
mainly divided into two types of categories like de-duplication,
one is file-level [10-12] and the other one is Block-level de-
duplication [13-15]. Means the duplication records are analyzed
in terms of internal information and external information of the
file system. From the above background study, it is essential to
develop a method for deleting data duplication to increase the
quality of the data.
II. PROBLEM STATEMENT
Data and DBMS based application are growing speedily in most
of the computing fields today. All the earlier approaches
discussed in the earlier research works are concentrating on data
cleaning, data arranging, error correction and duplication
removal through anomaly detection. But it is essential to
understand and apply data redundancy removal in advanced data
mining and big-data analysis applications to provide customer
satisfaction and quality service provision. Because, nowadays
data centers and DBMA are interconnected into cloud
environment which are integrated with parallel as well as
distributed computing. Hence, it is necessary to provide speed
access, memory management and accurate peculiar data retrieval.
This problem is taken into account and in this paper by using a
pattern recognition approaches data redundancy will be
eliminated and improve the quality of service.
The contribution of the proposed approach in this paper is:
1. Read the data and create custom-index for each data,
where the custom-index is considered as the pattern for
comparison.
2. Each interval of time, during insertion of a new record,
deletion of a record and altering the data the custom-
index will be examined and eliminated.
III. PROPOSED SYSTEM MODEL
The entire functionality of this paper is illustrated in the
following Figure-1. It is assumed that the data is maintained in
the form of records, stored in a DBMS (as a Table). Each data
has more number of fields F ={F1, F2, ….,Fn}, all the fields are
unique in terms of attributes. All the attributes are configured
while structuring the table. A new table is created for each table
within single column as the custom-index. The custom index is
created as a pattern. An exact pattern matching is to find all the
occurrences of a particular pattern (P =P1 P2 P3….Pm) where the
pattern contains number of characters. In the earlier
research works considered that the pattern occur in the data
Y={Y1, Y2, …., Yn}.
Figure-1. Proposed Model for Data
IV. PREPROCESSING
In this paper Quick Search Bad Character (QSBC) function is
used to verify all the characters in the data set. It read the data
row by row and column by column and examine from left to
right. If QSBC finds a same custom-index value more than one
time, then it will intimate to the DB administrator to eliminate
the duplicate record. At the same time it creates a log file to
verify the performance of the QSBC function. QSBC is selected
as a perfect solution for comparing the patterns due to the
following reasons are:
(i). QSBC accurately matches the custom index
speedily, and it is independent to other methods related to the
DBMS.
(ii). QBSC examines and compare the patterns character
by character using more shift operations.
(iii). QBSC compares the pattern from left to right by
creating sub patters in the entire data string.
This preprocessing phase helps to obtain the number of duplicate
patterns available in the database. The above three stages of the
pattern searching algorithm is given below:
1. Void QSBC(Char *X, int m, int QSBC[])
2. {
a. int i;
b. for(i=0; i<Size(DATA); i++)
c. {
d. if(lastCha == y[j + m-1] && first Cha == y[j])
i. {
ii. if(i <= 0)
iii. {
3. Integrated Intelligent Research (IIR) International Journal of Business Intelligents
Volume: 05 Issue: 01 June 2016 Page No.112-115
ISSN: 2278-2400
114
1. output(j);
iv. }
e. }
f. }
3. }
Three steps are applied in the preprocessing stage. In step-1,
character-character comparison is applied within the entire data
and pattern. To find out the resemblance among the pattern and
the data, the last character of the patters is compared with the
entire data. If it matches with any patter, the first character of the
pattern and the corresponding character in the entire data are
compared. If these characters match, the algorithm enters into the
next stage; otherwise, it goes to the last step. Whenever the
algorithm finds a matching, it maintains a count and log
information for further verification and intimation to the admin.
The value of Size(DATA) is dependent on the alphabet size. For
efficiency considerations, we have chosen a value accordingly to
cover all the ASCII values of the characters defined in the
present alphabet set. Void OUTPUT (int) is a function used to
print the position (j) of the current window on the text. The
following Figure-2 shows the custom-index and the data
persisted in the DBMS.The custom index is created by
combining the first and the second field from the main table. The
first and the last character are taken from the first and the last
character of the Field-2. The middle characters in the custom
index are a k-digit number generated automatically in increasing
order [like auto number]. This number indicates the record
number to search speedily from the data base separately as well
as integrated with the custom index.
Figure- 2 . (a). Custom Index Table, (b). Main table
The entire proposed approach is implemented in SQL-SERVER
2008 software for persisting the data. The front end is designed
in DOTNET software to provide user interaction. The proposed
QSBC and pattern matching operations are coded in C# language
and the results are obtained in terms of data size, number of
duplicate records created and the time taken to detect and delete
duplication.
V. RESULTS AND DISCUSSION
To evaluate the performance the experiment is examined more
number of times with various numbers of records in each round.
The number of records in each round is 1000, 2000, 3000, 4000
and 5000. The efficiency is verified by calculating the number of
data redundancy eliminated in the database. Also the time taken
to eliminate the redundant data is calculated to find out the
efficiency of the proposed approach. Comparing with the
existing approach, the performance of the proposed approach is
better in term of time and elimination of redundant data in the
DBMS. This results comparison is shown in the following
Figure-3 and in Figure-4. Figure-5 shows the performance of the
proposed approach in terms of redundant data detection in
different dataset. The number of data redundancy detection is
depend on the dataset. Also according to the time taken and
elimination ratio, our proposed approach is better than the
existing approach and it is shown in the following Figures-3 to
Figure-5.
Figure-3 . Performance Evaluation of Finding Redundant Data
Figure-4 . Performance Evaluation By calculating the Time taken to
eliminate the Redundant Data
Figure-5 . Performance Evaluation in terms of Various Data Sets
VI. CONCLUSION
To fulfill the objective of this paper, a custom index is
maintained for all the data going to be persisted in the DBMS.
4. Integrated Intelligent Research (IIR) International Journal of Business Intelligents
Volume: 05 Issue: 01 June 2016 Page No.112-115
ISSN: 2278-2400
115
The proposed QSBC approach can do detection and elimination
of redundant data in any kind of DBMS speedily and accurately.
Some of the anomaly functions are also carried out by the
proposed approach is data cleaning and alignment, by comparing
the custom-index. It is verified before and after data
manipulation processes. From the obtained results and the graphs
shown in Figure-3 to Figure-5, it is concluded that QSBC
approach is a suitable, better approach for DBMS.
REFERENCES
[1] Moises G. de Carvalho, Alberto H. F.Laender, Marcos Andre Goncalves,
Altigran S. da Silva, “A Genetic Programming Approach to Record
Deduplication”, IEEE Transaction on Knowledge and Data
Engineering,pp 399-412, 2011.
[2] N. Koudas, S. Sarawagi, and D. Srivastava, “Record linkage: similarity
measures and algorithms,” in Proceedings of the2006 ACM SIGMOD
International Conference on Management of Data, pp. 802–803, 2006.
[3] Ye Qingwei, WuDongxing, Zhou Yu, Wang Xiaodong, " The duplicated
of partial content detection based on PSO ", IEEE FifthInternational
Conference on Bio-Inspired Computing: Theories and Applications, pp:
350 - 353, 2010.
[4] E.J. Weyuker and F.I. Vokolos, “Experience with Performance Testing of
Software Systems: Issues,” IEEE Trans. Software Eng., vol. 26, no. 12,
pp. 1147-1156, Dec. 2000, doi: 10.1109/32.888628.
[5] G. Denaro, A. Polini, and W. Emmerich, “Early Performance Testing of
Distributed Software Applications,” Proc. Fourth Int‟l Workshop Software
and Performance (WOSP ‟04), pp. 94-103, 2004, doi:
10.1145/974044.974059.
[6] Dutch, T.M. and W.J. Bolosky, 2011, “A study of practical de-
duplication”, ACM Trans. Storage- DOI: 10.1145/2078861.2078864.
[7] Dubnicki, C., L. Gryz, L. Heldt, M. Kaczmarczyk and W. Kilianet al.,
2009. Hydrastor, ”A scalable secondary storage”, Proccedings of the 7th
Conference on File and Storage Technologies, (FST „09), pp: 197-210.
[8] Ungureanu, C., B. Atkin, A. Aranya, S. Gokhale and S.Ragoet al., 2010.
Hydra FS, “A high-throughput file system for the HYDRAstor content-
addressable storage system”, Proceedings of the 8th USENIX Conference
on File and Storage Technologies, (FST‟ 10), USENIX Association
Berkeley, CA, USA., pp: 17-17.
[9] Bolosky, W.J., S. Corbin, D. Goebel and J.R. Douceur, 2000, “Single
instance storage in Windows® 2000”, Proceedings of the 4th Conference
on USENIX Windows Systems Symposium, (WSS ‟00), USENIX
Association Berkeley, CA, USA, pp: 2-2.
[10] Harnik, D., B. Pinkas and A. Shulman-Peleg, 2010, “Side channels in
cloud services: De-duplication in cloud storage”, IEEE Security Privacy,
8: 40-47. DOI: 10.1109/MSP.2010.187.
[11] Gunawi, H.S., N. Agrawal, A.C. Arpaci-Dusseau, R.H. Arpaci-Dusseau
and J. Schindler, 2005, “Deconstructing commodity storage clusters”,
Proceedings of the 32nd Annual International Symposium on Computer
Architecture, Jun. 4-8, IEEE Xplore Press, pp: 60-71.
DOI:10.1109/ISCA.2005.20.
[12] Douceur, J.R., A. Adya, W.J. Bolosky, D. Simon and M. Theimer,
2002,”Reclaiming space from duplicate files in a server less distributed
file system,” Proceedings of the 22nd International Conference on
Distributed Computing Systems, (ICDCS‟ 02), ACM, USA., pp: 617-617.
[13] Quinlan, S. and S. Dorward, 2002, “Venti: A new approach to archival
storage”, Bell Labs, Lucent Technologies.
[14] Muthitacharoen, A., B. Chen and D. Mazieres, 2001,”A low-bandwidth
network file system”, Proceedings of the 18th ACM Symposium on
Operating Systems Principles, Oct. 21-24, ACM Press, Banff, Canada, pp:
174-187. DOI: 10.1145/502034.502052.
[15] Vrable, M., S. Savage and G.M. Voelker, 2009, “Cumulus: File system
backup to the cloud”, ACM Trans. Storage. DOI:
10.1145/1629080.1629084.