The document proposes techniques to identify and remove duplicate tuples in relational databases. It begins with an introduction to duplicate tuples and why their removal is important. It then describes two proposed techniques: 1) a union fusion function that replaces duplicate tuples by taking the union of attribute values, and 2) a functional dependency approach that is more generalized. The techniques aim to remove duplicates while maintaining referential integrity. An algorithm is provided to identify duplicate tuples across relations, replace them with single tuples, and propagate the changes to dependent relations in a semantically correct way. Sample relations are used to demonstrate how the techniques replace duplicate tuples and maintain referential integrity.
International Journal of Engineering Research and Applications (IJERA) is a team of researchers not publication services or private publications running the journals for monetary benefits, we are association of scientists and academia who focus only on supporting authors who want to publish their work. The articles published in our journal can be accessed online, all the articles will be archived for real time access.
Our journal system primarily aims to bring out the research talent and the works done by sciaentists, academia, engineers, practitioners, scholars, post graduate students of engineering and science. This journal aims to cover the scientific research in a broader sense and not publishing a niche area of research facilitating researchers from various verticals to publish their papers. It is also aimed to provide a platform for the researchers to publish in a shorter of time, enabling them to continue further All articles published are freely available to scientific researchers in the Government agencies,educators and the general public. We are taking serious efforts to promote our journal across the globe in various ways, we are sure that our journal will act as a scientific platform for all researchers to publish their works online.
Indexing based Genetic Programming Approach to Record Deduplicationidescitation
In this paper, we present a genetic programming (GP) approach to record
deduplication with indexing techniques.Data de-duplication is a process in which data are
cleaned from duplicate records due to misspelling, field swap or any other mistake or data
inconsistency. This process requires that we identify objects that are included in more than
one list.The problem of detecting and eliminating duplicated data is one of the major
problems in the broad area of data cleaning and data quality in data warehouse. So, we
need to create such a algorithm that can detect and eliminate maximum duplications.GP
with indexing is one of the optimization technique that helps to find maximum duplicates in
the database. We used adeduplication function that is able to identify whether two or more
entries in a repository are replicas or not. As many industries and systems depend on the
accuracy and reliability of databases to carry out operations. Therefore, the quality of the
information stored in the databases, can have significant cost implications to a system that
relies on information to function and conduct business. Moreover, this is fact that clean and
replica-free repositories not only allow the retrieval of higher quality information but also
lead to more concise data and to potential savings in computational time and resources to
process this data.
Index
Data mining , knowledge discovery is the process
of analyzing data from different perspectives and summarizing it
into useful information - information that can be used to increase
revenue, cuts costs, or both. Data mining software is one of a
number of analytical tools for analyzing data. It allows users to
analyze data from many different dimensions or angles, categorize
it, and summarize the relationships identified. Technically, data
mining is the process of finding correlations or patterns among
dozens of fields in large relational databases. The goal of
clustering is to determine the intrinsic grouping in a set of
unlabeled data. But how to decide what constitutes a good
clustering? It can be shown that there is no absolute “best”
criterion which would be independent of the final aim of the
clustering. Consequently, it is the user which must supply this
criterion, in such a way that the result of the clustering will suit
their needs.
For instance, we could be interested in finding
representatives for homogeneous groups (data reduction), in
finding “natural clusters” and describe their unknown properties
(“natural” data types), in finding useful and suitable groupings
(“useful” data classes) or in finding unusual data objects (outlier
detection).Of late, clustering techniques have been applied in the
areas which involve browsing the gathered data or in categorizing
the outcome provided by the search engines for the reply to the
query raised by the users. In this paper, we are providing a
comprehensive survey over the document clustering.
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERINGijcsa
Text Document Clustering is one of the fastest growing research areas because of availability of huge amount of information in an electronic form. There are several number of techniques launched for clustering documents in such a way that documents within a cluster have high intra-similarity and low inter-similarity to other clusters. Many document clustering algorithms provide localized search in effectively navigating, summarizing, and organizing information. A global optimal solution can be obtained by applying high-speed and high-quality optimization algorithms. The optimization technique performs a globalized search in the entire solution space. In this paper, a brief survey on optimization approaches to text document clustering is turned out.
International Journal of Engineering Research and Applications (IJERA) is a team of researchers not publication services or private publications running the journals for monetary benefits, we are association of scientists and academia who focus only on supporting authors who want to publish their work. The articles published in our journal can be accessed online, all the articles will be archived for real time access.
Our journal system primarily aims to bring out the research talent and the works done by sciaentists, academia, engineers, practitioners, scholars, post graduate students of engineering and science. This journal aims to cover the scientific research in a broader sense and not publishing a niche area of research facilitating researchers from various verticals to publish their papers. It is also aimed to provide a platform for the researchers to publish in a shorter of time, enabling them to continue further All articles published are freely available to scientific researchers in the Government agencies,educators and the general public. We are taking serious efforts to promote our journal across the globe in various ways, we are sure that our journal will act as a scientific platform for all researchers to publish their works online.
Indexing based Genetic Programming Approach to Record Deduplicationidescitation
In this paper, we present a genetic programming (GP) approach to record
deduplication with indexing techniques.Data de-duplication is a process in which data are
cleaned from duplicate records due to misspelling, field swap or any other mistake or data
inconsistency. This process requires that we identify objects that are included in more than
one list.The problem of detecting and eliminating duplicated data is one of the major
problems in the broad area of data cleaning and data quality in data warehouse. So, we
need to create such a algorithm that can detect and eliminate maximum duplications.GP
with indexing is one of the optimization technique that helps to find maximum duplicates in
the database. We used adeduplication function that is able to identify whether two or more
entries in a repository are replicas or not. As many industries and systems depend on the
accuracy and reliability of databases to carry out operations. Therefore, the quality of the
information stored in the databases, can have significant cost implications to a system that
relies on information to function and conduct business. Moreover, this is fact that clean and
replica-free repositories not only allow the retrieval of higher quality information but also
lead to more concise data and to potential savings in computational time and resources to
process this data.
Index
Data mining , knowledge discovery is the process
of analyzing data from different perspectives and summarizing it
into useful information - information that can be used to increase
revenue, cuts costs, or both. Data mining software is one of a
number of analytical tools for analyzing data. It allows users to
analyze data from many different dimensions or angles, categorize
it, and summarize the relationships identified. Technically, data
mining is the process of finding correlations or patterns among
dozens of fields in large relational databases. The goal of
clustering is to determine the intrinsic grouping in a set of
unlabeled data. But how to decide what constitutes a good
clustering? It can be shown that there is no absolute “best”
criterion which would be independent of the final aim of the
clustering. Consequently, it is the user which must supply this
criterion, in such a way that the result of the clustering will suit
their needs.
For instance, we could be interested in finding
representatives for homogeneous groups (data reduction), in
finding “natural clusters” and describe their unknown properties
(“natural” data types), in finding useful and suitable groupings
(“useful” data classes) or in finding unusual data objects (outlier
detection).Of late, clustering techniques have been applied in the
areas which involve browsing the gathered data or in categorizing
the outcome provided by the search engines for the reply to the
query raised by the users. In this paper, we are providing a
comprehensive survey over the document clustering.
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERINGijcsa
Text Document Clustering is one of the fastest growing research areas because of availability of huge amount of information in an electronic form. There are several number of techniques launched for clustering documents in such a way that documents within a cluster have high intra-similarity and low inter-similarity to other clusters. Many document clustering algorithms provide localized search in effectively navigating, summarizing, and organizing information. A global optimal solution can be obtained by applying high-speed and high-quality optimization algorithms. The optimization technique performs a globalized search in the entire solution space. In this paper, a brief survey on optimization approaches to text document clustering is turned out.
Many data mining and knowledge discovery methodologies and process models have been developed, with varying degrees of success, there are three main methods used to discover patterns in data; KDD, SEMMA and CRISP-DM. They are presented in many of the publications of the area and are used in practice. To our knowledge, there is no clear methodology developed to support link mining. However, there is a well known methodology in knowledge discovery in databases, known as Cross Industry Standard Process for Data Mining (CRISPDM), developed by a consortium of several industrial companies which can be relevant to the study of link mining. In this study CRISP-DM has been adapted to the field of Link mining to detect anomalies. An important goal in link mining is the task of inferring links that are not yet known in a given network. This approach is implemented through the use of a case study of real world data (co-citation data). This case study aims to use mutual information to interpret the semantics of anomalies identified in co-citation, dataset that can provide valuable insights in determining the nature of a given link and potentially identifying important future link relationships
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
Hierarchical clustering of multi class data (the zoo dataset)Raid Mahbouba
Data mining project
The main goal of this study is to group 101 animals into their natural family types using various features of animals and by utilizing Hierarchical clustering algorithm which is one of the unsupervised learning algorithms.
Certain Investigation on Dynamic Clustering in Dynamic Dataminingijdmtaiir
Clustering is the process of grouping a set of objects
into classes of similar objects. Dynamic clustering comes in a
new research area that is concerned about dataset with dynamic
aspects. It requires updates of the clusters whenever new data
records are added to the dataset and may result in a change of
clustering over time. When there is a continuous update and
huge amount of dynamic data, rescan the database is not
possible in static data mining. But this is possible in Dynamic
data mining process. This dynamic data mining occurs when
the derived information is present for the purpose of analysis
and the environment is dynamic, i.e. many updates occur.
Since this has now been established by most researchers and
they will move into solving some of the problems and the
research is to concentrate on solving the problem of using data
mining dynamic databases. This paper gives some
investigation of existing work done in some papers related with
dynamic clustering and incremental data clustering
Privacy Preserving MFI Based Similarity Measure For Hierarchical Document Clu...IJORCS
The increasing nature of World Wide Web has imposed great challenges for researchers in improving the search efficiency over the internet. Now days web document clustering has become an important research topic to provide most relevant documents in huge volumes of results returned in response to a simple query. In this paper, first we proposed a novel approach, to precisely define clusters based on maximal frequent item set (MFI) by Apriori algorithm. Afterwards utilizing the same maximal frequent item set (MFI) based similarity measure for Hierarchical document clustering. By considering maximal frequent item sets, the dimensionality of document set is decreased. Secondly, providing privacy preserving of open web documents is to avoiding duplicate documents. There by we can protect the privacy of individual copy rights of documents. This can be achieved using equivalence relation.
Enhancing Keyword Query Results Over Database for Improving User Satisfaction ijmpict
Storing data in relational databases is widely increasing to support keyword queries but search results does not gives effective answers to keyword query and hence it is inflexible from user perspective. It would be helpful to recognize such type of queries which gives results with low ranking. Here we estimate prediction of query performance to find out effectiveness of a search performed in response to query and features of such hard queries is studied by taking into account contents of the database and result list. One relevant problem of database is the presence of missing data and it can be handled by imputation. Here an inTeractive Retrieving-Inferring data imputation method (TRIP) is used which achieves retrieving and inferring alternately to fill the missing attribute values in the database. So by considering both the prediction of hard queries and imputation over the database, we can get better keyword search results.
The Statement of Conjunctive and Disjunctive Queries in Object Oriented Datab...Editor IJCATR
Entrance of object orienting concept in database caused the relation database gradually to replace with object oriented
database in various fields. On the other hand for solving the problem of real world uncertain data, several methods were presented.
One of these methods for modeling database is an approach wich couples object-oriented database modeling with fuzzy logic. Many
queries that users to pose are expressed on the basis of linguistic variables. Because of classical databases are not able to support these
variables, leads to fuzzy approaches are considered. We investigate databases queries in this study both simple and complex ways. In
the complex way, we use conjunctive and disjunctive queries. In the following, we use the XML labels to express inqueries into fuzzy.
We can also communicate with other sections of software by entering into XML world as the most reliable opportunity. Also we want
to correct conjunctive and disjunctive queries related to fuzzy object oriented database using the concept of dependency measure and
weight, and weight be assigned to different phrases of a query based on user emphasis. The other aim of this research is mapping fuzzy
queries to fuzzy-XML. It is expected to be simple implement of query, and output of execution of queries be greatly closer to users'
needs and fulfill her expect. The results show that the proposed method explains the possible conjunctive and disjunctive queries the
database in the form of Fuzzy-XML.
Many data mining and knowledge discovery methodologies and process models have been developed,
with varying degrees of success, there are three main methods used to discover patterns in data; KDD,
SEMMA and CRISP-DM. They are presented in many of the publications of the area and are used in
practice. To our knowledge, there is no clear methodology developed to support link mining. However,
there is a well known methodology in knowledge discovery in databases, known as Cross Industry
Standard Process for Data Mining (CRISPDM), developed by a consortium of several industrial
companies which can be relevant to the study of link mining. In this study CRISP-DM has been adapted to
the field of Link mining to detect anomalies. An important goal in link mining is the task of inferring links
that are not yet known in a given network. This approach is implemented through the use of a case study
of realworld data (co-citation data). This case study aims to use mutual information to interpret the
semantics of anomalies identified in co-citation, dataset that can provide valuable insights in determining
the nature of a given link and potentially identifying important future link relationships.
Mining academic social network is becoming increasingly necessary with the increasing amount of data. It
is a favorite topic of research for many researchers. The data mining techniques are used for the mining of
academic social networks. In this paper, we are presenting an efficient frequent item set mining technique
for social academic network. The proposed framework first processes the research documents and then the
enhanced frequent item set mining is applied to find the strength of relationship between the researchers.
The proposed method will be fast in comparison to older algorithms. Also it will takes less main memory
space for computation purpose.
Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databasesijsrd.com
Database schema integration is an important discipline for constructing heterogeneous multidatabase systems. Fuzzy information has also been introduced into relational databases and has been extensively studied. However, the issues of integrating local fuzzy relations are rarely addressed. In this paper, we identify new types of conflicts that may occur in schemas and data due to the inclusion of fuzzy relational databases. We propose a methodology that resolves these new types of conflicts in a specific order to minimize the execution time of integration process.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Clustering the results of a search helps the user to overview the information returned. In this paper, we
look upon the clustering task as cataloguing the search results. By catalogue we mean a structured label
list that can help the user to realize the labels and search results. Labelling Cluster is crucial because
meaningless or confusing labels may mislead users to check wrong clusters for the query and lose extra
time. Additionally, labels should reflect the contents of documents within the cluster accurately. To be able
to label clusters effectively, a new cluster labelling method is introduced. More emphasis was given to
/produce comprehensible and accurate cluster labels in addition to the discovery of document clusters. We
also present a new metric that employs to assess the success of cluster labelling. We adopt a comparative
evaluation strategy to derive the relative performance of the proposed method with respect to the two
prominent search result clustering methods: Suffix Tree Clustering and Lingo.
we perform the experiments using the publicly available Datasets Ambient and ODP-239
The use of Cooking Gas as Refrigerant in a Domestic RefrigeratorIJERA Editor
The application of cooking gas refrigerants in refrigeration system is considered to be a potential way to improve
energy efficiency and to encourage the use of environment-friendly refrigerants. Refrigeration operation has
been met with many challenges as it deals with environmental impact, how it affects humans and how it
contributes to the society in general. Domestic refrigerators annually consume several metric tons of traditional
refrigerants, which contribute to very high Ozone Depletion Potential (ODP) and Global Warming Potential
(GWP). The experiment conducted employs the use of two closely linked refrigerants, R600a (Isobutene) and
cooking gas which is varied in an ideal refrigerant mixture of 150 g of refrigerant and 15 ml of lubricating oil (to
a rating of 40 wt % expected in the compressor). The Laboratory process involved the use of Gas
Chromatography machine to ascertain the values of the mole ratio, molecular weight and critical temperatures.
Prode properties and Refprop softwares were used to ascertain other refrigerant properties of the mixture. The
results indicated that the mixture of R600a with lubricant confirm mineral oil as being the most appropriate for
the operation. The experimental results indicated that the refrigeration system with cooking gas refrigerant
worked normally and was found to attain high freezing capacity and a COP value of 2.159. It is established that
cooking gas is a viable alternative refrigerant to replace R600a in domestic refrigerators. Hence, its application
in refrigerating systems measures up to the current trend on environmental regulations with hydrocarbon
refrigerants
SUSTAINABLE BUILDINGS IN HOT AND DRY CLIMATE OF INDIAIJERA Editor
The consumption of energy in the buildings is increasing as the development is taking at a very fast rate. No evidence is now
required to prove that the present climate changes are directly linked to the human activities and also the concerns regarding
exploitation of the fossil fuel have reached a level where the negative effect are having impact on the life of a common man.
Passive Architecture involves blending conventional architectural principles with solar & wind energy and the inherent
properties of building materials to ensure that the interiors remain warm in winter and cool in summer, thus creating a yearround
comfortable environment. Various solar passive techniques have been studied in detail so that the undesirable impact
in hot and dry climate could be mitigated. It is concluded that with the application of these techniques the building could be
made comfortable with comparatively less use of energy
Many data mining and knowledge discovery methodologies and process models have been developed, with varying degrees of success, there are three main methods used to discover patterns in data; KDD, SEMMA and CRISP-DM. They are presented in many of the publications of the area and are used in practice. To our knowledge, there is no clear methodology developed to support link mining. However, there is a well known methodology in knowledge discovery in databases, known as Cross Industry Standard Process for Data Mining (CRISPDM), developed by a consortium of several industrial companies which can be relevant to the study of link mining. In this study CRISP-DM has been adapted to the field of Link mining to detect anomalies. An important goal in link mining is the task of inferring links that are not yet known in a given network. This approach is implemented through the use of a case study of real world data (co-citation data). This case study aims to use mutual information to interpret the semantics of anomalies identified in co-citation, dataset that can provide valuable insights in determining the nature of a given link and potentially identifying important future link relationships
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
Hierarchical clustering of multi class data (the zoo dataset)Raid Mahbouba
Data mining project
The main goal of this study is to group 101 animals into their natural family types using various features of animals and by utilizing Hierarchical clustering algorithm which is one of the unsupervised learning algorithms.
Certain Investigation on Dynamic Clustering in Dynamic Dataminingijdmtaiir
Clustering is the process of grouping a set of objects
into classes of similar objects. Dynamic clustering comes in a
new research area that is concerned about dataset with dynamic
aspects. It requires updates of the clusters whenever new data
records are added to the dataset and may result in a change of
clustering over time. When there is a continuous update and
huge amount of dynamic data, rescan the database is not
possible in static data mining. But this is possible in Dynamic
data mining process. This dynamic data mining occurs when
the derived information is present for the purpose of analysis
and the environment is dynamic, i.e. many updates occur.
Since this has now been established by most researchers and
they will move into solving some of the problems and the
research is to concentrate on solving the problem of using data
mining dynamic databases. This paper gives some
investigation of existing work done in some papers related with
dynamic clustering and incremental data clustering
Privacy Preserving MFI Based Similarity Measure For Hierarchical Document Clu...IJORCS
The increasing nature of World Wide Web has imposed great challenges for researchers in improving the search efficiency over the internet. Now days web document clustering has become an important research topic to provide most relevant documents in huge volumes of results returned in response to a simple query. In this paper, first we proposed a novel approach, to precisely define clusters based on maximal frequent item set (MFI) by Apriori algorithm. Afterwards utilizing the same maximal frequent item set (MFI) based similarity measure for Hierarchical document clustering. By considering maximal frequent item sets, the dimensionality of document set is decreased. Secondly, providing privacy preserving of open web documents is to avoiding duplicate documents. There by we can protect the privacy of individual copy rights of documents. This can be achieved using equivalence relation.
Enhancing Keyword Query Results Over Database for Improving User Satisfaction ijmpict
Storing data in relational databases is widely increasing to support keyword queries but search results does not gives effective answers to keyword query and hence it is inflexible from user perspective. It would be helpful to recognize such type of queries which gives results with low ranking. Here we estimate prediction of query performance to find out effectiveness of a search performed in response to query and features of such hard queries is studied by taking into account contents of the database and result list. One relevant problem of database is the presence of missing data and it can be handled by imputation. Here an inTeractive Retrieving-Inferring data imputation method (TRIP) is used which achieves retrieving and inferring alternately to fill the missing attribute values in the database. So by considering both the prediction of hard queries and imputation over the database, we can get better keyword search results.
The Statement of Conjunctive and Disjunctive Queries in Object Oriented Datab...Editor IJCATR
Entrance of object orienting concept in database caused the relation database gradually to replace with object oriented
database in various fields. On the other hand for solving the problem of real world uncertain data, several methods were presented.
One of these methods for modeling database is an approach wich couples object-oriented database modeling with fuzzy logic. Many
queries that users to pose are expressed on the basis of linguistic variables. Because of classical databases are not able to support these
variables, leads to fuzzy approaches are considered. We investigate databases queries in this study both simple and complex ways. In
the complex way, we use conjunctive and disjunctive queries. In the following, we use the XML labels to express inqueries into fuzzy.
We can also communicate with other sections of software by entering into XML world as the most reliable opportunity. Also we want
to correct conjunctive and disjunctive queries related to fuzzy object oriented database using the concept of dependency measure and
weight, and weight be assigned to different phrases of a query based on user emphasis. The other aim of this research is mapping fuzzy
queries to fuzzy-XML. It is expected to be simple implement of query, and output of execution of queries be greatly closer to users'
needs and fulfill her expect. The results show that the proposed method explains the possible conjunctive and disjunctive queries the
database in the form of Fuzzy-XML.
Many data mining and knowledge discovery methodologies and process models have been developed,
with varying degrees of success, there are three main methods used to discover patterns in data; KDD,
SEMMA and CRISP-DM. They are presented in many of the publications of the area and are used in
practice. To our knowledge, there is no clear methodology developed to support link mining. However,
there is a well known methodology in knowledge discovery in databases, known as Cross Industry
Standard Process for Data Mining (CRISPDM), developed by a consortium of several industrial
companies which can be relevant to the study of link mining. In this study CRISP-DM has been adapted to
the field of Link mining to detect anomalies. An important goal in link mining is the task of inferring links
that are not yet known in a given network. This approach is implemented through the use of a case study
of realworld data (co-citation data). This case study aims to use mutual information to interpret the
semantics of anomalies identified in co-citation, dataset that can provide valuable insights in determining
the nature of a given link and potentially identifying important future link relationships.
Mining academic social network is becoming increasingly necessary with the increasing amount of data. It
is a favorite topic of research for many researchers. The data mining techniques are used for the mining of
academic social networks. In this paper, we are presenting an efficient frequent item set mining technique
for social academic network. The proposed framework first processes the research documents and then the
enhanced frequent item set mining is applied to find the strength of relationship between the researchers.
The proposed method will be fast in comparison to older algorithms. Also it will takes less main memory
space for computation purpose.
Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databasesijsrd.com
Database schema integration is an important discipline for constructing heterogeneous multidatabase systems. Fuzzy information has also been introduced into relational databases and has been extensively studied. However, the issues of integrating local fuzzy relations are rarely addressed. In this paper, we identify new types of conflicts that may occur in schemas and data due to the inclusion of fuzzy relational databases. We propose a methodology that resolves these new types of conflicts in a specific order to minimize the execution time of integration process.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Clustering the results of a search helps the user to overview the information returned. In this paper, we
look upon the clustering task as cataloguing the search results. By catalogue we mean a structured label
list that can help the user to realize the labels and search results. Labelling Cluster is crucial because
meaningless or confusing labels may mislead users to check wrong clusters for the query and lose extra
time. Additionally, labels should reflect the contents of documents within the cluster accurately. To be able
to label clusters effectively, a new cluster labelling method is introduced. More emphasis was given to
/produce comprehensible and accurate cluster labels in addition to the discovery of document clusters. We
also present a new metric that employs to assess the success of cluster labelling. We adopt a comparative
evaluation strategy to derive the relative performance of the proposed method with respect to the two
prominent search result clustering methods: Suffix Tree Clustering and Lingo.
we perform the experiments using the publicly available Datasets Ambient and ODP-239
The use of Cooking Gas as Refrigerant in a Domestic RefrigeratorIJERA Editor
The application of cooking gas refrigerants in refrigeration system is considered to be a potential way to improve
energy efficiency and to encourage the use of environment-friendly refrigerants. Refrigeration operation has
been met with many challenges as it deals with environmental impact, how it affects humans and how it
contributes to the society in general. Domestic refrigerators annually consume several metric tons of traditional
refrigerants, which contribute to very high Ozone Depletion Potential (ODP) and Global Warming Potential
(GWP). The experiment conducted employs the use of two closely linked refrigerants, R600a (Isobutene) and
cooking gas which is varied in an ideal refrigerant mixture of 150 g of refrigerant and 15 ml of lubricating oil (to
a rating of 40 wt % expected in the compressor). The Laboratory process involved the use of Gas
Chromatography machine to ascertain the values of the mole ratio, molecular weight and critical temperatures.
Prode properties and Refprop softwares were used to ascertain other refrigerant properties of the mixture. The
results indicated that the mixture of R600a with lubricant confirm mineral oil as being the most appropriate for
the operation. The experimental results indicated that the refrigeration system with cooking gas refrigerant
worked normally and was found to attain high freezing capacity and a COP value of 2.159. It is established that
cooking gas is a viable alternative refrigerant to replace R600a in domestic refrigerators. Hence, its application
in refrigerating systems measures up to the current trend on environmental regulations with hydrocarbon
refrigerants
SUSTAINABLE BUILDINGS IN HOT AND DRY CLIMATE OF INDIAIJERA Editor
The consumption of energy in the buildings is increasing as the development is taking at a very fast rate. No evidence is now
required to prove that the present climate changes are directly linked to the human activities and also the concerns regarding
exploitation of the fossil fuel have reached a level where the negative effect are having impact on the life of a common man.
Passive Architecture involves blending conventional architectural principles with solar & wind energy and the inherent
properties of building materials to ensure that the interiors remain warm in winter and cool in summer, thus creating a yearround
comfortable environment. Various solar passive techniques have been studied in detail so that the undesirable impact
in hot and dry climate could be mitigated. It is concluded that with the application of these techniques the building could be
made comfortable with comparatively less use of energy
Assessment on the Ecosystem Service Functions of Nansi Lake in ChinaIJERA Editor
The assessment of ecosystem service functions is one of the focused area in modern ecological and
environmental research. As a typical shallow macrophytic lake in China, Nansi Lake is selected as the study
area. Based the indicator system and assessment models established in this research, the ecosystem service
functions of Nansi Lake are assessed. The results show that the ecosystem service function of drinking water
source area/aquatic product supply/ecological habitat/entertainment and landscape/ water purification function of
the lakeside zone for Nansi Lake is assessed as medium (2.2)/good (3.5)/medium (3)/medium (2.55)/medium (3),
and the overall ecosystem service function of Nansi Lake can be considered as ―Medium‖. The eutrophication
control and ecological restoration of lakeside wetland need to be enhanced in the future.
Speed Control of Induction Motor Using PLC and SCADA SystemIJERA Editor
Automation or automatic control is the use of various control systems for operating equipment such as
machinery, processes in factories, boilers and heat-treating ovens, switching in telephone networks, steering and
stabilization of ships, aircraft and other applications with minimal or reduced human intervention. Some
processes have been completely automated. The motor speed is controlled via the driver as an open loop control.
To make a more precise closed loop control of motor speed we will use a tachometer to measure the speed and
feed it back to the PLC, which compares to the desired value and take a control action, then the signal is
transferred to the motor – via driver – to increase / decrease the speed. We will measure the speed of the motor
using an incremental rotary encoder by adjusting parameters (PLC, driver) and also we need to reduce the
overall cost of the system. Our control system will be held using the available Siemens PLC. In addition, we will
monitor motor parameters via SCADA system.
Design and Simulation of a Fractal Micro-TransformerIJERA Editor
Due to advancement in smart technologies, the issues like renewable energy integrations into the existing power
systems, reduced weight and size of power equipments is required. In this regard, this work is focused on the
study and design of fractal type micro-transformer for day-to-day applications. An air core transformer is
designed using finite element modeling. The obtained results showed far better implementation parameters in
comparison to the macro transformers.
Database Security Two Way Authentication Using Graphical PasswordIJERA Editor
As data represent a key asset for today's organizations. The problem is that how to protect this data from
attackers, theft and misuse is at the forefront of any organization’s mind. Even though today several data
security techniques are available to protect database and computing infrastructure, many such as network
security and firewalls tools are unable to prevent attacks from insider. Insider is a person working in
organization who can try to access the sensitive data. This paper proposes a two-way authentication method
which fuses knowledge-based secret and personal trait information.
Analytic Solutions to the Darcy-Lapwood-Brinkman Equation with Variable Perme...IJERA Editor
Three exact solutions to the Darcy-Lapwood-Brinkman equation with variable permeability are obtained in this
work. Solutions are obtained for a given vorticity distribution, taken as a function of the streamfunction.
Classification of the flow field is provided and comparison is made with the solutions obtained when
permeability is constant. Interdependence of Reynolds number and variable permeability is emphasized.
A Hypothetical Study on Structural aspects of Indole-3-carbinol (I3C) by Hype...IJERA Editor
Indole-3-carbinol (I3C) is a plant compound derived from glucosinolates, found in cruciferous vegetables.
Researchers have indicated that I3C shows great promise as a cancer preventative and hormone-balancing agent.
HyperChem 7.5 software was used for quantum mechanical calculations. The geometry optimization was
carried out using Ab Initio method. QSAR parameters were generated with semi empirical single point AM1
method. The HOMO and LUMO frontier orbital energies were also computed. Conformational analysis and
geometry optimization of Indole-3-carbinol (I3C) was performed according to the Hartree-Fock (HF)
calculation method by ArgusLab 4.0.1 software .The minimum heat of formation is calculated by geometry
convergence function by ArgusLab software. PM3 semi empirical quantum mechanical calculations were
carried out on structure of Indole-3-carbinol (I3C) to obtain the geometries, geometric parameters and
thermodynamic parameters. The HOMO and LUMO frontier orbital energies were also computed for the
optimized molecule. Electron density surface of IDOX is determined using PM3 geometry with PM3
wavefunciton.
A SECURITY BASED VOTING SYSTEMUSING BIOMETRICIJERA Editor
The problem of voting is still critical in terms of safety and security. This paper is about the
design and development of a voting system using fingerprint to provide a high performance with high security
to the voting system. Fingerprint biometrics is widely used for identification. Biometrics identifiers cannot be
misplaced and they represent any individual identity. The integration of biometric with electronic voting
machine requires less manpower, save much time of voters and personal, ensure accuracy,transparency and fast
result in election. In this paper a framework for electronic voting machine based on biometric verification is
proposed and implemented. The proposed framework provides secured identification and authentication
processes for the voters and candidates through the use of fingerprint biometric
Finite Element Analysis of Composite Deck Slab Using Perfobond Rib as Shear C...IJERA Editor
Nowadays, the composite decks are very common to use in composite or steel construction. In this case of study
the composite slabs have been investigated numerically by Finite Element Method (FEM). Five composite slabs
were analyzed using finite element software LUSAS. The deflection of each model were obtained and compared
with experimental test. Results showed a good agreement with the experimental data and indicate that the
perfobond rib is appropriate shear connector for the bridges decks
Geostatistical analysis of rainfall variability on the plateau of Allada in S...IJERA Editor
The goal of this survey is to contribute to a better understanding of the distribution of the rainfall on the plateau
of Allada in Benin. The plateau of Allada is the garner ofCotonou and vicinities. The food production is over
62% rainfed.Then, it imports to analyze the way how rains are spatially distributed on the area in order to deduct
the potential rainfall. To achieve this goal, rainfall data of 28 stations have been used. Three sub-periods have
been identified: 1996-2000, 2001-2005 and 2006-2010. The distribution of rainfall has been established with
Thiessen and kriging methods. On average, 1117mm of rain fell on the study area per year. But three tendencies
were shown: the less rainy zones, the fairly rainy zones, and the greatly rainy zones. All the rainfall zones knew
an increase of the precipitations except Abomey-Calavi and Niaouli. But the variations are not significant. While
analyzing the spatial structure for the kriging of precipitations, it was revealed a power model of variogram. The
direction of the rainfall gradient is oriented southeast - northwest during the three sub-periods. Abomey-Calavi
recorded the weakest precipitations. The strongest values are interchanged between Toffo and Sékou, OuidahNorth
and Ouidah-City.
MIMO Systems for Military Communication/Applications.IJERA Editor
The Military is embracing the communication revolution, turning to a new generation of sophisticated systems
to enable faster, richer, less costly, more flexible, reliable, compact, mobile, jam resistant, low probability of
detection, re configurable and spectrally efficient communication. Many of these features can be added to a
great extent in the existing systems, by utilising MIMO technology appropriately and judiciously. MIMO finds
applications in wireless communication, NLOS communication, satellite communication, HF communication,
Optical Fibre Communication. MIMO makes these technologies more suitable by introducing features
mandatory for military communication such as Ant jamming capability, Low Probability of Intercept, low
visibility of satellite earth Antennas by reducing their aperture area. MIMO can also provide redundancy by
employing no extra resources, thereby increasing reliability. MIMO is highly effective to communicate with
Unmanned Aerial Vehicles (1).
Comparative Analysis of Different Wavelet Functions using Modified Adaptive F...IJERA Editor
The traditional method of wavelet denoising is inefficient in removing the overlap noise between noisy signal
and noise, due to which a modified adaptive filtering based on wavelet transform method is introduced. The
method used in this paper filters out the noise on the basis of wavelet denoising using different wavelet
functions. The simulation results indicate the Signal to Noise ratio (SNR), Mean Square Error (MSE) and signal
error power spectral density comparison plot between different wavelet functions. These comparison results
verified that Daubechies is more efficient than other wavelet functions in filtering out noise in all perspectives.
The Recent Trend: Vigorous unidentified validation access control system with...IJERA Editor
Service Providers can grow their business by selling our cloud authentication service that can be fully branded
to the Service Provider or if required a Service. The proposed enhanced decentralized access control scheme for
secure data storage in clouds that supports anonymous authentication. In the proposed scheme, the cloud verifies
the authenticity of the series without knowing the user’s identity before storing data. The scheme also has the
added feature of access control in which only valid users are able to decrypt the stored information. The scheme
prevents replay attacks and supports creation, modification, and reading data stored in the cloud. It addresses
user revocation. Moreover, our authentication and access control scheme is decentralized and robust, unlike
other access control schemes designed for clouds which are centralized. The communication, hiding attributes,
increasing high security in access, computation, and storage overheads are comparable to centralized
approaches.
In Existing, it is based on ABE (attribute based encryption) technique which is a centralized approach, where a
single Key Distribution Centre (KDC) distributes secret keys and attributes to all users using asymmetric key
approach. We propose a new decentralized access control method for storing data by providing security in
clouds and also we hide the attributes and access rule of a user. The cloud validates the authentication of the
sequence without knowing the users characteristics previous to the data store. By using this approach only
certified users have right to use the suitable attributes. In future, time based file revocation scheme can be used
to assure the deletion of a file. When time limit of a file expires, we implement the policy based renewal of time
to that file.
Solid Waste Management System: Public-Private Partnership, the Best System fo...IJERA Editor
Solid waste management (SWM) is a major public health and environmental concern in urban areas of many
developing countries. Nairobi’s solid waste situation, which could be taken to generally represent the status
which is largely characterized by low coverage of solid waste collection, pollution from uncontrolled dumping
of waste, inefficient public services, unregulated and uncoordinated private sector and lack of key solid waste
management infrastructure. This paper recapitulates on the public-private partnership as the best system for
developing countries; challenges, approaches, practices or systems of SWM, and outcomes or advantages to the
approach; the literature review focuses on surveying information pertaining to existing waste management
methodologies, policies, and research relevant to the SWM. Information was sourced from peer-reviewed
academic literature, grey literature, publicly available waste management plans, and through consultation with
waste management professionals. Literature pertaining to SWM and municipal solid waste minimization,
auditing and management were searched for through online journal databases, particularly Web of Science, and
Science Direct. Legislation pertaining to waste management was also researched using the different databases.
Additional information was obtained from grey literature and textbooks pertaining to waste management topics.
After conducting preliminary research, prevalent references of select sources were identified and scanned for
additional relevant articles. Research was also expanded to include literature pertaining to recycling,
composting, education, and case studies; the manuscript summarizes with future recommendationsin terms
collaborations of public/ private patternships, sensitization of people, privatization is important in improving
processes and modernizing urban waste management, contract private sector, integrated waste management
should be encouraged, provisional government leaders need to alter their mind set, prepare a strategic, integrated
SWM plan for the cities, enact strong and adequate legislation at city and national level, evaluate the real
impacts of waste management systems, utilizing locally based solutions for SWM service delivery and design,
location, management of the waste collection centersand recycling and compositing activities should be
encouraged.
I
In this article, the issues that have captured the attention of researchers in multinational corporations (MNC) are
discussed and the emerging research agenda is laid out. The first part focuses on understanding the history, and
contemporary scale and significance of multinationals as economic actors. Two opposing perspectives are
distinguished, the economic and the political. In the past, there was a rigid divide between these but,
increasingly, researchers are using elements of both perspectives to understand the dynamics of multinationals.
The crucial additional feature here is the importation of insights from institutional literature on the relationship
between firms and national contexts. Multinational corporations (MNCs) are playing a large and growing role in
shaping our world, both economically and politically. Public and academic opinion has long been mired in an
inconclusive debate as to whether these phenomena are beneficial things that should be encouraged or harmful
things that need intensive governmental regulation. The integrating thesis of this book is that the question as to
whether they are good or bad is the wrong question and is based on the fundamentally faulty premise that all
foreign subsidiaries are essentially similar, i.e., MNCs are homogeneous entities The inevitability of
heterogeneity results in the imperatives of disaggregation and the fallacy of generalization if these complex,
differentiated phenomena are to be properly understood.
On approximations of trigonometric Fourier series as transformed to alternati...IJERA Editor
Trigonometric Fourier series are transformed into: (1) alternating series with terms of finite sums or
(2) finite sums of alternating series. Alternating series may be approximated by finite sums more accurately as
compared to their simple truncation. The accuracy of these approximations is discussed.
Comparison of PWM Techniques and Selective Harmonic Elimination for 3-Level I...IJERA Editor
In this paper comparison of Selective Harmonic Elimination (SHE) method with different Pulse Width
Modulation (PWM) technique which are used as gate pulse for firing the IGBT‟s. Higher order harmonics are
easily eliminated by selecting a proper tuned value filters but lower order harmonics such as 3rd, 5th, 7th, and
9th are not easily eliminated. This paper includes Newton- Rapson method with random initial guess is used for
SHE reduction and shows where the solution can exit by solving transcendental non linear equations. With the
use of this method lower order harmonics as well as Total Harmonic Distortion (THD) will get reduce. In this
paper a cascaded H-bridge Multilevel Inverter using MATLAB/Simulink blocks is presented.
Digitization of Speedometer Incorporating Arduino and Tracing of Location Usi...IJERA Editor
This paper deals with the method of telemetric field which has grown exponentially in recent years. A correct
characterization is to handle, analyze the signals from the sensor which switches, positions and detects the
speed. Placing the sensor on the wheels of the train, a continuous pulse train of signal will be proportional to
train speed and it is sent toArduino microcontroller to analyze, process and simultaneously data will be stored
in memory and displayed. In addition, GPS is used to trace the current location and it is viewed with the help of
LCD placed in every compartment. Thus the simulation has been carried out using MATLAB/SIMULINK
software. Both the simulation and experimental results under diverse conditions are presented to clearly describe
the process of characterization of our system.
Elimination of data redundancy before persisting into dbms using svm classifi...nalini manogaran
Elimination of data redundancy before persisting into dbms using svm classification,
Data Base Management System is one of the
growing fields in computing world. Grid computing, internet
sharing, distributed computing, parallel processing and cloud
are the areas store their huge amount of data in a DBMS to
maintain the structure of the data. Memory management is
one of the major portions in DBMS due to edit, delete, recover
and commit operations used on the records. To improve the
memory utilization efficiently, the redundant data should be
eliminated accurately. In this paper, the redundant data is
fetched by the Quick Search Bad Character (QSBC) function
and intimate to the DB admin to remove the redundancy.
QSBC function compares the entire data with patterns taken
from index table created for all the data persisted in the
DBMS to easy comparison of redundant (duplicate) data in
the database. This experiment in examined in SQL server
software on a university student database and performance is
evaluated in terms of time and accuracy. The database is
having 15000 students data involved in various activities.
Keywords—Data redundancy, Data Base Management System,
Support Vector Machine, Data Duplicate.
I. INTRODUCTION
The growing (prenominal) mass of information
present in digital media has become a resistive problem for
data administrators. Usually, shaped on data congregate
from distinct origin, data repositories such as those used by
digital libraries and e-commerce agent based records with
disparate schemata and structures. Also problems regarding
to low response time, availability, security and quality
assurance become more troublesome to manage as the
amount of data grow larger. It is practicable to specimen
that the peculiarity of the data that an association uses in its
systems is relative to its efficiency for offering beneficial
services to their users. In this environment, the
determination of maintenance repositories with “dirty” data
(i.e., with replicas, identification errors, equal patterns,
etc.) goes greatly beyond technical discussion such as the
everywhere quickness or accomplishment of data
administration systems.
Nalini.M, nalini.tptwin@gmail.com, Anbu.S, anomaly detection,
data mining
big data
dbms
intrusion detection
dublicate detection
data cleaning
data redundancy
data replication, redundancy removel, QSBC, Duplicate detection, error correction, de-duplication, Data cleaning, Dbms, Data sets
Entity Resolution is a problem that occurs in many information integration processes and applications. The primary goal of entity resolution is to resolve data references to the corresponding same real world entity. The ambiguity in references comes in various net- works such as social network, biological network, citation graphs and many others. Ambiguity in references not only leads to data redundancy but also inaccuracies in knowledge representation, extraction and query processing. Entity resolution is the solution to this problem. There have been many approaches such as pair-wise similarity over attributes of references, a parallel approach for morphing the graph data on to cluster of nodes (P-Swoosh) [2] and relational clustering that makes use of relational information in addition to the attribute similarity. In this article, we make use of relational culstering to resolve author name ambiguities in a subset of a real-world dataset: US patent network consisting of more than 650,000 author references.
Power Management in Micro grid Using Hybrid Energy Storage Systemijcnes
This paper proposed for power management in micro grid using a hybrid distributed generator based on photovoltaic, wind-driven PMDC and energy storage system is proposed. In this generator, the sources are together connected to the grid with the help of interleaved boost converter followed by an inverter. Thus, compared to earlier schemes, the proposed scheme has fewer power converters. FUZZY based MPPT controllers are also proposed for the new hybrid scheme to separately trigger the interleaved DC-DC converter and the inverter for tracking the maximum power from both the sources. The integrated operations of both the proposed controllers for different conditions are demonstrated through simulation with the help of MATLAB software
Hybrid approach for generating non overlapped substring using genetic algorithmeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
TRANSFORMATION RULES FOR BUILDING OWL ONTOLOGIES FROM RELATIONAL DATABASEScscpconf
Relational Databases (RDB) are used as the backend database by most of information systems. RDB encapsulate conceptual model and metadata needed in the ontology construction. Schema mapping is a technique that is used by all existing approaches for ontology building from RDB.However, most of those methods use poor transformation rules that prevent advanced database mining for building rich ontologies. In this paper, we propose transformation rules for building owl ontologies from RDBs. It allows transforming all possible cases in RDBs into ontological constructs. The proposed rules are enriched by analyzing stored data to detect disjointness and
totalness constraints in hierarchies, and calculating the participation level of tables in n-ary relations. In addition, our technique is generic; hence it can be applied to any RDB. The
proposed rules were evaluated using a normalized and open RDB. The obtained ontology is richer in terms of non- taxonomic relationships.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
Information residing in relational databases and delimited file systems are inadequate for reuse and sharing over the web. These file systems do not adhere to commonly set principles for maintaining data harmony. Due to these reasons, the resources have been suffering from lack of uniformity, heterogeneity as well as redundancy throughout the web. Ontologies have been widely used for solving such type of problems, as they help in extracting knowledge out of any information system. In this article, we focus on extracting concepts and their relations from a set of CSV files. These files are served as individual concepts and grouped into a particular domain, called the domain ontology. Furthermore, this domain ontology is used for capturing CSV data and represented in RDF format retaining links among files or concepts. Datatype and object properties are automatically detected from header fields. This reduces the task of user involvement in generating mapping files. The detail analysis has been performed on Baseball tabular data and the result shows a rich set of semantic information.
Textual Data Partitioning with Relationship and Discriminative AnalysisEditor IJMTER
Data partitioning methods are used to partition the data values with similarity. Similarity
measures are used to estimate transaction relationships. Hierarchical clustering model produces tree
structured results. Partitioned clustering produces results in grid format. Text documents are
unstructured data values with high dimensional attributes. Document clustering group ups unlabeled text
documents into meaningful clusters. Traditional clustering methods require cluster count (K) for the
document grouping process. Clustering accuracy degrades drastically with reference to the unsuitable
cluster count.
Textual data elements are divided into two types’ discriminative words and nondiscriminative
words. Only discriminative words are useful for grouping documents. The involvement of
nondiscriminative words confuses the clustering process and leads to poor clustering solution in return.
A variation inference algorithm is used to infer the document collection structure and partition of
document words at the same time. Dirichlet Process Mixture (DPM) model is used to partition
documents. DPM clustering model uses both the data likelihood and the clustering property of the
Dirichlet Process (DP). Dirichlet Process Mixture Model for Feature Partition (DPMFP) is used to
discover the latent cluster structure based on the DPM model. DPMFP clustering is performed without
requiring the number of clusters as input.
Document labels are used to estimate the discriminative word identification process. Concept
relationships are analyzed with Ontology support. Semantic weight model is used for the document
similarity analysis. The system improves the scalability with the support of labels and concept relations
for dimensionality reduction process.
Similar to Confiscation of Duplicate Tuples in The Relational Databases (20)
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfKamal Acharya
The College Bus Management system is completely developed by Visual Basic .NET Version. The application is connect with most secured database language MS SQL Server. The application is develop by using best combination of front-end and back-end languages. The application is totally design like flat user interface. This flat user interface is more attractive user interface in 2017. The application is gives more important to the system functionality. The application is to manage the student’s details, driver’s details, bus details, bus route details, bus fees details and more. The application has only one unit for admin. The admin can manage the entire application. The admin can login into the application by using username and password of the admin. The application is develop for big and small colleges. It is more user friendly for non-computer person. Even they can easily learn how to manage the application within hours. The application is more secure by the admin. The system will give an effective output for the VB.Net and SQL Server given as input to the system. The compiled java program given as input to the system, after scanning the program will generate different reports. The application generates the report for users. The admin can view and download the report of the data. The application deliver the excel format reports. Because, excel formatted reports is very easy to understand the income and expense of the college bus. This application is mainly develop for windows operating system users. In 2017, 73% of people enterprises are using windows operating system. So the application will easily install for all the windows operating system users. The application-developed size is very low. The application consumes very low space in disk. Therefore, the user can allocate very minimum local disk space for this application.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Quality defects in TMT Bars, Possible causes and Potential Solutions.PrashantGoswami42
Maintaining high-quality standards in the production of TMT bars is crucial for ensuring structural integrity in construction. Addressing common defects through careful monitoring, standardized processes, and advanced technology can significantly improve the quality of TMT bars. Continuous training and adherence to quality control measures will also play a pivotal role in minimizing these defects.
Democratizing Fuzzing at Scale by Abhishek Aryaabh.arya
Presented at NUS: Fuzzing and Software Security Summer School 2024
This keynote talks about the democratization of fuzzing at scale, highlighting the collaboration between open source communities, academia, and industry to advance the field of fuzzing. It delves into the history of fuzzing, the development of scalable fuzzing platforms, and the empowerment of community-driven research. The talk will further discuss recent advancements leveraging AI/ML and offer insights into the future evolution of the fuzzing landscape.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
Water scarcity is the lack of fresh water resources to meet the standard water demand. There are two type of water scarcity. One is physical. The other is economic water scarcity.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Automobile Management System Project Report.pdfKamal Acharya
The proposed project is developed to manage the automobile in the automobile dealer company. The main module in this project is login, automobile management, customer management, sales, complaints and reports. The first module is the login. The automobile showroom owner should login to the project for usage. The username and password are verified and if it is correct, next form opens. If the username and password are not correct, it shows the error message.
When a customer search for a automobile, if the automobile is available, they will be taken to a page that shows the details of the automobile including automobile name, automobile ID, quantity, price etc. “Automobile Management System” is useful for maintaining automobiles, customers effectively and hence helps for establishing good relation between customer and automobile organization. It contains various customized modules for effectively maintaining automobiles and stock information accurately and safely.
When the automobile is sold to the customer, stock will be reduced automatically. When a new purchase is made, stock will be increased automatically. While selecting automobiles for sale, the proposed software will automatically check for total number of available stock of that particular item, if the total stock of that particular item is less than 5, software will notify the user to purchase the particular item.
Also when the user tries to sale items which are not in stock, the system will prompt the user that the stock is not enough. Customers of this system can search for a automobile; can purchase a automobile easily by selecting fast. On the other hand the stock of automobiles can be maintained perfectly by the automobile shop manager overcoming the drawbacks of existing system.
Courier management system project report.pdfKamal Acharya
It is now-a-days very important for the people to send or receive articles like imported furniture, electronic items, gifts, business goods and the like. People depend vastly on different transport systems which mostly use the manual way of receiving and delivering the articles. There is no way to track the articles till they are received and there is no way to let the customer know what happened in transit, once he booked some articles. In such a situation, we need a system which completely computerizes the cargo activities including time to time tracking of the articles sent. This need is fulfilled by Courier Management System software which is online software for the cargo management people that enables them to receive the goods from a source and send them to a required destination and track their status from time to time.
Confiscation of Duplicate Tuples in The Relational Databases
1. KVRamana-GVRamesh Int. Journal of Engineering Research and Application www.ijera.com
ISSN: 2248-9622, Vol. 6, Issue 3, (Part - 1) March 2016, pp.49-53
www.ijera.com 49 | P a g e
Confiscation of Duplicate Tuples in The Relational Databases
Dr.K.VenkataRamana*, Dr.G.V.Ramesh Babu**
*Department of Computer Science, KMM. Institute of Post Graduate Studies, Tirupati
** Department of Computer Science, S.V.University,Tirupati
ABSTRACT
Relational database is a collection of relations. Duplicate tuple existence is common in many real time relational
databases. There is no known, simple and direct technique for finding duplicated records in relational database.
In a relational database, if the same real-world entity is represented by more than one tuple, then such tuples are
called duplicate tuples. Finding duplicate tuples and then replacing them by one best tuple is called a fusion
operation. Whenever duplicate tuples are found in the relations of any database, those tuples must be replaced
with one special best approximate tuple that represents the joint information of all the duplicate tuples. Present
study proposes new techniques to find duplicate tuples and then remove those duplicate tuples with the correct
real world tuples. In the first step duplicate tuples in the relation are classified based on the class label and in the
second step then for each set of duplicate tuples functional dependency method or union method is applied to
replace duplicate tuples with the corresponding correct real world single tuple. One possibility is to replace one
set of duplicate tuples with one correct real world tuple. Another possibility is to replace two or more sets of
duplicate tuples in the relation by one set of correct real world tuples. Sometimes the removal of duplicate tuples
in the relations of any relational database can create referential integrity violations. All such violations must be
controlled and coordinated syntactically as well as semantically in relations
Keywords:– Relational Database, Duplicate Tuples, Joins, Union
I. INTRODUCTION
The purpose of this paper is to focus on
duplicate record detection algorithms in relational
databases. In many real time applications in current
scenario the data that exists in relations in the
databases are inherently associated with duplicate
tuples. Finding and then removing of duplicate data
tuples in the relational database is the most important
and latest research topic. In the context of relational
databases, dealing with duplicates comes down to (a)
identifying which tuples are duplicate and (b)
replacing those tuples by a single tuple [1]. Data
duplication is also known as entity resolution or
record linkage [1]. Duplicate records do not share a
common key and/or they contain errors that make
duplicate matching a difficult task [2]. Duplicate data
tuples are present in the one or more relational
databases when there exit multiple descriptions of the
same real world entity. Errors are introduced as the
result of transcription errors, incomplete information,
lack of standard formats, or any combination of these
factors [2]. The presence of duplicate tuples causes
many database maintenance problems. Some of the
reasons for the existence of duplicate tuples are
presence of missing attribute values, data entry errors,
typing errors and not following standards in data
entry and data maintenance. Finding and then
removing of duplicate data tuples in the relational
database is the most important and latest research
topic. In general, data tuples are duplicated in one or
more relations of any relational databases when there
exit multiple descriptions of the same real world
entity (record). Often, in the real world, entities have
two or more representations in databases [2].
Duplicate tuple detection and replacement with
correct tuple is inevitable in many relations of the
relational database. Data fusion is the step of actually
merging multiple, duplicate tuples into a single
representation of a real world object [3].
A crucial operation in the maintenance of
data quality in relational databases is to remove tuples
that mutually describe the same entity (i.e., duplicate
tuples) and to replace them with a tuple that
minimizes information loss [4]. A special procedure
is needed to take care of integrity constraint violations
that occur when duplicate tuples are removed from
the relations. One way is to develop a general
procedure that not only covers integrity constraint
violation management but also manages semantic
relationships among the relations in the relational
database. Duplicate record detection is the process of
identifying different or multiple records that refer to
one unique real-world entity or object [2].
Traditional approaches to entity resolution
and de-duplication use a variety of attribute similarity
measures, often based on approximate string-
matching Criteria [5]. Many mathematical tools or
models are available for efficient management of
syntax as well as semantic modifications in the
relations of any relational database. A correct way of
propagation is to take into account that multiple
tuples are fused, meaning that linked information
RESEARCH ARTICLE OPEN ACCESS
2. KVRamana-GVRamesh Int. Journal of Engineering Research and Application www.ijera.com
ISSN: 2248-9622, Vol. 6, Issue 2, (Part - 6) February 2016, pp.00-00
www.ijera.com 50 | P a g e
should be fused accordingly [1]. Integrity constraints
imposed on the relational database must be satisfied
before and after deleting the original duplicate data
tuples. First determine all such duplicate tuples in the
relations of any relational database and then replace
all such duplicate tuples by a single correct tuple.
Particularly referential integrity must be considered
and controlled in propagation of data fusion. Several
integrity constraints management strategies such as
on delete cascade, on update cascade, set null, set not
null, restrict are available in database modifications.
These techniques are syntactically correct but
semantically incorrect.
Present study proposes a new method to eliminate
duplicate tuples in the relations of a relational
database. This new technique is called union fusion
function technique that is applicable for attribute
values. Present study also proposes another duplicate
tuple replacing technique using functional
dependency approach. This is a more generalized
functional dependency approach that covers both the
partial preservative functions and also complete (full)
preservative functions.
Present study also proposes another new technique to
model a technique for removal of duplicate tuples in
the relations. For example, assume that a relation
contains many tuples. In order to find and then
remove duplicate tuples in the given relation, initially
we apply a classification technique to classify all the
tuples, and then based on the class labels, and
duplicate tuples are identified and then these duplicate
tuples are replaced by the correct real world tuple. K-
nearest neighbor classifier is one best tool in machine
learning as well as in data mining for finding and then
remove duplicate records. Decision tree is the
probably most important and highly interpretable
classification technique to the data. Decision tree is
used as a benchmark technique before applying any
classification technique. Also the time complexity of
decision tree is (n x number of attributes x log n)
where n is the number of tuples in the relation.
Duplicate records slow down the indexing process
and significantly increase the cost for saving and
managing data [6].
II. PROPOSED WORK
Data duplication is common in many real
time applications particularly in the relations of any
relational databases. Finding duplicate tuples and then
replacing them by one best tuple is called a fusion
operation. During fusion operation integrity constraint
violations must be controlled carefully and relational
database must be managed in a consistence way
before and after database modifications as well as
after removal of duplicate tuples in the relations of
relational databases.
In the present research paper, a sample set of three
relations viz, 1.COLLEGE, 2.CONFERENCE and
3.CONDUCTED_CONFERENCES is considered as
running example for understanding purpose. In the
relation COLLEGE tuples 1 and 2 are duplicated
because of some reasons such as typographic errors,
missing of values and lack of standard data
representation procedures etc.
Both one and two duplicate tuples describe the same
real world entity. These two duplicate tuples are
identified and consequently replaced by one
equivalent real and correct tuple. Finding and then
removing these duplicate tuples with one correct and
real world tuple is called a fusion operation. Present
study also proposes a new fusion operation called
union. Union fusion operation accepts a set of
duplicate tuples and then replaces with one correct
real world entity. Working principle of union fusion
function operation is explained below:
Union of College_Code = {3G} U{3G} = {3G}
Union of College_Name = {KMM} U {KMM} =
{KMM}
Union of Principal_Name = {Rama} U {null} =
{Rama}
Union of Affiliated_University = {null} U {JNT
University} = {JNT University}
In the COLLEGE relation duplicate tuples 1 and 2 are
replaced by the following single tuple using proposed
new union fusion function technique. The replacing
function may be either partially preservative or
complete preservative function. Partially preservative
function is defined as follows: There exists t ε
DupSet such that t[A] = REP(Dup)[A]
When A DupSet, it is called partial preservative
and when A = DupSet, it is called complete
preservative replace function.
For example, let A = {SNo, College_Code,
College_Name, Principal_Name}
Here t[A] = Rep(Dup)[A] and
Let B = {JNT University}, then t[B] = Rep[B]
In this particular example, replacing function is
partial preservative but not complete (full)
preservative. Hence, 1 and 2 duplicate tuples in the
COLLEGE relation are mapped with one correct real
world tuple. In the COLLEGE relation, tuples 5, 6,
and 7 are duplicate tuples. This is an example for
complete preservative. These three duplicate tuples
are shown in the FIGURE 6 and then they are
replaced by the single tuple shown in the FIGURE 7.
Here, t[all attributes] = RepDup[all attributes].
Complete preservative replacing function replaces a
set of tuples with another equivalent and simplified
set of tuples.
Consider once again the relation COLLEGE1 with the
functional dependency that holds on it, College_Code
3. KVRamana-GVRamesh Int. Journal of Engineering Research and Application www.ijera.com
ISSN: 2248-9622, Vol. 6, Issue 2, (Part - 6) February 2016, pp.00-00
www.ijera.com 51 | P a g e
→ {College_Name,Principal_Name,Affiliatedto}.
The functional dependency states that when two
values on different tuples are same on the attribute
College_Code then all values of the three attributes in
the right side of the functional dependency are also
same. That is, if t1[College_Code] = t2[College_Code]
then t1[College_Name,Principal_Name,Affiliatedto] =
t2[College_Name,Principal_Name,Affiliatedto].
Therefore duplicate tuples 1 and 2 in the COLLEGE
relation are replaced by tuple 1 by applying functional
dependency constraints.
Sometimes it may be necessary to take multiple sets
of tuples and map them into a single set of tuples. We
use union operation to map multiple sets of tuples
into a single set of tuples. This union operation is
more generalized version of many database
operations such as delete, cascading, and referential
integrity etc. First step is to find and replace duplicate
tuples and then remove problems that will arise after
duplicate tuple replacement. First step is performed
using union of values of attributes and then second
point is executed by applying union of sets of tuples.
The relation ONDUCTED_CONFERENCES
contains totally nine tuples. In this relation all foreign
key values that contain 2 are replaced by value 1. This
is due to first type of partial preservative function or
functional dependency rule. Using complete (full)
preservative function foreign key values 5, 6, and 7 in
the relation CONDUCTED_CONFERENCES are
replaced by 5.
4. KVRamana-GVRamesh Int. Journal of Engineering Research and Application www.ijera.com
ISSN: 2248-9622, Vol. 6, Issue 2, (Part - 6) February 2016, pp.00-00
www.ijera.com 52 | P a g e
Table-9 Tuples showing the CONDUCTED_CONFERENCE relation after tuple propagation
Table-10 Tuples showing the CONDUCTED_CONFERENCE relation after tuple propagation
with respect to the replacement tuple1
Table-11 Tuple showing the CONDUCTED_CONFERENCE relation t is not modified in the COLLEGE
Table-12 Tuples showing the CONDUCTED_CONFERENCE relation after tuple propagation with respect to
the replacement tuple 5
Table-13 Tuples showing the CONDUCTED_CONFERENCE relation after removing all the duplicate tuples
First proposed method takes union among the
attributes. Second proposed method takes union
among the tuples. Third proposed functional
dependency method is more generalized version of
the above two methods. Third proposed method also
takes care of partial and complete preservative
functions also.
III. ALGORITHM
Let R be a relation of tuples and assume
that the set of duplicate tuples are denoted by delta.
That is, delta ⊆ R. Let R^' be the child relation
corresponding to the parent relation, R. this
algorithm will be executed in two steps. In the first
step duplicate records are identified and then in the
second step identified duplicated records are
replaced with the correct real world records and also
these changes are propagated to the dependent
(referenced) relations in a semantically correct way
in addition to the syntactic correctness of relations
with respect to many database operations such as
insert, delete, and update.
Assume that sample parent relation R = COLLEGE,
and the dependent child relation of the parent
relation is taken as R^' =
CONDUCTED_CONFERENCES. Also assume that
tuples t ϵ delta ⊆ R and tuples t* ⊆ R^'.
The relationship between parent and child relations
is one to many from COLLEGE to
CONDUCTED_CONFERENCES.
SNo Confrence_Id Numberof_days Start_Date
1 Conference1 3 18/6/2009
1 Conference2 1 10/12/2006
1 Conference3 4 3/9/2012
1 Conference 1 3 18/6/2009
1 Conference2 1 10/12/2006
3 Conference1 6 3/5/2013
5 Conference1 4 29/12/2010
5 Conference1 5 29/12/2010
5 Conference1 6 29/12/2010
SNo Confrence_Id Numberof_days Start_Date
1 Conference1 3 18/6/2009
1 Conference2 1 10/12/2006
1 Conference3 4 3/9/2012
1 Conference1 3 18/6/2009
1 Conference2 1 10/12/2006
SNo Confrence_Id Numberof_days Start_Date
3 Conference1 6 3/5/2013
SNo Confrence_Id Numberof_days Start_Date
5 Con1 4 29/12/2010
5 Con1 5 29/12/2010
5 Con1 6 29/12/2010
SNo Confrence_Id Numberof_days Start_Date
1 Con1 3 18/6/2009
1 Con2 1 10/12/2006
1 Con3 4 3/9/2012
3 Con1 6 3/5/2013
5 Con1 4 29/12/2010
5. KVRamana-GVRamesh Int. Journal of Engineering Research and Application www.ijera.com
ISSN: 2248-9622, Vol. 6, Issue 2, (Part - 6) February 2016, pp.00-00
www.ijera.com 53 | P a g e
In the COLLEGE relation tuples 1 and 2
are duplicated and this type of duplication is deleted
using union operation of between or among the
attributes. Tuples 5, 6, and 7 are also duplicated and
these types of duplication of records are removed by
taking the union operation among the tuples but not
among the attributes. In the second case sets of
duplicate records are identified and then replaced
with the one or more sets of real world and original
or correct records.
INPUT:
Relations with duplicated tuples
OUTPUT:
Relations with duplicate tuples removed
1. For each tuple t ϵ delta do
2. In the relation R^' find a set off tuples whose
foreign key matches with the primary key of the
tuple t in R.
3. Let St be the set of such tuples
4 For all t* ϵ St replace foreign key values in R^'
with the respective primary key of the tuple t ϵ R
End for
End for
5. For each set st find projected set of tuples based
on their primary key End for
6. Now apply union operation for all St Sets
IV. ANALYSIS OF RESULTS
The COLLEGE relation contains seven
tuples. Tuples 1 and 2 are duplicated with respect to
partial preservative function. Duplicated tuples 1 and
2 are shown in Table-4 and then these two
duplicated tuples are replaced with one correct tuple
shown in Table-5. Similarly, in the COLLEGE
relation tuples 5, 6, and 7 are duplicated with
respected to full or complete preservative function.
These duplicated tuples 5, 6 and 7 are shown
separately in Table-6 and then replaced with one
correct tuple shown in Table-7.
The relation COLLEGE_CORRECTED is
shown in Table-8 after removal of duplicate tuples
with the replacement of correct tuples. The relation
CONDUCTED_CONFERENCE is updated based
on the updated details of the relation
COLLEGE_CORRECTED and modified
CONDUCTED_CONFERENCE relation is named
as
CONDUCTED_CONFERENCES_AFTER_PROPA
GATION and is shown in Table-9. With respect to
duplicate tuples 1 and 2 in the COLLEGE relation,
modified tuples in the
CONDUCTED_CONFERENCE relation are shown
separately in the Table-10. Similarly duplicate tuples
5, 6, and 7 are replaced with tuple 5 and are shown
in Table-12 separately. Tuples 3 and 4 in the
COLEGE relation are not duplicated and they
remain as it is. Tuple 3 is shown in Table-11.
Finally, COLLEGE relation after removal of
duplicate tuples is shown in Table-8 and updated
CONDUCTED_CONFERENCE relation is shown
in Table-13.
V. CONCLUSIONS
When database records are duplicated, both
storage and processing cost of records is very high.
Future research interest is to find good algorithms to
handle tuple duplication in many real applications
such as catalogs, networks, Data duplication is
common in many real life applications. Records are
duplicated in many relational databases because of
many reasons such as inclusions of null values, non-
standard method representation, and typographic
errors. There is no standard method for identification
of duplicated records in the relations of relational
databases. When there exist no specific standard
method for detecting duplicate records it is very
difficult to find duplicate records. Hence, there is a
scope for formulating specific standard methods for
duplicate record detection.
REFERENCES
[1]. Antoon Bronselaer, Daan Van Britsom, and
Guy De Tre , “Propagation of Data Fusion
IEEE Transactions on Knowledge and
Data Enginee-ring,vol. 27, no. 5, may 2015
[2]. Ahmed K. Elmagarmid, Panagiotis G.
Ipeirotis, Vassilios S. Verykios, Duplicate
Record Detection: A Survey IEEE
transactions on Knowledge and Data
Engineering, Vol. 19, NO. 1, pp. 1–16, Jan.
2007.
[3]. Felix Naumann , Alexander Bilke, Jens
Bleiholder, Melanie Weis Data Fusion in
Three Steps: Research Paper Antoon
Bronselaer,Marcin Szymczak, Sławomir
Zadrożny, Guy De Tré, Dynamical order
construction in data fusion, Published in:
journal information fusion archive Volume
27 Issue C,
[4]. January2016, Pages 1-18, Elsevier Science
I. Bhattacharya and L. Getoor, “Collective
entity resolution in relational data,” ACM
Trans.
[5]. Knowledge. Discovery Data, vol. 1, p.
2007, 2006.
[6]. Anestis Sitas, Sarantos Kapidakis,”
Duplicate detection algorithms of
bibliographi descriptions”, Library Hi Tech,
Vol. 26 No. 2, 2008, pp. 287-301,
[7]. Emerald Group Publishing Limited, 0737-
8831 DOI 10.1108/07378830810880379