A model for profit pattern mining based on genetic algorithmeSAT Journals
Abstract
Mining profit oriented patterns is a novel technique of association rule mining in data mining, which basically focuses on important issues related with business. As it is well known that every business aims to generate the profit and find the ways to improve the same. In earlier days association rule mining was used for market basket analysis and targeted only some of the business and commercial aspects. Afterwards the researchers started to aim the most prominent element of any business i.e. Profit, and determined the innovative way to generate the association rules based on profit. Profit oriented patterns mining approach combines the statistic based pattern mining with value-based decision making to generate those patterns with the maximum profit and some ways to generate recommenders for future strategy. To achieve the desired goal the traditional association rule mining alone is not effectual, so we combine the strength of genetic algorithm with association rule mining to enhance its capability. The study shows that Genetic Algorithm improves the effectiveness and efficiency of association rule mining outcome, since genetic algorithms are competent to handle the problems related with the uncertainty, multi-dimensional, non-differential, non-continuous, and non-parametrical, non-linearity constraint and multi-objective optimization problems. In this paper we apply the concept of profit pattern mining with genetic algorithm to generate profit oriented pattern which help out in future business expansion and fulfill the business objective.
Keywords: Data Mining, Association Rule Mining, Profit Pattern Mining, Genetic Algorithm
One of the most important problems in modern finance is finding efficient ways to summarize and visualize
the stock market data to give individuals or institutions useful information about the market behavior for
investment decisions Therefore, Investment can be considered as one of the fundamental pillars of national
economy. So, at the present time many investors look to find criterion to compare stocks together and
selecting the best and also investors choose strategies that maximize the earning value of the investment
process. Therefore the enormous amount of valuable data generated by the stock market has attracted
researchers to explore this problem domain using different methodologies. Therefore research in data
mining has gained a high attraction due to the importance of its applications and the increasing generation
information. So, Data mining tools such as association rule, rule induction method and Apriori algorithm
techniques are used to find association between different scripts of stock market, and also much of the
research and development has taken place regarding the reasons for fluctuating Indian stock exchange.
But, now days there are two important factors such as gold prices and US Dollar Prices are more
dominating on Indian Stock Market and to find out the correlation between gold prices, dollar prices and
BSE index statistical correlation is used and this helps the activities of stock operators, brokers, investors
and jobbers. They are based on the forecasting the fluctuation of index share prices, gold prices, dollar
prices and transactions of customers. Hence researcher has considered these problems as a topic for
research.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Frequent Item set Mining of Big Data for Social MediaIJERA Editor
Big data is a term for massive data sets having large, more varied and complex structure with the difficulties of storing, analyzing and visualizing for further processes or results. Bigdata includes data from email, documents, pictures, audio, video files, and other sources that do not fit into a relational database. This unstructured data brings enormous challenges to Bigdata.The process of research into massive amounts of data to reveal hidden patterns and secret correlations named as big data analytics. Therefore, big data implementations need to be analyzed and executed as accurately as possible. The proposed model structures the unstructured data from social media in a structured form so that data can be queried efficiently by using Hadoop MapReduce framework. The Bigdata mining is essential in order to extract value from massive amount of data. MapReduce is efficient method to deal with Big data than traditional techniques.The proposed Linguistic string matching Knuth-Morris-Pratt algorithm and K-Means clustering algorithm gives proper platform to extract value from massive amount of data and recommendation for user.Linguistic matching techniques such as Knuth–Morris–Pratt string matching algorithm are very useful in giving proper matching output to user query. The K-Means algorithm is one which works on clustering data using vector space model. It can be an appropriate method to produce recommendation for user
Enhancement techniques for data warehouse staging areaIJDKP
Poor performance can turn a successful data warehousing project into a failure. Consequently, several
attempts have been made by various researchers to deal with the problem of scheduling the Extract-
Transform-Load (ETL) process. In this paper we therefore present several approaches in the context of
enhancing the data warehousing Extract, Transform and loading stages. We focus on enhancing the
performance of extract and transform phases by proposing two algorithms that reduce the time needed in
each phase through employing the hidden semantic information in the data. Using the semantic
information, a large volume of useless data can be pruned in early design stage. We also focus on the
problem of scheduling the execution of the ETL activities, with the goal of minimizing ETL execution time.
We explore and invest in this area by choosing three scheduling techniques for ETL. Finally, we
experimentally show their behavior in terms of execution time in the sales domain to understand the impact
of implementing any of them and choosing the one leading to maximum performance enhancement.
A SURVEY ON DATA MINING IN STEEL INDUSTRIESIJCSES Journal
In Industrial environments, huge amount of data is being generated which in turn collected indatabase anddata warehouses from all involved areas such as planning, process design, materials, assembly, production, quality, process control, scheduling, fault detection,shutdown, customer relation management, and so on. Data Mining has become auseful tool for knowledge acquisition for industrial process of Iron and steel making. Due to the rapid growth in Data Mining, various industries started using data mining technology to search the hidden patterns, which might further be used to the system with the new knowledge which might design new models to enhance the production quality, productivity optimum cost and maintenance etc. The continuous improvement of all steel production process regarding the avoidance of quality deficiencies and the related improvement of production yield is an essential task of steel producer. Therefore, zero defect strategy is popular today and to maintain it several quality assurancetechniques areused. The present report explains the methods of data mining and describes its application in the industrial environment and especially, in the steel industry.
A model for profit pattern mining based on genetic algorithmeSAT Journals
Abstract
Mining profit oriented patterns is a novel technique of association rule mining in data mining, which basically focuses on important issues related with business. As it is well known that every business aims to generate the profit and find the ways to improve the same. In earlier days association rule mining was used for market basket analysis and targeted only some of the business and commercial aspects. Afterwards the researchers started to aim the most prominent element of any business i.e. Profit, and determined the innovative way to generate the association rules based on profit. Profit oriented patterns mining approach combines the statistic based pattern mining with value-based decision making to generate those patterns with the maximum profit and some ways to generate recommenders for future strategy. To achieve the desired goal the traditional association rule mining alone is not effectual, so we combine the strength of genetic algorithm with association rule mining to enhance its capability. The study shows that Genetic Algorithm improves the effectiveness and efficiency of association rule mining outcome, since genetic algorithms are competent to handle the problems related with the uncertainty, multi-dimensional, non-differential, non-continuous, and non-parametrical, non-linearity constraint and multi-objective optimization problems. In this paper we apply the concept of profit pattern mining with genetic algorithm to generate profit oriented pattern which help out in future business expansion and fulfill the business objective.
Keywords: Data Mining, Association Rule Mining, Profit Pattern Mining, Genetic Algorithm
One of the most important problems in modern finance is finding efficient ways to summarize and visualize
the stock market data to give individuals or institutions useful information about the market behavior for
investment decisions Therefore, Investment can be considered as one of the fundamental pillars of national
economy. So, at the present time many investors look to find criterion to compare stocks together and
selecting the best and also investors choose strategies that maximize the earning value of the investment
process. Therefore the enormous amount of valuable data generated by the stock market has attracted
researchers to explore this problem domain using different methodologies. Therefore research in data
mining has gained a high attraction due to the importance of its applications and the increasing generation
information. So, Data mining tools such as association rule, rule induction method and Apriori algorithm
techniques are used to find association between different scripts of stock market, and also much of the
research and development has taken place regarding the reasons for fluctuating Indian stock exchange.
But, now days there are two important factors such as gold prices and US Dollar Prices are more
dominating on Indian Stock Market and to find out the correlation between gold prices, dollar prices and
BSE index statistical correlation is used and this helps the activities of stock operators, brokers, investors
and jobbers. They are based on the forecasting the fluctuation of index share prices, gold prices, dollar
prices and transactions of customers. Hence researcher has considered these problems as a topic for
research.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Frequent Item set Mining of Big Data for Social MediaIJERA Editor
Big data is a term for massive data sets having large, more varied and complex structure with the difficulties of storing, analyzing and visualizing for further processes or results. Bigdata includes data from email, documents, pictures, audio, video files, and other sources that do not fit into a relational database. This unstructured data brings enormous challenges to Bigdata.The process of research into massive amounts of data to reveal hidden patterns and secret correlations named as big data analytics. Therefore, big data implementations need to be analyzed and executed as accurately as possible. The proposed model structures the unstructured data from social media in a structured form so that data can be queried efficiently by using Hadoop MapReduce framework. The Bigdata mining is essential in order to extract value from massive amount of data. MapReduce is efficient method to deal with Big data than traditional techniques.The proposed Linguistic string matching Knuth-Morris-Pratt algorithm and K-Means clustering algorithm gives proper platform to extract value from massive amount of data and recommendation for user.Linguistic matching techniques such as Knuth–Morris–Pratt string matching algorithm are very useful in giving proper matching output to user query. The K-Means algorithm is one which works on clustering data using vector space model. It can be an appropriate method to produce recommendation for user
Enhancement techniques for data warehouse staging areaIJDKP
Poor performance can turn a successful data warehousing project into a failure. Consequently, several
attempts have been made by various researchers to deal with the problem of scheduling the Extract-
Transform-Load (ETL) process. In this paper we therefore present several approaches in the context of
enhancing the data warehousing Extract, Transform and loading stages. We focus on enhancing the
performance of extract and transform phases by proposing two algorithms that reduce the time needed in
each phase through employing the hidden semantic information in the data. Using the semantic
information, a large volume of useless data can be pruned in early design stage. We also focus on the
problem of scheduling the execution of the ETL activities, with the goal of minimizing ETL execution time.
We explore and invest in this area by choosing three scheduling techniques for ETL. Finally, we
experimentally show their behavior in terms of execution time in the sales domain to understand the impact
of implementing any of them and choosing the one leading to maximum performance enhancement.
A SURVEY ON DATA MINING IN STEEL INDUSTRIESIJCSES Journal
In Industrial environments, huge amount of data is being generated which in turn collected indatabase anddata warehouses from all involved areas such as planning, process design, materials, assembly, production, quality, process control, scheduling, fault detection,shutdown, customer relation management, and so on. Data Mining has become auseful tool for knowledge acquisition for industrial process of Iron and steel making. Due to the rapid growth in Data Mining, various industries started using data mining technology to search the hidden patterns, which might further be used to the system with the new knowledge which might design new models to enhance the production quality, productivity optimum cost and maintenance etc. The continuous improvement of all steel production process regarding the avoidance of quality deficiencies and the related improvement of production yield is an essential task of steel producer. Therefore, zero defect strategy is popular today and to maintain it several quality assurancetechniques areused. The present report explains the methods of data mining and describes its application in the industrial environment and especially, in the steel industry.
Recommendation system using bloom filter in mapreduceIJDKP
Many clients like to use the Web to discover product details in the form of online reviews. The reviews are
provided by other clients and specialists. Recommender systems provide an important response to the
information overload problem as it presents users more practical and personalized information facilities.
Collaborative filtering methods are vital component in recommender systems as they generate high-quality
recommendations by influencing the likings of society of similar users. The collaborative filtering method
has assumption that people having same tastes choose the same items. The conventional collaborative
filtering system has drawbacks as sparse data problem & lack of scalability. A new recommender system is
required to deal with the sparse data problem & produce high quality recommendations in large scale
mobile environment. MapReduce is a programming model which is widely used for large-scale data
analysis. The described algorithm of recommendation mechanism for mobile commerce is user based
collaborative filtering using MapReduce which reduces scalability problem in conventional CF system.
One of the essential operations for the data analysis is join operation. But MapReduce is not very
competent to execute the join operation as it always uses all records in the datasets where only small
fraction of datasets are applicable for the join operation. This problem can be reduced by applying
bloomjoin algorithm. The bloom filters are constructed and used to filter out redundant intermediate
records. The proposed algorithm using bloom filter will reduce the number of intermediate results and will
improve the join performance.
A statistical data fusion technique in virtual data integration environmentIJDKP
Data fusion in the virtual data integration environment starts after detecting and clustering duplicated
records from the different integrated data sources. It refers to the process of selecting or fusing attribute
values from the clustered duplicates into a single record representing the real world object. In this paper, a
statistical technique for data fusion is introduced based on some probabilistic scores from both data
sources and clustered duplicates
With the development of database, the data volume stored in database increases rapidly and in the large
amounts of data much important information is hidden. If the information can be extracted from the
database they will create a lot of profit for the organization. The question they are asking is how to extract
this value. The answer is data mining. There are many technologies available to data mining practitioners,
including Artificial Neural Networks, Genetics, Fuzzy logic and Decision Trees. Many practitioners are
wary of Neural Networks due to their black box nature, even though they have proven themselves in many
situations. This paper is an overview of artificial neural networks and questions their position as a
preferred tool by data mining practitioners.
Introduction to feature subset selection methodIJSRD
Data Mining is a computational progression to ascertain patterns in hefty data sets. It has various important techniques and one of them is Classification which is receiving great attention recently in the database community. Classification technique can solve several problems in different fields like medicine, industry, business, science. PSO is based on social behaviour for optimization problem. Feature Selection (FS) is a solution that involves finding a subset of prominent features to improve predictive accuracy and to remove the redundant features. Rough Set Theory (RST) is a mathematical tool which deals with the uncertainty and vagueness of the decision systems.
A new hybrid algorithm for business intelligence recommender systemIJNSA Journal
Business Intelligence is a set of methods, process and technologies that transform raw data into meaningful
and useful information. Recommender system is one of business intelligence system that is used to obtain
knowledge to the active user for better decision making. Recommender systems apply data mining
techniques to the problem of making personalized recommendations for information. Due to the growth in
the number of information and the users in recent years offers challenges in recommender systems.
Collaborative, content, demographic and knowledge-based are four different types of recommendations
systems. In this paper, a new hybrid algorithm is proposed for recommender system which combines
knowledge based, profile of the users and most frequent item mining technique to obtain intelligence.
6 ijaems sept-2015-6-a review of data security primitives in data miningINFOGAIN PUBLICATION
This paper has discussed various issues and security primitives like Spatial Data Handing, Privacy Protection of data, Data Load Balancing, Resource Mining etc. in the area of Data Mining.A 5-stage review process has been conductedfor 30 research papers which were published in the period of year ranging from 1996 to year 2013. After an exhaustive review process, nine key issues were found “Spatial Data Handing, Data Load Balancing, Resource Mining ,Visual Data Mining, Data Clusters Mining, Privacy Preservation, Mining of gaps between business tools & patterns, Mining of hidden complex patterns.” which have been resolved and explained with proper methodologies. Several solution approaches have been discussed in the 30 papers. This paper provides an outcome of the review which is in the form of various findings, found under various key issues. The findings included algorithms and methodologies used by researchers along with their strengths and weaknesses and the scope for the future work in the area.
Multiple Minimum Support Implementations with Dynamic Matrix Apriori Algorith...ijsrd.com
Data mining can be defined as the process of uncovering hidden patterns in random data that are potentially useful. The discovery of interesting association relationships among large amounts of business transactions is currently vital for making appropriate business decisions. Association rule analysis is the task of discovering association rules that occur frequently in a given transaction data set. Its task is to find certain relationships among a set of data (itemset) in the database. It has two measurements: Support and confidence values. Confidence value is a measure of rule’s strength, while support value corresponds to statistical significance. There are currently a variety of algorithms to discover association rules. Some of these algorithms depend on the use of minimum support to weed out the uninteresting rules. Other algorithms look for highly correlated items, that is, rules with high confidence. Traditional association rule mining techniques employ predefined support and confidence values. However, specifying minimum support value of the mined rules in advance often leads to either too many or too few rules, which negatively impacts the performance of the overall system. This work proposes a way to efficiently mine association rules over dynamic databases using Dynamic Matrix Apriori technique and Multiple Support Apriori (MSApriori). A modification for Matrix Apriori algorithm to accommodate this modification is proposed. Experiments on large set of data bases have been conducted to validate the proposed framework. The achieved results show that there is a remarkable improvement in the overall performance of the system in terms of run time, the number of generated rules, and number of frequent items used.
Data Mining is an important aspect for any business. Most of the management level decisions are based on the process of Data Mining. One of such aspect is the association between different sale products i.e. what is the actual support of a product respected to the other product. This concept is called Association Mining. According to this concept we define the process of estimating the sale of one product respective to the other product. We are proposing an association rule based on the concept of Hardware support. In this concept we first maintain the database and compare it with systolic array after this a pruning process is being performed to filter the database and to remove the rarely used items. Finally the data is indexed according to hashing technique and the decision is performed in terms of support count. Krishan Rohilla | Shabnam Kumari | Reema"Data Mining based on Hashing Technique" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-1 | Issue-4 , June 2017, URL: http://www.ijtsrd.com/papers/ijtsrd82.pdf http://www.ijtsrd.com/computer-science/data-miining/82/data-mining-based-on-hashing-technique/krishan-rohilla
Abstract Learning Analytics by nature relies on computational information processing activities intended to extract from raw data some interesting aspects that can be used to obtain insights into the behaviors of learners, the design of learning experiences, etc. There is a large variety of computational techniques that can be employed, all with interesting properties, but it is the interpretation of their results that really forms the core of the analytics process. As a rising subject, data mining and business intelligence are playing an increasingly important role in the decision support activity of every walk of life. The Variance Rover System (VRS) mainly focused on the large data sets obtained from online web visiting and categorizing this into clusters according some similarity and the process of predicting customer behavior and selecting actions to influence that behavior to benefit the company, so as to take optimized and beneficial decisions of business expansion. Keywords: Analytics, Business intelligence, Clustering, Data Mining, Standard K-means, Optimized K-means
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Data Mining of Project Management Data: An Analysis of Applied Research Studies.Gurdal Ertek
Data collected and generated through and posterior to projects, such as data residing in project management software and post project review documents, can be a major source of actionable insights and competitive advantage. This paper presents a rigorous
methodological analysis of the applied research published in academic literature, on the application of data mining (DM) for project management (PM). The objective of the paper is to provide a comprehensive analysis and discussion of where and how data mining is applied for project management data and to provide practical insights for future research in the field.
https://dl.acm.org/citation.cfm?id=3176714
https://ertekprojects.com/ftp/papers/2017/ertek_et_al_2017_Data_Mining_of_Project_Management_Data.pdf
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
Recommendation system using bloom filter in mapreduceIJDKP
Many clients like to use the Web to discover product details in the form of online reviews. The reviews are
provided by other clients and specialists. Recommender systems provide an important response to the
information overload problem as it presents users more practical and personalized information facilities.
Collaborative filtering methods are vital component in recommender systems as they generate high-quality
recommendations by influencing the likings of society of similar users. The collaborative filtering method
has assumption that people having same tastes choose the same items. The conventional collaborative
filtering system has drawbacks as sparse data problem & lack of scalability. A new recommender system is
required to deal with the sparse data problem & produce high quality recommendations in large scale
mobile environment. MapReduce is a programming model which is widely used for large-scale data
analysis. The described algorithm of recommendation mechanism for mobile commerce is user based
collaborative filtering using MapReduce which reduces scalability problem in conventional CF system.
One of the essential operations for the data analysis is join operation. But MapReduce is not very
competent to execute the join operation as it always uses all records in the datasets where only small
fraction of datasets are applicable for the join operation. This problem can be reduced by applying
bloomjoin algorithm. The bloom filters are constructed and used to filter out redundant intermediate
records. The proposed algorithm using bloom filter will reduce the number of intermediate results and will
improve the join performance.
A statistical data fusion technique in virtual data integration environmentIJDKP
Data fusion in the virtual data integration environment starts after detecting and clustering duplicated
records from the different integrated data sources. It refers to the process of selecting or fusing attribute
values from the clustered duplicates into a single record representing the real world object. In this paper, a
statistical technique for data fusion is introduced based on some probabilistic scores from both data
sources and clustered duplicates
With the development of database, the data volume stored in database increases rapidly and in the large
amounts of data much important information is hidden. If the information can be extracted from the
database they will create a lot of profit for the organization. The question they are asking is how to extract
this value. The answer is data mining. There are many technologies available to data mining practitioners,
including Artificial Neural Networks, Genetics, Fuzzy logic and Decision Trees. Many practitioners are
wary of Neural Networks due to their black box nature, even though they have proven themselves in many
situations. This paper is an overview of artificial neural networks and questions their position as a
preferred tool by data mining practitioners.
Introduction to feature subset selection methodIJSRD
Data Mining is a computational progression to ascertain patterns in hefty data sets. It has various important techniques and one of them is Classification which is receiving great attention recently in the database community. Classification technique can solve several problems in different fields like medicine, industry, business, science. PSO is based on social behaviour for optimization problem. Feature Selection (FS) is a solution that involves finding a subset of prominent features to improve predictive accuracy and to remove the redundant features. Rough Set Theory (RST) is a mathematical tool which deals with the uncertainty and vagueness of the decision systems.
A new hybrid algorithm for business intelligence recommender systemIJNSA Journal
Business Intelligence is a set of methods, process and technologies that transform raw data into meaningful
and useful information. Recommender system is one of business intelligence system that is used to obtain
knowledge to the active user for better decision making. Recommender systems apply data mining
techniques to the problem of making personalized recommendations for information. Due to the growth in
the number of information and the users in recent years offers challenges in recommender systems.
Collaborative, content, demographic and knowledge-based are four different types of recommendations
systems. In this paper, a new hybrid algorithm is proposed for recommender system which combines
knowledge based, profile of the users and most frequent item mining technique to obtain intelligence.
6 ijaems sept-2015-6-a review of data security primitives in data miningINFOGAIN PUBLICATION
This paper has discussed various issues and security primitives like Spatial Data Handing, Privacy Protection of data, Data Load Balancing, Resource Mining etc. in the area of Data Mining.A 5-stage review process has been conductedfor 30 research papers which were published in the period of year ranging from 1996 to year 2013. After an exhaustive review process, nine key issues were found “Spatial Data Handing, Data Load Balancing, Resource Mining ,Visual Data Mining, Data Clusters Mining, Privacy Preservation, Mining of gaps between business tools & patterns, Mining of hidden complex patterns.” which have been resolved and explained with proper methodologies. Several solution approaches have been discussed in the 30 papers. This paper provides an outcome of the review which is in the form of various findings, found under various key issues. The findings included algorithms and methodologies used by researchers along with their strengths and weaknesses and the scope for the future work in the area.
Multiple Minimum Support Implementations with Dynamic Matrix Apriori Algorith...ijsrd.com
Data mining can be defined as the process of uncovering hidden patterns in random data that are potentially useful. The discovery of interesting association relationships among large amounts of business transactions is currently vital for making appropriate business decisions. Association rule analysis is the task of discovering association rules that occur frequently in a given transaction data set. Its task is to find certain relationships among a set of data (itemset) in the database. It has two measurements: Support and confidence values. Confidence value is a measure of rule’s strength, while support value corresponds to statistical significance. There are currently a variety of algorithms to discover association rules. Some of these algorithms depend on the use of minimum support to weed out the uninteresting rules. Other algorithms look for highly correlated items, that is, rules with high confidence. Traditional association rule mining techniques employ predefined support and confidence values. However, specifying minimum support value of the mined rules in advance often leads to either too many or too few rules, which negatively impacts the performance of the overall system. This work proposes a way to efficiently mine association rules over dynamic databases using Dynamic Matrix Apriori technique and Multiple Support Apriori (MSApriori). A modification for Matrix Apriori algorithm to accommodate this modification is proposed. Experiments on large set of data bases have been conducted to validate the proposed framework. The achieved results show that there is a remarkable improvement in the overall performance of the system in terms of run time, the number of generated rules, and number of frequent items used.
Data Mining is an important aspect for any business. Most of the management level decisions are based on the process of Data Mining. One of such aspect is the association between different sale products i.e. what is the actual support of a product respected to the other product. This concept is called Association Mining. According to this concept we define the process of estimating the sale of one product respective to the other product. We are proposing an association rule based on the concept of Hardware support. In this concept we first maintain the database and compare it with systolic array after this a pruning process is being performed to filter the database and to remove the rarely used items. Finally the data is indexed according to hashing technique and the decision is performed in terms of support count. Krishan Rohilla | Shabnam Kumari | Reema"Data Mining based on Hashing Technique" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-1 | Issue-4 , June 2017, URL: http://www.ijtsrd.com/papers/ijtsrd82.pdf http://www.ijtsrd.com/computer-science/data-miining/82/data-mining-based-on-hashing-technique/krishan-rohilla
Abstract Learning Analytics by nature relies on computational information processing activities intended to extract from raw data some interesting aspects that can be used to obtain insights into the behaviors of learners, the design of learning experiences, etc. There is a large variety of computational techniques that can be employed, all with interesting properties, but it is the interpretation of their results that really forms the core of the analytics process. As a rising subject, data mining and business intelligence are playing an increasingly important role in the decision support activity of every walk of life. The Variance Rover System (VRS) mainly focused on the large data sets obtained from online web visiting and categorizing this into clusters according some similarity and the process of predicting customer behavior and selecting actions to influence that behavior to benefit the company, so as to take optimized and beneficial decisions of business expansion. Keywords: Analytics, Business intelligence, Clustering, Data Mining, Standard K-means, Optimized K-means
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Data Mining of Project Management Data: An Analysis of Applied Research Studies.Gurdal Ertek
Data collected and generated through and posterior to projects, such as data residing in project management software and post project review documents, can be a major source of actionable insights and competitive advantage. This paper presents a rigorous
methodological analysis of the applied research published in academic literature, on the application of data mining (DM) for project management (PM). The objective of the paper is to provide a comprehensive analysis and discussion of where and how data mining is applied for project management data and to provide practical insights for future research in the field.
https://dl.acm.org/citation.cfm?id=3176714
https://ertekprojects.com/ftp/papers/2017/ertek_et_al_2017_Data_Mining_of_Project_Management_Data.pdf
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The International Journal of Engineering and Science (IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The International Journal of Engineering and Science (IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The International Journal of Engineering and Science (IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The International Journal of Engineering and Science (IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
We have concentrated on a range of strategies, methodologies, and distinct fields of research in this article, all of which are useful and relevant in the field of data mining technologies. As we all know, numerous multinational corporations and major corporations operate in various parts of the world. Each location of business may create significant amounts of data. Corporate decision-makers need access to all of these data sources in order to make strategic decisions.
McKinsey Global Institute Big data The next frontier for innova.docxandreecapon
McKinsey Global Institute
Big data: The next frontier for innovation, competition, and productivity 27
2. Bigdatatechniquesand technologies
A wide variety of techniques and technologies has been developed and adapted to aggregate, manipulate, analyze, and visualize big data. These techniques and technologies draw from several fields including statistics, computer science, applied mathematics, and economics. This means that an organization that intends to derive value from big data has to adopt a flexible, multidisciplinary approach. Some techniques and technologies were developed in a world with access to far smaller volumes and variety in data, but have been successfully adapted so that they are applicable to very large sets of more diverse data. Others have been developed more recently, specifically to capture value from big data. Some were developed by academics and others by companies, especially those with online business models predicated on analyzing big data.
This report concentrates on documenting the potential value that leveraging big data can create. It is not a detailed instruction manual on how to capture value, a task that requires highly specific customization to an organization’s context, strategy, and capabilities. However, we wanted to note some of the main techniques and technologies that can be applied to harness big data to clarify the way some
of the levers for the use of big data that we describe might work. These are not comprehensive lists—the story of big data is still being written; new methods and tools continue to be developed to solve new problems. To help interested readers find a particular technique or technology easily, we have arranged these lists alphabetically. Where we have used bold typefaces, we are illustrating the multiple interconnections between techniques and technologies. We also provide a brief selection of illustrative examples of visualization, a key tool for understanding very large-scale data and complex analyses in order to make better decisions.
TECHNIQUES FOR ANALYZING BIG DATA
There are many techniques that draw on disciplines such as statistics and computer science (particularly machine learning) that can be used to analyze datasets. In this section, we provide a list of some categories of techniques applicable across a range of industries. This list is by no means exhaustive. Indeed, researchers continue to develop new techniques and improve on existing ones, particularly in response to the need
to analyze new combinations of data. We note that not all of these techniques strictly require the use of big data—some of them can be applied effectively to smaller datasets (e.g., A/B testing, regression analysis). However, all of the techniques we list here can be applied to big data and, in general, larger and more diverse datasets can be used to generate more numerous and insightful results than smaller, less diverse ones.
A/B testing. A technique in which a control group is compa ...
PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...Editor IJMTER
This system technique is used for efficient data mining in SRMS (Student Records
Management System) through vertical approach with association rules in distributed databases. The
current leading technique is that of Kantarcioglu and Clifton[1]. In this system I deal with two
challenges or issues, one that computes the union of private subsets that each of the interacting users
hold, and another that tests the inclusion of an element held by one user in a subset held by another.
The existing system uses different techniques for data mining purpose like Apriori algorithm. The
Fast Distributed Mining (FDM) algorithm of Cheung et al. [2], which is an unsecured distributed
version of the Apriori algorithm. Proposed system offers enhanced privacy and data mining with
respect to the Encryption techniques and Association rule with Fp-Growth Algorithm in private
cloud (system contains different files of subjects with respect to their branches). Due to this above
techniques the expected effect on this system is that, it is simpler and more efficient in terms of
communication cost and combinational cost. Due to these techniques it will affect the parameter like
time consumption for execution, length of the code is decrease, find the data fast, extracting hidden
predictive information from large databases and the efficiency of this proposed system should
increase by the 20%.
Extract Business Process Performance using Data MiningIJERA Editor
This paper aimed to analyze the performance of the business process using process mining. The performance is very important especially in large systems .the process of repairing devices was used as case study and the Fuzzy Miner Algorithm used to analyze the process model performance
We are living in a world, where a vast amount of digital data which is called big data. Plus as the world becomes more and more connected via the Internet of Things (IoT). The IoT has been a major influence on the Big Data landscape. The analysis of such big data brings ahead business competition to the next level of innovation and productivity.
The Survey of Data Mining Applications And Feature Scope IJCSEIT Journal
In this paper we have focused a variety of techniques, approaches and different areas of the research which
are helpful and marked as the important field of data mining Technologies. As we are aware that many MNC’s
and large organizations are operated in different places of the different countries. Each place of operation
may generate large volumes of data. Corporate decision makers require access from all such sources and
take strategic decisions .The data warehouse is used in the significant business value by improving the
effectiveness of managerial decision-making. In an uncertain and highly competitive business
environment, the value of strategic information systems such as these are easily recognized however in
today’s business environment, efficiency or speed is not the only key for competitiveness. This type of huge
amount of data’s are available in the form of tera- to peta-bytes which has drastically changed in the areas
of science and engineering. To analyze, manage and make a decision of such type of huge amount of data
we need techniques called the data mining which will transforming in many fields. This paper imparts more
number of applications of the data mining and also o focuses scope of the data mining which will helpful in
the further research.
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVEIJDKP
Knowledge Discovery in Databases is the process of finding knowledge in massive amount of data where
data mining is the core of this process. Data mining can be used to mine understandable meaningful patterns from large databases and these patterns may then be converted into knowledge.Data mining is the process of extracting the information and patterns derived by the KDD process which helps in crucial decision-making.Data mining works with data warehouse and the whole process is divded into action plan to be performed on data: Selection, transformation, mining and results interpretation. In this paper, we have reviewed Knowledge Discovery perspective in Data Mining and consolidated different areas of data
mining, its techniques and methods in it.
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...theijes
Data mining works to extract information known in advance from the enormous quantities of data which can lead to knowledge. It provides information that helps to make good decisions. The effectiveness of data mining in access to knowledge to achieve the goal of which is the discovery of the hidden facts contained in databases and through the use of multiple technologies. Clustering is organizing data into clusters or groups such that they have high intra-cluster similarity and low inter cluster similarity. This paper deals with K-means clustering algorithm which collect a number of data based on the characteristics and attributes of this data, and process the Clustering by reducing the distances between the data center. This algorithm is applied using open source tool called WEKA, with the Insurance dataset as its input
COMBINED MINING APPROACH TO GENERATE PATTERNS FOR COMPLEX DATAcscpconf
In Data mining applications, which often involve complex data like multiple heterogeneous data sources, user preferences, decision-making actions and business impacts etc., the complete useful information cannot be obtained by using single data mining method in the form of informative patterns as that would consume more time and space, if and only if it is possible to join large relevant data sources for discovering patterns consisting of various aspects of useful information. We consider combined mining as an approach for mining informative patterns
from multiple data-sources or multiple-features or by multiple-methods as per the requirements. In combined mining approach, we applied Lossy-counting algorithm on each data-source to get the frequent data item-sets and then get the combined association rules. In multi-feature combined mining approach, we obtained pair patterns and cluster patterns and then generate incremental pair patterns and incremental cluster patterns, which cannot be directly generated by the existing methods. In multi-method combined mining approach, we combine FP-growth and Bayesian Belief Network to make a classifier to get more informative knowledge.
Combined mining approach to generate patterns for complex datacsandit
In Data mining applications, which often involve complex data like multiple heterogeneous data
sources, user preferences, decision-making actions and business impacts etc., the complete
useful information cannot be obtained by using single data mining method in the form of
informative patterns as that would consume more time and space, if and only if it is possible to
join large relevant data sources for discovering patterns consisting of various aspects of useful
information. We consider combined mining as an approach for mining informative patterns
from multiple data-sources or multiple-features or by multiple-methods as per the requirements.
In combined mining approach, we applied Lossy-counting algorithm on each data-source to get
the frequent data item-sets and then get the combined association rules. In multi-feature
combined mining approach, we obtained pair patterns and cluster patterns and then generate
incremental pair patterns and incremental cluster patterns, which cannot be directly generated
by the existing methods. In multi-method combined mining approach, we combine FP-growth
and Bayesian Belief Network to make a classifier to get more informative knowledge.
Similar to The International Journal of Engineering and Science (20)
Combined mining approach to generate patterns for complex data
The International Journal of Engineering and Science
1. The International Journal of Engineering
And Science (IJES)
||Volume|| 1 ||Issue|| 1 ||Pages|| 95-100 ||2012||
ISSN: 2319 – 1813 ISBN: 2319 – 1805
Actionable Knowledge Discovery using Multi-Step Mining
1
DharaniK,2Kalpana Gudikandula
1
Department of CS, JNTU H, DRK College of Engineering and Technology
Hyderabad, Andhra Pradesh, India
2
Department of IT, JNTU H, DRK Institute of Science and Technology
Hyderabad, Andhra Pradesh, India
--------------------------------------------------------Abstract-------------------------------------------------------------------
Data mining is a process of obtaining trends or patterns in historical data. Such trends form business intelligence
that in turn leads to taking well informed decisions. However, data min ing with a single technique does not yield
actionable knowledge. Th is is because enterprises have huge databases and heterogeneous in nature. They also
have complex data and min ing such data needs multi-step min ing instead of single step mining. When mu ltiple
approaches are involved, they provide business intelligence in all aspects. That kind of information can lead to
actionable knowledge. Recently data min ing has got tremendous usage in the real world. Th e drawback of
existing approaches is that insufficient business intelligence in case of huge enterprises. This paper presents the
combination of existing works and algorithms. We work on multiple data sources, multip le methods and
mu ltip le features. The combined patterns thus obtained from co mplex business data provide actionable
knowledge. A prototype application has been built to test the efficiency of the proposed framework wh ich
combines mult iple data sources, mu ltiple methods and multip le features in min ing process. The empirical results
revealed that the proposed approach is effective and can be used in the real world.
Index Terms : Data mining, act ionable knowledge discovery, multi-method mining, mu lti-feature min ing,
mu lti-source mining
-------------------------------------------------------------------------------------------------------------------------------- -----
Date of Submission: 19, November, 2012 Date of Publication: 30, November 2012
--------------------------------------------------------------------------------------------------------------------------------------
1. Introduction
Data mining at enterprise level operates on huge amount of data such as government transactions,
banks, insurance companies and so on. Inevitably, these businesses produce complex data that might be
distributed in nature. When mining is made on such data with a single-step, it produces business intelligence as
a particular aspect. However, this is not sufficient in enterprise where different aspects and standpoints are to be
considered before taking business decisions. It is required that the enterprises p erform min ing based on multiple
features, data sources and methods. This is known as combined mining. The co mb ined min ing can produce
patterns that reflect all aspects of the enterprise. Thus the derived intelligence can be used to take business
decisions that lead to profits. This kind of knowledge is known as actionable knowledge. The act ionable
knowledge is discovered through multip le data sources, multip le methods and multip le features. The intelligence
thus obtained is dependable and reliable. As businesses are growing in complex data, there is a need for mining
on complex data. However, it is challenging to discover actionable knowledge using complex data sources. This
is because the generated business intelligence must be comprehensive and informat ion that provides enough
knowledge to take enterprise business decisions. There are many traditional methods being used in data min ing.
Therefore it is very challenging to comb ine them and use them to generate actionable knowledge. The
challenges in the mult i min ing process can be categorized into multip le data sources, mu ltiple methods, mu ltiple
features, post analysis and mining, joining mu ltip le relational database and data sampling as well. Data sampling
is generally not acceptable in real world data mining applications. Due to space and time limits combining
mu ltip le tables or joining them may not be possible. As data mining methods are developed for various data
sources keeping some assumptions in mind, it is challenging to combine them for discovering actionable
knowledge in real t ime applications. Co mbined association rules, comb ined rule clusters and combined rule
pairs concepts areproposed in [1], [22] and [25]. These papers exposed the mining on comp lex data with respect
to mult iple data sets and obtaining comprehensive knowledge. Co mbined association rule is nothing but a set of
www.theijes.com The IJES Page 95
2. Actionable Knowledge Discovery Using Multi-Step…
mu ltip le item sets that are heterogeneous in nature. Co mb ined rules pairs are nothing but derived fro m co mbined
association rules by combining ru le clusters. The comb ined rule pairs are also derived fro m co mb ined rules
only. Traditional algorith ms such as FPGro wth [15] can’t be used to derive combined association rules. This
paper aims at presenting a new comprehensive framework for co mbined min ing. As a matter of fact, this p aper
makes use of existing methods or techniques as part of the framework. Therefore it integrates mu lti source
combined min ing, mu lti-method combined mining, and multi-feature co mbined mining. Multip le features might
include demographics of customer, behavior, business impacts and also transactional data. Multi method might
include clustering, classificat ion and so on. Multiple data sources do mean that the mining process takes data
fro m mu ltiple related data sources. The deliverables of the proposed framewo rk co mb ined patterns or
combinedassociation rules. The fo llo wing are the general aspects of comb ined min ing.
Co mbined patterns generated by using various features can reflect the characteristics of enterprise closely
and such business intelligence is always reliable.
The combined min ing with respect to multip le data sources can give patterns that cover many aspects of the
enterprise as data is taken from different data sources.
Multiple methods mining results in patterns that reflect the in depth nature of data and also the advantages
provided by various mining algorith ms are with this kind of co mb ined min ing.
Metrics are of mult iple interestingness in nature can be applied to generated patterns that can verify the
significance of generated patterns. Instead of providing a specific algorith m and get results pertaining to that
algorith m, it is good idea to combine them and get more accurate and actionable knowledge.
The contributions of this paper include the concept of combined mining that is based on the existing
works. It discusses the framework and various combined min ing such as multi-source combined mining, mult i-
method comb ined mining, and mu lti-feature co mbined mining. Provides various techniques for pattern
interaction and novel patterns such as clustered patterns, combined patterns etc. Interestingness metrics are
evaluated. Discovering practically the combined patterns or actionable knowledge from real world businesses
such as banks.
2. Related Work
Mining is a process of extracting trends or patterns from historical data. These trends or patterns can
provide business intelligence that leads to actionable knowledge. There are many data mining methods or
algorith ms that exist for mining data to get patterns. However, all the existing algorithms are single-step mining
algorith ms. This means that they provide business intelligence inadequately. They may not be able to reflect the
complex needs of an enterprise to take decisions correctly. When multiple data mining techniques are combined
it is possible to get actionable knowledge that can cater to the needs of an enterprise. In this paper comb ining
mining algorith ms [1] have been implemented using a prototype application that demonstrates the efficiency of
combined min ing. The combined actionable knowledge can’t be provided by existing algorithms such as
FPGrowth [15]. The existing works on data mining operations on complex data or enterprise generally of
different types such as direct mining approaches; post min ing of patterns; data sets with extra feat ures; mu ltiple
methods integration; and also joining mult iple relational tables. Harmony [19] proposed an approach to mine for
discriminative patterns. Other such experiments include contrast patterns [5], model based search tree [6]. These
algorith ms attempted to use multip le features or mining techniques. Comb ined mining is best used to provide
actionable knowledge in spite of comp lex data sets, and features.There are four categories of combined mining
approaches in literature. A common ly used approach [28] is the post mining or post analysis of obtained
patterns. It is best used to prune the rules obtained after mining database or reducing redundancy or even
summarizing the patterns obtained [12]. In [1] co mbined min ing proposed contain direct mining methods. In [1
mu ltisource combined mining, mu lti-method combined mining and multi-feature comb ined mining were
introduced. These algorithms are used in this paper and implemented in the prototype application that is used to
demonstrate the efficiency of the combined mining. In these approaches also post mining of patterns is used
when it is necessary. Multisource combined mining considers data of various natures. It does mean that it
combines mu ltiple data sets as it is required by an enterprise which has data of different kinds. The result of
mu ltisource data mining is the actionable knowledge that reflects different angles of an enterprise.
Multi-feature data mining is used to have data with multip le features. This kind of min ing also leads to
actionable knowledge that reflects the comp lex needs of an enterprise. Here mu ltip le features are used. For
instance the prototype application in this paper has been implemented using banking data containing customers’
demographic data and transactional data. Multi-method combined mining mixes up many existing data mining
techniques that are very useful independently for a particular purpose only. However, in mu lti-method combined
www.theijes.com The IJES Page 96
3. Actionable Knowledge Discovery Using Multi-Step…
mining various min ing techniques are combined to obtain actionable knowledge. For instance apriori and ID3
can be combined. The aprio ri algorith m provides association rules wh ile the ID3 is meant for p roviding decision
tree. The co mbination of these two can provide actionable knowledge that helps in taking well informed
decisions. Other data mining algorith ms that are existed and can be used in combination are clustering, rarity
mining, regression, association rule mining, sequence classification [15], rule based mining [7] etc. W ith the
help of combined min ing sequential and cluster patterns can be used further. It does mean that the result of one
method can be used in another method and the process can be repeated if required. It is also possible to have a
chain of mining operations until the desired actionable knowledge is derived.
3. PROPOSED MINING FRAMEWORK
The proposed framework co mbines the existing algorith ms or methods to integrate multiple features,
mu ltip le data sources and multiple min ing methods like classification. The co mbined mining can be done with
mu ltip le features, mult iple data sources and multip le methods. The architecture of the proposed framework is
shown in fig. 1 which is co mmon framework for all these combined mining approaches.
Fig. 1 – A rchitecture of proposed min ing framework [1]
The common arch itecture for mu lti-feature mining, mu lti-method mining and multi-source mining are
described here. There are multip le min ing approaches that make use of given data source (s) and perform one or
more min ing algorith ms or methods. Each mining approach produces a set of patterns. All the patterns obtained
fro m mu ltip le mining approaches are merged together. This is done by a component known as pattern merger.
Pattern merger is a program that combines all pattern sets obtained from various min ing techniques. The result
of pattern merging process is the final deliverables. The deliverables are nothing but actionable knowledge that
facilitates comprehensive decision making. Th is framework is useful to all enterprises where there is comp lexity
in business data and business intelligence has to cover many aspects of that business. Especially this framewo rk
is suitable for do mains like banking, insurance where monetary transactions are stored and maintained. The
historical data has to be mined and right decisions are to be made in case of a ll monetary decisions such as
issuing loans and other customer centric services. The algorith ms of [1] are used for the experiments made using
a prototype application. The ensuing sections present algorith ms and the experimental results.
4. Algorithms
This section provides description and algorithms for mu ltisource combined mining, mult i-feature
combined mining and mult i-method combined mining.
4.1 Multisource Combined Mi ning
The combination of mult iple data sources (D): The comb ined pattern set P consists o f mult iple ato mic
patterns identified in several data sources. By mining mu ltiple data sources, combined patterns are generated
which reflect mu ltip le aspects of nature across the business lines.
www.theijes.com The IJES Page 97
4. Actionable Knowledge Discovery Using Multi-Step…
Fig. 2 – A lgorith m for mu ltisource co mbined mining
4.2 Multi-Feature Combined Mini ng
The combination of multip le features (F): The combined pattern set involves multiple features, namely,
e.g., features of customer demographics and behavior. By involv ing mu ltiple heterogeneous features, combined
patterns are generated which reflect mult iple aspects of concerns and characteristics in businesses.
Fig. 3 – A lgorith m for mu lti-feature co mb ined min ing
4.3 Multi-method Combi ned Mining
Multi-method comb ined mining is another approach to discover more informat ive knowledge in
complex data. The focus of mult i-method combined min ing is on combining mu ltiple data mining algorithms as
needed in order to generate more informat ive knowledge. The co mbination of mu lt iple methods (R): The
patterns in the combined set reflect the results mined by multip le data min ing methods like association mining
and classification. By applying mu ltiple methods in pattern min ing, combined patterns are generated which
disclose a deep and comprehensive essence of data by taking advantage of different methods. The algorithms
used for multi-method combined mining are Apriori Algorith m and ID3 Algorithm. These two algorithms are
well known and existing algorith ms in data mining do main.
5. Experimental Results
The environment used to built the prototype application used to demonstrate the efficiency of various
combined mining algorith ms described in the previous section is Java Standard Edition (JSE 6.0) , Net Beans
IDE and Oracle 10G Exp ress Edition. The hardware used is a PC with 2GB RAM, and 2.9x GHz processor. The
dataset used for the experiments is pertaining to banking do main.
Fig. 4 – Input Screen for synthetic dataset
www.theijes.com The IJES Page 98
5. Actionable Knowledge Discovery Using Multi-Step…
The dataset used in related to banking domain. The data min ing makes use of customers’ demographic
and also transactional data in order to find patterns and provide actionable knowledge finally.
Fig. 5 – Results of mult isource combined mining
As can be seen in fig. 5, (a) shows patterns while (b) shows actionable knowledge that is whether loan can be
given to given customer or not.
Fig. 6 – Dynamics of Incremental Cluster Patterns
As can be seen in fig. 6, it v isualizes the dynamics of clusters patterns in terms of confidence, support,
contribution and impact. The PLN, DOC, DOC, REA, IES represent a series of activities.
6. Conclusion
Enterprises generally are distributed in nature with mult iple databases, multip le servers and so on. The
data in such businesses is very comple x. Often the applications exhib it multip le features and the applications
also heterogeneous in nature. For instance the database might have details pertaining to business impact,
business appearance, service usage, behavior, preferences and demographics. Considering complex business
scenarios and requirements of business intelligence, it is time consuming to have multiple single mining
methods or features or data sources. The single piece of informat ion fro m a single min ing approach may not
reflect actual requirement of the enterprise. Therefore, we proposed a mining framework that combines multiple
data sources, mult iple methods and multiple in order to obtain mult iple pattern sets and finally the deliverables
that is actionable knowledge that helps in taking comprehensive and well informed decisions. Such combined
mining frameworks yield best actionable knowledge that can provide profits to organizations that use it. Thus
the proposed framewo rk can be used in domains like banking, insurance etc. to take effective decisions in
monetary matters. Multiple data sources, features, and methods used in the proposed framework are not entirely
new things. They are taken from existing works and customized to get the required framework for co mbined
mining. The data mining framework is applied to many real time solutions and the experiments made with
prototype application revealed that the framework is very effect ive and able to provide actionable knowledge.
www.theijes.com The IJES Page 99
6. Actionable Knowledge Discovery Using Multi-Step…
References
[1]. Longbing Cao, Senior Member, IEEE, Huaifeng Zhang, Member, IEEE, Yanchang Zhao, Member, IEEE,
Dan Luo, and Chengqi Zhang, Senior Member, IEEE. Co mbined Mining: Discovering Informat ive
Knowledge in Co mplex Data. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—
PART B: CYBERNETICS, VOL. 41, NO. 3, JUNE 2011
[2]. H. Zhang, Y. Zhao, L. Cao, and C. Zhang, ―Comb ined association rule mining,‖ in Proc. PAKDD, 2008,
pp. 1069–1074.
[3]. Y. Zhao, H. Zhang, L. Cao, C. Zhang, and H. Bohlscheid, ―Comb ined pattern mining : Fro m learned rules
to actionable knowledge,‖ in Proc. AI, 2008, pp. 393–403.
[4]. J. Pei, J. Han, B. Mortazav i-Asl, J. Wang, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu, ―Min ing
sequential patterns by pattern-growth: The PrefixSpan approach,‖ IEEE Trans. Knowl. Data Eng., vol.
16, no. 11, pp. 1424–1440, Nov. 2004.
[5]. J. Wang and G. Karypis, ―HARM ONY: Efficiently mining the best rules for classificat ion,‖ in Proc.
SDM, 2005, pp. 205– 216.
[6]. G. Dong and J. Li, ―Efficient min ing of emerg ing patterns: Discovering trends and differences,‖ in Proc.
KDD, 1999, pp. 43–52.
[7]. W. Fan, K. Zhang, J. Gao, X. Yan, J. Han, P. Yu, and O. Verscheure, ―Direct min ing of discriminative
and essential graphical and itemset features via model-based search tree,‖ in Proc. KDD, 2008, pp. 230–
238.
[8]. Y. Zhao, C. Zhang, and L. Cao, Eds., Post-Mining of Association Rules: Techniques for Effective
Knowledge Extraction. Hershey, PA: Inf. Sci.Ref., 2009.
[9]. B. Liu, W. Hsu, and Y. Ma, ―Pruning and summarizing the discovered associations,‖ in Proc. KDD,
1999, pp. 125– 134.
[10]. K. K. R. Hewawasam, K. Premaratne, and M.-L. Shyu, ―Ru le mining and classification in a situation
assessment application: A belief-theoretic approach for handling data imperfect ions,‖ IEEE Trans. Syst.,
Man, Cybern.B, Cybern., vol. 37, no. 6, pp. 1446–1459, Dec. 2007.
AUTHORS
K. Dharani is a student of
DRK co llege of Engineering
and Technology, Ranga
Reddy, Andhra Pradesh, India.
He has received B.Tech
degree in Co mputer Science
and Engineering and M.Tech
Degree in Co mputer Science.
Her main research interest
includes Software
Engineering.
Kalpana Gudikandula is
working as Associate
Professor at DRK Institute of
Science & Technology, Ranga
Reddy, and Andhra Pradesh,
India. She has received
M.Tech Degree in Co mputer
Science. His Main Interest
includes Cloud Co mputing,
Software Eng ineering.
www.theijes.com The IJES Page 100